Slashdot Mirror


Programming As If Performance Mattered

Junks Jerzey writes "Saw the essay 'Programming as if Performance Mattered', by James Hague, mentioned at the Lambda the Ultimate programming language weblog. This is the first modern and sensible spin on how optimization has changed over the years. The big 'gotcha' in the middle caught me by surprise. An inspiring read." Hague begins: "Are particular performance problems perennial? Can we never escape them? This essay is an attempt to look at things from a different point of view, to put performance into proper perspective."

615 comments

  1. Damn! by Spruce+Moose · · Score: 4, Funny

    If only I had written my first post program with performance in mind I could have not failed it!

    1. Re:Damn! by nacturation · · Score: 2, Funny

      Next up... Slashdot: Posting as if Karma Mattered

      --
      Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
    2. Re:Damn! by Anonymous Coward · · Score: 0
      try the "Post Humously" option.

      Still retarded. I'll bet you also think it's hilarious to say "bear with me" while standing next to the cage at the zoo. Actually, as bad as that joke is, it's much better than yours.

    3. Re:Damn! by Anonymous Coward · · Score: 0

      tough crowd... tough crowd...

  2. speed/easy coding by rd4tech · · Score: 2, Interesting

    The golden rule of programming has always been that clarity and correctness matter much more than the utmost speed. Very few people will argue with that

    Really? How about the server side of things?

    Shameless bragging: Why don't you take a look at my page to get a whole new view on peformance?

    1. Re:speed/easy coding by Anonymous Coward · · Score: 1

      yes even on the server. fast but broken is useless.

      otoh, sometimes (not usually) fast *is* important, and you have to design for it.

      btw, wierd comment about cpuedge. Why do you think a few tuned libs would give anyone 'a whole new view on performance'? same old same old.

    2. Re:speed/easy coding by ArbitraryConstant · · Score: 4, Insightful

      On the server side security is an issue (also on the client side, clearly). If your code isn't clear and correct, the number of bugs is likely to be higher than average, and bugs lead to exploits. Your libraries may be well written, I don't know specifically. It's possible to do both, just hard.

      --
      I rarely criticize things I don't care about.
    3. Re:speed/easy coding by edwdig · · Score: 3, Insightful

      I'd say his points are more true on the server side than the client side.

      Say you're a large business, and you have a mix of client side and server side applications. Both have significant processing time requirements Which do you spend more time optimizing?

      In this scenario, you're going to have a large number of client machines and a small number of servers. If servers need a little more power, you can upgrade the machine without too much disruption or money spent. The upgrade will benefit all users of the system. In this case, it's more cost effective to upgrade the server than it is to pay developers to optimize the hell out of the code.

      The client machines is a different story. There's a lot of machines in use. Upgrading any one will only help the user of that computer. Optimizing the code will help every user. In this case, paying a developer to optimize your code will be a lot cheaper than doing a company wide hardware upgrade.

      This is all of course assuming you're designing things well in the first place. Of course you should do things like use a quick sort (or whatever may be more appropriate in the case at hand) instead of a bubble sort. The point is its not worth spending days to get the last 1% of performance.

    4. Re:speed/easy coding by rd4tech · · Score: 1

      Mostly, it is time consuming.

    5. Re:speed/easy coding by rd4tech · · Score: 1

      But in every case, it depends on the time available. If you are bussiness, you have to pay development time if you want to improve things. Interestingly though, sometimes, even a small change can yield totaly differnet performance. It happened to me that switching the position of the two lines in the most critical part of the algorithms, changed the total performance by 5%.

    6. Re:speed/easy coding by Lord+Kano · · Score: 5, Insightful

      Really? How about the server side of things?

      On the server side, I'd say that correctness and clarity are even more important. I guess it's all a matter of opinion as to where the "sweet spot" is, but most programming involves finding the right balance between speed and clarity.

      If you're in a situation where you need the servers to process large amounts of data, you're most likely in a position to be able to justify the expense of throwing better hardware at the problem.

      LK

      --
      "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
    7. Re:speed/easy coding by imbaczek · · Score: 1

      The golden rule becomes a platinum one.

    8. Re:speed/easy coding by maxwell+demon · · Score: 2, Insightful

      At the server, correctness matters even more: A slow server may get overloaded by too many requests, but a fast but incorrect server process may be a security problem.

      Of course, a correct and fast server is much better than a correct and slow server.

      --
      The Tao of math: The numbers you can count are not the real numbers.
    9. Re:speed/easy coding by groot · · Score: 1

      The golden rule of programming has always been that clarity and correctness matter much more than the utmost speed. Very few people will argue with that



      was in a recent Intel annual report :) but misread by someone at MS as:

      blah...blah...{mentally replaced with whatever was rolling around inside the MS cranium at the time}...blah...matters more than the utmost speed...blah...blah...blah...
      --
      "Just remember, it takes a village idiot." -- The Motley Fool.
    10. Re:speed/easy coding by pboulang · · Score: 1
      Why do you bother to link to cpuedge.com? Not like you have any useful information on it. Or support for claims like "Award Winning". When one sees things like that without a nice badge or link to proof, hell even what the award is, one must assume it was from one's mother.

      Anyway, what is your point? That speed is more important than correctness? I mean, obvious from the

      The Software is not fault-tolerant and is not designed, manufactured or intended for use or resale as on-line control equipment in hazardous environments requiring fail-safe performance, such as in the operation of nuclear facilities, aircraft navigation or communication systems, air traffic control, direct life support machines, or weapons systems, in which the failure of the Software could lead directly to death, personal injury, or severe physical or environmental damage ("High Risk Activities").

      There is no attempt being made at true software validation. That would require some kind of "correctness". Besides, your charts of speed isn't mindblowing, it is what, a 40% increase? Cheaper for me to spend money on hardware than to risk it on a site that is pretty, but sorely lacking on content.

      --

      This comment is guaranteed*

      *not guaranteed

    11. Re:speed/easy coding by Anonymous Coward · · Score: 0

      The awards which were won are clearly indicated on his site. Learn how to read, and don't be an ass.

    12. Re:speed/easy coding by rd4tech · · Score: 1

      Adhominem eh? :)
      1. I bother to link because I can.
      2. You obviously cannot appreciate the libraries offered and therefore you draw fast and wrongfull assumption about their usefulness (~120 downloads in the past 24 hours)
      3. Award winning it is. Check out this, and this, and this and this and this, or just search this.
      Just because you are incapable of grasping it, that doesn't means that you have the right to bitch about it. (Well, you actually can bitch about it, /. ) For a healthy exercise, do try to optimize any existing algorithm that's designed for *speed*, and to gain, shall we say, 1%, then, you'll surely appreciate 40%.
      f) I know that the webpage isn't perfect, but considering my time spending for university, girlfriend, work, research, weight lifting, reading and optimizing those things I *can* and *love*, I think that the webpage is quite cool for now.
      g) I would bring myself that low to say that you, and whoever moded you up, are idiots.

    13. Re:speed/easy coding by TheLink · · Score: 1

      You prefer getting an almost/usually correct answer really fast?

      I hope that's not how you do your MD5s...

      --
  3. The question I always ask is by Anonymous Coward · · Score: 5, Insightful

    Is the time it takes me to do the performance optimization worth it in time or money.

    1. Re:The question I always ask is by rpozz · · Score: 2, Informative

      Performance can be quite a major thing if you're doing a lot of forking/threading (ie like a daemon). If you create 100 threads, any memory leaks or bottlenecks are multiplied 100 times.

      However, 0.1s delay after clicking an 'OK' button is perfectly acceptable. It all depends on what you're coding.

    2. Re:The question I always ask is by irokitt · · Score: 4, Insightful

      Probably not, but if you are working on an open source project, we're counting on you to make it faster and better than the hired hands at $COMPANY. That's what makes OSS what it is.

      --
      If my answers frighten you, stop asking scary questions.
    3. Re:The question I always ask is by prockcore · · Score: 2, Insightful

      Is the time it takes me to do the performance optimization worth it in time or money.

      The question I ask is, can the server handle the load any other way? As far as my company is concerned, my time is worth nothing. They pay me either way. The only issue is, and will always be, will it work? Throwing more hardware at the problem has never solved a single performance problem, ever.

      We've been slashdotted twice. In the pipeline is a database request, a SOAP interaction, a custom apache module, and an XSLT transform.

      Our server never even came close to its breaking point. I attribute it to optimizing for performance.

    4. Re:The question I always ask is by bm_luethke · · Score: 3, Insightful

      "Is the time it takes me to do the performance optimization worth it in time or money."

      To a certain extent. I've seen that excuse for some pretty bad/slow code out there.

      Writing effecient and somewhat optimised code is like writing readable extensable code: if you design and write with that in mind you usually get 90% of it done for very very little (if any) extra work. Bolt it on later and you usually get a mess that doesn't actually do what you intented.

      A good programmer should always keep both clean code and fast code in mind while writing software.

      --
      ------- Sorry about the spelling, I suffer from two problems. Dyslexia makes it difficult to spell well, lazy makes it
    5. Re:The question I always ask is by techno-vampire · · Score: 0
      Throwing more hardware at the problem has never solved a single performance problem, ever.

      Tell that to Micro$oft. Every version of Windows since 95 has needed more memory and speed and still works slower.

      --
      Good, inexpensive web hosting
    6. Re:The question I always ask is by dhalgren99 · · Score: 1

      Yeah, tell that to my manager...

    7. Re:The question I always ask is by Anonymous Coward · · Score: 0

      Well, if you're talking about Open Source, then if you measure it in dollars per hour spent, the ratio is always zero. Given that you're not getting paid at all, what does it matter whether coding takes a short or long amount of time, and whether the program runs fast or slow? You get what you pay for!

    8. Re:The question I always ask is by tanksalot · · Score: 2, Funny

      Is the time it take me to do the performance optimization worth it in time that I could be browsing /.

      --
      "I am not denying the existence of stupidity, or of stupid people." - phyruxus
    9. Re:The question I always ask is by Anonymous Coward · · Score: 0, Funny

      Throwing more hardware at the problem has never solved a single performance problem, ever.

      I'm not even going to start trying to talk sense into you. You are obviously a pig headed imbicile. That's OK, keep living in your own world where your leet optimisation skills are worth so much more than a CPU upgrade.

      Your job will be going to India soon, my friend.

    10. Re:The question I always ask is by corngrower · · Score: 3, Funny

      Yes, you always have to worry about those forking threads.

    11. Re:The question I always ask is by Anonymous Coward · · Score: 0

      The question I always ask is, 'is the time it takes me to do the performance optimization worth it in time or money'.

      Then I proceed to optimize performance, whatever the answer :)

    12. Re:The question I always ask is by maxwell+demon · · Score: 4, Insightful

      If working on OSS, clarity of the code has to be one of the top goals. Because if the code is not clear, you're less likely to find others interested in improving it.

      --
      The Tao of math: The numbers you can count are not the real numbers.
    13. Re:The question I always ask is by BlackHawk-666 · · Score: 1
      The short answer is maybe. Most of the time when I ask a client if they want me to spend more of their money making a fast routine a little bit faster they say they'd rather more features or to have it delivered on time :-)

      The exception however is one quite large client with a single (but beefy) webserver and several SQL Servers attached to it. Time after time the cheapskates refused to upgrade the poor ailing servers so I had three successive rounds of wringing more performance from the beasts. Each time I was able to at least double performance in some area, and usually quadruple or more. This allowed them to keep using the hardware for the full five year expected life, whilst nearly tripling the number of users to their site. As far as I know they still haven't replaced that hardware :-)

      --
      All those moments will be lost in time, like tears in rain.
    14. Re:The question I always ask is by khakipuce · · Score: 1
      Faster and better for who? Faster to execute or faster to maintain? Better usability or a better match to business requirements?

      Given a finite amount of resource, any project has to make compromises and in most cases performance optimisation is a long way down the list. Everyone is rightly concerned about security and quite often the most performant code is not the most secure. Checking variable lengths and types uses clock cycles but blocks security holes.

      Let's be honest - OSS is good because it is low cost, we can modify the code if we need to and over time teh feature/performance/security balance will be improved in line with what the userbase require.

      --
      Art is the mathematics of emotion
    15. Re:The question I always ask is by ultranova · · Score: 2, Interesting

      So throwing more hardware at Windows doesn't solve its performance problems ? So the parent poster was right ?

      So what's your point ?

      In any case, the main usability problem with Windows XP isn't slowness - it's the thousand faces of madness within ! I mean the stupid popups that keep on harassing me ! "Do you want to clean the desktop ?" No, I want to work with the app I just started ! Why is there "never bother me again" -button on the darn popup ?!?

      I've only used XP on school, so I admit that my skills are propably lacking, and that a more skillfull person could propably turn all of these annoyances off. Still, it is not fun having those popups disturb me all the time, ut us not funny having the stupid operating system and Office programs hide half their menus, and it is NOT funny having that idiotic paperclip jumping up and offering useless "advice" all the time !

      Sorry about the rant, but I had to get that off my chest. Feel free to mod flamebait/offtopic/troll/whatever.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    16. Re:The question I always ask is by (trb001) · · Score: 1

      A well documented piece of code is easy to read, no matter what algorithm you are using. I'd rather someone figure out the most efficient way of doing something, then document *exactly what they did* than write something up, quick and obvious, that isn't documented. I can usually read through code and get the general gyst of what they're doing, no matter how complicated it is *.

      --trb

      * Perl doesn't count, you HAVE to consider readability with Perl or else even a seasoned veteran will get lost.

    17. Re:The question I always ask is by k12linux · · Score: 1
      The answer, I think, is "it depends." You need to know your data and how your code is going to be used. Here is a real example:

      Back around 1999, a coworker wrote a "round" routine for foxpro for dos. (Database language for those who have to ask.) It didn't round to the decimal, it rounded to the nearest 1/2, 1/4, 1/8, 1/16, etc.

      The way it did this was to build an array of every fractional value in decimal form, grab the decimal portion of the number, and then search the array for the nearest match.

      Even then, this could find the nearest 1/8th in just a small fraction of a second. But it fell down (miserably) if you were looking for the nearest 1/128th, or (since this WAS for database work) you were rounding values from a million records or so.

      The solution was to multiply the original number by the denominator (ie: x * 8), round to the nearst whole number and devide it back. So, the code ended up looking something like

      frac_round(num, frac) {
      return (ROUND(num*frac,0)/frac)
      }

      The point is that optimization, even then, wasn't neccessarily about squeezing every ounce of speed out of your hardware. It could be more about not doing stupid things in your code, especially if the "stupid" part was going to be run repeatedly during execution.

      Someone could have spent a day converting his slow round code to C or assembly and building a new function in a library to call from FoxPro. Not much payoff doing that though.

      You could even do the same thing for the code I listed, but then maintainability drops a ton. Now your DB programmers would need to know C or assembly to make changes. (Like adding a check to prevent a divide-by-zero fault?) As it was, it could process a million records in around 6 seconds if I remember... more than fast enough for the job it had to do. Why optimize any further?

      So, as much as possible, know how your code is likely going to be used and what the data is going to be. Look for choke points where optimization will acctually make a difference. And know when it's "good enough" vs when it's "as good as possible."

    18. Re:The question I always ask is by bobaferret · · Score: 1

      I run a website that gets about 150,000 db queries a day across 3million court cases and 4 million people. And we add about 50,000 hits/day/month and 100,000 cases/month. I spent a good dela of my time optimizing code & queries. And the payoff is noticeable. It saves us from having to by more hardware to keep up with demand, and when we do add more hardware the improvments are substantial, for a minimal amount of cost. I think that when doing predictable queries against large databases the user should see their results in under a second when the system is fully taxed.

      -jj-

    19. Re:The question I always ask is by be951 · · Score: 1
      Is the time it takes me to do the performance optimization worth it in time or money.

      The answer, naturally, depends on a number of factors such as the type and size of application.

      Also, there is a difference (to me, at least) between really doing performance optimization and making bad code less bad. Many responders are arguing the case for the latter. To my way of thinking, fixing bad code is just like it sounds: fixing defects. It shouldn't count as "optimization" (performance enhancement, maybe) if you're just going from bad to acceptable, codewise.

    20. Re:The question I always ask is by techno-vampire · · Score: 1

      What's my point? My point was that not only is the parent poster right, but NanoLimp Winderz is a textbook example of the problem. Even after getting a bigger, faster computer, Win2K will be slower than 98, and XP even slower. Micro$not has taken Winderz way past the point where faster computers can compensate for the pessimized bloatware they slap out and don't care.

      --
      Good, inexpensive web hosting
    21. Re:The question I always ask is by strobert · · Score: 1

      Very good take on this. a lot fo the programmers I am seeing these days don't care or think about performance or how it will run operational. They think oh if we have problems we can fix it later. When anyone who has eben doing it for a while nows it is far easier to put in some thought and get it decent (not perfect mind you) the first time.

      Right now I am flabergasted at a web application that REQUIRES 64-bit hardware so that it can have 4GB and up heap allocations. And this is for something close to a hello world for web apps (think on-line store with a couple of items and some back end payemtn processing).

      I completely agree witht he poster above, just think as you write. yes you want to keep the code readable and maintainable, but don't completely toss performance and making sure it will run right out the window.

    22. Re:The question I always ask is by Anonymous Coward · · Score: 1

      Your opinion is not universally held. See for example the Python project which has rejected ideas for improving performance in their code base which would have required more complex and less accessible code.

    23. Re:The question I always ask is by Minna+Kirai · · Score: 1

      If working on OSS, clarity of the code has to be one of the top goals.

      But I'm a major corporation using GPL code as the basis for my new product, and can't afford my customers hiring outside coders for support, you insensitive clod!

    24. Re:The question I always ask is by Minna+Kirai · · Score: 1

      doing a lot of forking/threading (ie like a daemon)

      Bah. Threading inside a daemon is for the weak! Only those whose minds are too small to contemplate handling multiple unsynchronized tasks within a single process space need lean on the crutch of threaded coding.

  4. What annoys me by Anonymous Coward · · Score: 4, Insightful

    is that ms word 4 did all I need, and now the newest office is a thousand times the size and uses so much more cpu and ram but does no more.

    a sad inditement

    1. Re:What annoys me by Planesdragon · · Score: 1, Offtopic

      is that ms word 4 did all I need

      And you aren't still using it why?

      (hint--your answer is the reason why MS 4 doesn't do all you need.)

      the newest office is a thousand times the size and uses so much more cpu and ram but does no more.

      Wrong. Office does a LOT more. Tasks that used to require running a specific process now run idly in the background, multiple times in-between keystrokes. If my PC falls behind me in typing now, I know that there's a problem--not just that I'm typing too fast.

      Of course, the world would be a cleaner place if MS made up their minds between "easy to use" and "powerful", instead of trying to be both and failing miserably at each.

    2. Re:What annoys me by Anonymous Coward · · Score: 5, Funny

      > a sad inditement

      Well, it does have a spell checker now...

    3. Re:What annoys me by DrEasy · · Score: 5, Insightful
      And you aren't still using it why? (hint--your answer is the reason why MS 4 doesn't do all you need.)
      Or maybe because you are forced to upgrade to read files that were created with a more recent version?

      --
      "In our tactical decisions, we are operating contrary to our strategic interest."
    4. Re:What annoys me by Bush+Pig · · Score: 2, Insightful

      Probably the only thing Word 4 doesn't do that he needs is read the Word 97 (or whatever) files that other people keep sending him.

      --
      What a long, strange trip it's been.
    5. Re:What annoys me by Anonymous Coward · · Score: 0

      No kidding.

      Inditement? The only word I could think he was trying to say was "indictment" ("a formal document written for a prosecuting attorney charging a person with some offense "), but what he was looking for was probably closer to "predicament".

    6. Re:What annoys me by Anonymous Coward · · Score: 2, Funny

      a sad inditement

      Your own post is a sad indictment of your spelling.

    7. Re:What annoys me by Anonymous Coward · · Score: 1, Informative

      Indictment means the act of indicting, and to indict means to accuse of wrongdoing. There are also the legal definitions, but indictment is correct for common usage.

    8. Re:What annoys me by ratsnapple+tea · · Score: 1

      Yeah, I think that was the point. That's the reason he needs the newest Office--he needs to read the files people send him. As far as features go, maybe he doesn't need all the new features of Office 3000 Turbo or whatever they're calling it these days, but other people probably do--and if he wants to be able to read their files, he's going to have to upgrade.

      That said, it sure would be nice if Microsoft would learn to write a word processor that wasn't slow as balls on my old iBook.

    9. Re:What annoys me by smallpaul · · Score: 1

      is that ms word 4 did all I need, and now the newest office is a thousand times the size and uses so much more cpu and ram but does no more.

      Newer versions of Word may not do new things that you need it to do, but many people do use the newer features of Word. For instance, the ability to see syntax errors without a separate syntax error pass and the superior WYSIWSYG display. I expect that there are people out there for whom notepad and lpr are good enough. You want a little more then that but less than what Word of today does. You are in the minority.

    10. Re:What annoys me by shrykk · · Score: 1

      I use Office 2000 (well, Word-Excel-Powerpoint), and it's excellent. I won't consider anything new (free or not) until it can't access files I need - THEN I'll be an OpenOffice user.

      General consensus seems to be that Office 97 was Good Enough, and everything since is treading water. Gives free software a good chance to catch up.

      Seriously, as microsoft stuff gets re-released and re-released with few changes, people will wonder why they keep paying for something that was fine in 1997. Better copy-protection will help too. When home users can't get (pirated) MS software for free, free software will be a better alternative.

      --
      #define struct union /* Reduce memory usage */
    11. Re:What annoys me by Anonymous Coward · · Score: 0

      many people do use the newer features of Word. For instance, the ability to see syntax errors without a separate syntax error pass

      Word's grammar checker *still* gets more wrong than it gets right. Parsing English ain't easy. I don't know anyone who doesn't turn it off as soon as they start Word.

    12. Re:What annoys me by c0p0n · · Score: 1

      I thought that MS Word jumped from v2 to v6...

      --

      Your head a splode
    13. Re:What annoys me by julesh · · Score: 1

      Or write a word processor with a file format that supports graceful degradation when features added in newer versions are used. It isn't as if the concept is new... SGML-based applications supported it in the late 80s, and it is clearly a useful feature.

      MS employed people smart enough to realise that the feature set of their software would increase over time, and they were smart enough to design for forward compatibility. The only reason not to is to force the upgrade cycle, and that is why they didn't.

    14. Re:What annoys me by Epistax · · Score: 1

      The majority of documents can be written fine in wordpad. Want to spell check it? Paste it into a box on Yahoo or something with a spell checker. Want tables? Open up outlook, excel, etc draw the desired shape, paste it into wordpad. Too bad you still need MS Office.

      Don't believe Office is bloated? Oh wait, how much does space does a blank MS Word document take? 24,000 bytes...

    15. Re:What annoys me by N1KO · · Score: 1

      Maybe he shouldn't need to upgrade. I've never needed to upgrade to open someone elses PostScript, JPEG, MPEG, PDF, tar, TeX, DVI... and hundreds of other formats.

    16. Re:What annoys me by ultranova · · Score: 1
      Word's grammar checker *still* gets more wrong than it gets right. Parsing English ain't easy. I don't know anyone who doesn't turn it off as soon as they start Word.

      Sometimes, I just like to look at the psychedelic patterns of red and green and try to guess what the program thinks is wrong this time :).

      This is especially fun when I'm using my native language (finnish). You see, finnish is based on mangling words, as opposed to using prepositions like in english. For example, "to the house - taloon, in the house - talossa, from the house - talosta". There's 15 such mangled forms for each substantive, and of course other words mangle as well, following no particular rules... Add the fact that, despite the best efforts of language teachers, finnish language simply doesn't follow any strict rules of grammar, and you have a nightmare for a grammar checker writer.

      I wonder if torturing a computer program makes me a sadistic person ?-)

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    17. Re:What annoys me by hswerdfe · · Score: 1

      err I thought MS provided a free viewer...(so I hear I've never used it myself)

      --
      --meh--
    18. Re:What annoys me by Hast · · Score: 1

      Adding finnish to the list of languages I'm not going to learn. ;-)

    19. Re:What annoys me by IceFreak2000 · · Score: 1
      I thought that MS Word jumped from v2 to v6...

      Word for Windows jumped from version 2 to version 6, to bring it into line with the DOS version of Word

      --
      Life is like a sewer; what you get out of it depends on what you put into it...
    20. Re:What annoys me by Bush+Pig · · Score: 2, Interesting

      Sure, but far too many people send Word97 files when a plain text file would have been adequate. Most people assume that you're going to have the same version of Word as they do, and go all blank when you ask for something else (I'm speaking from personal experience).

      --
      What a long, strange trip it's been.
    21. Re:What annoys me by cwis42 · · Score: 1
      And you aren't still using it why? (hint--your answer is the reason why MS 4 doesn't do all you need.)
      Or maybe because you are forced to upgrade to read files that were created with a more recent version?

      Doesn't that precisely count as a reason why MS 4 doesn't do all you need?

    22. Re:What annoys me by Surt · · Score: 1

      If only word weren't so huge, he could have loaded it, used the spellchecker, and then copy pasted his post.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    23. Re:What annoys me by Anonymous Coward · · Score: 0

      maybe you're just not taking advantage of the new features... like the spell checker...

    24. Re:What annoys me by DrEasy · · Score: 1
      Or maybe because you are forced to upgrade to read files that were created with a more recent version? Doesn't that precisely count as a reason why MS 4 doesn't do all you need?
      Stricto sensu, yes. But this is not a need that came from the user initially, it was rather created artificially by Microsoft (or as the result of other people's needs). Darn network effect! I guess you could keep around the old version for composing your own documents, and use the new version for opening complicated files. Kind of like the juggling we all need to do with web browsers these days...

      --
      "In our tactical decisions, we are operating contrary to our strategic interest."
    25. Re:What annoys me by DrEasy · · Score: 1

      Viewing is one thing, but you need to edit other people's documents too sometimes!

      --
      "In our tactical decisions, we are operating contrary to our strategic interest."
    26. Re:What annoys me by c0p0n · · Score: 1

      thanx!

      --

      Your head a splode
    27. Re:What annoys me by DigiShaman · · Score: 1

      Does this free viewer run on other OSs (Mac, Linux)? Can you cut the text from the viewer and paste into another word processing app?

      Hell, why doesn't Adobe make a word processing version of Acrobat?

      --
      Life is not for the lazy.
    28. Re:What annoys me by Planesdragon · · Score: 1

      Does this free viewer run on other OSs (Mac, Linux)?

      Not AFAIK. Mac, maybe. But both Linux and Mac have, essentially, free-as-in-beer DOC readers.

      Can you cut the text from the viewer and paste into another word processing app?

      YES!

      Hell, why doesn't Adobe make a word processing version of Acrobat?

      Because PDFs are a pain in the butt to edit, and, at current, it simply wouldn't make much sense. (As I understand it, the PDF file model has each line of text as a seperate entry in the file--which means that it'd be well nigh impossible to do a descent PDF writer.)

      Then again, maybe they just have an agreement with MS to not compete; Adobe doesn't try to out-Word Word, and MS doesn't try to out-Acrobat Acrobat.

    29. Re:What annoys me by Sax+Maniac · · Score: 1

      I don't even remember the existence of Word version 4. I remember 1.0, 2.0, then 6.0 (gotta catch up with WordPerfect), 95 (the marketing weenies took over), 97, then it's a blur.

      --
      I can explanate how to administrate your network. You must configurate and segmentate it, so it can computate.
    30. Re:What annoys me by baryon351 · · Score: 1

      The macintosh also had versions 4 and 5 at least, perhaps earlier.

    31. Re:What annoys me by Anonymous Coward · · Score: 0

      And you aren't still using it why?


      I do still use it

      (hint--your answer is the reason why MS 4 doesn't do all you need.)

      "I do still use it" is the reason why MS 4 doesn't do all I need? huh?

  5. Managed environments by Nick+of+NSTime · · Score: 4, Funny

    I code in managed environments (.NET and Java), so I just let some mysterious thing manage performance for me.

    1. Re:Managed environments by rpozz · · Score: 1

      I code in managed environments (.NET and Java), so I just let some mysterious thing manage performance for me.

      .NET/Java help with organising performance issues to a certain extent, however if you happen to create a large number of threads in Java, you will get quite a major speed decrease. Running in a VM isn't an excuse for inefficient code (in fact it can make it even worse in some cases).

    2. Re:Managed environments by Erratio · · Score: 1

      Oh come on....NET or Java programs running slow? That's impossible. Next thing you're going to try to tell me is that they take up unnecessary resources too.

      --
      I don't try to be right, I just try to make people think
    3. Re:Managed environments by metlin · · Score: 4, Informative

      Contrary to popular belief, managed code environments do optimize code a whole lot more than you would think!

      Joe Beda, the guy from Microsoft behind Avalon, had a discussion on Channel9 where he talked about why managed code is not that bad a thing afterall.

      Like I mentioned in an earlier post, managed code helps optimize the code for some of the bad programmers out there who cannot do it themselves, and takes care of a lot of exceptions and other "troublesome" things :) So, in the long run, it may not be that bad a thing afterall.

      There are two facets to optimization - one is optimization of the code per se, and the other is the optimization of the project productivity - and I think managed code environments do a fairly good job of the former and a very good job of the latter.

      My 0.02.

    4. Re:Managed environments by asb · · Score: 1

      however if you happen to create a large number of threads in Java, you will get quite a major speed decrease

      No shit, Sherlock? How about this: "if you happen to fork a large number of processes in C you will get quite a major speed decrease."

      There is no silver bullet with which you will be able to write efficient programs. Not creating enough threads can be just as bad performance bottleneck. The performance issues in Java are just the same that are issues in other languages. In the end you only have to know what you are doing and what the APIs you are using are doing.

      --
      Antti S. Brax - Old school - http://www.iki.fi/asb/
    5. Re:Managed environments by rpozz · · Score: 1

      No shit, Sherlock? How about this: "if you happen to fork a large number of processes in C you will get quite a major speed decrease."

      The point I was making was that the speed decrease with excessive threads in Java is enormous, hence the need to watch how you code in Java, particularly with stuff like threading. I've also seen an awful lot of over-use of threads in Java, due to how easy it is to create them - which is why I used that as an example.

    6. Re:Managed environments by ashkar · · Score: 1

      Like I mentioned in an earlier post, managed code helps optimize the code for some of the bad programmers out there who cannot do it themselves, and takes care of a lot of exceptions and other "troublesome" things :) So, in the long run, it may not be that bad a thing afterall.

      Why should I use an language designed for bad programmers? No self-respecting code monkey would ever say "Well, they designed this to do minor optimizations for people like me that can't code for shit, and that's why I use it." Either you have no self-respect, or you're the most honest person I've ever met.

    7. Re:Managed environments by metlin · · Score: 4, Interesting

      You assume that I made that reference to myself as being a bad programmer.

      The reason I made that statement was because just last week I was at Redmond for an interview for internship at Microsoft, and I was interviewed by the team that was trying to prevent just this sort of thing from happening.

      The idea was to design heuristics-enabled compilers that would effectively detect any "bad-code" and help make managed code and pseudo-managed code the norm, or convert existing code into managed code.

      I did not say that I was using a programming language that had such protections, merely that such programming languages have their own advantages. I was interviewed for creating compilers, linkers and OS-level protection that did not allow those troublesome things to exist - not use them - and hence my justification :)

      That said, you may knowingly or unknowingly use a language designed for bad programmers even when you program C or C++ in upcoming versions of compilers that insist on managed code - they may just wrap up your code in a nice wrapper to prevent mishaps and hand it over to the linker after having taken care of your holes.

    8. Re:Managed environments by Anonymous Coward · · Score: 2, Interesting

      I don't think you are really being fair.

      I've come to the belief that sending out machine-code packages is flawed, because you don't really know what the target platform is going to look like.

      Example: can the processor support SIMD instructions? This can make a _HUGE_ performance difference in some applications. I would argue that if you are shipping a binary, you just wouldn't know.

      Example: Code optimized for a Pentium III did not run as fast on a Pentium IV. Why? The pipeline changed and this had an effect on the way the compiler should schedule instructions.

      Summary: You want the binding between program -> machine code to be as late as you can push it. This doesn't mean I advocate source distrubition since those have their own issues (see Gentoo). But an easily translatable representation like Microsoft's ILM or Java's ByteCode seems to be the solution. This is a classic versioning problem, do you want every app to deal with it? Or deal with it once at the OS level?

      Not to mention having really stupid loaders causes other problems down the pipleline. Was this code compiled for thread-saftey? How about with debug information?

      But back to performance, low-level efficiency matters. But how much it matters depends on the application. Thinking in terms in processing large amounts of data. Obviously you want to touch it as few times as possible: therefore O(log n) is bettern then O(n), better then O(n^2). But to get good performance: BE CACHE FRIENDLY. Make sure you access data in an array, so that if you access the same data again, you access it soon, and if you look at one element you are likely to look at one of it's neighbors.

      This will keep your code running well today, and will allow future hardware to process it faster (assuming new machines use the same type of memory hierarchy only bigger). This does mean prefering indexing tricks into the array over pointers. And yes, linked lists are your enemies.

      Anything else, it is going to be a wash. The first thing that researchers of some new language find out is how to get rid of 80% of the inefficiencies that make it slower than C.

      In sum, concentrate more on the algorithms. Be cache friendly. Program in a language that you enjoy. And leave the pissing contest between who's language (and therefore their unit) is fundamentally faster to people who have nothing better to do.

    9. Re:Managed environments by Anonymous Coward · · Score: 0

      And at my workplace, we are finding that this is not nearly as fast as we'd like.
      We have migrated 80% of our database-backed front end apps from the old, and horrible, MS access linked clients to .NET based intranet pages.
      The reason was mainly to speed up distribution and remove versioning problems. And also because access was just too limiting in certian respects.
      It's a simple architecture, with the aspx accessing the databases directly and pushing html out to the clients. No middle tiers.
      But to get any useful functionality requires one of 2 things:
      1) Lots of postbacks to gather data and regenerate pages
      2) Lots of javascript.
      We chose (2) in order to "speed things up". But the javascript needs to be loaded, and then interpreted. And much of it is used just when generating the pages, not responding to client events. We have had to implement our own data and javascript caching on the clients in order to make it tolerable. And now the main GUIB (G U I bitch - "gweeb") is being made to do his work via a 56k connection into the LAN in order to see just how slow the whole process can be. These client machines are running windows 98, and you'd be lucky to find one over 1 gigahertz, but I simply cannot accept that it actually takes as long as it does to process a hundred lines of javascript and render a page worth of HTML. The underlying software architecture has slowed down much, much faster than hardware has sped up. Coding (or more often re-coding) as if performance matters is taking up vastly more time in our organisation than it used to.

    10. Re:Managed environments by Torne · · Score: 1

      I have yet to observe a significant difference between Java threads and pthreads performance; perhaps you are thinking of *very* old Java versions that used 'green' threads (threads implemented in the VM by doing interpreter-level context switched)? All modern Javas use native threads, and the overheads associated with using them are pretty much identical to those of using the native threading API of your platform.

      I have converted Java network server apps that use one thread per client into threadpooled apps using asynchronous I/O - switching from java.io to java.nio (so you can manage your own buffers with full knowledge of your app's requirements, and use native OS buffers) is a large performance boost; going one step further and using asynchronous I/O and a small number of worker threads is almost not worth doing (though it's the way I do it when writing *new* apps) because the performance gain is just so small.

    11. Re:Managed environments by Anonymous Coward · · Score: 0
      1. Joe Beda, the guy from Microsoft behind Avalon, had a discussion on Channel9 ...

      Channel 9; one of Microsoft's marketing mouthpieces? Why was this moderated up?

      I'll find other sources of news, thanks.

    12. Re:Managed environments by merphle · · Score: 1
      The catch, however, is that this is only really true in relatively long-running applications. The interpreter (CLR/JVM/etc) is best at optimizing code after said code has been executed a number of times during any given instantiation.

      Don't get me wrong -- I love managed code environments. I find that I'm significantly more productive when coding in C# or Java than in C++. The performance hit for the managed code environments becomes negligible when averaged over the course of the day, for the server-style apps I write, due to runtime optimization.

    13. Re:Managed environments by Fweeky · · Score: 1

      A binary can choose different code paths to follow based on the CPU it's running on; games do it all the time.

      Did you see this latest Python JIT-alike btw? I think it was on SlashDot not long ago.. ah, Psyco. Shame I don't like Python, but it makes for a nice proof-of-concept to show that high level and dynamic doesn't have to mean slow (as if SmallTalk and friends haven't already done that).

    14. Re:Managed environments by phasm42 · · Score: 1

      I'd also like to note that switching to jdk1.4.2 (I think we were using 1.3.1 previously) under Linux gave our servers an enormous boost in speed -- a lot of CPU/disk/network intensive code ran 5 times as fast. Btw, the JDK we used was from Blackdown.

      --
      "No one likes working in a hamster wheel, and your shop smells of cedar shavings from here." - TaleSpinner
    15. Re:Managed environments by Torne · · Score: 1

      Yup, the runtime optimiser in Sun's 1.4 VM is really terrifyingly good. As long as you stay away from GUI components and stick on the server side, Java performs very well, though it does use a lot of memory =)

    16. Re:Managed environments by Brandybuck · · Score: 1

      Now matter how awesome that Java/C# optimizer is, it still won't fix a bad algorithm. Bubble sorts won't magically transform into quicksorts, for example. Of course, the new trend is to include all possible algorithms in the framework. Pretty soon Windows developers will be doing nothing more than drag-n-dropping controls onto a form and connecting them to a backend database. Oh wait...

      --
      Don't blame me, I didn't vote for either of them!
    17. Re:Managed environments by TheLink · · Score: 1

      The last I checked there are only a few programmers who put personal/monetary guarantees on their code - e.g. Knuth, DJB. So IMO the rest of the programmers suck and any "self respect" they have as programmers can't be worth very much.

      A brief look at bugtraq will show you that the other programmers can't code for shit and should use languages designed for bad programmers.

      But they're not even smart enough to realize that. They think they are good programmers and many even use stuff like C and C++[1].

      If your program operates in a hostile environment (network server, online app etc), I recommend you stick to languages designed for bad programmers. I do, which is why even though I am a crappy programmer I know my program won't have those buffer/stack overflow crap (it'll be the language programmer's mistake not mine), instead of counting bits one by one I can stick to getting the job done.

      Maybe stuff like the NX thingy and other stuff might help eventually. But meanwhile, the IT security business is still OK.

      [1] Dangerous languages where common silly mistakes will "allow the attacker to execute _arbitrary_code_ of the attacker's choice". Safe and sane languages won't do that.

      --
  6. Funny thing about performance by ObviousGuy · · Score: 5, Interesting

    You can spend all your time optimizing for performance and when you finally release your product, your competition whose main objective was to get the product out the door faster, who uses a slower algorithm, is already first in mindshare with your customers. Not only that, the processors that you thought you would be targetting are already a generation behind and that algorithm that was going to hold back your competition runs perfectly fast on new processors.

    Performance gains occur at the hardware level. Any tendency to optimize prematurely ought to be avoided, at least until after v1.0 ships.

    --
    I have been pwned because my /. password was too easy to guess.
    1. Re:Funny thing about performance by Anonymous Coward · · Score: 1, Funny

      Yeah. I bet you use bubblesort too...

    2. Re:Funny thing about performance by ObviousGuy · · Score: 2, Interesting

      No, I use the language's sort routine. This typically means quicksort or heapsort.

      Do you code all your own algorithms?

      --
      I have been pwned because my /. password was too easy to guess.
    3. Re:Funny thing about performance by corngrower · · Score: 5, Insightful
      Any tendency to optimize prematurely ought to be avoided, at least until after v1.0 ships.


      Assuming there is a second version, which there may not be because potential customers found that the performance of v1.0 sucked.

    4. Re:Funny thing about performance by Anonymous Coward · · Score: 0
      You can spend all your time optimizing for performance and when you finally release your product, your competition whose main objective was to get the product out the door faster, who uses a slower algorithm, is already first in mindshare with your customers.

      Either that or you are successful and make lots of money at which point Microsoft decides they will take your customers by writing a faster app using their undisclosed API calls.

    5. Re:Funny thing about performance by metlin · · Score: 4, Insightful

      Well said.

      However, I will dispute the claim that performance gains happen only at the hardware level - although programmers cannot really optimize every tiny bit, there is no harm in encouraging good programming.

      The thing is that a lot of programmers today have grown NOT to respect the need for performance - they just assume that the upcoming systems would have really fast processors and infinite amounts of RAM and diskspace, and write shitty code.

      I agree that like Knuth said, premature optimization is the root of all evil. However, writing absolutely non-optimized code is evil in itself - when a simple problem can be simplified in order and time, it's criminal not to :)

      A lot of times, programmers (mostly the non-CS folks who jumped the programming bandwagon) write really bad code, leaving a lot of room for optimization. IMHO, this is a very bad practice, something that we have not really been paying much attention to because we always have faster computers coming up.

      Maybe we never will hit the hardware barrier, I'm sure this will show through.

    6. Re:Funny thing about performance by Anonymous Coward · · Score: 2, Informative

      Obviously you have never done any programing wrt cryptography. Optimization is *_N-E-V-E-R_* done in the hardware!!! The difference between using a good algorithm and a crappy one is the difference between 2 days for the program to run, and fifty trillion centuries (literally). Hardware upgrades are merely incremental. Moore's law says speed doubles every 18 months, but doubling is a tiny incremental increase, if you want an exponential/logorithmic change, you have to use software. I'm not talking about "oh, twice as fast as a year ago", but "10,000 times as fast as that other software" or "1 billion times as fast".

    7. Re:Funny thing about performance by Erratio · · Score: 1

      The releasing of v1.0 is of course a somewhat arbitrary decision. I think what is implied is that the program should originally be written with clarity and reusability in mind, then after an initial version is completed, any bottlenecks should be optimized (for me that happens well before v1. Saving a half a second on a routine that runs once in a while can be delayed for a while if not indefinitely, but code that is passed through a million times every time the program is used will benefit immensely from even the slightest tweak.

      --
      I don't try to be right, I just try to make people think
    8. Re:Funny thing about performance by naden · · Score: 2, Insightful

      Assuming there is a second version, which there may not be because potential customers found that the performance of v1.0 sucked.

      Better a version 1.0 that sucked than none at all.

      And funny how Microsoft seems to release so many crappy 1.0 releases yet usually ends up clawing back to become the market leader.

      --
      Funtage Factor: Purple
    9. Re:Funny thing about performance by techno-vampire · · Score: 4, Interesting
      The thing is that a lot of programmers today have grown NOT to respect the need for performance - they just assume that the upcoming systems would have really fast processors and infinite amounts of RAM and diskspace, and write shitty code.

      That's not the only reason. Programmers usually get to use fast machines with lots of RAM and diskspace, and often end up writing programs that need everything they have.

      Back in the DOS days, I worked on a project that had a better way of doing things. We had one machine with reasonable speed as the testbed. It wasn't well optimized as we didn't expect our customers to know how to do that and the programs we were writing didn't need expanded or extended memory. If what you wrote wouldn't run on that machine, it didn't matter how well it worked on your machine, you had to tweak it to use less memory.

      --
      Good, inexpensive web hosting
    10. Re:Funny thing about performance by mibus · · Score: 1

      IIRC, the sucking-performance-killing-a-product problem was what caused Infocom's Cornerstone database software to be largely a flop. By the time they'd gotten towards solving performance issues, the company was virtually dead...

    11. Re:Funny thing about performance by kubrick · · Score: 5, Insightful

      All Microsoft have to do is pre-announce features that won't be in their products until v3, well before the release of v1 (vide Go Corporation & Pen Windows), and that's enough to kill off the competition. Microsoft's success has become a self-fulfilling prophect for most of the market these days...

      --
      deus does not exist but if he does
    12. Re:Funny thing about performance by Anonymous Coward · · Score: 0

      s/prophect/prophecy/

      It's a new keyboard, I'm not used to it yet. :)

    13. Re:Funny thing about performance by Anonymous Coward · · Score: 2, Informative

      Here's a somewhat relevant anecdote.

      I interviewed at a company that makes a big deal about being super duper technical on their web site. They had a written coding problem as part of the interview. (A good sign!)

      They left me in a room with a non-networked PC with instructions get as far as possible in writing a program in Java to take an initial date and a number of days to add or subtract, and figure out what the resulting date would be. The test instructions contained a detailed explanation of the workings of the Gregorian calendar system. The PC had Windows and the JDK installed on it, and just about nothing else. They gave me a pretty short period of time to do it in - 15 or 20 minutes, if I remember right.

      At first I had to call the interviewer back in so that I could show him that there were about TEN different past solutions still stitting on the hard drive, and that I was going to delete them all while he watched and ask him to start the clock again. (Lamers...)

      When I read the problem I realized that it was very easily solved using the java.util.GregorianCalendar class that comes with the JDK. I didn't remember exactly how to use it but fortunately the installed JDK on this PC also included the JDK source. I javadoc'd the source to GregorianCalendar and Calendar and read the docs, wrote my app, and tested it thoroughly. Of course it didn't take long to get it working, since the hard work was already done. I had to walk all over the office looking for the interviewer, who apparently wasn't expecting me to actually complete the task within the allotted time.

      When I reviewed my very short program with the proctor and explained all of the things that I had done in order to do it that way, he seemed upset, as though I cheated. I tried to make a case for the fact that I had passed up a chance to actually cheat and then been resourceful, but he wasn't convinced. I didn't do it exactly the slow and tedious way they wanted, so I was wrong. I pointed out that if I was on a project and caught a developer duplicating base JDK functionality due to plain ignorance of the class library, I'd consider that a *bad* thing, not an example of technical excellence.

      The rest of the interview went OK, but they eventually called me back and said there was a hiring freeze. Well maybe so, since it was in 2000 or 2001 (I don't remember exactly when), or maybe not. I wasn't exactly crushed.

      Since then, the hard-skills tests that I use when interviewing developer candidates includes something like this for the relevant environment... kinda like you said. Something like "read in a text file, sort the lines, and print it out in sorted order". If their program includes a sort routine, BZZT, they failed the test.

    14. Re:Funny thing about performance by Henk+Poley · · Score: 1

      The thing is that a lot of programmers today have grown NOT to respect the need for performance - they just assume that the upcoming systems would have really fast processors and infinite amounts of RAM and diskspace, and write shitty code.

      Not only that, but they are teached that this is the case. Which is IMHO rather odd, as since the average consumer PC is 1-2y old.

    15. Re:Funny thing about performance by Anonymous Coward · · Score: 0

      Hate to be pedantic, but past tense of teach is taught :)

    16. Re:Funny thing about performance by mcrbids · · Score: 1

      Any tendency to optimize prematurely ought to be avoided, at least until after v1.0 ships.

      Assuming there is a second version, which there may not be because potential customers found that the performance of v1.0 sucked.


      I wrote an application not too long ago. Being that PHP is my language of choice, and being that PHP-GTK was available, I wrote it in that language.

      The software manages paperwork for schools. There are about 1,000 courses in files in the program. The user selects courses, gives assignments to students, grades the results, and prints it off.

      I initially paid no attention to performance - and performance sucked. It would tie down a P4-2000 badly in places.

      So, I spent a week or two optimizing performance. I discovered a few things about what really slowed down the computer, and built indexes in memory rather than reading from disk anytime you "want to know".

      Bottom line is that it's a terrible performer for performance. Just pitiful. It's also completely irrelevant since the average user has about 800 Mhz system, on which it performs just fast enough to never appear slow.

      And, in a few years, it'll be "lean and mean" compared to whatever else is out there.

      Oh, and version 1 (called 4.0 for largely historical reasons - it's a rewrite of an earlier, fundamentally broken codebase) has been quite successful, and I'm in the middle of pounding out version 2 as quick as I can.

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    17. Re:Funny thing about performance by Molina+the+Bofh · · Score: 4, Funny
      I do Dumbsort.

      Dumbsort works something like this:
      :loop
      randomize (array)
      if (sorted) goto next
      goto loop
      :next
      Straight from MS-style programming books.
      --

      -
      Roses are #FF0000, Violets are #0000FF, find / -name '*base*' |xargs chown -R us && mv zig greatjustice
    18. Re:Funny thing about performance by Anonymous Coward · · Score: 0

      While Knuth said that premature optimization is evil, he said it in the context of optimizing the implementation, that's very different from the problems discussed here.

      Choosing a bad algorithm is something your hardware can't solve. Replacing instructions with cheaper ones, adding caching and all this are implementation details, but trashing CPU / filesystem caches is whole lot different.

    19. Re:Funny thing about performance by almaw · · Score: 4, Insightful

      > Performance gains occur at the hardware level.
      > Any tendency to optimize prematurely ought to be
      > avoided, at least until after v1.0 ships.

      Performance gains occur at the algorithm level. It doesn't matter how much hardware you throw at a problem if it needs to scale properly and you have an O(n^3) solution.

    20. Re:Funny thing about performance by Anonymous Coward · · Score: 1, Funny
      Performance gains occur at the algorithm level. It doesn't matter how much hardware you throw at a problem if it needs to scale properly and you have an O(n^3) solution.

      On the other hand, lower complexity isn't always better. Quicksort is O(n^2) in the worst case, but in practice it's almost always an order of magnitude faster than the O(n log n) heap sort.

      And here's a wrapper function which will turn any sort algorithm into an O(1) one:
      void sort (char** arr) {
      sort_strings (arr);
      for (;;);
      }
      I suspect, however, that it will not improve performance.
    21. Re:Funny thing about performance by Moraelin · · Score: 5, Insightful

      Well, yes and no.

      I still don't think you should start doing every single silly trick in your code, like unrolling loops by hand, unless there's a provable need to do so. Write clearly, use comments, and use a profiler to see what needs to be optimized.

      That is coming from someone who used to write assembly, btw.

      But here's the other side of the coin: I don't think he included better algorithms in the "premature optimization". And the same goes for having some clue of your underlying machine and architecture. And there's where most of the problem lies nowadays.

      E.g., there is no way in heck that an O(n * n) algorithm can beat an O(log(n)) algorithm for large data sets, and data sets _are_ getting larger. No matter how much loop unrolling you do, no matter how you cleverly replaced the loops to count downwards, it just won't. At best you'll manage to fool yourself that it runs fast enough on those 100 record test cases. Then it goes productive with a database with 350,000 records. (And that's a small one nowadays.) Poof, it needs two days to complete now.

      And no hardware in the world will save you from that kind of a performance problem.

      E.g., if most of the program's time is spent waiting for a database, there's no point in unrolling loops and such. You'll save... what? 100 CPU cycles, when you wait 100,000,000 cycles or more for a single SQL query? On the other hand, you'd be surprised how much of a difference can it make if you retrieve the data in a single SQL query, instead of causing a flurry of 1000 individual connect-query-close sequences.

      (And you'd also be surprised how many clueless monkeys design their architecture without ever thinking of the database. They end up with a beautiful class architecture on paper, but a catastrophic flurry of querries when they actually have to read and write it.)

      E.g., if you're using EJB, it's a pointless exercise to optimize 100 CPU cycles away, when the RMI/IIOP remote call's overhead is at least somewhere between 1,000,000 and 2,000,000 CPU cycles by itself. That is, assuming that you don't also have network latency adding to that RPC time. On the other hand, optimizing the very design of your application, so it only uses 1 or 2 RPC calls, instead of a flurry of 1000 remote calls to individual getters and setters... well, that might just make or break the performance.

      (And again, you'd be surprised how many people don't even know that those overheads exist. Much less actually design with them in mind.)

      So in a nutshell, what I'm saying is: Optimize the algorithm and design, before you jump in to do silly micro-level tricks. That's where the real money is.

      --
      A polar bear is a cartesian bear after a coordinate transform.
    22. Re:Funny thing about performance by Anonymous Coward · · Score: 0

      and dont underesitmate that OSS version out there going very slowly. They will eventually overtake you as those programmers ARE making sure it's faster and better.

      you might get to market 36 moths before them, but if your customers see this "new free thing" is faster 3 years later your upgrade that you "count on" selling may not sell at all.

      Look at linux right now. within the next 5 years it will overtake all other OS's just because it's plodding along at it's own pace. no deadlines, no moron executive VP's screming "ship it! ship it!" and OSS has radical project managers calling for "out there" ideas like "make it smaller", and "make it faster", and the mother of all radical ideas..... "make it better".

      I use many OSS and commercial apps, one of my favorite Commercial apps is Lightwave.... but the OSS Blender is overtaking it quickly. it perfoems better, render's faster, and they are working on increasing it's stability to make lightwave look silly!. (Granted some of lightwave's unstability is the dongle support and at times it "loses" the dongle and locks up, but a insanely expensive app should NOT crash like it does at times.)

      All I can say is... commercial developers need to keep doing things the way they are now... the rest of the OSS world will silenty pass by in the night and the Commercial companies will start asking "what happened".

      Slow code = bad code, you CAN write clear and fast code. Embedded developers do it every day.

    23. Re:Funny thing about performance by Threni · · Score: 1

      Microsoft produces products that work. Not the best, not the cheapest, but they work. And they're easy for users to use without too many stupid questions. That's pretty much all there is to it.

      It's like music. There's really good music....then there's the crap that gets on the radio/tv. Most people aren't very discerning, so as long as there's something on the radio, or in the background or whatever, then that's fine.

      Must dash - there's a Britney song on the radio...

    24. Re:Funny thing about performance by hankaholic · · Score: 3, Insightful

      I was going to moderate this post "Overrated", but I'd rather just explain why you're wrong in stating that the "algorithm that was going to hold back your competition runs perfectly fast on new processors".

      Certain algorithms take more-than-proportionately longer as the data size increases. For example, if you're writing route-planning software, each additional stop on a route might cause the number of calculations required to (roughly) double.

      In such a case, having hardware which is twice as powerful would mean that performance would half, although as soon as the user added two more data points, the performance would be slower than the original machine.

      To clarify a tad, let's say FedEx decides to optimize the routes drivers in Montana are travelling. Assume that there are 10,000 stops and 200 drivers, and that your code runs in, say, an hour on FedEx's machines.

      Assume that you've used an algorithm for which each additional data point doubles the amount of computation required. Now FedEx deciding to hire 10 more drivers means that your route planning software is going to take 2^10 times as long to plan their routes (since it doubles for each new data point, that's 2^1 for one driver, 2^2 for two, 2^3 for three...).

      The point is that tiny operations add up when you've chosen the wrong algorithm. Despite the fact that runtime was fine using FedEx's CPU farm in the original situation, your disregard for efficiency will cause the route-planning time to take not the overnight-batch-job-friendly hour, but a stunning 1024 times as long (hint: over a month).

      Say a new big fast machine enters the market, with four times the CPU power. FedEx will still need 256 times as many machines to perform the same calculations in under an hour, or at least, say, 32 times as many in order to be able to perform them overnight.

      All because you decided that choosing algorithms based on performance was poppycock.

      Prematurely optimizing on a microscopic level may be "bad", but choosing the proper algorithm can make the difference between a product with a future and a product with too many designed-in limitations to be able to handle greater-than-expected loads.

      (CS fans will note that the TSP problem was a unrefined to have pulled out given the whole P/NP thing, but that's the point -- sticky situations can and will arise for which no amount of source-level optimization will save the day.)

      --
      Somebody get that guy an ambulance!
    25. Re:Funny thing about performance by PhotoBoy · · Score: 1

      I think one of the bigger performance killers these days is unfair deadlines.

      Management types these days expect code to be finished yesterday, and IMHO it's this which causes inefficient code to be written, as the programmer gets pressured to be finished ASAP, leaving no time for them to properly architect their system.

      I always try to get programmers to be honest about how long they think something will take them to do, but I often find new recruits, who have previously had deadlines dictated to them, are nervous about telling me how long something will take because they think they might get fired if they take too long.

      Optimising code can be worthwhile if you are given the time, I'm certain that these days many developers are being squeezed to the point where just finishing a project within the time limit is hard.

    26. Re:Funny thing about performance by Vengie · · Score: 3, Insightful

      You're ignoring constants. Constants can sometimes be large. That is why strassen's matrix multiply method takes longer than the naive method on small matricies.

      Scarily, you have just enough knowledge to sound like you know what you're talking about. Sometimes it DOES matter how much hardware you throw at the problem, lest you forget the specialized hardware DESIGNED to crack DES.

      How about your next computer I replace all the carry-lookahead adders with ripple-carry adders? Please look up those terms if you don't know them. I'm sure you'd be unpleasantly surprised.

      --
      When in doubt, parenthesize. At the very least it will let some poor schmuck bounce on the % key in vi. (Larry Wall)
    27. Re:Funny thing about performance by kubrick · · Score: 1

      I think most people have an area within which they know more and have some strong aesthetic choices, but it's difficult or impossible to be that picky in almost every area. As I work with software, I'm picky about that. However, I can't, e.g. really tell the difference between a $15 wine and a $30 wine ($Australian, I'm not sure what the equivalent amounts would be in the US market.) The larger companies tend to cater for the mass market, not the fussy types... which can be annoying if those compromises rub you up the wrong way (try feeding a wine buff a glass of the most popular $8/bottle wine, for example). Still, you can't argue with success -- but popularity rarely means quality, unless we define quality as 'near enough is good enough'.

      --
      deus does not exist but if he does
    28. Re:Funny thing about performance by p3d0 · · Score: 1
      It's best to "think big and code small". That is, make your code's internal interfaces flexible enough to accommodate any implementations you can brainstorm; then, with the interfaces in place, choose the most straightforward implementations that are most likely to be correct. Once it's working, if the program is too slow, then your brilliant interface design should allow you to re-implement the slow parts with ease.

      After the 1.0 release, you can continue to tune for performance, and even re-architect some of the interfaces if your designs were too shortsighed to allow for the improvements you want to make.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    29. Re:Funny thing about performance by p3d0 · · Score: 1

      Bertrand Meyer has argued that the increasing performance of hardware makes algorithm design even more important. If you choose an algorithm with exponential complexity, then its performance will increase only linearly with time as hardware follows Moore's law, while the performance of a linear algorithm will improve exponentially.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    30. Re:Funny thing about performance by elwell642 · · Score: 0

      Totally unrelated to your post (*runs from the evil Troll*), I noticed that your effortlessis.com website has a broken PHP script under Products. Just some ./ FYI =)

      --

      <insert witty linux comment here>

    31. Re:Funny thing about performance by Hast · · Score: 1

      What whould have annoyed me with that example is that they didn't have the API description locally available. The entire point of having an API is to allow your coders to ignore irrelevant details like what type a certain function takes and such.

      OTOH I think most test like these tend to fail their mark. Unless you get "out of the box" behaviour like you described and the interviewer recognices it. Unless you are hiring code monkeys there seems to be little point in having trivial coding assignments (input file, sort output file). It seems like a more complex problem without coding but instead reasoning is more relevant.

      Quite a lot harder to set up and evaluate though.

    32. Re:Funny thing about performance by Anonymous Coward · · Score: 0

      No too shabby performance, either O(2^n)

    33. Re:Funny thing about performance by bobaferret · · Score: 1

      I had the craziest feeling last year, when I realized that the normal bottlenecks are changing. It used to be that mem was the fastes followed by drives, followed by the network (comodity hw). suddenly with the availability of gigabit network cards, the disk is the slowest thing out there and in some cases, the network can even go up against the memory. So now, my distributed apps all try to minimize disk use before they minimze net use. Seems wrong on some level ya know.

      Just a random thought.

    34. Re:Funny thing about performance by The+Mayor · · Score: 1

      Moore's Law states that transister density will double every 18 months. As a corollary, the cost per transister will halve every 18 months. But Moore's Law never promised any specific performance improvements.

      Of course, smaller transisters lead to shorter circuit paths (i.e. higher speed). Furthermore, higher transister density leads to greater power consumption, leading manufacturers to work on lower power requirements for transisters (this leads to faster switching times for transisters, and therefore higher speeds). In other words, it can be inferred from Moore's Law that speed will increase. But Moore's Law says nothing about predicting speed increases from circuits.

      --
      --Be human.
    35. Re:Funny thing about performance by An+Onerous+Coward · · Score: 1

      "Performance gains occur at the hardware level?" Not really. At least in the realm of high performance computing, the speed increases due to hardware over the decades has actually been dwarfed by improvements in algorithms.

      [I don't remember the name of the book where I learned this. It was originally an IBM whitepaper on multiprocessor systems versus beowulf cluster-style computing. It had a really cool drawing of a five-headed dog fighting five one-headed dogs. Two Karma points to anyone who remembers the title.]

      'Sides, whatever technology you're trying to ship, it won't actually exist in the minds of consumers until Microsoft "invents" it.

      --

      You want the truthiness? You can't handle the truthiness!

    36. Re:Funny thing about performance by BeerMilkshake · · Score: 1

      Good point about the RPC calls.

      If you develop a system within a high-speed environment, it is easy to take that network connection for granted.

      Recently, I worked at home with a slow connection to work, and noticed the LED on the network hub blinking -alot-. Guess what, I found a place in the code where I was making an RPC call inside a big loop.

      Some of these things you never notice unless you test on a slow network...

    37. Re:Funny thing about performance by Valar · · Score: 2, Interesting

      Actually, when I was TAing data structures, we called that 52 card pick up sort. You take the deck of cards and throw it in the air. Pick up the cards and if they end up sorted, stop. If not, throw them in the air again. We used it as an example of "just because it works, doesn't mean you should do it" and as an example of algorithms with big os in the 'bad' column.

    38. Re:Funny thing about performance by Mr.+Piddle · · Score: 1

      Performance gains occur at the hardware level. Any tendency to optimize prematurely ought to be avoided, at least until after v1.0 ships.

      So, if your app did a lot of 2-D graphics...I guess it is perfectly okay to refresh the whole window and every object in it with every repaint at 60Hz?

      Remember, there are still lots of people out there with 4MB PCI graphics cards. Also, lots of Linux users have to run a non-accelerated X server for various reasons.

      Also, for non-graphical applications (web servers, databases), there are still oodles of marketing people and "journalists" who will resort to benchmark scores in declaring a "winner" and will be using only v1.0. Benchmarks are a sad state of reality (just look at mostly-baseless CPU flamewars over SPEC this and that).

      --
      Vote in November. You won't regret it.
    39. Re:Funny thing about performance by Anonymous Coward · · Score: 0

      It has a name:

      It is called the "bumble sort".

      The improved bumble sort, checks to see if it sorted first.

    40. Re:Funny thing about performance by TwistedSquare · · Score: 1
      E.g., there is no way in heck that an O(n * n) algorithm can beat an O(log(n)) algorithm for large data sets

      Yep, those fantastical O(log (n)) algorithms sure can't be beat ;-)

    41. Re:Funny thing about performance by Moraelin · · Score: 1

      Guess the wonder of one liners is that it just lets me wondering what you meant.

      A binary search is O(log2(n)), which is to say, searching through 1 million records takes a mere 15 steps more than searching through 32 records. So there's nothing particularly fantasy about them. Such algorithms exist.

      Other kinds of algorithms involve even less steps. E.g., the kinds of trees used in an English spell-checker (well, some spell checkers anyway), have 26 as a base for the logarithm. They also have other advantages over a binary search, such as being able to calculate a "distance" between two words.

      Can they be beat? Sure. A hash table is O(1) for retrieving an element, so it executes faster than a binary search for most data sets. (I.e., given enough data so that calculating the hash isn't slowing you down. For string keys this pretty much means every time.)

      But I'm guessing you knew such elementary things already, and I'm probably just boring you. So if you'd kindly clarify what you had in mind, I might even be able to give a less boring answer :)

      --
      A polar bear is a cartesian bear after a coordinate transform.
    42. Re:Funny thing about performance by s00p41337h4x0r · · Score: 2, Informative
      That's a well known algorithm called "Bogosort" in the Jargon File.

      Interesting thing about it is that it is one of the few algorithms that has an expected running time of O(n!). If you're teaching an intro algorithms class it's easy to come up with examples of O(lg n), O(n), O(n lg n), O(n^2), and O(2^n) in lecture but O(n!) is tricky. Useful as an extra credit question.

    43. Re:Funny thing about performance by TwistedSquare · · Score: 1

      Dammit, I ruined my own witty one-liner by mis-reading your message and malphrasing mine. So I guess it was ruined from the outset. I read your message as referring to O(log(n)) sorting algorithms (them being impossible of course), and then missed out the word sorting in my reply. So sorry bout that, I was clearly off thinking about something else at the time!

    44. Re:Funny thing about performance by MarkCollette · · Score: 1

      Hey, if it was already sorted, then you still randomize it!

    45. Re:Funny thing about performance by Moraelin · · Score: 1

      Ahh... Ok, I suppose O(n^2) can make one think of bubble sort.

      Given enough lack of clue or lack of care for performance, though, one can squeeze an O(n^2) or worse in lots of other places. E.g., it's pretty easy to end up doing an O(n^2) even for searching through a table.

      Or an example of clueless coding that's particularly dear to me, comes from some guys who basically tried to shaft me back in my freelancer days. Except they couldn't work with trees. At all.

      They had data which was a simple binary tree. Or rather the perfect textbook example for something fit to be stored in a tree. (A big rectangle which could be cut horizontally or vertically into two. Each of which could then be cut in two. And so on.) They needed to save and load it.

      But they couldn't work with trees. So they replaced the tree with an array, losing all the tree data in the process. (No, not a tree-inna-array, with indexes instead of pointers. Juts a flat array with no pointers.) And saved it as such. Which upon reading the file left them with the problem of which piece depends on which one. So they used backtracking to find out which pieces have already been cut.

      The result? It took minutes to load even the smallest trees. For a piece of processing which shouldn't even have existed at all, if they only knew how to recursively traverse a tree.

      Sad but true.

      --
      A polar bear is a cartesian bear after a coordinate transform.
    46. Re:Funny thing about performance by mcrbids · · Score: 1

      thanks. It's fixed now - but the product in quesiton was not Contact Manager, but Report Writer, hosted on the same server.

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    47. Re:Funny thing about performance by TomDLux · · Score: 1

      Optimization should be defered!

      What people seem confused about is what should be done in the meantime.

      The point of deferring optimization is to select data structures and algorithms which are clear, relevent to the application, scalable, and have reasonable performance potentional.

      Whether you call it XP, Agile Progrramming, or waterfall design, you want to get some idea of performance fairly early. If things look good, and scaling tests are promising, you can defer changes till some time in the future.

      While you do need to consider the hardware that will be available by the time your software gains any significant audience, the MS greed for hardware is a significant factor in first world / third world competition. Traditional North American and European corporations, especially the large one, assume you need to upgrade to the newest terabyte, gigahertz, super-duper system to run up-to-date software. In the meantime, smaller companies, and organizations in India or even Mali can compete just fine with obsolete hardware at a fraction of the speed, running Linux, BeOs or other non-greedy operating systems.

    48. Re:Funny thing about performance by TomDLux · · Score: 1

      I have some really horrible news for you ... doubling performance every 18 months IS exponential.

      Performance = 2^N

      where N is the number of 18 month periods that have passed

    49. Re:Funny thing about performance by Anonymous Coward · · Score: 0

      Bzzzzzzt.

      Not an algorithm. randomize(array) might spew forth unsorted output until the end of time - admittedly very unlikely, but not impossible. One criterion for an algorithm is that it must terminate itself.

  7. Method of Payment... by inf0rmer · · Score: 0

    What, you mean I don't get paid per line of code I write?

  8. the software taketh what the hardware giveth. by equex · · Score: 4, Insightful

    i remember times when 7.14mhz and 256k ram was enough to drive a multitasking windowed os. (amiga)
    ive seen glenz vectors and roto-zoomers on the commodore 64.
    modern os's, escpecially windows seem super-sluggish when you see what is possible on those old computers if you just care to optimize the code to the max.

    --
    Can I light a sig ?
    1. Re:the software taketh what the hardware giveth. by neil.orourke · · Score: 2, Interesting

      But the great demos on the Amiga and C64 never hit the OS.

      Have a look at some of the PC demos that boot from DOS and take over the machine (eg. www.scene.org) and tell me that they aren't just as amazing.

    2. Re:the software taketh what the hardware giveth. by cr0sh · · Score: 1
      One of the most amazing PC demos I ever saw was a 256 byte intro that ran under DOS (I forget the name of it).

      It was so well written, it would run perfectly under NT4.0 (mode switches and everything worked perfectly). As if that wasn't enough - the graphics - OMFG!

      The intro sped you through a 3D tunnel with twists and turns, with computed-on-the-fly texture map. Spinning (or you were rotating around it - been a while since I watched it) in the middle of the tunnel was this 3D warped texture "blob".

      This ran by itself - all in 256 bytes. My brief look at the code (under debug) showed a very simple program, which hit heavily at the math co-processor - lots of math ops - and interfaced with mode 13h via VESA (IIRC). Most of the bytes were for the main meat of the code - but 256 bytes! I am still amazed.

      In fact, I may look at that code again, and see if it could be compiled to run under Linux in some manner (from the console, maybe?) - since it seemed so simple...

      --
      Reason is the Path to God - Anon
    3. Re:the software taketh what the hardware giveth. by Anonymous Coward · · Score: 3, Informative
      One of the most amazing PC demos I ever saw was a 256 byte intro that ran under DOS (I forget the name of it).
      This one?
    4. Re:the software taketh what the hardware giveth. by Anonymous Coward · · Score: 3, Informative

      The program you're looking after is "tube". If you want to get seriously impressed, have a look at "lattice" instead.
      However, you can't achieve the same easily in linux since
      a) putting pixels is more than just writing to 0a0000h and
      b) elf format has actually some structure. (iirc program that merely returns 42 takes 53 bytes and uses quite obscene amount of trickery to achieve that)

      These will probably bloat the linux version to something like 512 bytes;) Oh dear.

  9. Longer code is faster code by Anonymous Coward · · Score: 1, Funny

    Unroll those loops by hand. You'll get a little bump in speed. And a little bump in your pocketbook.

    1. Re:Longer code is faster code by Anonymous Coward · · Score: 0

      I want more of Duff's device. :)

  10. If everyone paid attention in english class... by fervent_raptus · · Score: 2, Informative

    this slashdot post would read:

    I just finished reading the essay "Programming as if Performance Mattered", by James Hague. The essay covers how compiler optimization has changed over the years. If you get bored, keep reading; there's a big 'gotcha' in the middle. Hague begins: "Will performance issues haunt us forever? This essay puts performance analysis in perspective."

    1. Re:If everyone paid attention in english class... by Anonymous Coward · · Score: 0

      bzzzt... split infinitive, use of passive voice, etc.

      Recently, I finished reading James Hague's essay, "Programming as if Performance Mattered". Hague discusses compiler optimization, and how this has changed over time.

      hell... right about there I *did* get bored, so someone who's much more pedantic than I can continue.

    2. Re:If everyone paid attention in english class... by Anonymous Coward · · Score: 0

      You're fucking stupid. Morons like you don't deserve to live.

  11. I think... by rms_nz · · Score: 2, Interesting

    ...it would have been better for him to show the run times for all the versions of his program to show us what difference each of the changes had made...

  12. Make the common case fast by DakotaSandstone · · Score: 2, Interesting
    Yes, yes, yes. Do optimize. But, come on people, do we really need to turn that nice readable device init code that only executes once into something like:
    for (i=0,j=0,init();i!=initDevice(j);j++,writeToLog()) ;

    Sheesh!

    --
    Nothing is so smiple that it can't get screwed up.
    1. Re:Make the common case fast by CosmeticLobotamy · · Score: 0

      Why not? If it takes you more than a second to read that and figure out what it does, you're in the wrong business.

      Why are you using i to compare initDevice(j) to 0?

      And to preempt the likely response, yes, I am in the wrong business.

    2. Re:Make the common case fast by Anonymous Coward · · Score: 0

      Maybe I just been programming too much but...

      for (int j=-1, init(); initDevice(++j); writeToLog());

      Also I'd be interested to know why its writing to the log there.

    3. Re:Make the common case fast by Anonymous Coward · · Score: 0

      That's not optimization! You are merely removing whitespace. That's obfuscation. Big difference.

    4. Re:Make the common case fast by BasilBrush · · Score: 1

      You didn't quite work it out then did you? You don't know the scope of i, so for all you know i may be altered in initDevice() or writeToLog(). It'd be terribly badly writen code if it was, but then what can you expect from someone cramming a for statement in such an unreadable fashion. A second is too long to spend on getting an impression of what a line does when you are skimming through code. Splitting the extra initialisation and body of the loop out from the for statement costs nothing but makes the code much more readable.

    5. Re:Make the common case fast by Anonymous Coward · · Score: 0

      What about i? It is still in scope outside the loop, and it won't nessisarily be 'true' depending on what writeToLog does (exceptions, goto etc), and if writeToLog is inline or heavily optimised by the compiler.

    6. Re:Make the common case fast by CosmeticLobotamy · · Score: 1

      You didn't quite work it out then did you? You don't know the scope of i, so for all you know i may be altered in initDevice() or writeToLog().

      I assumed it was an error and not massive retardation, and there's no need to be an asshole.

      "The cat jumped over the fence," is unreadable if you're really bad at reading. Though the guy a few replies down is right. It's obfuscation, not optimization, and it really serves no purpose. Not arguing with you that it couldn't be better. I'm just saying it's not that bad.

    7. Re:Make the common case fast by Anonymous Coward · · Score: 0

      I though the original looked more obfuscated.

    8. Re:Make the common case fast by BasilBrush · · Score: 1

      Don't complain about me being an asshole when you are trying to tell people they are in the wrong job for complaining about code being "optimized" by writing it in a non-reader friendly way. The fact is that code could be written in a easier to read way with absolutly no run time penalty. It therefore *should* be written in an easier to understand way. Macho statements about how quickly *you* can understand it and "you must be in the wrong job" are not helpful. It's an attitude that doesn't make for a good programmer.

    9. Re:Make the common case fast by Anonymous Coward · · Score: 0

      I just got done saying it wasn't optimized and that it should be written better, and you're still being unnecessarily asshole-ish.

      I wasn't addressing anyone specifically about being in the wrong job. I'm sure the original poster understood the line, and I got his point. My point was that his overly complicated line wasn't overly complicated enough to warrant complaint. Not 'cause I've got macho-ass code-reading skills, but because it's five-ish (I'm not looking at it right now) statements seperated by commas in a for loop. If you take more than a second to read that particular one, you're probably still in school. Which, I concede, does not mean you're in the wrong business. I could have worded that better, and I apologize. If you can't read it quickly, then at the very least you need to go practice so you can be like the cool, macho, 99% of programmers that can. If I'm mistaken that most people who use the language frequently wouldn't have a problem with it, I'm sorry, but I don't think I am.

    10. Re:Make the common case fast by zhenlin · · Score: 1
      I don't think that's optimised... Just obfuscated. (You could say, optimised for small source code size)

      A real example? Well, I can't really think of any that don't involve major alterations to the algorithm.

      Then again...
      unsigned int f(unsigned int n)
      {
      unsigned int a = 1, b = 1, c = 1, i;
      for (i = 3; i <= n; i++)
      {
      c = a + b;
      a = b;
      b = c;
      }
      return b;
      }
      Eliminate c and i;
      unsigned int f(unsigned int n)
      {
      unsigned int a = 1, b = 0;
      while (n--)
      {
      a += b;
      b ^= a ^= b ^= a;
      }
      return b;
      }
      Can always cheat though... change the algorithm:
      unsigned int f(unsigned int n)
      {
      static double q = sqrt(5.0);
      static double P = log((1.0 + q)) / 2.0);
      static double p = log((q - 1.0)) / 2.0);
      if (n % 2)
      return (exp(P * n) + exp(p * n)) / q;
      else
      return (exp(P * n) - exp(p * n)) / q;
      }
      Eliminate the branch:
      unsigned int f(unsigned int n)
      {
      static double q = sqrt(5.0);
      static double P = log((1.0 + q)) / 2.0);
      static double p = log((q - 1.0)) / 2.0);
      return (exp(P * n) - (0 - (n % 2)) * exp(p * n)) / q;
      }
    11. Re:Make the common case fast by BasilBrush · · Score: 1
      You are still making the mistake of thinking that people that ask for more readable code are doing it because don't understand it, or on behalf of those that don't. Actually the people that champion writing code in a more readable fashion are usually the most experienced programmers in an organisation.

      Clearly you aren't going to get this now because you've got your hackles up, but consider that the line in question could have made a first step to improved readability by simply adding a space after the first two semicolons. It wouldn't make a damn bit of difference to those that don't understand the syntax, but for those that do, it would be more readable.

    12. Re:Make the common case fast by dtfinch · · Score: 1

      That's quite interesting. Didn't know there was a formula. After fixing the few typos, I verified it for all inputs up to 47, after which the integer overflows. So you typed all that from memory?

      If you really wanted to do it fast, you could try:
      #define f(i) _f[i]

      const unsigned int _f[]={0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233,
      377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025,
      121393, 196418, 317811, 514229, 832040, 1346269, 2178309, 3524578, 5702887,
      9227465, 14930352, 24157817, 39088169, 63245986, 102334155, 165580141,
      267914296, 433494437, 701408733, 1134903170, 1836311903, 2971215073U};

      Same output for all supported inputs.

      I bet someone, somewhere is using the recursive version exactly as defined in many math textbooks:
      int f(int n) {
      if(n<3) return 1;
      else return f(n-1)+f(n-2);
      }
      So that the time taken to calculate the result is the result.

  13. You don't optimize, that's the job of the compiler by Anonymous Coward · · Score: 2, Insightful

    If you write clear and simple code the compiler or interpreter does all the other work. It will automatically remove unused code and simplify complex segments. So long as your code is not unnecessarily convoluted often the machine optimizations are better than the human brain optimizations. It's like register allocation. You don't do that by hand. That's just crazy! Some poor fools 20 years ago had to do it by hand and came up with an algorithm to do it that the computer just does for you.

    That's the difference between modern languages and more archaic ones. Sure you can't get the "absolute best" most optimized optimization, but you're probably going to get a better optimization than you can think of just from the interpreter/compiler doing its job.

    The only thing that really needs optimization is streamlining data structures because the compiler can't predict what part of the data structure isn't used during runtime. You just ned to make sure you use the right data structure for the job and put the basic pen-and-paper (optimized) algorithm down in plain code. No strange hacker tricks needed.

  14. Don't agree by Docrates · · Score: 4, Interesting

    While the author's point seems to be that optimization and performance are not all that important, and that you can achieve better results with how you do things and not what you use, I tend to disagree with him.

    The thing is, in real life applications, playing with a Targa file is not the same as service critical, 300 users, number crunching, data handling systems, where a small performance improvement must be multiplied by the number of users/uses, by many many hours of operation and by years in service to understand its true impact.

    Just now I'm working on an econometric model for the Panama Canal (they're trying to make it bigger and need to figure out if it's worth the effort/investment) and playing with over 300 variables and 100 parameters to simulate dozens of different scenarios can make any server beg for more cycles, and any user beg for a crystal ball.

    --

    There are two kinds of people in the world: Those with good memory.
    1. Re:Don't agree by Anonymous Coward · · Score: 0
      The thing is, in real life applications, playing with a Targa file is not the same as service critical, 300 users, number crunching, data handling systems, where a small performance improvement must be multiplied by the number of users/uses, by many many hours of operation and by years in service to understand its true impact.

      Holy run on sentence batman!

    2. Re:Don't agree by mfago · · Score: 4, Insightful

      Not what I got out of it at all, rather:

      Clear concise programs that allow the programmer to understand -- and easily modify -- what is really happening matter more than worrying about (often) irrelevant details. This is certainly influenced by the language chosen.

      e.g. I'm working on a large F77 program (ugh...) that I am certain would be much _faster_ in C++ simply because I could actually understand what the code was doing, rather than trying to trace through tens (if not hundreds) of goto statements. Not to mention actually being able to use CS concepts developed over the past 30 years...

    3. Re:Don't agree by fbform · · Score: 1

      author's point seems to be that optimization and performance are not all that important, and that you can achieve better results with how you do things and not what you use

      I have to disagree with the author too, but on a different point. What the author has done is to write some image-processing code, manually profile it (recognizing that several pixels get the value FF00FF looks like profiling to me) and then modified the algorithm (made those pixels transparent) to make it faster.

      I would say that any such changes to the algorithm after knowing the nature of the data is bound to make it faster than blind compiler-level or language-level optimization that's oblivious to data.

      --
      Time flies like an arrow. Fruit flies like a banana.
    4. Re:Don't agree by avelth · · Score: 1

      Actually, his point seems to be that optimization doesn't necessarily rely on your choice of tools. He wrote (and optimized) his program in a VM environment, and his performance was pretty good.

      He even goes about it in a rational way:

      1-write the program correctly (includes functional testing)
      2-test with an eye on performance
      3-make changes
      4-goto 2

      Notice he didn't obsess about performance in 1.

    5. Re:Don't agree by bloosqr · · Score: 1

      Its possible but fortran as a lot of things going for it, if nothing else our compilers still are pretty bad. I use C++ in all our numerical w/ "C like" numerics using gsl/blas whereever possible and have a lot of simple benchmarks of parts of our code using profiling. One thing i've noticed, is using the intel compilers is that the compiler has a much easier time vectorizing parts of the code whenever possible. The other irony of using C++ is its pretty easy to "abstract" the code away, its easier to see what the code is doing rather than what it is supposed to be doing, let alone the issues of having routines hiding in operators , destructors, constructors etc.

    6. Re:Don't agree by Anonymous Coward · · Score: 0

      I tried running the pseudocode you described, but got caught in an infinite loop.
      any suggestions?

    7. Re:Don't agree by rangek · · Score: 1
      I'm working on a large F77 program

      I feel your pain. As a computational chemist, I am often forced to work with f77 code and I hate it. But I love (well maybe not love, but compared to f77...) fortran95. Try it out. It is a superset of f77, so refactoring your code over to f95 shouldn't be a total disaster. And you get to use some of those "CS concepts developed over the past 30 years".

    8. Re:Don't agree by OoSync · · Score: 1

      e.g. I'm working on a large F77 program (ugh...)

      Are you able to start working in modern Fortran? As in Fortran 90/95. I'm working on a similar task, though I'm actually rewriting the entire code from F66 + Cray pointers to modern Fortran (and not a pointer in sight).

      Yeah, gotos can be a pain if the author uses them as control strucures (such as loops and if/then/else structures), which are available in F77. If they are used by control strucutures to direct the flow to chunks of code, then rewriting those chunks as subroutines or functions may help you're understanding.

      --

      I always get the shakes before a drop.
    9. Re:Don't agree by globalar · · Score: 1

      Exactly, the programmer needs to know what is important and focus design, development, and finalizing all around optimizing for what is important.

      Optimization begins by focusing on the design goals, not "where will our hacks save the most CPU cycles". If you don't have a goal to your optimizing that lines up with the project's direction, then reconsider. Your time may be better spent thinking a few steps ahead, rather than focusing on just any detail. After all, your time is far more valuable than billions of cycles.

    10. Re:Don't agree by Frnknstn · · Score: 1

      Step 3 breaks out of the loop when there are no more changes the programmer can make.

      --
      If it's in you sig, it's in your post.
    11. Re:Don't agree by Frobnicator · · Score: 5, Insightful
      Just now I'm working on an econometric model ... in real life applications, playing with a Targa file is not the same as service critical, 300 users, number crunching, data handling systems, where a small performance improvement must be multiplied by the number of users/uses, by many many hours of operation and by years in service to understand its true impact. ... I'm playing with over 300 variables and 100 parameters to simulate dozens of different scenarios can make any server beg for more cycles, and any user beg for a crystal ball.

      I don't think that fits into the description the article was talking about.

      The point of this article is not targeted to you. I've seen interns as recent as last year complain about the same things mentioned in the article: division is slow, floating point is slow, missed branch prediction is slow, use MMX whenever more than one float is used, etc.

      The point I get out of the article is not to bother with what is wasteful at a low level, but be concerned about the high levels. A common one I've seen lately is young programmers trying to pack all their floats into SSE2. Since that computation was not slow to begin with, they wonder why all their 'improvements' didn't speed up the code. Even the fact that they are doing a few hundred unneccessary matrix ops (each taking a few hundred CPU cycles) didn't show up on profiling. Their basic algorithm in a few cases I'm thinking about are either very wasteful, or could have been improved by a few minor adjustments.

      The article mentions some basic techniques: choosing a different algorithm, pruning data, caching a few previously computed results, finding commonalities in data to improve the altorithm. Those are timeless techniques, which you probably have already learned since you work on such a big system. Writing your code so that you can find and easily implement high-level changes; that's generally more important than rewriting some specific block of code to run in the fewest CPU cycles.

      A very specific example. At the last place I worked, there was one eager asm coder who write template specializations on most of the classes in the STL for intrinsic types in pure asm. His code was high quality, and had very few bugs. He re-wrote memory management so there were almost no calls to the OS for memory. When we used his libraries, it DID result in some speed gains, and it was enough to notice on wall-clock time.

      However... Unlike his spending hundreds of hours on this low-return fruit, I could spend a day with a profiler, find one of the slower-running functions or pieces of functionality, figure out what made it slow, and make some small improvements. Usually, a little work on 'low-hanging fruit', stuff that gives a lot of result for a little bit of work, is the best place to look. For repeatedly computed values, I would sometimes cache a few results. Other times, I might see if there is some few system functions that can be made to do the same work. On math-heavy functions, there were times when I'd look for a better solution or 'accurate enough but much faster' solution using calculus. I'd never spend more than two days optimizing a bit of functionality, and I'd get better results than our 'optimize it in asm' guru.

      Yes, I would spend a little time thinking about data locality (stuff in the CPU cache vs. ram) but typically that doesn't give me the biggest bang for the buck. But I'm not inherently wasteful, either. I still invert and multiply rather than divide (it's a habit), but I know that we have automatic vectorizers and both high-level and low-level optimizers in our compilers, and an out-of-order core with AT LEAST two floating point, two integer, and one memory interface unit.

      And did I mention, I write software with realtime constraints; I'm constantly fighting my co-workers over CPU quotas. I read and refer to the intel and AMD processor documentation, but usually only to see which high-level functionality best lends itself to the hardware. I am tempted to 'go straight to the metal' occasionally, or to count the CPU cycles of everything, but I know that I can get bigger gains elsewhere. That's what the point of the article is, I believe.

      --
      //TODO: Think of witty sig statement
    12. Re:Don't agree by Mithrandir · · Score: 1

      That's the difference between static optimisation and dynamic. In the dynamic case, such as Java and .NET runtime, the VM knows exactly what your code is doing, and probably much better than you do. In that case, it optimises your code explicitly for the data it is working with Right Now(TM) rather than what the programmer thought was a fairly common case. Pretty much every study done on dynamic optimisation has shown that programmer pre-optimisations actually perform worse on these sorts of VMs than non-optimised code. The programmer-optimised version typically does stuff The Wrong Way that is slow for the given current conditions.

      --
      Life is complete only for brief intervals in between toys or projects -- John Dalton
    13. Re:Don't agree by po8 · · Score: 1

      Just now I'm working on an econometric model for the Panama Canal (they're trying to make it bigger and need to figure out if it's worth the effort/investment) and playing with over 300 variables and 100 parameters to simulate dozens of different scenarios can make any server beg for more cycles, and any user beg for a crystal ball.

      As a practicing SE with 20 years experience as well as a University prof who teaches SE and AI for a living, I suspect you're wrong. Talk to someone at your local Uni who is an expert in "combinatorial optimization". What you will find out is that (1) there are good algorithms that might help you a lot (only 100 parameters? hah :-) and that (2) because this class of problem tends to scale exponentially with effective problem size, any performance gain you get from slicing off constant factors of 3-20 or so will be swamped by even slight improvement in the search algorithm used.

      I'm not there and YMMV, of course, but check it out---it will probably be worth it.

    14. Re:Don't agree by MechaStreisand · · Score: 1

      Ie... never.

      --
      Disclaimer: IANAL. This post is, however, legal advice, and creates an attorney-client relationship.
    15. Re:Don't agree by alex_tibbles · · Score: 2, Insightful

      Exactly. The point of the article is (as someone else pointed out) is that clear, high-level code is easy to optimize, since it is easy to understand, and thus it's easy to reason about the code.
      It doesn't sound like any low level work is going to get you anywhere in your simulation. The best bet is to buy lots of hardware to brute-force it...
      ... or get smart! Reason about the problem. Is it important to evaluate the function for all possible combinations of all possible values all the variables and parameters? Is there any hidden constraint or relationship between those variables? The economic model which provides those variables may well have made a distinction between two variables, or an assumption of the independence of two variables, which is not relevant to your modelling.
      Or might statistics/ heuristics help? Picking the most likely region(s) (based on other theory) for computation, calculating that (those) first, and then working out into (assuming probability is smooth) less likely region(s).
      As the article points out, these kinds of optimizations (very similar conceptually to those in the article) are those easiest to do in a very high-level language.

    16. Re:Don't agree by Xugumad · · Score: 1

      I'd just like to second this. I'm doing large amounts of data crunching (30Gb of raw data, being processed in several passes to generate summary statistics) for a paper I'm writing. I started writing a parser in C, and it was difficult. Someone suggested, as the data was text, that Perl would be a good language, so I wrote a parser in Perl. And lo, for it was easy, and the code made sense, and the performance really, really sucked.

      So while I waited for the Perl to finish processing, I went back and finished my C parser. The code was ugly (something about a hand-coded finite state automata for parsing), and performance still kinda sucked, but I managed to finish the code and do the parsing in less time than it took the Perl code to complete.

      In particular, all these benchmarks comparing C, Java, and Perl, etc. compare them doing similar tasks. Except, it's the features that Java and Perl don't have that makes C so fast, by which I primarily mean pointers, although having fine-grain control over data structures seems to provide a large performance boost too.

    17. Re:Don't agree by Luna-tic · · Score: 1
      The thing is, in real life applications, playing with a Targa file is not the same as service critical, 300 users, number crunching, data handling systems, where a small performance improvement must be multiplied by the number of users/uses, by many many hours of operation and by years in service to understand its true impact.

      Erlang is a functional programming language which supports concurrency, communication, distribution, fault-tolerance, automatic memory management, and on-line code updates. It was designed for soft real-time control systems which are commonly developed by the telecommunications industry. It runs in system where there are demands on 99.999% uptime. Ever tried to use a phone? Most likely the software for connecting you was written (at least partly) in Erlang.

      I think that Erlang handles "service critical, 300 users" systems with no problems at all. Watch that C/C++ server go down when those numbers rise to 3,000, or 30,000.

      Just my two cents...

    18. Re:Don't agree by tedgyz · · Score: 1

      While the author's point seems to be that optimization and performance are not all that important

      I think you have missed the point of the article. He is saying that optimization and performance still matter, but it is being done by the programmer at a higher level.

      As a Java webapp developer, I know exactly what he means. The optimization process is safer. You don't have to worry about twiddling the wrong bits and blowing up your program. However, he neglects to say that optimization at the higher level can still be dangerous. For example a caching algorithm could introduce out-of-sync errors.

      I learned early on that before you tackle a performance problem, you must first measure. I am fond of tools like JProbe for making this task easier. One of the other points made by the author is that old assumptions can be made irrelevant by improvements in hardware. Just look at the first three letters in that word.

      --
      "No matter where you go, there you are." -- Buckaroo Banzai
    19. Re:Don't agree by beachy · · Score: 1

      Yeah, F77 really does suck. (I think its biggest problem is its lack of structs - either you end up with subroutines that have a billion arguments or you pass around everything in common blocks, which are essentially global variables.) F90/95 may be a solution, but at this point my hatred of anything fortran is so strong (irrational, probably) that I'm not sure I'm willing to look into it.

      The article linked to and mfago correctly point out that being able to understand and easily modify a program are the keys to getting performance that can be gained by algorithmic hacks. If you could program and modify C++/F95 as quickly as Erlang/Python then you would have a truly fast program, but it ain't gonna happen.

      I sure know which of the two groups of languages I prefer to program in, though...

  15. The Longhorn developers... by ErichTheWebGuy · · Score: 4, Funny


    should really read that essay! Maybe then we wouldn't need dual-core 4-6 GHz CPUs and 2GB ram to run their new OS.

    --
    bash: rtfm: command not found
    1. Re:The Longhorn developers... by Linsaran · · Score: 0

      But won't someone please think of the little codelings! All that legacy code they'd have to delete to do make longhorn run efficiently, it'd be Codeocide, we'd have millions of lines of little codelings who's legacy ancestors were deleted cause they're no longer needed. Won't someone please think of all the poor little codelings! Without their elders around to tell them about how they had to climb uphill both ways up the system bus to quiery Interupt 9, we'll have 1337 h4xor gangs of code all over the place, selling their spam, and spreading their viri all over the place.

      --
      In a bit of shameless internet panhandling, I accept Litecoin Donations at Lbd2oH9QsthD1GfuUXPyka12YxvWJYnBVf
    2. Re:The Longhorn developers... by NegativeK · · Score: 1

      should really read that essay! Maybe then we wouldn't need dual-core 4-6 GHz CPUs and 2GB ram to run their new OS.

      Good lord, that article was _written_ for the general /. non-RTFA-readership.

      I hate to spoil the article for people (hint, hint,) but Longhorn in Erlang would be scary at best. At worst.. Well, let's just say that we'd have to wait for the Earth Simulator to come out before running it.

      Or a computer that can handle Doom 3.

      --
      This statement is false.
    3. Re:The Longhorn developers... by TrancePhreak · · Score: 1

      I guess that's why it currently runs just fine on slightly above average hardware right now.... AND IT'S IN DEBUG MODE.

      --

      -]Phreak Out[-
    4. Re:The Longhorn developers... by Anonymous Coward · · Score: 0

      YOU INSENSITIVE CLOD!!!! It was a joke. Funny. Haha. LOL. Laugh.... Sheesh, some people!!!

    5. Re:The Longhorn developers... by Anonymous Coward · · Score: 0

      Why, so they can get their targa image readers to work a little bit better? Good plan!

    6. Re:The Longhorn developers... by julesh · · Score: 2, Informative

      Maybe then we wouldn't need dual-core 4-6 GHz CPUs and 2GB ram to run their new OS.

      The reason they're targetting this kind of system is because the hardware will probably be cheaper than Windows itself by the time Longhorn comes out.

      I'm sure they'll let you switch off the flash features that need it, though. All recent versions of Windows have been able to degrade to roughly the same performance standard as the previous version if you choose the right options.

  16. Premature Optimization by Godeke · · Score: 4, Insightful

    One of the concepts touched upon is the idea that optimization is only needed after profiling. Having spent the last few years building a system that sees quite a bit of activity, I have to say that we have only had to optimize three times over the course of the project.

    The first was to get a SQL query to run faster: a simple matter of creating a view and supporting indexes.

    The second was also SQL related, but on a different level: the code was making many small queries to the same data structures. Simply pulling the relevant subset into a hash table and accessing it from there fixed that one.

    The most recent one was more complex: it was similar to the second SQL problem (lots of high overhead small queries) but with a more complex structure. Built an object to cache the data in with a set of hashes and "emulated" the MoveNext, EOF() ADO style access the code expected.

    We have also had minor performance issues with XML documents we throw around, may have to fix that in the future.

    Point? None of this is "low level optimization": it is simply reviewing the performance data we collect on the production system to determine where we spend the most time and making high level structural changes. In the case of SQL vs a hash based cache, we got a 10 fold speed increase simply by not heading back to the DB so often.

    Irony? There are plenty of other places where similar caches could be built, but you won't see me rushing out to do so. For the most part performance has held up in the face of thousands of users *without* resorting to even rudementry optimization. Modern hardware is scary fast for business applications.

    --
    Sig under construction since 1998.
    1. Re:Premature Optimization by prockcore · · Score: 1

      The first was to get a SQL query to run faster: a simple matter of creating a view and supporting indexes.

      Every database programmer out there is cringing right now. Throwing indexes at a poorly designed table is not going to solve your problem. You'll find that as your table grows, your insert time is going to start bogging down the table heavily.

      Eventually you'll find that a simple insert locks up the table for several seconds, and the requests will start to pile up.

      By all means, don't spend days optimizing useless things, but spending a few hours planning good table structure will save you a lot of headaches a few years down the line.

    2. Re:Premature Optimization by G-funk · · Score: 1

      The first was to get a SQL query to run faster: a simple matter of creating a view and supporting indexes.

      Ah, views.... I miss real databases.... Stupid cheap bastards and their MySQL....

      --
      Send lawyers, guns, and money!
    3. Re:Premature Optimization by davebarz · · Score: 1

      Yeah, but you're assuming he knows nothing about what operations will be necessary in the systems. There are plenty of systems in which inserts only happen extremely rarely but reads occur very frequently, in which case the insert penalty won't matter.

      In ninety-nine percent of the apps a programmer builds, he knows something about the data that means he can't design around the eventual usage of the app, rather than assuming random usage.

    4. Re:Premature Optimization by kuhneng · · Score: 1

      Um, PostgreSQL anyone?

    5. Re:Premature Optimization by Anonymous Coward · · Score: 1, Funny

      Nooooo...... must..... use.......MysqL!!!!!!!!!!!!!!!11111111
      TIS SO UBAR!1111
      OMG

      begin;
      *insert 200k rows*
      rollback;
      connection to database lost?
      hm..

    6. Re:Premature Optimization by G-funk · · Score: 1

      Can't, postgres' windows version isn't stable enough, and we need com objects to interface with proprietry prism data :'(

      --
      Send lawyers, guns, and money!
    7. Re:Premature Optimization by Godeke · · Score: 1

      Actually, we took into consideration the balance between indexing costs during inserts and the read costs. We have indexed very little except primary keys in our database, which is exactly why we had the opprotunity to introduce a covering index for this operation when it was determined that this *particular* operation was going to be the most heavily read of the system.

      If we had more inbound transactions, I would have broken this into a materialized view which would have been background updated on another server, but our inbound transaction count is actually fairly low compared to the amount of reads these tables get.

      You are making a pretty wide assumption that I have no idea how to design a database: these tables are not denormalized: the reason we created needed to create a view and indexes was infact to overcome the many normalized tables that needed to be unified to produce the data required. On the other hand, if you are one of those who thinks you should build large systems *on* denormalized tables, may I refer you to www.dbdebunk.com.

      --
      Sig under construction since 1998.
    8. Re:Premature Optimization by Anonymous Coward · · Score: 0

      Yup. Over and over I've run monitors that reported that the REAL bottleneck was no where near everyone "thought" it would be, so they wasted their time optimizing the wrong part of the system.

      Almost always, selecting an algorithm more optimal for the actual workload would blow the doors off code-level optimizations.

      Once I even had to deal with a MAINFRAME being brought down on a daily basis just because the indexes on the CICS files were grossly sub-optimal. A minor config tweak and the problem vanished forever. No code mods required.

    9. Re:Premature Optimization by Darth_Burrito · · Score: 1

      Not that I trust anything to make any release (for any product), and not that it does any good for you now, but I think stored procedures and views are due in some form in version 5.

  17. Performance, shmerformance by jargoone · · Score: 1

    The First Rule of Program Optimization: Don't do it.
    The Second Rule of Program Optimization -- For experts only: Don't do it yet.
    -- Michael Jackson (not the molestor one)

  18. Performance is relative by jesup · · Score: 4, Interesting

    66 fps on a 3 GHz machine, doing a 600x600 simple RLE decode...

    Ok, it's not bad for a a language like Erlang, but it's not exactly fast.

    The big point here for the author is "it's fast enough". Lots of micro- (and macro-) optimizations are done when it turns out they aren't needed. And writing in a high level language you're comfortable in is important, if it'll do the job. This is a good point.

    On the other hand, even a fairly naive implementation in something like C or C++ (and perhaps Java) would probably have acheived the goal without having to make 5 optimization passes (and noticable time examining behavior).

    And even today, optimizations often do matter. I'm working on code that does pretty hard-real-time processing on multiple threads and keeps them synchronized while communicating with the outside world. A mis-chosen image filter or copy algorithm can seriously trash the rest of the system (not overlapping DMA's, inconvenient ordering of operations, etc). The biggest trick is knowing _where_ they will matter, and generally writing not-horrible-performance (but very readable) code as a matter of course as a starting point.

    Disclaimer: I was a hard-core ASM & C programmer who for years beta-tested 680x0 compilers by critiquing their optimizers.

    1. Re:Performance is relative by Jeffrey+Baker · · Score: 1

      I agree with your view here. The author claims "hey I made these great speedups with only a few passes at the high level" but a Targa decoder in C on a 3,000,000,000Hz P4 would probably have run in 10 milliseconds, even with the obvious implementation and no manual optimizations. I bet you could get it under 1 millisecond by using the SSE2/SSE/MMX/3dNow/Altivec or whatever vector unit was at hand.

    2. Re:Performance is relative by Lazy+Jones · · Score: 1
      The big point here for the author is "it's fast enough".

      Fast enough for what? For an amateur programmer coding for himself, in the language he likes best. I certainly wouldn't buy a code library written by that guy ;-)

      (disclaimer: I'm a "1985 cycle counting programmer")

      --
      "I love my job, but I hate talking to people like you" (Freddie Mercury)
    3. Re:Performance is relative by Wocko · · Score: 2

      Jesus, it's a 4-digit UID post-off!

    4. Re:Performance is relative by Anonymous Coward · · Score: 0

      Please followup with a URL where you have a robust, bug-free targa decoder, written in C and assembly, that achieves either of your above goals.

    5. Re:Performance is relative by turgid · · Score: 1
      (disclaimer: I'm a "1985 cycle counting programmer")

      Then you might appreciate Math Toolkit for Real-Time Programming by Crenshaw. I'm no expert, but I bought it and it's an entertaining, if not informative, read.

    6. Re:Performance is relative by Anonymous Coward · · Score: 1, Interesting

      Its sad that the gems of truth about speed and optimisation are buried in the erlang flag waving.

      Optimising the algorithm is certainly the best way to speed it up but his apparent belief that C programmers would have trouble using the same optimisations is truly bizarre. In truth C programmers probably wouldn't bother because they'd have stopped earlier, doing less work for a similar final speed, reaping the benefit of making it 'just fast enough'. I know I made little effort to optimise image readers even on ancient 16Mhz PC's, the naive, readable C++ was fast enough.

      If it had anything useful to say about micro-optimisation (and when not to bother) it would be useful, but that's not really mentioned in the article, just an invention in the slashdot preamble.

    7. Re:Performance is relative by Anonymous Coward · · Score: 0

      His argument for fast enough was that running it on a fast computer gave better time than running it on a slow computer. Duh.

      He never even compared his Erlang implementation to any other language implementation. He just declared that his was fast enough. Maybe a C++ version can do it in 1/10 the time.

      I don't think the article author knows what he is talking about or he wants to sell Erlang so much that he would comprise his academic integrity.

    8. Re:Performance is relative by Anonymous Coward · · Score: 0

      I agree. The article is pretty much a waste because the author can't even evaluate his ideas.

    9. Re:Performance is relative by Anonymous Coward · · Score: 0

      Performance has (at least) 2 dimensions: the speed of your implementation and the size of your problem. The author of the article had a trivially small problem to solve and a very loose definition of what was acceptable. If he had to decode 1000 targas in a 60th of a second, his solution would have been laughable.

      I'm all for using the most appropriate tool for the job, but if you don't take an initial guess at your performance goals and requirements before you start then you're going to find yourself rewriting your code a lot and in many environments that is too expensive. Even in a proof-of-concept prototype, if performance is a factor then you will have to account for it in choosing how to build your prototype.

      That said, I think the author makes a good point that most people don't have a good grasp of what a billion really is... a billion cycles, a billion dollars, etc. It is a such a huge number that you can't trust your intuition when dealing with it. If you are going to think about how much you can do with that much computation, you'd better not go with your gut feel. If you're thinking about it before you've written code, then write down numbers and do math. If you thinking about it after you've written code, then break out the profiler and measure. Do not operate based on your intuition.

    10. Re:Performance is relative by JamieF · · Score: 1

      >even a fairly naive implementation in something like C or C++ (and perhaps Java) would probably have acheived the goal without having to make 5 optimization passes (and noticable time examining behavior).

      How long would that have taken to write? More importantly, since it was the author and not you writing it, how long would it have taken him to write in C++?

      The performance of the program is not the only thing that matters. The performance of the development team (of 1, in this case) matters too. This is the reason for the canonical "then why don't you just write everything in assembly" argument. I'm surprised at how many people in this thread don't see this, and are happy to replace that with "everything should be written in C++", as though all possible requirements are best addressed with C++ (and a tad of hand tuned assembly, of course!).

      The right way to evaluate this is to include performance requirements with your functional specs. Once you have that, then the "right" architecture / language / process / team members etc. are the ones that let you satisfy the *complete* set of requirements (including features, maximum bug counts, and performance goals) in the shortest amount of time. This also addresses the argument in the other thread, regarding "but a slow 1.0 release is better than none." Sure, you could use a low-level programming language to squeeze out some more performance, but that's only important if the current performance isn't acceptable.

    11. Re:Performance is relative by Anonymous Coward · · Score: 0

      The disclaimer goes at the beginning, idiot

    12. Re:Performance is relative by jesup · · Score: 1

      Your point is well taken - he may not be a C/C++/java/whatever programmer, and certainly he's most comfortable in Erlang - and as I mentioned, that's important to his point.

      However, my point was that if a certain performance level was important, doing the initial implementation in a language (any of several) that was appropriate might have obviated the need to do multiple optimization passes, analysis, etc. I.e. a simpler algorithm implemented in a "faster" language might have run as fast as his 5-phase-optimized Erlang algorithm.

      And yes, the performance of the development team matters a lot. There are all sorts of parameters that go into deciding how to solve a problem, obviously: team skills, tools, how often will this be run, what the requirements are, expected hardware, lifetime, maintenance (a simple algorithm in a "faster" language will have an advantage here over a more complex/heavily-optimized one in a slower language, for example), etc, etc.

      In his "test" case, apparently on the order of 60+ fps was "good enough", though he was trying to illustrate what seemed general optimization methods - looking for work that can be factored out; work that can be collapsed; work that wasn't really needed; and ways to do the work faster (better O() algorithms) before micro-optimizing. And if you do optimize, make sure it's actually a part that's contributing to it being slow (profile, or analyze).

    13. Re:Performance is relative by Tim+Browse · · Score: 1

      You kids! Get a haircut!

      I know your Dad!

    14. Re:Performance is relative by Brandybuck · · Score: 1

      The biggest disappointment for me with the article was that it's a C/C++ bashing in disguise. The article's conclusion? "If we did [believe clarity and correctness matter], then 99% of all programs would be written in something like Python. Or Erlang."

      After plowing through some interesting bits on optimization, suddenly he comes out with the thesis that optimization doesn't matter and that we should use high level languages. I speak English. No matter how clear and concise the Greek is spoken, I won't understand it. Likewise, as a C++ programmer, no matter how clear and concise the Erlang is written, I won't understand it.

      If the author wasn't so insistant on setting the stage for an attack on C/C++, he might have gotten around to seeing how fast the same program would have been in C or C++.

      --
      Don't blame me, I didn't vote for either of them!
    15. Re:Performance is relative by Brandybuck · · Score: 1

      The performance of the program is not the only thing that matters. The performance of the development team (of 1, in this case) matters too.

      Let's say I need two people on the development team. What are my odds of finding another C++ developer versus finding another Erlang developer? And what about the poor schmuck that has to maintain the program next year after I move on to better things. Is he going to know C++ or Erlang?

      From the forty developers at my work, all know C. All but one knows C++. About a dozen know Perl. Two know Python. One knows Ruby. NONE know Erlang. What language should we choose for our new project?

      --
      Don't blame me, I didn't vote for either of them!
    16. Re:Performance is relative by JamieF · · Score: 1

      Hmm, you've just elaborated on the point I was making, but you phrased it as though you're arguing with me...

      so, um, yeah, like I said: the developer resources that are available (Erlang programmers, C++ programmers, etc.) should be factored into the decision of which language to use for a given project.

      >What language should we choose for our new project?

      That depends on the project, of course.

  19. Writing by jetfuel · · Score: 1

    "Are particular performance problems perennial?"
    ...
    "point of view, to put performance into proper perspective."
    Brought to you by the "Alliteration is the Only Literary Device I Ever Learned" School of Writing.

  20. He forgot to mention stability. by Phidoux · · Score: 1

    The golden rule of programming has always been that clarity and correctness matter much more than the utmost speed.

    In the "real world", not only is correctness and clarity more important than speed, but so is stability.

    1. Re:He forgot to mention stability. by Anonymous Coward · · Score: 0

      umm, how exactly do you think stability is not part of correctness?

    2. Re:He forgot to mention stability. by Phidoux · · Score: 1

      Easy... Would you consider a well and correctly written VB app to be stable?

  21. Depends on your target by KalvinB · · Score: 4, Insightful

    Working on a heavily math based application speed is necessary to the point that the user is not expected to wait a significant amount of time without something happening. I have a large background in game programming working on crap systems and it comes in handy. My tolerance for delays goes to about half a second for a complete operation. It doesn't matter how many steps are needed to perform the operation, it just all has to be done in less than half a second on a 1200Mhz system. My main test of performance is seeing how long it takes for Mathematica to spit out an answer compared to my program. Mathematica brags about being the fastest and most accurate around.

    When operations take several seconds a user gets annoyed. The program is percieved to be junk and the user begins looking for something else that can do the job faster. It doesn't matter if productivity is actually enhanced. It just matters that it's percieved to be enhanced or that the potential is there.

    You also have to consider if the time taken to complete an operation is just because of laziness. If you can easily make it faster, there's little excuse not to.

    For distributed apps you have to consider the cost of hardware. It may cost several hours of labor to optimize but it may save you the cost of a system or few.

    In the world of games half a second per operation works out to 2 frames per second which is far from acceptible. Users expect at minimum 30 frames per second. It's up to the developer to decide what's the lowest system they'll try to get that target on.

    You have to consider the number of users that will have that system vs the amount it will cost to optimize the code that far.

    In terms of games you also have to consider that time wasted is time possibly better spent making the graphics look better. You could have an unoptimized mesh rendering routine, or a very fast one and time left over to apply all the latest bells and whistles the graphics card has to offer.

    There are countless factors in determining when something is optimized enough. Games more so than apps. Sometimes you just need to get it out the door and say "it's good enough."

    Ben

    1. Re:Depends on your target by Anonymous Coward · · Score: 0

      I like your style of developing for slower machines.

  22. Its not just MS by Anonymous Coward · · Score: 0

    Performance is an inssue for us too.

    I just want a GNU distro that runs as fast as windows 98. Debian based. And a pony.

  23. Atari... by foog · · Score: 1

    Is that the same James Hague that wrote articles for ANTIC and Analog back in the era of the Atari 8-bits?

  24. But in some cases performance counts by jbms · · Score: 2, Interesting

    As another user commented, server software can benefit greatly from a large variety of optimizations, since better performance translates directly into supporting more users on fewer/cheaper servers.

    Optimizations also have significant effect in software designed to perform complex computations, such as scheduling.

    Also, the trend of ignoring performance considerations with the claim that modern hardware makes optimizations obselete is precisely what leads to the trend, particularly among Microsoft software, for the software to become significantly slower with each revision.

  25. Article puts it all in perspective by Debian+Troll's+Best · · Score: 4, Funny
    I'm currently completing a degree in computer science, with only a few courses left to take before I graduate. Man, I wish I had read that article before last semester's 'Systems Programming and Optimization' course! It really puts a lot of things into perspective. So much of a programmer's time can get caught up in agonizing over low-level optimization. Worse than that are the weeks spent debating language and design choices with fellow programmers in a team. More often than not, these arguments boil down to personal biases against one language or another, due to perceived 'slowness', rather than issues such as 'will this language allow better design and maintenance of the code', or 'is a little slow actually fast enough'?

    A particular illustration of this was in my last semester's 'Systems Programming and Optimization' course. The professor set us a project where we could choose an interesting subsystem of a Linux distro, analyze the code, and point out possible areas where it could be further optimized. I'm a pretty enthusiastic Debian user, so I chose to analyze the apt-get code. Our prof was very focused on low-level optimizations, so the first thing I did was to pull apart apt-get's Perl codebase and start to recode sections of it in C. At a mid-semester meeting, the professor suggested that I take it even further, and try using some SIMD/MMX calls in x86 assembly to parallelize package load calls.

    This was a big ask, but me and my partner eventually had something working after a couple of weeks of slog. By this stage, apt-get was *flying* along. The final step of the optimization was to convert the package database to a binary format, using a series of 'keys' encoded in a type of database, or 'registry'. This sped up apt-get a further 25%, as calls to a machine-readable-only binary registry are technically superior to old fashioned text files (and XML was considered too slow)

    Anyway, the sting in the tail (and I believe this is what the article highlights) was that upon submission of our project, we discovered that our professor had been admitted to hospital to have some kidney stones removed. In his place was another member of the faculty...but this time, a strong Gentoo supporter! He spent about 5 minutes reading over our hand-coded x86 assembly version of apt-get, and simply said "Nice work guys, but what I really want to see is this extended to include support for Gentoo's 'emerge' system...and for the code to run on my PowerMac 7600 Gentoo PPC box. You have a week's extension'

    Needless to say, we were both freaking out. Because we had focused so heavily on optimization, we had sacrificed a lot of genericity in the code (otherwise we could have just coded up 'emerge' support as a plug-in for 'apt-get'), and also we had tied it to Intel x86 code. In the end we were both so burnt out that I slept for 2 days straight, and ended up writing the 'emerge' port in AppleScript in about 45 minutes. I told the new prof to just run it through MacOnLinux, which needless to say, he wasn't impressed with. I think it was because he had destroyed his old Mac OS 8 partition to turn it into a Gentoo swap partition. Anyway, me and my partner both ended up getting a C- for the course.

    Let this be a lesson...read the article, and take it in. Optimization shouldn't be your sole focus. As Knuth once said, "premature optimisation is the root of all evil". Indeed Donald, indeed. Kind of ironic that Donald was the original professor in this story. I don't think he takes his work as seriously as he once did.

    1. Re:Article puts it all in perspective by jcain · · Score: 1

      Great story. It really shows the tradeoff you make when doing serious low level optimizations.

      I had a similar situation where I was lost some points on a CS exam, even though I followed the instructions to the letter. After typing in the code and running it (the exam was hand-written), it worked perfectly and did exactly what it was supposed to. I went up and asked the professor why I had lost points, and he said it was because the linked list the program used was empty when my program exited. I told him that it didn't matter, since the linked list would be destroyed when the program exited either way. He still didn't give me credit.

    2. Re:Article puts it all in perspective by Xoro · · Score: 2

      Oh, come on. Who modded this up? Funny, I could see, but "Interesting"?

      The final step of the optimization was to convert the package database to a binary format, using a series of 'keys' encoded in a type of database, or 'registry'.

      It's a joke.

      --
      Kill, Tux, kill!
    3. Re:Article puts it all in perspective by kyoko21 · · Score: 1

      Off topic:

      I would have switched majors to pre-law and come back 5 years later to sue that prof's sorry butt. And then I'd use his computer to download some illegal mp3s and get RIAA to dump on his butt, too.

      Seriously though, that second prof sounded like a real nut...

    4. Re:Article puts it all in perspective by joib · · Score: 2, Informative

      WTF are you talking about?

      I'm staring at the apt codebase on my screen just now, and it's all C++, baby. Ok, so there is a trivial amount of perl; sloccount summary:

      Totals grouped by language (dominant language first):
      cpp: 26481 (89.75%)
      sh: 2816 (9.54%)
      perl: 209 (0.71%)

      This is for apt-0.5.14, but I can't imagine that the newest version in unstable (0.5.24) would be that different.

      Now, if the rest of your story is true, that's mind-boggling. If the new teacher refused to judge your, from your description very fine, work just because he has a serious hard-on for gentoo, I seriously believe you should have taken it up with the dean of the faculty instead of just swallowing it and later complaining on /..

      That being said, why chose apt in the first place? Now, I haven't profiled apt, but I guess it spends the majority of time waiting on network i/o or waiting for dpkg to finish anyway.

    5. Re:Article puts it all in perspective by Anonymous Coward · · Score: 0
      Totals grouped by language (dominant language first): cpp: 26481 (89.75%) sh: 2816 (9.54%) perl: 209 (0.71%)

      Yeah, but that 209 lines of Perl is doing the work of 100,000 lines of C++.

    6. Re:Article puts it all in perspective by Anonymous Coward · · Score: 0

      I really enjoy your posts. I wish I was a girl, I would marry you! The moderation is quite entertaining too. Even better would be to have "apt-get" somewhere in the subject field of all your posts.

    7. Re:Article puts it all in perspective by defile · · Score: 2, Interesting

      How is this fair? He completely and utterly changed the entire assignment on you forcing you to throw all of your work away. And gave you one week for it!?

      apt-get and emerge are two totally different implementations of the same idea. Changing the environment on you may have taught you a lesson about how optimizing eliminates robustness, but if the last professor encouraged you to try MMX/SIMD instructions then you were totally right to tie yourself to the x86.

      I would've kicked that moron's ass.

    8. Re:Article puts it all in perspective by Anonymous Coward · · Score: 0

      Teehee, I'm running apt-get right now...and I'm NOT WEARING ANY PANTIES!!!

    9. Re:Article puts it all in perspective by prockcore · · Score: 0, Offtopic

      How is this fair? He completely and utterly changed the entire assignment on you forcing you to throw all of your work away. And gave you one week for it!?

      He sounds just like my boss.

    10. Re:Article puts it all in perspective by Anonymous Coward · · Score: 0

      For Christ's sake, his *name* has Troll in it. It's fake! He made it up!

    11. Re:Article puts it all in perspective by Anonymous Coward · · Score: 0

      I agree, that was great--the best work I have seen all week.

    12. Re:Article puts it all in perspective by Anonymous Coward · · Score: 0

      Don Knuth is not the originator of that quote. Anthony Hoare is.

      Btw, congrats on a well-crafted troll.

    13. Re:Article puts it all in perspective by Anonymous Coward · · Score: 0

      But what if somebody works on your program and adds some significant amount of code to the end of the program. Now it is not the end anymore.

    14. Re:Article puts it all in perspective by Godeke · · Score: 1

      Welcome to school: I got busted for using the fact that VAX (well, let's just date myself here) registers overlap depending on the bit length you use them at. The teacher said that "you can code this function in 13 assembly instructions" during class. Well, most everyone was using 20+ instructions. I had it down to 16, but was driven insane by not meeting the quoted number. Then I had a lightbulb go off. If I were to use adjacent registers but do the rotation of bits using double length registers, I could move my temporary and current accumulated around all at once. A bit later, I was down to 13 and declared victory.

      When I came to class the next day (about 5 minutes late, I must admit) I heard "ah... there he is now" from the teacher. I asked the girl next to me what that was all about, and she told me he went off on how I was using non portable optimization techniques. When I got that assignment back, he docked me points for meeting the goal. I complaned that he said it could be done in 13, and yet his code was 18 instructions long. He denied ever saying such.

      Bah. On the other hand, I would nearly never resort to such an optimization today, for all the reasons brought up in the article.

      --
      Sig under construction since 1998.
    15. Re:Article puts it all in perspective by LittleLebowskiUrbanA · · Score: 1

      You just got trolled, buddy.

  26. If feature X were important, we'd code in Y by wintermute42 · · Score: 2, Offtopic

    The economist Brian Arthur is one of the proponents of the theory of path dependence. In path dependence something is adopted for reasons that might be determined by chance (e.g., the adoption of MS/DOS) or by some related feature (C became popular in part because of UNIX's popularity).

    The widespread use of C and C++, languages without bounds checking in a world where we can afford bounds checking, is not so much a matter of logical decision as history. C became popular, C++ evolved from C and provided a some really useful features (objects, expressed as classes). Once C++ started to catch on, people used C++ because others used it and an infrastructure developed (e.g., compilers, libraries, books). In sort, the use of C++ is, to a degree, a result of path dependence. Once path dependent characteristics start to appear, choices are not necessarily made on technical virtue. In fact, one could probably say that the times when we make purely rational, engineering based decisions (feature X is important so I'll use language Y) are outweighed by the times when we decide on other criteria (my boss say's we're gonna use language Z).

    1. Re:If feature X were important, we'd code in Y by Hobobo · · Score: 1

      Path dependence seems like another word for industry standard.

    2. Re:If feature X were important, we'd code in Y by plover · · Score: 1
      It sounds like you are believing that path dependence is a completely bad thing, chaining us down and preventing us from achieving the loftiest goals of bug-free programming. I think you're confusing "technical merit above all else" with "the real world."

      In the real world, decisions such as 'choice of language' are not made in a vacuum. To create a ficticious example, let's say that an expert in the field might recognize that perl would be the ideal language for feature X because of the existence of a perlmod that already performs complex operation Y. Path depencence simply means there are many, many considerations to weigh. First, do I already have more than one perl expert on my development staff? Do I have support teams that can read perl? If not, do I have the budget to give to the support manager enough money for perl training? Will they be competent enough to figure out what might be going wrong at a remote site? Do my developers and support staff have all the perl development tools they need? Do all my developers and support people even have access to the same version of perl? Are there already perl environments established on my target platforms? Are there licensing restrictions preventing our usage or deployment of that perlmod?

      Real people in the real world have to produce code that other people can maintain. This code has to run on a platform that exists (or one that can be set up for a reasonable cost.)

      As you have no doubt already guessed, I'm approaching this from a large organization's viewpoint. We already have tens of thousands of machines spread around thousands of diverse geographic locations. Our users do not maintain their own machines, we do it all remotely. So if someone comes along and says "hey, I can do this in one line of perl" we may have to pack up an awful lot of baggage just to get that one camel to take that one step.

      Or, we can decide to use the existing infrastructure and tools to write it in ten lines of C++.

      Does that mean we would never consider a perl solution? Not at all -- we make these sorts of decisions frequently. But we already know we're an elephant, unable to turn on a dime. That means we know we can only walk down elephant-sized paths. If a mouse says "run over this piece of rope with me and you will quickly get to the other side of the gorge," well, we're still an elephant. We're not running anywhere, much less over a piece of rope. Decisions like these are just much bigger than simply perl vs. C++.

      --
      John
    3. Re:If feature X were important, we'd code in Y by Mr.+Piddle · · Score: 1

      The economist Brian Arthur is one of the proponents of the theory of path dependence [bearcave.com].

      It sounds like "path dependence" is academic-speak for "_blatantly_obvious_ history of software and management decision-making." Does Brian Arther make good money off of his theories? If so, I think I need a career change!

      --
      Vote in November. You won't regret it.
    4. Re:If feature X were important, we'd code in Y by wintermute42 · · Score: 1

      What is "blatantly obvious" is not always obvious. Also, some of the things that we are sure are obvious and clearly true turn out to be false. Human history is so full of examples that I will not attempt to list them here. Rational inquiry is a framework for trying to show that what we "know" is actually true. Economics as a broad topic has many faults. Economists adopt theories as a matter of faith or simply because the math works out. But there are good economists who do not fall prey to these faults. In my opinion, Brian Arthur is one of them.

      The theory of "path dependence" is actually controversal. This was argued out in the US governments anti-trust case. Brian Arthur was one of the people who argued for the Government. Other economists, like Paul Krugman, have been rather scathing in their rebuttal of Arthur's theories. So this is certainly not an area that everyone things is obvious.

      And I don't know if Brian Arthur makes "good money" off his thories. I think that he is a full professor, so he probably makes a salary in the low six figures. And so far I have not read that they are offshore outsourcing jobs for economists. So yeah, perhaps a career change would be a good thing.

  27. optimize with discretion by kaan · · Score: 2, Insightful

    All projects are an exercise in scheduling, and something is always bound to fall of the radar given the real-world time constraints. In my experience, the right thing to do is get a few smart people together to isolate any problem areas of the product, and try to determine whether that code might produce performance bottlenecks in high-demand situations. If you find any warning areas, throw your limited resources there. Don't fret too much about the rest of the product.

    In the business world, you have to satisfy market demands and thus cannot take an endless amount of time to produce a highly optimized product. However, unless you are Microsoft, it is very difficult to succeed by quickly shoving a slow pile of crap out the door and calling it "version 1".

    So where do you optimize? Where do you concentrate your limited amount of time before you miss the window of opportunity for your product?

    I know plenty of folks in academia who would scoff at what I'm about to say, but I'll say it anyway... just because something could be faster, doesn't mean it has to be. If you could spend X hours or Y days tweaking a piece of code to run faster, would it be worth it? Not necessarily. It depends on several things, and there's no really good formula, each case ought to be evaluated individually. For instance, if you're talking about a nightly maintenance task that runs between 2am and 4am when nobody is on the system, resource consumption doesn't matter, etc., then why bother making it run faster? If you have an answer, then good for you, but maybe you don't and should thus leave that 2 hour maintenanc task alone, spend your time doing something else.

    For people who are really into performance optimization, I say get into hardware design or academia, because the rest of the business world doesn't really seem to make time for "doing things right" (just an observation, not my opinion).

    1. Re:optimize with discretion by prockcore · · Score: 1

      If you have an answer, then good for you, but maybe you don't and should thus leave that 2 hour maintenanc task alone, spend your time doing something else.

      What else should, say, the programmers responsible for OpenOffice be spending their time on? Features? Compatibility? Other projects altogether?

      I think we all can agree that the biggest problem with OpenOffice is speed. If only they had designed it with speed in mind.

      Optimization takes longer (and is often impossible) if it is the last thing you do. It should be in mind every step of the process.

    2. Re:optimize with discretion by prockcore · · Score: 1

      Just another follow up on my previous reply..

      the rest of the business world doesn't really seem to make time for "doing things right" (just an observation, not my opinion).

      This may be true in certain businesses, but not for most commercial websites. People on the web have little tolerance for slow loading pages, and we lose viewership if something is slow.

    3. Re:optimize with discretion by JamieF · · Score: 1

      >the rest of the business world doesn't really seem to make time for "doing things right"

      Software exists to solve problems.

      If "doing things right" means solving a business need in the least expensive way possible, or in a way that costs a small amount now and will need more investment later, that tends to piss off developers, even though it may be the best thing to do from a business point of view.

      If "doing things right" means spending trillions of dollars building new CPUs to run a brand new revolutionary programming language so you can spend eight more years writing a provably correct program that does something really fascinating but which no one can justify paying even $100 for, developers love that shit.

      Plus, there's always some dick who will tell you that your program is a hack because it's lacking some feature that was never part of the requirements, but which they think every single program in the universe MUST have because they had to do it once. It has to run on 64-bit CPUs or it's a hack. It has to work on little-endian and big-endian CPUs or it's a hack. It has to work well in low memory situations, and handle multiple currencies, and support Hebrew, and have font smoothing, and a themed UI, and be portable to any GUI framework, and be multithreaded, and... blah blah blah. Developers are guilty of just as much scope creep as business stakeholders, but we hide it in the architecture instead of putting it in a feature list.

      If you can't stand the idea of shipping something until you couldn't possibly think of any way to improve it, you definitely should stay in academia.

  28. One thing new programmers often miss by xant · · Score: 2, Insightful

    Less code is faster than more code! Simply put, it's easier to optimize if you can understand it, and it's easier to understand if there's not so much of it. But when you optimize code that didn't really need it, you usually add more code; more code leads to confusion and confusion leads to performance problems. THAT is the highly-counterintuitive reason premature optimization is bad: It's not because it makes your code harder to maintain, but because it makes your code slower.

    In a high-level interpreted language with nice syntax--mine is Python, not Erlang, but same arguments apply--it's easier to write clean, lean code. So high-level languages lead to (c)leaner code, which is faster code. I often find that choosing the right approach, and implementing it in an elegant way, I get performance far better than I was expecting. And if what I was expecting would have been "fast enough", I'm done -- without optimizing.

    --
    It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
    1. Re:One thing new programmers often miss by AaronW · · Score: 4, Interesting

      Less code does not equal faster code. You can usually get the best performance increases by using a better algorithm. For example, if you're doing a lot of random adding and deleting of entries, a hash table will be much faster than a linked list. This will have the greatest impact.

      Other things that can help are doing your own memory management at times (i.e. freelists) since that will be faster than malloc/new, and will have less memory overhead. Also, design your storage to your data. If you know you'll allocate up to 64K of an item, and the item is small, allocate an array of 64K of them and maintain a freelist. This will use a lot less memory than dynamically allocating each item and will result in better locality.

      I write code in the embedded space, where memory usage and performance are both equally important. Usually getting the last ounce of performance out of the compiler doesn't make much difference.

      A good real-world example is that I replaced the malloc code provided by a popular embedded OS with DLMalloc, which glibc is based. The dlmalloc code is *much* more complicated, and the code path is much longer, but due to much better algorithms, operations that took an hour with the old simple malloc dropped down to 3 minutes. It went from exponential to linear time.

      -Aaron

      --
      This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
    2. Re:One thing new programmers often miss by 16K+Ram+Pack · · Score: 1
      I disagree. I've optimised code, and in the process had to add a load more code (often caching information in memory etc.).

      The one performance area that more code hampers is the performance of programmers picking it up. If you have something going to a table in memory to get some information rather than the database, you have to understand then where the table in memory is getting it's information from.

    3. Re:One thing new programmers often miss by Ristretto · · Score: 1

      You make one horrible suggestion and one great one. The great one is to address allocation performance problems by substituting a better allocator (e.g., Doug Lea's). This is a good idea for lots of reasons. It takes almost no time and involves no source code changes, and leverages heavily-used robust code.

      The horrible suggestion is to roll your own memory manager. This approach causes numerous software engineering problems -- consider what happens if you inadvertently call free on one of your custom-allocated objects. Worse, it often yields no performance improvement compared to DLmalloc. See my OOPSLA 2002 paper Reconsidering Custom Memory Allocation for a full analysis. One of the morals of the story is that you should use DLmalloc instead of freelists, which yield neither cycle-level nor locality performance advantages.

    4. Re:One thing new programmers often miss by Anonymous Coward · · Score: 0
      I write code in the embedded space, where memory usage and performance are both equally important. Usually getting the last ounce of performance out of the compiler doesn't make much difference

      My last embedded work had three parts, all optimized differently. About 1% was optimized for speed, where any nasty trick to buy an extra cycle actually mattered. That was asm code that allowed two instruction slots in the shadow of a delayed branch. At one point, I used those two slots to speculate down both sides of the branch.

      About another 1% was space-constrained. The code had to fit a small, fixed buffer, and people kept demanding more of it: serial line protocol, error control, flash updates etc. in 256 words. Again, any nasty trick was justified to stay within the limit. We applied the tricks as needed, in increasing order of nastiness.

      The other 98% was optimized for time to market, i.e. human readability. I had an intern back out his "optimization", on the grounds that the particular nanosecond he saved would cost thousands in maintenance/update costs. It was about like optimizing the null process.

    5. Re:One thing new programmers often miss by dtfinch · · Score: 1

      I'd have to agree with you on the rolling your own memory manager part, at least under modern operating systems and compilers. My last freelist allocator only achieved a few % gain over the glibc malloc() and free(), and was a bit better against the new and delete keywords. At first it performed 50% worse, which I determined to be a memory alignment problem and corrected by inserting an unused int into one of the structures. Using a one time allocator, which didn't have a free() method but deallocated all at once, I was able to cut the the time in half.

      But then I noticed the intrinsic alloca() function. It temporarily allocated space directly on the stack, and beat my one time allocator by over 1/2 again. I looked at the assembly code generated by gcc, and the alloca() call had been reduced to just a sub and a mov, two instructions, probably 1 cycle of cpu time. Dynamic allocation is almost FREE when its only temporary, and damn fast otherwise if you use malloc() instead of new for structures and primitive types.

      I retried all my benchmarks on a WinXP system using VC++ 6.0 to compile, and my gain over the built in allocators grew by about 5x, except for alloca(). The alloca() test caused stack overflows, forcing me to greatly reduce the number of items I allocated with it compared to what I did under gcc+linux. I was surprised to see the extent that gcc+glibc whipped VC++ at memory allocation. Microsoft clearly rolled their own.

    6. Re:One thing new programmers often miss by funbobby · · Score: 1

      Additional code can obviously make things faster, but that wasn't the point of the original post.

      The point was that if your code is simpler, it will be much easier to understand what the important algorithms and data structures are, and much more likely that you will be able to choose the correct ones.

      If you obfuscate your code with a lot of pre-emptive attempts to fix perceived performance problems that might or might not actually matter, you're more likely to miss the important high level optimizations.

      So in the end you might add code to make things run faster, but by keeping the code small to start with it's much more clear what code needs to be added.

      funbobby

  29. Why aren't optimized algorithms best practices? by ObviousGuy · · Score: 3, Interesting

    You would think that with all the years put into developing computer languages, as well as the decades of software engineering, that these algorithms and techniques would make their way into best practices.

    This, of course, has already begun with many frequently used algorithms like sorting or hashing being made part of the language core libraries, but more than that, it seems that duplicating effort occurs much more often than simply that.

    This is one instance where Microsoft has really come through. Their COM architecture allows for inter-language reuse of library code. By releasing a library which is binary compatible across different languages, as well as backwards compatible with itself (v2.0 supports v1.9), the COM object architecture takes much of the weight of programming difficult and repetitive tasks out of the hands of programmers and into the hands of library maintainers.

    This kind of separation of job function allows library programmers the luxury of focusing on optimizing the library. It also allows the client programmer the luxury of ignoring that optimization and focusing on improving the speed and stability of his own program by improving the general structure of the system rather than the low level mundanities.

    Large libraries like Java's and .Net's as well as Smalltalk's are all great. Taking the power of those libraries and making them usable across different languages, even making them scriptable would bring the speed optimizations in those libraries available to everyone.

    --
    I have been pwned because my /. password was too easy to guess.
    1. Re:Why aren't optimized algorithms best practices? by Anonymous Coward · · Score: 0

      >This is one instance where Microsoft has really come through. Their COM architecture allows for inter-language reuse of library code

      Hooray for VARIANT!

      Man, there's nothing like casting everything to void* to make data types interchangeable.

  30. Re:You don't optimize, that's the job of the compi by Anonymous Coward · · Score: 0

    Unless the compiler is VC++ 5.0-try looking at an asm dump from that puppy. Ouch.

  31. alternate article for Java programmers by next1 · · Score: 1

    Programming: As If Performance Mattered

  32. You need optimisation here: by Anonymous Coward · · Score: 0

    Database applications.

    The potential for a SQL statement to go tragically awry, hanging the user session and sending the CPU to 100%, is significant. In a well-designed database, it won't happen too often, but it will happen often enough for you to need a good DBA close at hand to deal with it.

    It's probably at its worst with Oracle, which now possesses a black box called the Cost-Based Optimiser. This little piece of voodoo uses a large range of metrics to decide on an execution plan for your query, and woe betide you if it gets it wrong - you'll be tearing your hair out trying to persuade it to do things differently.

    Mind you, I've also seen programmers run catersian joins against two tables that had several million rows each. But that's not optimisation; it's trying to decide what to hit them with.

    1. Re:You need optimisation here: by swordfishBob · · Score: 1

      Yes yes!
      Many queries' execution times are an exponent of the size of the data set.
      I'm sure that happens in other areas, but usually it's not so dramatic.

      I've been using IBM's DB/2. They've done some great things with optimisation, but every time they improve it, they also add new (useful) query features that can but blow out the optimisation if not used carefully.

      Having reports written by someone who doesn't understand which side is which in a LEFT JOIN rather got up my nose for a while...

      --
      -- All your bass are below two Hz
  33. Effienciency is what seperates man from monkeys by yuriismaster · · Score: 1, Funny

    If you compare a man to a monkey today, you see cognitive differences (granted, some similarities too) that tend to seperate these two related species. When a monkey attempts to open a locked box, he will try many times to pry the box open, bash at it with his fists, and the like. A man, however, quickly realizes the box is locked and uses a tool to break the lock. While the monkey's strategy is simple, and will _eventually_ get the box open, the man's strategy is more complex, but much more efficient.

    Same goes with programming. If you have to search through a massive sorted database, no skilled programmer alive would use a linear traversal (simple, yet inneficient), they would use a binary search (more complex, yet far more efficient).

    So when customers pay for you to get that locked box open, who's strategy will you choose?

    1. Re:Effienciency is what seperates man from monkeys by Anonymous Coward · · Score: 0

      Whoever modded this funny needs to think a little harder.

      This is an apt analogy.

  34. Re:You don't optimize, that's the job of the compi by neil.orourke · · Score: 1

    No matter how good the compiler is, it can't possibly compensate for poor program design.

    Right at the outset, you need to decide if the program is performance critical or not. Take, for example, this code:

    class fred{
    int q;

    int setQ(int newQ)
    {
    q=newQ;
    }
    };

    fred myFred=new(fred);

    Now, which is going to execute faster:
    myFred.setQ(10);

    or

    myFred.q=10;

    Using your super optimising compiler, how is it going to know the best way of setting q in myFred? It can't, because no compiler could make an assumption like that.

  35. Code tweaking by Frequency+Domain · · Score: 5, Insightful
    You get way more mileage out of choosing an appropriate algorithm, e.g., an O(n log n) sort instead of O(n^2), than out of tweaking the code. Hmmm, kind of reminds me of the discussion about math in CS programs.

    Every time I'm tempted to start micro-optimizing, I remind myself of the following three simple rules:

    • 1) Don't.

    • 2) If you feel tempted to violate rule 1, at least wait until you've finished writing the program.
      3) Non-trivial programs are never finished.
    1. Re:Code tweaking by Anonymous Coward · · Score: 0

      ..but then you realize that it's fun, like playing solitaire and do it anyway?

    2. Re:Code tweaking by mike260 · · Score: 1

      You get way more mileage out of choosing an appropriate algorithm, e.g., an O(n log n) sort instead of O(n^2), than out of tweaking the code.

      True, but that's pretty irrelevant here since the algorithm in the article is inherently O(n). As for your 3 golden rules, perhaps you should investigate a recent innovation called 'modularisation'.

      Anyway, a properly written TGA decoder will spend pretty much all of it's time waiting around for IO, so a much more realistic optimisation would be to address that by using a format with better compression like PNG.

    3. Re:Code tweaking by MarkCollette · · Score: 1

      Or just do what I do: make all optimizations at the design phase.

  36. Optimize the algorithm by jgardn · · Score: 1

    One of the golden rules that Python is founded on is optimizing the algorithm, and let the compilers / interpreters do all the rest. Python, like perl and most other high-level languages, allow you to tweak the code and try out many different algorithms, without worrying about pointers and useless trivia like how much memory is available to your process.

    If you master the algorithm, and you still want more speed, you can go ahead and usually cut run times by half or more by implementing it in C. But it takes a lot of work to get it right in C, and once in C, it isn't very friendly to tweaking.

    About the compiler / interpreter doing the real optimizations - psyco is freely available and regularly gets close to C speeds for python. The new parrot compiler, when it comes out, will redefine the speed barriers for all high level languages ported to it.

    I work at a company where we have half C-developers and half perl-developers. The perl developers code circles around the C developers, getting projects done within constraints on time and with fewer bugs. The perl programs are easier to maintain and modify. However, everyone always wants to push code into C. It's amazing when I sit down and plan out projects, and show them, "If we implemented it in Perl, we would be done in 1/4 the time with 1/4 the number of bugs, and 100% of the features." But the retort, "But C will be 3 times faster!" And I respond, "Buy 3 times as many machines to run the perl code on for all I care. It'll still be cheaper to do it in perl becuase hardware costs far less than the developer's, tester's, and manager's time!" Hey, as long as they sign my paycheck, I'll do what they ask. But if they still want to live in the 80's, that's their problem.

    --
    The radical sect of Islam would either see you dead or "reverted" to Islam.
    1. Re:Optimize the algorithm by cperciva · · Score: 1

      The perl programs are easier to maintain and modify.

      I almost believed you until I reached this point.

    2. Re:Optimize the algorithm by Anonymous Coward · · Score: 0

      s/perl/python and you've got yourself a consistent statement still. I've just started with python, and its amazing how easy even the largest of projects are to understand, tweak, and modify (well, easier than c, c++, java, etc.).

    3. Re:Optimize the algorithm by jgardn · · Score: 1

      Perl gives you the freedom of writing crap; it also gives you the freedom of writing clear, concise code that almost looks like Python. I'll let you guess which style I use and enforce on large, multi-developer projects.

      C allows no such opportunity.

      --
      The radical sect of Islam would either see you dead or "reverted" to Islam.
  37. Re:You don't optimize, that's the job of the compi by prockcore · · Score: 1

    So long as your code is not unnecessarily convoluted often the machine optimizations are better than the human brain optimizations.

    That's not what optimising is. It is a logic problem, one that a computer cannot solve. It is organization, it involves using your mind. It is making sure that your code isn't doing more work that it should. Restructuring code to remove redundant operations. Finding a better way to do the things you need to do.

    Claiming that a compiler can do a better job of optimizing than a human is exactly the same thing as claiming that a computer makes a better opponent in UT2k3 than a human.

    Humans have creativity on their side.

  38. Painful P-ful Post by Dominic_Mazzoni · · Score: 4, Funny

    Proper programming perspective? Please. People-centered programming? Pretty pathetic.

    Programmer's purpose: problem-solving. Programmers prefer power - parallelizing, profiling, pushing pixels. Programmers prefer Pentium PCs - parsimonious processing power. Pentium-optimization passes Python's popularity.

    Ponder.

    [Previous painful posts: P, D]

    1. Re:Painful P-ful Post by Lord+Kano · · Score: 1

      Participant's post presents profuse "P" periodicity.

      Personally prefer personality.

      Pathetic performance!

      LK

      --
      "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
    2. Re:Painful P-ful Post by Anonymous Coward · · Score: 0

      Precisely.

    3. Re:Painful P-ful Post by Anonymous Coward · · Score: 0

      This P'ing in public has got to stop!

  39. me too... by interactive_civilian · · Score: 0, Offtopic
    Yeah...my first version of "Hello World" could also have used a bit more optimization...

    ;p

    --
    "Empathise with stupidity, and you're halfway to thinking like an idiot." - Iain M. Banks
    1. Re:me too... by Lord+Kano · · Score: 5, Funny

      One of my C++ instructors told us a story about one of his former coworkers who used to go through his programs and delete ALL whitespace from the code because he thought it would make the programs smaller and faster.

      LK

      --
      "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
    2. Re:me too... by Billly+Gates · · Score: 0, Offtopic

      hehe

      God.

      Talk about some people who should not be programers. No wonder we are shipping jobs to India.

    3. Re:me too... by maxwell+demon · · Score: 2, Funny

      In some sense he was right:
      The program source code gets smaller (by as many bytes as the removed whitespace occupied), and compiles faster (because the parser doesn't have to read and ignore all those whitespace characters).

      Now, the size reduction might be quite measurable (esp. if the original program was quite readable), though not substantial. However, if the improved compile speed was measurable, the compiler must have had an incredibly slow parser (or maybe the system had mounted the disk with NFS through ppp over ssh through a 14.400 kbps modem link :-).

      --
      The Tao of math: The numbers you can count are not the real numbers.
    4. Re:me too... by Lord+Kano · · Score: 1

      I guess I should have been clearer, the man's belief was that his executables would be smaller and faster if he got rid of all whitespace and comments.

      LK

      --
      "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
    5. Re:me too... by some+guy+I+know · · Score: 4, Interesting

      If the man was coming from a BASIC programming background, his belief may have been understandable.
      Some (very old) BASIC interpreters used to parse each source line each time it was executed.
      Doing it that way saved memory (no intermediate code to store).

      --
      Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
    6. Re:me too... by SmackCrackandPot · · Score: 1

      That sounds like a piece of knowledge inherited from the days of BASIC programming.

    7. Re:me too... by ultranova · · Score: 1
      Some (very old) BASIC interpreters used to parse each source line each time it was executed. Doing it that way saved memory (no intermediate code to store).

      I remember programming a simulator in one of those for examining a stars (as in big ball of hot gas, not as in actor or rock star, even if they might be full of hot air as well ;) life cycle from a gas cloud to a white dwarf in a visual manner. The whole simulation only took about 10 seconds to run. And I didn't even remove whitespaces.

      It's amazing what a Basic interpreter and an 8086 can do in the right hands :).

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    8. Re:me too... by Patrik_AKA_RedX · · Score: 4, Funny
      No wonder we are shipping jobs to India.
      Bad solution. Every Darwinist knows how to solve this. Let only the best programmers reproduce. That way we'll be breeding a race of superprogrammers!

      And perhaps we could breed some very small furry humans as pets. I'm very sure there is a market for pet-humans as no less than 25% of the Andromedians voted yes on a survey asking if they would spend more than 100 Astrobucks on a pet-human if they would be smaller and less noisy.
    9. Re:me too... by StrawberryFrog · · Score: 1, Insightful

      The program source code gets smaller (by as many bytes as the removed whitespace occupied),

      Who cares? I'll bet that for a typical techie, the collection of all the source code they've ever written ever is smaller than thier current MP3 collection. By at least one order of magnitiude. Give me readability.


      and compiles faster (because the parser doesn't have to read and ignore all those whitespace characters).


      Technically you are right. However this effect will be utterly negligable next to the things that really take up time when compiling a c++ program, such as preprocessing #defines, expanding tempate code, checking types, resolving variable references, and generating machine code. c++ compilers are generally quite slow (as compared to say, Borland Delphi) this is due to the complexities and terseness of the language, not the whitespace.

      --

      My Karma: ran over your Dogma
      StrawberryFrog

    10. Re:me too... by Patrik_AKA_RedX · · Score: 1

      hmmm, I doubt he would be winning time.
      It would take time to remove all those white spaces (from a copy of the source code) and the end product is unreadable source code. So if something gets changed he'll have to remove the white spaces from another copy.
      I wonder what would take the most time: Removing all the whitespaces or letting the parser skip them.

    11. Re:me too... by Anonymous Coward · · Score: 1, Interesting

      ... And if you were customizing and programing Vantive (customer support backend software where all customization is done in either VB for Applications or in stored procs on the SQL server), you'd still find yourself shortening variable names and reducing whitespace.

      Vantive stores VB for Application data in a set size blob on the server. Max size 64k. Go over that, and it won't save.

      Of course you could always refactor the blocks to move things out, but when you're working 70 hour days under the gun to "get-it-done-now-however" you just substitute a variable or two and damn the consequences.

    12. Re:me too... by Anonymous Coward · · Score: 0

      On old Basic programs( think C64 and Atari 8bit), shorter programs DID run faster because the program was read each time it was executed.

      No clue why someone would do it nowadays...

    13. Re:me too... by CastrTroy · · Score: 1

      I'm sure that since the white space doesn't matter, some C++ compilers might actually just run a parser that removes all the white space, to make the rest of the compilation faster/easier to program.

      --

      Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
    14. Re:me too... by ggeens · · Score: 2, Interesting

      Some (very old) BASIC interpreters used to parse each source line each time it was executed.

      One time (around 1985 IIRC), I read an article in the local computer club's magazine. The author had written a BASIC program (GW-BASIC I think) to "shorten" BASIC programs. It would:

      • Shorten all variable names to 1 or 2 characters
      • Remove whitespace
      • Renumber all lines so that all GOTO n became shorter

      The source code that went along with the article looked like it was used on itself (very terse).

      With the machines you had back then, it probably made a difference.

      --
      WWTTD?
    15. Re:me too... by FictionPimp · · Score: 1

      Something changes? Are you sure you are a programmer? I've never rewrote or edited my code. I do it right the first time. Well except that one time, but how was I to know the world would last after 2000. *Its a joke!*

    16. Re:me too... by Anonymous Coward · · Score: 0

      Uhhh... why didn't he write a program to do that?

    17. Re:me too... by Guignol · · Score: 1

      Yeah ?
      well I remember doing a whole multiverse expanding and collapsing simulation in a visual manner,
      and it only took 3 seconds to run (up to the first syntax error)
      I agree, it's amazing what you can do with two right hands

    18. Re:me too... by Eric+Giguere · · Score: 1

      These techniques are still in use today, actually. Although it's getting better, many Java-enabled cellphones (those that support J2ME, Java 2 Micro Edition) have a very small amount of space available for installing and running applications. So an important thing to do is to make your application as small as possible. You do some of this by refactoring your code, but you can also make your code smaller by running it through an obfuscator, because the obfuscator will replace all the long symbolic names (which are stored in the .class files) with shorter, meaningless ones as well as remove unreachable code, etc. But same basic techniques.

      Eric
      http://www.ericgiguere.com/j2me

    19. Re:me too... by Surt · · Score: 2, Insightful

      Worse, the time it take him to delete one space or tab will always be much longer than the time saved in the parser/compiler.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    20. Re:me too... by Rufus88 · · Score: 1

      Every Darwinist knows how to solve this. Let only the best programmers reproduce

      Every mathematician knows you can't solve an over-constrained system of equations.

    21. Re:me too... by Sepper · · Score: 2, Funny

      Like this fellow 'Coder' sitting next to me, that putting most of his code in between /* and */ because 'It compiled faster'...

      he was wondering why his program wasn't working

      I was wondering was he was doing in Computer Engineering school...

      --
      I live in Soviet Canuckistan you insensitive clod!
    22. Re:me too... by randomencounter · · Score: 2, Interesting
      I know where he might have gotten the idea.
      The Vic20/C64 basic allowed you to merge program lines using a semicolon. This took 1 byte less per merged line, and did indeed run somewhat faster. Since the Vic20 had only 2.3K usable without tricks this was a big deal.

      Of course, anyone inflexible enough to carry that through to a C++/C/Cobol project shouldn't be programming.

      --
      Forget diamonds, copyright is forever.
    23. Re:me too... by Anonymous Coward · · Score: 2, Funny

      That only works if programming is heritable.

      In your thought experiment, a few generations down all programmers will have greasy hair & shaggy beards (heritable) & the social need (heritable) to have greasy hair & shaggy beards, with no programming ability (not heritable) ....

    24. Re:me too... by KlomDark · · Score: 1

      Um, you're thinking of a colon ":", not a semicolon ";". :)

      (Yes, I did way too much C64 programming back in the day... Living in a house in the country in the middle of Nebraska, with no car, and no Internet even possible. And was a metalhead, so no way I was walking a mile to the closest redneck neighbors to hang out. Give you a lot of time to learn coding and explore other worlds by huffing gasoline. :) )

    25. Re:me too... by BorgCopyeditor · · Score: 1
      when you're working 70 hour days under the gun

      No offense, but what planet are you on?

      --
      Shop as usual. And avoid panic buying.
    26. Re:me too... by Phurd+Phlegm · · Score: 1
      Yeah...my first version of "Hello World" could also have used a bit more optimization...

      For starters, try making it "Hi World."

    27. Re:me too... by Anonymous Coward · · Score: 0


      2 people working 24 hour days + one person working a 22 hour day = a 70 hour work day.

    28. Re:me too... by Progman3K · · Score: 1

      >delete ALL whitespace from the code because he thought it would make the programs smaller and faster.

      It won't make it execute faster, but it'll compile faster.

      --
      I don't know the meaning of the word 'don't' - J
    29. Re:me too... by innosent · · Score: 1

      Well put, if only more people on slashdot could understand what you wrote...

      --
      --That's the point of being root, you can do anything you want, even if it's stupid.
    30. Re:me too... by Anonymous Coward · · Score: 0

      Rememberances from my earliest computing days on the Apple ][+:
      In Applesoft BASIC, white space was not a problem because the lines were parsed, reserved words (GOTO,PRINT,etc) were replaced with special token values and excess white space was removed.

      You could get speed ups by changing commonly used constants (3.1416) into variables (PI) since it didn't have to parse the floating point numbers.

      It also sped things up to place commonly used subroutines at the beginning of the program because BASIC would do a linear search to find the line number.

    31. Re:me too... by maduro55 · · Score: 1

      I welcome our new SuperProgrammer masters. May I carry your bit bucket?

  40. A little bump in your pants by Anonymous Coward · · Score: 0

    bump in your pants whaaa!!!

  41. Optimizations in the Real World by NegativeK · · Score: 2, Informative

    Optimization isn't really a hard topic. Should a programmer spend days nitpicking fifty lines of code that won't be used frequently? No. When initially writing code, should someone use Bogosort instead of Quicksort ? I'll let you figure that one out.
    My biggest (reasonable) beef in the optimization area is software bloat. Programs like huge office suites containing excessive, poorly implemented crap that people won't use really ticks me off. KISS. Even the stuff that has to be complicated.

    Of course, I'll always be a sucker for tweaking code for the fun of it, when I have the time. =)

    --
    This statement is false.
  42. Optimizations are a varied lot by corngrower · · Score: 2, Interesting
    Often times to get improved performance you need to examine the algorithms used. At other times, and on certain cpu architectures, things that slow your code can be very subtle.

    If you're code must process a large amount of data, look for ways of designing your program so that you serially process the data. Don't try to bring large amounts of data from a database or data file all at once if you don't have too. Once you are no longer able to contain the data in physical memory, and the program starts using 'virtual' memory, things slow down real fast. I've seen architects forget about this, which is why I'm writing this reminder.

    On the other hand I've worked on a C++ project where, in a certain segment of the code, it was necessary to write our own container class to replace one of the std: classes, for performance on the SPARC architecture. Using the std: container would cause the subroutines to nest deeply enough to so that the cpu registers needed to be written out out to slower memory. The effect was enough to be quite noticeable in the app.

    With today's processors, to optimize for speed, you have to think about memory utilization, since running within cache is noticably faster than from main memory. Things are not as clear cut, so far as speed optimization goes, as they once were.

    1. Re:Optimizations are a varied lot by zhenlin · · Score: 1

      Reminds me of the 68k Mac emulator for MacOS (Classic). It had to fit in the cache... and leave space for the program too.

  43. Performance oriented coding in real life... by Anonymous Coward · · Score: 1, Interesting

    While I agree that it is silly to spend eternity optimizing small routines that trivially achieve the required level of performance for their intended purpose, a lot of scientific packages and simulation software absolutely demand serious performance optimization. The mantra about premature optimization being the root of all evil is only true if it doesn't save you significant development and testing time as a result. If you have a simulation application that requires an hour to do enough work to complete a minimal function/correctness test, it immediately pays off if you spend time optimizing the code if you can reduce your test times down to a half hour, for example. I've worked on a lot of software packages that run calculations for days and/or weeks at a time on parallel computers. You always want to start out with fast sequential algorithms and data structures before you get into using multiple processors. Parallelism is inherently more complex, so its often worth it to squeeze the most you can out of a single thread/process before going to multiple threads/processors. While desktop apps typically consume a negligable amount of CPU time/resources, apps that are candidates for running on parallel computers, clusters, or big SMP machines are inherently more costly to run in both CPU time and user's wall-clock time, so those applications don't fall within the same logic that trivial apps like image decompression/formatting do.

  44. Re:You don't optimize, that's the job of the compi by techno-vampire · · Score: 4, Insightful
    If you write clear and simple code the compiler or interpreter does all the other work.

    I remember looking over something once that was clear, simple and very slow. It was a set of at least twenty if statements, testing the input and setting a variable. The input was tested against values in numeric order, and the variable was set the same way. Not even else if's so that the code had to go through every statement no matter the value. I re-wrote it to a single if, testing to see if the input were in the appropriate range and calculating the variable's value. No compiler is going to do that. Brute force can be clear, simple and slow.

    --
    Good, inexpensive web hosting
  45. Re:You don't optimize, that's the job of the compi by Anonymous Coward · · Score: 0

    I agree with your statement.

    In my opinion there are two ways to optimize code. The bad way, optimize line by line, writing ASM, etc. And then there is the good way which is using better program design and algorithms.

    Of course the "bad way" can be good in certain specific occasions (scientific, embedded, etc), but I would say that in general, it's just a waste of time. For starters, the compiler takes into account things like the number of registers and parallel pipelining that we can not optimize manually in C.

    Also, it can often be cheaper to just buy a 200MHz faster processor than spend piles of the cash optimizing software to the last bit and maintaining it. Of course, this is not so true with widly distributed desktop products, but it is for large in-house of consultation projects.

  46. Performance, an aspect of design and understanding by StevenMaurer · · Score: 2, Insightful

    This article seems to be something that I learned twenty years ago... performance is an aspect of good design.

    That is why I insist on "optmization" in the beginning. Not peephole optimization - but design optimization. Designs (or "patterns" in the latest terminology) that are fast are also naturally simple. And simple - while hard to come up with initially - is easy to understand.

    But that's also why I discount any "high level language is easier" statement, like this fellow makes. It is significantly harder to come up with a good architecture than learning to handle a "hard" language. If you can't do the former (including understanding the concepts of resource allocation, threads, and other basic concepts), you certainly aren't going to do the latter. Visual Basic is not an inherently bad language because you can't program well in it. It just attracts bad programmers.

    And that goes the same for many of the newer "Basics": these "managed languages" that make it so that people can "code" without really understanding what they're doing. Sure, you can get lines of code that way. But you don't get a good product.

    And then the whole thing falls apart.

  47. Bad performance is built in. by BigZaphod · · Score: 2, Insightful

    There seems to be two basic causes of bad performance:

    1. Mathematically impossible to do it any other way.
    2. Modularity.

    Of course crap code/logic also counts, but it can be rewritten.

    The problem with modularity is that it forces us to break certain functions down at arbitrary points. This is handy for reusing code, of course, and it saves us a lot of work. Its the main reason we can build the huge systems we build today. However, it comes with a price.

    While I don't really know how to solve this practically, it could be solved by writing code that never ever calls other code. In other words, the entire program would be custom-written from beginning to end for this one purpose. Sort of like a novel which tells one complete story and is one unified and self-contained package.

    Programs are actually written more like chapters in the mother of all choose-your-own-adventure books. Trying to run the program causes an insane amount of page flipping for the computer (metaphorically and actually :-))

    Of course this approach is much more flexible and allows us to build off of the massive code that came before us, but it is also not a very efficient way to think about things.

    Personally, I think the languages are still the problem because of where they draw the line for abstractions. It limits you to thinking within very small boxes and forcing you to express yourself in limited ways. In other words, your painting can be as big as you want, but you only get one color (a single return value in many languages). It is like we're still stuck at the Model T stage of language development--it comes in any color you want as long as its black!

    1. Re:Bad performance is built in. by Lord+Kano · · Score: 2, Funny

      Sort of like a novel which tells one complete story and is one unified and self-contained package.

      In other words, "a book".

      LK

      --
      "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
    2. Re:Bad performance is built in. by BigZaphod · · Score: 1

      Uhh. yeah. What's your point?

    3. Re:Bad performance is built in. by Lord+Kano · · Score: 1

      Uhh. yeah. What's your point?

      To point out the irony of the situation. Here we are talking about performance mattering and you use 14 words in the place of 2.

      LK

      --
      "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
    4. Re:Bad performance is built in. by BigZaphod · · Score: 1

      Ohhhh.. Ok. :-) Hee hee...

    5. Re:Bad performance is built in. by Nasarius · · Score: 1

      How is this insightful? Do you really honestly believe that your code would be faster if you wrote a giant main() function rather than a few dozen classes? If so, you're incredibly ignorant about how compilers actually work.

      --
      LOAD "SIG",8,1
    6. Re:Bad performance is built in. by ahem · · Score: 1

      That's kind of an interesting idea. Sort of the ultimate in unrolled loops. Follow every possible execution path and translate all storage into global variables and string it all together with gotos.

      Hm.

      It's early, so maybe this sounding like a good idea is only my first-thing-in-the-morning fuzz keeping me from seeing the problem.

      --
      Not A Sig
  48. Followed your link by ccoakley · · Score: 3, Insightful

    1. Is that your sig or is that part of your comment? If it is part of your comment, please explain why it would give me a whole new view on performance. If it's your sig, then spooky how it was related to the topic.

    2. Assuming your stuff is good, when are you going to code up SHA-1 (*MY* favorite hash)?

    3. On the server side of things, I would argue that correctness is more important than otherwise. If an app crashes 1 in 100 times for a desktop user, the developer blames windows and the user is satisfied (don't flame me on this, please). On the server, if the app crashes 1 in 100 times, it may bring down the transactions for 100s of users, making things very bad for the developer. For non-crash correctness problems, consider a problem which makes a minor, but cumulative error in subsequent runs. That would likely be disasterous for the server situation.

    As far as clarity, find me one developer who has taken over a project and not complained about the quality of the inherited code ever. Seriously. (that's not directed at parent)

    --
    Network Security: It always comes down to a big guy with a gun.
    1. Re:Followed your link by rd4tech · · Score: 1

      1. Part of the comment. My argument was that performances do matters. I've seen animations for DOS on my old 486 that operates faster with 2D bitmaps than a standard windows start menu on some older PII computers. When coding anything, msot of the code should be done to be fast. On the other hand, when $$$ are there just to do the job quick, it is a whole new story.

      2. My next project is supposed to be either an encryption or a hash algorithm. I still haven't decided though. On the other hand if I hear few more SHA comments I'll make my pick really soon :)

      3. I agree.

      4. I've yet to see that. But we all think we are unique, just like everyone else :)

    2. Re:Followed your link by Lord+Kano · · Score: 1

      1. Is that your sig or is that part of your comment? If it is part of your comment, please explain why it would give me a whole new view on performance. If it's your sig, then spooky how it was related to the topic.

      Take a look at the site he's talking about. It looks like he just made it. Most of the links take you to pages that say "Section to be available soon."

      LK

      --
      "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
    3. Re:Followed your link by ccoakley · · Score: 0

      1. Actually, my question was more intended as a probe into your target market (although re-reading it I feel it sounds rather rude. Sorry). I rarely uses hashes that need to process large amounts of data quickly. For work, I've used hashing to store passwords. For school/research, I had to hash a large list of keywords. For home use, I actually check my checksums on dowloaded stuff (sure I do).

      So, I have single use small, many use small, and single use large. I am trying to imagine a scenario where I need a super high performance hash (well, I'll give you one: a commercial product comparable to tripwire). I imagine that your market would be people who hash lots of large blobs. If it is a single large chunk, then a slow hash is merely an inconvenience. If it is lots of small chunks, then IO probably dominates processing (in fact, when is this not the case?).

      Have you encountered such a situation yourself? Is business good (as other poster commented, it appears as if you are just starting, so this is a positive-only litmus test)? Who is your target customer? Akamai maybe (any mirror-for-profit business)?

      2. For reasons discussed in 1, I suggest you do encryption first. Fast encryption is always good.

      --
      Network Security: It always comes down to a big guy with a gun.
    4. Re:Followed your link by rd4tech · · Score: 1

      Its a student-tuition situation :), so instead of spamming people for money, I decided to do something I have skills and experience in.

      The website is new so all of my expectations are yet to be seen if they'll come true.

      To answer your question, my guess (bussiness dictionary: market estimation analysis :) ) was that coders do need fast data algorithms, and I had UltraHash (research paper and everything) prepared for CEC2004 (two prizes), so I started with it and with md5 after that. I noticed that I like that kind of abstract thinking quite a lot (optimization problems and such), so I dug into it.

      IO really dominates the process of working on small chunks, however if you start prefetching your data from the memory in the cache, the effect will be reduced a bit. From my experience,the gain in speed savings is up to 5-10% when accessing linear array, just by telling the processor to load something aforehand.

      As for the target market, there are some things I'm planning to do with that website. I promise they'll be fun to hear/take part in :)

    5. Re:Followed your link by BlackHawk-666 · · Score: 5, Insightful
      When coding anything, msot of the code should be done to be fast

      This may be true when you are producing libraries of math routines and similar stuff like you are doing. It doesn't hold an ounce of water when you do the sort of work I do. My projects are generally medium sized, mixed languages, developers of all different skill levels. Code clarity is far more important for 98% of the stuff we do. I need my juniors to be able to follow the code the seniors write, even if they can't write it themselves. The other 2% of the time it's fine to sacrifice clarity for speed to get the performance to an acceptable level on the target platform.

      I have generally found that clear code is usually good code, so long as you are aware of the cost implications of your design decisions. For instance, I seem to recall the bubble sort (mentioned earlier) was actually faster than a qsort under some circumstances. Deep data knowledge would help you to make the decision as to which would need to be used...don't just reach for that qsort, it may be the fastest under most cases, but not all.

      --
      All those moments will be lost in time, like tears in rain.
    6. Re:Followed your link by Anonymous Coward · · Score: 0

      Specifically, qsort is fastest when you have a reasonably large chunk of *unsorted* data. It's slow on sorted data, so not very suited for insertion of single elements, for instance.

    7. Re:Followed your link by ecb29 · · Score: 1
      I seem to recall the bubble sort (mentioned earlier) was actually faster than a qsort under some circumstances.
      Bubble sort is an O(n^2) algorithm, and quicksort is an O(n log n) algorithm. Given that quicksort will have a larger constant time factor, we can compare something like f(n) = 6n^2 to f(n) = 32 n log n, and find that for arrays of 14.12 elements, bubble sort will actually be faster.

      So, the use of "inefficient" algorithms needs to be taken with a grain of salt depending on the context in which you are using those algorithms. If you need to sort short short lists, bubblesort will work just fine!

    8. Re:Followed your link by lederhosen · · Score: 1
      So, the use of "inefficient" algorithms needs to be taken with a grain of salt depending on the context in which you are using those algorithms. If you need to sort short short lists, bubblesort will work just fine!


      And quicksort will work just fine too. Sometimes
      O(n^2) will *not* work. Therefore never use bubblesort.
    9. Re:Followed your link by CastrTroy · · Score: 1

      if your sorting 10 or so elements, any algorithm will do fine on today's computers, no noticeable performance difference will be seen. You could probably even use random sort, and with any luck, it would work pretty quickly.

      If you know you are only going to be sorting < 1000 items, and that your code will only be run on pentium 3's, coding a correct and readable sorting algorithm is probably more important than programming a quick algorithm. Unless your in the business of designing a database server, where you know your code will be used to sort 1 million items, the sort you use probably doesn't matter.

      Although I think the best solution is to use standard built in sorting algorithms whenever possible, as these are probably quite efficient, and also correct.

      --

      Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
    10. Re:Followed your link by BlackHawk-666 · · Score: 3, Informative
      I should have qualified my statement a little better, and I suspect qsort vs bubblesort is not the best illustration possible. Each sort algorithm has strengths and weaknesses e.g. easy to implement but slow to run, ruthlessly difficult to code but fast as hell, good at sorting random data but worst case scenerio on near sorted data. If qsort were always faster than every other algorithm then we wouldn't still be talking about them. QSort is generally faster than most other sorts, and the O(n log n) is an average sort cost, not a gaurantee.

      When I was in uni our lecturer gave us an example from the QU campus where he used to lecture. There was a computer (remember, this is back in the eighties) that needed to sort rather a lot of data and it took three days to do it with the qsort algorithm. The main problem was, I believe, due to memory restrictions i.e. all the data could not fit into memory at once. It was recoded to use a different algorithm, one that could work from disk and in small chunks, and ran orders of magnitude faster. The recoded algorithm was theoretically slower, but faster in actuality due to the nature of the data and the machine it had to run on.

      --
      All those moments will be lost in time, like tears in rain.
    11. Re:Followed your link by Anonymous Coward · · Score: 0
      In many cases, quicksort is written as a recursive algorithm. Let's say you have the following call within your QuickSort function:
      QuickSort(p_data, length);
      If you also have a BubbleSort function, you could just do this:
      if (length < 15) {
      BubbleSort(p_data, length);
      } else {
      QuickSort(p_data, length);
      }
      So with only a minor modification, you can get the best of both worlds.

    12. Re:Followed your link by dtfinch · · Score: 1

      What a lot of people will do is take a quicksort, and when start and end are close enough together, use either a bubble or insertion sort. In that case, using an O(n^2) sort can actually speed up sorting of very large arrays, often by quite a bit.

    13. Re:Followed your link by Anonymous Coward · · Score: 0

      A classic example where insertion sort (whose best case asymptotic time is Omega(n)) is better then say an asymptotical optimal sort like heap sort Theta(nlog n) is sorting checks for an account at a bank. The application data is nearly sorted making insertion sort a better fit for the application.

      The issue is good understanding of theory and knowing the application.

    14. Re:Followed your link by E_elven · · Score: 1

      >Most of the links take you to pages that say "Section to be available soon."

      Optimization at work :)

      --
      Marxist evolution is just N generations away!
  49. A few points that come to mind... by ivec · · Score: 2, Interesting

    - Decoding a RLE data buffer is short of impressive as a benchmark. RLE was designed as a simple and specific (generally inefficient) compression approach for age-old hardware (i.e. 8MHz, not 333MHz as the base system used here).
    How about JPEG or PNG ?

    - The author actually spent several iterations optimizing this Erlang code. And these optimizations required handling special cases. (So performance eventually did matter to the author?) Now, would a 'first throw' implementation in C/C++ have been written faster while immediately performing better than the Erlang version? (simpler code)

    - I agree that the compiled/interpreted code performance matters less and less, because processors are so much more powerful. For instance, the processing for RLE decompression should in any case be negligible wrt the memory or disk i/o involved.
    What is becoming increasingly important, however, is the data structures and algorithms that are used. In this perspective, C++ still shines, thanks to the flexibility that its algorithms and containers library provides.
    C++ offers both a high level of abstraction (working with containers), and provides the ability to convert to a different implementation strategy with ease - if and when profiling demonstrates a need.
    For large system and library development, the strong static typing of C++ is also a real plus (it doesn't matter to me it is faster than dynamic typing or not).

    I totally agree that performance should not be a concern during program implementation (other than avoiding 'unnecessary pessimization', which involves the KISS principle and knowledge of language idioms). Optimization should only be performed where the need for a speed-up has been demonstrated.
    Other than saying "wow this interpreted language runs damn fast on current hardware", this article does a poor job at making any relevant point.

    radix omnia malorum prematurae optimisatia est -- Donald Knuth

  50. Performance is IMPORTANT by Jason+Pollock · · Score: 4, Informative

    I am hearing a lot of people saying that you shouldn't optimise prior to the first release. However, it is very easy to select a design or architecture that limits your high end performance limit. Therefore, there is some optimisation that needs to be done early.

    When you're architecting a system that is going to take tens of man years of effort to implement, you need to ensure that your system will scale.

    For example, a project I recently worked on hit a performance wall. We had left optimisation for later, always believing that it shouldn't be done until last. However, while the architecture chosen was incredibly nice and clear, it limited the performance to 1/3th what was required. Back to the drawing board, we just doubled the project cost - ouch.

    Even worse, there are performance differences on each platform! For example, did you know that throwing an exception is 10,000 times slower than a return statement in HP/UX 11? Solaris is only a little better at 2 orders of magnitude. Linux is (I understand) a dead heat.

    So, while low-level optimisation of statements is silly early in the project, you do need to ensure that the architecture you choose is going to meet your performance requirements. Some optimisations are definitely necessary early in the project.

    The article also talks about tool selection, suggesting that the extra CPU could be better used to support higher level languages like Erlang. If a system has CPU to spare, I agree, use what you can. The projects I work on always seem to lack in CPU cycles, disk write speed, and network speed. You name it, we're short of it. In fact, a large part of our marketing strategy is that we are able to deliver high performance on low end systems. What would happen to us if we dropped that edge? We're working with a company that has implemented a real-time billing system in Perl. Not a problem, until you try and send it 500 transactions/second. Their hardware budget? Millions to our 10s of thousands. Who do you think the customer likes more?

    Jason Pollock
    1. Re:Performance is IMPORTANT by ben_ · · Score: 1

      For example, a project I recently worked on hit a performance wall. We had left optimisation for later, always believing that it shouldn't be done until last


      I don't think the paper, or for that matter anyone I've read on the subject, advocates leaving all optimisation until last. Performance is affected by design, as by many other aspects of a system, and the design should have included an analysis of how it might affect performance.

      The level of optimisation that is better left until later on in development is the tweaking of code to gain performance and remove inefficiency. Programming time is always a limited resource in any software engineering project, and time spent on unnecessary optimisations is time not spent on debugging, testing or useful refactoring. That's the lesson to take away. IMVHO, natch ;)

      --
      ben_ the technologist and platform agnostic
    2. Re:Performance is IMPORTANT by haral · · Score: 1

      For example, did you know that throwing an exception is 10,000 times slower than a return statement in HP/UX 11?

      Unless you were throwing exceptions to return values, this is not a fair comparison.

      I would compare throwing exceptions to returning a value and testing for an error code every time you call a function regardless of whether an error occurred or not. If you have a deep call stack, you will also have to check for an error at multiple levels, manually propagating them.

    3. Re:Performance is IMPORTANT by Jason+Pollock · · Score: 1

      The system was throwing an exception whenever a socket didn't have the requested number of bytes available. So, it would happen pretty frequently - every message. It was very pretty code though. :)

      Even if the exception doesn't happen frequently, the systems that I work on have a definite preference for worst-case performance close to average performance. Determinism is _very_ important. So it is worth a performance loss to ensure that when things start to go bad they don't go _really_ bad. :)

      That's what you get when there's a high likelihood of your company being on the national news if your software goes bad. Heh. :)

      Jason Pollock

    4. Re:Performance is IMPORTANT by Jason+Pollock · · Score: 1

      I agree that the paper doesn't argue that. It was however, a view being put forward by other posters, and one I felt needed to be argued against. :)

      You are entirely correct that more analysis should have been done. Most of the problems could have been simulated. But then, that would have doubled the cost of the project to the end result anyways. :) It would have just been a bit more comfortable along the way.

      I am a HUGE fan of throwing hardware at a problem. I never understand why companies are willing to spend 100 person days of effort (with a loaded labour cost of $1k/day) to fix a problem that could be solved by spending $50k on hardware. Just doesn't make sense to me...

      As for the tweaking of code. Yep, that should be left until you run into a problem. But there are always exceptions. :)

      On the same project, we had a library that would hide the fact that certain configuration information was stored in the database. This information was retrieved every time the function was called. Not really a problem, until you realise that the function was called several times/transaction. It was another forseeable problem that should have been optimised earlier. More time was spent tracking down the problem (even before the fix was applied) than would have been spent doing it correctly the first time.

      People really do need to design and write software with the performance limiters of their platform in mind. On the platform that I've worked on, everything was always O(number of database operations) + O(number of network round trips), nothing else mattered. :)

      Jason Pollock

  51. Throw hardware at it. by aiyo · · Score: 2, Interesting

    My software engineering prof. believes that optimization should never be done during a project. Instead he thinks the programmer should wait until the project is complete then give careful consideration as to wether to optimize or not. He says most problems can be fixed by upgrading to better hardware and hours of optimization is not worth 3-4k more in hardware costs. I thought he was crazy to preach this during lecture. What do you guys think? Would you spend a day designing a better algorithm or finish the project and buy faster hardware?

    1. Re:Throw hardware at it. by defile · · Score: 3, Insightful

      Well, that depends.

      You probably picked the simplest, dumbest algorithm and probably used the most basic data structure. Why do all of the hard work when you don't even know if the easy work will suffice?

      If they don't suffice, your options are to develop your own algorithm/find a better one and a more natural data structure, or to throw hardware at it. Chances are, you won't be lucky enough that you can just upgrade so you'll have to spend valuable programmer time implementing a more complex algorithm that will need more careful maintenance that is likely to have more bugs that is probably less robust. You'll probably have to convert the data to a more machine-friendly format. Maybe you'll have to inconvenience the user or ship a lot of precompiled data. Whatever.

      It's rare that the easy algorithm is slow enough that it won't do as-is, but fast enough that doubling cpu power makes it tolerable. Usually there are orders of magnitude differences between the "best" algorithm and the easy algorithm, and only incremental speed bumps in computer offerings.

      On the other hand, maybe with an extra GB of RAM you'll never have to touch swap. Maybe that's good enough. ;)

    2. Re:Throw hardware at it. by SatanicPuppy · · Score: 4, Interesting

      Depends on how bad it is. I've seen stuff that runs so slow there really isn't a way to throw more hardware at it. Of course that was written by a guy who had two goals: 1) to make sure no one but him could support his work, and 2) to do as little work as possible.

      I don't know. Clean, elegant, functional code is beautiful. If you're ever going to have to work on it again, I think it's better for it to be clean and optimized.

      Also depends on the size of the app. With a small app, what excuse do you have for not optimizing? Wouldn't take that long. With a big project? Depends on your work environment.

      The bosses will never know if its optimal or not. If you tell them you've maxed out the server, they just think you write big badass code. A lot of times though, there isn't time to thoroughly bug check a big app (That what users are for, eh?), more less optimize it.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    3. Re:Throw hardware at it. by BlackHawk-666 · · Score: 2, Informative

      My charge out rate at my last company was 1000/day (about $1500USD) so I'm going to say yes to the optimisation in this case, because it is cheaper. If it were going to take me 3-4 days I'd say get the better hardware and try to keep the code as lean and as fast as possible without wasting too much time trying to wring it for performance.

      --
      All those moments will be lost in time, like tears in rain.
    4. Re:Throw hardware at it. by dtfinch · · Score: 1, Insightful

      To answer your question, whichever's cheapest, but taking into account that hardware costs may be recurring as you deploy the code on other systems, adding more servers only increases the volume that can be handled, and might not improve the response time of a slow loading page, and slow pages push away customers, which costs money. But the professor's right about waiting on many optimizations until you're finished. Code changes, and code you've spend hours optimizing may not make it into the final product, or may need to be altered in ways that require some optimizations to be undone.

      Most of my server side code (javascript) works fast enough on the first try, with most of the optimization being in the design, rather than in coding tweaks. I'll do other simple optimizations as I see them if there's a noticable slowdown and they won't adversely affect the clarity, maintainability, or reusability of the code. My biggest optimization on recent sites was an AES encryption/decryption library written in javascript, which I managed to speed up several fold. I've done research to determine the optimal tweaks for our servers, to get speedups without writing better code. Client side things that work fast enough from the start I don't bother to optimize. 90% of the time, reusable code is much better than efficient code. Most of what I write is 10x as fast as it needs to be, and with our web sites, bandwidth is the most limiting factor. I do optimize a bit to reduce bandwidth, but not excessively. And we spend roughly $0 a year on server hardware, and have since probably the year 2000.

      There's one last optimization that we're planning on making that will eliminate 99% of the need for future optimization, allow us to write horribly innefficient code in favor of code reuse (such great reuse that our non-programmers can jump into development), and it'll give us maximum performance from our database driven web sites. It's a home-grown site generator, for all the pages that are created from a database but are otherwise static. The use of static pages means they can be gzipped and cached on the server, allowing for near instantaneous page loads with low server overhead.

      Then there's the CS side of the fence:

      Right now I'm working on a capstone project that will be one big exercise in optimization, being extremely 3d, memory, cpu, and hard disk intensive, working with data that's to big to fit into memory without culling but needs to be rendered at high frame rates anyway, so everything I just said can go out the window for the next couple months.

    5. Re:Throw hardware at it. by the_thunderbird · · Score: 1

      I have always found that keeping your code as simple as possible (C/C++) its a lot more stable and a hell of a lot faster! Plus abstraction (mention somewhere around here) is your friend. For example, I wrote a database class (C++) for a Point of Sale application. I kept it very simple by creating a simple front end to any kind of database, the abstraction in it is great as all I have to do is add an extra case to switch and then add the corresponding enum and function! But implementing it is a dream! Also the best piece of advice I can give to a progammer is stick to using integers, enums are great! Be carefull with new (if you use C++) or find an API that can manage memory allocation for you like QT or GTKMM -> (my favourite!!). Another performance hint, stay away from too many embeded loop, first of all it makes you code ugly second of all, if you make one mistake say hello to mister reboot because your processor usage hits the roof!

    6. Re:Throw hardware at it. by EastCoastSurfer · · Score: 3, Insightful

      I hesistate to first throw hardware at the problem, but I do agree that optimizations generally should be left as the last thing to do in a project. Code should be written first to be readable and correct. Once those goals have been met, testing and profiling will find the few areas that are critical and may need some optimization.

      The problem your prof is probably trying to get you to avoid is wasting time tuning code that rarely gets executed. It comes down to the old 80/20 rule. Sure, you can spend weeks hand tuning some import routine, but all your time was wasted if that import is only run once a month, at night while the system is offline.

    7. Re:Throw hardware at it. by chooks · · Score: 1

      He says most problems can be fixed by upgrading to better hardware and hours of optimization is not worth 3-4k more in hardware costs.

      This only works if hardware is actually the bottleneck. This doesn't help if if your object lifecycles are so short that you are doing major garbage collections every two minutes and loading down your app server, or your SQL does full table scans and updates primary keys such that your DB is operating at 90% capacity all the time.

      Also, I would like to live in the fantasy land of 3-4K of hardware costs. Lets say you are running Oracle and need to add CPU's -- sure the hardware and installation might not be that bad (although even 3-4K is way on the low side when you factor in downtime, overtime, etc...), but wait till you get the licensing hit from the additional CPU. We're talking tens of thousands (if not hundreds of thousands, depending on the scale) of dollars.

      Granted - I think that your profs point is that premature optimization or blind optimization (w/o knowing where the bottlenecks are) is generally a waste of time/money. I am not sure what the "right" time to optimize is (generally when the customers, complain, right? :) but ideally the coding would be written such that blatant inefficiencies are avoided (e.g. using built in quicksort instead of writing your own bubble sort) However, thinking that throwing hardware is going to fix your performance problems is pretty much deluded thinking.

      --
      -- The Genesis project? What's that?
    8. Re:Throw hardware at it. by borgboy · · Score: 1

      Performance requirements are like any other aspect of the requirements gathering process - they have to be identified and quantified. If the customer needs a particular level of performance, it IS incumbent upon the developers to design that performance into the system and test it as rigorously as any "functional" requirement.
      That said, optimization is a somewhat loaded word. Good engineering means determining the best tradeoff between cost (readability and fault risk) and benefit (performance).

      --
      meh.
    9. Re:Throw hardware at it. by revscat · · Score: 1

      Would you spend a day designing a better algorithm or finish the project and buy faster hardware?

      It would depend upon the cost/benefit of each.

      Funny you should ask this. Where I work we have a bunch of AS/400's running on the back end handling all of our raw data. The code on these boxes is terrible: it works, but has been written by inexperienced coders over many years, and is basically a nightmare of obfuscation and redundant code. Well, these things typically run at around 80-95% utilization, depending upon the hour of the day. We recently had a project whose sole purpose was to speed up the AS/400 code, and we realized a whopping 200% improvement over all servers. Those servers went from ~85% utilization to an average of 30%.

      Now, AS/400's are a bit more expensive than eMacs on eBay. Just a bit. So the time we spent optimizing this horrible, horrible code was worth it to the tune of several tens of thousands of dollars. If we had instead added another AS/400 to the cluster, we would have had to (a) purchase the system, (b) upgrade the support contract, (c) inject it into the cluster, (d) provide internal resources to monitor it and do basic support.

      So in this case, anyway, optimizing the code was far cheaper than upgrading the hardware, especially in the long term. As our business grows, we will need to add/upgrade hardware at a slower rate due to this refactoring effort.

    10. Re:Throw hardware at it. by arkanes · · Score: 3, Insightful
      The most important reason to wait is because, almost inevitably, the part that you THINK is slow is not the part that actually hangs you up. You may spend an extra couple days working on you super-fast optimized sort & data structure only to find when you deploy that your bottleneck is RAM usage and all your clever caching is just slowing stuff down. Another good example is earlier in this thread, with the super-fast optimized MD5 libraries - spending money or time writing/buying those libraries if your data set is IO bound doesn't make much sense.

      Optimization is great, but profiling to make sure that your optimization isn't wasted is more important.

    11. Re:Throw hardware at it. by Glonoinha · · Score: 1

      Just remember, you get almost a 100% increase in performance every year for free - hardware pretty much doubles in performance every year. I remember reading somewhere some really big dataset was being processed (recreating at the molecular level everything within a couple hundred yards of a nuclear explosion or something like that) that was expected to run for 12 years. After four years they stopped the run, ported everything to a new computer that was four times faster, re-ran the program and it was done in 3 years, five years ahead of schedule - and they didn't change a line of code. Granted that was probably a fabricated story but it does make a point.

      --
      Glonoinha the MebiByte Slayer
    12. Re:Throw hardware at it. by bobaferret · · Score: 1

      The best investment a coder can make is in a profiler. Write you clean code, doing it by the book, with some basic clear optimizations thrown in. Then see if it's fast enough. If it isn't then throw a profiler at it. This is the quickest way to see where you code is killing time. Everytime we use one we see 1000% improvments after a day or two's (at most) work. You get a nice pointer to where your code is working the hardest. There is also the benefit that you only optimize the code that you need to, so the rest of your code doesn't end up with useless optimizations that have no real effect and only serve to obfuscate your code.

    13. Re:Throw hardware at it. by An+Onerous+Coward · · Score: 3, Insightful
      Also depends on the size of the app. With a small app, what excuse do you have for not optimizing? Wouldn't take that long. With a big project? Depends on your work environment.
      The level of optimization needed for small projects varies wildly. If it's a one-shot deal to allow one secretary to generate a report twice a week, who really cares if it takes two seconds or twenty? Even if you assume that's thirty-six seconds every week, it's going to take years of use before it would have been worthwhile to optimize it.

      On the other hand, if it's something that hundreds of people are going to be using four or five times a day, then it's probably worthwhile to do some algorithmic/data structure improvements.

      Finally, you get the extreme case: some library that will end up being used by millions. Those are the times when you want to eke out every bit of performance you can. The size of the project doesn't always determine its importance, nor does the importance of the project always determine how much optimization is needed.
      --

      You want the truthiness? You can't handle the truthiness!

    14. Re:Throw hardware at it. by mysticgoat · · Score: 1

      Back in the day, I was taught this approach:

      1. Program something that works.
      2. Throw away what you've done and program something that works right

        Only after the above is done:

      3. Gross profile at the module level (remember HIPO charts?) for the slow spots, then program something that works better by rewriting bottlenecks with better algorithms and data structures.
      4. Only then profile and optimize at the code level-- if that is actually required.

      The main product of #1 is a detailed binding agreement with the client that will tell you when the work is done. If you use comments well, then keep those, too. But the code itself is crap: toss it. Neither you nor the client actually knew what you wanted the computer to do until you got your feet wet (one of you might think you knew, but that's the ass/u/me thing). (or the problem is truly trivial.)

      Step #2 produces a working prototype, version 0.9x, that does everything in the specification. That might be all the client wants to pay for (right now). It solves his immediate computer problem, but it might cause his business to evolve in unexpected ways. Which could mean you'll be negotiating a new contract with him rather than trying to sell him on the advantages of optimizing the prototype-- which is both more lucrative and more fun.

      A successful project might never get to step #3. So what? Especially if it generates a string of child projects...

      As to buying new hardware to solve a software inefficiency-- it depends. Would the client be replacing or expanding existing hardware pretty soon anyway? If so, the hardware solution isn't going to cost him any more than what he is already budgeting for. I think much of the time, your software engineer prof's generalization makes a lot of sense.

    15. Re:Throw hardware at it. by GlassHeart · · Score: 1
      I hesistate to first throw hardware at the problem,

      Why? If you're delivering a system, then it doesn't matter whether it's the hardware or the software doing the heavy lifting. $1,000 hardware with $100 software is just as expensive as $100 hardware with $1,000 software. If one solution has a desirable property (such as fewer bugs or faster delivery time), by all means take it.

      but I do agree that optimizations generally should be left as the last thing to do in a project. Code should be written first to be readable and correct. Once those goals have been met, testing and profiling will find the few areas that are critical and may need some optimization.

      Great goals, and this is something I would tell a student of computer science. The problem is that the industry is so bad at scheduling software projects that there's rarely ever any time scheduled for optimization. In practice, the only things that get optimized are the things that are Really Too Slow, and little or no effort goes to make things run Fast.

    16. Re:Throw hardware at it. by Anonymous Coward · · Score: 0

      Also having large amounts of code inside a loop will cause what is known as trashing. This is when two pages in memorys are constantly swapped in and out because execution is looping and covering large amounts of code. Causes much IO and slows much down.

    17. Re:Throw hardware at it. by WNight · · Score: 1

      Another benefit is that optimized code is often simply reworked code - having seen where the bottlenecks are and working in a better algorithm (buying a faster MD5 library, replacing bubble-sort, etc). That code should now be much easier to fix and debug, potentially a millions-of-dollars-per-day savings that far outweighs any hardware savings.

      Re your sig: You only *think* Saddam didn't support Al Queda, there's no proof either way and it's the kind of thing he'd do. Like SCUDing Israel during GW1.

    18. Re:Throw hardware at it. by cr0z01d · · Score: 1

      I think 'clean' and 'optimized' are very different goals. I like for my code to work first, be clean and readable second, and be optimized third.

      With a small app, I have a very big excuse for not optimizing: in the time it takes me to figure out the bottlenecks and rewrite them, I could have gotten some actual work done on a different project. Check the Portland Pattern Repository's entry on the rules of optimization (http://c2.com):

      - First rule of optimization: don't

      - Second rule of optimization: don't, yet

      - Third rule of optimization: the bottleneck isn't where you think it is

      Here's a suggestion for them programming folk: RUN the program first, if the speed is acceptible, then don't optimize. If it isn't acceptable, don't optimize yet, profile your code. If you don't profile your code but you do optimize, I hate to say this, but you're being a moron.

      The cool thing about profiling is that it makes optimizing a large application much, much easier. Instead of staring down 100 klocs to optimize, you're looking at the 1 or 2 klocs that use 80% of the time.

    19. Re:Throw hardware at it. by EastCoastSurfer · · Score: 1

      I hesistate to first throw hardware at the problem,

      Why?


      I'm never opposed to throwing hardware at a problem, but many times people use hardware to try to overcome poor design. A poorly designed and architected system may appear for the short term to work OK because you put it on more hardware, but it will eventually fail because it is poorly designed. This is why I hesitate to first throw hardware at a problem.

    20. Re:Throw hardware at it. by Minna+Kirai · · Score: 1

      My software engineering prof. believes that optimization should never be done during a project.

      I hope he doesn't phrase it quite so stupidly- because according to what you just wrote, optimization will never happen. ("Once the project is complete" means that the CDs are in the mail to customers, and by then it's too late!)

      It's more likely he used one of the more sensible aphorisms, like:

      "First make it run. Then make it run correctly. Then make it run fast"

      (There is disagreement amoung software-engineering theorists as to what order the first two come in, but the idea that performance comes last is close to universal... barring the quite obvious exceptions)

      Would you spend a day designing a better algorithm or finish the project and buy faster hardware?

      That's a misleading question- it begs the question that optimization can be performed quickly (or even in a predictable time). If it really only took one day to make a better algorithim, of course you should do it. But the point is that it's very difficult to predict where the performance bottlenecks in a piece of software will come from... while it's realtively trivial to measure the hotspots in a running processes.

      Any effort expended "optimizing" something that wasn't really a bottleneck is probably wasted.

      The ROI for $ spent "optimizing algorithms" is uncertain... but with a scalable design, it can be rather safe to predict that each additional $4k will increase speed by X jobs/sec.

    21. Re:Throw hardware at it. by Minna+Kirai · · Score: 1

      Just remember, you get almost a 100% increase in performance every year for free - hardware pretty much doubles in performance every year

      Only CPU hardware! Disks, memory, and networks certainly improve much, much more slowly.

      In days gone by, optimizing for the CPU was more important. But now, optimizing for network utilization can be both more profitable and more difficult (being that instead of rewriting a few loops in pure assembly, you need to re-think how data flows around the whole system)

    22. Re:Throw hardware at it. by Minna+Kirai · · Score: 1

      You only *think* Saddam didn't support Al Queda, there's no proof either way and it's the kind of thing he'd do.

      Al Queda supported Islamist theocracies, such as the Taleban government of Afganistan. Saddam was too far away from Afganistan to interact with them, so we must instead look at his behavior towards Islamic fundamentalist states that border on Iraq. Such as Kuwait, Iran, and Saudi Arabia.

      Hmm... what did he do to them??

      (Or a more simplified version: Al Quaeda called non-bearded men sinners)

    23. Re:Throw hardware at it. by WNight · · Score: 1

      The enemey of my enemy...

      I think Saddam could have gotten huge payoffs for a comparatively small donation to the terrorists, why wouldn't he have done it? Al Queda proved that they weren't above 'sinning' to accomplish their goals. Their suicide pilots in the USA were drinking, not observing religious rituals, etc.

      Fanatics are usually willing to overlook many things in order to accomplish their goals.

      There's no way we could prove that Saddam didn't help the terrorists and, I think, it's reasonable that he might have common goals, despite their long-term views. Because of that, it's wrong to say that we *know* that Saddam didn't help the terrorists.

      There's enough FUD from Bush and friends, the other sides don't need to play that same game.

  52. My company by Sarojin · · Score: 0, Troll

    My company which does a good bit of open source development uses very similar methodologies to those outlined in the essay.

    --
    HOW'S MY POSTING? CALL 1-800-POSTING
  53. right for the wrong reasons by epine · · Score: 4, Insightful


    Sigh. One of the best sources of flamebait is being right for the wrong reasons.

    Surely C++ must rate as the least well understand language of all time. The horrors of C++ are almost entirely syntactic, beginning with the decision to maintain compatibility with the C language type declaration syntax and then adding several layers of abstraction complexity (most notably namespaces and templates).

    There are only two areas where I fear C++ for program correctness. The first is making a syntactic brain fart leading to an incorrect operator resolution or some such. These can be tedious to ferret out, but most of these battles are fought with the compiler long before a defect makes it into the production codebase.

    My second source of fear concerns interactions of exception unwinding across mixtures of object oriented and generic components. I see this as the only case where managed memory provides a significant advantage: where your program must incorporate exception handling. If you can't manage your memory correctly in the absence of the exception handling mechanism, I really don't believe you can code anything else in your application correctly either. I think exceptions are mostly a salvation for poor code structure. If all your code constructs are properly guarded, you don't need an error return path. Once a statement fails to achieve a precondition for the code that follows, the code path that follows will become a very efficient "do nothing" exercise until control is returned to a higher layer by the normal return path, whereupon the higher layer of control can perform tests about whether the objectives were achieved or not and take appropriate measures. I think the stupidest optimization in all of programming is cutting a "quick up" error return path that skips the normal path of program execution so that the normal path of execution can play fast and loose with guard predicates.

    The four languages I use regularly are C, C++, PHP, and Perl. Perl is the language I'm least fond of maintaining. Too many semantic edge cases that offer no compelling advantage to motivate remembering the quirk. C++ has many strange cases, but for C++ I can remember the vast majority of these well enough, because I've stopped to think about how they evolved from taking the C language as a starting point.

    I happen to love PHP for the property of being the most forgettable of all languages. I forget everything I know about PHP after every program I write, and it never slows me down the next time I sit down to write another PHP program. The managed memory model of PHP appeals to me in a way that Java doesn't, because as an inherently session-oriented programming model, PHP has a good excuse for behaving this way.

    I have a love/hate relationship with both C and C++. I write one program at a high level of abstraction in C++ and then when I return to C it feels like a breath of fresh air to live for a while in an abstraction free zone, until the first time I need to write a correctness safe string manipulation more complicated than a single sprintf, and then I scream in despair.

    The part of my brain that writes correct code writes correct code equally easily in all of these languages, with Perl 5 slightly in the rear.

    If I really really really want correct code I would always use C++. The genericity facilities of C++ create an entire dimension of correctness calculas with no analog in most other programming languages. The template type mechanism in C++ is a pure functional programming language just as hard core as Haskell, but because C++ is a multi-paradigm language, in C++ you only have to pull out the functional programming hammer for the slice of your problem where nothing less will do.

    What I won't dispute is that C++ is a hard language to master to the level of proficiency where it becomes correctness friendly. It demands a certain degree of meticulous typing skills (not typing = for ==). It demands an unflagging determination to master the sometim

    1. Re:right for the wrong reasons by Dr.Knackerator · · Score: 1

      Yep I'm pretty much PHP only at the moment, coming from a machine code/asm/Forth/C/C++/Java and C#. PHP is ideally suited to web based solutions. I spent a year writing Java/Struts pages and it killed my love of writing software. only 2 things i'd like php to do. get rid of the $ signs and having to do $this-> to call member functions on an object. that last one always tricks me out.

    2. Re:right for the wrong reasons by smallpaul · · Score: 1

      Maybe someday if slashdot regains its past glories we can have a thread devoted to the subject of whether ultimate code correctness bears any relationship to personal discipline, or if the entire matter rests with finding a suitable womb in which to program with protects the programmer from his or her own nature.

      I would be amazed whether you could find anybody to support the latter assertion. Rather you will hear them say that there are a variety of factors that lead to correctness, including memory correctness, type correctness, mathematical correctness, correct business process modeling etc. Every form of correctness takes effort (i.e. time and money) to maintain. Language features can reduce this cost. Furthermore, language features can dramatically cut the cost of maintaining correctness across a team of people with diverse mental models and skills.

      The thing I find your post deeply lacking is any mention of teams or of other people at all. It is frankly of no relevance to me that "epine" writes equally correct code in C++, C, PHP and Perl 5 (in addition, you seem to suggest, to Haskell). None of the twenty high quality programmers I work with on a day to day basis will report the same experience. If your brain works in a way that you write equally correct code in Haskell and C then good for you, you should probably go write kernels for small devices. Back in the world I live in, various languages dramatically affect the ability of programmers to write correct code: though none of C, C++, PHP or Perl are languages that are really designed with correctness as a priority.

    3. Re:right for the wrong reasons by RedWizzard · · Score: 1
      Surely C++ must rate as the least well understand language of all time. ... There are only two areas where I fear C++ for program correctness. ... The four languages I use regularly are C, C++, PHP, and Perl.
      I wonder if you'd be so forgiving of C++'s flaws if you used an alternative, well designed OO language as much. I didn't used to mind C++ at all, but then I learnt Java and Sather, and now I avoid C++ wherever possible.
    4. Re:right for the wrong reasons by Anonymous Coward · · Score: 0

      I learnt Java and its flaws and now I avoid it wherever possible.

      To Polarize is the root of all evil.

    5. Re:right for the wrong reasons by gillbates · · Score: 1
      Maybe someday if slashdot regains its past glories we can have a thread devoted to the subject of whether ultimate code correctness bears any relationship to personal discipline, or if the entire matter rests with finding a suitable womb in which to program with protects the programmer from his or her own nature.

      I think that sums up the core problem of "language debates" rather nicely. I've found that certain languages are better than others when it comes to certain types of problems, and I will readily implement in the most appropriate language:

      • Assembly is very good for coding simple problems which must be fast, reliable, and small. Typically, this means OS components and other hardware-level code.
      • C is good for general utility programs, or programs meant to work on relatively small tasks in a larger context. It particularly excels at managing flat-files and other data structures of fixed size.
      • C++ is good for building robust applications. It is more complex than C, but in the hands of a well disciplined coder, is much more powerful and much more productive.
      • Visual Basic and Java are good for building enterprise level systems. VB is particularly problematic in regard to external dependencies; setting up a box to run your VB app can take as long as coding itself. Java, OTOH, has a much more sane, portable, approach, but it is decidedly limited in accessing system-specific details.

      I know that some are going to take issue with my depictions of their favorite language, and these are the people most in need of correction. With few exceptions, the inability to solve a problem with a given language is almost never the fault of the language.

      Granted, some languages do restrict what a programmer can do, but I've found that, more often than not, the supposed "failure" of a language has more to do with the ineptitude of the coder than the deficiencies of the language paradigm. The fundamental difference between the l33t c0dR and the programmer analyst is that the analyst approaches the problem in terms of processes and algorithms, where the coder thinks only in terms of their favorite language; if said language doesn't solve the problem nicely, they blame the language rather than themselves.

      Having worked on enterprise level systems, I can truly say that the implementation language is of almost a trivial relevance from a problem-solution perspective. The primary difference between the different languages is the time required to build the system. Since some of the higher level languages (and their libraries) do much for the programmer, they require few low-level design details, and are hence faster. But while this makes design a little easier, it neither prevents the programmer from choosing inefficient algorithms, nor making logical errors. Someone who gave up C++ in favor of Java for any reason other than development time will find that the problem of complexity hasn't gone away; rather it has simply been moved up a level. Those same complexity problems C++ faced at the statement level will now reappear at the module level (Such as two objects which support the same syntactic interface, yet do subtly different things...) And so it goes with every language transition - the problem of complexity doesn't go away, but rather, is shifted to a higher level. A programmer analyst, who thinks from a top-down perspective, addresses the problem of complexity with the high level design, rather than the facilities of the language. Thus, once the overall design is done, there is considerable leeway in choosing the correct language for implementation.

      --
      The society for a thought-free internet welcomes you.
    6. Re:right for the wrong reasons by mattgreen · · Score: 1

      Well said. C++ being misunderstood doesn't affect my perception of the language. It is a "kitchen-sink" language that has a TON of depth. It has a lot of idiosyncrasies, and can be extremely unforgiving in the wrong hands. Despite it seeming like a confused, multi-paradigm high/low level language, I like the amount of choice it offers me, I feel more expressive. On the other hand, Java's conscious design policy to scale the language down to the absolute minimal subset that average programmers can use is insulting.

      I am also looking forward to the D language as a simplified, GC'd C++.

    7. Re:right for the wrong reasons by Alan+Shutko · · Score: 1

      The problem with the "programmer analyst" view you describe is that it implies that software development is like a cow: you feed the cow the parameters of the problem, and the system falls to the ground at the other end. And the system you get is generally of the same quality.

      After you're done with your top-down work of art, all those details you elided as "the time required to build the system" can mean the difference between solving the problem or watching the company go bankrupt due to poor choice of tools. Some languages will be better able to express the higher level design than others. Sure, you could describe the Mona Lisa to someone using Swahili over morse code, but showing them a picture of it will be a lot quicker. Different languages don't just shift around the essential complexity of programming, they also add their own complexities which have to be managed. And you have to face those complexities each time you modify the system. Remember that maintenance of systems is usually the highest cost in software development... and when it isn't, it's usually because the system was scrapped.

      So, no, language choice will rarely mean that you can't solve a problem. But poor language choice can mean you're eventually unable to afford continuing to solve the ever-changing problem. Of course, by that time, your "programmer analyst" has long since moved on to other projects, leaving the people in the trenches to gripe about legacy systems.

  54. This isn't an article about optimization by the_skywise · · Score: 4, Insightful

    So much as an attempt to "prove" that programming to the metal is no longer necessary or desireable. (IE "After all, if a C++ programmer was truly concerned with reliability above all else, would he still be using C++?" )

    The analogy is all wrong. These days there are distinctly two types of "optimization". Algorithmic and the traditional "to the metal" style.

    During college I worked with the English department training English students to use computers as their work had to be done on a computer. (This was before laptops were commonplace) The theory was that word processing allowed students a new window into language communication. To be able to quickly and painlessly reorganize phrases, sentences and paragraphs showed the students how context, clarity and meaning could change just by moving stuff around.

    This is what the author has discovered. That by being able to move code actions around, he can experiment and "play" with the algorithm to boost speed while keeping error introduction to a minimum. (Ye olde basic anyone?)

    He mistakenly equates this to "advanced technologies" like virtual machines and automatic memory buffer checking. In reality, we've just removed the "advanced technologies" from the process. (IE Like pointers, dynamic memory allocation, etc) (IE, ye olde basic anyone?)

    There's nothing wrong with this. Though I am a C++ programmer by trade, I was far more productive when I was professionally programming Java. But that was because I had LESS creative control over the solution because of the language syntax. No passed in variable changing, no multiple inheritance, etc. So I'm thinking of how to layout the code, there's pretty much a limited way of how I'm going to go about doing that.

    It's like the difference between having the Crayola box of 8 crayons and the mondo-uber box of 64. If you're going to color the green grass with the box of 8, you've got: Green. If you've got 64 colors, you're going to agonize over blue-green, green-blue, lime green, yellow-green, pine green and GREEN.

    That doesn't make C++ less "safe" than Java. Sure, you can overwrite memory. But you can also create a Memory class in C++ ONCE which will monitor the overflow situation FOR you and never have to worry again.

    But back to optimization:
    66 fps seems really fast. But in game context it's still kind of meaningless. Here's why. You're not just displaying uncompressed images. You're also doing AI, physics, scoring, digital sound generation, dynamic music, User input, possibly networking. As a game programmer, you don't stop at 66 fps. Because if you do 132 fps, then you can really do 66 fps, and still have half a second left over to do some smarter AI or pathfind. Or if you get it up to 264 fps than you can spend 1/4 of the cycle doing rendering, maybe you can add true Dynamic voice synthesis so you don't have to prerecord all your speech!

    Ultimately, my point is this. (and I think this is what the author intended) You're going to get bugs in whatever language you write in. That's the nature of the beast. VM's and 4th generation languages take away the nitty gritty of programming while still providing alot of performance power. And in alot of cases, that's a good thing. But it's still nothing more than a "model" of what's really going on in the hardware. If you're really going to push the limits of the machine you have to be able to control all aspects of it. Now, it's getting harder to do that in Windows. We spend more time coding to the OS than the metal. But in the embeddes systems category, and in console video game systems the metal still reigns and if you're going to develop a game that will push the hardware, you're going to need a programming language that will let you speak machine language. Not one that's going to protect you from yourself.

    As it was in the beginning, as it always will be: Right tool for the right job.

    1. Re:This isn't an article about optimization by Anonymous Coward · · Score: 2, Insightful
      But it's still nothing more than a "model" of what's really going on in the hardware. If you're really going to push the limits of the machine you have to be able to control all aspects of it.


      For instance, the famous 'Goto is considered harmful'.

      In actual machine code, the processor's equivalent of 'goto' (usually called a 'jump') is one of the most common operations...

      Another way of looking at this is antilock brakes in cars.

      It's not so much that the 'new' way of doing things is really any better to a skilled user. But they sure help reduce headaches caused by a lack of skill on the part of a new and/or less talented person.
    2. Re:This isn't an article about optimization by master_p · · Score: 1

      The argument that "variety means problems" only applies to people like you, as you understand. There is an equally large group of people who don't have a problem with choices: they know the choices, and they are perfectly able to choose from one that suits them best.

    3. Re:This isn't an article about optimization by Brandybuck · · Score: 1

      If you've got 64 colors, you're going to agonize over blue-green, green-blue, lime green, yellow-green, pine green and GREEN.

      Somehow I think if the new crop of language police were in charge of art, they would be telling artists to use only eight Crayons.

      "I've never had the need to use 'lime green', so there's no reason you should use a palette that contains it."

      --
      Don't blame me, I didn't vote for either of them!
  55. Depends on the design and the bottleneck by www.sorehands.com · · Score: 2, Insightful

    What you say makes sense, but is completely wrong.

    You have to consider the entire system design when looking at the bestplace to make the optimization. You need to look at what the bottleneck and attack that, but keep in mind the issue in upgrading the system.

    1. Re:Depends on the design and the bottleneck by edwdig · · Score: 1

      That's exactly what I meant by assuming your design is ok first.

      If there's a huge bottleneck, fix that first. I'm talking about what to do if you still need more performance after that.

  56. Re:You don't optimize, that's the job of the compi by scot4875 · · Score: 2, Informative

    An intelligent compiler (i.e. any modern compiler you'd be likely to use) will automatically __inline the fred::setQ function, and then the peephole optimizer will reduce it down to the equivalent of myFred.q = 10;

    --Jeremy

    --
    Jesus was a liberal
  57. Mental images by Anonymous Coward · · Score: 0

    CS: Coder who always does low-level optimizations
    UT: Guy pressing fire as fast as possible

    CS: Coder who relies on compiler for optimizations
    UT: Low-life using an aimbot

  58. Damn!-Survivor! Stashdot style. by Anonymous Coward · · Score: 0

    Performance? I don't have a problem with peformance. Just ask my wife.

  59. Re:You don't optimize, that's the job of the compi by Anonymous Coward · · Score: 3, Informative

    I don't know why his example was so bad.

    A good example would be how to detect if a king is in check in a chess program. There are a few different approches. Some are fast, some are slow, and a compiler just cannot "optimize" a slow approach into a fast one. The function is called millions of times per second in a chess program, so you want it optimized.

  60. When to optimize by lejordet · · Score: 2, Interesting

    When I start on a program, I usually make "place holder" functions where necessary to get the program up. Sure, this will be slow, but at least I can get the program up and running quickly (the place holders usually do what they're supposed to in the most convenient-to-code way I could think of, or emulate their final functionality - for example by returning true all the time).

    What this achieves for me, is that I can look at the program as a whole, and _then_ identify where the problem areas are - most likely not where I thought they were... Even if the first version takes 5 minutes to run (as my first attempt at a depth-first tree search did), it works passably, and is often easier to optimize than trying to optimize each function as I write it.

    Might not work for everyone, but I like coding this way :)

    --
    Yes?
  61. Managed environments-Coffe cuts. by Anonymous Coward · · Score: 0

    Maybe you do take a hit? But look at what you get for your troubles. I'd say it's worth it.

  62. Fragility of the decoder by Animats · · Score: 3, Interesting
    "I was also bothered--and I freely admit that I am probably one of the few people this would bother--by the fragility of my decoder."

    And, sure enough, there's a known, exploitable buffer overflow in Microsoft's RLE image decoder.

  63. Re:You don't optimize, that's the job of the compi by BasilBrush · · Score: 1
    What do you mean? Many compilers would take that function and inline it. Why on earth not?

    Is setQ returning an int or not?

  64. uhh.. yeah by XO · · Score: 2, Interesting

    ok, so this guy is saying that.. he found 5 or 6 ways to improve the performance of his program by attacking things in an entirely different fashion... ok..

    back in the day, i discovered a really great trick... you might represent it as something like... :

    boolean a;
    a = 1 - a;

    this is a zillion times more efficient than if(a == 1) a = 0; else a = 1;

    it is also about the same as a |= 1; if you were going to use bitwise functions.

    OK. Great.

    --
    "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    1. Re:uhh.. yeah by prockcore · · Score: 2, Informative
      this is a zillion times more efficient than if(a == 1) a = 0; else a = 1;

      This is the one time where I'll step up and say that VC actually does a few neat tricks for the trinary operator.
      c=(a>b)?0:1 /* or c=!(a>b), it's the same code */
      translates to
      cmp b,a
      sbb c,c
      inc c
      there are other variants of this, I'll leave it as an exercise to the reader to figure out what is going on.
    2. Re:uhh.. yeah by Anonymous Coward · · Score: 1, Informative

      XOR it up!

      boolean a;
      a = a ^ 1

      One assembly instruction on most processors.

    3. Re:uhh.. yeah by Fuzzums · · Score: 1

      and this is the most logical implementation I would say :)

      --
      Privacy is terrorism.
  65. Re:You don't optimize, that's the job of the compi by Anonymous Coward · · Score: 0

    *most* of the time I trust the compiler, but where performance really matters I do it myself because sometimes the optimizer produces slower or incorrect code. Slower is fine, but incorrect code causes problems when your stuff works in debug, but not in release.

  66. What are you talking about? by Anonymous Coward · · Score: 1, Informative

    bzzzt... split infinitive, use of passive voice, etc.

    Sheesh. Don't make the corrections unless you know what you're talking about. Where's the split infinitive? In "I just finished reading..."? There's no split infinitive there.

    That post doesn't use passive voice, either -- "the essay" is a perfectly valid subject. Passive voice would be "compiler optimization is discussed."

    Hmph.

  67. Some perspective by Wumpus · · Score: 1

    I think it's worth noting that this guy's optimized version, while fast enough for his purposes, is probably a couple of orders of magnitudes slower than it can be: The 333 MHz Pentium II he squeezes 5 frames and some change out of can decode 30 frames/sec of MPEG-2 video and stereo AC-3 audio without breaking a sweat. The machine I use to watch DVDs is a 300MHz Pentium II. Decoding an MPEG-2 frame involves a bit more computation than decoding an RLE encoded image. Just reading and decoding the DCT coefficients for an MPEG-2 I Frame would take more computation than decoding an entire targa file.

    So, yes, interpreted languages are great, and fast enough for most applications, but they're not suitable for everything, as the author ironically illustrates.

  68. Re:You don't optimize, that's the job of the compi by neil.orourke · · Score: 1

    Ok, fair enough :) For a spectacularly simple example like this, of course the compiler will inline the code. And thanks for pointing out that setQ() wasn't supposed to be returning an int :)

    That, however, wan't the point I was trying to make. What I was (extreamly badly) trying to say was that the compiler doesn't have a global view of the code, it can only be looking at the code from a local point of view. So, if your class design is badly thought out, the complier's performance, no matter how clever, isn't going to help for the real world.

    This isn't to say that a modern compiler can't do some amazing things. If you want real performance, though, depending on the compiler to magically fix a bad design is not the smartest thing.

  69. Asymptotic performance by alanwj · · Score: 4, Insightful

    The problem, in my opinion, is that people go about optimizing in the wrong place.

    You can spend all day optimizing your code to never have a cache-miss, a branch misprediction, divisions, square roots, or any other "slow" things. But if you designed an O(n^2) algorithm, my non-optimized O(n) algorithm is still going to beat it (for sufficiently large n).

    If the asymptotic performance of your algorithm is good, then the author is right, and you may not find it worth your time to worry about further optimizations. If the asymptotic performance of your algorithm is bad, you may quickly find that moving it to better hardware doesn't help you so much.

    Alan

    1. Re:Asymptotic performance by Flyboy+Connor · · Score: 4, Insightful
      To put it in different terms: Optimisation is in finding a good algorithm, not in tweaking code details.

      To give a nice example: a colleague of mine worked on a program that took two months to execute (it consisted of finding the depth of all connections between all nodes in a graph containing 50,000 nodes). Since the customer needed to run this program once a month, this took far too long. So my colleague rewrote the whole program in assembly, which took him a few months, managing to reduce the required time to, indeed, one month.

      My boss then asked me to take a look at it. Together with a mathematician I analysed the central function of the program, and we noticed that it was, basically, a matrix multiplication. We rewrote the program in Delphi in an hour or so, and reduced the required running time to less than an hour.

      I won't spell out the lesson.

    2. Re:Asymptotic performance by Anonymous Coward · · Score: 2, Interesting

      (Posting as anonymous coward to protect the guilty)

      When I started my Ph.D. work, I came into a project doing compiler stuff in functional languages. There was a home-brew lexer that my adviser had written, that did 2d "array" lookups by scanning all the way through a list of lists. We thought it was broken, as it never finished. I changed it to use real arrays, and got it down to taking a matter of minutes. Merely using a O(n^2) implementation rather than O(n^4) :) (Language and implementation still sucked, though)

      Morale: It's the O-factors that will kill you. Optimizing anything but that is a waste of time until you have seen the profiling data. And even the O-factors are irrelevant in code that's not going to be executed often or that is outside of high-performance areas.

      I still catch myself trying to avoid single instruction improvements in handling of, say, user dialog actions. But the user and window system together probably took over a second already to cause the action, so doing a bit of extra CPU work to be clearer or safer is The Right Thing.

    3. Re:Asymptotic performance by lahi · · Score: 2, Insightful

      I disagree: Finding a good algorithm (indeed, finding the *best* algorithm for a task), is merely good programming. (And *inventing* a good algorithm is *excellent* programming!) Implementing it in the best possible manner, including applying shortcuts which are known to be possible due to knowledge of the specific task to which the algorithm is applied, is optimising.

      You might be a good optimiser, or you might just be a good programmer. Your colleague however, is a bad programmer.

      -Lasse

    4. Re:Asymptotic performance by jbert · · Score: 1

      But be careful. If you know (because of knowledge of the problem domain) that N will be 50 for the next few years, there is no shame in a linear pass to find your entry.

      Adding a code to maintain a sorted list and perform a binary search or adding hashing, all so you can scale nicely to N=1,000,000 kind of misses the point.

      I think my point is that dumb algorithms are fine first implementations. Don't waste your time with a "better" algo if the problem doesn't demand it. (And how do you know the problem demands it? Well, really you should use profiling again, rather than "I may have lots of these").

    5. Re:Asymptotic performance by rangek · · Score: 1

      Beware of linear scaling algorithms. If n is small, you are actually going to do worse with an O(n) algorithm. I was so annoyed when a colleague of mine replaced all of the "normal" code with "linear scaling code". It made the cases I was interested in 10 times slower! And the code was harder to follow.

    6. Re:Asymptotic performance by Brandybuck · · Score: 1

      Unfortunately, all too many people will hear your story and think that Delphi was the solution.

      PHB: "You got it down to one HOUR using Delphi? Go fire all those C and assembly developers because they aren't needed any more!"

      --
      Don't blame me, I didn't vote for either of them!
    7. Re:Asymptotic performance by exp(pi*sqrt(163)) · · Score: 1

      That's a good distinction. An ex-colleague of mine, years ago, needed to implement directed acyclic graphs to represent a workflow. He thought: "it's a graph, graphs can be represented by adjacency matrices, so I'll use a matrix". So an N element graph became an NxN matrix. Inserting new nodes in the graph took time O(N) as entire rows were updated. I pointed out that this was a braindead idea (in polite language). His response was that my suggestion (er...just sticking pointers to parents and children each node like nprmal programmers) was an optimization and as such should be left until later. But the truth is, he wasn't delaying optimization, he was programming badly.

      --
      Doesn't it make you feel good to know that our freedoms are protected by politicans, lawyers and journalists.
  70. parameters and variables by Anonymous Coward · · Score: 0

    wtf is the difference between a parameter and a variable?

  71. BIG OH WAS here.....whoever says optimizations.... by SauroNlord · · Score: 1

    BIG OH WAS here.....whoever says optimizations are not important are a) not talking about the proper optimizations b) have never heard of big OH .. It has to due with input size N and it's rate of growth: such as linear(O(N)) algorithms, compared to a dual nested for loop algorithms over input size N will run in O(N^3) That can be years of straight processing on 100000 machines, or a few microseconds on 1. lookup: http://c2.com/cgi/wiki?BigOhNotation

  72. I don't know about Erlang but... by BitwizeGHC · · Score: 2, Interesting

    Some implementations of popular dynamic languages (e.g., LISP, Scheme), let you do some type inference and/or some explicit declarations, and will spit out machine code or C that will do the job that much faster. Tweak your algorithm in the slow version of the language and then produce a program that runs ten times faster with an optimizing compiler.

    The Squeak VM is a great example of this. The whole thing is written in Squeak itself. Running a Smalltalk VM this way is painfully slow, but a Smalltalk->C translator generates the code that will be compiled and used as the actual, runtime VM (which can support a whole host of things, including raster and vector graphics, sound, MP3 audio and MPEG video!).

    --
    N4st0r, trixx0r h0bb1tz0rz! Th3y st0l3 0ur pr3c10uzz!
  73. They might... by warrax_666 · · Score: 1

    if the program contains lots of strings with embedded spaces. :)

    (Well, probably not measurably faster, but you could probably shave a few bytes off the executable... depending on padding and such)

    --
    HAND.
    1. Re:They might... by Surt · · Score: 1

      I don't know, if you do things like print those strings or iterate over them in any way, you may see a measurable improvement.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    2. Re:They might... by Grayswan · · Score: 1

      Thatmightbehardtoread.
      Thatmightbehardtoread.
      Th atmightbehardtoread.

      --
      If you open your mind too wide, people will throw trash in it.
    3. Re:They might... by hayriye · · Score: 1
      You can solve "strings with embedded spaces" problem like this
      10PRINT"HELLO"+CHR$(32)+"WORLD":END
  74. nice troll by Anonymous Coward · · Score: 0

    even with troll in your username, you got modded all the way to 5.

    1. Re:nice troll by Anonymous Coward · · Score: 0

      Isn't that cool? Debian Troll's Best often gets modded up to 5 one way or another. 3/4 of the replies to this one are taking it seriously. The posts are always full of deliberate red flags: I count 4 or 5 in this one. But the average clue-impaired /. reader just plows blindly on.

      I hate trolls in general, but it's hard to hate DTB. Nice work, DTB.

  75. Programming == Optimization by Frans+Faase · · Score: 1

    The only reason we program is because performance is an issue. If we would have automatic systems for making specification executable, we would not have to write programs. One of the reasons, I believe, that we do not have such automatic systems is that they would not perform. We need human optimizers, people like you and me, that can before the necessary optimizations. The von-neuman computer architecture is rather slow, it is only because of the memory piramide that it is able to perform reasonable. 90% of code is dealing with moving data around different types of memory (registers, cache, RAM, hard-disk, network) and doing differentials queries, e.g. changing the result of the query when the underlying data has been modified. The screen for example, is not completely computed when we move the mouse around, or with every keyboard hit. Only the area that is changed is refreshed. And this is just one of the examples. My conclusion is that the only reason we program (and not write specifications) is because performance is an issue.

  76. Re:Article puts it [some] in perspective by swordfishBob · · Score: 1

    High-level optimisation before worrying about low-level? Absolutely. You could start even higher up, and discover that often the wrong problem is being solved. How's that for inefficient?

    Regarding the assignments, it's a sad story but actually comes back to documented requirements and documenting the change in requirements. (Did the original assessor sign his name to his mid-task comments?)

    Back on the article, yeah, ok, sounds fair, but..
    - My kids still use a 200MHz machine, and simple flash animations bog it down. I remember Wolf 3D running on 33MHz machines. What the?

    - My mobile phone (with palmos) has more CPU & non-volatile storage than my first five personal computers did, and my current desktop has 200 times as much again. If I weren't developing software, viewing inefficient web-hosted graphics/animations or Word documents, and being in an Active Directory, I'd probably settle for a PDA with external screen and keyboard -- or any of the PCs we're currently chucking out as "worthless".

    --
    -- All your bass are below two Hz
  77. Let me guess. You have't done much programming. by anti-NAT · · Score: 1

    I used to write programs the way you describe. In BASIC, around 1982. I was self taught, so I didn't know any better.

    The problem with code in the form you suggest, from a programmer's point of view, is that once a program gets to a particular size - I'd suggest around 200 to 300 lines - you can't keep track of every detail or every variable, and what they do, are for, and what their current value us, or whether the current value is valid.

    Modularity helps solve this problem, by allowing you to forget irrelevent details. In other words, it allows you to focus on what you should be focusing on, at that moment. As long as the other modules are well written, and debugged, you just mentally treat them like a black box.

    Your concern about modularity impacting performance is misplaced, for a few reasons. Low performance in a program is usually a result of slow algorithms, not bad programming structure - the slight performance decrease of jumping to a module will be infinately smaller than the result of picking an inappropriate sorting routine.

    Secondly, if calling routines does have a measurable performance impact, you can "inline" them. This means that a copy of the routine's assembly code is inserted in the position where the module call would have been. This may improve performance, although it comes at a cost - the size of the binary increases, as there are now multiple copies of the same routine in the binary. This impacts on memory though, which, on a low end system, may mean swapping to disk, which, of course, impacts performance so significantly, that any other optimisations performed were just a waste of time.

    Everything comes at a cost. Modularity comes at the cost of time spent planning the modular architecture of the program, as well as the additional actual code to implement the modules. However, the benefits of modularity to program maintenance, data hiding and code reuse far exceed these costs.

    If you're interested in further reading on this topic, I'd suggest getting a copy of "Code Complete", by Steve McConnell. His first edition has been available since 1993, however, I've heard a new version is coming some time this year, so it may be worth waiting for that.

    --
    The Internet's nature is peer to peer - 20050301_cs_profs.pdf
  78. This guy is out on a limb by ibullard · · Score: 3, Informative
    Quote:
    Even traditional disclaimers such as "except for video games, which need to stay close to the machine level" usually don't hold water any more.

    Yeah, as long as you write simple, 2D games(like the author of the essay does) that would be true. Complex, 3D games are another matter. I write games for a living and even if you're within sight of cutting edge you're writing at least some assembly and spending a lot of time optimizing C++.

    Now I'm not knocking all he says or saying that good games need to be in C++ and assembly. Some games rely heavily on scripting languages to handle the game mechanics and world events. There's a lot less assembly code than there used to be. However, the core engine that handles graphics, physics, AI, and I/O is going to be written in C++ and assembly and will be for the forseeable future.

    If I published a game that required a 3Ghz computer to display 576x576 images at 66fps, I'd be laughed off the internet. A PS2 has a 300Mhz processor and needs to display a 512x448 image every 30-60 seconds.

    1. Re:This guy is out on a limb by ibullard · · Score: 2, Insightful
      That should read "a 512x448 image 30-60 TIMES a second.

      I should have ended the post by typing "HEY, GRAMMER FREAKS! LOOK AT ME! I SUCK!" instead of writing that last sentence.

    2. Re:This guy is out on a limb by Anonymous Coward · · Score: 0

      ditto.

      although i'd have a go at the claims from a different angle. at work i use both python and c++, so i know a little about their respective strengths and weaknesses.

      the main weakness of c++, as i see it, is that it's difficult to use. if you write 'c with classes', that is use only _some_ of the features of c++ and completely forgo the stl or boost containers, algorithms and templates, you're likely to end up with code that is as unsafe as the author of that article claims. the proper use of c++ makes it just as safe, and almost as easy to use as python (i can't speak about erlang, never having written it).

      now i'm no games programmer, so i know little about the performance requirements for games. instead i write webserver software and to some extent clustering libraries for web applications. performance _does_ matter a lot here, but is always limited by considerations such as the connection speed of clients and whether another machine behind the loadbalancer isn't cheaper than trying to optimize more...

      the point i'm trying to make here is: quite often, once you've used c++ properly, that's performant enough. if not, _then_ you optimize, and yes, that can include cutting out some of the nicer c++ features and replacing them with more c-style paradigms. most of the time, though, you just rearrance code to avoid some repeated expensive function call in the middle of a loop.

      with python, some things are incredibly easy to code, and due to specialization in the interpreter, pretty fast. the main problem with python is that everything's a named object and hugely dynamic, so the python interpreter will spend a huge amount of time performing string comparisons. that's why a good number of python extensions are written in c++, which is made all the easier by - again - boost. make sure that the amount of calls into c++ code aren't many (type conversions cost time), and rewrite the python code you've got into a c++ extension, and things should speed up. (do try to profile your python code first, though. and use psyco.)

      my point? i'll wholeheartedly agree that semi-interpreted (bytecode-compiled) languages are usually safer to use, and may well be fast enough. i'll also agree that nevertheless optimizations will be necessary, and that they're likely to be on a higher level. if you need more performance than that, i don't think you'll be able to avoid lower level languages for a long time to come. and some of these (c++) are similarily safe as well.

      oh, well.

    3. Re:This guy is out on a limb by prockcore · · Score: 3, Insightful

      Yeah, as long as you write simple, 2D games(like the author of the essay does) that would be true.

      Not only that, but even simple 2d games can need optimizing. Perhaps they need optimizing because they're on an inherently slow platform (like Flash or a cell phone), or perhaps they need optimizing because they're multiplayer (and games with bad network code are immediately obvious and usually fail miserably)

      I find it strange that so many programmers here talk about things being "fast enough" or "not worth my time"... yet any article about mozilla, openoffice, windows, osx, damn near any software package with a gui is filled with complaints about slowness and bloat.

      Makes you wonder what IS worth their time.

    4. Re:This guy is out on a limb by julesh · · Score: 2, Insightful

      Mozilla is slow because its GUI is written using a flexible script interpreter, rather than being hard coded in C++.

      I don't know why OpenOffice is slow, I've never analysed the way it works in enough detail. I'm sure the reason is fairly obvious to anyone who knows the code base well enough to comment.

      Windows isn't really slow, but has some annoying features that have been added recently that can slow you down; for instance in the user interface it will try to open files of certain types to display information about them whenever you perform any operations on them, which isn't exactly helpful if opening the file is too slow...

      My only experience of using OSX is that it's blindingly fast. But I've used it for about 3 hours, so that's hardly conclusive.

      But you see the pattern -- these systems are slowed down by features that are _necessarily_ slow. You couldn't have the same features without the performance problems they bring. Windows can't give you a preview of an image, or tell you how many files are in an archive, without opening it (although I _really_ wish it'd do it in another thread...). Mozilla can't support its really easy-to-write user interface extensions without an interpreted UI.

      The people complaining are people who don't actually want these features, and don't see why they should suffer for them, which is a fair point.

    5. Re:This guy is out on a limb by Anonymous Coward · · Score: 0

      Um, the word is "grammar." With an a.

    6. Re:This guy is out on a limb by scrytch · · Score: 1

      > Windows isn't really slow, but has some annoying features that have been added recently that can slow you down

      Forget recent, some of them are just a result of truly dumb ass design. For example, ever wonder why in explorer, right-click "file->new" is so slow to pop up the menu the first time, to the point of freezing for a minute or so? Because instead of there being a simple "templates" directory (like there is for office files) for all file types, it literally scans through every single registry key in \HKEY_CLASSES_ROOT looking for a FileNew subkey. At least it caches it somewhere so it's not so expensive the second time ... til you reboot or restart explorer anyway. It's just inexcusable. And windows is just chock full of brute force approaches like that.

      --
      I've finally had it: until slashdot gets article moderation, I am not coming back.
  79. "fast enough" by Clod9 · · Score: 1
    This is the kicker. What is fast enough? What hardware do you have to run on?


    In the early nineties, I chose a tool with high-level scripting abilities because I naively thought learning C would be overly complex, and overkill. Processing and displaying color images on a Mac, I could do one frame every few seconds. That was just barely "good enough" on our development machines but it wasn't good enough in a lab full of old machines, where the software had to run. Then someone showed us a program called NIH Image, which could do a slideshow at several frames per second at full resolution even on the old machines. The difference? THEY USED THE RIGHT TOOL, and got results that appeared almost magical.


    A decade later, I figured CPU speeds had improved so much the hardware would no longer be an issue for any practical application short of weather forecasting, but even now I find people soaking processor cycles so badly with poorly chosen software architectures that a 2GHz machine slows to a crawl.


    Think: HARDWARE IS NOT FREE. If you write software that requires 10 times the horsepower, then your hardware will cost 10 times as much. If your company is spending 100K a year installing new boxes, wouldn't it justify the effort to use the right tools, do your profiling, and reduce it even by half?

  80. Re:You don't optimize, that's the job of the compi by Guilly · · Score: 1

    yeah.. that must be the most horrible example .. of both coding and optimization.. since the coding is poor and the optimization is going to happen anything from the compiler.

  81. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  82. http://www.javaperformancetuning.com/ by Kunta+Kinte · · Score: 2, Interesting
    Maybe a little offtopic.

    But if you haven't heard of it http://www.javaperformancetuning.com/ is a good source of performance tips for java

    --
    Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW
  83. Re:Performance, an aspect of design and understand by k8to · · Score: 1

    The point is not that the language is "too hard", but that the low level language creates an order of magnitude more clutter and errors which get in the way of creating a reasonable, reliable program. They also obstruct creating a reliable efficient program.

    --
    -josh
  84. C++ has bounds checking. by rjh · · Score: 1

    And if you don't use it, you're a damnfool idiot. Consider:

    try {
    vector<int> foo(10, 0);
    std::cout << foo.at(10) << std::endl;
    } catch (exception &e) {
    std::cerr << e.what() << std::endl;
    }

    Note: I haven't compiled this, so please consider it just pseudocode. But the point of the matter is, old non-bounds-checked arrays exist in C++ only because it's necessary for the C heritage. The vast majority of new C++ code out there should not use non-bounds-checked arrays, and if it does, it reflects more on the programmer than the language.

  85. Quick, go get Fortran 95. by mbkennel · · Score: 3, Insightful

    Do NOT convert to C++ under any circumstances!

    Fortran 77 sucks.

    But C++ sucks, in different ways.

    Fortran 95 is a much better language than Fortran 77, and for many things, better than C++ as well.

    It is practically a new language with an old name.

    If you currently have a F77 code, it is almost certainly far better to start using Fortran 95.

    Essentially all Fortran 95 implementations have compile and run-time checks which can make things as safe as Java, and when you take off the checks, things will run very fast. With the Intel Fortran 8.0, probably faster than anything other than hand-tuned assembly. You will probably whip GCC C++.

    It is also quite doubtful you will get significantly better performance in C++.

    No, I am not an old-fogey (I'm 35 now, programming since age 13). I learned Basic on the Apple II+, then Fortran 77 and C simultaneously when I got my first summer job. Then C++. Then Eiffel and Sather. Then Fortran 95.

    Yes, indeed, fully knowing C++ I choose Fortran 95 for technical superiority in the problems I want and need to solve. (Sather was the best ever, but now dead, Eiffel good if you have sophisticated data structures and you don't need multi-dim arrays, and F95 best for any linkage to Fortran and multi-dim arrays, modules but not objects).

    The problem is that C++ bugs, though less frequent than bugs in C, are can be deep, subtle and severe. The language has very opaque bits. Include files are antideluvian. Pointers and references, baroque and archaic. Object model brittle. Templates powerful and dangerous. A hideous and error-prone syntax.

    This is not the case in Fortran 95. Other than fully algorithmic bugs are shallow.

    "computer science" truly misunderestimates Fortran 95.

    1. Re:Quick, go get Fortran 95. by Anonymous Coward · · Score: 1, Interesting

      I think your love for F95 is a bit overzealous. It suffers from many of the same problems that C++ does.

      Like C++, F95 has opaque types and function (and operator) overloading, and "pointers". These offer all the same merits and demerits as in C++.

      Many F95 compilers do not check runtime array bounds, and none do by default. And when they do, the executable slows noticeably. Nobody ships F95-based apps or runs production code with array bounds checking turned on. It just isn't done.

      F95 bugs are at least as common as in C/C++, simply because the language's programming idiom does not routinely check return status variables, much less trap exceptions. Maybe that's not the fault of the language, but idioms count just as much as syntax.

      I agree that F95 is almost certainly superior to most/all other languages if you need to manipulate arrays. F95's compilers also produce better optimized code than the others (aside from C/C++). That's simply where the priorities of its users lies.

      But since most users of F95 attempt a different kind of programming task, with its own characteristic set of bugs, it's not meaningful to compare the subtleties of F95's floating-point exceptions with C++'s null pointers.

      Finally, perhaps the reason that "computer science" often underestimates F95 is that there are no books written on the language for computer scientists. Fortran terminology was coined in 1955 and NEVER updated. Rather than use common terms like "record" or "procedure", fortran persists in promulgating its own antiquated argot.

      Until fortran "gets with the program", it will remain a language that only an engineer could love.

      Randy

  86. Re:You don't optimize, that's the job of the compi by murr · · Score: 1

    Now, which is going to execute faster

    Neither, since both q and setQ are private to the class and thus can't be accessed.

  87. Re:You don't optimize, that's the job of the compi by oskillator · · Score: 3, Insightful
    If you write clear and simple code the compiler or interpreter does all the other work. It will automatically remove unused code and simplify complex segments. So long as your code is not unnecessarily convoluted often the machine optimizations are better than the human brain optimizations.

    A compiler can do low-level optimization, but it can't figure out a better algorithm for you, and the simplest, least convoluted algorithm is usually not the fastest.

    All the assembly language fiddling in the world -- by the optimizer or by hand -- will give you maybe a 2x performance over C, 10x over perl, but a better algorithm will often increase performance by many orders of magnitude.

  88. Optimise last by Kris_J · · Score: 2

    Intially develop the entire project in a langauge you can develop fast. Once it works (and you're sure it does what the client wants), find out where the most CPU time is spent, then optimise those bits. "Optimise" may just mean having a good look at your code and working out a better way of doing it, or it might mean writing a library in assembler. Either way, optimise last.

  89. Re:Clarity in Code by Designadrug · · Score: 2, Insightful
    As far as clarity, find me one developer who has taken over a project and not complained about the quality of the inherited code ever.

    Guilty. But at least I'm thinking about the poor SOB who's going to be maintaining my code. In fact we were just implementing new functionality using a superfast but arcane algorithm and were having trouble debugging it (mucho matrix maths - yuk). Instead of finishing that, we researched another algorithm that instead uses triply-nested loops with two conditionals. It won't be half as fast (because of conditionals within loops of course) but it will be a heck of a lot easier for my successor to maintain. (Took 10 minutes to implement and worked first time. Had to check I wasn't stuck in a BTL simulation)

    --
    Cogitum Ergo Hatto
  90. Pre:Painful P-ful Post by zCyl · · Score: 1

    Perhaps peon posting parent post prefers posts precognitively portending prematurely passing profiling.

    Performance profiling presents plentiful possibilities per producing programs performing past previous paces.

    1. Re:Pre:Painful P-ful Post by Lord+Kano · · Score: 1

      Peck posterior protuberances.

      --
      "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
    2. Re:Pre:Painful P-ful Post by maxwell+demon · · Score: 1

      P-ful posts?
      Cruel! C creates clearly creative comments! Cancel C-less crap!

      --
      The Tao of math: The numbers you can count are not the real numbers.
  91. For some things, enough hasn't ever been enough by grimen · · Score: 2, Interesting

    The (I think correctly) author argues that for many tasks we over stress optimization in places where it isn't necessary. Well and fine for tasks that it's not necessary such as the example he gives.

    However, as available processing power increases, some tasks change. Many technologies follow a trajectory that starts at "unthinkable" then move to "if you have special hardware" and then move gradually to software. Often along the way, features and computational complexity are added that keep a technology barely in reach (of both HW and SW implementations). It can be many many years before some technologies settle into a stage where they can be comfortably supported in SW at acceptable performance.

    Examples include: sound (which started with clicks and beeps and moved through to multichannel 3D audio), graphics, games (text-based to ever-more-complex 3D) and video codecs (simple RLE moving to ridiculously complex stuff like the H.264 codec). In games, for example, there are often preference panels controlling which features should be disabled for performance reasons. This seems evidence that the authors/publishers feel they can't count on their customers having enough power to run the games without cutting features to gain performance.

    I think for those applications where processing power trails needs and desires of customers and where optimization can make up the difference, developers will need to optimize or be eaten by the competition. In my experience, in things like codec and graphics development, you can get many-times performance increases over solid but poorly optimized implementations (sometimes even when you're just feeding HW).

    I think those gains can be critical.

  92. Re:You don't optimize, that's the job of the compi by colinleroy · · Score: 1

    Please give me a call the day our compilers will be able to transform some poor O(3) algorithm into a nice O(1). I'm very interested.

    Compiler can optimise lots of stuff, but certainly not the programmer's logic.

    --
    blah
  93. Similar story by complete+loony · · Score: 1

    When I was in high school I wrote an AI player for connect 4. In Basic. On a 386 equivalent machine. (errrrg slow)
    I had to optimize the hell out of the algorythm to eliminate repeated calculations and simplify loop structures.
    When I got to uni one of the assignments was to write an AI player for gomoku. Well all I had to do was re-implement the algorythm, this time in C, and on a more powerful machine.
    Boy was it fast, and that allowed me to spend some serious time looking at future moves that probably wouldn't have been possible if I hadn't spent the time optimizing the algorythm previously.
    Don't underestimate what you can acomplish in a modern CPU once you've saved time optimizing.

    --
    09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
  94. Postmature optimization by nimblebrain · · Score: 5, Informative

    After years of developing, I really take to heart two things:

    1. Premature optimization often makes better optimizations down the line much more difficult
    2. It's 90% guaranteed that the slowdown isn't where or what you thought it was

    Profilers are the best thing to happen to performance since compilers - really. I encounter a number of truths, but many myths about what degrades performance. A few examples of each:

    Performance degraders

    • Mass object construction
    • Searching sequentially through large arrays
    • Repeated string concatenation (there are techniques to mitigate this)
    • Staying inside critical sections for too long

    Not performance degraders

    • Lots of object indirection
    • Lots of critical sections

    The "lots of object indirection" myth is one I encounter frequently. Object A calls Object B calls Object C, and it "intuitively" looks like it must be slow (Computer A calling Computer B, etc. would be slow), but even with stack frame generation, these are lightning fast compared with even the likes of "date to string" functions, never mind line-drawing commands or notification-sending.

    The reason that particular myth is dangerous is that it's the single most pervasive myth (IMHO) that leads to premature optimization. People take out layers of object indirection and make it harder to put in better solutions later. I had an object that recorded object IDs in a list and let you look them up later. If I had "flattened" that into the routine that needed it, I might have effected a 0.1% speed increase (typical range for many premature optimizations). As it stood, because it hid behind an interface (equivalent to an ABC for C++ folks), when I had finally implemented a unit-tested red/black tree, it was trivial (~5 minutes) to drop in the new functionality. That's not an isolated case, either.

    Mind you, I profiled the program to determine the slowdown first. Searching on the list, because so many were misses (therefore full scans), the search was taking up 98.6% of the entire operation. Switching to the red/black tree dropped the search down to 2.1%.

    All in all, if you have a slow program, profile it. There is no substitute for a well-written profiler. Stepping through and "feeling" how long it takes in a debugger, while it can point you in rough directions, will miss those things that take 50 ms out of the middle of each call to the operation you're checking. Manually inserting timing calls can be frustrating enough to maintain or slow down your program enough that you can't narrow down the performance hit.

    gprof works well with gcc and its relatives (make sure to add -pg to your flags), but I'm not sure if there's a good open source option out there for people using other tools that doesn't require you to alter your source.

    In the Windows world, we recently got in the professional version of AQTime 3. It's an astounding package, allowing you numerous reports, pie charts and call graphs, saving the last few runs, calculating differences in performance between runs, allowing attachment to running processes, on top of a pretty nice way to define areas of the program to profile. The single nicest thing about it, though, is the performance. We turned on full profiling (that is, profiling all methods in all modules, including all class library and third party components) on the largest project we had, and it ran with perhaps a 30% slowdown. If you've used profilers before, you know how astounding that is ;)

    Profiling applications always surprises me. In one case, a space-making algorithm I was running on controls seemed a little pokey; I found out more than 50% of the time spent was on constantly verifying that the lists were sorted. Today, I was investigating a dialog that looked like it must hav

    --
    Binary geeks can count to 1,023 on their fingers :)
    1. Re:Postmature optimization by unwesen · · Score: 1

      kcachegrind is a wonderful and very visual (=intuitive) tool that really helps you profiling (i.e. a frontend to valgrind). Instead of showing how much time is spent in a particular function, it shows how many instructions are executed in that context - often a much better approach, as functions may be waiting (and necessarily!) most of their time.

      Altogether, with gprof and kcachegrind, it's pretty simple to find bottlenecks in an application. One of my colleagues added/helped add support for reading python profiling logs, so profiling python works well, too.

    2. Re:Postmature optimization by oogoody · · Score: 1

      Great post. I disgree with this though:
      > Lots of critical sections

      If you actually have many threads in your program
      critical sections can cause really bad performance
      and high unpredictable latency. With a few threads
      it won't matter so much. Plus mutexes vary in
      performance on different OSs.

    3. Re:Postmature optimization by nimblebrain · · Score: 1

      That looks excellent, unwesen :) I'm impressed with the drive to modularity and working with multiple output, and kcachegrind looks just plain purdy :) It's nice to see profilers moving towards instrumenting behind the scenes at profile time, requiring no more from a user other than turning on standard debug information.

      Thanks for the pointer!

      (*laugh* Reading valgrind's documentation, I get a chuckle about having extra instrumentation specified with a "--skin" parameter)

      --
      Binary geeks can count to 1,023 on their fingers :)
    4. Re:Postmature optimization by nimblebrain · · Score: 1

      If you actually have many threads in your program critical sections can cause really bad performance and high unpredictable latency.

      That doesn't actually have to be the case. The default case for a critical section is lightning fast. I've stepped through the assembly for NT's critical section, and the non-contention path is a mere handful of instructions long, including an LOCK INC instruction, which takes about 3x as long as an INC (which is not much).

      The contention path, however, involves a WaitForSingleObject call, which, while not substantial, is certainly longer, and the performance hit mostly comes from the block, not the call per se.

      The critical section itself is represented by a small, basically passive structure, representing the lock count and depth count, and not a whole lot else.

      There are some simple rules that can make critical sections take the non-contention path 99.99 times out of 100, and profiling results bear it out:

      • Contention is the performance-killer; exit the critical section as soon as the appropriate data is safe
      • Enter and exit the critical section multiple times if there are non-contentious lines in the middle, rather than making the critical section big
      • You can often work with local variables (or local copies) without critical sections, or with a critical section for the copy - performance can vary, so use your judgement
      • Never, ever make a call to something you don't have complete control of, or that is even remotely large, inside a critical section
      • Corollary: If you're sharing output, use a critical-section-protected queue rather than trying to call the output within a critical section

      I've seen violations of all of the above 'rules', and heck, yeah, performance suffers :)

      Critical sections can badly impact performance, but I would guarantee that the problem is their use, not the construct itself!

      Other mutexes on operating systems are typically equally fast unless they're being used for inter-process communication and must be accessed/looked up by name/atom.

      --
      Binary geeks can count to 1,023 on their fingers :)
    5. Re:Postmature optimization by oogoody · · Score: 1

      How is windows dealing with priority inversion in
      just a few instructions?

      >Never, ever make a call to something you don't have >complete control of, or that is even remotely >large, inside a critical section

      Good luck on that one :-)

    6. Re:Postmature optimization by Anonymous Coward · · Score: 0

      >> It's 90% guaranteed that the slowdown isn't where or what you thought it was

      Dude. Think about this. Perhaps you just lack talent for this?

    7. Re:Postmature optimization by nimblebrain · · Score: 1

      How is windows dealing with priority inversion in just a few instructions?

      Priority inversion? Critical sections don't have anything to do with priority inversion - the operating system is free to switch thread context at any time, including right in the middle of entering or exiting the critical section.

      Are you thinking about the actual OS-internal constructs that prevent task switching or interrupts, or switch "rings"? Last time I dealt with those was the SEI and CLI instructions on the old 6510 processors. The only time those types of constructs should be touched is writing device drivers, and sparingly at that :)

      Actually, to tell you the truth, when I first heard about critical sections, that's what I thought they were as well, and I steered completely clear of them, thinking that a single long critical section could grind the entire computer to a halt.

      >Never, ever make a call to something you don't have complete control of, or that is even remotely large, inside a critical section<

      Good luck on that one :-)

      In practice, it's actually not that hard to avoid. Essentially, it means avoiding constructs like this:

      MyPackets::ProcessPackets
      packet_critical_section.enter
      . packet = packets.dequeue
      . process_packet(packet)
      . destroy_packet(packet)
      packet_critical_section.leave

      But rather, use a construct like the following:

      MyPackets::ProcessPackets
      packet_critical_section.enter
      . packet = packets.dequeue
      packet_critical_section.leave
      process_packet(packet)
      packet_critical_section.enter
      . destroy_packet(packet)
      packet_critical_section.leave

      The idea being that process_packet is potentially long and not under your control, so if someone was to add a packet (MyPackets::AddPacket) while process_packet was going on, that thread would be blocked, and it doesn't need to be (there's nothing inherent in adding a packet that's going to affect the processing of a previous packet).

      destroy_packet can be assumed to be short and totally under your control (destroying objects shouldn't be a grand affair), and if the operation is simple enough, may not need to be in a critical section at all.

      Does that shed some better light on what I was saying? :)

      --
      Binary geeks can count to 1,023 on their fingers :)
    8. Re:Postmature optimization by oogoody · · Score: 1


      >Priority inversion? Critical sections don't have >anything to do with priority inversion

      Um, they have everything to do with
      priority inversion.
      http://www.us.design-reuse.com/article s/article242 5.html.
      I have found NT to have horrible thread scheduling
      in a real multithreaded app.

    9. Re:Postmature optimization by MarkCollette · · Score: 1

      Binary geeks can count to 1,023 on their fingers :)

      I guess I'm special, since I can count up to 1024 on my fingers! 1, 2, ..., 1024

    10. Re:Postmature optimization by nimblebrain · · Score: 1

      I guess I'm special, since I can count up to 1024 on my fingers! 1, 2, ..., 1024

      *laugh* Ah, but can you count down to zero? :)

      Right hand: thumb = 1, index = 2, middle = 4, ring = 8, pinky = 16

      Left hand: pinky = 32, ring = 64, middle = 128, index = 256, thumb = 512

      All fingers: 1+2+4+8+16+32+64+128+256+512=1,023 :)

      No fingers: 0

      Especially rude: 132 :)

      --
      Binary geeks can count to 1,023 on their fingers :)
    11. Re:Postmature optimization by nimblebrain · · Score: 1

      Um, they have everything to do with priority inversion. (http://www.us.design-reuse.com/articles/article24 2 5.html)

      I would find it an odd design to schedule two threads of different priority on the same critical section, outside of perhaps device drivers and RTOS applications. Mutex protocols are an optional add-in to threading libraries (e.g. on AIX, here), for those instances where you would design such interaction - it's not going to happen by chance. Those interactions wouldn't be likely to happen in more than one or two "hot spots" in an application - and I would surmise those would be on the 'edges'.

      Mind you, I don't know what kind of specialized software you write. It sounds like something that approaches needing realtime priority.

      I have found NT to have horrible thread scheduling in a real multithreaded app.

      For the priority inversion scenario, certainly. Here's an article on how NT deals with priority inversion. In practice, I've found that there's a subtle order-of-operation difference that can make a Linux/Unix-optimized approach go slower (relative to its theoretical performance) on NT, and vice versa. If I recall correctly from some of my porting efforts, NT tends to set up the new thread, continue running on the current thread, and lets the scheduler switch to the new thread later, and Linux tends to fork to the new thread immediately and lets the schedule switch to the old thread later. I was glad of the difference at one point; my thread pool suffered from a race condition I had overlooked - Linux's switching model ran into my bad assumption situation almost every time.

      --
      Binary geeks can count to 1,023 on their fingers :)
    12. Re:Postmature optimization by MarkCollette · · Score: 1

      *laugh* Ah, but can you count down to zero? :)

      Why yes I can... by putting my hands behind my back :)

  95. Re:You don't optimize, that's the job of the compi by Captain+Segfault · · Score: 1

    Please give me a call the day our compilers will be able to transform some poor O(3) algorithm into a nice O(1). I'm very interested. What's your phone number?

  96. wtf by fred+fleenblat · · Score: 4, Insightful

    The article made it sound like the optimizations he was doing at the erlang level were somehow "better" than optimizations done in a language like C++ because he could just try out new techniques w/o worrying about correctness. His array bounds and types would be checked and all would be good.

    BS.

    First of all, erlang won't catch logical or algorithm errors, which are quite common when you're optimizing.

    Second, you can optimize just fine in C++ the same way just as easily, IF YOU ARE A C++ programmer. You just try out some new techniques the same way you always do. So array bounds aren't checked. You get used to it and you just stop making that kind of mistake or else you get good at debugging it. Hey at least you have static type checking.

    In fact you might be able to do a better job of optimization because you'll be able to see, right in front of you, low level opportunities for optimization and high level ones also. C++ programmers aren't automatically stupid and blinded by some 1:1 source line to assembly line ratio requirement.

  97. Re:You don't optimize, that's the job of the compi by Pseudonym · · Score: 4, Insightful

    Wrong. Dead wrong.

    You don't micro-optimise unless the compiler doesn't do the job well enough. But nowadays, you almost never have to. Your superior brainpower can mostly be freed from the mundane details of your hardware and instead you can concentrate on using more suitable algorithms or data structures.

    Indeed, the best thing you can do to get your code running fast is to write it with good abstractions. That way, when you find a performance problem, you can swap some old code out and swap some new code in and everything else will still work.

    --
    sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
  98. You just don't understand by CwazyWabbit · · Score: 1

    Code runs faster if you take out all possible white space between the instructions.

  99. A quibble by CwazyWabbit · · Score: 1

    If you are testing against a non-volatile value then there will be no side effects of the end of an if skipping following if statements that can't possibly be true. A decent optimising compiler will just jump to the end of the set of if statements. However, as you say, it probably wouldn't turn the whole lot into a range check and a calculation.

  100. Who's time and money? by Moderation+abuser · · Score: 1

    10 hours spent optimizing code by the programmer vs 100,000 people spending an extra 30 seconds (if you're lucky) twenty times per day waiting for a computer to do something.

    It all depends whether you want quantity or quality, doesn't it.

    --
    Government of the people, by corporate executives, for corporate profits.
  101. Re:You don't optimize, that's the job of the compi by quannump · · Score: 1

    umm O(3) = O(1) lameness lame

    --

  102. All very good ... till you compare with asm.. by Anonymous Coward · · Score: 0

    I wonder how a program written in C or asm with the same external behavior will compare with the one written in asm. I am surprised that the author never brought that up. Performance is not about how much time it takes. Its about how much time it takes multiplied by how many times it is called.

    Speed doesnt get you nowhere if you are going in the right direction.

    1. Re:All very good ... till you compare with asm.. by Anonymous Coward · · Score: 0

      The author wanted widespread use of other hand-holding languages instead of C, or other fast and easy to err languages. The only real perspective in optimizing for performance is to show how big a gain there is between Erlang and C or equivalents. Without this comparison, the performance gain by optimizing in Erlang means little compared to the real performance gain from another language, like C.

      The author's benchmark in Erlang maybe good enough for picture operations, but it won't do for video operations.

      There's this common fallacy that the current hardware is too powerful for software, so it's acceptable to be inefficient because the inefficiency is trifling. The opposite is true because there is a physical limitation to hardware, while software is limitless depending on the programmer's imagination - look at cryptography, 3D rendering (Earth Simulator), banking, etc.

  103. An article about Erlang on Slashdot, but... by Anonymous Coward · · Score: 0

    None of the readers seem to anything about it...

    A few points:
    * interpreted languages nowadays tend to compile to native code
    * the compiler or run time is better at optimising than you are
    * performance is not the same as scalability, a scalable architecture is more importantant than Mips (Mips are cheap commodities which you buy in inexpensive boxes from HP or Dell - if your architecture supports it)
    * most real life systems have latencies (io, kernel activities, screen painting) which are only tangentially related to the grunt of your piece of code (see architecture)
    * Erlang is uber scalable, designed to build reliable high performance systems. It provides its own (very lightweight) concurrency and means the developer doesn't use expensive OS process and thread spawning and doesn't have to worry about managing threads and synchronisation.

    Check out the Yaws (yet another web server) versus Apache benchmarks for throughput and concurrency.

    Erlang systems also stay up. The main BT telephony system only had 31ms downtime in its first year (see here (page 30).

  104. So true... by warrax_666 · · Score: 2, Interesting
    You have to work much, much harder in C++ to get anywhere near FORTRAN performance, so much so that it's almost never worth the effort.

    One of the most dangerous things (optimization-wise) in C++, I've found is the temporary-creation problem. You have to be insanely careful to avoid creating temporaries to get any sort of reasonable performance... (or maybe I just need a better compiler than GNU GCC?)

    Templates powerful and dangerous.


    Not quite sure why you would consider them dangerous, but they are Turing Complete (i.e. they are a compile-time language all of their own). Which some people have used to create this. It looks almost as fast as Fortran, but the syntax is a lot more complex than just A*B for a matrix-multiplication.
    --
    HAND.
  105. Performance tuning. by Moderation+abuser · · Score: 1

    Not entirely true that you can't throw hardware at a problem, but pretty close.

    What gets me are all the people who want me to tune systems because an application is slow. Thing is, unless there's something specifically wrong with a hardware or software configuration, performance tuning at the hardware, OS, network and rdbms layers will only make a relatively marginal difference to the performance of the application.

    Performance tuning or optimisation at the application code level on the other hand can make several orders of magnitude difference to the performance of an app.

    --
    Government of the people, by corporate executives, for corporate profits.
    1. Re:Performance tuning. by Old+Uncle+Bill · · Score: 2, Interesting

      Amen. Poorly written queries, excessive XML parses/transforms and too much bandwidth utilization are all things NOT solved by tuning the architecture. We typically make 2-3X improvements in our product through tuning the system and up to 100X by tuning the above. I've worked on projects where the system (in this case 1 4way db and two 2way app servers), could support 2 users. No amount of throwing hardware at that thing would improve the performance. Funny thing is, the client was a bit frosted because they had paid (at that time) about $4 million for the project. As a performance architect, lazy and inefficient programmers will keep me employed for centuries.

      --
      Yes, I am an agent of Satan, but my duties are largely ceremonial.
    2. Re:Performance tuning. by Glonoinha · · Score: 5, Insightful

      Problem is that your client is still running on the prototype of their project, not the real release. They just don't know it, and I'm guessing the original programmers don't know it.

      The most effective, well used (if unintentionally used) development methodology is the prototype methodology. The first pass is simply a reality check, can we even accomplish what needs to be accomplished on the hardware and development tool we have available? The prototype is then shown to management as a proof of concept, show them that their ideas are possible, and then a second generation is re-engineered from the ground up using the lessons learned in the first generation as a foundation for a solid, well engineered deliverable product. This breaks down in one of two ways : management says screw the rewrite, lets just run what we have - or the developers are not smart enough to understand that their first pass at it wasn't production quality code, only a prototype.

      What your client has right now is a prototype, a proof of concept. It 'works' inasmuch as a kite flys - as a demonstration that the concept is viable, but not meant for real work. You could probably push a big kite hard enough to 'fly' two people, but that doesn't make it a good idea. You could continue to 'tweak' a kite in order to even double the performance, get 4 people off the ground - but I wouldn't recommend using it for commercial applications.

      Odds are the app needs to be understood from top to bottom so a set of software engineers know the concepts, what the package is intended to do, how it currently does it, what the expectations are for performance and growth - and then the SE's that understand it need to rewrite it from the ground up developing performance engineered code that is production quality.

      --
      Glonoinha the MebiByte Slayer
    3. Re:Performance tuning. by pboulang · · Score: 2

      This is a wise post. Add in that it is extremely important to get this concept across to the one-who-signs-the-checks. This is the difference between writing software and software engineering. I appreciate your post.

      --

      This comment is guaranteed*

      *not guaranteed

    4. Re:Performance tuning. by Old+Uncle+Bill · · Score: 1

      I agree on all points. One of my newer clients (the example above is from about 5 years ago) says it this way: An application goes through three revisions: It is written once for functionality, second time for performance and third time for serviceability/maintainability. Unfortunately, there are many programmers that do not know how to do 2 & 3. I guess, like you said, there is a difference between programming and software engineering.

      --
      Yes, I am an agent of Satan, but my duties are largely ceremonial.
  106. Re:You don't optimize, that's the job of the compi by colinleroy · · Score: 1

    Of course, I meant O(n^3) and O(n).

    --
    blah
  107. Performance is QUITE important by jubitzu · · Score: 1

    It may not be now as the hardware more than meets the demands of software. What if the demands of software one day catch up with the hardware again? Hopefully programmers will still know how to optimize software when that happens. I think that the matter is definitely perennial. Used to be the hardware could not meet the needs of the software, now the software cannot fully utilize the hardware. It is only a matter of time before that changes again.

    This post was written with O(1) complexity.

  108. Persuasive, widely accepted, and wrong. by Anonymous Coward · · Score: 1, Insightful
    20 years ago these arguments would have been revolutionary, and partly correct.

    Today they're widely accepted. Most of the responses on /. agree with the basic premise.

    And that's probably why my 2GHz cpu doesn't really feel much snappier than the first computer I ever had, a 10MHz 286 box.


  109. The compiler can't do all micro-optimizations by r6144 · · Score: 2, Informative
    Some good habits in coding helps the compiler to do its job better, and also results in clearer (at least not uglier) code.

    Example 1: in C, if you use "int" for a variable "x" that should have a type of "unsigned", "x/4" will not just be a simple shift, instead three or four instructions are involved. Indeed, it would be very hard for the compiler to infer that "x" is always non-negative and optimize for you, except in the simplest cases.

    Example 2: in floating-point math, "divide by 10" is not exactly the same as "multiply by 0.1", thus many compilers (gcc 3.4 without "-ffast-math", icc8 by default, and probably the Java VM) won't optimize the former into the latter, even in the many cases where it won't matter. This results in code that is 10-40 times slower on the P4.

    Example 3: in Haskell, since lazy evaluation has much more overhead than eager evaluation, compilers always try to optimize the former into the latter. However, in many cases it is impossible for the compiler to do that, since it can't decide if using eager evaluation will prevent the evaluation from terminating.

    In short, it is good to rely on the compiler to do the optimization (such as register allocation) that is known to be done well, but what the compiler can do is very limited, since (1) it can't know your intent if you had not expressed it, so (for example) it has to make sure that every floating-point operation conforms to very stringent error bounds, often at the cost of significant speed, even if you don't really care about that; and (2) some code-optimization problems take extortionate time to solve, or might even be theoretically infeasible in general. Therefore, when writing code that is going to take some significant CPU-time, it is good to have some good habits that helps the compiler, as long as the code isn't uglified too much.

    1. Re:The compiler can't do all micro-optimizations by raxx7 · · Score: 1

      About 1: many architectures have arithmetic shifts, not just logic shifts.
      About 2: Are you sure about ICC8? And you're right about Java VM. In both cases, it has to do with IEEE 754 conformance, which is part of Java and C99 but not of C90, by the way.

    2. Re:The compiler can't do all micro-optimizations by r6144 · · Score: 1
      1. Arithmetic shifts are divisions that round downwards, which is different from the C integer division operator which rounds toward zero. Therefore the C compiler is obliged to use something more clumsy although very few people actually need that.

      2. ICC8 does generate a division with "-O3 -march=pentium4", although I think it doesn't have to do strict math by default. Strange.

    3. Re:The compiler can't do all micro-optimizations by andrewgreen · · Score: 1

      Nitpicking a couple of your examples.
      Example 1: "...will not just be a simple shift...". No, it will be an *arithmetic* shift, where the topmost bit is preserved and replicated into the second-to-topmost bit. Something even the 6502 had.
      Example 2: "'divide by 10' is not exactly the same as 'multiply by 0.1'" -- particularly as 0.1 cannot be represented precisely in IEEE floating point.

  110. Algorithms by Ann+Elk · · Score: 1

    To summarize the essay: A suboptimal implementation of an optimal algorithm usually beats an optimal implementation of a suboptimal algorithm.

    Also, the author is clearly enjoying the fruits of those "optimization aware" programmers that created Erlang, especially those "cycle-counters" who wrote the virtual machine.

  111. Re:You don't optimize, that's the job of the compi by Anonymous Coward · · Score: 0

    Compiler optimazation....

    would be great, but.. I never seen a compilter that actualy does some real magic, okey they do optimize to some point, but thats _nothing_ compared to your broken algorithm. When searching for data maybe you put all in a list, and then search the list from start to end. If you instead put the data into a binary tree, it would be ALOT faster to search when the data grows. This is also a kind of optimization, that it will result in a REAL speadup.. even if the compiler managed to optimize 5% on the list search.

    Most of the time the compilers are happy just _if_ the code compiles!.. Look at GCC, it likes to allocate more memory for values then actualy needed? Why? Nobody knows, adding -O6 minimizes the overhead, but not totaly.. Why is this? Cause it was easyer to code and get right this way, is it the optimal solution? No ofcourse it isnt. You can hand opimize everything, but that will take you a lifetime for a bigger project.. Get used to that not everything is the fastes as it can be, its better that it does what it should.. And then you optimize away the BIG things, and thats most likey badly algorithms. When you fixed those, then you can start looking at the 95% loop (where you program probely spend the most time)..

    Compiler optmization is overrated, and I dont beleave in it.. sorry.. compilers are as other programs and suffers from the same things as any software.

  112. MOD PARENT UP! by Anonymous Coward · · Score: 0

    Excellent original poetry. Slightly insulting, but creative.

  113. It's not fast enough by vrt3 · · Score: 1

    The thing is, the author claims it's fast enough, but I don't buy it. In any commercial setting, many customers will have machines with 1 Ghz and slower machines. Couple that with slower hard drives, larger images and other stuff running on the machine, and there's nothing left of your nice framerate.

    --
    This sig under construction. Please check back later.
    1. Re:It's not fast enough by JamieF · · Score: 1

      >the author claims it's fast enough, but I don't buy it. In any commercial setting,

      Well, first of all it's a benchmark, so its purpose it is to measure relative performance. There is no particular performance goal. "Fast enough" doesn't even apply here. The author says he's impressed with how fast the initial trivial implementation is, but that doesn't equate to a hard performance target. For my uses, a Targa decoder could take all year, because I have exactly 0 Targa images that I need decoded.

      Secondly, it's a little benchmark he whipped up to prove a point, and his point was that high-level optimizations yield more performance benefit than low-level ones (not calling a function is faster than inlining it, etc.) But it's not a commercial application, and nobody's asking you to buy it.

      You might as well say "I won't pay the exorbitant $0 Winbench license fee for all the desktop users in my company because it doesn't run as fast as they need it to on modern PCs." Okaaaay....

  114. A bit more complicated when "enogh is enough" by SLOGEN · · Score: 1

    There are several external influences that decides if a program is "optimized enough". These influences are often local to spcific users, and thus cannot be tested easily by benchmarking.

    Now, I'm all for the optimizations based on profiling and don't optimize utill the need is there. I'm just trying to say that it's pretty complicated trying to figure out when the need is there in a more general setting.

    Repetition: (probably the most common problem) It's ok if a task takes long if it's not done too often. If you have to repeat the task many times, performance of that task becomes as many times more critical as you repeat it.

    Multitasking/Parralelism: Other tasks would like to get to the processor too. If your program is just fast enough to do it's job, it's too slow to do it while notepad is running along side it. Now, I run colinux alongside on my windows desktop, so performance of "other" programs does matter. Also, execessive memory usage slows down all the other programs on my machine down too (hi there mozilla :)

    Power: My laptop can scale the cpu-frequency and use that to save power, so i can sit out in the sun programming longer. If a program spends too many cycles on a job, i'm gonna have to go inside sooner. and this programs BEGS for optimizations from my point of view.

    Many more of these side-effects of optimization occur, so don't think you can easily (or exactly) evalute when certain code is efficient enough for others, only for yourself.

    This also shows a fundamental advantage in the OpenSource model. I can run a profiler myself and try to perform some optimization, instead of selecting not to use the program (or use another).

    --
    SLOGEN [ http://ungdomshus.nu : Sebastian cover music]
  115. It can be true.. by adeyadey · · Score: 1

    When I first started I wrote a "Compiler" for the 8K PET BASIC - that removed all unneeded whitespace, automatically put lots of statements on 1 line, used short var names, etc. It did speed up a program a little, saved some RAM, and also protected the code a bit, since the result was unreadable.. Needless to say this is a final "before publication" step..

    --
    "You lied to me! There is a Swansea!"
  116. engineering by sir_cello · · Score: 2, Interesting


    A lot of this discussion here is either crap, a rehash or was covered in Engineering 101.

    Basically, you have some requirements for the product, and you optimise according to those requirements. Performance is just one variable (time to market, scalability, reliability, security, usability, cost, etc - are the many others).

    The requirements for a product in a fast moving market entry company are less about performance and more about rollout ASAP.

    The requirements for the same product two years later may be to improve performance to achieve scalability requirements.

    If you're writing some sort of overnight build or batch application: whether it takes an extra hour or not may not matter, because it has a 12 hour window to run in.

    If you're writing an order processing system, then performance and end-to-end turn around to will be vitaly important, but you won't focus on the assembly, you'll focus on the algorithms, design and architecture.

    If you're writing a compression or encryption module: you probably will work on the assembly.

    All of the above cases, before you optimise anything: you profile and understand how the optimisation is going to pay back in real terms.

    In my experience, you cannot prescribe any of this: you need to take it on case by case basis because every product and circumstance is different.

  117. Where's the gotcha? by PhilHibbs · · Score: 1

    Sorry, but I missed that one, nothing in the article really surprised me.

    Was it the fact that he was using a pcode-interpreted language? I've already been surprised so many times at how fast Perl is, that this does not surprise me any more. I've parallel-written a few programs in Perl, C, and C++, and Perl won every time.

  118. Re:You don't optimize, that's the job of the compi by ecb29 · · Score: 1

    Most compilers will actually reduce sets of if statements to a hardcoded jump table, which is going to be as fast as rewriting it. See http://www.codeguru.com/Cpp/Cpp/cpp_mfc/comments.p hp/c4007/?thread=48816 for more information. However, the clarity of the single test is indisputable....

  119. Optimise where needed. by Anonymous Coward · · Score: 0

    Here is the way I program a project:
    Start with an overview of the whole project. What must be done in each step, what is the structure.
    Then I choose the algorithm for everything. It is very important to choose the right algorithm at the beginning, since it is cumbersome to go back and change a little simple algorithm, with a more complex with different datastructure, than to just do the right thing from the beginning.
    Simple datastructures should be used, where it is easy to implement a better, and there isn't a performance hit.
    Then I code as much as I can, with focus on stability. I don't care for performance at all, since my goal is to implement the right algorithms.
    When I am done with this, I run a profiler, and find all bottlenecks.
    Here comes the most important point!
    Whenenever I find a bottleneck, I check if it is in my code the problem is, or if it is in one of the dependencies or in the compiler.
    It is usually pretty easy to see, where the problem lies. 'Can this structure be easy recognised by a program' - then it is the compiler.
    'Can a program in no possible way, guess this optimisation', then it is the program.
    If the bottleneck is in some systemcall, i go through the code (mailinglist, mailinglist, mailinglist) for that call.

    The point is I don't do microoptimisation unless there is no other way to do it. The main benefits is that I do far less laobour in the long run - and i get a better understanding of other programs.
    It usually takes a lot longer to fix performance bottlenecks elsewhere in the system, but the benefits in the long run is well worth it.

  120. Program as if users mattered. by vaxer · · Score: 1

    Screw performance -- that can be solved by hardware.

    No amount of hardware spending will make your program easy to use, easy to understand, and easy to adapt for new purposes.

    Program as if the user mattered. They ain't getting twice as smart every 18 months, but they sure are spending money on software.

  121. Knowlege of the language is more important by gangz · · Score: 1

    It is a common misconception that to acheive better performance, one needs to drill down to the assembly code. It has been shown that using VM languages like Java / C# people were able to get performances close to the compiled to native code languages. The trick is to know how the language behaves. A Simple thing like notifying the GC to delete an object can bring a good performance boost. I have also known applications that were written in C++ but with bad performance. So I think to be able to get a good performance, knowledge of the language you are writing is very essential. I am not getting into the with such faster CPUs why do we have to worry about execution speeds argument

  122. Somebody doesn't understand O notation... by Theatetus · · Score: 2, Insightful
    And quicksort will work just fine too. Sometimes O(n^2) will *not* work. Therefore never use bubblesort.

    You totally missed the point, didn't you? There are situations where a bubble sort is faster than a merge sort or a quicksort. It has almost no setup overhead, so if you're sorting sufficiently small arrays (and what I remember from CS101 is that "sufficiently small" goes up to about 1000 members) bubble sort is actually significantly faster.

    So, as a matter of fact, if you had to sort a million small arrays, bubble sort would be the only feasible option.

    --
    All's true that is mistrusted
    1. Re:Somebody doesn't understand O notation... by lederhosen · · Score: 1
      I do understand O notation.

      Do not use bubblesort.

      You totally missed the point, didn't you? There are situations where a bubble sort is faster than a merge sort or a quicksort. It has almost no setup overhead, so if you're sorting sufficiently small arrays (and what I remember from CS101 is that "sufficiently small" goes up to about 1000 members) bubble sort is actually significantly faster.


      My point is: if you're sorting sufficiently small arrays, the speed allmost never makes a big differense. It does not change the complexity.


      So, as a matter of fact, if you had to sort a million small arrays, bubble sort would be the only feasible option.


      Sort 1000 elements: cost O(1)
      do it n times: cost O(n)

      So the complexity of the algorithm is not depending on the sorting algorithm.

      The bubble sort algorithm may be faster, but the complexity is in no way better. So do not start talking about "understanding" big oh notation.
    2. Re:Somebody doesn't understand O notation... by Eivind+Eklund · · Score: 4, Insightful
      The previous post miss out on many aspects of algorithmic optimization, and lead to the wrong conclusions.

      For a better analysis of optimization in this specific part of the sort space, I recommend Jon Bentley's classic "Engineering a sort function".

      This paper discuss how to implement an optimal sort, after having done real-life measurements. Conclusions include dropping to an O(N^2) sort algorithm when qsort partitions become small enough - insertion sort was choosen. (The selected cut off was secven elements at that point; it may be that it would be sensible to choose a higher cutoff for the generic case now, as the cache locality might help. However, I won't bet on this either way without doing measurements.)

      The qsort implemented there is the one still used in at least FreeBSD. I don't know the status for other OSen.

      As for big O notation: The discussion in the previous post is so imprecise as to be misleading. It use "cost" and "complexity" where it discuss asymptotic complexity; these are distinctly different, and it is necessary to be quite clear on the distinctions to do correct analyses.

      Big-O notation measure asymptotic complexity over an arbitrarily selected set of basic operations assumed to have unit cost. It discard all constants to make the analysis easy to do and easy to work with. This is a useful tool, but it only measure asymptotic complexity, and it only does it based on arbitrary basic operations.

      In practice, a mere factor 1000 speed difference (one second to twenty minutes) might be quite noticable. This will be REMOVED from the big-O analysis, which can make it point in a quite different direction from the truth.

      In the parent post, sorting 1000 elements is assigned a unit cost, claiming that the time will be similar for a bubble sort and a quick sort, and "low enough not to matter". Further, the conclusion is "never use bubble sort". Assuming a naive implementation of both bubble sort and quick sort, and a set of arrays that is already sorted, the quicksort will be O(N^2) and the bubble sort will be O(N) in the number of items in each bin. This is a quite noticable difference in asymptotic complexity.

      A naive programmer is in my opinion the only relevant assumption if we're to give absolute advice on simple sort functions. A non-naive programmer will know how to do complexity evaluation, will know the tradeoffs on startup of the various algorithms, and will only be implementing a sort him- or herself because actual speed measurements or specific knowledge of the sort behaviour show that the system supplied sort is not fast enough for the case in question, and that a custom sort can do better. (S)he will also evaluate whether the data to sort is likely to be almost sorted or highly random, and thus which kind of algorithm is likely to go faster. (And insertion sort/bubble sort is actually faster also for large data sets if they're almost sorted beforehand.)

      Eivind, who if he had to give general advice would give "evaluate qsort, mergesort, heapsort, insertion sort, and using a data structure that keeps order before choosing bubble sort."

      --
      Doubting the existence of evolution is like doubting the existence of China: It just shows that you're uninformed.
    3. Re:Somebody doesn't understand O notation... by Anonymous Coward · · Score: 0
      The bubble sort algorithm may be faster, but the complexity is in no way better. So do not start talking about "understanding" big oh notation.

      He is not talking about complexity, he is talking about time. You are being obtuse, intentional or not. GP's point is spot on that with small enough arrays of elements the constant time becomes a factor that *overwhelms* the complexity. Where this happens between bubblesort and qsort is up for debate, since his numbers were made up examples anyway.

      For example, you would never do quicksort on a two element array that must be ordered, but you are applying bubble sort with if b > a b=a, a=b. That one statement will always be faster for the two element case than partitioning and all the other set-up stuff involved for qsort.

      This is time, not complexity.
    4. Re:Somebody doesn't understand O notation... by jeremyp · · Score: 1

      if (b > a) { t = b ; b = a ; a = t }

      (t is a temporary variable of the same type as b and a).

      --
      All I want is a secure system where it's easy to do anything I want. Is that too much to ask ~~ Randall Munroe
    5. Re:Somebody doesn't understand O notation... by GlassHeart · · Score: 1
      There are situations where a bubble sort is faster than a merge sort or a quicksort. It has almost no setup overhead, so if you're sorting sufficiently small arrays (and what I remember from CS101 is that "sufficiently small" goes up to about 1000 members) bubble sort is actually significantly faster.

      You remember wrong. According to sortchk, for random data sets Quicksort is comparable to Bubblesort at 10 items. At 100 items, Bubblesort requires 4x to 7x more comparisons, and 5x to 13x more data moves, depending on the variant of Quicksort. At 1,000 items, Bubblesort requires some 32x more comparisons and some 255x more data moves.

      I don't disagree if what you want to say is that there are special case data sets where Bubblesort can be the best performer. However, it should not be the "default" algorithm of choice.

    6. Re:Somebody doesn't understand O notation... by Anonymous Coward · · Score: 0

      You don't understand Big-Oh at all. In fact, you're describing the exact thing I hate about the way everything is taught nowadays. In my Programming Languages class, we were told to do various things 'without recursion' (in languages without loops). Things such as reversing a list, etc. The teacher's solution was that a single call to 'reverse' was done without recursion. She then asked which way was faster: A call to reverse, or creating our own. The correct answer would be "There's no way to tell: Reverse could be the exact code we'd write, or it could be optimized in C++ outside of the bounds of the functional language." But EVERYONE said that the single reverse call would be faster, because it ran in constant time (O(1)). This isn't true.

      Big-Oh notation is, in general, useless. There can be a O(n) algorithm that runs extremely slow, while an O(n^2) algorithm runs really fast. Why? Because it's really c*n, and d*(n^2), and c might be .. say.. 1,000 times your average n. But it's still O(n), so if you increase n by a factor of 1,000, then yeah, the O(n) might perform better. But look at your average data sets first. If you have O(10000000n) and O(1n^2) algorithms, well.. I'd rather choose the n^2 algorithm if my data sets are going to be 1,000 items or less.

      Even better are the people who don't realize things might get unrolled/optimized in the compiler, and their 'pure' algorithms course told them that O(n) is always better than O(n^2). I'm sorry, but if the choice is running a really complex algorithm that traverses my list once, or a really simple and straightforward one that traverses it n times, it's often likely (especially for smaller data sets) that the n^2 algorithm will run faster, again.

      Examples include O(n) algorithms that do a lot of logs, sqrts, powers, multiplication, etc... but where the n^2 algorithm might use only addition and subtraction. I dont' know of one off hand, but there's hidden costs to each function call. You can't just assume that said functions don't take up more time such that it'd be slower in the end anyway.

    7. Re:Somebody doesn't understand O notation... by Anonymous Coward · · Score: 0

      I remembered the maximum size for using bubble sort to be about 10 :) I haven't tested it though.

    8. Re:Somebody doesn't understand O notation... by jjoyce · · Score: 1

      Just to be nitpicky (although people make this mistake all the time), big Oh means asymptotic upper bound, not lower bound. When I want to argue the "slowness" of an algorithm, I need to provide an asymptotic lower bound with Omega. If f(n) is asymptotically bounded above and below by g(n), then we can say f(n) is Theta(g(n)).

    9. Re:Somebody doesn't understand O notation... by cakoose · · Score: 1

      I don't think the parent poster had a problem understanding big-Oh. He was just responding to this statement of yours:

      Therefore never use bubblesort.

      The situation he presented is one in which bubble sort would be preferable.

  123. Huh? by Theatetus · · Score: 2, Informative

    What are you talking about? I get paid to write open-source software. Where did you get the idea that open-source software is written entirely by volunteers?

    --
    All's true that is mistrusted
    1. Re:Huh? by Anonymous Coward · · Score: 0
      by Theatetus (521747) * on 04:37 AM May 6th, 2004 (#9072009)

      And that asterisk means that you pay for the same "open" content we volunteers read and write.

  124. Coding while blind by xyote · · Score: 4, Interesting

    The biggest problem I see with performance is lack of visiblity of performance factors. At the hardware level there is cache (which is supposed to be transparent) and really deep pipelined processors. This can have a major effect on what would otherwise be an optimal algorithm. And the hardware keeps changing, so what may have been optimal at one point will become suboptimal later on.

    In software, the biggest problem is lack of performance directives. POSIX pthreads is one of the biggest offenders here. Best performance practices in pthreads are based on how common implementations work. POSIX allows implementations that would cause major performance problems for so called best pthread programming practices. Example, POSIX allows pthread_cond_signal implementations to wake all waiting threads, not just one. There are programs that depend on pthread_cond_signal to wake only one thread for performance in order to avoid "thundering herd" problems. So while standards allow portability of correct programs, whey do not necessarily allow portability of performance.

    We need explicit performance directives.

  125. What About Windows? by Blackbird_Highway · · Score: 1

    If the hardware really has improved so much, shouldn't Windows now boot in like .001 seconds?

    --
    By the perception of illusion, we experience reality
  126. Those who code so no one can do their work... by Anonymous Coward · · Score: 2, Insightful
    I tell all programmers who work for me and let all who work with me know this:

    Given that the only reason to deliberately make it hard for others to understand your work is to increase your job security, that must mean that you don't think you bring enough other skills to the job to keep it on merit.

    In other words, you don't think you're good enough.

    And given that most programmers think they are better than they actually are, if you don't think your good enough, why the hell should anyone else?

  127. Two quotes by CondeZer0 · · Score: 1

    A program that produces incorrect results twice as fast is infinitely slower.
    -- John Osterhout

    The cheapest, fastest, and most reliable components are those that aren't there.
    -- Gordon Bell

    --
    "When in doubt, use brute force." Ken Thompson
  128. You're ignoring the "gotcha" by Theatetus · · Score: 4, Informative
    It doesn't matter how much hardware you throw at a problem if it needs to scale properly and you have an O(n^3) solution.

    Well, maybe you're not ignoring it since you said "if it needs to scale properly". But that's a very crucial "if", and the "scale properly" only refers to certain situations.

    If the array you need to sort might have several million members and you won't be sorting more than a few dozen of those arrays, yes you should use an O(n lg n) or whatever sort routine. OTOH, if the array itself is smaller (a few hundred members) but you have to sort several hundred thousand of them, quicksort or merge sort will be remarkably slow compared to the much-maligned bubble sort.

    Big-O notation is an asymptotically-tight bound, not the function itself. For small datasets, not only is there no guarantee that the lower big-O algorithm will be faster, it's in fact usually the case that the allegedly "less efficient" algorithm will actually be faster.

    --
    All's true that is mistrusted
    1. Re:You're ignoring the "gotcha" by An+Onerous+Coward · · Score: 3, Informative

      A project I was doing last semester had just what you described: thousands of arrays of twenty members each. I was still able to double the performance by switching from bubblesort to quicksort. Besides, you never know when those arrays are going to get bumped up from a few hundred members to a few thousand.

      I'm still a firm believer in the principle that bubblesort is never the way to go.

      --

      You want the truthiness? You can't handle the truthiness!

    2. Re:You're ignoring the "gotcha" by Darth_Burrito · · Score: 1

      A long time ago, an algorithms class I took did a series of tests to try sorting by different methods. For small list sizes (~100), bubble sort was always faster. Since illustrating this point was the purpose of the assignment, it was my general understanding that this was an accepted fact. I could be mistaken, but it doesn't really matter. Bubble Wikipedia Entry

      My personal take on this kind of optimization is to mostly ignore it until you reach a point where it becomes important. Of course, I mostly write software that talks to databases, might be running distributed components, etc. Most of the large data set work gets done by the database engine. The vast majority of nontrivial optimizations are in reducing the number of network trips.

      Of course, even in what I do, I have met code that somehow managed to be so dreadfully awful that network trips weren't necessarily the major culprit. However, in those cases, it's often some pretty scary code written by an even scarier programmer (think 10,000 line procedures with 25 parameters).

    3. Re:You're ignoring the "gotcha" by gumbi+west · · Score: 1

      Bubble sort is actualy very slow in all cases check out this wonderful page on sorting.

    4. Re:You're ignoring the "gotcha" by Eivind · · Score: 1
      This is true, and well-known. It's so because though say quicksort is O(n*log(n)), and bubble sort is O(n^2), that's the average "number of steps" needed to sort a large collection. It doesn't take account for the fact that one "step" with bubblesort is smaller and thus quicker.

      I surprised my not-terribly-brigth professor in algorithms-101 by claiming I'd improve on his quicksort by incorporating bubblesort.

      He found the idea ridiculous, and said so, but I persisted, saying that by incorporating bubblesort, I'd make the algorithm run faster for any size data-set.

      What I did was run quicksort with a cutoff value of around 20, and to bubble-sort if the set to be sorted is smaller than that.

      This means a small set will be only bubble-sorted, a large set will be quick-sorted until the recursive calls work on collections smaller than 20, at which point those get bubble-sorted.

      Fairly standard technique. Ran something like twice as fast as the standard quicksort. Astonishing everyone who thougth that the Big-Oh notation is the be-all and end-all of performance.

  129. Re:You don't optimize, that's the job of the compi by zhenlin · · Score: 1

    A compiler has yet to figure out how to change a slow algorithm for a faster one, or a greedy algorithm for a conservative one etc.

    For instance, (iterative) bubblesort is relatively conservative in terms of memory usage, as opposed to recursive quicksort; however, bubblesort is usually slower than quicksort. The compiler does not yet know how to recognise bubblesort and replace it with quicksort where appropriate.

    Another, Fibbonacci numbers can be calculated recursively. (SLOW) Or, it can be calculated iteratively. (slow) Or, it can be calculated in an effectively O(1) algorithm. (phi = (1+sqrt(5))/2; f[n] = (((phi^n) - ((-phi)^(-n)))/sqrt(5))
    But I doubt the compiler can do that kind of optimisation.

    Remember, the compilers have yet to figure out intention. Without knowing intention, it cannot optimise very well. (Which is why higher level languages tend to be more optimisable -- the higher up you go, the more intention is encoded in to the code itself)

  130. What a load of.... by JFMulder · · Score: 1

    Seriously. Let's all code in Python, Java or Erlang. Then render stuff like the special effects in LOTR or composite scenes with 4k media with motion blur and a few keyers thrown into the mix. In realtime. Yeah, right...

    Performance DOES matter in a LOT of places.

  131. JIT optimization is just peephole optimization by Speare · · Score: 2, Insightful

    People keep saying that the JIT-style optimizers in .NET and Java can radically optimize the application "for programmers who can't or won't."

    Peephole optimization and clock-scheduling are among the simplest of optimization. The machine looks at a few low-level instructions and might suggest an alternative which would operate identically but with better performance. That's really all that the VM has time or capability to perform today.

    Mid-range optimizations include vectorizing, unrolling of loops, and register reduction. These are still machine-analyzable, so I expect the JIT-style optimizers to continue to make strides here.

    But I don't think you're ever going to see JIT-style optimizers which replace an O(n^2) algorithm with an O(log n) algorithm. That is real optimization. That's where you win the performance races. That's the one that programmers should care about, and should learn how to do. The level of analysis required to "divine" the whole meaning of a large routine, realize the alternative algorithm equivalent, and fix up the code is far beyond any JIT solution.

    I think we will have to wait far longer than the 6 GHz Longhorn machines before you see any meaningful machine optimization of sloppy code.

    --
    [ .sig file not found ]
    1. Re:JIT optimization is just peephole optimization by Surt · · Score: 1

      I don't think it's as far out as you might imagine. For example a JIT could monitor your object storage access patterns. Suppose it detects that you are using linked list when your access pattern would be better on tree. Replace all instances of LinkedList with TreeMap.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
  132. Re:You don't optimize, that's the job of the compi by ultranova · · Score: 1

    Actually, wouldn't the clear and simple way to code this be to use a single "switch" statement and not twenty "if" statements ?

    --

    Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

  133. small is beautiful by lexluther · · Score: 1

    The title of this essay is based on the title of an out-of-print book by Nathaniel S. Borenstein, Programming as if People Mattered: Friendly Programs, Software Engineering, and Other Noble Delusions

    Which is actually a play on E.F. Schumacher's seminal work. Small is Beautiful: Economics as if people mattered Which probably should be required reading for all /.ers .

  134. thanks! by zogger · · Score: 1

    Joe User here, who lurks on devel to try the knowledge osmosis absorption technique..

    Speaking as an antique box driver,bleeding edge of 5-6 year old technology when I'm lucky, I REALLY appreciate apps that can run well* on older machines. To ME, always thinking of the end users of the code is vital in development. I wish it was the complete industry standard, even if it meant some more days work before release.

    *define "well" OK, to me = lowest ram usage,(most machines shipped in the past buncha years never shipped with all the ram slots filled, and they STAY that way) lowest CPU usage (how come older machines did a lot of the same stuff you want to do now, but now you need CPUs orders of magnitude faster to do it? Huh?), stability (no leaks/conflicts), security (no idea what this thoroughly e-vile "buffer" is, but that thing always seem to be overflowing whenever you hear of the latest security exploit. Wazzup with that? Someone needs to invent the dang buffer valve and turn that thing off once your sink is filled with enough buffer whatsis stuff so it don't overflow. I am assuming that is what all these emergency "patches" do whenever there's a new exploit. Uh, patch em in advance please)

    signed Joe User

  135. Simplified version of the article by Anonymous Coward · · Score: 0

    Carefully examining a problem will lead to efficient ways to solve it. This even works with higher level languages and those languages provide important benefits that make them the natural choice for solving most problems.

    My favorite example of this is a program that I wrote for a VIC20. (6502 and 4k of memory) The problem solved a statistical problem known as m choose n. This involved large factorals. Naturally, 100! can't be solved directly on such a machine so I found a way around it. Since this solved a problem we were having with a piece of equipment, I sent the program and the results to head office. They sent it to the supplier of the equipment. The supplier looked at the equation but not the program. They plugged the equation into their mainframe which promptly crashed on overflow.

    The article teaches a lesson that many programmers never understand.

  136. Ah, but they have improved by JeanBaptiste · · Score: 1

    "a sad inditement "

    their spellchecker is much more comprehensive with the later releases than ms word 4...

  137. Re:You don't optimize, that's the job of the compi by Anonymous Coward · · Score: 0

    There are things to be said about writing good code versus well-optimized code.
    Many of us today need to support more than one platform (and more than one type of CPU), which means for example, different memory architectures, different instruction sets, and different CPU capabilities.

    It is possible to hand tune source code such that it will be the most optimal for one platform. However, the resulting source code would most likely not be optimal for other platforms. Think of a loop tiling optimization - one that divides the iteration-space of a multi-level nest into smaller "tiles". Choosing an optimal tile sizes and tiling "depth" depends on the memory architecture aimed for. Some platforms provide O/S means to query the cache sizes, etc., and the decision can be delayed to run-time, and thus be made more accurate, but some changes can not be delayed, such as the tiling depth (i.e. the depth of the tiling hierarchy), which means that the could would need to be versioned for different target memory architectures. A compiler optimizer can do that easily, since it is the same code, only in a "slightly" different forms. Versioning the code might then expose some additional optimization opportunities that are specific for each version, etc. The user, on the other hand, would have to maintain multiple copies, or apply an extremely creative use of macros, to achieve the same goal, which would only result in an unmaintainable (well, almost...) code.

    There are many more examples of target dependent tuning and optimization that may change from target to target (think of unrolling, outer loop unrolling, register allocation - as someone mentioned in a previous post, instruction scheduling, etc.).

    My point is, the platform specific compilers should take have responsibility of tuning for the specific platform.

    An optimizer should not replace a bad algorithm for a good one. That should be the programmer's responsibility. The programmer's code should be clean, efficient (at his abstraction level) and easy to maintain. Let the compiler do the job of making that code run best on its targeted platform. Use higher optimization levels if necessary, giving the optimizer more time for analysis and transformations. The end result may be well worth it :)

    With all this in mind, compilers have their limits, and there are differences between optimizers (and optimization levels) on different target platforms. That really depends on the application and the type of optimizations required to make it run optimally.

  138. De-commenting by RogL · · Score: 4, Interesting

    Back in the late '80s, early '90s, worked on (DOS-based) commercial products written in APL. APL is an interpreted language, written in Greek & math symbols, some overstrikes.

    Each byte of that 640K was precious, so it was common practice to "de-comment" the code before release; remove all comments, reduce whitespace, move multiple statements onto a single line, possibly shorten variable names. You could gain a substantial (for the time) amount of memory that way. You also dynamically imported/destroyed functions.

    I regularly debugged client systems with "de-commented" APL; if you could read that, you could read anything!

    1. Re:De-commenting by Anonymous Coward · · Score: 0

      If you can read de-commented APL, you can read minds!

      I used Sharp APL on a mainframe system and loved its power and flexibility. Functional languages are so cool. But APL is definitely NOT easy to read.

  139. Why the article shot itself in the foot a bit by Anonymous+Brave+Guy · · Score: 1
    However, I will dispute the claim that performance gains happen only at the hardware level - although programmers cannot really optimize every tiny bit, there is no harm in encouraging good programming.

    Absolutely. Serious performance tuning is a difficult task, and it's foolish to attempt it without good information to work from, hence the usual recommendations to use profilers, disassemblers, etc. That doesn't mean a competent developer shouldn't be aware of the issues, and act on them if appropriate, though.

    Actually, I think the article shot itself in the foot slightly, by undermining its own argument with its own information. Sure, the performance of its example application improved by around an order of magnitude using newer hardware. Sure, it was written in a language not designed for high performance work. But the example was a trivial manipulation of around 1GB of data, and the program was getting that done a whole 60 or so times in a second on a 3GHz beast? That's pathetic performance, and seems a pretty damn good argument that either low level hackery in something like C or assembler, or choosing a high level language with good performance like OCaml, are still the best ways to go if performance really matters.

    The article also suggests that if you can do those 60 manipulations a second, you're up to modern video game standards. Somebody should mention to the author what's actually involved in rendering a state-of-the-art FPS, because it's a bit more than a trivial display of a graphic file every 1/60 of a second! :-) (Yeah, OK, everybody's got fancy accelerated graphics cards today, but still...)

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  140. Videogame Load Times... by PenguiN42 · · Score: 1

    Hm.

    The next time you're pulling your hair out because your new videogame takes 20 minutes to load every level, ask yourself again if 0.176 seconds per texture "seems quick enough" (and double-check to make sure your game isn't shipping with an Erlang interpreter...)

    Or hell, just buy a shiny new 3Ghz system -- how dare you make the programmer's lives more difficult by expecting their code to run decently on anything less?

    --
    The following sentence is true. The preceding sentence was false.
  141. An O(n) optimization by Latent+Heat · · Score: 1
    OK, here's a question for you.

    I have two widgets on the screen that give two different "views" of the result of a mathematical calculation on a model. Each widget runs the same mathematical calculation on the model.

    Refactoring by moving the mathematical calculation into the model would require adding a large amount of storage to hold the calculation result. Benchmarks indicate the saving in time for a screen refresh would be from 2 seconds down to 1.5 seconds (P-III 1.2 GHz). The screen refresh is broken down into steps so the UI is not frozen during the 2 seconds, allowing the user to interrup the refresh to do anything else.

    Do I bother with this optimization.

    1. Re:An O(n) optimization by Wumpus · · Score: 1

      Yes, but not because of the small speedup. (Going from 2 seconds to 1.5 seconds won't be very obvious - if it isn't twice as fast, the difference is lost on most users.)

      The refactoring you're proposing, however, makes it easier to change the code later, since you don't have to worry about modifying the view widget and breaking the calculation. The results are computed in one place, making maintenance easier. It's the right thing to do, and being faster is just a bonus.

  142. All your optimizations are wrong. by scorp1us · · Score: 4, Interesting
    If you spend hours tweaking code to eliminate a few instructions, even instructions in a loop, then you are just wasting your time.


    Real opimizations come before you right your program. Take for example that loop that you removed an instruction or two from. Say it is a searching an array. and looks like:

    for (i=0; i<strlen(x); i++){
    if (x[i]=='&') ;
    }


    There are two things wrong. One you cal strlen repetitively. Strlen() is theta(n) So you have a loop that executes n times at a cost of n . n*n=n^2. That's one of the slowest algorithms around. Maybe your compiler is smart enough to see that x is not being modified and will to a s=strlen(x); then compare against X for you, but probably not.


    The other thing is when searching an array, try to give it structure. If your array contains sorted characters, then you can find it in log _2 (n). Of course, of you sort by frequency (most commonly accessed at the top) then your n^2 loop *might* do better.


    The article is right: constant-time operations (type checking, etc) are asymtotically infitessimal in algorithms. The article's real problem is that it is n, but on a 2d image (x*y)=n you can't do any better. Note that it is not n^2, (though it makes a square picture) because you're operating on pixels. So that will be your unit of measure - NOT time.


    Which is my 2nd point. Don't measure in time or instructions. Measure in OPERATIONS. Operations are not instructions or lines of code. An Operation is everytime you have to look at a unit. It is a logical unit of cost. Hardware can be changed out. We know that hardware (performance) doubles every 18 months. The constant-time instructions will get smaller. (Also clocks per cycle are irrelevant as well). But your loops will remain the biggest part of your program.


    With that, here's a crash course in CS:

    1. Loops are most of (time, operations) the program. Try not to use them.
    2. To avoid loops, structure your data. Giving structure means assumptions, and assumptions means you can skip irrelevant sections of data.
    3. Determine your dataset, minimize worst-case occurences. Find out what order of data or instructions will make your n*log2(n) algorithm become n^2. Then find away around it.
    4. and optimize for average case. That is, if you never sort more than 6 numbers at a time, an n^2 will beat a n*log_2 (n) algorithm.
    5. If your data structure introduces overhead (most will) find yuor most common or costly operation. Optimize your datastructure for that (searching, sorting, etc) If you do a combination determine the ratio and optimize for that. The cost of overhead is usually small compared to the reason why your using a datastructure to speed up your common operation.
    6. The most obvious and easiest to code algorithm is the slowest. (Bubble sort vs. Radix or quick-sort)
    7. Mastery of the above is the difference between a $50k programmer and a $90K programmer.


      To learn more, take a datastructures class at your local university. Please review Calculus II before doing that though.

    --
    Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.
    1. Re:All your optimizations are wrong. by biobogonics · · Score: 1

      [snip points 1-5]

      6. The most obvious and easiest to code algorithm is the slowest. (Bubble sort vs. Radix or quick-sort)

      7. Mastery of the above is the difference between a $50k programmer and a $90K programmer.


      Even better is to match the solution to the problem.

      1. When I need to write a sorting routine in a programming language as part of an application, as opposed to writing a routine for a library call, I usually use a comb sort instead of a quick sort. Speeding up the sort by a factor of 2 is less important than having code I can understand or I can write at 3 AM and get right.

      2. I frequently tabulate the number of people on a mailing list by zip code. After exporting the data as text, I run it through a simple AWK program:

      {a[$0]++}
      END{for(i in a)print i,a[i]}

      This is fast enough on a few thousand items even on an original PC-XT and I can write it in my sleep.

      3. Shuffling a deck of cards is easy, even in BASIC:

      10 RANDOMIZE TIMER
      20 DIM C(52)
      30 FOR I=1 TO 52
      40 C(I)=I
      50 NEXT I
      60 FOR I=52 TO 1 STEP -1
      70 SWAP C(I),C(INT(1+I*RND))
      80 NEXT I
      90 FOR I=1 TO 52
      100 PRINT C(I);
      110 NEXT I

    2. Re:All your optimizations are wrong. by scorp1us · · Score: 1

      Reasons why you don't make $90k.
      1) If you don't understand a simple sort...
      2) Lines 60-80 are not even needed AT ALL.

      --
      Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.
  143. Re:You don't optimize, that's the job of the compi by Anonymous+Brave+Guy · · Score: 1
    If you write clear and simple code the compiler or interpreter does all the other work.

    That will never be completely true, for the simple reason that the programmer will usually know more about the problem than will be expressed completely in the clear and simple code.

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  144. maybe depends on your product category by zogger · · Score: 1

    If for example you are developing scientific/engineering/ professional audio/visual creation industry apps,etc, or games, ya, your end users are probably always getting newer faster machines. But the hardware side of the industry is a real severe slowdown on new machines, adoption peaked a few years ago from what I have read (and noticed in meatworld). What I am saying is, for developers and their buddies and their peers at work, high end machines are the norm, that's what they are used to and have *forgotten* that they are the exception, not the rule. For 90% of the rest of the world, upgrading hardware every 6 months to a year doesn't happen, and the trend for people at home and a ton of businesses is to hang on to hardware for a longer time now,years longer in a lot of cases, upgrades are being postponed a lot.

    In the US, last I looked, media family income is around 45 grand or a scosh more. That's FAMILY income, in a lot of cases that's 2 paychecks combined. People who are making a lot more than that with one paycheck tend to forget they aren't the norm either,and usually they hang out with other people closer to their economic level, and their consumer purchasing choices tend to be vastly different from the rest of the people. A side note, but that is also something the music/video industry forgets, wonders why people aren't as fast to snap up expensive CDs as they used to. Same with a lot of other industries, you can't look at it with blinders, people having a harder tiome now paying necessary bills, and that goes up and down and sideways throughout all industries, and why we have interest rates so low, because people have maxed out credit already, why bankruptices at all time highs, mortgage defaults all time highs, etc. To developers ( I mean the sales aspect of it software obviously), who are quite specialised into a niche way of thinking and doing business, maybe they are not noticing these things, might be a reason why it's getting harder to sell stuff. Maybe, but I bet it's part of it. it's all connected, chaos theory and whatnot.

    Mindshare is both an immediate and long term process. Longer -term thinking will get you more loyal customers *for a longer time* for your apps as people are happy to use ones that still run well on what hardware they have. They'll remember that your product, written for THEM and their hardware worked well. Cost is BOTH, if a new program actually requires an additional one thousand dollars or more in new hardware just to run, you might actually LOSE mindshare as people think about it and go "no thanks", even if the program itself is just totally spiffy.

    Just wondering, but I wonder what a median set of hardware specs is now for the millions and millions of machines out there still being used. CPU in the medium pentium II class, and 64 megs RAM? That would be my guess. Isn't win98 still the dominant OS that shows up in most website logs? That would be a clue as to what machines for specs are the bulk of this "the market" thing, VERY broadly speaking taking my original point into consideration on types of apps.

  145. Depends what you use it for by Anonymous Coward · · Score: 0

    The performance question often depends on what you're actually doing. I think it's safe to say that some specific applications (like games) will always need performance. I guarantee you a modern 3D game in Erland would never be able to compete with one written in C++. Sure, in absolute terms it might be able to do a decent job, but games are always judged relative to other games, and if even one person can create really fast code by using C++, everyone else is obliged to do the same to keep up.

    Also, I've always been a little distressed at the trend towards bloated programs. Maybe that's a holdover from my days of DOS game programming where you really needed to consider every clock cycle, but I think it still matters in a modern context. Sure, maybe that TGA loader can load 60 images a second. But was the author of that article listening to music, browsing Slashdot, downloading files, IMing with friends, protecting his computer with a firewall and virus scanner, etc. etc. AT THE SAME TIME? Probably not. I've read articles similar to that one, and all of them seem to ignore the fact that, as a programmer, you don't have the full 3gHz of the machine at your disposal. Even with games and other such things that fully grab the user's attention, I wouldn't rely on having more than 75% of the actual CPU available. And for the kinds of applications we run on a daily basis (web browsers), I'd plan for much lower than that. So increasing resources doesn't translate into an automatic license to bloat up code to horrific levels as so many people appear to be fond of doing.

  146. You mean performance *matters*? by jwpacker · · Score: 1

    A decade ago, when I was graduating, the CS courses I took were all of the same mind: don't optimize, don't bother writing hairy processes in assembly to speed things along - the computer manufacturers will just make faster machines with larger drives and more memory...

    Ten years ago. I'm guessing things aren't much different anymore...they've just added a new level for client/server, that they'll eventually get faster and faster networking.

    --
    Software is like a goldfish - it'll grow to fit the size of it's bowl...
  147. twisted crap by Anonymous Coward · · Score: 0

    This article doesn't say anything useful.

    It says: I wrote a (not typically used benchmark) in erlang and ran it on a slow computer, then I ran it on a fast computer. The benchmark ran faster on a fast computer! Amazing!

    It says: Virtual machine languages with extra features like erlang has can perform well! Despite the fact that he didn't compare the performance to any other language.

    Somebody needs to teach this guy about the scientific method.

  148. But who'll want to improve it? by Chemisor · · Score: 1

    If you are working on OSS, quality of the result is more important than the readability of the code. That's because if you application sucks, nobody will want to improve it. They'll just start from scratch, saying "what moron wrote this? I bet there isn't a single line of good code in there."

  149. This is a well-known problem, by warrax_666 · · Score: 2, Insightful

    especially in databases where the data set that you have to sort is often so big that it doesn't fit into memory. The (usual) solution is to use a variation on the well-known Merge Sort algorithm, where blocks are merged into larger and larger "runs" of sorted data (which are then merged). (The number of runs of course depends on how much data there is and how much memory you have).

    --
    HAND.
  150. one other thing by Anonymous Coward · · Score: 0

    I guess no one else picked up on it, but 66 small targa files decoded per second is dead slow. His only possible hope for vindication is if random access to his hard drive took up most of the time. The author seems a prime example of this.

  151. Better compilers, clean code by Goodbyte · · Score: 2, Insightful

    I don't know Erlang, but if it is a pure functional language, the compiler/interpreter can use "special" optimizations, e.g.

    decode_rgb(Pixels) ->
    list_to_binary(decode_rgb1(binary_to_list(Pixe ls))).
    will not produce intermediate lists, instead the compiler will use lazy evaluation when decoding the data.

    My point is that many optimizations do not sacrifice readability. Many times it is possible to refactor slow code that improves both readability and execution speed, but you must know the pros and cons of the tools you are using!

  152. Re:Funny how. .. by Bastian · · Score: 1

    Funny how the opposite is usually the case. Best example in my experience: FOSS libraries like oh, I don't know, GTK+ 1.2. The story generally goes like this: Start working with the libraries. Read the documentation. 10 minutes later, realize that the documentation is written for an old crufty 2-year-old version of the library that doesn't really work the same way as the current version, and that it wasn't even halfway complete for that version, either. Get annoyed, but start reading the source code. Realize that the people responsible for this project are huge fans of 'clever hacks' and have religious problems with commenting. Give up, go back to VisualStudio.NET.

  153. This is backward thinking with no vision by mrnick · · Score: 1

    If you read this article it advocates not worrying about writing efficient code because advancements in hardware will cover for your incompetence. This is happening every day! Every year code gets more and more bloated and programmers become more and more sloppy. Who cares we will have 10GHz CPUs soon right? This makes me crazy! If humanities software developing skills progressed at the same speed as hardware we would have sentient thinking computers right now.

    I can't wait until Moores Law breaks and we reach a hard barrier on computing speed. Right now we have far faster computers than anyone actually needs. I think with proper coding we could be running all our current applications without performance degradation on 386 class systems. If the hardware crutch is removed maybe people will start focusing on writing quality optimized code.

    Nick Powers

    --

    Encryption: I may not agree with what you say, but I will defend your right to encrypt it...
    1. Re:This is backward thinking with no vision by GnuVince · · Score: 1

      C and C++ programmers can't seem to write a single large application that does not have at least one buffer overflow security hole. We don't need that, especially now that more and more stuff is done on the internet such as banking.

    2. Re:This is backward thinking with no vision by Anonymous Coward · · Score: 1, Interesting
      If humanities software developing skills progressed at the same speed as hardware we would have sentient thinking computers right now.

      You're a bit too early to be making that kind of statement. Try again in 15-20 years. Computers are still THAT far behind the processing/memory capabilities of a human.

  154. It's a trick! by Anonymous Coward · · Score: 0

    The images he's decoding are repetitive.
    He's using Erlang's built-in hash table functionality to cache the decoding of particular runs of data. His implementation will almost certainly NOT be fast enough for all images (e.g. not for photographs).

    Plus, you get the sense of how much work it was to identify and implement these so called "high level" optimizations. Surely it would have been easier just to use C (or Java, for that matter, since it has more straightforward data representation and JIT compilation) and the resulting code would have been simpler and much more useful.

  155. I don't think you thought that out too well... by ta+bu+shi+da+yu · · Score: 1, Funny

    "Let only the best programmers reproduce". Hmm. The best programmers are people like Richard Stallman. Well, there goes that plan for increasing overall programming quality.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  156. Ummmm... by Rufus88 · · Score: 2, Insightful

    Were they C++ programs, or 20 year old shell scripts?

  157. Intel's VTune is your friend. by yecrom2 · · Score: 3, Informative

    We were introduced to vtune during a 2-week trip to Intel. Profilers are good. vtune is the best one that I've found.

    The way that we use it is to not even touch it until we have the feature completely working in the simplest form possible. Then we do some performance testing. If everything works well under load, we don't even bother profiling it. Otherwise run it in vtune and see what the bottleneck is. 90% of the time, there is some type of minor oversight. Occasionaly, there is an algorithmic change that needs to take place, like adding a secondary index to something, or making some temporaries thread-local.

    We run both event-counters and call-tracing, but I've found that call-tracing is far more accurate. The best use of VTune is to smite arrogant developers. The result of our trip to Intel was that one of our developers, who had to write everything from scratch, was shown that all of his "high performance components" were completely worthless.

    Just my $0.02.

    Matt

  158. Funny, but not so funny by Rufus88 · · Score: 1

    Funny, but that's not entirely unlike a non-deterministic algorithm. Just get rid of the loop. When quantum-computing becomes commonplace, this may become the way sorting is actually done.

  159. what's the score by dutky · · Score: 2, Insightful
    This guy, for some reason only vaguely defined, wants to demonstrate something about computer performance and the need (or lack thereof) for optimization. So, rather than use any of a large number of established benchmarks, he pulls a targa graphics file decoder out of his ass. This is the justification he gives:

    It gets away from tired old benchmarks involving prime numbers and replaces them with something more concrete and useful. It also involves a lot of data: 576x576 24-bit pixels, for a total of 995,328 bytes. That's enough raw data processing to require performance-oriented coding, and not some pretty but unrealistic approach.

    Despite the fact that nobody (other than him) seems to be interested in targa graphics these days, and that the total amount of data involved (less than a megabyte) is miniscule compared to limiting bandwidths in modern computers (ranging from ~100 MB/sec for the disk I/O subsystem, ~500 MB/sec for the main memory, and in the multiple GB/sec for the caches and internal registers).

    Then he shows us just what a wonderful benchmark this program is: it is so wonderful that it runs nearly instantaneously! Maybe it's just me, but I like my benchmarks to take a little while to complete, either becuase I don't trust the sub-second accuracy of the system time routines, or because I like to get a reasonable sample of system states contributing the overall performance of the benchmark.

    Next, the guy tells us how he used unsatisfactory tools to implement an ill-conceived algorithm and, glory-glory, later fixed his own dumb-ass mistakes!

    Finally, he claims that he would not have been able to make the same kinds of algorithmic adjustments if he had implemented in C rather than Erlang, though it is not obvious why it would have been more difficult (and he doesn't give any arguemnt to support his assertion).

    From this exercise he concludes that, GASP, optimization is still important!

    What a senseless waste of skin.

  160. Optimizing from intent rather than algorithm by Baldrson · · Score: 1
    It is much easier to optimize from a high level statement of intent than an algorithm simply because it is easier to derive algorithms from intent than it is to derive intent from algorithms.

    This is the main reason compiling to intermediate "virtual machines" (JVM, Parrot etc) is a bad strategy.

    A better strategy has been around since the days of the first HP-UX systems (under the HP BASIC incremental compiler) is to dynamically compile and keep track of source code dependencies so as be able to void encached compiled machine code and recompile as necessary. This allows you to trade compile time against run time. Once you can trade those two off against each other you can run background recompilation tasks that take into account constraints of increasingly-global scope imposed by the source code. If you have a set of source that sits around idle for a long time you can end up with highly optimized machine code -- but only if your source starts out at a high enough level to retain the intent rather specifying algorithms. Human intervention to assist the compiler should be in terms of pragmas stated simply as theorems the human believes to be provable from the source code.

    This all looks doable within predicate calculus as a formalism (with appropriate synatctic sugars as needed).

  161. Moving up the ladder by Tomster · · Score: 1

    When I wrote code for the 8086 (yes, the 4.77Mhz CPU), it made sense to drop down into assembly on a regular basis so I could extract every cycle of performance out of a bit of code. This was roughly a .33 MIPS chip with 29,000 transistors. Shaving 10 cycles off a loop that executed 100,000 times saved 1,000,000 cycles -- close to a quarter of a second.

    Today's desktop CPU's are approaching 10,000 MIPS, running at 4Ghz. In simplistic terms, this is about 30,000 times the performance of an 8086. If I did my math right, saving those 1 million cycles now improves performance of that loop by .00025 seconds.

    There are still applications which are compute-bound. But for typical applications, the type of performance optimizations that used to make sense are a waste of time today.

  162. Re:Funny how. .. by Panaflex · · Score: 1

    Windows.Forms in .net is simply a wrapper for win32... and win32 is as much or more of a hack job than gtk+.

    If you're looking for beautiful code, check out QT.. truly a beautiful C++ implementation...

    Pan

    --
    I said no... but I missed and it came out yes.
  163. You are mistaken about hardware generations by AHumbleOpinion · · Score: 1

    Not only that, the processors that you thought you would be targetting are already a generation behind and that algorithm that was going to hold back your competition runs perfectly fast on new processors.

    I agree that alogorithms are the primary place that optimizations should occur at, that premature optimization is wasteful, and that for most applications the minor gains are not worth it. That said, your comment above is ill-informed. Optimizations done for one architecture are not necessarily counterproductive on the next. For example I performed some Pentium UV pipe and Pentium FP stack optimizations on some core 3D math code. That code is still faster that compiler optimized code targetting the Pentium 4, as it was for Pentium III, Pentium II, and Pentium Pro. The performance improvement is now around 7%, I would not bother writing the assembly language code today, but four architecture generations later it is still a win. YMMV.

    In addition, "new processors" are often irrelevant. Few people have them and the older systems are the ones that need the performance tuning to reach acceptable levels of performance. There is value in enabling your product to reach back one more processor generation.

    Any tendency to optimize prematurely ought to be avoided, at least until after v1.0 ships.

    Again, ill-informed, unless you believe in releasing 1.0's as-is and that 1.0's will be immediately followed with 1.01's and 1.02's that were what 1.0 should have been. I understand that patches happen and that no matter how good the QA on a product customers will probably find new and creative ways to use a tool and probably find a flaw. I just object to planning on having 1.0x's to finish a product. You statement seems to imply the later. The proper time to perform optimization is after a piece of code has been identified as a bottleneck, after the algorithms have been thoroughly reviewed, and probably after the rest of the product is feature complete.

  164. You missed a class by multipartmixed · · Score: 1

    There are *three* kinds of people in a dev shop:

    - Programmers
    - Transcribers
    - Optimizers

    Programmers come up with algorithms.
    Transcribers turn algorithms into code.
    Optimizers make a given algorithm perform its best.

    The author of the article is suggesting that we eliminate optimzers, since he claims they are the least important.

    Some people are good programmers. Many of these are good transcribers. Some of these are good optimizers. The people that are all three, and who know when to wear which hat, are what I like to have working for me.

    A lousy programmer tasked with a programming job will come up with a lousy algorithm. Often, he'll be able to transcribe is successfully and optimze the hell out of it. This type of program will almost always be slow.

    Here's a stunning example. My highschool had a bunch of Turbo 8086 machines (8Mhz, whoo hoo!). My highschool CS teacher wanted to teach us about "optimization", so he challenged us to write a program to factor large numbers. He "optimized" his program by re-writing it in assembly, and showed us how it was faster than a our BASIC programs, even when the assembled version was running on the old IBM box (4.77 MHz).

    I "optimized" my program by using a better algorithm. (Only checking up to the square root. The stupid fool was checking every freakin' number!). His program was indeed very fast for small N, but mine *smoked* his for large values of N.

    He gave me an F because I "cheated" by not checking every number.

    I played Budokan for the rest of the year.

    --

    Do daemons dream of electric sleep()?
  165. My TGA Reader by MuValas · · Score: 1

    I already have a TGA reader that doesn't do quite the same thing as the essay's reader, but probably comparable. It reads the various headers, makes sure the tga file is the type I like (32 bit the last byte being alpha), reads it in, steps through all the bytes and swaps the red and blue bytes (tga is bgra, I want rgba, if I remember correctly). The code uses C++, does no optimization other than what .net 2003 pro does in standard "release" build, and was written by me in an afternoon a few years back. It has performed flawlessly since written.

    Profiling under WinXP on an Athlon, reading in a complex image (not that it matters) the same size as that in the essay (576x576) although with an extra byte of alpha, I get:
    1,226 microseconds
    -or-
    0.001226 seconds
    -or-
    816 images/second

    So basically 12x speed improvement on a processor roughly 2/3rd the speed. Remember, I did no optimization iterations. Wrote it, ran it, debugged it, put it in CVS for everyone to use the past 5+ years. Also remember that I'm not doing the same thing, but it seems to me that RLE would be faster than what I'm doing, since I need to read and write every byte of the image.

    Now, having said that, I agree with the essay in principle. Choose the right algorithms to start with, choose different algorithms if needed, and optimize hot spots when profiling shows a problem. I am a gaming/simulation/graphics programmer by trade and I am amused when someone worries about a 100 iteration loop that has some trig functions in it that runs every time the user pops open a certain dialog box. The massive amounts of processing that goes on in graphics/simulation inner loops has given me a profound understanding of the power of a modern machine, such that I don't sweat the small stuff. However, it has also given me a profound appreciation for the concept of O() notation. O(2^n) will cripple any machine for any software. Except, of course, when n is small. With small n, who gives a crap about the inside of the O() - care about the unspoken constant in front of it!

    1. Re:My TGA Reader by Anonymous Coward · · Score: 0

      When the author compare his tga decoder to a game he shows that he doesnt really understand about complex software... he is a lamer that maybe, only know a bit of erlang ... I really want to code a tga decoder like his one, in c or c++ and optimize with gcc flags for specific hardware and send to this lamer... he will have a surprise... maybe he need weakup and see the real world and get out of his matrix.

  166. Nitpick by Anonymous Coward · · Score: 0
    the example was a trivial manipulation of around 1GB of data, and the program was getting that done a whole 60 or so times in a second on a 3GHz beast?

    Er, make that 1MB of data. If it was processing 60GB of data per second, that would indeed be very impressive.

    You would need a pretty hairy RAID system to keep up with his program's sustained 60MB/s of data in and out anyway. Not too long ago, the average PC's RAM wouldn't keep up with that sort of pace, for Pete's sake. In the real world, this level of performance is way past "good enough," even if the CPU is capable of better.

  167. I'm not surprised by cat_jesus · · Score: 4, Funny
    E.g., there is no way in heck that an O(n * n) algorithm can beat an O(log(n)) algorithm for large data sets, and data sets _are_ getting larger. No matter how much loop unrolling you do, no matter how you cleverly replaced the loops to count downwards, it just won't. At best you'll manage to fool yourself that it runs fast enough on those 100 record test cases. Then it goes productive with a database with 350,000 records. (And that's a small one nowadays.) Poof, it needs two days to complete now.

    And no hardware in the world will save you from that kind of a performance problem.

    E.g., if most of the program's time is spent waiting for a database, there's no point in unrolling loops and such. You'll save... what? 100 CPU cycles, when you wait 100,000,000 cycles or more for a single SQL query? On the other hand, you'd be surprised how much of a difference can it make if you retrieve the data in a single SQL query, instead of causing a flurry of 1000 individual connect-query-close sequences.

    (And you'd also be surprised how many clueless monkeys design their architecture without ever thinking of the database. They end up with a beautiful class architecture on paper, but a catastrophic flurry of querries when they actually have to read and write it.)

    E.g., if you're using EJB, it's a pointless exercise to optimize 100 CPU cycles away, when the RMI/IIOP remote call's overhead is at least somewhere between 1,000,000 and 2,000,000 CPU cycles by itself. That is, assuming that you don't also have network latency adding to that RPC time. On the other hand, optimizing the very design of your application, so it only uses 1 or 2 RPC calls, instead of a flurry of 1000 remote calls to individual getters and setters... well, that might just make or break the performance.

    (And again, you'd be surprised how many people don't even know that those overheads exist. Much less actually design with them in mind.)
    Not much surprises me these days. I had to rewrite a SQL trigger one time and I was very concerned about optimization because that sucker would get called all the time. I was shocked to discover that in this one particular instance a cusor solution was more efficient than utilizing a set processing methodology. I was so suprised that I wrote up a very nice paper about it for a presentation to my department, along with standard O notation and graphs.

    No one knew what O notation was.

    Not long after that I found out about knoppix. I burned a few disks and gave them out. Only one other person knew what linux was. It wasn't my manager.

    Just last week one of our servers had a problem with IIS. "What's IIS?", My manager asks.

    Here are some other gems

    "We can't put indexes on our tables! It will screw up the data!"

    "I've seen no evidence that shows set processing is faster than iterative processing" -- this one from our "Guru".

    "What is a zip file and what am I supposed to do with it?" -- from one our our senior systems programmers in charge of citrix servers.

    "What do you mean by register the dll?" -- from the same sysprog as above

    They pushed a patch out for the sasser worm and about 2% of the machines got the BSOD. I finally decided to give the fix a try on my machine and it died too. I booted into safe mode and rolled it back. Everyone else had to get their machines reimaged because desktop support couldn't figure out what was wrong. Lucky for my neightbor I was able to recover most of his data before they wiped it clean. He made the mistake of letting his machine loop through reboots for two days, which hosed his HD up. Of course the PC "experts" couldn't recover any data because the machine wouldn't boot up.

    Yes, I am in programmer purgatory. I am reluctant to say hell because I'm sure it can get worse. No, I'm not kidding.
    1. Re:I'm not surprised by Anonymous Coward · · Score: 0

      Except that purgatory isn't supposed to be a step better than hell, purgatory is where you go if your good and bad are nearly equal, you then hope someone prays that you go to heaven...

    2. Re:I'm not surprised by cat_jesus · · Score: 1

      OK then, that Klingon barge on the way to hell.

  168. alpha by h4x0r-3l337 · · Score: 1

    Looks to me like the writer of the article has his notion of 'alpha' backwards. An alpha value of 0 means fully transparent, 255 is opaque. The article has it the other way around.

  169. I call shenanigans by Euphonious+Coward · · Score: 1
    Notice that nowhere in the article does he actually report the speed of any program not in Erlang; in particular, not C++. Whenever he mentions comparing the readability of his code to the same thing in some other language (although he ever actually does) he compares to C, despite that everything he does would be easy and clean in C++.

    This essay is a disgracefully misleading piece of legerdemain.

    The lesson to take away is that (1) C is almost always a poor choice, if alternatives (such as C++) are organizationally possible; (2) Erlang is not appreciably nicer to code or optimize than C++ (else he would have compared to it instead of C); and (3) the C++ code, optimized in the same way he did his Erlang code, would have been embarrassingly faster (else he would have clocked it).

    We are still waiting to see the language that can take the mantle from C++, for industrial uses. Unfortunately, we have become used to reading lies from promoters of slow or otherwise insufficiently capable languages. It will be hard to recognize the language that ought to replace C++, because what is said about it will read a lot like the lies.

  170. Useless article by ChaosDiscord · · Score: 2, Insightful
    Even traditional disclaimers such as "except for video games, which need to stay close to the machine level" usually don't hold water any more.

    Sure, if you're talking about puzzle games.

    However, for most retail games pushing the graphics as far as possible is important. If you can squeeze a 5% improvement out of the engine you can use the freed up time to make the game a bit prettier. Or put another way, art expands to fill available processing power. Graphics blocking on the video card? Well, you can use processing power for increasingly realistic physics simulations and artificial intelligence.

    If you play a lot of games you know that there is great variance between games. Some games coast along at a bare minimum while others surprise you with their ability to create compelling visuals with older hardware.

    After all, who ever thought you could use an interpreted, functional language to decode Targa images, especially without any performance concerns?

    Ummmm, just about anyone sane? Wow, decoding a measely megabyte of data. And the encoding? Simple run length encoding. That's not a real programming problem; that's a homework assignment for Computer Science 101. If you're keen on loading graphics you could at least pick something that is slightly complicated like JPEG.

    Is it true that optimization is massively overrated; that most programs are plenty fast? Sure. But this article doesn't provide a bit of evidence for that.

  171. Scientific computing by gnuLNX · · Score: 2, Insightful

    Sorry pal....but as long as there are scientific aplications to solve there will be a need to write highly optimized code...Heck I wrote some inline assembly today. Yes we can ignore performance in some areas, but in high perfomance computing, speed is still king, and quite frankly always will be.

    --
    what?
  172. Re:You don't optimize, that's the job of the compi by techno-vampire · · Score: 1
    Actually, wouldn't the clear and simple way to code this be to use a single "switch" statement and not twenty "if" statements ?

    If it still needed the twenty statements, yes. In this case I tested to see if the input was within the right limits, and if so, set the variable to (input + offset). That changed twenty if's to one. Clear, consise, simple, maintainable.

    --
    Good, inexpensive web hosting
  173. Re:Funny how. .. by Brandybuck · · Score: 1

    Then you get the opposite: people choosing to write Qt software simply because of the amazing documentation. I know Windows developers who use Qt over the "native" toolkits for this reason.

    --
    Don't blame me, I didn't vote for either of them!
  174. MOD PARENT UP +1 TROLL by Anonymous Coward · · Score: 0

    Wow, look at the all the replies. Hook, line, and sinker, sir, I bow to your expertise. +1 Troll, well deserved.

  175. VIC20 BASIC: Tokens mattered by solprovider · · Score: 3, Interesting

    When I was finally allowed to program, I did it on a Commodore PET in elementary school. Of course I was writing games, and I optimized because games are not fun when they are slow. I was too lazy to type in the source from magazines, so all of my programs grew until they were usable, then grew some more as people played them and asked for features.

    Junior High did not have computers yet. I finally convinced my family to get me a computer if I paid half. With my budget, that meant a VIC20 for under $100. The VIC20 had 4KB of RAM. You could buy a 16KB expansion, but I could not afford it.

    The language was the same as the PET, so I tried to run my existing programs. They ran. I tried to modify them, save, and run them, and they would not work, even if the change was to remove code. I finally tried changing all the commands to "tokens" to shorten them. IIRC, a token was the first 2 characters of a command and an underscore. Since most of the commands were 4 letters, this saved quite a few characters. I also renamed all my variables to shorten them. Then I saved and the program ran. Yeah!

    Then I made another change, and the problem reappeared.

    I decided that:
    10 The program loaded as written to the tape. (Hard drive? Floppy disk? Never heard of them.)
    20 If the program fit in memory, it would run.
    30 When the program was loaded for editing, all tokens were expanded to the full command.
    40 The program was saved as text, except...
    50 If the tokenized version of a command was encountered, then it was saved as the token. I never figured out if they were saved as the 2 Hex number, the dollar sign and the number, or the 3-character shorthand "token" I typed.
    60 GOTO 10 (and see if it runs.)

    So every time I wanted to modify the longer programs, I had to change every command to the "token" format. (About half of my programs were under the 4KB limit, about half could be "fixed" using this technique, and a few were large enough that I never got them working again.) Any changes to the longer programs required 20 minutes of "tokenizing" the commands before saving it. That killed much of the fun of programming. (Today I get upset if a build takes longer than a game of Solitaire, but "getting upset" means deciding to fix the build process.)

    Commodore bought BASIC from MS, and then modified it, so I do not know who to blame for the hours I wasted on this, but Commodore is gone and MS continues to take the fun out of computers, so I blame MS.

    ---
    My next venture into computers was the C64. They had "Sprites". Half of the code in my games was controlling the graphics, and this improvement to the platform made that code obsolete. For the challenge, I upgraded one game to use Sprites. They took much of the fun out of it, and (IIRC) you were limited to 4 of them, so you had to play games (pun intended) to write PacMan. (4 Ghosts and PacMan required 5 Sprites. The dots and cherries would be handled without Sprites. It was easy to write a 3-ghost PacMan game, and really difficult to write a 4-ghost game.)

    Since the C64s at school did not have tape drives, my old programs had to be typed in, if I had a printout from elemetary school (no printer at home.) I already stated that I was lazy, so they are gone. Well, I still have the tapes, but they are 2 decades old, and my PC does not have a tape drive anyway.

    --
    I spend my life entertaining my brain.
    1. Re:VIC20 BASIC: Tokens mattered by CarrionBird · · Score: 1

      I don't know about commodore, but I've seen MC-10 emulators that came with utilities to read in tapes recorded to wav files. Maybe the same exists for c64.

      --
      Free Mac Mini Yeah, it's
  176. Re:You don't optimize, that's the job of the compi by Anonymous Coward · · Score: 0

    Wow, how much time did it take to go through 20 if statements anyway?

  177. Situation-dependent by Anonymous Coward · · Score: 0

    I don't think it's about "one way of optimization is always better than another" so much as recognizing, hopefully early on, that you can design software in an optimized fashion, or you can design it in a "try it and let's see what works" fashion.

    If you're a good programmer - a good *engineer* - you've already probed through things to try and you can figure out where optimization is most likely necessary and get it in from the beginning. If you're not, you'll take more time to build a slower product in the long run and that's why you aren't "good."

    Nobody can possibly get in every concievable optimization on the first run and at the same time generate clean code, which is why rewrites are often necessary. But someone with experience will get in the crucial ones, every time.

  178. Get programming as if SIZE matters... by Kazoo+the+Clown · · Score: 2, Insightful

    I think size is a far more serious problem than speed-- just because I can put multi-gigabytes in my PC doesn't mean I want to waste them loading bloatware. In fact, I probably wouldn't NEED the multi-gigabytes if it wasn't for code bloat. Of course, Gates loves such things because the machine retailers love them-- easier to sell more memory to someone who needs it because they just tried to upgrade to XP or something.

    So which compilers space-optimize by rolling loops instead of unrolling them?

  179. algorithm...algorithm....algorithm... by rudga · · Score: 1

    algorithm...that one word says it all and does it all too, so all you math whizkids ..its time to rule.. :)

    --
    ~~~~~ rudga ~~~~~
  180. Broken C++ code by d-rock · · Score: 2, Insightful

    Actually, that could break C++ code that uses templates. There's a difference between

    vector <pair <int, float> > myVector

    and

    vector<pair<int,float>>myVector

    It's only subtle until you try to compile it :)

    Derek

    --
    Don't Panic...
    1. Re:Broken C++ code by Frizzle+Fry · · Score: 1

      It would break plenty of other things in C and C++ as well. For example
      a = b / *c;
      is not the same as
      a=b/*c;

      and

      #define DEBUG 1
      #ifdef DEBUG
      is going to work a lot better than
      #defineDEBUG1
      #ifdefDEBUG

      --
      I'd rather be lucky than good.
    2. Re:Broken C++ code by maxwell+demon · · Score: 1

      Well, I certainly implicitly assumed he only removed unnecessary whitespace. Because, after all, even

      int main(){}

      will not compile if you remove the whitespace between "int" and "main".

      And of course, if you remove the whitespace in the string "hello world", users of your hello world program will notice :-)

      Now the example with "a = b / *c;" is especially nice because I don't see a to write it in C++ (or C99) without using whitespace (where newline counts as whitespace). In C90, you could write "a=b//**/*c;", but in C++ and C99, the "//" will be seen as comment starter, and since all comments start with a slash, you cannot break it apart with a comment in between.

      --
      The Tao of math: The numbers you can count are not the real numbers.
  181. The real truth by Gilk180 · · Score: 1

    I think the article over-simplifies the issue.

    The real truth is that EVERY decision about a program needs to be made in the context of it's creation and deployment.

    If your writing something like 'ls' that will run often on large multiuser systems, speed should the be ultimate goal.

    Alternatively if you are writing something that will run once a week and will only be deployed on the two systems in your lab, ease of coding and maintainability should be the goal, since in this case it will probably be cheaper to throw more hardware at it than to pay a programmer (or use your own precious time) to spend the extra time on optimization.

    It's all about context. Hardware is cheap if you only need one more box. Hardware is expensive if 10,000 users need to buy new boxes because your program runs slow. Programmer time is expensive if a single user is footing the bill, but if there are 10,000 users to amortize the cose, programmers become cheap.

    The author also seems to think that choosing a programming language is part of the performance picture. It is to a small extent, but this must also be viewed in the cotext of the application. Different languages have different advantages and disadvantages. Some are slow for mathematical calculations but can push strings around incredibly fast. Some are really 'safe' when it comes to bounds checking. Some are easy to use and are fast when most calls are to libraries. Some are great for long-running processes, but thrash the drive on startup and are attrocious for short-lived programs.

  182. Wow, dumb article by iammaxus · · Score: 0, Flamebait

    This has to be one of the crappiest articles in the developers section, ever. The author has absolutley no explanation of what is "fast enough". He just says, wow 66/second, sounds fast. Guess performance doesnt matter. That is retarded. He never compares it to a non-interpreted, more performance oriented language, so how can he say its fast? There are many applications where i wish image decoding happened faster, so you can't just assume its fast enough.

  183. Re:Funny how. .. by Bastian · · Score: 1

    Of course, this isn't really the opposite so much as further support for the argument. Qt is commercial software.

  184. Optimization vs. Using "good" programmers (STORY) by solprovider · · Score: 1

    I am currently involved with making a very simple change to an application. It has been implemented using 3 very different algorithms. The change is to automatically set 2 fields for geographical groupings from a Country field. I wrote the application, and I suggested during the original design we should automatically fill in the other two fields, but I was overruled. (If they would not configure what country belonged to what grouping, my design would have failed, so I did it their way. The deadline was too short for me to argue about it.)

    The users requested it a year later, and they realized the poor data quality from depending on users to set all 3 fields, so they decided to implement the change.

    The original design pulled 3 "multi-value" text fields (like arrays) from a single record. The platform will automatically parse each element so what is before the pipe '|' is displayed to the user, and what is after the pipe is stored. Each field used its own choices.

    The first revision was without me. (I am expensive, and it seems an easy change. Even I thought the full-timers could handle it.) They created a new table with a 6-field record for each Country: 3 for the codes, and 3 for the display names. Then they created 3 new fields on the form to display the names, while using the original fields to store the codes. The programming was so that every time the screen refreshed, the 3 code fields and the 2 display fields were recalculated. Even worse, each of the 5 fields did its own lookup to the table. So we have 5 lookups (across a global WAN) for every screen refresh. They also set each field twice, which may cause the issues that made them call me.

    Then they told me what they wanted. In less than 2 hours (more than half spent testing and writing instructions for implementation), I delivered code that only triggers if the Country changes. It pulls from the single record (as described above) when the document loads, does one lookup (from the in-memory copy) for all the data, which is then parsed into the other 2 fields. (I did not use 3 display-only fields, since the platform supports translating the code to text as mentioned above.) I am hoping they will switch to this code soon.

    Then I learn that another group has modified their own copy of the application to do this in a third method. One of the groupings is static, because everybody using this copy belongs to one of the regions, so they just hardcoded those 2 fields. They hardcoded very long IF statements that decides the other grouping based on the Country. There are 3 of these hardcoded blocks: one each for the country code, the region code, and the region display text. Each of the 3 blocks of code runs every time the screen is refreshed. Their performance is much better than the first developer since it is hardcoded, but it is unmaintainable. (Their version is supposed to allow integration back into the corporate version, but they arbitrarily changed many of the field names and some of the data formats, so any integration will require translation.)

    We currently have 3 versions of the same code:
    #1 is very slow (and is not functioning properly.) This application runs globally from a central server, so those 5 lookups every refresh will be eat bandwidth.
    #3 is hardcoded, so maintenance requires a programmer.
    #2 (mine) is configurable and is fast since it runs only when needed and uses a single lookup done once when the record loads.

    Did I spend time optimizing? I probably put 5 minutes of thought into how the configuration data should be stored, but the algorithm was obvious because it was the simplest method.

    Was it more expensive? I may charge 6 times what the full-time developers get paid, but my 2 hours included testing all exceptions. (If the Country is not in the lookup, then the Country code is displayed in the region fields, which alerts the user and admins that there is a problem. I could have blanked the fields, and may still if this is not acceptable.) I

    --
    I spend my life entertaining my brain.
  185. My favorite comment ever by jasomill · · Score: 1

    That's not bad -- I was in charge of maintaining several QuickBASIC programs. The code was poorly written and uncommented, with the exception of one comment that appeared at the beginning of each source file:

    REM If changes are made to this program, it must be recompiled.

    It's also interesting to note that GOSUB was used liberally, but RETURN was not used once. Not to mention the fact that temporary files were always created in fixed locations on a shared network drive, so the programs tended to fail in creative ways (to give an example of "creative": the same temp file name was often used for, say, a list of files to view in one program and a list of files to delete in a second) when multiple users were using them...the list goes on.

  186. Do I really want my VIC20 BASIC programs? by solprovider · · Score: 1

    A friend mentioned at lunch that he missed playing a game that ran on MS-DOS BASIC. He last tried to play it on a 486, but it moved too fast. We wondered whether there was an interpreter for MSWindows98 that would run it slow enough to be usable on modern hardware.

    I remember trying to play Populous on a Pentium100. A game lasted almost 2 minutes. I flattened a mountain 3 times, but the computer player would keep wrecking it. I was going to use MoSlo Deluxe to slow it down, but Civ3 was released and I was "busy" for a few years.

    Even if I could get the games from the tapes, would I want them? Do I really want to see code from when I was learning to program? It might have historical value if I become very famous, but I have not heard of people asking for the source of Bill and Paul's BASIC compiler, or their Traff-o-matic system, or Carmack's early efforts, so it is unlikely anyone would care.

    I could probably write similar games in a week using today's technology. Most were variants on PacMan or SpaceInvaders. None were as good as what was in the arcade a few years later. They would not be a good resume to enter computer game development.

    The tapes I have are for the VIC20. The C64 (and other color Commodore computers) used a slightly different version of BASIC. My games used PEEK and POKE to control the display, so even if I had a BASIC interpreter, I would need a VIC20 (or PET) virtual machine for them to run.

    I really doubt I will ever make the effort to recover the code, even if the 10-minute cassette tapes are usable after 25 years.

    --
    I spend my life entertaining my brain.
  187. I disagree with that article completely by Ekted · · Score: 1

    You cannot take a given programming task out of context. Just because a given implementation is "good enough" on your machine with your hardware and your operating system doesn't mean anything. Your process has to run with other processes. They are all sharing cpu, memory, buses, storage, network. What if this image convertor was being used as part of a system to generate live streaming video on a major web server? Using the entire CPU to convert 60 FPS for a single video would then be so incredibly poor, it would be a failure. I'm not saying you have to lose sleep over squeezing every last micro-code operation out of your executables, just that performance always matters. Writing "good enough" routines leads to bloated, slow, terrible software (windows media player, real player, winamp, internet explorer, ICQ, on and on).

  188. That guy doesn't know ANYTHING about performance by Anonymous Coward · · Score: 1, Insightful

    Why was this even posted? This guy has no clue about real performance!! Being able to process 66 images a second is nothing to brag about.. Sure he may "look" cool to non-programmers because he used edgar or whatever the hell, but i wouldn't hire him to work on anything performance intensive.. plus who the hell talks about performance in terms of execution speed times? saying 66 images a second says nothing about the relative performance about his program, algorithm OR the underlying language.

  189. How to Optimize Programming by solprovider · · Score: 1

    I have never thought of "unrolling loops" as optimization techniques. They are fun tricks to learn, but are bad for performance and maintenance. A good compiler will do better optimization, and using those tricks in high-level source code may hurt the optimization the compiler can do. If performance needs that kind of tuning, you should insert some assembler.

    Programmers usually get to use fast machines with lots of RAM and diskspace, and often end up writing programs that need everything they have.
    I mentioned in another post to this article how I had to optimize a program that I wrote on a 386, but customers were running on a XT. My quickie add-the-functionality-so-it-works was not good enough, but changing the algorithm was almost painless once we knew the feature would be used.

    How to optimize
    Remember that tasks cause performance hits in this order:

    1. Writing to the network with confirmation.
    If you are just sending to a port, it does not matter. Just write and forget. If you are waiting for a response, then this is the worst. Make certain that the response is handled as in #2.

    2. Reading from the network.
    Do not allow your program to idle while waiting for input. The first use of threading should be to continue processing as much as possible while waiting for the input.

    2.a User interaction
    User input (keyboard, mouse, etc.) is also slow, but it is very important. Make certain that your program will pause everything else to respond to the user. Try to redraw as little of the screen as possible. Do not recalculate any of the screen that was not changed. VB is the worst for this, since it wants to run all the code every time the screen is refreshed.

    Even MS has trouble not refreshing the entire screen every time something minor changes. Rename a file in Windows Explorer. Did the entry stay in place (desirable)? Did it resort so you have to find it again? Did it resort and move the data so you forget where you were? Did it drop to the end of the list (Windows98)? Were you able to type the new name while another process wrote to a file in that directory (WindowsNT)? If the user wanted it sorted, they would click the title of the column. Do not try to anticipate them. (And remember where they were. Why does it always go to the first entry if you use the Up or Back buttons?)

    3. Writing to a local hard drive.
    Keep as much in memory as possible so it can be written once. The only reason to write more than once is during debugging, and that code should not be in production. If you really need to log the intermediate steps, think about using write-and-forget across the incredibly-fast local network to a separate box that can do the writing.

    The other side is that if you are done with something, then store it so you can free the RAM. Unless you will reuse it. See #5.

    4. Reading from a local hard drive.
    Do it in one operation. Get all the data possible, then work with it. See #6.

    5. Requesting more resources.
    If you need much RAM, ask for it in one chunk. For VB, do not "Redim Preserve" arrays for every new entry. Start with an estimated size, and double it if you reach the upper bound. If you write C, then this concept was beat into you early. If you use Java, the Hashtable class does it for you.

    If you constantly use several network ports, use a pool. Add each to a pool, and check if one is available before requesting new ones. Your pool should send them to garbage collection if the number being used is MUCH lower than the number being reserved. This depends on your platform. You can always pool outbound ports, but the platform usually assigns the inbound ports as needed, so optimization must be handled by the OS or platform software.

    6. Know the physical resources available.
    If you try to load a 2GB database into 1GB of RAM, the database is going to be swapped to the much-slower-than-RAM hard drive anyway. It would be more efficient to use 200MB ch

    --
    I spend my life entertaining my brain.
  190. Re:That guy doesn't know ANYTHING about performanc by innosent · · Score: 2, Informative

    Exactly, he says it's not about (discrete) mathematics, but when it comes down to what a programmer is supposed to do, it's all discrete math. You have a Turing machine (albeit limited), and the whole point is to do what needs to be done correctly and as efficiently as possible. Some things still need to be written the same way they were in "1985", unlike the author's view that optimal code doesn't matter.

    Yes, machines are faster now, by at least an order of magnitude, but optimizing poor code can speed things up more than engineering a new processor.
    Bubble Sort a list of one billion points of data (O(n^2) compares = k * (1e18)) on a new 3GHz machine (assuming 1 compare per clock), and you need about 3.333e8 (333333333.33) seconds (about 10.5 years).
    Weak Heap (best), Quick, or Heap sort the same billion points (O(n*log n) compares = k * (~3e10)) on an old 486/33MHz (again assuming 1 compare per clock), and you need about 9.0909e3 (9090.909) seconds (about 2.5 hours).

    There you go, the author can use the new Pentium 5s, and I guess the rest of us can go dig out our 486s. Sure, you don't often need to sort a billion records, but next time you do, make sure your algorithm is reasonable, then use a language that allows you to implement it. One-size-fits-all library, garbage collection, and run-time language error checking might be good for rapid development, but doing things efficiently requires lower-level interaction, sometimes even below what C allows. Bubble sort 2 records, and you won't see much benefit, but for performance critical sections of code, sometimes it's better to optimize first, then use comments to make it readable, than to write code that looks like comments. Even adding bounds checking to the sorts would at least double the time required, which in this case could mean anywhere from another 3 or 4 hours up to the time left until you collect social security.

    --
    --That's the point of being root, you can do anything you want, even if it's stupid.
  191. Cycles are cheap, but whole servers cost a bundle by Nygard · · Score: 1

    The knife's edge in performance is irrelevant for 99.9% of today's desktop applications.

    But when you look at the server side, it matters. If you have a server app that takes 10% longer to generate a page, that translates almost directly (with some nonlinearity, for the nitpickers) to 10% less overall capacity. Or, conversely, it means adding about 12% more CPU/memory/disk. Since enterprise-class server units are quanitized in pretty large grains, that can mean a big chunk of change.

    I'll use real numbers, but no names:

    A particular app server can normally handle 1,000 concurrent sessions on one instance. One instance wants 1 CPU to itself and 2 GB RAM. So, supporting 50,000 concurrent users (the site's goal) means carrying around 60,000 concurrent sessions (sessions > users due to sessions waiting to expire).

    That should have required 60 CPUs to handle requests, plus one CPU per server for the OS and one CPU per server for "auxiliary" services needed by this app server. It also means about 120GB of RAM.

    That's no small chunk of change, but the picture really looks bad when you consider that inefficient application design hobbled the app server so badly that it could only handle 200 sessions per instance.

    That means five times the number of CPUs and five times the amount of RAM.

    There's a similar (coupled) calculation you can do with page latency and capacity. And another one you can do with page weight. A good application developer must pay attention to all of these things, because even a 1K change in page weight makes a very noticable difference to overall site capacity, bandwidth costs, and hardware costs.

    --
    "Genius may have its limitations, but stupidity is not thus handicapped." --Elbert Hubbard (1856-1915)
  192. Re:90% of the IT industry needs to read this book by rjwilson01 · · Score: 1

    Gee I'm offshore and I've read the book. I musn't of understood it. Any statement like "group of" is prejudiced and so probably wrong

  193. sig by Anonymous Coward · · Score: 0
    I think you're taking the comment way too seriously . . .

    "stripped" and "no support" sounds like a joke to me.

    Anonymous Coward since I've moderated comments for this story.

    1. Re:sig by WNight · · Score: 1

      Perhaps. I don't know either way, but I just wanted to point out that making groundless claims one way is no better than the other side with their groundless claims.

      The fact that there's no proof is damning enough, I don't need to manufacture the idea that we *know* he didn't do it.

  194. Re:Funny how. .. by Bingo+Foo · · Score: 1

    I would agree, except for Q_OBJECT and "moc."

    Anyone know how to seamlessly compile Qt apps in XCode?

    --
    taken! (by Davidleeroth) Thanks Bingo Foo!