Slashdot Mirror


Programming As If Performance Mattered

Junks Jerzey writes "Saw the essay 'Programming as if Performance Mattered', by James Hague, mentioned at the Lambda the Ultimate programming language weblog. This is the first modern and sensible spin on how optimization has changed over the years. The big 'gotcha' in the middle caught me by surprise. An inspiring read." Hague begins: "Are particular performance problems perennial? Can we never escape them? This essay is an attempt to look at things from a different point of view, to put performance into proper perspective."

11 of 615 comments (clear)

  1. Funny thing about performance by ObviousGuy · · Score: 5, Interesting

    You can spend all your time optimizing for performance and when you finally release your product, your competition whose main objective was to get the product out the door faster, who uses a slower algorithm, is already first in mindshare with your customers. Not only that, the processors that you thought you would be targetting are already a generation behind and that algorithm that was going to hold back your competition runs perfectly fast on new processors.

    Performance gains occur at the hardware level. Any tendency to optimize prematurely ought to be avoided, at least until after v1.0 ships.

    --
    I have been pwned because my /. password was too easy to guess.
    1. Re:Funny thing about performance by techno-vampire · · Score: 4, Interesting
      The thing is that a lot of programmers today have grown NOT to respect the need for performance - they just assume that the upcoming systems would have really fast processors and infinite amounts of RAM and diskspace, and write shitty code.

      That's not the only reason. Programmers usually get to use fast machines with lots of RAM and diskspace, and often end up writing programs that need everything they have.

      Back in the DOS days, I worked on a project that had a better way of doing things. We had one machine with reasonable speed as the testbed. It wasn't well optimized as we didn't expect our customers to know how to do that and the programs we were writing didn't need expanded or extended memory. If what you wrote wouldn't run on that machine, it didn't matter how well it worked on your machine, you had to tweak it to use less memory.

      --
      Good, inexpensive web hosting
  2. Don't agree by Docrates · · Score: 4, Interesting

    While the author's point seems to be that optimization and performance are not all that important, and that you can achieve better results with how you do things and not what you use, I tend to disagree with him.

    The thing is, in real life applications, playing with a Targa file is not the same as service critical, 300 users, number crunching, data handling systems, where a small performance improvement must be multiplied by the number of users/uses, by many many hours of operation and by years in service to understand its true impact.

    Just now I'm working on an econometric model for the Panama Canal (they're trying to make it bigger and need to figure out if it's worth the effort/investment) and playing with over 300 variables and 100 parameters to simulate dozens of different scenarios can make any server beg for more cycles, and any user beg for a crystal ball.

    --

    There are two kinds of people in the world: Those with good memory.
  3. Performance is relative by jesup · · Score: 4, Interesting

    66 fps on a 3 GHz machine, doing a 600x600 simple RLE decode...

    Ok, it's not bad for a a language like Erlang, but it's not exactly fast.

    The big point here for the author is "it's fast enough". Lots of micro- (and macro-) optimizations are done when it turns out they aren't needed. And writing in a high level language you're comfortable in is important, if it'll do the job. This is a good point.

    On the other hand, even a fairly naive implementation in something like C or C++ (and perhaps Java) would probably have acheived the goal without having to make 5 optimization passes (and noticable time examining behavior).

    And even today, optimizations often do matter. I'm working on code that does pretty hard-real-time processing on multiple threads and keeps them synchronized while communicating with the outside world. A mis-chosen image filter or copy algorithm can seriously trash the rest of the system (not overlapping DMA's, inconvenient ordering of operations, etc). The biggest trick is knowing _where_ they will matter, and generally writing not-horrible-performance (but very readable) code as a matter of course as a starting point.

    Disclaimer: I was a hard-core ASM & C programmer who for years beta-tested 680x0 compilers by critiquing their optimizers.

  4. Re:Throw hardware at it. by SatanicPuppy · · Score: 4, Interesting

    Depends on how bad it is. I've seen stuff that runs so slow there really isn't a way to throw more hardware at it. Of course that was written by a guy who had two goals: 1) to make sure no one but him could support his work, and 2) to do as little work as possible.

    I don't know. Clean, elegant, functional code is beautiful. If you're ever going to have to work on it again, I think it's better for it to be clean and optimized.

    Also depends on the size of the app. With a small app, what excuse do you have for not optimizing? Wouldn't take that long. With a big project? Depends on your work environment.

    The bosses will never know if its optimal or not. If you tell them you've maxed out the server, they just think you write big badass code. A lot of times though, there isn't time to thoroughly bug check a big app (That what users are for, eh?), more less optimize it.

    --
    ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
  5. Re:Managed environments by metlin · · Score: 4, Interesting

    You assume that I made that reference to myself as being a bad programmer.

    The reason I made that statement was because just last week I was at Redmond for an interview for internship at Microsoft, and I was interviewed by the team that was trying to prevent just this sort of thing from happening.

    The idea was to design heuristics-enabled compilers that would effectively detect any "bad-code" and help make managed code and pseudo-managed code the norm, or convert existing code into managed code.

    I did not say that I was using a programming language that had such protections, merely that such programming languages have their own advantages. I was interviewed for creating compilers, linkers and OS-level protection that did not allow those troublesome things to exist - not use them - and hence my justification :)

    That said, you may knowingly or unknowingly use a language designed for bad programmers even when you program C or C++ in upcoming versions of compilers that insist on managed code - they may just wrap up your code in a nice wrapper to prevent mishaps and hand it over to the linker after having taken care of your holes.

  6. Re:One thing new programmers often miss by AaronW · · Score: 4, Interesting

    Less code does not equal faster code. You can usually get the best performance increases by using a better algorithm. For example, if you're doing a lot of random adding and deleting of entries, a hash table will be much faster than a linked list. This will have the greatest impact.

    Other things that can help are doing your own memory management at times (i.e. freelists) since that will be faster than malloc/new, and will have less memory overhead. Also, design your storage to your data. If you know you'll allocate up to 64K of an item, and the item is small, allocate an array of 64K of them and maintain a freelist. This will use a lot less memory than dynamically allocating each item and will result in better locality.

    I write code in the embedded space, where memory usage and performance are both equally important. Usually getting the last ounce of performance out of the compiler doesn't make much difference.

    A good real-world example is that I replaced the malloc code provided by a popular embedded OS with DLMalloc, which glibc is based. The dlmalloc code is *much* more complicated, and the code path is much longer, but due to much better algorithms, operations that took an hour with the old simple malloc dropped down to 3 minutes. It went from exponential to linear time.

    -Aaron

    --
    This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
  7. Re:me too... by some+guy+I+know · · Score: 4, Interesting

    If the man was coming from a BASIC programming background, his belief may have been understandable.
    Some (very old) BASIC interpreters used to parse each source line each time it was executed.
    Doing it that way saved memory (no intermediate code to store).

    --
    Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
  8. Coding while blind by xyote · · Score: 4, Interesting

    The biggest problem I see with performance is lack of visiblity of performance factors. At the hardware level there is cache (which is supposed to be transparent) and really deep pipelined processors. This can have a major effect on what would otherwise be an optimal algorithm. And the hardware keeps changing, so what may have been optimal at one point will become suboptimal later on.

    In software, the biggest problem is lack of performance directives. POSIX pthreads is one of the biggest offenders here. Best performance practices in pthreads are based on how common implementations work. POSIX allows implementations that would cause major performance problems for so called best pthread programming practices. Example, POSIX allows pthread_cond_signal implementations to wake all waiting threads, not just one. There are programs that depend on pthread_cond_signal to wake only one thread for performance in order to avoid "thundering herd" problems. So while standards allow portability of correct programs, whey do not necessarily allow portability of performance.

    We need explicit performance directives.

  9. De-commenting by RogL · · Score: 4, Interesting

    Back in the late '80s, early '90s, worked on (DOS-based) commercial products written in APL. APL is an interpreted language, written in Greek & math symbols, some overstrikes.

    Each byte of that 640K was precious, so it was common practice to "de-comment" the code before release; remove all comments, reduce whitespace, move multiple statements onto a single line, possibly shorten variable names. You could gain a substantial (for the time) amount of memory that way. You also dynamically imported/destroyed functions.

    I regularly debugged client systems with "de-commented" APL; if you could read that, you could read anything!

  10. All your optimizations are wrong. by scorp1us · · Score: 4, Interesting
    If you spend hours tweaking code to eliminate a few instructions, even instructions in a loop, then you are just wasting your time.


    Real opimizations come before you right your program. Take for example that loop that you removed an instruction or two from. Say it is a searching an array. and looks like:

    for (i=0; i<strlen(x); i++){
    if (x[i]=='&') ;
    }


    There are two things wrong. One you cal strlen repetitively. Strlen() is theta(n) So you have a loop that executes n times at a cost of n . n*n=n^2. That's one of the slowest algorithms around. Maybe your compiler is smart enough to see that x is not being modified and will to a s=strlen(x); then compare against X for you, but probably not.


    The other thing is when searching an array, try to give it structure. If your array contains sorted characters, then you can find it in log _2 (n). Of course, of you sort by frequency (most commonly accessed at the top) then your n^2 loop *might* do better.


    The article is right: constant-time operations (type checking, etc) are asymtotically infitessimal in algorithms. The article's real problem is that it is n, but on a 2d image (x*y)=n you can't do any better. Note that it is not n^2, (though it makes a square picture) because you're operating on pixels. So that will be your unit of measure - NOT time.


    Which is my 2nd point. Don't measure in time or instructions. Measure in OPERATIONS. Operations are not instructions or lines of code. An Operation is everytime you have to look at a unit. It is a logical unit of cost. Hardware can be changed out. We know that hardware (performance) doubles every 18 months. The constant-time instructions will get smaller. (Also clocks per cycle are irrelevant as well). But your loops will remain the biggest part of your program.


    With that, here's a crash course in CS:

    1. Loops are most of (time, operations) the program. Try not to use them.
    2. To avoid loops, structure your data. Giving structure means assumptions, and assumptions means you can skip irrelevant sections of data.
    3. Determine your dataset, minimize worst-case occurences. Find out what order of data or instructions will make your n*log2(n) algorithm become n^2. Then find away around it.
    4. and optimize for average case. That is, if you never sort more than 6 numbers at a time, an n^2 will beat a n*log_2 (n) algorithm.
    5. If your data structure introduces overhead (most will) find yuor most common or costly operation. Optimize your datastructure for that (searching, sorting, etc) If you do a combination determine the ratio and optimize for that. The cost of overhead is usually small compared to the reason why your using a datastructure to speed up your common operation.
    6. The most obvious and easiest to code algorithm is the slowest. (Bubble sort vs. Radix or quick-sort)
    7. Mastery of the above is the difference between a $50k programmer and a $90K programmer.


      To learn more, take a datastructures class at your local university. Please review Calculus II before doing that though.

    --
    Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.