Programming As If Performance Mattered

← Back to Stories (view on slashdot.org)

Programming As If Performance Mattered

Posted by simoniker on Wednesday May 5, 2004 @04:38PM from the my-programming-can-reverse-time dept.

Junks Jerzey writes "Saw the essay 'Programming as if Performance Mattered', by James Hague, mentioned at the Lambda the Ultimate programming language weblog. This is the first modern and sensible spin on how optimization has changed over the years. The big 'gotcha' in the middle caught me by surprise. An inspiring read." Hague begins: "Are particular performance problems perennial? Can we never escape them? This essay is an attempt to look at things from a different point of view, to put performance into proper perspective."

4 of 615 comments (clear)

Min score:

Reason:

Sort:

Re:Managed environments by metlin · 2004-05-05 17:08 · Score: 4, Informative

Contrary to popular belief, managed code environments do optimize code a whole lot more than you would think!

Joe Beda, the guy from Microsoft behind Avalon, had a discussion on Channel9 where he talked about why managed code is not that bad a thing afterall.

Like I mentioned in an earlier post, managed code helps optimize the code for some of the bad programmers out there who cannot do it themselves, and takes care of a lot of exceptions and other "troublesome" things :) So, in the long run, it may not be that bad a thing afterall.

There are two facets to optimization - one is optimization of the code per se, and the other is the optimization of the project productivity - and I think managed code environments do a fairly good job of the former and a very good job of the latter.

My 0.02.
Performance is IMPORTANT by Jason+Pollock · 2004-05-05 17:38 · Score: 4, Informative

I am hearing a lot of people saying that you shouldn't optimise prior to the first release. However, it is very easy to select a design or architecture that limits your high end performance limit. Therefore, there is some optimisation that needs to be done early.

When you're architecting a system that is going to take tens of man years of effort to implement, you need to ensure that your system will scale.
For example, a project I recently worked on hit a performance wall. We had left optimisation for later, always believing that it shouldn't be done until last. However, while the architecture chosen was incredibly nice and clear, it limited the performance to 1/3th what was required. Back to the drawing board, we just doubled the project cost - ouch.
Even worse, there are performance differences on each platform! For example, did you know that throwing an exception is 10,000 times slower than a return statement in HP/UX 11? Solaris is only a little better at 2 orders of magnitude. Linux is (I understand) a dead heat.

So, while low-level optimisation of statements is silly early in the project, you do need to ensure that the architecture you choose is going to meet your performance requirements. Some optimisations are definitely necessary early in the project.

The article also talks about tool selection, suggesting that the extra CPU could be better used to support higher level languages like Erlang. If a system has CPU to spare, I agree, use what you can. The projects I work on always seem to lack in CPU cycles, disk write speed, and network speed. You name it, we're short of it. In fact, a large part of our marketing strategy is that we are able to deliver high performance on low end systems. What would happen to us if we dropped that edge? We're working with a company that has implemented a real-time billing system in Perl. Not a problem, until you try and send it 500 transactions/second. Their hardware budget? Millions to our 10s of thousands. Who do you think the customer likes more?
Jason Pollock
Postmature optimization by nimblebrain · 2004-05-05 19:35 · Score: 5, Informative
After years of developing, I really take to heart two things:
1. Premature optimization often makes better optimizations down the line much more difficult
2. It's 90% guaranteed that the slowdown isn't where or what you thought it was
Profilers are the best thing to happen to performance since compilers - really. I encounter a number of truths, but many myths about what degrades performance. A few examples of each:
Performance degraders
- Mass object construction
- Searching sequentially through large arrays
- Repeated string concatenation (there are techniques to mitigate this)
- Staying inside critical sections for too long
Not performance degraders
- Lots of object indirection
- Lots of critical sections
The "lots of object indirection" myth is one I encounter frequently. Object A calls Object B calls Object C, and it "intuitively" looks like it must be slow (Computer A calling Computer B, etc. would be slow), but even with stack frame generation, these are lightning fast compared with even the likes of "date to string" functions, never mind line-drawing commands or notification-sending.
The reason that particular myth is dangerous is that it's the single most pervasive myth (IMHO) that leads to premature optimization. People take out layers of object indirection and make it harder to put in better solutions later. I had an object that recorded object IDs in a list and let you look them up later. If I had "flattened" that into the routine that needed it, I might have effected a 0.1% speed increase (typical range for many premature optimizations). As it stood, because it hid behind an interface (equivalent to an ABC for C++ folks), when I had finally implemented a unit-tested red/black tree, it was trivial (~5 minutes) to drop in the new functionality. That's not an isolated case, either.
Mind you, I profiled the program to determine the slowdown first. Searching on the list, because so many were misses (therefore full scans), the search was taking up 98.6% of the entire operation. Switching to the red/black tree dropped the search down to 2.1%.
All in all, if you have a slow program, profile it. There is no substitute for a well-written profiler. Stepping through and "feeling" how long it takes in a debugger, while it can point you in rough directions, will miss those things that take 50 ms out of the middle of each call to the operation you're checking. Manually inserting timing calls can be frustrating enough to maintain or slow down your program enough that you can't narrow down the performance hit.
gprof works well with gcc and its relatives (make sure to add -pg to your flags), but I'm not sure if there's a good open source option out there for people using other tools that doesn't require you to alter your source.
In the Windows world, we recently got in the professional version of AQTime 3. It's an astounding package, allowing you numerous reports, pie charts and call graphs, saving the last few runs, calculating differences in performance between runs, allowing attachment to running processes, on top of a pretty nice way to define areas of the program to profile. The single nicest thing about it, though, is the performance. We turned on full profiling (that is, profiling all methods in all modules, including all class library and third party components) on the largest project we had, and it ran with perhaps a 30% slowdown. If you've used profilers before, you know how astounding that is ;)
Profiling applications always surprises me. In one case, a space-making algorithm I was running on controls seemed a little pokey; I found out more than 50% of the time spent was on constantly verifying that the lists were sorted. Today, I was investigating a dialog that looked like it must hav
--
Binary geeks can count to 1,023 on their fingers :)
You're ignoring the "gotcha" by Theatetus · 2004-05-05 23:55 · Score: 4, Informative

It doesn't matter how much hardware you throw at a problem if it needs to scale properly and you have an O(n^3) solution.

Well, maybe you're not ignoring it since you said "if it needs to scale properly". But that's a very crucial "if", and the "scale properly" only refers to certain situations.

If the array you need to sort might have several million members and you won't be sorting more than a few dozen of those arrays, yes you should use an O(n lg n) or whatever sort routine. OTOH, if the array itself is smaller (a few hundred members) but you have to sort several hundred thousand of them, quicksort or merge sort will be remarkably slow compared to the much-maligned bubble sort.

Big-O notation is an asymptotically-tight bound, not the function itself. For small datasets, not only is there no guarantee that the lower big-O algorithm will be faster, it's in fact usually the case that the allegedly "less efficient" algorithm will actually be faster.

--
All's true that is mistrusted