Programming As If Performance Mattered
Junks Jerzey writes "Saw the essay 'Programming as if Performance Mattered', by James Hague, mentioned at the Lambda the Ultimate programming language weblog. This is the first modern and sensible spin on how optimization has changed over the years. The big 'gotcha' in the middle caught me by surprise. An inspiring read." Hague begins: "Are particular performance problems perennial? Can we never escape them? This essay is an attempt to look at things from a different point of view, to put performance into proper perspective."
If only I had written my first post program with performance in mind I could have not failed it!
I code in managed environments (.NET and Java), so I just let some mysterious thing manage performance for me.
Unroll those loops by hand. You'll get a little bump in speed. And a little bump in your pocketbook.
Yeah. I bet you use bubblesort too...
should really read that essay! Maybe then we wouldn't need dual-core 4-6 GHz CPUs and 2GB ram to run their new OS.
bash: rtfm: command not found
> a sad inditement
Well, it does have a spell checker now...
A particular illustration of this was in my last semester's 'Systems Programming and Optimization' course. The professor set us a project where we could choose an interesting subsystem of a Linux distro, analyze the code, and point out possible areas where it could be further optimized. I'm a pretty enthusiastic Debian user, so I chose to analyze the apt-get code. Our prof was very focused on low-level optimizations, so the first thing I did was to pull apart apt-get's Perl codebase and start to recode sections of it in C. At a mid-semester meeting, the professor suggested that I take it even further, and try using some SIMD/MMX calls in x86 assembly to parallelize package load calls.
This was a big ask, but me and my partner eventually had something working after a couple of weeks of slog. By this stage, apt-get was *flying* along. The final step of the optimization was to convert the package database to a binary format, using a series of 'keys' encoded in a type of database, or 'registry'. This sped up apt-get a further 25%, as calls to a machine-readable-only binary registry are technically superior to old fashioned text files (and XML was considered too slow)
Anyway, the sting in the tail (and I believe this is what the article highlights) was that upon submission of our project, we discovered that our professor had been admitted to hospital to have some kidney stones removed. In his place was another member of the faculty...but this time, a strong Gentoo supporter! He spent about 5 minutes reading over our hand-coded x86 assembly version of apt-get, and simply said "Nice work guys, but what I really want to see is this extended to include support for Gentoo's 'emerge' system...and for the code to run on my PowerMac 7600 Gentoo PPC box. You have a week's extension'
Needless to say, we were both freaking out. Because we had focused so heavily on optimization, we had sacrificed a lot of genericity in the code (otherwise we could have just coded up 'emerge' support as a plug-in for 'apt-get'), and also we had tied it to Intel x86 code. In the end we were both so burnt out that I slept for 2 days straight, and ended up writing the 'emerge' port in AppleScript in about 45 minutes. I told the new prof to just run it through MacOnLinux, which needless to say, he wasn't impressed with. I think it was because he had destroyed his old Mac OS 8 partition to turn it into a Gentoo swap partition. Anyway, me and my partner both ended up getting a C- for the course.
Let this be a lesson...read the article, and take it in. Optimization shouldn't be your sole focus. As Knuth once said, "premature optimisation is the root of all evil". Indeed Donald, indeed. Kind of ironic that Donald was the original professor in this story. I don't think he takes his work as seriously as he once did.
If you compare a man to a monkey today, you see cognitive differences (granted, some similarities too) that tend to seperate these two related species. When a monkey attempts to open a locked box, he will try many times to pry the box open, bash at it with his fists, and the like. A man, however, quickly realizes the box is locked and uses a tool to break the lock. While the monkey's strategy is simple, and will _eventually_ get the box open, the man's strategy is more complex, but much more efficient.
Same goes with programming. If you have to search through a massive sorted database, no skilled programmer alive would use a linear traversal (simple, yet inneficient), they would use a binary search (more complex, yet far more efficient).
So when customers pay for you to get that locked box open, who's strategy will you choose?
Proper programming perspective? Please. People-centered programming? Pretty pathetic.
Programmer's purpose: problem-solving. Programmers prefer power - parallelizing, profiling, pushing pixels. Programmers prefer Pentium PCs - parsimonious processing power. Pentium-optimization passes Python's popularity.
Ponder.
[Previous painful posts: P, D]
One of my C++ instructors told us a story about one of his former coworkers who used to go through his programs and delete ALL whitespace from the code because he thought it would make the programs smaller and faster.
LK
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
a sad inditement
Your own post is a sad indictment of your spelling.
Is the time it take me to do the performance optimization worth it in time that I could be browsing /.
"I am not denying the existence of stupidity, or of stupid people." - phyruxus
Throwing more hardware at the problem has never solved a single performance problem, ever.
I'm not even going to start trying to talk sense into you. You are obviously a pig headed imbicile. That's OK, keep living in your own world where your leet optimisation skills are worth so much more than a CPU upgrade.
Your job will be going to India soon, my friend.
Yes, you always have to worry about those forking threads.
Sort of like a novel which tells one complete story and is one unified and self-contained package.
In other words, "a book".
LK
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
In some sense he was right:
:-).
The program source code gets smaller (by as many bytes as the removed whitespace occupied), and compiles faster (because the parser doesn't have to read and ignore all those whitespace characters).
Now, the size reduction might be quite measurable (esp. if the original program was quite readable), though not substantial. However, if the improved compile speed was measurable, the compiler must have had an incredibly slow parser (or maybe the system had mounted the disk with NFS through ppp over ssh through a 14.400 kbps modem link
The Tao of math: The numbers you can count are not the real numbers.
Dumbsort works something like this:Straight from MS-style programming books.
-
Roses are #FF0000, Violets are #0000FF, find / -name '*base*' |xargs chown -R us && mv zig greatjustice
Nooooo...... must..... use.......MysqL!!!!!!!!!!!!!!!11111111
TIS SO UBAR!1111
OMG
begin;
*insert 200k rows*
rollback;
connection to database lost?
hm..
On the other hand, lower complexity isn't always better. Quicksort is O(n^2) in the worst case, but in practice it's almost always an order of magnitude faster than the O(n log n) heap sort.
And here's a wrapper function which will turn any sort algorithm into an O(1) one:I suspect, however, that it will not improve performance.
And perhaps we could breed some very small furry humans as pets. I'm very sure there is a market for pet-humans as no less than 25% of the Andromedians voted yes on a survey asking if they would spend more than 100 Astrobucks on a pet-human if they would be smaller and less noisy.
"Let only the best programmers reproduce". Hmm. The best programmers are people like Richard Stallman. Well, there goes that plan for increasing overall programming quality.
XML is like violence. If it doesn't solve the problem, use more.
Like this fellow 'Coder' sitting next to me, that putting most of his code in between /* and */ because 'It compiled faster'...
he was wondering why his program wasn't working
I was wondering was he was doing in Computer Engineering school...
I live in Soviet Canuckistan you insensitive clod!
That only works if programming is heritable.
....
In your thought experiment, a few generations down all programmers will have greasy hair & shaggy beards (heritable) & the social need (heritable) to have greasy hair & shaggy beards, with no programming ability (not heritable)
No one knew what O notation was.
Not long after that I found out about knoppix. I burned a few disks and gave them out. Only one other person knew what linux was. It wasn't my manager.
Just last week one of our servers had a problem with IIS. "What's IIS?", My manager asks.
Here are some other gems
"We can't put indexes on our tables! It will screw up the data!"
"I've seen no evidence that shows set processing is faster than iterative processing" -- this one from our "Guru".
"What is a zip file and what am I supposed to do with it?" -- from one our our senior systems programmers in charge of citrix servers.
"What do you mean by register the dll?" -- from the same sysprog as above
They pushed a patch out for the sasser worm and about 2% of the machines got the BSOD. I finally decided to give the fix a try on my machine and it died too. I booted into safe mode and rolled it back. Everyone else had to get their machines reimaged because desktop support couldn't figure out what was wrong. Lucky for my neightbor I was able to recover most of his data before they wiped it clean. He made the mistake of letting his machine loop through reboots for two days, which hosed his HD up. Of course the PC "experts" couldn't recover any data because the machine wouldn't boot up.
Yes, I am in programmer purgatory. I am reluctant to say hell because I'm sure it can get worse. No, I'm not kidding.