Knuth Got It Wrong
davecb writes "Think you've mastered the art of server performance? Think again. Poul-Henning Kamp, in an article at ACM Queue, finds an off-by-ten error in btrees, because they fail to take virtual memory into account. And he solves the problem in the open source 'Varnish' HTTP accelerator, for all of us to see and use."
wow, my mind just got blown
Funny how often I wind up using those tags.
Hail Eris, full of mischief...
E pluribus sanguinem
Not to mention the author's frame of reference is totally invalid, because in the real world, we make sure we have so much RAM in our servers that virtual memory never becomes that large of a factor. So, bottom line, some nobody trolling for some fame on the back of one of our most respected predecessors.
cat
If a set of closely related nodes can fit in a cache line, this improves locality so that the CPU can do more work instead of twiddling its thumbs.
That's sort of ... obvious? But off-topic.
RTFA: it revolves around I/O optimization. It was tested on system with 16GB RAM and 300GB mmaped storage. LOL. Essentially author claims that if you use binary tree for on-disk data look-ups it would be *surprise* very slow. Then he goes on to reinvent some b-trees, tuned for the task at hand, and shows how they are 10 times faster.
All hope abandon ye who enter here.
Not to mention the author's frame of reference is totally invalid, because in the real world, we make sure we have so much RAM in our servers that virtual memory never becomes that large of a factor.
You are wrong - about where he's wrong. The author essentially takes a very niche application, optimizes it and then goes to wildly extrapolate his finding on CS in general. That's where he's wrong.
Lots of server applications (RDBMSs, HTTPDs) would love to get better use of VM and load all the junk once and serve it as if from memory. The problem is that in RTFA's niche application (very dumb cache) one doesn't have to take care for things like coherency or consistency, something most other applications are bound to provide out of box and what contributes lion share of their complexity. And yes, instead of increasing the applications complexity, we'd rather throw some buck at RAM and keep it simple instead.
All hope abandon ye who enter here.