Slashdot Mirror


Knuth Got It Wrong

davecb writes "Think you've mastered the art of server performance? Think again. Poul-Henning Kamp, in an article at ACM Queue, finds an off-by-ten error in btrees, because they fail to take virtual memory into account. And he solves the problem in the open source 'Varnish' HTTP accelerator, for all of us to see and use."

4 of 298 comments (clear)

  1. Agreed by eldavojohn · · Score: 5, Insightful

    Knuth's analysis is valid in the framework of his assumptions, and what is described in the linked article has been known as "cache oblivious b-tree" for not so short time.

    Yeah, using this logic, once quantum computers get to the point of solving general problems there's going to be an awful lot of people who "got it wrong" because their algorithms do not apply in a quantum computer. Advancements in technology causing algorithms & data structures to be tweaked means the original person who thought them up "got it wrong"? Ridiculous.

    "Oh, RSA was brilliant. But they got it wrong. You see, they failed to account for xanxinglydiumapping in 2056 computers. Poor stupid wrong bastards."

    --
    My work here is dung.
  2. Knuth didn't get anything wrong by jfengel · · Score: 5, Insightful

    There isn't any "off by ten error", and this isn't telling us anything we don't already know (in CS terms): implementation on an actual computer can be different in performance from an abstract machine.

    What the author is saying (quite well) is that the virtual memory performance amounts to cache misses, which cause extra performance overhead. He found a case where it was significant and got a 10x speedup in his particular application.

    The article is a little over-zealous its characterization, though it's careful to note that this is not actually a theoretical novelty. The summary, on the other hand, bastardizes and exaggerates it.

    The article is interesting, and worth reading, but if you RTFS without RTFA you'll be dumber than you were before. Thanks, kdawson.

  3. Re:Nope, he didn't by SashaMan · · Score: 5, Insightful

    Mod parent up. There was a comment on the article that referenced cache oblivious algorithms, which was a new concept to me and very interesting. Basically, this set of algorithms assumes a memory hierarchy (e.g. fast ram vs slow disk) that is optimized to limit the number of times the slower memory is accessed. Importantly, cache oblivious algorithms are optimal REGARDLESS of the size of the cache. That's opposed to a cache aware algorithm, like a normal b-tree, where the size of each node is set according to the page size of the machine.

    A very helpful overview here from MIT Opencourseware: http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-fourteen/

  4. Re:Nope, he didn't by yuhong · · Score: 5, Insightful

    Especially when virtual memory did not exist 46 years ago.