Slashdot Mirror


How the Wayback Machine Works

tregoweth writes: "O'Reilly has an interview with Brewster Kahle about how The Internet Archive's Wayback Machine works, with lots of juicy details about how the biggest database ever built works."

3 of 134 comments (clear)

  1. They haven't got http://web.archive.org/ by Rentar · · Score: 5, Funny

    They don't seem to think the history of their site would be interesting: http://web.archive.org/web/*/http://web.archive.or g/ lredirects you to their index.html! boring!

    Now, that would really be a test for their apps. Same as if Google indexed www.google.com (entirely).

  2. Not the biggest DB by costas · · Score: 5, Informative

    100 TBs do not make the biggest DB ever. I am personally working on an 60-70TB ERP system that's also writeable; I am sure there are bigger systems out there (e.g. Wal-Mart's or GM's ERP systems come to mind).

    A read-only DB containing highly-compressible text does not really make for a very challenging datamine. Just because it's on and about the Web and sexier than a stodgy ERP system should not make you overlook the real technology.

  3. Noooooooooo !!! by morzel · · Score: 5, Funny
    Please please please please do _NOT_ google it... It was embarassing enough when google acquired dejanews, and put the old usenet archives on-line. :-)
    I just visited some sites from which I hoped that they dissappeared completely from cyberspace. The only defense I've got now are the old cryptic URLs of these monstrosities... Indexing that database would be a disaster, especially with an unusual name like mine...
    (Yes, I was stupid enough to use my real name ;-)
    Damn you, wayback :p

    --
    Okay... I'll do the stupid things first, then you shy people follow.
    [Zappa]