Slashdot Mirror


Interview with Brewster Kahle

Netmonger writes "A fascinating interview with the man behind The Wayback Machine. Some specs from the article: "It's 150-odd standard PC cases, with four drives in each.. 'Over 100 terabytes.. As plain text in book form, that'd be over 3000 miles of shelf space.." All I can say is.. Wow!"

10 of 195 comments (clear)

  1. Wait wait by PygmyTrojan · · Score: 2, Insightful

    Now how many Library's Of Congress is that?

    --

    Trying is the first step towards failure.

  2. Like all those crappy old buildings... by FreeUser · · Score: 3, Insightful

    A lot of internet information is crap... So why would you want to preserve all of it? Why not just get the good stuff and maybe he won't need so many comptuers.

    And of course, you're going to decide what is "good" and what "isn't?" He is providing the resource for, among other things, scholarly researchers. Of what use is the data if it has been hand edited according to one person's aesthetics or anothers?

    Indeed, your comment reminds me of one that was heard quite often, shortly before beautiful and irreplacable old buildings were razed to make way for a new strip mall, or, in downtown Chicago, a couple of new government buildings whose architectural style is best described as "Federal Drab." Preserving as much as possible is a good thing, because none of us can tell what will be valuable, and what will not, in another 20 or 30 years, and no one's aesthetic should be dictating such a decision to entire generations to come.

    --
    The Future of Human Evolution: Autonomy
  3. Sounds good... by C0LDFusion · · Score: 2, Insightful

    ...except for the fact that he allowed the Church of Scientology to bend him over and use him like a toy. Why doesn't he get some Google backbone and refuse to bow to their DMCA threats?

    Oh, I forget that honor is dead on the internet.

    --
    Only in slashdot are posts of solidarity modded at -1 Redundant, while posts of antagonism are modded as -1 Flamebait.
  4. Re:A lot of internet information is crap... by Vellmont · · Score: 2, Insightful

    He brings up this point in the article. It's important to archive everything because we never know what's going to be usefull information in the future.
    In other words, perspective and context is a huge part in determining value and meaning. At some point these annoying popup ads may play be important for someone studying the evolution of advertising on the net. In fact, popups, or the frequency or timing of them might be something that's missing from the archive.
    Most of the culture is invisible to most of us most of the time. The things we take for granted are the most ingrained into us, and possibly the most interesting to someone after the culture has changed.

    --
    AccountKiller
  5. Re:A lot of internet information is crap... by kscguru · · Score: 2, Insightful
    50 years from now when historians digs through 2002 e-mail logs, they'll probably think the most heavily consumed product in the country was (insert random spam product here).

    Ah, the legacies we'll leave... based on YOUR e-mail, what will YOU be remembered as?

    --

    A witty [sig] proves nothing. --Voltaire

  6. Odd, no copyright questions by dsanfte · · Score: 5, Insightful

    I was curious to how the Wayback Machine's operators view its legal status... I mean, it's not really a search engine in the broadly accepted meaning of the term. It doesn't just search what's out there, it archives entire pages of old information; And while search engine sites do this (google), this is ALL the wayback machine site does.

    Surely they must know they're treading on untested legal ground. All it might take is one offended copyright holder to bring the whole thing to its knees. Basing it in a country other than the USA might have been smarter, then, given the existence of laws like the DMCA which could serve to shut the site down.

    --
    occultae nullus est respectus musicae - originally a Greek proverb
  7. Re:A lot of internet information is crap... by garcia · · Score: 4, Insightful

    I think that storing everything on computers will make historians jobs MUCH less difficult but a lot less fun.
    Doing historical research is fun b/c you get to get your hands dirty (literally). I spent 6 hours a day for three weeks researching crime rates in Toledo, OH during prohibition (before, during, and after) and b/c the books were all handwritten and they were so old my hands turned black for days at a time...
    It would have been MUCH easier if all the information was sorted and easily found I guess it would make future historians jobs easier but what fun would that be?

    Just my worthless .02

  8. Why only four? by pla · · Score: 4, Insightful

    Out of curiosity, why only four drives per PC?

    With a simple $10 PCI IDE card (per additional 4), you could have gotten at *least* 8 drives, possibly as many as 16, per case. Granted, not many cases will let you *mount* that many, but I would expect paying a few bucks extra for the IDE cards and a better case would save quite a bit of money (and physical space) by halving or quartering the number of PCs you need ($100 extra to save $1500 per $2000, not counting the drives themselves?).

    88lf of machines vs 22lf. One requires an entire room, one would fit on a standard sized 3-or-4-tier storage rack. Of course, speaking of racks (of a different sort)... What on earth made you go with an array of standard PCs rather than a raid-in-a-rack?

  9. Re:A lot of internet information is crap... by aengblom · · Score: 3, Insightful

    I think that storing everything on computers will make historians jobs MUCH less difficult but a lot less fun.

    I think it's more that i will be different people. Understanding most of history is constrained by the lack of data about that time. Our age is precisely the opposite. We try and save EVERYTHING we can possible afford--because we know that crap will be valuable to many people later on. For next centuries historians it will be about data sampling and extracting the gold nuggets from all the crap we have saved.

    It will be the folks who built google. Not the current type of folks.

    That said. It's better to have too much than too little.

    --


    So close and yet so far from the world's perfect ID number
  10. The Wayback machine is a lie by corebreech · · Score: 5, Insightful

    Try accessing news stories immediately prior to and after the September 11 attack and you'll see just how valuable this website is... or rather, isn't.

    I have also personally ran a website which contained fairly controversial material (based on this story) that I saw listed on their website and then removed shortly thereafter. Tell me, why would a service like this ever have occasion to remove material once it's been archived, especially if there are *NO* copyright issues and the webmaster of the archived site never asked them to remove it?

    The answer is simple: the powers-that-be saw how dangerous it was to make all this information available to anyone on demand so they took control. It would be a great service were it allowed to operate unfettered, but the reality is quite different.

    And I'm the first to mention this here so far? You should all be modded down -1 for naiveté.