Interview with Brewster Kahle
Netmonger writes "A
fascinating interview with the man behind The Wayback Machine. Some specs from the article: "It's 150-odd standard PC cases, with four drives in each.. 'Over 100 terabytes.. As plain text in book form, that'd be over 3000 miles of shelf space.." All I can say is.. Wow!"
So why would you want to preserve all of it? Why not just get the good stuff and maybe he won't need so many comptuers. I understand that just choosing the good stuff would be very subjective, but do we really need archives of pr0n sites and popups?
Visualize Whirled Peas
It's a shame that some fo the more interesting moments in Internet history are so transient the wayback machine can't catch them.
e.g. The Ded Kitty picture we put up when napster shut down at the star of september, it was only there for a few hours but it will be lost.
Of course, some of the more interesting transient events are websites that are hacked, but there exist dedicated archives for this kind of event, so you can relive the hilarity of RIAA.org being repeatedly defaced.
As Borges once said about the Libaray of Babel wayback now...
...The Library is a sphere whose exact center is any one of its hexagons and whose circumference is inaccessible.
The universe (which others call the Library) is composed of an indefinite and perhaps infinite number of hexagonal galleries, with vast air shafts between, surrounded by very low railings.
Looks like he wasn't too far off...
Well, maybe not...
Perhaps we need to propose an extension to the robots.txt file to tell certain classes of search crawler to visit more frequently or at specific times?
Here. They seek to create physical items (clocks and libraries are two items they name) that will last for very, very long periods of time. This diagram shows what is meant by the "long now", and this is a link to their first prototype clock that is on display in the Science Museum in the UK (the second clock on the page).
Hint: Don't put security pages in your robots.txt which aren't supposed to be linked.... or at least secure them with a password.
http://www.zone-h.org/en/news/read/id=894/
Small personal thanks from me. I had put an online exhibit of my artwork up a few years ago, but unfortunately lost all of it by a harddrive failure. Much to my surprise I was able to find nearly all of my site, http://www.gpapassavas.com online and backed up on the WBM.
I always have to chuckle when I see these analogies. "If you printed all of the data on a CD-ROM, it would reach Mars!"... that's super.
There are at least two problems with such analogies:
1) People use them to comment on the marvelous efficiency of technology - but in reality, it's only a comment on the hideous inefficiency of print. It doesn't say much at all about technology. It might be useful to convince people to digitize/OCR their printed matter - but is anyone *not* doing this? Even the Library of Congress is scanning its texts now.
2) In this case it's a particularly bad analogy, because it assumes that all data is printed as hex. Example: images, which are obviously a huge, huge chunk of the Wayback archive. Virtually all website images are small enough to print on a printed page at full resolution. But consider a 500x500-pixel image, at 16 bits (2 bytes per pixel, 2 chars to represent each byte)... that's 1,000,000 characters, or 1,000 pages!
Basically the analogy is good for wildly inflating some numbers to stun the 0.00001% of the population that doesn't already realize these things.
- David Stein
Computer over. Virus = very yes.
In presentations, Brewster says his policy is to take out the complainers. So if you think having your site in the Wayback Machine is a copyright infringement, he'll just take it out. Meanwhile he's taking the Napster approach: assume what you're doing is legal until someone tells you to stop. Hopefully that day won't arrive.
If there is a way to permanantly erase pages from the archive, I would be a little less worried. But I can never tell if they let you delete stuff, or just "block" it. "Blocking" is crap, we all know what that will be worth if somebody really wants the info someday and knows the Archive has it.
on how long before a politician has to resign because of some over the top statements he/she made in a flamewar back in college? Or maybe that webpage of ethnic jokes that seemed so hilarious back in high school.
I have a feeling we are either going to have to become way more forgiving, or we're going to be stuck with only faceless boring types with no opinions as our leaders (no wisecracks, it could be much worse than it is now).
"It is easier to ask for forgiveness than permission."