Slashdot Mirror


Inside the Internet Archives

blackbearnh writes "O'Reilly Media is running an interview with Gordon Mohr, Chief Technologist for the Internet Archive (archive.org). If you've ever wondered how pages are selected for archiving, or just how they manage such a huge quantity of data, the answers are here. The interview also touches on the problems of intellectual property in archives, archiving the Internet in a post Web 2.0 world, and the potential vulnerabilities exposed by archiving web sites that may include security exploits."

2 of 85 comments (clear)

  1. Re:I wished archive.org stored even more stuff by RareButSeriousSideEf · · Score: 3, Interesting
    Yeah, how exactly do pages go AWOL from archive.org? I've encountered that, plus pages suddenly acquiring META refresh tags (maybe through an external script or iframe?) that redirect to some domain squatter's site now. Extremely annoying. I'm going to have to mess around with wget to see what's in the markup, unless someone can suggest an easier way to get at such content.

    Combining a bookmarking / chaching service would be really handy. Furl fits that bill, doesn't it?

  2. Remember Slashdot in it's Infancy? by dbarron · · Score: 5, Interesting

    Check this out....it reads like a free software update blog :)
    http://web.archive.org/web/19980113191222/http://slashdot.org/