Archiving Web Pages - Legal or Illegal?
Dyer asks: "I used to run several high-trafficked anonymous surfing sites and if I wasn't getting emailed by a lawyer telling me to block someone's site from being accessed I was being woken up at 2am with a telephone call from a crazy person yelling, sometimes swearing at me with the impression that my site copied theirs and it resided on my server, when in actuality it was being accessed by my server at that instant and being relayed to the user. This is my point, how do services like Archive.org and Google's cache get away with what they're doing? You can call their services whatever you like, but it doesn't change the fact that they are copying people's websites and saving them onto their servers for everyone to access."
Archive .org FAQ
How can I remove my site's pages from the Wayback Machine?
The Internet Archive is not interested in preserving or offering access to Web sites or other Internet documents of persons who do not want their materials in the collection. By placing a simple robots.txt file on your Web server, you can exclude your site from being crawled as well as exclude any historical pages from the Wayback Machine.
See our exclusion policy.
You can find exclusion directions at exclude.php. If you cannot place the robots.txt file, opt not to, or have further questions, email wayback2@archive.org.
In other words, by your NOT including a robots.txt file, you are implicitly granting them permission to cache your content. Also, the content is cached as it was published, complete with the appropriate markings, and is only publicly accessible content, so you'd be hard press to argue there is any economic harm from the caching, which means there would be likely be no damages from a successful copyright suit, which means a copyright suit would be pretty damned unlikely.
IANAL.
Well, it should be legal/allowed. If you don't want it read and archived, don't put it on the Web.
.z5 file and play it offline on any zip interpreter. Would the copyright owners object to it? I own that Infocom 33-game collection and all 5 books; the reason the game wasn't included in the collection is copyright hassles. Am I "entitled" to play it offline?
You know, I've been wondering about Java/Shockwave games. Certainly most kids would love a CD full of those games, and many companies have many different games online which mostly disappear a few months later.
Is anybody archiving these? Do we need to start?
Would the companies object?
You can play The Hitchhiker's Guide to the Galaxy on Douglas Adams' web site. As it happens, if you know what you're doing you can also download the
This ties in to today's "is ROM collecting wrong" story, except in this case you're actually offered the games, under mostly unclear terms.