Archiving Web Pages - Legal or Illegal?
Dyer asks: "I used to run several high-trafficked anonymous surfing sites and if I wasn't getting emailed by a lawyer telling me to block someone's site from being accessed I was being woken up at 2am with a telephone call from a crazy person yelling, sometimes swearing at me with the impression that my site copied theirs and it resided on my server, when in actuality it was being accessed by my server at that instant and being relayed to the user. This is my point, how do services like Archive.org and Google's cache get away with what they're doing? You can call their services whatever you like, but it doesn't change the fact that they are copying people's websites and saving them onto their servers for everyone to access."
Well, it should be legal/allowed. If you don't want it read and archived, don't put it on the Web.
Everything should go, except for things like malicious alteration and theft (taking stuff and claiming it is yours)
On the day of 9/11, I began to think that maybe a lot of things would be online that would disappear on the next update, forever. We tend to think of 1880 newspaper clippings as being perishable, not online media, but the opposite is true. So all day on 9/11 I archived news sites and about two hundred blogs using "wget -p".
...what can I do with this data?
Over the next week I archived some 4,600 blogs. They've kind of been sitting around waiting for me to weed through and organize. I've also been wgetting 30 or so large news sites' front page every 15 minutes or so on the hunch that I'll grab something emerging even if I'm AFK. Well
The answer(s) to this question will definitely be of use to me. Thanks for asking it. Slash, thanks for posting it.
My
Limekiller
(FWIW, IANAL) Web site content is copyrighted. Therefore, you have a right to make your own personal copy, and backup copies, but it is not legal to redistribute those copies without the site owner's permission. I cannot imagine that the Wayback machine or the Google cache is legal. They are blatantly disregarding the site owners' copyright.
That said, I think the law should be changed or at least clarified, because it is patently (pun intended) obvious that those services are doing a vast social good, and should be encouraged.
J'aime mieux les méchants que les imbéciles, parce qu'ils se reposent. -- Alexandre Dumas