On The Preservation Of Endangered Web Resources ...
An unnamed correspondent asks: "Recently Mathworld, what many would consider one of the more valuable Web resources, fell victim to a copyright lawsuit. We've seen in the past that through sufficient mirroring the community can save such resources (DeCSS for example) from similar legal onslaughts. What Web resources do you consider most valuable and/or most vulnerable to legal attack and is there any effort under way to mirror and preserve these resources?"
"Personally, I'd like to see an official community set up to protect such resources. Call them the Information League perhaps. Set up a mailing list for members and whenever some (perhaps corporate) entity tries to snuff out a Web site a member sends an e-mail to the list and all other concerned members could mirror the site."
the tabliature / chord archive. it's been around as long as i have, and always seems to be on the verge of shutting down.
pezpunk
Internet killed the video star,
i could live a little longer in this prison
You don't have nearly the "copyright" issues at that point (e.g., news-site.com might not want you to be a public mirror while they're selling ad-space and trying to live, but when they're getting their ass kicked in court, they may not be nearly as inclined to go after you for preserving their livelihood and image), and you can basically keep the "mirroring" active for a certain period of time and then drop it. (e.g., if I put out an APB today to "mirror slashdot", but in 60 days slashdot is still around, drop the mirror, the crisis is over.") You do that to conserve resources. Obviously if the system has been told (as in the above example) that a site IS down, then it holds on to it for as long as necessary (forever potentially), where interested parties could then mirror it themselves. D
There is something similar called Mojonation that sets up a peer to peer network over the internet by using and granting resources to people. If you provide a resource, such as hard drive space, you get "mojo" if you use up a resource, you spend mojo.
It seems organized well and it promotes people providing resources and not just taking them. Also everything is encrypted and your files dont dissapear with time!
I havent tried it yet, but if you have then I would like to hear from you.
________________
________________
Private Essayist
There are agencies that are basically godzilla-sized racks of VCRs and tape recorders (well, it's probably all digital, now) connected to satellite dishes, antennas, and able. And they record EVERYTHING. And I mean EVERY channel, every radio station, everything, so that there is a "backup" of whatever was broadcast.
I almost worked at one of these places.
And if you wanted, say, to use a clip from some TV station, you could go and get appropriate copyright permission from the copyright owner, and then get the clip from the billions of tapes in the warehouse.
I'm surprised that there aren't people archiving every UseNet post. It would certainly be an interesting exercise.
--- Jump!! Fire!! Bullet time!! - Lego version of the Matrix
Fresh from the FAQ:
Q: What's this about a lawsuit?
A: In March 2000, CRC Press LLC, a subsidiary of Information Holdings Inc., filed a copyright infringement lawsuit in the Southern District of Florida, claiming that the web site mathworld.wolfram.com violates their copyright in Eric Weisstein's CRC Concise Encylopedia of Mathematics published by CRC in November 1998.
Q: Why do they think the site violates their copyright?
A: Three and one-half years ago, Eric signed a book deal with CRC in which he agreed to provide printed, camera-ready pages for the encyclopedia. He thought he was selling them a printed snapshot of his existing web site, not the whole web site. CRC now claims that he sold them his whole web site, not just a printed book.
Q: So, did he sell them the web site or not?
A: Eric did not believe he was selling them his web site: he thought he was selling them the right to print a book and that he would be able to keep his web site up. If he had had more experience in the publishing industry, he would have insisted on a contract that made this crystal clear, but he didn't. Eric's contract, which is a standard boilerplate book contract that has probably been signed by many other CRC authors, does not give CRC explicit rights to the website. However, the court found that the contract is ambiguous on this point. What Eric intended to sell CRC is at the heart of this lawsuit.
Q: Doesn't the standard "right to reproduce in all media" clause cover the web site?
A: The web site is not based on or derived from the printed book: it existed for years beforehand. We believe and argue that the printed book is a derivative work. We don't dispute that CRC would have the right to put up a web site containing, for example, PDF files of the printed book. But we strongly object to the idea that their copyright in the printed book allows them to reach back and gain control of Eric's preexisting, ever-changing, collaborative internet community.
Q: Did Wolfram Research just cave in and yank the site to avoid trouble?
A: Absolutely not. We have kept the site up as long as we were able, but unfortunately CRC requested and was granted a preliminary injunction that orders us to take the site down until the case goes to trial. By direct order of the court, we had no choice and no alternative but to take it down.
Q: Isn't a lot more harm being caused by taking it down than leaving it up?
A: We respect the judge's well-reasoned opinion that the site should be taken down until the dispute is settled: he considered the evidence available to him in the legal record. He simply did not agree that the harm to the community at large would be enough to justify keeping the site available.
Browser? I barely know her!
Many major university computer science departments also have whole-Web archives for the purpose of running siumlations of spiders and other automated information collecting and processing tools.
The main problem is that this information is not always publicly accessible and is within the long arm of the lawyers. Maybe the best way to implement this would be to arrange to have somebody like HavenCo purchase these snapshots on a monthly basis, keep them in near-line storage and move censored content that is deemed important by the Information League back "into print".
Weisstein certainly wouldn't be in this predicament if his website were being sold in book form: it predates the contract with CRC, and he says that it is not derived from that work (in fact, it's more likely that the reverse is true). Why should his website be treated any differently than any other former publication?
For a VERY long time, Eric absolutely demanded NO mirrors, and would firewall off and permanently deny access to anyone who tried to mirror it.
....)
This is why I like to mirror. That way, if a wonderful resource get's blocked/denied/taken down, I can still use it. (treasuretroves, digitalblasphemy,
-- Spoken as a small contributor and as someone who tried to mirror and was firewalled off.
However, considering the pending sale of Deja, the existing usenet archives do need to be recorded (if just for the Big 7 minus alt., just as references) instead of being sent to the bit bucket.
"Pinky, you've left the lens cap of your mind on again." - P&TB
"I can see my house from here!" - ST:
You can post your web sites on Mojo Nation (warning: this is in beta! It is not stable, but it works.). Documents posted to Mojo Nation are not deletable. (This is due to some complicated peer to peer architecture and RAID-like splitting of the data into multiple redundant shares, of which you need only a subset to reconstruct the original document. See the web site for docs.)
Regards,
Zooko, Evil Geniuses For A Better Tomorrow
People mention Freenet, but Freenet only protects information that is there. You've got to make sure that if police or military forces comes bursting in one night, information must be stored where they can't close it down and distributed from there.
Also, it is important that people who support a site don't use too much bandwidth and HD space before it gets serious. Othervice, people may not be able give the necessary resources.
What I have in mind is a network where those providing endagered resources can call for support (CFS). Those who respond to the CFS set up a software to download an image of the site every now and then (say once a month, once a week or something), and at least after controversial information has been published.
Next, we need something that sets off an alarm that the endagered site is being attacked. This has to include the possibility that the site just goes down without warning (military forces shoot the webmaster and blows everything to pieces, to take an extreme). This could be done by checking every now and then if the server is up, and if it stays down for any extended period of time, the alarm would go off. Naturally, there must not be too many false alarms, or the system will loose credibility. This pretty much rules out Windows as platform.... :-) Also, it should be possible for the administrator to set off the alarm by a single command, so that if somebody comes bursting in, they have to act fast to stop the information from being transmitted. Other features such as the administrator saying "if my site goes down at 12:15 and you don't hear from me, we're under attack". Also, intelligence might try to fool the system to mirror useless or bogus information, we would have to work hard to make sure we are one step ahead.
If a site is under attack, there are a number of things that could be done. First, put up a mirror of any information that you have stored, dump it on Freenet. Maybe some sort of system could be set up so that nameservers are updated with information about one of the mirrors, so that the web site has very little downtime? Perhaps a global network of name servers similar to the two provided by Granite Canyon's Public DNS service, where authority can be transfered as part of an alarm. One can also attempt to keep e-mail working as well, but that's of little use if the admin has been shot.... If the alarm has been set off by the admin, one should try to download a mirror as a part of the alarm response to get the latest.
I have also been thinking about how to use the internet to try to keep those suppressed online using minimalist solutions, e.g. TCP/IP over ham radio. It might have low bandwidth, but perhaps sufficient for e-mails...?
Employee of Inrupt, Project Release Manager and Community Manager for Solid
For me the most valuable resource on the Web that is in risk of disappearing is Dejanews.
We've already lost everything older than one year, and now what's left is being sold off to some unnamed party.
How much legal strong arming by some pro-censorship or copyright protection group will be required to remove it forever? Not to mention the COS.
I use Dejanews daily, and would sorely miss even this now diminished archive.