On The Preservation Of Endangered Web Resources ...
An unnamed correspondent asks: "Recently Mathworld, what many would consider one of the more valuable Web resources, fell victim to a copyright lawsuit. We've seen in the past that through sufficient mirroring the community can save such resources (DeCSS for example) from similar legal onslaughts. What Web resources do you consider most valuable and/or most vulnerable to legal attack and is there any effort under way to mirror and preserve these resources?"
"Personally, I'd like to see an official community set up to protect such resources. Call them the Information League perhaps. Set up a mailing list for members and whenever some (perhaps corporate) entity tries to snuff out a Web site a member sends an e-mail to the list and all other concerned members could mirror the site."
the tabliature / chord archive. it's been around as long as i have, and always seems to be on the verge of shutting down.
pezpunk
Internet killed the video star,
i could live a little longer in this prison
But what about in practice? I, for one, would sign up guaranteed. But as it would tend to happen when the information "bat signal" was sent out I probably wouldnt A. Have the space B. Have the energy or C. Have the bandwidth to attempt to mirror one site, let alone many. Plus, all the gov has to do is get a warrant and make the sign up list known to them and go after each one of us. Yeah, I know Im being paranoid. And I know that there are probably some people with the time, energy, space and bandwidth to do this. But out of, say, 15000 people I bet only 2 or 3 actually do something. Just my two cents, so dont hate me.
----------------------------------
Looking for hardware (Currently need: Large Etch-a-Sketch) Have one? See my journal!
Wouldn't you then end up with the lawyers targetting the Information League and putting them out of business (by suing the individual members or the ISPs that provide their access, if nothing else), then going back and targetting the original victim?
While I realize that "copyright" is kind of a nasty word these days, any time you talk about doing this sort of thing, you're going to run into copyright laws. If the laws are wrong, then you work to change them either by going through the system, or being prepared to stand up to take your medicine if (when) the authorities come down on you (see any discussion of Civil Disobedience). Otherwise, you earn the distinction of being a scoflaw and get no respect from society at large.
...phil
...phil
"For a list of the ways which technology has failed to improve our quality of life, press 3."
Information cannot be wiped from it, because requests for the information only increase it's popularity, and thus spread it further across more hosts.
The only problem with FreeNet is that they are not, and do not intend to be a permenant storage repository for information. If no-one requests your document, it will eventully fade off of the network.
--
This message brought to you by Colin Davis
Colin Davis
You don't have nearly the "copyright" issues at that point (e.g., news-site.com might not want you to be a public mirror while they're selling ad-space and trying to live, but when they're getting their ass kicked in court, they may not be nearly as inclined to go after you for preserving their livelihood and image), and you can basically keep the "mirroring" active for a certain period of time and then drop it. (e.g., if I put out an APB today to "mirror slashdot", but in 60 days slashdot is still around, drop the mirror, the crisis is over.") You do that to conserve resources. Obviously if the system has been told (as in the above example) that a site IS down, then it holds on to it for as long as necessary (forever potentially), where interested parties could then mirror it themselves. D
There is something similar called Mojonation that sets up a peer to peer network over the internet by using and granting resources to people. If you provide a resource, such as hard drive space, you get "mojo" if you use up a resource, you spend mojo.
It seems organized well and it promotes people providing resources and not just taking them. Also everything is encrypted and your files dont dissapear with time!
I havent tried it yet, but if you have then I would like to hear from you.
________________
________________
Private Essayist
Isn't the IMDB just a review site? Last time I checked, they hadn't made reviewing things illegal, at least not yet... (although several software EULAs have had a decent shot at it).
The very power of the deCSS mirrors is their association -- they have none. There's no authority or listing of who has what, so would-be litigators are hard-pressed to do anything about it.
In creating a 'membership', you are creating a mechanism for the dismemberment.
The idea is really cool, but the transfer rate is slow. Dog slow. Which is the opposite of how it's supposed to work. It's supposed to be that an army of modem users can each send a separate piece of the file to a DSL user, (or the reverse, of course). The latest version is supposed to be much faster. It doesn't work on my Linux box.
The DeCSS experience shows that corporations and trade groups with vast financial resources and legal clout have no problem firing off unlimited barrages of form "cease-and-desist" letters to ISP's, universities, webmasters ... etc.
Ultimately, I believe mirroring is a temporary solution to the copyright conundrum. It's high time a membership-based organization was formed -- kinda like the EFF of intellectual property -- to protect valuable online resources from succumbing to the profit-driven proprietarization of the Internet.
Sincerely,
Vergil
Insects and Grafitti Photos
Of course Mathworld is still with us! Of course it is!
o rld.wolfram.com/PolynomialEquation.html
;))
What is the answer? Forget freenet, gnutella, and whatever else you're thinking of. The pages are still out there. What we need is a great big URL filter, where dead pages can be resurrected, Lazarus-like, with this simple function:
void alive_url(char *dead_url, char *alive)
{
strcpy(alive, "http://www.google.com/search?q=cache:");
strcat(alive, dead_url);
}
Thus, when we ask "Where is the mathworld page on polynomials?" we get the response http://www.google.com/search?q=cache:http://mathw
Now all we need to do is convert that to cgi, and run everything through this simple filter. Presto, live site
Lord Google will look after us all.
"Elmo knows where you live!" - The Simpsons
Useful when you see someone in a movie and think 'where'd I see that guy before?'. Look up the movie, find the actor, click his link, see a list of other movies he's been in, etc.
In any case, IMDB should be safe, I've never heard of laws against collecting this sort of information, and they built the database themselves (the database as a whole work would be copyrighted by them, actually).
There are agencies that are basically godzilla-sized racks of VCRs and tape recorders (well, it's probably all digital, now) connected to satellite dishes, antennas, and able. And they record EVERYTHING. And I mean EVERY channel, every radio station, everything, so that there is a "backup" of whatever was broadcast.
I almost worked at one of these places.
And if you wanted, say, to use a clip from some TV station, you could go and get appropriate copyright permission from the copyright owner, and then get the clip from the billions of tapes in the warehouse.
I'm surprised that there aren't people archiving every UseNet post. It would certainly be an interesting exercise.
--- Jump!! Fire!! Bullet time!! - Lego version of the Matrix
Deja seems to be a pretty unique resource. They've already stopped allowing us access to "older" parts of their archives that I think are relevant. What happens when they decide they no longer want to support the current system at all. IMHO, these usenet archives need to be free and accessible to all. Damn! It bugs me that I can't even find something that I posted at the end of August... I need to find the answer again. It's like there is a black hole for about 10 days.
Fresh from the FAQ:
Q: What's this about a lawsuit?
A: In March 2000, CRC Press LLC, a subsidiary of Information Holdings Inc., filed a copyright infringement lawsuit in the Southern District of Florida, claiming that the web site mathworld.wolfram.com violates their copyright in Eric Weisstein's CRC Concise Encylopedia of Mathematics published by CRC in November 1998.
Q: Why do they think the site violates their copyright?
A: Three and one-half years ago, Eric signed a book deal with CRC in which he agreed to provide printed, camera-ready pages for the encyclopedia. He thought he was selling them a printed snapshot of his existing web site, not the whole web site. CRC now claims that he sold them his whole web site, not just a printed book.
Q: So, did he sell them the web site or not?
A: Eric did not believe he was selling them his web site: he thought he was selling them the right to print a book and that he would be able to keep his web site up. If he had had more experience in the publishing industry, he would have insisted on a contract that made this crystal clear, but he didn't. Eric's contract, which is a standard boilerplate book contract that has probably been signed by many other CRC authors, does not give CRC explicit rights to the website. However, the court found that the contract is ambiguous on this point. What Eric intended to sell CRC is at the heart of this lawsuit.
Q: Doesn't the standard "right to reproduce in all media" clause cover the web site?
A: The web site is not based on or derived from the printed book: it existed for years beforehand. We believe and argue that the printed book is a derivative work. We don't dispute that CRC would have the right to put up a web site containing, for example, PDF files of the printed book. But we strongly object to the idea that their copyright in the printed book allows them to reach back and gain control of Eric's preexisting, ever-changing, collaborative internet community.
Q: Did Wolfram Research just cave in and yank the site to avoid trouble?
A: Absolutely not. We have kept the site up as long as we were able, but unfortunately CRC requested and was granted a preliminary injunction that orders us to take the site down until the case goes to trial. By direct order of the court, we had no choice and no alternative but to take it down.
Q: Isn't a lot more harm being caused by taking it down than leaving it up?
A: We respect the judge's well-reasoned opinion that the site should be taken down until the dispute is settled: he considered the evidence available to him in the legal record. He simply did not agree that the harm to the community at large would be enough to justify keeping the site available.
Browser? I barely know her!
In contrast, someone that goes out there and sets up a "Slimey Sex Site" has got to know that they will see some sort of opposition, whether from:
The "porn" site would seem to me to be more likely to have some funding and concern about such attacks.
In effect, it may be more likely that the "pornsters" will get attacked, in one way or another; the fact that they can expect such attacks leads to them "hardening" themselves, at least from a legal perspective.
Thus, the taxonomy may be more like:
If you're not part of the solution, you're part of the precipitate.
Lawsuits and threatening letters are expensive. Massive mirroring schemes work by making so many copies of the "forbidden" data that it would be prohibitively expensive to sue all the archives. If a company thinks that having some abandonware game available for free on the net will cost them $10,000 in lost revenue, and a nastly letter from the legal department costs $100, it makes financial sense to go after 1 or 10 or 50 mirrors. However, if there are more than 100 mirrors shutting them all down would be more expensive than forgoing the revenue lost due to downloading. Mirroring won't stop lawsuits, but it can make them too costly to use in some cases
0 1 - just my two bits
Many major university computer science departments also have whole-Web archives for the purpose of running siumlations of spiders and other automated information collecting and processing tools.
The main problem is that this information is not always publicly accessible and is within the long arm of the lawyers. Maybe the best way to implement this would be to arrange to have somebody like HavenCo purchase these snapshots on a monthly basis, keep them in near-line storage and move censored content that is deemed important by the Information League back "into print".
Weisstein certainly wouldn't be in this predicament if his website were being sold in book form: it predates the contract with CRC, and he says that it is not derived from that work (in fact, it's more likely that the reverse is true). Why should his website be treated any differently than any other former publication?
For a VERY long time, Eric absolutely demanded NO mirrors, and would firewall off and permanently deny access to anyone who tried to mirror it.
....)
This is why I like to mirror. That way, if a wonderful resource get's blocked/denied/taken down, I can still use it. (treasuretroves, digitalblasphemy,
-- Spoken as a small contributor and as someone who tried to mirror and was firewalled off.
You can post your web sites on Mojo Nation (warning: this is in beta! It is not stable, but it works.). Documents posted to Mojo Nation are not deletable. (This is due to some complicated peer to peer architecture and RAID-like splitting of the data into multiple redundant shares, of which you need only a subset to reconstruct the original document. See the web site for docs.)
Regards,
Zooko, Evil Geniuses For A Better Tomorrow
CRC representatives will be at a number of technical conferences this year, including the Computer Security Conference in Chicago next week. I intend to visit their booth and talk to their representative about this shameful action. You should, too.
Assuming they don't just use an attractive freelancing schoolteacher, which other book companies seem to do...
just search bn.com for "crc math". They're still $100 tho.
try { do() || do_not(); } catch (JediException err) { yoda(err); }
You can still get the MathWorld site out of the google cache. Here's a quick and dirty hack to make the google cache "navigatible:" http://net127.com/g c/i ndex.cgi/mathworld.wolfram.com/topics/
this one, without a doubt.
wishus
---
People mention Freenet, but Freenet only protects information that is there. You've got to make sure that if police or military forces comes bursting in one night, information must be stored where they can't close it down and distributed from there.
Also, it is important that people who support a site don't use too much bandwidth and HD space before it gets serious. Othervice, people may not be able give the necessary resources.
What I have in mind is a network where those providing endagered resources can call for support (CFS). Those who respond to the CFS set up a software to download an image of the site every now and then (say once a month, once a week or something), and at least after controversial information has been published.
Next, we need something that sets off an alarm that the endagered site is being attacked. This has to include the possibility that the site just goes down without warning (military forces shoot the webmaster and blows everything to pieces, to take an extreme). This could be done by checking every now and then if the server is up, and if it stays down for any extended period of time, the alarm would go off. Naturally, there must not be too many false alarms, or the system will loose credibility. This pretty much rules out Windows as platform.... :-) Also, it should be possible for the administrator to set off the alarm by a single command, so that if somebody comes bursting in, they have to act fast to stop the information from being transmitted. Other features such as the administrator saying "if my site goes down at 12:15 and you don't hear from me, we're under attack". Also, intelligence might try to fool the system to mirror useless or bogus information, we would have to work hard to make sure we are one step ahead.
If a site is under attack, there are a number of things that could be done. First, put up a mirror of any information that you have stored, dump it on Freenet. Maybe some sort of system could be set up so that nameservers are updated with information about one of the mirrors, so that the web site has very little downtime? Perhaps a global network of name servers similar to the two provided by Granite Canyon's Public DNS service, where authority can be transfered as part of an alarm. One can also attempt to keep e-mail working as well, but that's of little use if the admin has been shot.... If the alarm has been set off by the admin, one should try to download a mirror as a part of the alarm response to get the latest.
I have also been thinking about how to use the internet to try to keep those suppressed online using minimalist solutions, e.g. TCP/IP over ham radio. It might have low bandwidth, but perhaps sufficient for e-mails...?
Employee of Inrupt, Project Release Manager and Community Manager for Solid
If the resource is popular then it gets mirrored automatically by greedy block servers who are hoping to sell copies in return for Mojo to people that download it. (Note that you earn Mojo by running a Mojo Nation client, so it is more like "trading" your bandwidth and your disk space and the blocks you've collected for the blocks that the other guy has collected.)
So as far as I can tell, mirroring useful web resources that a large community uses is a perfect use for Mojo Nation. I wouldn't recommend depending solely on Mojo Nation at this point (BETA! BETA! It's the letter that comes before Gamma which is the kind of radiation that made Spiderman and The Incredible Hulk!), but I would recommend experimenting: take a web site that you are mirroring, do a `wget -r -k' on it, then run the Mojo Nation utility "cmdpub" on the resulting directory.
Regards,
Zooko, Evil Geniuses For A Better Tomorrow
The most threatened material like decss or cuecat should be preserved, along with various software that my be patented in some counties but not others (Lame, BladeEnc etc) Also, anything useful that has the potential to draw legal threats like reverse engineered device drivers for WinPrinters (Lexmark), WinModems, parallel port win-scanners. Think about it. If Digital Convergence can go after software drivers on linux and claim that there secret protocol is protected IP then every other dumb hardware manufacturer out there could go after all sorts of drivers for Linux or any other OS other than windows. Some companies have been successful in squashing certain software such as cp4break, glide wrappers simply because mirroring didn't happen fast enough.
For me the most valuable resource on the Web that is in risk of disappearing is Dejanews.
We've already lost everything older than one year, and now what's left is being sold off to some unnamed party.
How much legal strong arming by some pro-censorship or copyright protection group will be required to remove it forever? Not to mention the COS.
I use Dejanews daily, and would sorely miss even this now diminished archive.
What does matter is getting mathworld back online. And I see that as easily done. The simple fact is that CRC is not behaving in their own best interest. The web site is not competition for the book -- it's free advertising for the book. Besides which, who will sell them web content after this incident?
So CRC is just sabotaging their own product. And drying up any further web-originated product. And creating a lot of ill will in the process. They may have the legal right to screw themselves, but if enough people point out that they are screwing themselves, they might well stop.
Slashdotters have considerable power to communicate this point. There are a lot of them, they know about web economics, and they are precisely the kind of technical audience CRC depends upon. So here's the relevent contact info, taken from their web site:
CRC Press LLC Headquarters 2000 NW Corporate Blvd Boca Raton,FL, USA 33431
Phone 1(800)272-7737 x6066 (561)994-0555 Fax - 1(800)374-3401 (561)989-9732
Please , make this a exercise in lobbying, not a DoS attack. One short fax or phone call per person. Anything else is self-defeating.
__________________