Archiving Web Pages - Legal or Illegal?

It SHOULD be legal by Anonymous Coward · 2003-06-30 08:00 · Score: 4, Interesting

Well, it should be legal/allowed. If you don't want it read and archived, don't put it on the Web.

Everything should go, except for things like malicious alteration and theft (taking stuff and claiming it is yours)

Re:It SHOULD be legal by lightspawn · 2003-06-30 08:39 · Score: 5, Interesting

Well, it should be legal/allowed. If you don't want it read and archived, don't put it on the Web.

You know, I've been wondering about Java/Shockwave games. Certainly most kids would love a CD full of those games, and many companies have many different games online which mostly disappear a few months later.

Is anybody archiving these? Do we need to start?

Would the companies object?

You can play The Hitchhiker's Guide to the Galaxy on Douglas Adams' web site. As it happens, if you know what you're doing you can also download the .z5 file and play it offline on any zip interpreter. Would the copyright owners object to it? I own that Infocom 33-game collection and all 5 books; the reason the game wasn't included in the collection is copyright hassles. Am I "entitled" to play it offline?

This ties in to today's "is ROM collecting wrong" story, except in this case you're actually offered the games, under mostly unclear terms.

RTFF by kalidasa · 2003-06-30 08:04 · Score: 5, Informative

Archive .org FAQ

How can I remove my site's pages from the Wayback Machine?
The Internet Archive is not interested in preserving or offering access to Web sites or other Internet documents of persons who do not want their materials in the collection. By placing a simple robots.txt file on your Web server, you can exclude your site from being crawled as well as exclude any historical pages from the Wayback Machine.
See our exclusion policy.
You can find exclusion directions at exclude.php. If you cannot place the robots.txt file, opt not to, or have further questions, email wayback2@archive.org.

In other words, by your NOT including a robots.txt file, you are implicitly granting them permission to cache your content. Also, the content is cached as it was published, complete with the appropriate markings, and is only publicly accessible content, so you'd be hard press to argue there is any economic harm from the caching, which means there would be likely be no damages from a successful copyright suit, which means a copyright suit would be pretty damned unlikely.

IANAL.

Re:RTFF by strider415 · 2003-06-30 08:14 · Score: 0

And by leaving my door unlocked am I implicitly agreeing to let people come in and take pictures?
Re:RTFF by Anonymous Coward · 2003-06-30 08:36 · Score: 0

You already expected them to take pictures anyway, just less permanent ones.
Re:RTFF by jgoemat · 2003-06-30 08:40 · Score: 2, Insightful

No, but you do have a door. People are free to drive by your house and take a picture of it, or anything else out in public view. That reminds me of the girl that lost the lawsuit against Girls Gone Wild. If someone doesn't want their web page to be archived or cached, they can "put up a door" by using "robots.txt". If they really don't want to let the public at large see it, lock the door by protecting the content with a password. If they want to make absolutely sure no one sees it that they don't specifically show it to, they should save it as an HTML file on their personal computer and not even publish it on the web.
Re:RTFF by Lord+Bitman · 2003-06-30 08:57 · Score: 0, Flamebait

"with all the markings", yeah, I love that. Yet another idiot who thinks they can make copies of and distribute an author's work provided they tell others that they are doing so illegally. GOOD ONE.
"But they're giving it for free anyway!" how the fuck is that remotely relevant? And how the fuck, just to let the question be raised, do you know that they are publishing on a "public" network? How fucking infeasable would to be to send only to IP addresses which they wanted to? Not at fucking all? Whoopy! I guess you're full of shit.

--
-- 'The' Lord and Master Bitman On High, Master Of All
Re:RTFF by anthony_dipierro · 2003-06-30 09:41 · Score: 1

People are free to drive by your house and take a picture of it, or anything else out in public view.

Not if that thing out in public view is copyrighted.
Re:RTFF by Anonymous Coward · 2003-06-30 10:18 · Score: 0

As you correctly point out in another post, copyright law has an exception for caching Internet content.
Re:RTFF by sir_cello · 2003-06-30 10:29 · Score: 2, Insightful

You don't properly understand the legal process.

In a copyright case, the courts first establish whether infringement has taken place, and this is determined irrespective of economic issues. It is determined purely on issues of subsistance, owernship, duration, etc - in terms of the statuory provisions and the existing case law. It is only then that exceptions (such as fair use, and specific exemptions - say - for public archives and libraries) are considered.

Then, finally, when remedies are considered (e.g. damages), the economic harm is taken into account. No damages may be awarded if there is no economic harm, but you still have the right to prevent the party from using an infringing copy of your work.

This is because copyright is a right on the work conferred to you. You can choose how you exercise that right, and that may include you refusing to allow others to use your work even in situations where it does no economic or moral harm to you (I mean, basically, you have the right and you can do damn well what you like with it!). There are of course some "essential facility" copyright cases where courts have ruled that an owner must license or allow use of a work, but these do not come along often (e.g. the macgill case in the EU).

You can argue that this is not a good way to do it, but the facts are that this is how it works now, and in terms of new technologies such as the Internet, it is not likely to chance immediately.
Re:RTFF by anthony_dipierro · 2003-06-30 10:43 · Score: 1

Yes but it doesn't have an exception for taking pictures of "anything out in public view."
Re:RTFF by sharkey · 2003-06-30 12:12 · Score: 1

Not if that thing out in public view is copyrighted.
So, I'd be breaking the law if I went to Stephen King's next book-signing, and snapped a shot of the cover art of the books on the table?

--

--
"Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
Re:RTFF by anthony_dipierro · 2003-06-30 12:41 · Score: 1

Probably not, because that would fall under fair use. But if you then went on to redistribute that picture over the internet, you probably would be breaking the law.
Re:RTFF by kalidasa · 2003-06-30 16:25 · Score: 1

Read more carefully. The implications of my posting: The cachers are providing a mechanism to have your work excluded at your request, providing you with a non-court means to remedy the caching if you choose. Since it is all publicly available information anyway, the potential economic damage is minimal. There are usually two remedies provided to a plaintiff after a lawsuit over copyright: the violater is ordered to stop violating, and the violater is ordered to provide monetary compensation. In this case, the first remedy is provided by the potential violator *without the need for court intervention*, and as I said, monetary considerations are likely to be minimal to nonexistent. So, given that there isn't likely to be a worthwhile remedy resulting from a lawsuit, lawsuits in these cases seem to me to be unlikely in the extreme.
Re:RTFF by cthugha · 2003-06-30 19:52 · Score: 1

I think you're sort-of-right. The mere fact that a search engine gives you this facility to opt out does not create an implicit licence to use content by itself: there is an old principle of law that silence does not mean consent. If this were not the case, I could, e.g., write to you offering an opportunity to engage in a Nigerian money-laundering scam with the rider that "if you don't reply to this I will take it to mean you have accepted my offer" and then enforce that through contract law if you didn't in fact take the effort to tell me to go jump.

However, the practice of search engines archiving content is so prevalent that the social conventions and behavioural norms of the Internet would give rise to an implied licence if you did not create a robots.txt file to block archiving. There's a useful analogy in trespass: if you leave an open or unlocked gate and don't indicate otherwise through a sign or other means then there is an implied licence for friends, family, travelling salesdroids, etc to enter your property in order to call on you. This does not extend to other types of visitors though, otherwise you wouldn't have an action in trespass against a would-be burglar.

In short, exercise caution when archiving somebody else's web content. If it's not a common activity amongst web users at large, you probably don't have a right to do it.
Re:RTFF by ScuzzMonkey · 2003-07-01 02:52 · Score: 2, Interesting

In this case, the first remedy is provided by the potential violator...

Yes, but it places the burden in the wrong place and so is not likely to be considered an adequate remedy by the courts. More properly, the violator should be seeking permission prior to re-distributing the content, rather than essentially saying to the copyright holder "Stop me before I copy again!"

I'm not sure I think that caching sites should be subject to traditional copyright law--it has some nasty implications for anyone who cuts traffic loads using a proxy server (insert humorous image of AOL Time Warner suing themselves for caching their own content)and really strikes me as yet another area where technology outstrips law, but if they are subject to it, their chosen remedy isn't likely to hold much water.

--
No relation to Happy Monkey
Re:RTFF by ChrisKnight · 2003-07-01 08:07 · Score: 1

> In other words, by your NOT including a robots.txt file, you are implicitly granting them permission to cache your content.

Bullshit.

Your argument is like saying "If you leave your front door unlocked you are giving your neighbors implicit permission to loot your house."

Many website creators don't even know about archive.org, so how will they know to go read the document? You can not assume permission by the lack of existence of a robots.txt file.

Now, if archive.org only copied your site if you DID have a robots.txt file with an explicit statement allowing them to duplicate your site, then that would be a different story.

-Chris

--
-- This sig is only a test. If this were a real sig it would say something witty. --
Re:RTFF by NanoGator · 2003-07-05 09:27 · Score: 1

"But they're giving it for free anyway!" how the fuck is that remotely relevant?"

It's called 'publishing'.

" do you know that they are publishing on a "public" network? How fucking infeasable would to be to send only to IP addresses which they wanted to?"

Yeah, they wouldn't want to password-protect their files or anything.

I can see that you're no more 'enlightening' here than you are in our thread about your inability to see your own idiocy.

--
"Derp de derp."

My 9/11 Archive by limekiller4 · 2003-06-30 08:05 · Score: 4, Interesting

On the day of 9/11, I began to think that maybe a lot of things would be online that would disappear on the next update, forever. We tend to think of 1880 newspaper clippings as being perishable, not online media, but the opposite is true. So all day on 9/11 I archived news sites and about two hundred blogs using "wget -p".

Over the next week I archived some 4,600 blogs. They've kind of been sitting around waiting for me to weed through and organize. I've also been wgetting 30 or so large news sites' front page every 15 minutes or so on the hunch that I'll grab something emerging even if I'm AFK. Well ...what can I do with this data?

The answer(s) to this question will definitely be of use to me. Thanks for asking it. Slash, thanks for posting it.

--
My .02,
Limekiller

Re:My 9/11 Archive by ralphclark · 2003-06-30 09:48 · Score: 1

Here are a couple of ideas:

1) Burn it onto DVD. But I don't know which format is likely to survive the longest!

2) Hand it over in whatever form you can to your nearest major University and let them work out how to archive it. If they can find a way to do so reliably, it will be very valuable to their Faculty of History in a hundred years or so!

If you can do both, then great - you could distribute it to several Universities. Be sure to include a few European Unis that that have already been around for at least 600-700 years as these are surely the most likely to survive intact over the long term ;o)
Re:My 9/11 Archive by Anonymous Coward · 2003-06-30 10:16 · Score: 0

Others are mentioning archive.org. In my experience, they'll take pretty much anything.
Re:My 9/11 Archive by SlamMan · 2003-06-30 11:02 · Score: 1

Give the Smithsonain Institution a call. They are working on a extremly extensive media and 9/11 project. I went to thier current offerings. Very impressive.

--
Mod point free since 2001
Re:My 9/11 Archive by Oopsz · 2003-06-30 11:31 · Score: 1

Print it out.

Paper will last far, far, far longer than any electronic media. I can still read the masters thesis my dad wrote in the 70s, but the box of punch cards to go along with it is utterly useless.
Re:My 9/11 Archive by damiam · 2003-06-30 14:53 · Score: 1

CDs will last at least as long as the average paper archive, and will still be readable in 50 years. Presumably the equipment to do so won't be widespread, but it'll be there.

--
It's hard to be religious when certain people are never incinerated by bolts of lightning.
Re:My 9/11 Archive by Dunkalis · 2003-06-30 16:53 · Score: 1

I say try and set up a server for all this. You personally may not have the money, but I'm betting that your local university would be willing to help. Now, if they don't, you could get people to donate money to help you set up a server for all that stuff. I'd love to see some of it, since its got to be an interesting cross-section of post-9/11 America and such. As others have said, the Smithsonian may be interested too, but giving everyone access to your archives would be a great public service. I know I'm definitely interested in what happens to your massive archive.

--
Slashdot is a waste of time. I enjoy wasting time.

An idea by revmoo · 2003-06-30 08:06 · Score: 4, Insightful

Here's a thought, a rather complicated one, but I Think it just might do the trick...

DON'T POST THINGS YOU DON'T WANT PEOPLE TO SEE ON A PUBLIC NETWORK.

It's quite simple really.

--
I would expect such blatant racism on Fark, but on Slashdot? Mods please ban this asshole.

Re:An idea by Anonymous Coward · 2003-06-30 09:43 · Score: 0

I think you misunderstood his question.
Re:An idea by revmoo · 2003-06-30 10:46 · Score: 1

I think you misunderstood his question.

No, my post was in response to people that get angry when their site's are mirrored. They seem to feel that even though they are distributing the content on an international network with millions of users, they can still control the information, solely through litigation, and that is not how things SHOULD be.

--
I would expect such blatant racism on Fark, but on Slashdot? Mods please ban this asshole.
Re:An idea by Anonymous Coward · 2003-06-30 10:50 · Score: 0

No, my post was in response to people that get angry when their site's are mirrored.

Oh, then you just posted it in the wrong place.
Re:An idea by Anonymous Coward · 2003-06-30 15:26 · Score: 0

revmoo@dipfish.org

Got it, thanks.

It might be useful to note... by stienman · 2003-06-30 08:10 · Score: 3, Informative

It might be useful to note that the archive servers are located outside the US, and that they act on requests to have information and websites removed from their archive. (IIRC). I would state that the Archive serves a compelling public interest, both in the sense of free speech, and in the basic idea of keeping a history or record of the internet. The archive is a museum of sorts.

Google, on the other hand, is gathering data for its search engine, and, of necessity, must have what essentially amounts to a copy of each web page in its stores in order to provide this service. If one does not want to have their data in Google, they simply use robots.txt, and Google doea not spider, cache, or store any data from that site if robots.txt is filled out. However, the site owner also denies themselves the ability to be listed, for 'free', in googles search pages. This could be thought of as the cost of being listed.

So I don't think either of those two situations have any problems defending themselves. An anonymizer could also be seen as providing a useful, protected service. An anonymizer is nothing more than a proxy service, and many ISPs use proxies now, not to mention caches and many other tools that store website information or meta information without notifying or requesting explicit permission to do so - they request implicit permission by sending a GET command.

-Adam

Re:It might be useful to note... by simoniker · 2003-06-30 09:42 · Score: 2, Informative

Actually, the Internet Archive's main Wayback Machine servers are located in a co-location center in San Francisco, so it's not correct to say they're located outside the US. There is a mirror of the Archive's web content at the Library of Alexandria in Egypt, however - maybe that's what you're thinking of?

In any case, the Archive's work with the Library Of Congress and, increasingly, national libraries who want to archive the Web content of their countries, proves that the establishment also thinks Web archiving is a vital thing to do for posterity. But the rights issues are definitely tricky.
Re:It might be useful to note... by adelton · 2003-07-01 00:21 · Score: 1

Please note that robots.txt affects whether Google crawls various parts of your website at all. To prevent your pages from being stored in the Google cache (even if they are searchable using Google), you need to specify the META tag <META NAME="GOOGLEBOT" CONTENT="NOARCHIVE"> in each and every of your pages.

Email? by Anonymous Coward · 2003-06-30 08:10 · Score: 2, Funny

We do not accept email from lawyers as a legitimate form of communication.

Email from lawyers is /dev/null'd.
As for the waking up in the middle of the night...
Um, turn off the ringer? Stop sleeping in the NOC? Maybe invest in a second phone line for your business instead of using moms POTS line.

Re:Archive sites are valid and important resources by Mundrid · 2003-06-30 08:19 · Score: 0

Warning! DO NOT follow that link.

Be Happy by Apreche · 2003-06-30 08:19 · Score: 2, Insightful

I'd be damn happy if someone made backups and mirrors of a site I made. People will visit my site without using bandwith I pay for. Also, if disaster strikes I can get my site back because someone else was kind enough to back me up. The more the merrier

--
The GeekNights podcast is going strong. Listen!

Putting your trash out by berb · 2003-06-30 08:30 · Score: 0

I would argue that this is like putting your trash out on the curb. There is no expectation of privacy See This. When something is posted to the InterWeb it becomes part of the public domain, and even if you copyright it, yes those words and images and layout still belong to you, as does your trash sitting on your properity, but there is no legal recourse preventing someone from looking at it, taking photos of it or even taking it home.

--
In teh event of an actual emergency this space might provide useful information.

Off topic, but... by UnrefinedLayman · 2003-06-30 08:50 · Score: 0, Offtopic

...my girlfriend works somewhere where they tend to keep logs going back the last decade on what sites people visit. I've been unable to find a good proxying bit of software that is opensource so I could run an anonymizer for her.

Basically what I'm looking for is something like anonymizer.com, where she can put in a URL in an HTML form, click "Go", and it will display the site in a frame with a new form in the top frame with the "go" button, that way instead of seeing that she went to yahoo.com they'd see she went to mysite.com/go.php?1293874 or perhaps go.pl?239861.

Does anyone know of any opensource software for that other than PHP-Proxy which doesn't seem to have been updated, ever (and yes, I can do some hacking on my own, but I'd prefer a mature project to save time).

Re:Off topic, but... by outlier · 2003-06-30 08:54 · Score: 1

Try CGIProxy.

Re:Archive sites are valid and important resources by Anonymous Coward · 2003-06-30 08:52 · Score: 0

why dont you just add

127.0.0.1 goatse.cx

to your HOSTS file?

Re:Archive sites are valid and important resources by Anonymous Coward · 2003-06-30 08:56 · Score: 0

Probably because I don't host goatse.cx on my local webserver.

Hitchhiker's Guide to the Galaxy (off topic) by jgoemat · 2003-06-30 08:59 · Score: 0, Offtopic

Thanks for the link! I used to play THGTTG on my Commodore 64 :) Playing the game actually got me interested enough to buy the books. I bought a bundle of Infocom games about 10 years ago that had it. There were 20 games on 5 floppy disks (Zork 0 took up almost two whole disks is why there were so many). I still have those backed up, although I haven't played them in years. Good thing they weren't copy protected, eh?

Oh crap. by Anonymous Coward · 2003-06-30 09:02 · Score: 0

I played it for about 3 minutes, i ended up going back to sleep and being killed by the vogons :(

Honestly... by lptport1 · 2003-06-30 09:04 · Score: 2, Informative

This sounds sort of cynical to me, but it strikes me that the people who might be concerned about that don't comprehend the word "cache" and therefore never click on that link in the search results...

Thus, never discovering that their site has been archived somewhere else. That, and Google has a rather chunky disclaimer-type-deal at the top--I'm sure it's in response to just that behaviour.

*copy* right by ccady · 2003-06-30 09:06 · Score: 4, Interesting

(FWIW, IANAL) Web site content is copyrighted. Therefore, you have a right to make your own personal copy, and backup copies, but it is not legal to redistribute those copies without the site owner's permission. I cannot imagine that the Wayback machine or the Google cache is legal. They are blatantly disregarding the site owners' copyright.

That said, I think the law should be changed or at least clarified, because it is patently (pun intended) obvious that those services are doing a vast social good, and should be encouraged.

--
J'aime mieux les méchants que les imbéciles, parce qu'ils se reposent. -- Alexandre Dumas

Re:*copy* right by stanwirth · 2003-06-30 09:19 · Score: 2, Interesting

Web site content is copyrighted. Therefore, you have a right to make your own personal copy, and backup copies, but it is not legal to redistribute those copies without the site owner's permission. I cannot imagine that the Wayback machine or the Google cache is legal. They are blatantly disregarding the site owners' copyright.

That would imply that every ISP running a public squid cache is breaking the law, and Akamai's entire business model is based on illegal content-smuggling. I really don't think so!
Re:*copy* right by aridhol · 2003-06-30 09:45 · Score: 1

Akamai's entire business model is based on illegal content-smuggling
Can you clarify that? Last time I checked, Akamai only distributes for those who pay them to do so, so I'm pretty sure they have permission.

--
I can't say that I don't give a fuck. I've just run out of fuck to give.
Re:*copy* right by limekiller4 · 2003-06-30 09:45 · Score: 2, Informative

stanwirth writes:
"...and Akamai's entire business model is based on illegal content-smuggling. I really don't think so!"

Akamai caches sites of people who pay them to cache them, so that would be one hell of a lawsuit. I know this because I worked for them for a few years.

--
My .02,
Limekiller
Re:*copy* right by anthony_dipierro · 2003-06-30 09:50 · Score: 2, Informative

(FWIW, IANAL)

Obviously.
Re:*copy* right by SeanAhern · 2003-06-30 10:03 · Score: 4, Informative

Mod parent up! This link to the US Code is very useful in this context.

Heck, it's so useful that I'm going to quote some of it here:

TITLE 17 > CHAPTER 5 > Sec. 512. Prev | Next

Sec. 512. - Limitations on liability relating to material online

(a) Transitory Digital Network Communications. -

A service provider shall not be liable for monetary relief, or, except as provided in subsection (j), for injunctive or other equitable relief, for infringement of copyright by reason of the provider's transmitting, routing, or providing connections for, material through a system or network controlled or operated by or for the service provider, or by reason of the intermediate and transient storage of that material in the course of such transmitting, routing, or providing connections, if -

(1)

the transmission of the material was initiated by or at the direction of a person other than the service provider;

(2)

the transmission, routing, provision of connections, or storage is carried out through an automatic technical process without selection of the material by the service provider;

(3)

the service provider does not select the recipients of the material except as an automatic response to the request of another person;

(4)

no copy of the material made by the service provider in the course of such intermediate or transient storage is maintained on the system or network in a manner ordinarily accessible to anyone other than anticipated recipients, and no such copy is maintained on the system or network in a manner ordinarily accessible to such anticipated recipients for a longer period than is reasonably necessary for the transmission, routing, or provision of connections; and

(5)

the material is transmitted through the system or network without modification of its content.
(b) System Caching. -

(1) Limitation on liability. -

A service provider shall not be liable for monetary relief, or, except as provided in subsection (j), for injunctive or other equitable relief, for infringement of copyright by reason of the intermediate and temporary storage of material on a system or network controlled or operated by or for the service provider in a case in which -

(A)

the material is made available online by a person other than the service provider;

(B)

the material is transmitted from the person described in subparagraph (A) through the system or network to a person other than the person described in subparagraph (A) at the direction of that other person; and

(C)

the storage is carried out through an automatic technical process for the purpose of making the material available to users of the system or network who, after the material is transmitted as described in subparagraph (B), request access to the material from the person described in subparagraph (A),

if the conditions set forth in paragraph (2) are met.
(2) Conditions. -

The conditions referred to in paragraph (1) are that -

(A)

the material described in paragraph (1) is transmitted to the subsequent users described in paragraph (1)(C) without modification to its content from the manner in which the material was transmitted from the person described in paragraph (1)(A);

(B)

the service provider described in paragraph (1) complies with rules concerning the refreshing, reloading, or other updating of the material when specified by the person making the material available online in accordance with a generally accepted industry standard data communications protocol for the system or network through which that person makes the material available, except that this subparagraph applies only if those rules are not used by the person described in paragraph (1)(A) to prevent or unreasonably impair the intermediate storage to which this subsection applies;
Re:*copy* right by darksaber · 2003-06-30 10:05 · Score: 1

(FWIW, IANAL) Web site content is copyrighted. Therefore, you have a right to make your own personal copy, and backup copies, but it is not legal to redistribute those copies without the site owner's permission. I cannot imagine that the Wayback machine or the Google cache is legal. They are blatantly disregarding the site owners' copyright.

This confuses fair use on purchased items that you own with what you are allowed to do with temporary copies for viewing. By the same logic, you could legally take a video camera to see Terminator 3 in a couple days and make a copy for personal use to watch at home later all by yourself (let's pretend you don't have friends over to watch it too or post it on the internet afterwards). And you could save it "just in case" the MPAA lost every single copy they had.

However, the latest HTTP protocol has provisions for caching, and headers for controlling it, so if you get a file with headers allowing caching, then you should be able to cache it. I don't remember if HTTP 1.0 had them, but HTTP 1.1 certainly does. If you get the file through HTTP 1.1 and they don't include the headers to prevent caching, then the protocol with which they chose to provide you the file is stating that you can cache it. So, if you play by those rules, you should be able to check if you can cache content legally. If they don't want you to cache, then they can say so when they send it to you. (I'm ignoring the messy rules regarding stale content for simplicity.)
Re:*copy* right by Anonymous Coward · 2003-07-01 12:14 · Score: 0

I find it somewhat amusing that my direct copy/paste of your link is getting moderated higher than your original link. You should get these mod points, not me. I guess the moderators are rewarding laziness. Sorry about that.

-SeanAhern

Archiving is important by pdoucy · 2003-06-30 09:07 · Score: 1

I really think archiving is important, and is one strenght of the internet : archiving your data without paying, or even asking for it. I mean, there must be a lot of companies or organization (I think about the NASA, etc...) who probably have hundreds of terabytes of data, and don't want to spend money or time making backups. Add that most archiving medium won't last more than a couple of decades, and you'll understand that archiving is great because everyone can backup a little something, and all those wonderful datas aren't lost...

--
Cats are intended to teach us that not everything in nature has a function.

Re:Archive sites are valid and important resources by Anonymous Coward · 2003-06-30 09:17 · Score: 0

So you are not a Linux user then? =)

I think it is, unless... by arcadum · 2003-06-30 09:23 · Score: 0

A popup brought you too it I suspect the following makes it legal in the US.

*** NOTICE: In accordance with Title 17 U.S.C. Section 107, this material is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. Feel free to distribute widely but PLEASE acknowledge the original source. ***

Actually it's the DMCA by anthony_dipierro · 2003-06-30 09:37 · Score: 1

That's right, the DMCA contains provisions protecting companies like google from copyright infringement. Read it some time.

ON A RELATED NOTE by exhilaration · 2003-06-30 09:51 · Score: 1

Any easy way for me to save pages I'm looking at? Perhaps a little button in Mozilla that automatically saves the page with graphics, and places everything neatly into a timestamped folder?

Re:ON A RELATED NOTE by Anonymous Coward · 2003-06-30 10:01 · Score: 0

Kind of like what IE has had for years?
Re:ON A RELATED NOTE by Anonymous Coward · 2003-06-30 11:27 · Score: 0

You mean like File/Save As/Complete Web Page?

It's there in both Moz and IE.

Re:Archive sites are valid and important resources by Anonymous Coward · 2003-06-30 10:11 · Score: 0

it works for both windows and linux. And it doesn't need a webserver running on your machine - it'll just time out.

legality by sir_cello · 2003-06-30 10:18 · Score: 2, Informative

There are limited provisions in copyright law (at least in the UK, and I expect to occur elsewhere in the world) for public libraries and archives. But these are indeed limited provisions and do not apply to a random commercial organisation that decides to provide such a service.

Firstly, in the general case of search engines providing indexing of content, this is legal and there are legal cases to back it up (in the UK: antiquesportfolio) so long as the indexes are not copies.

Secondly, in the case of USENET groups and mailing lists, then in the process of submitting a message to the mailing list or group, you have given an implicit license for the message to be reproduced within the nature of the particular technology at hand. This means if at a later date you object to a message in a mailing list that you wrote in the past, you don't really have the ability to retract it. In all cases, anyone deciding to use the material in another way (e.g. creating a commercial CDROM of USENET material for a marked up price) would be violating your (and others) copyright. However, if they were providing that CDROM as a distribution service for USENET itself (e.g. "get your monthly USENET CDROM") then this is probably within the bounds of legality as it is still transfer via the USENET system, and the cost is likely to be that to reflect media/distribution costs rather than some specific aim to make a commercial product out of your material.

Finally, in the specific case of copies of websites, yes this is a violation of copyright - but as far as I know this has not been tested in a court of law. The use of the Robots Exclusion Protocol and the NOARCHIVE, NOINDEX and NOFOLLOW elements allow a weasal argument suggesting that it is inherent in the WWW itself (as a new form of media / technology) that search engine indexing and archiving / caching is legal unless you specifically disallow it with this mechanism. It may also be the case that if this archiving / caching was carried out for profit or at price greater than fair for distribution/media then a party is making an economic gain out of your material and this suggests an inequitable violation of your economic rights.

Another point to remember is that in WTO treaties that resulted in DMCA provisions, as enacted in the UK and EU, there are specific fair use allowances for intermediate copies of a copyright work as necessary for the telecommunications medium itself (this would seem to allow things like store-and-forward systems, and caching).

But is it still caching once the original is gone? by Anonymous+Brave+Guy · 2003-06-30 10:29 · Score: 1

As you correctly point out in another post, copyright law has an exception for caching Internet content.

That may be true in some places, I don't know. Regardless, if the archive continues after the original site is taken down, it is no longer a cache, it is an outright copy.

And yes, this could be damaging. To give a close-to-home example, consider a case where a site gets /.ed so only a few people can see the real content. If that site is then updated in some critical way, the numerous caches all over the web won't be (at least not immediately, and it is clearly unreasonable to expect anyone publishing a website to notify them all). This means all the people following the link posted on /. to the Google cache or whatever will be reading out of date information, which could easily be detrimental to whoever owns the site.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Not so black and white as most here are saying! by Anonymous+Brave+Guy · 2003-06-30 10:40 · Score: 2, Interesting

In other words, by your NOT including a robots.txt file, you are implicitly granting them permission to cache your content.

Riiiiight. See you in court.

As I've just posted elsewhere, it is quite feasible that a site owner could be damaged if caches maintain information after the original site has been changed or taken down. For example, if updated information is placed on the original, this leaves the "cached" versions out of date and misleading anyone who reads them thinking they're seeing a perfect copy of the real thing.

There is also the issue of a site owner's right to know who is visiting them. Many popular web sites can and do collect information about how visitors move around their sites, the browsers and resolutions they use, etc. If the information on the site is being offered according to the normal conventions of the Internet, it is only fair to provide them the feedback normally returned by the conventions of the Internet. This information is valuable to them when they come to revise the site. Ultimately it is also in the site visitors' best interests for the site owner to have accurate information available, so that if they want to make the effort to improve usability, support minority browsers that some of their visitors use or whatever, they can do so.

On a related note, there are questions of advertising revenue etc. if a site is supported by sponsors who pay per-hit. It's not at all guaranteed that they will get their fair amount of sponsorship if most of those hits are seeing a web cached version.

This whole issue isn't nearly as black and white as the "information should be free" crowd are inevitably shouting already.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:But is it still caching once the original is go by anthony_dipierro · 2003-06-30 10:48 · Score: 1

Regardless, if the archive continues after the original site is taken down, it is no longer a cache, it is an outright copy.

I'm not sure what you mean by "an outright copy." It's always an outright copy. But if the archive continues after the original site is taken down, it's still a cache.

If that site is then updated in some critical way, the numerous caches all over the web won't be (at least not immediately, and it is clearly unreasonable to expect anyone publishing a website to notify them all).

The legally exception requires that you adhere to internet standards. Thus you have to adhere to the robots-exclusion standard, the "Cache-Control: no-cache" HTTP header, and the "Expires" HTTP header, among others.

Re:But is it still caching once the original is go by Anonymous+Brave+Guy · 2003-06-30 14:09 · Score: 1

I'm not sure what you mean by "an outright copy." It's always an outright copy. But if the archive continues after the original site is taken down, it's still a cache.

My point is that the term "cache", as commonly used in computing, carries an inference that the cached material is identical to the original but faster to access. If the original is no longer there, or has changed, then you are no longer caching it, you are simply keeping a copy of the old data.

The legally exception requires that you adhere to internet standards.

Can you give a reference for this, and tell us where it applies (which jurisdiction), please? I'd be interested to see the actual legal wording.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Interesting by Anonymous+Brave+Guy · 2003-06-30 14:28 · Score: 1

Interestingly, the law cited makes explicit provision for several of the concerns I expressed in earlier posts in this thread, notably the issues of keeping the data up-to-date and of the information provider getting information from those visiting their site directly.

The normal Internet convention is that when I update my site, changes are immediately visible to everyone. (NB: browser caching is not equivalent to web caching here for several reasons.) Also, visitors to my site normally leave information about their browsing that I might use, for example, to bill sponsors for advertising revenue. In light of this, it seems to me that typical web caches are pretty clearly in breach of the above conditions for legal protection.

It also seems that if I make available some information, which gets cached, and I then password protect the same information on my own site, the cache is again breaking the rules, since it wouldn't require the same password to access the information.

All in all, if that is the exemption I was referred to earlier in this thread, it looks as though the web caches are skating on very thin ice. If they did something like cloning material on a web site that was later removed in order to publish it in a book, I imagine they could wind up having a serious dispute with the publisher, or perhaps the author himself, either of whom might have a strong case that they suffered financially because of the actions of the caching site.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:Interesting by anthony_dipierro · 2003-07-01 14:54 · Score: 1

All in all, if that is the exemption I was referred to earlier in this thread, it looks as though the web caches are skating on very thin ice. If they did something like cloning material on a web site that was later removed in order to publish it in a book, I imagine they could wind up having a serious dispute with the publisher, or perhaps the author himself, either of whom might have a strong case that they suffered financially because of the actions of the caching site.

I'm not sure you're talking about this thread, so I'll assume you're talking about the exemption I referred to. Specifically, I was referring to Google, who removes content from its cache after a short period of time.
Re:Interesting by Anonymous+Brave+Guy · 2003-07-01 23:46 · Score: 1

I was referring to the post where someone said there was an exemption under copyright law for web caches. I assumed the parts of the DMCA that were cited here were that exemption. In that case the validity of the original claim appears to be less clear than was suggested.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Not as bad as me by RachaelAnne · 2003-06-30 14:29 · Score: 1

My first try I didn't realize I could say "turn on light" so that I could see where my robe was that had the aspirin in it. :) The bulldozer killed me *inside the house*.

Rachael

--
"Go Forth Ye Lemmings and Propagate"

Re:But is it still caching once the original is go by anthony_dipierro · 2003-06-30 14:30 · Score: 1

I gave references in two other posts, but here it is again, for the US at least. http://www4.law.cornell.edu/uscode/17/512.html It's part of the DMCA. In part, " the service provider described in paragraph (1) complies with rules concerning the refreshing, reloading, or other updating of the material when specified by the person making the material available online in accordance with a generally accepted industry standard data communications protocol for the system or network through which that person makes the material available, except that this subparagraph applies only if those rules are not used by the person described in paragraph (1)(A) to prevent or unreasonably impair the intermediate storage to which this subsection applies"

could you send that to me? by zoloto · 2003-06-30 14:43 · Score: 1

I made a mistake by not doing that and I was wondering if you could send me some of this info?

email me with details!, use my public key!

Re:could you send that to me? by limekiller4 · 2003-06-30 14:50 · Score: 1

You mean you want the archives?

--
My .02,
Limekiller
Re:could you send that to me? by zoloto · 2003-06-30 15:08 · Score: 1

yeah if you could :) any method you'd be willing to send them to me in?
Re:could you send that to me? by dattaway · 2003-06-30 15:54 · Score: 1

in the spirit of requests:

"me too!"
Re:could you send that to me? by limekiller4 · 2003-06-30 23:03 · Score: 1

Sure. Ummm. I'm going in for surgery today so it'll have to wait until tomorrow, but drop me an email at slash@php.us and I'll get you a URL where you can just dl it. If you have dialup or some such, let me know and I'll snailmail you a CD with the data.

--
My .02,
Limekiller
Re:could you send that to me? by limekiller4 · 2003-06-30 23:11 · Score: 1

I wrote this to someone else, so I'll just cut and paste...

"Sure. Ummm. I'm going in for surgery today so it'll have to wait until tomorrow, but drop me an email at slash@php.us and I'll get you a URL where you can just dl it. If you have dialup or some such, let me know and I'll snailmail you a CD with the data."

--
My .02,
Limekiller
Re:could you send that to me? by zoloto · 2003-07-01 15:19 · Score: 1

mail sent ;) you should have it soon my friend.

That's not what I've read. by MarkusQ · 2003-06-30 17:36 · Score: 1

CDs will last at least as long as the average paper archive

Paper can easily last a hundred years (I have a number of books from the late 1800's & early 1900's); IIRC the typical MTF for CDs is on the order of 20 years, and can be as low as 5.

-- MarkusQ

Re:That's not what I've read. by damiam · 2003-07-01 01:44 · Score: 2, Insightful

Paper can potentially last a long time (the US Constitution is still intact, for example). However, the average paper archive the size of a CD (which would physically be quite substantial) would require enough upkeep to make the cost of storing and maintaining it much greater than the cost of burning a new copy of the CD every ten or twenty years.

--
It's hard to be religious when certain people are never incinerated by bolts of lightning.

shrinkwrap/acceptable use policy? by lpq · 2003-06-30 18:46 · Score: 2, Interesting

Some people are arguing robots.txt as the determiner, however remember
the court case that a company *lost* because it copied the data of a
competitor site and set it's prices lower.

This is equivalent to Kroger hiring a few clerks to go down each day and
take prices of various objects on their wifi equip'ed phones/handhelds in
a store so Safeway can under cut prices.

What, you didn't read the fine print on the Safeway door that says no price
comparisons or making up price lists? Or what...were they supposed to look
for a robots.txt file behind the Safeway door?

There seems to be a general lack of common sense here (especially on the
part of the judge that ruled against the company scanning for competing
prices). If it is allowed in the real world, it shouldn't be different in
the computer world without alot of sound reasoning behind why it should be
different. The fact that Safeway could have a 3-page acceptable use policy
that I accept when my body presence opens the door, is ludicrous.

Now you talk about advertising losses -- what about whatever major network
it was, deleting competing major network logo bought and paid for on
tall building in Times Square for New Years eve? Competing networked modified
the image in realtime and inserted their own logo for the price of an SGI
workstation -- heck of alot cheaper. Legal? Not legal? Can you say a
real life image is "copyright" and if two people take a picture of the same
real life picture, is one the rightful owner? What if one or both alter
the "real life picture", have they violated someone's rights? Reality's
rights (ok, in this case it would have been the network that paids to rent the
entire side of the building), but it's really a matter of who owns what you
see? If a picture is take of what you see, who owns the picture?

This is a complete mishmash of conflicting legal decisions with computer
copying, caching, alteration and adding to the mess. What if I load a page
but I don't load the images? Have I violated copyright because I either
chose or cannot load the images? What if I selectively blocked them based
on their IP or name? If I don't load flash player, am I violating a
copyright on a site by not viewing the flash content advertising?

Random judges in random jurisdictions are going to be making random calls on
right/wrong that will collide with each other and with what makes sense in
the real world.

I'm not sure what the collective approach should be -- should I be required to
watch TV advertising or am I stealing programming if I go to the loo during
a panty spot? If I block popup am I stealing computer time.....

This is all just one big gigantic growing mass of living worms that promises to be one of the larger headaches of times to come.

Any unified field theories to solve this mess? :-)

Re:Archive sites are valid and important resources by Rip!ey · 2003-06-30 19:37 · Score: 1

Go into your /. user settings (preferences) and on the comments page, set 'Display Link domains' to 'Always show link domains'. It does at leat give you the chance to think before you click. I mean, what do you think a link to a site named Goatsex is going to reveal? Happy hunting.

Re:But is it still caching once the original is go by Anonymous+Brave+Guy · 2003-06-30 23:30 · Score: 1

OK, I've read the relevant parts of that. I fail to see how a web cache "...complies with rules concerning the refreshing, reloading, or other updating of the material when specified by the person making the material available online in accordance with a generally accepted industry standard data communications protocol for the system or network..."

The industry standard is that when you request information from a web site, you get the current version. (As I noted elsewhere, browser caching is quite different to web caching in this respect.) Web caches may not match that expectation.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:But is it still caching once the original is go by anthony_dipierro · 2003-07-01 04:35 · Score: 1

The industry standard is that when you request information from a web site, you get the current version.

Sometimes. If the current version isn't available (for instance because you're offline), then you get whatever is in the cache.

I'm mainly thinking of google, here. Google isn't intentionally displaying old content, and they take it down after a rather short period of time. Presumably they adhere to the "Expires" header and other relevant information. Certainly they adhere to the robots exclusion and things like that. Archive.org is a different story. They'd probably have to rely at least in part on fair use, which is much less black and white.

Re:But is it still caching once the original is go by Anonymous+Brave+Guy · 2003-07-01 05:59 · Score: 1

If the current version isn't available (for instance because you're offline), then you get whatever is in the cache.

Sorry, but I don't think this is reasonable. That caching is part of the browser software, and as I've noted repeatedly, that is a different issue to the web caches we're discussing here.

Nothing in the HTTP spec, or in any other relevant Internet standards, provides for any caching of old content and supplying it when a straightforward HTTP request for a file is sent. You get the current file, or a defined error code.

Archive.org is a different story. They'd probably have to rely at least in part on fair use, which is much less black and white.

That's lovely. Remind me again where in UK law there is any such provision? (For those who missed that, US copyright law provides for various "fair use" exemptions to the normal rule, while UK law provides far fewer.) If the Wayback Machine records content off my web site, and allows others to view it against my wishes when I have taken it down, then I think it is breaking UK copyright law, and this is completely black and white.

This is actually quite relevant to me, because I do run a web site on which I've put various technical articles in the past. There is a distinct possibility that a book will be published based significantly on the content of those articles. How do you think the publisher would feel if people could go and look up the articles that I used to make available publicly on my web site, although I no longer choose to do so?

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:But is it still caching once the original is go by anthony_dipierro · 2003-07-01 06:34 · Score: 1

Nothing in the HTTP spec, or in any other relevant Internet standards, provides for any caching of old content and supplying it when a straightforward HTTP request for a file is sent.

I don't think that's what the spec says. It says that you have to explicitly warn the end-user when semantic transparancy is relaxed by cache. It only says that the request must be explicit when relaxed by client or origin server.

You get the current file, or a defined error code.

Or you get an old file and a warning.

Remind me again where in UK law there is any such provision?

Why do I give a shit about the UK?

If the Wayback Machine records content off my web site, and allows others to view it against my wishes when I have taken it down, then I think it is breaking UK copyright law, and this is completely black and white.

So contact them and ask them to remove it, and they'll comply. Or you could sue them and try to extradite. Good luck.

This is actually quite relevant to me, because I do run a web site on which I've put various technical articles in the past. There is a distinct possibility that a book will be published based significantly on the content of those articles. How do you think the publisher would feel if people could go and look up the articles that I used to make available publicly on my web site, although I no longer choose to do so?

Depends on the publisher. Maybe you should self-publish, using an open content license. I suggest you release your work into the public domain. That'll solve your problems.

Re:But is it still caching once the original is go by Anonymous+Brave+Guy · 2003-07-01 12:13 · Score: 1

I'll have to take your word about the HTTP spec; I don't recall ever seeing what you describe, but I can't say I've read it in that much detail recently.

Regarding your other points...

Why do I give a shit about the UK?

Because if you're smart, you respect reasonable laws of other countries with which yours deals regularly, and not just your own.
Why do I give a shit about the US? Many of these sites aren't based there any more than they are in the UK.

Depends on the publisher. Maybe you should self-publish, using an open content license. I suggest you release your work into the public domain. That'll solve your problems.

Unfortunately, it won't pay my rent.

I spend a lot of my spare time helping out with free information and advice on-line, occasionally on this forum and frequently elsewhere. Mostly, I do it because people have helped me in the past, and I feel both a certain moral obligation to "return the favour" and a certain satisfaction knowing that I've helped someone out.

There comes a point, however, when I feel reasonably entitled to something more than a grateful newbie's "thank you", particularly when we're talking about the result of months of hard work. According to the law in my country and probably in yours, I'm entitled to do that without some freeloading commercial entity nicking my work and giving it away for free.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:Not so black and white as what you are saying! by Discoflamingo13 · 2003-07-01 14:00 · Score: 1

As I've just posted elsewhere, it is quite feasible that a site owner could be damaged if caches maintain information after the original site has been changed or taken down.

Damaged in what way? Aren't there archives of newspapers, journals, and magazines? And if time-sensitive information is present on a website, does the public have a right to see what was previously there? Websites can get away with a lot of instant censorship that way - you can check out this site for an archive designed in response to that very issue. If you have a specific example related to this problem, I would love to hear it.

There is also the issue of a site owner's right to know who is visiting them.

Right - just like WalMart has the right to pat down and run a credit check on everyone who walks through their doors. While a site admin might like to know everything about a person who wanders on to their site, they have no "de facto" claim to information about any/everybody who browses their site.

On a related note, there are questions of advertising revenue etc.

If your site is being sponsored by ad revenue, I think the site owners need to find a better business model. (Might I recommend the "customers pay a premium to not have ads" model?) And if your content is worth viewing, your viewers will want the latest, greatest version of it - I haven't seen a web archive yet that claims this information is the most current and up-to-date.

This issue isn't so black and white as the "information belongs to me" crowd seems to believe it is.

Re:But is it still caching once the original is go by anthony_dipierro · 2003-07-01 14:46 · Score: 1

Because if you're smart, you respect reasonable laws of other countries with which yours deals regularly, and not just your own.

I don't think copyright is a reasonable law. I only respect it to the extent I think I might get caught.

Why do I give a shit about the US? Many of these sites aren't based there any more than they are in the UK.

I never said you had to give a shit about the US.

Unfortunately, it won't pay my rent.

So why don't you get a job?

There comes a point, however, when I feel reasonably entitled to something more than a grateful newbie's "thank you", particularly when we're talking about the result of months of hard work. According to the law in my country and probably in yours, I'm entitled to do that without some freeloading commercial entity nicking my work and giving it away for free.

You're entitled to try, anyway. But I'm not going to feel sorry for you if you fail.

Re:But is it still caching once the original is go by Anonymous+Brave+Guy · 2003-07-01 23:32 · Score: 1

I don't think copyright is a reasonable law. I only respect it to the extent I think I might get caught.

Ah, I see. You're one of the people who, instead of discussing the issue on merits, decides unilaterally that he is above the law. So much for your credibility in any discussions around here, then.

So why don't you get a job?

It was a figure of speech.
I have a job, at which I work hard, and get paid fairly.
That remark was pretty crass considering the state of the industry and the number of good people who currently aren't in employment in spite of having useful skills.
If I were one of those people, and I went and flipped burgers rather than writing a book that will hopefully help thousands of newbies to improve their skills, who loses out? Everyone.

You're entitled to try, anyway. But I'm not going to feel sorry for you if you fail.

If people like me listened to people like you, open source and many other good things would be dead.

Tell me, how often do you volunteer 10+ hours of your spare time in a week, to help out others with your knowledge, experience or skills? 20+ hours? Do you run a high traffic information web site? Do you answer questions on bulletin boards or Usenet? Do you contribute to open projects? From the tone of your comments, it doesn't sound like it, or you might have a little more respect for those who do.

Now tell me how often you personally use those resources. (You do subscribe to Slashdot, right? And you don't run any free or open source software, nor visit freely available web sites run by volunteers to get some information you wanted?) I'm guessing honest answers to these questions would make you look more than a little selfish.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:Not so black and white as what you are saying! by Anonymous+Brave+Guy · 2003-07-02 00:06 · Score: 2, Interesting

Damaged in what way? Aren't there archives of newspapers, journals, and magazines? And if time-sensitive information is present on a website, does the public have a right to see what was previously there?

If I put up information on a web site, for free, as a volunteer, then the public has no rights whatsoever, either legally or morally. Why the hell should they? They didn't do anything to earn them.

If you have a specific example related to this problem, I would love to hear it.

I'll give you a couple of examples where real damage can be done. There are certainly several other instances, but I hope these will suffice for now.

There have been cases where someone published some material on a subject that interested them on a web site, but later wanted to publish work based on it in something like a journal or a book. (Disclosure: I am currently in a similar position myself.)

Now, publishers get very nervous about publishing material that has previously been available in another form. If you're arguing that by putting it up on the web an author effectively forfeits all rights to control their work -- i.e., that the usual principles of copyright shouldn't apply for some reason in this medium -- then you're basically saying that anyone who might ever want to publish original material they wrote shouldn't ever make anything available on the web first. Given how much both the public and the author can potentially get out of that, provided that reasonable controls are in place -- there was a Slashdot story about a new programming book citing a preprint temporarily placed on the web just a few days ago -- this seems to be needlessly counterproductive to me.

Secondly, a bit closer to home, consider a company that has a critical story about it published on Slashdot. That company is likely to get a lot of traffic to its web site if the site is linked, and might well want to put up a rebuttal of any points made against it. It's only fair that visitors who go to check out the Slashdot story also see the company's response.

Now, we all know that Slashdot articles have seriously criticised businesses in the past, sometimes with justification, sometimes without. We all know that web sites get Slashdotted. We all know that people post links here to Google caches of sites, or just copy whole pages and post them here. In this sort of case, someone could suffer serious harm to their reputation because the audience of Slashdot only get to read things supporting a critical claim, without seeing (or even being aware of) a response from the criticised party in their defence.

Nicking someone's material and posting it here is blatant copyright infringement, and just because it's done by an AC and Slashdot claims that all posts are the responsibility of their authors doesn't necessarily make it legal. It amazes me, given a few of the things that get posted around here, that no-one has ever really attempted to sue Slashdot over this. Certainly things like circumventing the NYT's "free reg required" are very dicey, and given that everyone (including those running Slashdot) knows that it happens, I don't see how they'd have much of a defence.

In my personal opinion, and looking at the actual US law that's been quoted here, it seems that web sites caching material are also likely to be in breach of copyright laws for much the same reasons, doing much the same damage in some cases, and potentially subject to much the same penalties.

Right - just like WalMart has the right to pat down and run a credit check on everyone who walks through their doors.

No, it doesn't. But it has the right to refuse entry to anyone who doesn't provide the information it requires. Banks do this if you try to enter before removing your crash helmet. Bars do it if you look under-age and can't produce ID.

While a site admin might like to know everythin

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:But is it still caching once the original is go by anthony_dipierro · 2003-07-02 06:30 · Score: 1

Ah, I see. You're one of the people who, instead of discussing the issue on merits, decides unilaterally that he is above the law. So much for your credibility in any discussions around here, then.

We're all above the law. The government derives its power from the people, not the other way around.

That remark was pretty crass considering the state of the industry and the number of good people who currently aren't in employment in spite of having useful skills.

I'm unemployed despite having useful skills. I fail to see how my remark was crass.

If I were one of those people, and I went and flipped burgers rather than writing a book that will hopefully help thousands of newbies to improve their skills, who loses out? Everyone.

If no one was willing to pay you to write a book, then apparently you aren't a very good writer. Maybe you're better at burger flipping.

If people like me listened to people like you, open source and many other good things would be dead.

How so?

Tell me, how often do you volunteer 10+ hours of your spare time in a week, to help out others with your knowledge, experience or skills?

Quite often.

20+ hours?

Not really.

Do you run a high traffic information web site?

I help run one... Well, it's more of an entertainment web site.

Do you answer questions on bulletin boards or Usenet?

No, but I answer questions on slashdot... Same difference.

Do you contribute to open projects?

A little. Not much. Usually my help isn't wanted.

From the tone of your comments, it doesn't sound like it, or you might have a little more respect for those who do.

From your tone it sounds like you do, only you don't do it voluntarily, but in order to toot your horn or make a profit.

Now tell me how often you personally use those resources.

Very often. I'd say I've put in my own fair share, though.

(You do subscribe to Slashdot, right? And you don't run any free or open source software, nor visit freely available web sites run by volunteers to get some information you wanted?)

Slashdot is not a volunteer organization. They make money off me, not the other way around. Sure, I run free and open source software. I've contributed patches to some, and bug reports to others. I do what I can when I can. Most people don't like my coding style, so I'm not usually asked to help.

What I'm sick of is the duality of people who volunteer and then demand to get something back. That's not the way volunteering works. If you want to demand to get something back, then you get a job, you don't volunteer. Most of the time you will get something back, but it's almost never the same thing you gave. We each have different skills, after all.

Re:But is it still caching once the original is go by Anonymous+Brave+Guy · 2003-07-02 08:05 · Score: 1

We're all above the law. The government derives its power from the people, not the other way around.

Those two statements aren't in any way equivalent.

Just to be clear, I do not volunteer my time just to toot my horn or make a profit. I have given thousands of hours over the past 5-6 years helping out in very technical forums (not just writing amusing anecdotes on Slashdot for my own entertainment, which is hardly the same thing) and never made a penny from it. I post here anonymously, and do you see me plugging anything I've done for commercial reasons? No.

However, what you're proposing would mean that I could no longer volunteer my time to help in some ways, because doing so would directly affect my ability to take part in other activities that are rent-paying. I'm not volunteering my time in the latter case, I'm doing a job, for which I expect to be given credit and paid. At present, publishers might be interested in my work, because it's not publicly available. I know that several major ones will refuse to publish material that has been previously available without pretty solid guarantees that the author has exclusive rights to it, etc.

Slashdot is not a volunteer organization. They make money off me, not the other way around.

How do they do that, if you're not paying subscription fees?

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:But is it still caching once the original is go by anthony_dipierro · 2003-07-02 08:36 · Score: 1

Those two statements aren't in any way equivalent.

Ah, but they are. When the government tries to force an unjust law on the people, the people are under no obligation to accept it. Now sometimes it's in our own best interests to follow it anyway, just as we would sometimes give up certain freedoms under other gunpoint situations, but it's not always the best idea.

Just to be clear, I do not volunteer my time just to toot my horn or make a profit. I have given thousands of hours over the past 5-6 years helping out in very technical forums (not just writing amusing anecdotes on Slashdot for my own entertainment, which is hardly the same thing) and never made a penny from it.

Toot toot.

However, what you're proposing would mean that I could no longer volunteer my time to help in some ways, because doing so would directly affect my ability to take part in other activities that are rent-paying.

I thought you already had a job. Do you do any real work, or do you make all your money by threatening people with copyright infringement lawsuits?

How do they do that, if you're not paying subscription fees?

Two ways. 1) Advertising, and 2) Subscription fees from others.

Re:But is it still caching once the original is go by Anonymous+Brave+Guy · 2003-07-02 09:09 · Score: 1

When the government tries to force an unjust law on the people, the people are under no obligation to accept it.

And who is to say that it is unjust? You? Copyright is a well-established legal principle, and there are very good reasons for it. The fact that you don't like it doesn't make it unjust.

Perhaps you have a better idea for how to make laws? Or should we dispense with them altogether, since no doubt someone thinks every illegal thing should be legal, typically those who want to break the law and get away with it.

I'll gloss over the tooting thing; how exactly can an anonymous posting on an Internet bulletin board refuting a critical comment possibly be tooting my own horn?

I thought you already had a job. Do you do any real work, or do you make all your money by threatening people with copyright infringement lawsuits?

I do have a real job. I'm looking to make some extra money by doing some extra work. Do you have a problem with that?

And no, I'm not threatening anyone with copyright infringement lawsuits, though I would have every right to do so if people took my material and posted it elsewhere without my permission. I'm looking to make extra money by doing some honest work making use of my other skills. The only time action over copyright infringement would become relevant is if somebody like you tried to take advantage of my hard work for his own benefit, and I quite reasonably took legal action to prevent you from doing so.

Two ways. 1) Advertising, and 2) Subscription fees from others.

Thank you. Think about that for a minute, and understand that you just made my whole point beautifully. If everyone read Slashdot via web caches that didn't pass on relevant information, they would make no money from either of those sources. Then how would they support themselves?

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:But is it still caching once the original is go by anthony_dipierro · 2003-07-02 09:29 · Score: 1

And who is to say that it is unjust? You?

Yep, me. We each have to decide for ourselves what is moral and immoral.

Copyright is a well-established legal principle

So was slavery.

and there are very good reasons for it.

I disagree.

The fact that you don't like it doesn't make it unjust.

You're right. But the fact that it is unjust makes me not like it.

Perhaps you have a better idea for how to make laws?

Yes. You should never go to jail for breaking a law which does not cause direct physical harm to someone.

Or should we dispense with them altogether, since no doubt someone thinks every illegal thing should be legal, typically those who want to break the law and get away with it.

The law is the law. I just don't think people should always follow laws which they don't agree with, especially when those laws are not enforced. This is already what people do. As you see, copyright infringement, drug abuse, and speeding run rampant.

I'll gloss over the tooting thing; how exactly can an anonymous posting on an Internet bulletin board refuting a critical comment possibly be tooting my own horn?

You're not anonymous. You're pseudonymous. What was the critical comment you were refuting? I simply told you you should release your works into the public domain.

I do have a real job. I'm looking to make some extra money by doing some extra work. Do you have a problem with that?

I don't have a problem with that, but I do have a problem with your methods of going about it. Further, you keep talking about paying the rent. Obviously you already have enough to pay the rent. You're looking for extra money for something else, not for paying the rent.

And no, I'm not threatening anyone with copyright infringement lawsuits

You don't have a copyright notice on any of your works?

I'm looking to make extra money by doing some honest work making use of my other skills.

There's nothing honest about threatening people with copyright infringement lawsuits.

The only time action over copyright infringement would become relevant is if somebody like you tried to take advantage of my hard work for his own benefit

Yeah, I should go sue all the people who have benefited from my blood donations, cause they're taking advantage of my hard work for their own benefit. Please. No one forced you to do that hard work. You chose to do it.

If everyone read Slashdot via web caches that didn't pass on relevant information, they would make no money from either of those sources. Then how would they support themselves?

They wouldn't. They'd go out of business, and people would move on to some other site which isn't trying to make a profit off other people's words. Despite what you believe, I believe the world would be a better place without VA Software, not worse.

Re:But is it still caching once the original is go by Anonymous+Brave+Guy · 2003-07-02 10:55 · Score: 1

OK, I give up. You're worse than RMS. You persistently ignore the positives of things you don't like, you exaggerate the negatives, you put words into people's mouths, you ignore the wording of the law or just dismiss it outright when you happen not to agree with it, and your arguments are illogical, emotional and utterly without objective merit. The best you can do is attack figures of speech and twist what I've written to give it meanings I did not, so as to set up a range of straw men at which you can shoot. I've tried to persuade you with reasoned argument, but you appear not to want to be convinced, or even to consider any other point of view than your own.

I don't know why you bother participating in a forum like this, particularly if you believe it shouldn't exist, but I for one no longer have the time to reply. Fortunately, if you ever try to act on your views in the real world, you're likely to get sued, thrown in jail, or otherwise discover the truth about these things the hard way.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:But is it still caching once the original is go by anthony_dipierro · 2003-07-02 11:14 · Score: 1

OK, I give up. You're worse than RMS.

Wow. Thank you for the compliment. I wish I really could compare myself to RMS. I don't agree with him all the time, but his combination of practicality with stubborness has been inspirational :).

Re:Not so black and white as what you are saying! by Discoflamingo13 · 2003-07-02 12:55 · Score: 1

If I put up information on a web site, for free, as a volunteer, then the public has no rights whatsoever, either legally or morally. Why the hell should they? They didn't do anything to earn them.

The fact that the public has a right to anything you produce is the reason that the public domain exists. Copyright is instituted by governments to keep creative people in a position to keep creating - but when you're dead, the information should go somewhere to enhance the public good. If the human race is to advance, worthy knowledge needs to be transmitted to people - and some knowledge is too important to charge money for. If your information hasn't been saved by somebody else, where are we supposed to get it when your limited, exclusive rights to it expire? Keep in mind that I don't think copyright is wrong - I give away what I feel should be given away, and I sell what I think I should sell. But I think the current state of copyright law is in the rights - this is, of course, debatable.

If you're arguing that by putting it up on the web an author effectively forfeits all rights to control their work -- i.e., that the usual principles of copyright shouldn't apply for some reason in this medium -- then you're basically saying that anyone who might ever want to publish original material they wrote shouldn't ever make anything available on the web first.

If you put something on the web that's world-readable, you've published it electronically. Regardless of how many people have seen it, most publishers (that I know of) will consider a dead-tree publication a "reprint". If I want to submit something on the web for people to read, I will make them login and identify themselves first - with a disclaimer. Publishing then becomes interpersonal communication - which is a very different thing.

No, it doesn't. But it has the right to refuse entry to anyone who doesn't provide the information it requires.

Actually it does - it's not public property - if they wanted to search you for weapons before you step on the premises, they can bar your entry if they so desire. They will lose an enormous amount of influence in the business community by doing so, but they are perfectly within their rights to do so.

At banks and bars, both of their access restrictions make sense.

In Minnesota (where I currently reside) a concealed-carry law just went into effect. With a permit, you can carry a concealed weapon on you wherever you go - except for buildings which don't allow them on the premises. Depending on the buildings you go in , they may have to wand you / search you before you get in. I don't like it, but that's the way it is - because sometimes it's better to be cautious than assume that people will do the right thing.

That is debatable. The normal protocol on the Internet is that if you visit my site, I get certain information about your visit in exchange. That's your side of the bargain. If you don't like it, don't visit my site; no-one's forcing you to, and you have no right to my material "just because". People like you seem to want an exemption to the usual principles of fair deals and copyrights because it would be to their advantage. Hey, robbing a bank would be to my advantage, maybe the government will change the theft laws so I can do it. I'm guessing the banks might object, though, as unreasonable of them as that would be.

People like me? Well, you don't exactly know a whole lot about me, do you? If you look around on the net, you'll find out enough, and I don't care what you do find - because I honestly think I'm really boring. I haven't heard mention of this "protocol" before - the Internet is set up to be whatever you want it to be. If I want to wander it anonymously, you have the right to refuse my access to my webpage because my browser doesn't tell you anything about me. Regardless of what the law says, some people are driven by their inner morality and principles rather than adhering to the letter of every little o

Slashdot Mirror

Archiving Web Pages - Legal or Illegal?

102 comments