Slashdot Mirror


The Wayback Machine, Friend or Foe?

ShaunC asks: "As the webmaster of numerous sites, I'm curious how others feel about the Wayback Machine. What particularly interests me is the fact that the Machine is a relatively new animal, yet it contains snapshots from my sites dating back to 1998. I can't help but wonder: where did they get such old copies of my websites, and who gave them permission to make those copies? I certainly didn't provide either. Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one. Archives have bothered me ever since the fledgling days of DejaNews." This site last made an appearance on Slashdot, earlier this year. Internet archival sites are right smack in the crosshairs of copyright, but they are useful. Anyone who has ever used Google's cache (and there are plenty of those links on Slashdot) can attest to this. Of course, the issue that may bug many content providers is how to opt-out of such services, since some see it as a copyright violation. Is it possible to balance the issues of copyright and history, or will these two Internet resources find themselves in legal trouble in the future?

"The way I see it, archives are much like SPAM; I never opted in, why should it be my responsibility to opt out? I manage a number of domains and the process of refining robots.txt files and submitting myself to the Wayback Machine for removal seems to be intrusive. Worse, domains I've abandoned (which have lapsed or been re-registered by someone else) are forever archived in the Machine and I have no way to exclude them. Why should I have to deliberately remove my copyrighted material from an archive which was never granted permission to replicate that material in the first place?"

209 of 508 comments (clear)

  1. Erm by adamwright · · Score: 3, Insightful

    Isn't this exactly the point of robots.txt? Google won't cache content it doesn't spider, and it won't spider content forbidden by your robots.txt. Does the WayBack Machine obey the robots rules?

    1. Re:Erm by JebusIsLord · · Score: 2, Informative

      Yes, it does follow robots.txt protocol. Therefore there really isn't a problem now is there?

      --
      Jeremy
    2. Re:Erm by JebusIsLord · · Score: 2, Informative

      Shoot, that should be:

      User-agent: *
      Disallow: /

      --
      Jeremy
    3. Re:Erm by HP+LoveJet · · Score: 2, Funny

      Clearly an RFC is needed here:

      "Retro-Temporal Automated User Agent Exclusion Protocol"

      I'll try to put a draft together by April 1.

      --
      spawn_of_yog_sothoth
    4. Re:Erm by 1g$man · · Score: 2

      Why do webmasters have to "opt-out" rather than "opt-in" to be cached?

      Shouldn't the default be "don't allow spiders and caching" ? And if I want it then I should specifically allow it.

    5. Re:Erm by zootread · · Score: 2, Informative

      Yeah, you can add a robots.txt file and ask them to remove your site and it'll be wiped from their records. The problem is, if you don't have access to the site anymore, you can't throw in the robots.txt file. But, I just checked on a web page I requested they remove, which no longer existed so I couldn't put up a robots files, but I made the request anyways.

      It looks like the page has been removed! My guess is if you request to remove a page and it doesn't exist anymore, they probably will remove it for you. This web page revealed me as the pothead and pro-marijuana person that I was (and still am though in private) back in college. I was afraid my employers were going to find my old web page, but they're probably potheads too.. But still, its good to be able to cover up the silliness of my past.

      --
      Zoot!
    6. Re:Erm by kevinank · · Score: 5, Insightful
      The goal of the person who started archive.org was to record the history of the world wide web. The assumption was that whatever anyone thinks about the archive, there will never be another chance to go back and get that data once it is lost.

      The copies that they have archived in their databases are individual copies served from the original web requests, so they have the right to keep them. They became their copy when they were originally downloaded. Whether they have the right to make new copies and redistribute them depends on how you think fair use applies to that content.

      Ultimately if a lot of people start suing them they will probably shut down the archive to public access and only allow researchers to view their original copies on site. And if you'd prefer that, well, you'll end up with the world you deserve.

      --
      LibBT: BitTorrent for C - small - fast - clean (Now Versio
    7. Re:Erm by Ross+C.+Brackett · · Score: 5, Funny

      Well, the default is to not plug your server into the Internet the first place, now isn't it? To quote Doug from Ghost World, "It's America, dude, learn the rules."

      Seriously, if someone's precious intellectual property - as if anything worthwhile was ever posted on the Internet in the first place - becomes compromised because they don't know a basic principle of how to run a website, well then boo hoo.

      It's worth the tradeoff. That the Wayback Machine exists is seriously cool, and some day will be of definite historical worth. If the occasional Brady Bunch erotic slash fiction author has to take a ride on the waaahmbulance because "A Very Brady Gangbang (M/m/F/f nc b/d)" got copied without their permission for the greater historical good, then that's a price worth paying.

    8. Re:Erm by dswensen · · Score: 5, Informative

      Yes it does, and how. In fact, immediately upon reading this story, I went to the Wayback Machine and checked out my personal website archive. There it was, material dating back to 1996 ("Oh God, no, not the digging man GIF!"). I made a new robots.txt file:

      User-agent: *
      Disallow: /
      # BITE ME WAYBACK MACHINE

      ... uploaded it, went back to the Wayback Machine, and got:

      Robots.txt Query Exclusion.

      We're sorry, access to [site] has been blocked by the site owner via robots.txt.
      Read more about robots.txt
      See the site's robots.txt file.
      Try another request or click here to search for all pages on [site]

      So, yeah, they seem to check the site for the most current robots.txt file before they show the archive. And if the robots.txt disallows archiving the site, ALL the entries are marked unavailable, not just the current ones.

      So, it's pretty easy to solve the problem of the Wayback Machine -- and probably without going balls-out with the "disallow everything everywhere" like I did.

    9. Re:Erm by treat · · Score: 2
      Why do webmasters have to "opt-out" rather than "opt-in" to be cached?


      You are opting in when you make data publically accessible. It is part of the implicit social contract, due to the nature of information. Since it is such an obvious, natural, and desirable feature. A large proxy server will probably have several sites cached in their entirety. Retention time need not be considered at issue, due to the low cost of storage and the simply natural idea that if the information has even a slight value, it will recover the cost of storing it.


      When I view anything, it is my natural right as well as access to air is, to be able to electronically retain a copy of it, if for no other reason than to aid my memory. You have no right to prevent me from retaining a picture I took that your car was in the background of.

    10. Re:Erm by M-G · · Score: 2

      Well, the default is to not plug your server into the Internet the first place, now isn't it?

      That's quite possibly the most perfect comeback I've ever seen....

    11. Re:Erm by Qrlx · · Score: 2, Interesting
      I agree with the kevin completely. What is wrong with having old copies of your site archived? take this quote from the front page of this article:
      1. Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one. Archives have bothered me ever since the fledgling days of DejaNews.

      I don't know what kind of a "purist" this person thinks they are. DejaNews (now google) is one of the *best* places to look for info that's relevant but not this week's headline. We might as well burn all the libraries to the ground, since they contain books with embarassing misprints or factual errors.

      It might not be easy to get your site out of the Wayback machine, but it doesn't sound like it's impossible either. Consider the alternatives; would you rather live in a world where the past can be "updated" as needed, like the (purportedly reputable) New York Times did to the web version of a Sep. 9 story warning about Osama bin Laden. Right after September 11 they replaced it with a puff piece-- full details here. (Warning, contains links to the NYT registration-reqd pages and I think the content may have been re-scrubbed since this appeared on BuzzFlash.)

      If there's no record of content, how am I supposed to provide a bibliography or references for "something I saw on the web somewhere?"
    12. Re:Erm by uncoveror · · Score: 2, Insightful

      I like the wayback machine's reason for being: preserving history. In 20 or 100 years, it will be very valuable information. I found old copies of my website, The Uncoveror there. It relly took me back. What I didn't like, though, is that it tried to force-feed me spyware, namely Gator and Bonzi Buddy. If the Spyware and ads were removed, then it would be a true historical archive; the kind real historians, and students can use for research. With the garbage on it, however it has little, if any, academic value.

      --
      The Uncoveror: It's the real news.
    13. Re:Erm by guttentag · · Score: 3, Informative
      A number of people who don't want their content archived by the Internet Archiver may still want search engines to direct traffic to their sites (The Washington Post does this). If that's the case, use this in your robots.txt file:

      User-agent: ia_archiver
      Disallow: /

      Most (all?) search engines provide information on how to specifically exclude their spiders (while allowing everyone else). Just go to the engine's site and search for info on how they treat robots.txt.

    14. Re:Erm by Lumpy · · Score: 2

      Here's the thing that needs to be sorted out. The internet is globally public. if you put anything out there that is not password or access protected it is public domain, property of the world's population. (Just like that handbill pasted to a wall or telephone pole, your website is nothing more than that.) we really need to get some sane laws and regulations out there, if it's pubically displayed, you have no control over how many copies of said page are copied, distributed, or used. (The individual graphics are different, let's use (GASP) existing copyright laws to protect them.) but the snapshot in time of the page you produced is Mine,my neighbors,and that stinky-kid down the street's property now.... dont like that? get the hell off the web you loser.

      This is the crux and design of the internet. no laws passed can change this. Until you start an advertising campain that the internet is NOT a community, it is NOT open to the world, and IS the property of corperate america and equivilant to walking in a store.. (and with entry webpages stating that.... just like real stores (GASP AGAIN... making people responsible for their actions? the Horrors!)

      So to the person who submitted the story, quit your whining you big baby, if you dont want your webpage viewed and reproduced publically... dont make it public.

      --
      Do not look at laser with remaining good eye.
  2. Yummy by sheepab · · Score: 2, Informative

    Slashdot from 1997.

    1. Re:Yummy by quintessent · · Score: 2

      Very nice. And it's good to know they were using the same careful journalism back then. I like this headline:

      Judge Uninstalls IE in 90 seconds.

    2. Re:Yummy by digitalsushi · · Score: 2

      In the process of digging this up, you have also apparently answered the question of "who originally archived this", as the bottom of the page has a "welcome user from .alexa.com" footer.

      --
      slashdot: where everyone yells sarcastic metaphors to themselves to understand the issue
  3. "The Wayback Machine" by pb · · Score: 3, Informative

    "The Wayback Machine" has been a pet project for a long time, and we're only now seeing results. I know for a fact that they have pages back at least as far as 1996, and it's a damn shame they don't have anything that much earlier...

    And yes, it obeys the Robot Exclusion Principle.

    "Ask Google" strikes again; I would hope that you could find all of this information by searching, or reading an "About" page, or something. Fortunately, these abortions to journalism don't appear on the Front Page very often.

    --
    pb Reply or e-mail; don't vaguely moderate.
    1. Re:"The Wayback Machine" by Disevidence · · Score: 4, Insightful

      I think the question is not about its being publicly available, but rather about it archiving web pages that were taken down at later dates for various reasons.

      Its legally grey, and all it really takes is for some paranoid person to sue, and then the fireworks start.

      IANAL.

      --
      Think nothing is impossible? Try slamming a revolving door.
    2. Re:"The Wayback Machine" by martyn+s · · Score: 4, Insightful

      So I suppose libraries should just stop carrying books because the author doesn't like what he wrote anymore? I mean, what the fuck?

    3. Re:"The Wayback Machine" by Disevidence · · Score: 2

      Im not saying whats right or wrong, im saying he could possibly sue however.

      Person A puts up a website about X. Wayback Machine archives this website. Later, unbeknownst (sp?) to the Wayback Machine, Person A is sued by someone who controls or has copyright on X, and its taken down. Yet that copy is still on the Wayback Machine.

      What if the Wayback Machine archives a link to Decss? They do archive forums, i;ve checked numerous old old posts of forums through the wayback machine.

      I just think its a bit iffy about archiving stuff...

      --
      Think nothing is impossible? Try slamming a revolving door.
    4. Re:"The Wayback Machine" by Reality+Master+101 · · Score: 2

      So I suppose libraries should just stop carrying books because the author doesn't like what he wrote anymore? I mean, what the fuck?

      The issue is more akin to a library making a copy of a book and giving out copies of that copy to anyone who asks.

      --
      Sometimes it's best to just let stupid people be stupid.
    5. Re:"The Wayback Machine" by rodgerd · · Score: 2

      Actually, if a book is declared obscene or libellous, a library may well stop carrying it, and the Wayback machine has the same problem.

      And while it is sometimes delightful that it preserves things that, eg, Big Companies may prefer we didn't see, it's less delightful that the ramblings of a 17 year old's blog may come back to haunt them years later...

    6. Re:"The Wayback Machine" by Mr+Windows · · Score: 2

      Possibly sue for what? It's not libellous to (truthfully) say "n years ago, so and so said 'whatever'".

      It's always been the case that "if you don't want a future potential employer to read it, don't put it out in public". If a newspaper prints a libellous story, they issue a retraction, they don't seek out and destroy all copies of the paper.

    7. Re:"The Wayback Machine" by Rick+the+Red · · Score: 5, Insightful
      No, the issue is more akin to a library carrying newspapers and magazines for years, and their publishers suddenly telling the libraries "those copies are out of date, stop letting people read them." Why? If you didn't want anyone to read it, why did you put it out on the web?

      Are you ashamed of what you did back then, when you were young and foolish? Grow up -- we're all ashamed of what we did when we were young and foolish, and years from now you'll be ashamed of what you're doing today. Get over it.

      Personally, I think archives are great. Whenever I design an application I always ask about archiving, because inevitably they're gonna want it and it's easier to design in from the start. Oh, you want to know what your top 10 customers ordered last Christmas? Now you tell me! Geeze, we flushed that data last February, 'cause you said once the credit card cleared you didn't care to pay for the storage. But I digress.

      Someday your next client will want examples of your previous work, then you'll go crawling on your hands and knees to the Wayback Machine, begging them to show you what your pages looked like. And they'll honor your robots.txt file and tell you to get lost.

      --
      If all this should have a reason, we would be the last to know.
    8. Re:"The Wayback Machine" by budgenator · · Score: 2

      IANAL but I don't think you can sue someone because they truthfully reported that you were stupid, or did something embaressing a few years ago; but then again I didn't they you could be sued for serving your hot coffee hot either.

      But on the other hand, someone just asked me if I might work on some web pages they had about a year ago, and sure enough there they were, with just a few broken images, This could be usefull.

      We'll just have to remember that anything posted is forever now, or at least until they run out of storage space

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    9. Re:"The Wayback Machine" by Lord+Ender · · Score: 2

      If you posted it to a publically accessable web page, then you gave everyone in the world permission to copy it, in my opinion. Anybody who viewed your page made a duplicate of it in their browser cache. And copying is copying. Don't make it publically available on the web if you want to restrict the copying of it. Ass.

      --
      A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    10. Re:"The Wayback Machine" by Disevidence · · Score: 2

      Im an 18 year old Australian Uni Student.

      So yeah, how did you work it out?

      --
      Think nothing is impossible? Try slamming a revolving door.
    11. Re:"The Wayback Machine" by gerardrj · · Score: 2

      No, the issue is more like the library deciding to sell copies of the books it carries, without the author or publisher's permission.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    12. Re:"The Wayback Machine" by global_diffusion · · Score: 2

      I think the question is not about its being publicly available, but rather about it archiving web pages that were taken down at later dates for various reasons.

      If this is the question, it implies that there is content out there that should be unavailable to certain people. I strongly disagree to that because I feel that data should be free. From anything like Bertrand Russell papers to kiddie porn, if it was on the net, it should be considered part of the net. If we select what should be part of our archive by select standards, we are in effect choosing how we want history to look at us. I say that we should store all the data and allow interpretations of it to change over time. Ask any anthropologist or historian and I bet you that they would love it if everything, even the obscene, had been recorded. If we want a true picture of the net, we need to include everything.

      Copyright on the web is a silly notion. If you put something on the "world wide web," then it is public to the world. You can't just take it off and expect it to disappear. If you take that idea to the extreme, next we would have people suing us for not deleting their websites from our browser cache. Copyright is silly. I just don't get it these days.

    13. Re:"The Wayback Machine" by martyn+s · · Score: 2

      I dunno, I take a look back at my early usenet posts, and although I may blush a little, and I'm embarrassed about it, I just deal with it. If it's just a matter of people knowing that they're being archived, then I can solve the problem very easily: You are being archived. Consider yourself informed.

    14. Re:"The Wayback Machine" by martyn+s · · Score: 2

      As someone said in another post, I thing it's tragic when any piece of information, no matter how trivial, is lost forever.

    15. Re:"The Wayback Machine" by Suppafly · · Score: 2

      No its not, reading anything on the internet constitutes copying, anyone that places information on the internet knows or reasonably can be expected to know that, so the issue of redistribution doesn't really apply. So if you are going with a book analogy, it really is just like a library allowing people to come in and read books.

    16. Re:"The Wayback Machine" by Suppafly · · Score: 2

      No, the issue is more like the library deciding to sell copies of the books it carries, without the author or publisher's permission

      But it's not like that at all since they aren't selling anything and, reading anything on the internet constitutes copying, anyone that places information on the internet knows or reasonably can be expected to know that, so the issue of redistribution doesn't really apply and one can't legitimately complain about copying since using http implies you want stuff to be copied.

      So if you are going with the book analogy, it really is just like a library allowing people to come in and read books without the authors or publishers permission (with is just fine with physical books).

    17. Re:"The Wayback Machine" by Rick+the+Red · · Score: 2
      Gee, I must have missed the part where the Wayback Machine charges to look at their archive. I guess I'm stealing from them, eh? Or else maybe they're not selling anyone's copyrighted work.

      Oh, and every year I pay for my public library, whether I use it or not, in my property taxes. So in a sense they are selling me those books (only I don't get to keep them, so I guess I'm just renting them).

      The Wayback Machine doesn't even do that; they just let you see them for free, just like you could see them for free at the original sites. Imagine that! They're giving away free what the copyright holder gave away for free. Now, you'd have a case if they archived pay sites and let you see them for free -- point me to their pay-for-porn collection!

      --
      If all this should have a reason, we would be the last to know.
    18. Re:"The Wayback Machine" by gerardrj · · Score: 2

      The money is not the issue. Selling, giving, loaning whatever. The works are not theirs to do any of these with. The fact that I allowed you to view my web page at one point does not extrapolate to your right to make that page viewable by others on demand for eternity.
      The library neither copies nor republishes the works they hold. The Waybakc machine does both.

      You drive your car in public. Does that make it public domain? would you report it stolen if I borrowed (without asking) it for a drive to Montana for a week even though you have two other cars (essienitally copies)?

      Every year I pay property tax also. Much of it goes to schools, but I don't have kids. Are they selling me the schools, or the children? Can I enroll in high-school classes as a refresher course? But again, money is not the issue.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    19. Re:"The Wayback Machine" by Lord+Ender · · Score: 2

      Posting your own intellectual property to the web (or giving someone else permission to do so) is much different than posting someone else. Think about it. Totally different.

      --
      A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    20. Re:"The Wayback Machine" by pjrc · · Score: 3, Insightful
      Are you ashamed of what you did back then, when you were young and foolish?

      I am. Well, sorta anyway. My site has all of the pages that have ever appeared, all the way back to 1995. For example, this circuit board schematic page got a lot of hits in 1995. For years, I got emails from people who attempted to build it... a few were success but most were failures. So, in 1997 I redesigned the board/schematic so that it would be much easier to build and troubleshoot, and then I made another new rev in 1999 (because the flash rom chip became obsolete).

      Based on lots of user feedback, I redesigned it yet again in 2001, mainly to increase the speed, add more memory to be C compiler friendly, and I added the most user-requested feature, a port to plug in a standard LCD.

      Today, those old pages (well, still need to update the '99 ones) have a message at the top of the page that tell the visitor they're viewing obsolete material and strongly suggests they follow a link to the new version of the circuit board, which is easier to build (added in 1997), uses parts that are currently available on the market (added in 1999), and has more features (added in 2001).

      An archive of the original 1995 page, even archived in 1996, isn't going to warn the poor user about the usability improvements added in 1997, the part that became obsolete in 1999, and the nice new features that were added in 2001. At the very least, it'd be proper for archive.org to link to the current version of the page (if it's on-line)... but even that would be difficult since the site moved from a university to its permanent domain name in 1999 (the old site keep a redirect for a couple years, but even that is gone now).

      So, while it sucks that someone might find that old material and suffer though all the problems that have been corrected and miss out on the improvements of the last several years, it doesn't suck enough that I'd hire a lawyer, or even bother to tell them to exclude my material.

      But I can understand how a large company would not want its old products displayed with the then-current literature in a way that might confuse potential customers.

    21. Re:"The Wayback Machine" by anshil · · Score: 2

      Well to honest thats a bit of nonsense, somebody _sees_ anyway that he is viewing a side form in example 1996.

      Take in example magazines, I really have old ones in my cellar, and is so funny to take a computing magazine from 1992 or so and read it. (in example Powerplay if anybody remembers it). When they ie discuss how BardsTale 3 on the CPC is the most fantastic ever seen.

      Do I get confused by this magazine? Certainly No, I look at the frontcover and see the date the information applies to. Same as reading very old newspapers, or old social magazines like the are lieng at the tables in doctors wait rooms. Anybody get confused? No anybody can read the date on the frontcover. Like last time I read a magazine at the doctor from 1999, telling about the Y2K problematic etc. it's very funny today.

      No what you're telling is to burn down the powerplays in cellar, the magazines at the doctors, and the old newspapers in the libraries because somebody might get confused by not reading state of the art information. Welcome 1984.

      --

      --
      Karma 50, and all I got was this lousy T-Shirt.
    22. Re:"The Wayback Machine" by gerardrj · · Score: 2

      Your comment that ...this is a different medium..." is at the heart of my arguments...
      Why should publishing on the Internet have any more or fewer rights and restrictions than real world publishing? Just because the new medium requires something that is illegal in the real world, does not necessarily mean the real-world rules should suddenly not be germain.

      It's also interesting that you mention the Washinton Post, as they have "opted out" of the wayback machine.

      Wayback does differ in several major ways from the operation of a standard library:
      1. A library does not provide, allow or condone copying and/or redistributing their content (books, periodicals, reference materials) except for fair use. You may use the copies they have rights to. They do not produce as many copies of a work as there are patrons wanting ot use it.
      On wayback, the copies you implicity get rights to when viewing a web page are the one in memory and on your screen, and the one temporarily in your browser's cache. You have no distribution rights, or rights to make any other copies beond fair use (such as to a backup CD or tape).
      To stay withing their rights, you would have to go to the Waback building and look at the page on their computer that captured/cached the site/page.

      2. A library does not keep all copies of a work indefinately. They rotate stock to keep up to date.
      The Wayback is specifically attempting to maintain all data, even if wrong, false or outdated. Your library will destroy and replace such items.

      3. A library purchases content (generally) through channels that specifcally know the purchase is for a library, and certain special rights may be conveyed or restricted.
      The Wayback is taking pages offered for one use and using them for something else entirely. They further make or condoning the making of multiple copies while redistributing the works.

      4. A Library never alters stored works.
      Due to space, technology and other limitations, not all pages in the archive are re-rendered as initially offered to the public. This is tantamount to re-writing those pages, and could be considered plagerism.

      There's also common knowledge. The public at large is well aware that if they write a book, it may well end up in a library. Wayback has no such ubiquity, and they seem happy to remain in the background, collecting these pages without most authors' knowledge.

      I'm not for shutting down the Wayback, or others like it. I'm just saying that overall things would be less legalistic(is that a word?) if they would simply take a solid "opt-in" stance. That is, they would only store pages that specifically allow them to do so, via athe robots.txt, meta information, sign-up at their site, whatever. This serrupticious gathering of copyrighted works, and the questionably legal issue of re-distributing them in whole or in part, possibly altered from their origional form is just not right.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    23. Re:"The Wayback Machine" by gerardrj · · Score: 2

      The problem is that money is not the issue. Copyright law still pertains. I did not give them any permission to redistribute my site.
      THe footer on my site even says so. The funny bit was they reproduced the footer stating that what they where doing was prohibited use.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    24. Re:"The Wayback Machine" by gerardrj · · Score: 2

      Let me take money out of the equation with another example:

      An artists takes a pitcure. He then makes 300 prints of that negative.

      He gives away the prints to the first 300 people who come asking for them.

      Can those 300 people then make copies of thier prints and give them away?
      As the developer, can I run more prints from the negative and sell or give them away?

      In both cases, the people giving away the copies they made are violating copyright law.

      In this example, the negative is analagous the initially publised web site; the prints to the archived site.

      As I've said in many of the other threads, the money is not the issue. The issue is copying rights, who gets them, when and why. You only get rights to my work that I explicitly transfer to you and those of fait use.
      In publishing my web site, I automatically have copyright to the content. When I offer the pages on the web, I implicitly transfer to the viewer the right to view that content, and to cache it in RAM and on disk for a relativelt short period. I undertand that caching is an inherent part of the medium in which I publish my work. You also have certain rights under fair use. You may discuss my work, reference it in your own workds, make backup s, cite portions of it, etc...
      Noplace, do I explicitly, by reference or implication transfer rights whereby you can re-transmit any or all of my content.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    25. Re:"The Wayback Machine" by Rick+the+Red · · Score: 2
      Simple fact. I own my work, I retain all rights to that work. I didn't pass any of those rights on to Wayback. They have no right to archive my work without my permission.
      Gee, when I visited those sites (c. 1995) I don't recall seeing anything on them that said I didn't have the right to read them. If you don't want people to read your stuff, don't put it on the Internet! If you put it on the World Wide Web, don't be surprised if someone takes that to mean you don't mind if anyone reads it. Remember, if you put a "robots.txt" disclaimer, they'll honor your request.

      --
      If all this should have a reason, we would be the last to know.
  4. Robots.txt by mshowman · · Score: 5, Informative

    I had recently placed a restricted robots.txt file on my site and when trying to access any of the past revisions, I get a message saying that the owner has restricted access to the site via robots.txt. They seem to have that aspect under control.

    1. Re:Robots.txt by ShaunC · · Score: 3, Interesting
      Sigh. This, I suppose, is what happens when Slashdot keeps stories in the queue too long:
      2002-03-30 10:12:57 The Wayback Machine, friend or foe? (askslashdot,news) (accepted)
      At the time, I was having severe problems getting in touch with anyone at The Wayback Machine. Yes, their site makes it quite clear how to have your site removed. Yes, I placed the appropriate entry in my robots.txt files. Yes, I submitted my sites for exclusion. Then nothing happened. After emailing them several times with a list of domains I'd prefer to have removed from the archive, I got a reply back saying they should disappear by the end of the following day. No go.

      That's all changed. They've got the kinks worked out, as best I can tell, and have begun obeying robots.txt files. They weren't so diligent about it three months ago, or I wouldn't have gotten ticked at 'em.

      BTW, my submission was edited in at least one place: I don't capitalize the word "SPAM," as the capitalized version is Hormel's trademark. (Maybe my submission was combined with someone else's; hard to remember what I wrote 3 months ago.)

      Everything else I'd say has already been said, I wish I'd noticed the story sooner.

      Shaun
      --
      Thanks to the War on Drugs, it's easier to buy meth than it is to buy cold medicine!
  5. There are more than copyright concerns... by Anonymous Coward · · Score: 4, Insightful

    It's a scary thought that things kids are saying on message boards when they're teenagers are going to be back to haunt them when they apply for jobs in their mid 40s...

    I mean, if everything I posted on BBSes in the 1980s were still attributable to me... yikes.

    Remember kids. Use a nickname, and change it frequently if you ever want to run for any kind of office.

    1. Re:There are more than copyright concerns... by TheMonkeyDepartment · · Score: 4, Insightful

      Well, that's a great point, and it's a good illustration of the double-edged sword of free speech. You are free to say whatever dumbshit, ridiculous things you want. But you are also free to deal with the social consequences.

    2. Re:There are more than copyright concerns... by rhaig · · Score: 4, Interesting

      dejanews was my best tool to weed out resumes

      before I secheduled even a phone interview, I'd always search dejanews for the person in question. Sometimes I'd come up with a definate hit (first and last name as well as email and mentioning the local area or some work that was on their resume) and I'd be able to see what kind of person I was really dealing with. That's when I started looking at what I'd posted.

      --
      "We are not tolerant people. We prefer drastically effective solutions"
    3. Re:There are more than copyright concerns... by gad_zuki! · · Score: 2

      The problem with this is that as the copyright owner you cannot convince google to remove your old usenet posts unless you still have that 1998 email address. Its a ridiculous requirement for google to ask people to have their college email accounts when they want the posts they own and wrote removed from google's system. A simple proof of ID should be enough for them.

      An inaccessible copyright policy is like having no policy at all. Expect fallout sooner or later.

    4. Re:There are more than copyright concerns... by Fencepost · · Score: 2

      Wrong. You can get Google to remove postings you made with no-longer-extant email addresses. See the Google Groups help, specifically this entry.

      --
      fencepost
      just a little off
    5. Re:There are more than copyright concerns... by Suppafly · · Score: 2

      I don't understand why people can't grasp that concept and also the concept of 'if you are going to put something out that publically viewable, you can't take it back'.. It'd dumb to whine about caches, if you put something out that is world viewable thats the risk you take.

    6. Re:There are more than copyright concerns... by Suppafly · · Score: 2

      Why would you want to have them removed if you posted them?

    7. Re:There are more than copyright concerns... by madmancarman · · Score: 4, Interesting
      dejanews was my best tool to weed out resumes

      before I secheduled even a phone interview, I'd always search dejanews for the person in question. Sometimes I'd come up with a definate hit (first and last name as well as email and mentioning the local area or some work that was on their resume) and I'd be able to see what kind of person I was really dealing with. That's when I started looking at what I'd posted.

      This kind of freaked me out when I started teaching in 1998 - I'd been running a large fan web site devoted to one of my favorite bands, and being heavily into the band, I posted a lot in their newsgroup and participated in more than one flame war. Of course, I was in college and in my very early 20's and late teens, but it's all archived on DejaNews now, with no way to remove it. I really doubt any public school districts are going to wise up to this (or even care, considering the national teacher shortage), but I wouldn't be surprised if it came back to haunt me in some way some day. As a previous poster mentioned, such is the burden of free speech.

      An interesting thing did happen to me at the beginning of this school year. I teach high school computer classes, and I was talking about managing that fan web site when one of my students (a junior) opened his eyes really big and pointed at me with his jaw dropped, sort of aghast. I paused and asked him what was wrong, and he exclaimed that he downloaded and used the guitar tabs I'd written years earlier when he was in junior high. I found that kind of amusing!

      I think the archiving of the internet is particularly scary when I can still find a lousy guitar tab I did of Pearl Jam's "Footsteps" that I did back in 1992, when I was a senior in high school piggybacking off an account at the nearby university, on my parents' Apple //e, while I was still learning how to play guitar. Obviously, the internet can have a much longer shelf life than a ProDOS 5.25" floppy (excluding news sites that "expire" their articles after limited availability).

      First they ignore you, then they laugh at you, then they fight you, then you win. -- Gandhi

      --
      First they ignore you, then they laugh at you, then they fight you, then you win. -- Gandhi
    8. Re:There are more than copyright concerns... by rhaig · · Score: 2

      when they put me somewhere doing more than just busywork, I'll call you. Until people start hiring sysadmins again, I'm stuck working for the state.

      --
      "We are not tolerant people. We prefer drastically effective solutions"
    9. Re:There are more than copyright concerns... by guttentag · · Score: 2
      Remember kids. Use a nickname, and change it frequently if you ever want to run for any kind of office.
      20 years later...

      MTV: "PresidentNeal, have you ever posted on Slashdot?"
      PresidentNeal: "Yes, but I never inhaled controlh controlh controlh controlh controlh controlh controlh modded anyone down for being a troll."
      MTV: "Mr. President, control-H doesn't work on Microsoft TV. It looks like you think you're using Linux."

    10. Re:There are more than copyright concerns... by sql*kitten · · Score: 2

      I wouldn't be surprised if it came back to haunt me in some way some day. As a previous poster mentioned, such is the burden of free speech.

      The thing is, people posted to usenet believing that it was an ephermeral medium, and that everything they said was essentially a throwaway comment that would expire within a week at most. The idea that someone was actually saving all this stuff simply didn't occur to 99% of posters (myself included). Partly it was because way back when, the storage to keep a usenet history online would have been prohibitively expensive, and partly because who would even want to preserve alt.*?

      Google do have a procedure for removing posts from their archive, but either it doesn't work or they are simply autoresponding then ignoring the request.

  6. Opting out -- of publicly available HTTP??? by TheMonkeyDepartment · · Score: 4, Interesting

    When you publish something on the web, it is publicly available via HTTP. End of story. Responsible netizens can observe the requests of "robots.txt" but they don't have to. If you want something more controlled, create a VPN or intranet or some other kind of non-public data server.

    Your argument is similar to that of newspaper publishers who didn't like "deep linking." What they couldn't (or didn't want to) understand is that the nature of an HTTP web server is quite simple. A client asks for a file, the server gives it back. Using that protocol implies that you are OK with that. If you're not, I suggest you look into different technologies, instead of complaining about lack of control, in a medium that was never intended to provide it.

    1. Re:Opting out -- of publicly available HTTP??? by krogoth · · Score: 2

      Exactly what I wanted to say. Of course, when you put something on the Internet you don't expect it to be archived forever, but you have to keep in mind that anyone can download it and do what they want.

      --

      They that quote Benjamin Franklin on liberty and safety deserve neither.
    2. Re:Opting out -- of publicly available HTTP??? by KillerCow · · Score: 4, Insightful

      When you publish something on the web, it is publicly available via HTTP. End of story.

      I don't think that that is a good enough standard. When a television show is broadcast, or when a book is published, it is publicly available -- but we don't think that the publisher looses their right to copyright protection in these cases. Publishing on the web is similar. The creator wants people to see his/her creation, but does not automatically give visitors the right to archive and retransmit the works.

    3. Re:Opting out -- of publicly available HTTP??? by FreeUser · · Score: 2

      When you publish something on the web, it is publicly available via HTTP. End of story.

      Exactly. By publishing online and publicly you've already opted-in.

      This is just another example of how incompatibel copyright is with any kind of normalcy vis-a-vis individual freedom and, in this particular case, the freedom to archive information and hold someone accountable if they try to change it retroactively (and on the sly). Unless we want Orwellian-style changing of the facts post facto copyright must lose to the right of archivists to preserve information from being lost. Any other policy would be disasterous.

      --
      The Future of Human Evolution: Autonomy
    4. Re:Opting out -- of publicly available HTTP??? by sckeener · · Score: 2

      When you publish something on the web, it is publicly available via HTTP. End of story.

      Ah...as a previous post pointed out, I don't think kids should have their remarks recorded forever. I doubt I would have made it as far as I have if my BBS quotes were still around...

      --
      "Only one thing, is impossible for god: to find any sense in any copyright law on the planet." Mark Twain
    5. Re:Opting out -- of publicly available HTTP??? by I_redwolf · · Score: 2

      Comparing a library of books or whatever you are thinking about and the HTTP protocol doesn't work.

      You go to the Library, borrow a book which has a value and has to be returned; If you keep it then you must pay the value of the book. You could also take the book and make a copy wherever you are and redistribute and most likely pay heavy fines when caught.

      The HTTP protocol, information is put on a public network that operates on the basis that information exchange should be free (this is why there is an httpS://). The information, files, etc you receive have value except they don't have to be returned. You could also take those files etc and make a copy where ever you are and redistribute and most likely pay heavy fines when caught. HOWEVER if you archive those files, images etc that you receive it's not a big issue. Just like you read a book and it has value, you archive as much as possible in your brain for a test, or you pull pieces of it out or whatever. Except in those cases you DO redistribute on paper or into a report or whereever else.

      If you go your route then eventually teachers wouldn't be able to teach, you kill off the internets original purpose and you usher in a 1984 like society; in general this is a very bad idea.

      I'm going overboard right? If I'm not mistaken your email address says you go to cornell, look around campus and tell me if I'm going overboard.

    6. Re:Opting out -- of publicly available HTTP??? by elmegil · · Score: 2
      I doubt I would have made it as far as I have if my BBS quotes were still around...

      I don't see why not. I have lots of fun quotes floating around in google, deja* and slashdot, but my employers have never called me on them. Do you think HR has nothing better to do than find ways to embarass potential employees? Perhaps some places, but I wouldn't want to work there....

      Public office is a different matter, but honestly, I don't see how embarassing things said on the net are any worse than embarassing things done in the public record (like the famous Newt G. divorce your wife while she's in the hospital with cancer debacle).

      --
      7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
    7. Re:Opting out -- of publicly available HTTP??? by TheCarp · · Score: 5, Interesting

      The otherquestion is one of historical record.

      What you say does not BELONG to you. It is not property. Once you write it, it exists. You may own the medium it is on, but once it is out in the world it is uncontrollable and no longer owned. You may hold copyright... but a hundred years from now when you are long since dead and copyright is expiring, then what?

      We have the works of Galileo, we have letters that Thomas Jefferson wrote to people, why? because they were written. Many years later, long after the fact, these were made public and part of historic record because they survived.

      On the net, we have a culture of written information apearing and disapearing. This information is part of our culture, its things that we read and see, when it goes away - for whatever reason - we have lost something.

      I have websites from 96 that exist now only in the way back machine. Yea, som eof the stuff I aid back then I don't agree with now, and would rather not have associated with me but, by that same token, I wouldn't want it to be lost forever. If someone read it and what I wrote had enough impact on them that they want to see it again... then I would not even dream of trying to stop them (even if the impact was one of disgust - an impact is an impact) - even if its just someone wanting to see what the web looked like 5 years ago... I think thats valid... I think thats an important record fo our culture.

      the only thing I can see a case for really is the removal of personal information that shouldn't have been public in the first place. Beyond that though, I think its good... i mean... its not something that is ever going to be mistaken for a live current site - you have to actually go to the way back machine and ask for it.

      All in all this is a good thing and I hope it survives longtime.

      -Steve

      --
      "I opened my eyes, and everything went dark again"
    8. Re:Opting out -- of publicly available HTTP??? by Jobe_br · · Score: 2

      Archiving is one thing, rebroadcasting (or rehosting as is the case here) is another. By copyrighting my site, I reserve the sole right to host a server that distributes that content. Nobody else is given a right, expressly or implied, to 'mirror' my site, regardless of if its for archival purposes or not. That's the consideration that needs to be understood here. Archives are great - I often make use of Google's cache, but only if the *real* content I'm trying to reach is behind a slow connection or down entirely. Technically, Google cache should reserve copyright, too - a concept that would certainly kill off the practice. Is it worth it to lose the convenience? Possibly ... think about the ramifications and think hard. If its permissible to archive and host a site's content, why wouldn't it be permissible to archive and broadcast the ST:TNG episodes you've so faithfully taped, w/o paying the royalties to whoever holds the rights to that? Seems like its pretty much the same thing to me, eh?

      Now - if you yourself want to archive a site that you're interested in, or if you want to contact the maintainer of a site for something you're looking for that used to be on his/her site, that's perfectly legit and respectable. I personally have all the previous sites for my company archived - if someone wants something that's no longer on our current site, they can certainly ask and we'll try to fulfill their request.

      I dunno ... I'm up in the air on this, but I entirely understand the copyright ramifications of the situation.

    9. Re:Opting out -- of publicly available HTTP??? by I_redwolf · · Score: 3, Interesting

      If the creator wants people to see his/her creation, but not give them the right archive and retransmit the works just like always they can put a (C) at the bottom of their webpage expressing that redistribution of the work without express permission of blah is prohibited. Obviously that would bring the question about how many people the creator would want to see his/her work in the first place. If they want to be selective then be selective, write a webapp that only allows registered users who have agreed to a non-disclose or redistribute license etc etc. There are many ways to go about it so long as the creator understands "When you publish something on the web, it is publicly available via HTTP".

    10. Re:Opting out -- of publicly available HTTP??? by osgeek · · Score: 2

      When you publish something on the web, it is publicly available via HTTP. End of story. Responsible netizens can observe the requests of "robots.txt" but they don't have to. If you want something more controlled, create a VPN or intranet or some other kind of non-public data server.

      Your argument is analogous to those of spammers and telemarketers. You have a publicly available email box, so we can use it to send you spam - You have a publicly reachable phone, so we can use it to call you to sell you stuff.

      Respecting the wishes of those who create content, own email boxes, or have telephones would trump the wishes of those who wish to use those resources -- in my ideal society.

    11. Re:Opting out -- of publicly available HTTP??? by krypto246 · · Score: 4, Insightful

      People are just pissed about this archinving because they like the internet to be a 100% responsibility free zone - now matter what you say or do, you ca nalways change, edit or delete it later. How about standing behind your comments and opinions, instead of just deleting them when they can be held against you? Yes - use nicknames and aliases, but dont expect that the things you put out there to be temporary. You put something out into the internet, it stays there, and it can be found later, thats the power of the net, and the price you pay for it.

    12. Re:Opting out -- of publicly available HTTP??? by RovingSlug · · Score: 2
      Okay, recast it in this direction:

      It's the same problem we're having with Napster, Kazaa, Blizzard, etc. That information can trivially be copied, that certain "copies" of information are absolutely fundamental for a computer to work properly ... these issues eat at the original preconditions to copyright.

      My computer needs a copy of your information in its registers, in its L1/L2/L3 caches, in its system RAM. Software may and often does save (archive) copies to a cache on the hard drive -- admins usually appreciate this because it reduces server load. A transparent web proxy to an intranet may cache web requests for its internal clients if it has a slow outgoing connection.

      Surely you shouldn't have to "opt in" the first few cases. But, it's all the same principle, caching/archiving. So, as we go out, especially to the transparent web proxy, where do you have to opt in? And what about further out, as computers become just one component in a cluster of computers? Where does broadcasting begin and caching end?

      I think there has to be a lot more philosophy than ST:TNG analogies to make a sound decision about copyright ramifications to computers (see Taking the Copy Out of Copyright). It's a very broad issue, and it will siginificantly determine the way we use both information and computers/electronics in the future.

    13. Re:Opting out -- of publicly available HTTP??? by delta407 · · Score: 2

      HTTP makes provision for caching and caching proxy servers, which does give visiting machines the right to archive and retransmit the works. Of course, there are expiration headers, but there is nothing that says it has to be purged from the cache once it expires.

      Are cache servers in violation of copyright?

    14. Re:Opting out -- of publicly available HTTP??? by M-G · · Score: 2

      Yeah, but the magazine publisher isn't gonna go into your bathroom, remove the old magazine content, and replace it with new stuff, which is in effect what happens with web sites.

    15. Re:Opting out -- of publicly available HTTP??? by budgenator · · Score: 2

      My site not only has the copywright notice on the bottom of the page, but add banners for companies long since bankrupt durring the DOT-bomb phase of the internet; but they are archived for posterity by the wayback machine. so the little (c) is ignored

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    16. Re:Opting out -- of publicly available HTTP??? by Dan+D. · · Score: 2
      Then why aren't people suing the TV Guide for deep linking infringment. And yes you can record TV and quite often you can find archivals of TV that may or may not actually have been collected by the original producer of the show.

      If its broadcast on publicly available wavelengths (or electrons) its publicly availabe.

      --
      People who quote themselves bug the crap out of me -- Me.
    17. Re:Opting out -- of publicly available HTTP??? by I_redwolf · · Score: 2

      Cable TV is a protocol? I can broadcast my own tv shows without FCC permits?! Wow, AC's sure do know alot of stuff it seems. If you could forward that information to me. Or read the rules of public access.

    18. Re:Opting out -- of publicly available HTTP??? by I_redwolf · · Score: 2

      The ad banners were for companies long since bankrupt during the DOT-bomb phase of the internet; but the banner ad's don't belong to you, you were redistributing them on your website. Maybe with the permission of the advertiser including a small fee they might of paid you, i bet if you ask them, they won't complain about these same banner ad's still promoting whatever product. But hey, I could be wrong there are some advertisers who only want people who saw the website in June of 98 to buy their stuff.

    19. Re:Opting out -- of publicly available HTTP??? by nexthec · · Score: 3, Interesting

      Actually, In canada (I am an american, but I'm married to a canuck) anybody can rebroadcast anything. the deal is tho, that they can not change it, cant remove advertisments, cant shorten, lengthen, commentary over it, or put up their logo. kinda a neat idea.

    20. Re:Opting out -- of publicly available HTTP??? by hymie3 · · Score: 2

      How about standing behind your comments and opinions, instead of just deleting them when they can be held against you?

      Okay, for usenet, sure. That's why I haven't removed my stuff from the google news thing. I've got posts from 1992 on their, many of them not all that flattering (I was 18, that's my excuse).

      My websites, on the other hand, are *my* creation, not "released to the public" as has been argued is the case for email and usenet. I *still* own the copyrights, but they are not being respected.

      I can stand behind stupid/offensive websites I made in my younger days. Can they stand behind their claims that they respect copyright? Last time I checked, copyright (at least in Berne signatory countries) was not an opt-in thing.

    21. Re:Opting out -- of publicly available HTTP??? by subsolar2 · · Score: 2

      On the net, we have a culture of written information apearing and disapearing. This information is part of our culture, its things that we read and see, when it goes away - for whatever reason - we have lost something.

      I have to agree whole heartedly. I wish the archive went farther back to the beginnings of the web so people could really see how it started out. It's always bothered me that there was no way of saving it because of the ability to basically re-write hostory and not being able to prove it.


      There are also sites that I wish would have made it to the archive back when I first started out in 95 ... it would be cool to look at them again.

    22. Re:Opting out -- of publicly available HTTP??? by stinky+wizzleteats · · Score: 2

      The creator wants people to see his/her creation, but does not automatically give visitors the right to archive and retransmit the works.

      It's amazing to me how people can be so enthusiastic about using technology to spread information and yet be so capable of an unreasonable need to control that information once spread. To demand ownership of HTML, when storage and retransmission are a normal part of the operation of web browser software, and when you really don't even have control over how your page is presented by the browser, is patently absurd.

      You can't have it both ways. If you want to play in a world where information is freely and rapidly exchanged, then you must be prepared for exactly that.

    23. Re:Opting out -- of publicly available HTTP??? by Lars+T. · · Score: 2
      A copy of (almost) every printed book or newspaper is stored in at least one public library. In some countries it is even required that you give one copy to a national archive. Why should publishing on the web be any different?

      It is somewhat odd that the Slashdot crowd both wants to get rid of IP and cheers for Open Whatever, but wants their copyright protected if somebody archives their webpage or Usenet post.

      --

      Lars T.

      To the guy who modded me down from perfect to terrible Karma - Apple haters still suck

    24. Re:Opting out -- of publicly available HTTP??? by Simon+Brooke · · Score: 2
      I don't think that that is a good enough standard. When a television show is broadcast, or when a book is published, it is publicly available -- but we don't think that the publisher looses their right to copyright protection in these cases. Publishing on the web is similar. The creator wants people to see his/her creation, but does not automatically give visitors the right to archive and retransmit the works.

      We cannot have it both ways.

      Either there is an information commons, with rights to fair use of copyright information, or there is a DMCA-inspired world where ultimately it becomes illegal to so much as quote a phrase anyone else has previously used. If you said a thing, you said it; it's a fact and it's part of the historical record. You might reasonably have reason to complain if the wayback machine altered what you said in any way, or made it appear that you had said something you had not said; but so long as it merely archives and keeps an historical record, as far as I am concerned it is entirely legitimate and proper.

      I also hope that the wayback machine is archiving material that is hidden by robots.txt files, and will make them public after normal copyright has lapsed. It is part of the historical record, too.

      --
      I'm old enough to remember when discussions on Slashdot were well informed.
    25. Re:Opting out -- of publicly available HTTP??? by Twylite · · Score: 2

      Wrong, your website IS released to the public, unless you have taken steps to make it private (say, using password protection and having a limited set of permitted members). Publishing on the web is just that - publishing. That gives libraries the right to take your publication and make it available to the general public.

      The issue is more accurate one of whether the archive site is simply making your publication accessible, or republishing (which is protected by Copyright). Copyright ownership does NOT give you carte blanche to decide how and who has access to your material; there are limitations and provisions, in particular that public libraries can provide access to that material on a loan basis irrespective of the licensing clause you try to apply.

      --
      i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net
  7. Talk about a time machine... by wompser · · Score: 3, Interesting

    Went back and looked at the site for the .com I used to work for, very nostalgic. The wayback machine is a good resource for people who create content on someone's site (a.k.a. me), and then lose access to it because the company goes under. Now I'm able to add my old content to my portfolio, now that the company who once owned it is gone.

    --
    .....
  8. Permission... by gorf · · Score: 3, Insightful

    who gave them permission to make those copies?

    The way I see it, you implicitly give people some limited form of permission by putting it up on the internet freely available to download in the first place. You put it up for people to download, print out and so forth (which amounts to copying), and therefore you've implied that people may do so.

    Sure, you own copyright, and blatant plagarism is something that clearly is wrong. But I see nothing wrong with taking an article that you published on the web and reproducing it, as long as it is taken in context and is clearly attributed (and it made obvious that the copy isn't the original, but proper attribution would do this and therefore suffice).

    Of course, this is republication and so the issue is not so clear and obviously subjective. That's just my opinion.

    1. Re:Permission... by gorf · · Score: 2

      Well, assuming you didn't pay to watch TV (if you did, then you're cutting into their revenue by selling copies, so that's a good reason for it not to be right, as otherwise you wouldn't be able to get pay TV) then as long as you kept the advertisements intact, then I don't see a problem.

      As for website owners prohibiting things, I don't really consider that kind of notice valid. It's already implied that you can view them by the fact that you can (HTTP and all that). Restricting you after you've already done it is meaningless and therefore (in my humble opinion) invalid.

    2. Re:Permission... by gorf · · Score: 2

      That's interesting. My stance on that would be that the publisher should have considered that it had been on the internet, therefore publicly disseminated and thus impossible to prevent the article from being reproduced elsewhere when they made any agreement to transfer ownership of copyright.

    3. Re:Permission... by gorf · · Score: 2

      Because you paid for the book (the library did on your behalf) and the same for a film. The owner of the copyright didn't implicitly give you permission to do anything, because in the case of a film you paid for the privilege to see it, and for a book your library did.

    4. Re:Permission... by gorf · · Score: 2

      Yes, but say a writer published a book, owned the copyright and gave permission in the book for anyone to freely make copies, but then sold the copyright to the publisher. The publisher can't then prevent me from freely making copies if I'd bought the original publication of the book. What I'm saying is that publishing something on the internet for anyone to download for free implies permission to freely make copies.

      Like you say, it's the text that's copyrighted and not the content, so that whole (other) thread is irrelevant.

    5. Re:Permission... by budgenator · · Score: 2

      Yes my original banners are there and yes, if someone clicked them we would make money (if the banner's company wasn't bankrupt that is) so technicaly we are potentialy get paid for content that is no longer available on our site.

      a bit off topic but here goes, while don't the tv ad sponsers negotiate revenues for time shifted programing?

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    6. Re:Permission... by gorf · · Score: 2

      Who said anything about Copyright Law? I'm talking about what is right and wrong, not what the law says you should and should not do. Have you not noticed in your infinite wisdom that there is actually a difference?

      There is a very good reason that copyright law exists; it's there to promote production of works by giving a financial incentive to do so. If someone is publishing something on the internet, then in my view he is effectively saying that he has more interest in dissemination of what he has to say than to make money, and therefore permission to copy verbatim is implied. If there are ads on the web page that he uses to fund himself, then that's fine; they'll get copied just as well as the other stuff.

  9. Legally you can stop them, but why? by the_womble · · Score: 3, Informative
    If you own the copyright they can not archive it without your permsiission, legally, that is all there is to it.

    Of course in practice you have to purse this and ask them to remove it.

    If you really object I suggest a list of every site you have or have had and dates with a request to remove everything. Then you only need to notify them when you put up a new site that that whould also be excluded. That would not be such a nuisance, would it?

    That said I think they are providing a service that is interesting so unless you are harmed by it, why object?

    I am interested in knowing how they had such old versions of your site though. Do search engines keep archives?

    1. Re:Legally you can stop them, but why? by zangdesign · · Score: 2

      Nuisance, but not illegal. That is actually a good idea from a personal responsibility standpoint. Are you still willing to stand by words you spoke many years ago? If not, why?

      Who know? It might actually get people to talking for a change.

      --
      To celebrate the occasion of my 1000th post, I will post no more forever on Slashdot. Goodbye.
    2. Re:Legally you can stop them, but why? by poot_rootbeer · · Score: 2

      If you own the copyright they can not archive it without your permsiission, legally, that is all
      there is to it.


      So if I want to establish an old-fashioned library full of books that are made of paper, I can't do so until I get permission from the author/publishers of every single book in the building?

      The issue is not as cut-and-dried as you represent it to be.

    3. Re:Legally you can stop them, but why? by hymie3 · · Score: 2

      Nuisance, but not illegal. That is actually a good idea from a personal responsibility standpoint. Are you still willing to stand by words you spoke many years ago? If not, why?

      How old are you? I'll be 29 this year. I've been on the internet since I was 18. My first two or three years saw me posting quite a bit of offensive/tasteless stuff. At the time, I had a reasonable expectation to not have my words archived for my great-great-grandchildren to read.

      Somehow, jokes about Roland De Graaf having sex with Chelsea Clinton in the back row during the premier of Jurrasic Park seem a lot less funny now.

      Anyhow, my copyrights are being violated. I don't have to opt-in to be granted copyright. The mere act of authoring grants implicit copyright under the Berne convention (US signed on in 1989, which covers all of my web sites *and* gopher sites). Where's my satisfaction?

    4. Re:Legally you can stop them, but why? by zangdesign · · Score: 2

      I will be 35. I've been on some form or another of the internet since 1986. Which hardly invalidates your complaint, of course. I have probably made an equal amount or more of inconsiderate, illiterate, or even downright stupid comments online, so I do not relish the idea of those things being resurrected.

      However, the question arises: which weighs more - historical record or personal rights?

      We have no way of determining, at this time, what may be historically relevant in the future. If we have the means to archive these things for historical purposes it behooves us to do so for future generations to study and pull some pearls of wisdom or knowledge therefrom (perhaps: "Don't do that" might have more meaning because of your statements).

      The question of copyright is a specious argument, in my opinion. There is no violation of copyright, since your words have not been altered, no profit is gained by making a copy of your words, and, in fact, no readily apparent benefit has yet arisen from having archived your ill-considered words. Any benefit exists, in potentia, but one can hardly mortgage a potential metaphorical benefit for a gain now. In fact, one can use the same argument that a library would use for keeping books on the shelf. The author may retain the copyright, but the library has the right to take publicly available material and make it available to others. The potential public benefit outweighs the needs of the individual in that case.

      I would say that you fail, or at one time did fail, to understand the nature of digital media, in that it is intended to be a permanent medium, one certainly more permanent than paper storage. The truth of this argument falls far short of the dream in most cases, but advances are still being made.

      As for your "satisfaction", what satisfaction do you demand? Without a clear definition of your "satisfaction", your moral outrage is mere noise, signifying nothing. Do you demand that your websites and gopher sites be removed from the Wayback Machine? The Wayback Machine makes no attempt to usurp your copyright. You still retain all rights to your material, including the right of removal if you so desire.

      I suggest that you start trying to figure out how to explain to your great-great-grandchildren just who the heck Roland De Graaf is.

      And probably Chelsea Clinton.

      --
      To celebrate the occasion of my 1000th post, I will post no more forever on Slashdot. Goodbye.
  10. The story should read 'since 1996' by forged · · Score: 2

    www.cisco.com, 1 page (1996)
    www.microsoft.com, 5 pages (1996)
    www.ibm.com, 7 pages (1996)

    This is in the FAQ.

  11. As an creator... by Bonker · · Score: 2

    As someone who makes lots of free sellable and href="http://www.furinkan.net/fanfic/">unsellab le content, I think The Wayback Machine is an invaluable resource. I can look back a see how big a dork I was and still am. I've also found stuff of mine that I've lost over time, amazed that anyone ever bothered to hold on to it.

    --
    The next Slashdot story will be ready soon, but subscribers can beat the rush and slashdot the links early!
    1. Re:As an creator... by rknop · · Score: 2

      I've also found stuff of mine that I've lost over time, amazed that anyone ever bothered to hold on to it.

      Yes, I've used it for this too. I'm a volunteer webmaster for a site (www.fudgerpg.com) where we have a "monthly spotlight", but foolishly I wasn't keeping track of past spotlights. Eventually I wanted to put together a list of past spotlights, and realized that I hadn't kept that list. I felt stupid. The Wayback Machine (mostly) came to my rescue there.

      -Rob

      [ Reply to This | Parent ]
  12. Ah, Gee! by Dark+Paladin · · Score: 2, Funny

    Sherman: Mr. Peabody, I want to go back in time!

    Mr. Peabody: Be quite, Sherman. This new Wayback Machine is now accessable via a browser. Be happy with that.

    Sherman: But I wanted to go back in time and watch Cleopatra taking one of those milk baths again.

    Mr. Peabody: .... Damn it, boy, fire up the Wayback machine. And fetch me my chew toy.

  13. Who DOES have permission to copy your site? by allism · · Score: 3, Insightful

    Do I have permission to copy the content of your site to my browser history directory, and if so, how long do I have permission to keep it? Can I show a copy of an html document that is stored in my browser history to my mother? What about my neighbor? Or the dude in another country I happen to be chatting with online?

    IANAL blah blah blah, but once you open your files up to being downloaded and stored by a browser, you've pretty much given up the right to tell people they can't be re-distributed--I would think the best you could hope for is that people would re-distribute them, in whole, the way you originally released them.

    1. Re:Who DOES have permission to copy your site? by Pseudonym · · Score: 2

      To answer your question: You have always had fair use rights. Plus, thanks to the DMCA[1], you now have proxying and caching rights. You have never had republishing rights except where explicitly granted. Indeed, on most corporate web sites that I've seen, republishing is explicitly disallowed in the relevant corporate disclaimers.

      IANAL either, but lawyers and courts are usually not impressed by the "slippery slope" arguments that we geeks (I do include myself) usually come up with in these situations.

      [1] You heard that right, by the way. If it weren't for the constitution-violating Chapter 12, the DMCA would actually be a pretty good law. It has some lovely shiny new rights for Americans to enjoy. This is one of them.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
  14. I like it but... by rknop · · Score: 4, Insightful

    When I first discovered it, it was a lot of fun. Much nostalgia; it was fun seeing earlier verisons of my webpages. Some go back quite a number of years.

    On the other hand, I was horrified when I realized that there was full archiving of www.dramex.org. If you visit that site, you will see that there are a large number of scripts (as in plays), many of which have restrictions on use. Over the years, we've had people request that scripts be removed from the site; of course, we did so. However, they weren't necessarily removed from the archive, and an archive keeps them forever. Specifically with the wayback machine, I was able to submit stuff that removed the specific directories I was worried about (they don't archive the scripts from www.dramex.org, just the "front page" stuff which is all part of the fun), and keep them from doing it again.

    I like the idea of archives; it preserves history. The web is a transient medium, but not entirely. Yes, much of the content is dynamic and should only be dynamic. Some of it, though, is like the front page of a newspaper. Each day, what's on "today's front page" is different-- but there is value and use in seeing what was on the front page in any day in history.

    But sometimes you need to delete something and make sure it really is no longer available. When you don't completely control your site (i.e. somebody else archives it, rather than just mirrors it), that becomes impossible.

    newspaper.

    (Incremental backups can have a similar issue. If you only back up files which are "newer than the last backup", your backup doesn't have the information about files which have been *deleted* since the last backup. When you restore, you might find some files there you thought shouldn't exist any more.)

    (Dramex.org has changed so that it's not straightforward to get directly to the scripts any more. META tags tell the search engines to leave the actual scripts alone, and you can only get the text itself via CGI. Yes, it's easy to subvert if you put your mind to it, but at least you do have to put your mind to it, and automated search engines or archivers won't. 90% of the security for 1% of the effort.)

    -Rob

  15. I love it. by gripdamage · · Score: 3, Informative

    What's the problem?

    If you do something illegal on your website, you won't be held responsible more than once just because the data persists on the Wayback machine. If you remove the offensive material from your site, that's all you can do. The Wayback machine can deal with their own lawsuit threats. And I'm sure they'll remove material if you are the site owner and ask nicely.

    As far as outdated information, anyone reading pages on the wayback machine and expecting them to be current would have to be crazy. It's an archive after all.

    It's easy to opt out. Google provides instructions in there webmaster faq which points out "There is a standard for robot exclusion at http://www.robotstxt.org/wc/norobots.html."

  16. As a webmaster of various sites... by schon · · Score: 5, Insightful

    As a webmaster of various sites, I have no problem with archives.. if I didn't want people to see my stuff, I wouldn't have put it on the internet in the first place.

    where did they get such old copies of my websites, and who gave them permission to make those copies?

    They probably got the copies the same way everybody else did - by surfing. You (implicitly) gave them permission to cache your sites by not including an appropriate entry in your robots.txt.

    The way I see it, archives are much like SPAM; I never opted in, why should it be my responsibility to opt out?

    Archives are nothing like spam. Spam is primarily harrassment. These guys aren't harrassing you. They did ask your permission (by way of checking your robots.txt). If you've since changed your mind, it's your responsibility to notify them.

    Google caches material too - do you consider them to be spam as well?

    Archive sites provide a valuable resource to the rest of the 'net. If you don't like it, put an appropriate entry in your robots.txt file, and be done with it.

    1. Re:As a webmaster of various sites... by Twylite · · Score: 2

      It should also be pointed out that in most countries it is a legal requirement that a number of copies of all publications be lodged with the central/national library. Because of the ad hoc nature of Internet publication (which means ANY web site) this is largely overlooked or ignored.

      Archive sites provide a facility which can be equated to a public library. Once you have published material publically, you have no right to demand that it cannot be presented in a public library.

      --
      i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net
    2. Re:As a webmaster of various sites... by MrFredBloggs · · Score: 2

      Is there a sensible comment anywhere in this story giving reasons why stuff *shouldnt* be archived? I can see sensible pro-archiving comments, and anti-archiving whining, but is there a rational argument against archiving freely accessable information?

  17. Can libraries keep old newspapers? by cperciva · · Score: 2

    The submitter states that he never gave the Internet Archive permission to replicate his work. He is wrong.

    By placing material on the web, one is implicitly granting permission for it to be read. If I put a poster up in my window, I lose the right to complain if someone walking by on the street reads it.

    Equally, I lose the right to complain if someone walks by and takes a photograph of the front of my house, including the poster. The fact that someone might then be able to read the poster ten years from now is irrelevant.

    If the Internet Archive were required to seek permission before archiving freely and publicly available material, then the same argument would require libraries to seek permission prior to archiving (free) newspapers.

    Timeshifting is fair use, and it applies to web pages just as well as TV signals.

  18. Quit simply, without Google ... by Vicegrip · · Score: 2

    I would never have visisted countless sites I reguarly surf to. Google has definitely been a major gateway to the internet for me.

    I think making an issue of the caching is a moot point, as about 99% of the time I always go to the website for the content since the source is always better than the cache. I use the cache only in cases when the content has disapeared or in some cases when the website itself is gone.

    This is a valuable service Google is providing-- and webmasters get it for free.

    --
    Do not spread "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" over the internet, thank you.
    1. Re:Quit simply, without Google ... by Vicegrip · · Score: 2

      Perhaps, but the article did. Try reading it again.

      --
      Do not spread "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" over the internet, thank you.
  19. Preserving information is important. by Chiasmus_ · · Score: 5, Insightful

    I doubt that I'm alone in my belief that it is always tragic when any piece of information--no matter how trivial--is lost forever.

    If a person has offered that information for free at any point, to the extent that an automated script could access it, then I believe that information can be safely considered public domain. I doubt that there's any mechanism by which Richard M. Stallman could lose his mind and "rein in" all copies of GNU, or by which Stephen King could recall all his novels and refund the purchase price; once something is offered to the public, it no longer belongs exclusively to the publisher.

    In my opinion, the value of archives in the future immeasurably outweighs occasional inconveniences of having information stick around longer than the author would have wished.

    --
    "Beware he who would deny you access to information, for in his heart he deems himself your master."
    1. Re:Preserving information is important. by quintessent · · Score: 2

      Throwing out so much dated information would mean discarding a critical part of our written history. Did you notice how the multitude of Y2K disaster sites changed from 1999 to 2000? That is history.

      If the courts are going to outlaw archives of the Internet, I suggest they do a complete job of suppression and order the burning of all books, newspapers, and magazines more than a year old.

      Then authors will be free to rewrite history as they wish.

    2. Re:Preserving information is important. by ImaLamer · · Score: 2

      This is one point I agree with 100%.

      Maybe some sites aren't meant to be "archived", no matter how cool it would be to see sites such as Yahoo '96 again.

      There are sites however out there with good information that should be available forever. This is our history folks! One of the advantages of the Web and the Internet in general is access to data. If archive.org wants to be the ones who house this data let us praise them!

      I think they were smart by doing this. Now we don't have to rely on nightly news (or other horrible, skewed sources) to replay their tapes or interviews. Now I don't have to save every newspaper I get.

      Archive.org demonstrates what the web should be!

      Also, check out there movies section. The greatest.

      Of course opt out with robots.txt (as said above. they won't delete their copy, just exclude it from the site)

  20. err okay... by NanoGator · · Score: 2

    "I can't help but wonder: where did they get such old copies of my websites, and who gave them permission to make those copies?"

    You sound like Television broadcasters when you say something like that. "We'll broadcast content over the airwaves, but you better not capture it!"

    Well, let me make it simple for you: When you make something public you cannot expect to bottle it up later. That's the whole reason that the internet is in existance: Extreme redundancy so that data is never lost. The original idea was to build a data network that could survive a nuclear attack.

    I don't think anybody should ever post stuff on the web without expecting it to last forever in some form or another, regardless of whether permission is granted.

    --
    "Derp de derp."
  21. How so? by SkyLeach · · Score: 2

    "Of course, the issue that may bug many content providers is how to opt-out of such services, since some see it as a copyright violation."

    So I need to burn all my old comics? Or perhaps I don't need to every allow anybody to look at them?

    Caches aren't republishing information, they are archiving it. That's what libraries do to. Hell, they can even charge for the service if they want and still be in the moral right.

    --
    My $0.02 will always be worth more than your â0.02, so :-p
  22. Fork over your caches by Eponymous,+Showered · · Score: 3, Funny

    I browsed your all of your sites (even the abandoned ones) and since my browser cache is set to 782TB (and I'm still running Netscape 1.0N), your sites are still there. And my cache is publically accessible via my webserver. Yet another way you're being violated. Ah, the risks and perils of publishing on a public network.

    1. Re:Fork over your caches by quantum+bit · · Score: 2

      (and I'm still running Netscape 1.0N)

      And it hasn't crashed yet...?

  23. Archives need to be made by Waffle+Iron · · Score: 4, Insightful
    If the courts determine that it is technically illegal to make archives of electronic content, then the copyright laws should be changed to explicitly allow archiving. Otherwise, we could eventually lose track of history. The only written record of large portions of our civilization would be relegated to a few rusting web server hard drives buried landfills.

    If you read 1984, you might remember that the government tightly controlled all old copies of documents so that they could manipulate history as they wished. We might get into a similar situation by accident if we don't allow independent archives of electronic information.

    With traditional media, you publish something on paper, but you don't get to control who puts the paper copies in which archives. That has served us well for keeping track of history, and an equivalent system needs to maintained for electronic content.

    1. Re:Archives need to be made by Target+Drone · · Score: 2

      Reminds me of the article about Online News Stories that Change Behind Your Back. Granted it's nothing like what the Ministry of Truth did in 1984 but it's still scary that news agencies have actually taken that first step of changing a news story.

  24. A Real World Example/Question by GeekLife.com · · Score: 2, Insightful

    Do libraries have to get permission to save and allow browsing of copies of newspapers (both physical and microfiche)?

  25. And what to do when info must die? by Nf1nk · · Score: 2, Insightful

    For the most part I don't have a problem with them archiving my sites (after all they can show me what a site used to look like faster than digging out my back ups), but recently one of my customers told me to remove all traces of a product from thier site (something about nasty litigatiation). I pulled the info off our servers quickly, but three hours later I get a nasty phone call from the customer saying he can still see the product on the site. seems it was hung up in some proxy server between here and there.

    back to the point how do you deal with an archive when you need to get rid of information that is a liability to you now? Maybe we are better off without them in some cases

    --
    I used to have a cool sig, back when I cared
  26. Friend to Hosting Comapnies by Da+J+Rob · · Score: 5, Funny

    I was talking to this guy who works for a web hosting company, and he says a fourth of his sales calls are people calling him up cause they're pissed that their last hosting company 'lost' thier site. (in reality most the time its later found out that the guy deleted it himself or renamed index.html to index2.html, etc..) He says 90% of the sites he can find a copy on the wayback machine. He'll then start to quote the website's contents to the guy on the phone and usually will have the amazed (and dumbfounded) customer signing a hosting contract by the end of day.

  27. Uh, robots.txt! by Tom7 · · Score: 2

    Use robots.txt, stupido. It lets you prevent search engines from indexing and archiving your property. However, if you're that concerned about people copying your pages, you might try avoiding the internet.

    I personally love the internet archive and google's cache.

  28. The web is a public medium! by Steveftoth · · Score: 2

    This parent post said almost everything I was going to, but one thing that I wanted to add was that the web, if a spider is even able to get to a page, (even if it doesn't follow the robots protocol which the wayback machine does) is only seeing a public page that anyone with an internet connection can get to.

    Otherwise you have bad control over your content and need to update your web server to not serve that content. If you don't want people to be able to copy your information then don't give it to them. Or only give it to them in a signed format that cannot be easily duplicated.

    It's like being surprised that someone has forwarded an email that you sent them.

  29. robots.txt DUUUUUUUUHHHHHHH!!!!!!!! by jsimon12 · · Score: 2

    For such a "webMASTER" this guy doesn't seem to know a lot about the Internet, seems more concerned with keeping his "Intellectual Property" safe then actually understanding the way things work.

    People like this ruin the concept of the Internet, the free exchange of knowledge. I hope other people on /. feel the same.

  30. Copyright and websites. by www.sorehands.com · · Score: 3, Interesting
    It could be argued that the site is publically available and thus anyone can copy it. There is also the issue of fair use. That is why many people place terms of use and robots.txt files on their sites. It could even be a DMCA violation where an IP (or range) has been blocked, so people from that IP use the google cache to bypass the block.


    I don't mind that my site is being added to indexes that the public have use of for free. I have a problem where a company uses my site to make a profit, with no public benefit.


    There is case law where unauthorized access to a website is a copyright violation.


    I am trying to use copyright law against some of the spammers who scrape my site for email addresses. Then, go after the spam software companies for contributory infringement (let the napster rulings serve some good).

  31. Re:Copyright must die! There is no such right by Chiasmus_ · · Score: 2

    According to Locke, the "natural rights" of man are life, liberty, and the ability to own property; when you enter into a society, you turn over all those rights to the State in return for whatever rights it deems fit to grant you.

    Thus, no one has the right to eat, have children, work, or be sheltered, unless their government sees fit to grant those rights. Certainly, America does not acknowledge a right to be employed or to eat; in fact, it's been known to blacklist people in the hope that they'll do neither.

    And no, no society I'm aware of has ever given its citizens the right to copy information indiscriminately. Personally, I would love to see a society do so, because I suspect that such a society would actually probably end up richer in technology and culture. Both sides of the argument make some sense, but only one is actually tried, and it's apparent that excessively restrictive copyright laws actually retard cultural and economic growth. But, no, as it stands, society has deemed that the exclusive right to copy a piece of work is something a government can hand out.

    --
    "Beware he who would deny you access to information, for in his heart he deems himself your master."
  32. Get Used to It, please by pyrrho · · Score: 2

    I understand the concerns, but I think it's a part of the net, a good part, that we have to wrap our minds around.

    Especially when you mention Usenet archives, which are (ok, get ready to laugh) historically important. I'm not kidding! There is a little signal in there, it's a cultural brain dump, and that's of historic interest.

    I think the rub is, if the archive presents the data exactly as you presented it (that is, it doesn't play with your content, present it in a frame or otherwise embed it as their own content), then it is a fair archive, a ghost of your site still walking the internet. There is no taking it back once you post it.

    --

    -pyrrho

  33. TV Broadcast analogy by rknop · · Score: 4, Interesting

    Some have already drawn analogies to TV broadcasts, saying hey, it was broadcast, you get to keep a copy. You can't bitch now if people still have that copy, unless you're Jack Valenti.

    You can spin this how you want. Here's one valid way to think about it though: a TV network brodcasts a show. You make a private copy on a VCR tape. Jack Valenti aside, you can watch that copy again as often as you like, and it's no big deal. However, you do emph not have the right to rebroadcast your copy of that show to the public without the permission of the original copyright holder. (I have my B5 tapes. I'm watching them through again now, showing them to my wife. I'm sure nobody is upset about this. But I'd be in deep doo-doo if I managed to broadcast them on a local access station, or uploaded them to a public website.)

    If you are inclined to be negative about the Wayback Machine, you could view it this way. While the page existed on the original site, it was broadcast to the public. If somebody made a personal copy, they have it and will always have it, even if the site goes down. However, when the site goes down, individuals do not necessarily have the right to then "rebroadcast" (i.e. post) themselves the content they downloaded and kept. This, however, is what the WayBack machine is doing.

    Mind you, except for the issue with www.dramex.org that I noted above (and which I fixed long ago), I like the WayBack machine, and am happy that they archived the content which was implicitly copyrighted to me. I would have opted in if I had wanted to. But, of course, I didn't know about it back in 1996 to opt in.

    I don't have a good answer to the questions. Just thought.

    -Rob

    1. Re:TV Broadcast analogy by Suppafly · · Score: 2

      The problem with the internet is that you can't really compare it to books. On the internet, all access to material implies redistribution.

      If you could compare it to books, you'd call the Way Back Machine a library and no one would complain, because you'd go to the library (the way back website) and view archieved versions of publically available content.. In the book world, there is nothing wrong with this. If Micheal Criton sells copies of Jurassic Park and then later decides to rerelease Jurassic Park with extra chapters, the old version doesn't go away and there is nothing to keep me from freely allowing you to read my copy. The problem with the real world is that people don't see it like that, they bitch and moan because someone copied their content to another place and is distributing it from there, totally ignoring the fact that basic usage of the internet relies on the fact that information must be copied before it can be viewed (and the concept of proxy servers and the various legitimate reasons for caching content).

      There is nothing wrong with the way back machine, if people didn't want thing publically viewable forever, they shouldn't have put them on the internet. Things such as plays that people don't want to be totally publically available need to actually make their entries protected in someway instead of expecting everyone to notice or care about a little (C) that may or may not be valid.

  34. best thing since sliced bread by John+Sokol · · Score: 2, Insightful

    There is nothing-worst then revisionist history. I can't stand seeing site that post something and a bit later it vanished forever or have it altered removing the very think I was interested in.
    There are several GPL'ed Open Source software packages that I have copies of, that have vanished with all references to them and are no longer available on the net. Also a number of great sites that came and gone for either lack of cash or time. I think if someone open sources something it should stay that way.

    Also if it's open on the net for public viewing, then it should be fair game. Especially if the original author is credited and it is in the original context, like the Wayback Machine is. I know there are always special cases where something was put up that the webmaster was not entitled to like a copyrighted book or something, but for most stuff this is invaluable and a great service to humanity.

    Also think of all those users who's we site was lost without backup. Now they can get that data back.

    The Wayback Machine is one of the few web services I'd be willing to pay for.

    John

    --
    I am always doing that which I can not do, in order that I may learn how to do it. - Pablo Picasso
  35. Library archives are given broader copyright uses by tiltowait · · Score: 5, Informative

    .... and wayback is sponsored, amongst others, by the library of congress. The archive itself a 501(c)(3) public nonprofit. See 17 U.S.C. SECTION 108(a)(3) for more information.

    Strange that such a complaint would appear within a group expousing that "information wants to be free." :)

  36. For what it's worth... by Reality+Master+101 · · Score: 2

    What particularly interests me is the fact that the Machine is a relatively new animal, yet it contains snapshots from my sites dating back to 1998.

    Interestingly, if you look at Slashdot's earliest entry (man, that page was ugly back then!), and then look at the bottom of the page, it shows the domain that was used to pull the page: "Welcome User From firestone.alexa.com".

    Alexa.com appears to be some web search ("powered by Google") toolbar thingy. I can't determine if they are the same people as the wayback machine or not.

    --
    Sometimes it's best to just let stupid people be stupid.
    1. Re:For what it's worth... by MushMouth · · Score: 2, Informative

      Alexa does the Archive's crawl. Notice that Brewster Kahle's name is attached to both.

    2. Re:For what it's worth... by adolf · · Score: 2

      IIRC, Alexa is responsible for the content of Netscape's "What's Related" button, and they've been, appearently, taking snapshots of whatever they could for years. I seem to recall some discussion about this button, and the data-collection policies at Alexa about the same time the button started appearing in Netscape.

      Ironically, despite archive.org's extensive cache and slashdot's search feature, I can't find it. Hrmph.

      According to whois, archive.org and alexa.com are both registered to companies in San Francisco. Additionally, the 9/11 TV news archive page, as linked from archive.org's main page, credits a number of Alexa employees in the right-hand sidebar.

      http://tvnews3.televisionarchive.org/tvarchive/h tm l/

      I'd say they're all the same people, more or less. Different corporations, perhaps, but at least the same faces.

  37. Purist? Pure what? by American+AC+in+Paris · · Score: 5, Insightful
    Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one. Archives have bothered me ever since the fledgling days of DejaNews.

    I'd say it makes you more of a control freak than a purist, personally.

    Seriously, how did you ever get it into your head that a medium that serves documents to the general public on demand would be somehow exempt from archiving?

    Would it bother you of John Q. Savant could recite the contents of your web pages from memory ten years after you'd taken it down?

    Would it bother you to learn that stock prices, perhaps the most "ever-changing" thing out there, are permanently archived by a variety of services?

    Or are you just jittery at the thought that your spouse/boss/Friendly Neighborhood Representative of The Man/kids may be able to someday look at the shite you plastered all over the web in your younger days? ("Ech, that stupid Netscape 2 animated title hack--honey, you actually -did- that?")

    --

    Obliteracy: Words with explosions

  38. But!!! by www.sorehands.com · · Score: 2
    A person may take a picture of the front of your house and of you and your painting for personal use.

    Now, when that person redistributes it, then it becomes an issue of fair use, copyright and license.

    1. Re:But!!! by hymie3 · · Score: 2

      Yeah, but the wayback machine/internet archive isn't creating a derivative work--they're republishing (without my consent!) my copyrighted material. Copyright isn't opt-in, is it? If it is, I'll be adding a lot more mp3s to my archive.

  39. Microsoft.com in 1996 by dasheiff · · Score: 2

    http://web.archive.org/web/19961020014044/http://w ww.microsoft.com/

    Well back in 1996 you really could win a million dollars from Bill Gates... well atleast a cruise.

    See all the exciting things happening on the Internet in Latin America, and win big prizes at the same time! Register for the first Latin American Internet Explorer Race. You'll have a great time, and perhaps even win a Caribbean cruise!

  40. easy to remove and stop from archiving by arson1 · · Score: 2

    robots.txt

    User-agent: ia_archiver
    Disallow: /

    --


    --
    Don't sweat the petty things, and don't pet the sweaty things.
  41. You have given permission by MrResistor · · Score: 4, Insightful

    By the very act of posting your site on the web you have given permission to make copies of it. Otherwise, how would anyone view it? And if no one is supposed to view it, why have you published it in a publicly accessible space?

    If I went to your website 2 years ago and never closed or refreshed that browser window, would I now be violating your copyright? What if I saved the page so I could view it later offline? What if I never erased that file, would that mean that I'm violating your copyright? I have several floppies of web sites I saved at school for viewing at home from the days when I was stuck on a crappy dial-up service. Does that make me a pirate? What about all the copies of sites held in my browsers cache?

    Don't get me wrong, I understand where the sentiment is coming from, even if I disagree with it. I'm just trying to point out how incongruous it is with the basic nature of computers and the internet and how they work.

    These questions aside, though, I have to come down in favor of the historians. People here are always whining about old movies/books/music being lost because their owners refuse to let them go, even if they aren't using them, why should the web suffer the same fate? The rate of destruction is far faster on the internet, and since it isn't a physical media, the information has to be actively archived if it is to be preserved.

    --
    Under capitalism man exploits man. Under communism it's the other way around.
    1. Re:You have given permission by SuiteSisterMary · · Score: 2

      So, anybody sitting behind a caching proxy...or an offline cache...is doing something you don't want them to do? Because the first, and under a strict interpretation, the second, fall under the heading 'republishing.'

      --
      Vintage computer games and RPG books available. Email me if you're interested.
    2. Re:You have given permission by gerardrj · · Score: 2

      The line gets drawn when you re-publish the web page in question.
      You are correct, by puclishing a web page on a non-protected, publicly accessible server, you explicitly provide rights for your content to be viewed and retained by browsers.
      In your example, you would be wrong when you decided to later put the page you stored back on-line at something other than the original URL.

      If these people want t archive the internet, they should specifically be reqired to do it on an opt-in basis. You go to their site and enter the URL that you want them to keep archives of.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    3. Re:You have given permission by gerardrj · · Score: 2

      Your own example shows that the URL required to access the document (now archived) has been altered.

      Doing evil in the name of good is still evil.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    4. Re:You have given permission by MrResistor · · Score: 2

      I have to disagree with opt-in for archiving, simply on the grounds that to much would be lost simply due to laziness. Chances are I wouldn't go out of my way to opt-in to an archiving program, and I'm willing to bet that 99% of webmasters wouldn't either. If I were running the archive I probably wouldn't offer an opt-out either, but then I also probably wouldn't put my archive up on the web. It makes more sense to me to make such an archive available on some physical media such as CD or DVD.

      I could see making republishing on the web opt-in, but not the archiving itself. If you take that step, you're opening the whole browser cache can of worms, and before you know it some idiot is suing people for using caching web-proxies.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
    5. Re:You have given permission by gerardrj · · Score: 2

      But why must "everything" be preserved fo posterity.

      My dog took a dump on the lawn this morning. I have no record of what the lawn looked like before the event, of me cleaning up the event, ot what the lawn looked like afterward. What is the loss to society for this lack of information?

      MOST of the web is just that... someone's brain dump. Most of it has no socially or intellecuallt reasonable need to be archived.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    6. Re:You have given permission by MrResistor · · Score: 2

      Because it's the everyday stupid stuff that nobody thinks is important that gives true insight into a society.

      Someone reading your post knows that:

      You have a dog, and that it is probably a pet since you trust it enough for it to have the opportunity to crap on your lawn.

      You have a lawn, an area around you residence which you care for enough to clean up when your dog takes a crap on it.

      It can also be infered that you live in a society which has domesticated animals and a concept of individual property ownership, and that your society places a value on hygene, and most likely the appearance of cleanliness, not just of the person but of the area surrounding the person.

      The loss to our society for not having this information is nothing, since pet dogs and lawns are common in our society and we all know that. The loss is to future societies trying to understand ours. Did the Egyptions keep pet dogs? Did they have lawns? Did they clean it up when their dogs crapped on their lawns? If they did, how would we know except that someone wrote "My dog crapped on my lawn and I had to clean it up" and that writing was somehow preserved?

      Academic and philosophical writings, while arguably more worthy of preservation, generally give little or no insight into the life of the average person, and at this point in time the way the average Athenian lived and the everyday things they did, and took for granted, is of more intellectual interest to us than that Aristotle knew a few things about Geometry.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
    7. Re:You have given permission by gerardrj · · Score: 2

      But how do you know what I wrote is true? perhaps I just made it up?
      The problems with these statments and stories is there is no way to prove them true. Hence for all we know, perhaps a future society will attempt to build a religion on my posts.

      Look what's happened with the bible... Some lonely people wrote down versions of stories that had been passed down via word of mouth fot milenia. Once written down in teh collective, they where suddenly looked as as authoratative, and people begane killing each other over them. The archiving of those stories has caused more grief, misguided decision, and war than anything else.

      Archiving without related, supporting data, and without explination of the archived documents by the authors can be a dangerous thing.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    8. Re:You have given permission by MrResistor · · Score: 2

      Once written down in teh collective, they where suddenly looked as as authoratative, and people begane killing each other over them. The archiving of those stories has caused more grief, misguided decision, and war than anything else.

      Every fabrication has a kernel of truth. Even fiction has historical value. While I agree that more suffering has been caused in the name of the Bible than any other written work, more good has been done as well. To say that the Bible advocates or encourages violence one would have to be totally ignorant of what it actually says. Jesus wasn't exactly vague on that point. To put it another way; many, many people have been killed by being hit with hammers. Would we be better off without hammer technology? No. It isn't the hammer that kills people, just as it isn't the Bible that starts wars. It all comes down to people twisting everything around them to serve their own ends. To blame the hammer, or the Bible, is to ignore the real problem.

      Archiving without related, supporting data, and without explination of the archived documents by the authors can be a dangerous thing.

      All the more reason why we should archive everything we can. Even the things which, on their face value, seem worthless or worse. Everythign anyone writes down gives an insight into their thoughts, state of mind, dreams, desires, etc. When all of these things are taken together, you have a society.

      I hate white supremists. I think the world would be much better off without them. I think they spread ignorance and lies, and I'm ashamed to be a part of the same species, let alone race, as them. But, it would be impossible to understand the dynamics of the society I live in without knowing they exist and what they are about, or that there are people who oppose the beliefs they espouse.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
  42. A great tool for future historians /archeologists by msoldo · · Score: 2, Insightful

    Could you imagine if there was the equivalent of the wayback machine for everything published in 5th century Athens? We'd know and incredible amount more about were the human race had already been intellectually and where its going.

    I publish several websites and I don't mind this a bit - If someone wants to host my content for free and offer my customers a way to get at older versions of the site for whatever reason (maybe they want to know what prices were 2 years ago), then they've done me a service. Cool.

  43. Historical Records by JonBuck · · Score: 2, Interesting

    As a historian and future librarian, one thing has always bothered me about the Internet. Because change is a constant, it's very difficult to keep records. It isn't like newspapers, pamphlets, books, or any other form of written record of the past five thousand years. Unless they're printed out, our writings here leave no physical evidence of their existance. Because I feel that the Internet is as significant as the printing press five centuries ago, the prospect of having no records from its early days is frightening.

    We have books from five centuries ago. Will anything here still exist in a readable form five centuries from now? Unless something is done to preserve it, I feel there will be a massive gap in history.

    And this is why I do not object to web archives. They are a half step to printed and more permanent storage mediums, but preferable to nothing at all.

    1. Re:Historical Records by SimJockey · · Score: 2

      Maybe just to play devils advocate here, but is there anything on the web that is historically significant that is not also in a more permanent (say, dead tree) format? I'll agree that the Internet is important, but in the scope of history I would think that the structure would be of more interest than the content.

      --
      Laugh while you can, monkey boy!
  44. Re:Even better.... by WEFUNK · · Score: 2

    This is just....mind blowing. Look at Ebay from 1997 [archive.org].

    You fool! You've just Slashdotted Ebay!

    I think we've also taken out Slashdot, and we're probably on our way to taking out the whole damn history of the internet. It's one thing to knock out somebody's geocities account or web serving PDA, but the Slashdot effect has finally gone totally out of control!

    --
    My next sig will be ready soon, but friends can beat the rush!
  45. libel? by sckeener · · Score: 3, Interesting

    I didn't know that the wayback machine went that far back. I wonder if anyone is going to go to jail from posts they made in the past....

    --
    "Only one thing, is impossible for god: to find any sense in any copyright law on the planet." Mark Twain
  46. opting out by josepha48 · · Score: 3, Interesting
    At least for google to opt out of its service add the following tag in the "head" of your web page:
    <meta NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
    This will tell google not to cache your pages. If you dont want them to index your page and include the page in the search engine use:
    <meta NAME="ROBOTS" CONTENT="NOINDEX">
    Now I am not sure about this other site that is caching old pages, and right now I cannot get through but if they are caching any of my pages I will tell them to take them off as ALL my pages are just that MY pages. I think you can sue them, I'd imagine with all the other internet lawsuits it would be valid. They are stealing your pages.
    --

    Only 'flamers' flame!

    1. Re:opting out by Buran · · Score: 2

      With only a robots.txt entry to stop Google, my site's entry in their index reads:

      www.buran.org/
      Similar pages

      ... I assume it's drawing off the domain name itself (the search term was "buran") to put the site in the index. The robots.txt file reads:

      # go away
      User-agent: *
      Disallow: /

    2. Re:opting out by josepha48 · · Score: 2
      Yes I messed up the tags, but that is a page that I don't want cached or indexed.

      As far as stealing my pages, yes they are. If people want to search for me then they should find my latest site, not an old snapshot of it. Many of the images on my site are mine, except for the ones that are taken from sites that say. "if you want to link to this site use this image". Yes I created them. What about using images from apple.com or some other site, wouldn't that be stealing images? What about caching pictures from a site that does not want you to cache the pictures? Isn't that stealing?

      As the poster says "Why should I have to deliberately remove my copyrighted material from an archive which was never granted permission to replicate that material in the first place?"

      --

      Only 'flamers' flame!

  47. Damn you slashdot! by Aanallein · · Score: 2

    I was just digging through a few hundred pages of information in the wayback machine when the site became sluggish. I jokingly told my friends (you know, the kind that live in my head?) it seemed I was singlehandedly slashdotting the site.
    *sighs* Seems I had some help...

    Anyway, I love the Wayback Machine. Besides being an extremely useful tool, it proves that Zindell was right. Information is never lost, only ever created.

  48. I had my sites removed by kstumpf · · Score: 2

    I used to run a Half-life map review site, and a TFC map review site called "radium". I took my sites down a couple of years ago, and recently some friends pointed out that they showed up on one of these archival sites. I took my sites down for a reason, and didn't appreciate them hovering about on someone else's server without my permission. Say what you will, but I just don't like it. I emailed them and had my property removed from their servers. It took a bit of badgering, but it finally got done.

    1. Re:I had my sites removed by PurpleBob · · Score: 2

      "A bit of badgering?" They will automatically remove any site whose robots.txt denies them, so it's not like they're trying to make it hard to get out of their archive.

      Perhaps they were more uncooperative because you were being nasty in your e-mail to them.

      --
      Win dain a lotica, en vai tu ri silota
  49. The backup copy of the archive by Animats · · Score: 2
    The Wayback Machine started as Brewster Kale's project. He also did Alexa, which provided some of the old archive tapes to start up the Wayback Machine.

    The long-term plan is to have a copy of the history of the Internet, beyond the power of any single government to censor. To this end, there are copies of the archive at multiple locations around the world.

    One of them is in the Bibliotheca Alexandrina, in Egypt. They too have a Wayback Machine. It's jointly operated by the Government of Egypt and the United Nations Scientific and Cultural Organization. While they will usually honor removal requests, they don't have to do so.

    There are plans for two more archive sites around the world, affiliated with major national libraries.

  50. function like search engines. by Restil · · Score: 2

    Wayback machines should function exactly like search engines. If there's a robots.txt file, check it. If it tells you to get lost, do so. A search engine is going to cache at least the text part of your site, and you know it. And you can prevent it if you wish. And depending on the engine, it can take months or years to update.

    Besides, wayback machines will run into the same snags that search engines do. They can't replicate cgi scripts any better than search engines can, so to deny them access to those resources for their sake as well as the server's makes sense.

    I don't know how wayback works. At the very least they SHOULD read the robots file. If they do, then I consider most of the copyright issues to be a moot point.

    -Restil

    --
    Play with my webcams and lights here
  51. dating back to 1998 by quantaman · · Score: 4, Funny

    Anyone else find it mildly disturbing that 1998 is considered to be distant history?

    --
    I stole this Sig
  52. The other issue by corebreech · · Score: 2

    It is suspected by many that archive.org also removes archives based on content.

    For instance, try accessing news sites back in the days immediately before and after 9/11. It is a very spotty record.

    I have seen this for myself as well, as a web site I am struggling to find the time to build, and which has controversial content, was at one time retrievable under archive.org, but no longer is.

    For that matter, it seems impossible to get Google to index it anymore either (though they too once included the site.)

    By presenting themselves as having a complete record of the Internet's web sites, and then selectively deleting or restricting access to sites based on content is a very pernicious form of censorship. It isn't a First Amendment issue perhaps since dotgov assumedly isn't the one restricting content, but it is worrisome nonetheless.

  53. Copyright *is* archiving. by blair1q · · Score: 2

    You can't unregister a copyright.

    You give a copy of your work to the Libary of Congress, and there the evidence sits for eternity, free to be accessed by anyone with a request slip.

    The price you pay for copyright protection is public availability and persistence of your old rantings.

    --Blair

    1. Re:Copyright *is* archiving. by zenyu · · Score: 2

      You give a copy of your work to the Libary of Congress, and there the evidence sits for eternity, free to be accessed by anyone with a request slip.

      Unfortunately this is not the case, if a librarian at some point feels your book isn't historically significant they will chuck it. They don't have offsite archiving, like some more reputable university libraries do, so there just isn't the space to keep every book that's sent to them. They do however have a right to keep those two books you sent them in perpetuity and copy them into archival formats if they want to.

  54. Nothing you can do by litewoheat · · Score: 2

    Once you're on the Internet you can never get out. Its simple fact. Someone will always have a copy of that e-mail you sent professing your love to Missy Gringlebach or the nntp post about how brilliant Hitler was or your web site dedicated to New Kids on The Block.

    Trying to get that stuff off is futile at best. A professor of mine once said that there is not a nanosecond when some computer isn't processing or storing something about you somewhere. And that was in 1991. I've got to side with McNealy on this. There is no such thing as privacy anymore.

  55. Some one hasn't done their research by mfos.org · · Score: 4, Informative

    A few things

    1) They've been archiving since 1998, but they've only recently had the horse power to provide a live connection to it

    2) It is very easy to not have your stuff indexed. the directions are here.

  56. court evidence? by dubiousmike · · Score: 2, Funny

    It's so funny that I've been sending around links to my friends of their old corporate websites for months now. Totally freaks them out.

    On a different note, how long until the wayback machine is used as evidence in court?

    "No, Your Honor, we never posted slanderous comments about XYZ Company. *Oh CRAP! Not the Wayback Machine?!?*

    1. Re:court evidence? by MushMouth · · Score: 2, Informative

      Already used in the Go.com GoTo.com trademark suit 3+years ago

  57. Excuse me? by innocent_white_lamb · · Score: 2, Insightful

    Er, you posted content on the WWW for world+dog to read. After all, that's the purpose of posting said content. And now you're unhappy because folks are reading it?

    If you don't want folks reading your stuff, for heavens sake don't post it on the web!

    Seems obvious to me, somehow...

    --
    If you're a zombie and you know it, bite your friend!
  58. Re:Mississippi Trollse by TheMonkeyDepartment · · Score: 2

    You know what, I actually found that amusing.

  59. Dead-tree publishing parallel by Todd+Knarr · · Score: 2

    Why, somehow, does this strike me as similar to an author having published an utterly bad, horribly stinky book that, later in life, he regrets ever having let see the printing press, and complaining that some people won't turn in their copies to him to destroy now that he wants to unpublish it? Remember that copyright isn't an unlimited right to prevent copies. IMHO most of these archival sites fall into the same category as a library that bought a newspaper, scanned it onto microfilm and then subsequently had the original newspapers destroyed in a flood: they had legitimate access to the originals, the copies were legitimate fair-use copies when made, the originals haven't been transferred to anyone else, the copies remain legitimate fair-use copies.

    It may be embarrassing to the creators to have copies of their sites preserved for posterity, but copyright isn't about preventing an author from being embarrassed.

  60. What damages? by blair1q · · Score: 2

    Since material put on the web and made available for free access has no value, there can be no damage due to copying should someone copy it for their own use, or to use it against you in the future.

    Your copyright is valid, but valueless.

  61. Nebulous argument... by Codex+The+Sloth · · Score: 2

    While all that is true, proxy servers cache information to re-transmit and nobody complains about that. Don't my Usenet posts from 1990 implicitly have my copyright on them? Where do you draw the line? I say if you put it out there, you should just live with it and let the chips fall where they may. It's more like archeology than copyright theft...

    --
    I am not a number! I am a man! And don't you ... oh wait, I'm #93427. Ha ha! In your face #93428!
  62. Who archived it and why by Robotech_Master · · Score: 2

    It's funny the submitter should mention this...because I remember when the people who archived it started archiving it in the first place. A rather big to-do was made about it, as I recall; it was archived as a side-project of the folks at Alexa--you know, the ones who provide the "what's related" technology to Netscape? At the time they started, they didn't know for sure what they would do with it except store it for future generations...but they clearly had some ideas, judging from what they've done with it recently.

    As to the poster's complaint about his old stuff being archived...my immediate response is to say, "Well, tough...you should have thought about that before you put your content out there in the open for anybody who wanted to look at it."

    I mean, seriously, if you do something in public, you have no reasonable expectation of privacy thereafter.

    --
    Editor Emeritus and Senior Writer, TeleRead.org
  63. The purpose of copyright... by kcbrown · · Score: 3, Insightful
    The Congress shall have Power To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries
    -- United States Constitution

    The purpose of copyright is to promote progress, to entice authors and inventors to release their works and discoveries to the public.

    But that is not an end unto itself. The true end is the benefit to society that the release of such works brings.

    Now, remember that the whole incentive here, the entire reason for granting the monopoly privilege of copyright, is to allow the originators of works to make money from their works, which in turn (theoretically) gives them incentives to release their works to the public.

    When you publish something on the web, you're publishing your works for free, unless you go to the extra trouble of implementing some kind of access control. The Wayback Machine won't work on a site that has access control, so all it ends up archiving is stuff that was published for free public consumption.

    So the real question is: if a work has already been released for free to the general public, how would letting authors restrict the republication of that work after the fact bring greater benefit to society than not letting the author impose such restrictions?

    My opinion is that it is much more beneficial to society as a whole if the release of a work for free public consumption automatically implied that members of the public have the right to redistribute that work. So if an author doesn't want people in the general public to be able to redistribute his work, he has to control who receives the work and who doesn't. Certainly requiring payment for the work in question is sufficient to meet the requirement of controlling access. But whatever method the author chooses, it should be one that makes it clear that the work in question is not being released for free to the public.

    --
    Use 'slashdot stuff' in the subject line in any email you send me if you want to get past the spam filter.
  64. Hey, it's just a really... by neo · · Score: 2

    really slow proxie server. It's just got lots of options for which caches version you want to see. :-)

  65. Re:robots.txt won't work by gerardrj · · Score: 2

    BUT...
    You have to KNOW the thing exists in order to put them in your robots file.

    This means that there are MANY sites in that archive that are being captured and re-published without any knowlede by the authors.

    the robots.txt file is like making a burular alarm that only stops people you know to rob houses. Wayback should use the robots file to only archive sites that specifically allow them to do so.

    --
    Article X: The powers not delegated... by the Constitution...are reserved...to the people
  66. Removal Instructions by akiy · · Score: 2

    There are removal instructions at:

    http://www.archive.org/internet/remove.html

    --

    --
    http://www.aikiweb.com - AikiWeb Aikido Information

  67. Your kepboard is a microphone... by surfcow · · Score: 2

    .. which anyone can listen to.

    Do you use caution when speaking into a microphone? Why?

    Anything you publish can be used against you. Data wants to be free, remember?

    =brian

  68. Errr...i disagree by Archfeld · · Score: 2

    If you want to OPT OUT, then don't put it up on the net. The NET is a public utility, put content out there and expect it to accessed, cached, and backed up in numerous ways by LOTS of individuals intentionally or un-intentionally. If you want your data private DON'T put it on the net, seems fairly straight forward and simple.

    --
    errr....umm...*whooosh* *whoosh* Is this thing on ?
  69. Coincidentally enough... by Kickstart70 · · Score: 2, Interesting

    Yesterday I used the Wayback Machine for one of the lawyers at the law firm I work at to prove that a company at one point had an office in a certain location. The company in question was trying to duck out of a contracted agreement by saying they were not the people who signed the contract.

    The Wayback Machine proved that they indeed knew of, approved, and granted authorization to this specific office, and the other people had a valid contract. In this specific case, the Wayback Machine prevented an apparently scumbag company from trying to screw some apparently good people over.

    Kickstart

  70. WOW! SEX.COM! by wo1verin3 · · Score: 2

    This wayback machine is invaluable!

    I was able to travel back to the early days of internet pr0n (click here to launch sex.com from '96) and research ancient authentication methods including "Click here if you are over 18".

  71. There are really two issues by Jerf · · Score: 2

    There are really two issues: 1. Should the archives be made? Which is what everyone seems to be discussing, and 2. Should the archives be publically accessible?

    I agree that any interpretation of copyright law that says the answer to "1" is "No" means that copyright law needs to be changed, not that it is "illegal and therefore immoral".

    But a case can be made for "2" that the distribution should only be made for when copyright on the material has either expired, or could reasonably be expected to be expired. Which brings up two other issues, which are the absurd lenght of copyright materials, and the near impossibilty of determining if a material is still copyrighted.

    So, I don't have any answers, just better questions.

  72. Anybody heard of the Library of Alexandria???? by fdiaz5583 · · Score: 3, Interesting

    If anyone has ever heard of the Library of Alexandria it was supposedly the most impressive knowledge base the world had ever assembled. Some crazy guy came by and burnt it to the ground -- setting the entire industrialized planet back hundreds perhaps thousands of years. We are now in the process of surpassing this great library, and are making it even easier for people to have access to knowledge. That knowledge may be porn, may be the morning news, or sports scores, it may even be how to construct a nuclear bomb. Nevertheless it is knowledge and EVERY person who is alive has the God (and any other higher power) given right to knowledge, despite what any government agency, or copyright may say. 21st century libraries such as the WayBack Machine are providing the tools necessary for researchers to go "back to the future." This is a great service to mankind, and it's overall importance should not be outweighed by greedy, and or overparanoid privacy rights activists. If you do not wish to be known, please do not post any information on the web, and move to the jungles of Africa and step away from a time and place known as the PRESENT.

  73. Re:I dont care much about copyrights..... by josepha48 · · Score: 2, Offtopic

    Happy lawsuits... when you steal a logo from a corporation that just wants to screw someone...

    --

    Only 'flamers' flame!

  74. What would really be useful... by Junior+J.+Junior+III · · Score: 2

    Would be an archive site that kept versions of news articles before and after they were changed by editors. Often, an article making allegations of corruption or bad intent gets changed shortly after it is published, and the replacement gives a more neutral stance, which doesn't give readers the whole story anymore, and in many instances makes the story a non-story, leading me to wonder why it was even published in the first place.

    --
    You see? You see? Your stupid minds! Stupid! Stupid!
  75. Not strange at all. by Ungrounded+Lightning · · Score: 2

    Strange that such a complaint would appear within a group expousing that "information wants to be free." :)

    Not strange at all.

    Slashdot is not populated by a bunch of lockstepping conformists. Its postership is large and diverse. The individuals are NOT the average, nor are they the stereotype.

    Perhaps on the average the posters think that IP laws are 'way too tight. But some think they're too loose. Post an article about somebody making them tighter and the make-em-loosers will complain, post one about somebody apparently not respecting them at all and the make-em-tighters will sound off.

    Further: Few if any Slashdot posters think a published author has no rights at all over the distribution of his work. (How would Copyleft work if that were true? B-) ) So when it looks like a service may be copying and republishing past works far beyond the authors' intended distribution they may sound off.

    And even the most fanatic of the "information wants to be free" faction may still post a cautionary note about how a particular act of radically freeing it may attract opposition.

    Which seems to be what happened here.

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
  76. Re:Anarchists Cookbook by martyn+s · · Score: 2

    Yeah, that's actually the book I had in mind when I said this. That and "How to get the women you desire into bed", by Ross Jeffries. He also mentioned in the forward that he was thinking about, but was talked out of, taking the book off the market.

    See one day I might regret having admitted that I read that book "How to get the women you desire into bed", but there ain't nothing I can do about it :)

  77. 2 little points by Sabalon · · Score: 2

    1) I was glad that they had one of my old pages on there. I lost it due to a crash (my brain crashed and I wiped it out). I was able to pull it back off their site and get it right back running.

    2) Are we not the same collective group that gets mad at NBC for not wanting us to use our Tivo's? I realize that there are a crapload of people on /., but sometimes the irony is just too much.

  78. Of course, archives should be legal. by shimmin · · Score: 2
    Archiving Web sites ought to be fair use.

    Arguing otherwise is like saying retaining old copies of magazines after the new ones have come out is an infringing use of those magazines.

  79. OT: Local archiving of Wayback machine results? by HEbGb · · Score: 2

    I found some information on the Wayback that I would really like to archive myself - for legally defensive reasons (i.e. trademark use, and to kill patents).

    Is there a way to archive sites from the Wayback machine in a clean (linked) way? I tried using standard web downloaders (Webreaper, Offline Explorer), but they didn't work correctly. Their FAQ says it can't be done, but for some reason I don't believe them... :)

    Anyone have advice? Thanks.

    1. Re:OT: Local archiving of Wayback machine results? by HEbGb · · Score: 2

      Duh. I wasn't asking about saving one page, but an entire site.

  80. Freaky by filtersweep · · Score: 2

    My ancient vanity site that received no traffic, nor deserved any, has been duly archived. I'm dying of embarrassment at my rudimentary HTML- back in the day.

    My question is why I was even on their radar?

    --


    Those that suggest you "dance like no one is watching" really want to see you make a complete fool of yourself.
  81. Re:Old encryption can be broken! by SuiteSisterMary · · Score: 2

    What's that quote from Cryptonomicon, when the guy tells his buddy to use 4096 bit encryption? Something like "I want this encrypted until men no longer do evil."

    --
    Vintage computer games and RPG books available. Email me if you're interested.
  82. No less stringent than the GPL by blueskies · · Score: 2, Insightful
    The copies that they have archived in their databases are individual copies served from the original web requests, so they have the right to keep them. They became their copy when they were originally downloaded.


    You have the right to something once you download it?

    If I copyright my content, other people are not allowed to distribute it without my consent. There is no way around this. I don't have to add extra disclaimers, just a copyright notice. How can there be any arguement about this?

    Ok, someone GPLs some software they wrote and put it on their website. If you download a compiled version of the software, you can't redistribute the compiled executable without making the source available. Why? Because the copyright owner (via the GPL) only gives you permission to redistribute if you also make the source available. The owner can do this because the GPL is backed by copyright laws, just like copyrighted web content. Notice I said owner, because the law grants special priviledges to people that create content and copyright it. There is no implied social contract that says the content is up for grabs. And there is also no reason fair use even comes close to applying if you are talking about a large quantity of content.

    I do think the archive provides a useful service, but I think they are on shaky legal ground.
    1. Re:No less stringent than the GPL by kevinank · · Score: 2
      If I copyright my content, other people are not allowed to distribute it without my consent. There is no way around this. I don't have to add extra disclaimers, just a copyright notice. How can there be any arguement about this?

      Assuming that you were the copyright owner of the original web page, then when you made a copy for the original download to the people running archive .org you were within your rights. Since you gave the copy you made to them, the data is now theirs to dispose of as they please (this is a reasonably straight forward mapping of Copyright law into the digital domain.)

      Within the limits of copyright law, you can make your single (or multiple) originals available to other people without the Copyright owner's consent, assuming we can apply the first sale doctrine to alienation of the data by transfer over public networks.

      Likewise you can do anything else with the original legal copy you have that is permitted under copyright law, such as make fair use of the original. Fair use might be stretched to include the use that archive.org is making of the documents, or it might not, but it has yet to be tested. The only reason you can't say for sure that it isn't a fair use is that fair use isn't a specified set of uses, but any use that the courts consider fair. There are guidelines that have been created for judging fair use, but so far I don't know of any case law establishing archive.org's use as fair or not fair.

      My point was that if you really want them to lock away their database to a location where only they can use their originals then you can probably force them to do so in court. I'm merely of the opinion that the world will be poorer for the loss of readily available information.

      --
      LibBT: BitTorrent for C - small - fast - clean (Now Versio
    2. Re:No less stringent than the GPL by anshil · · Score: 2

      If I copyright my content, other people are not allowed to distribute it without my consent. There is no way around this. I don't have to add extra disclaimers, just a copyright notice. How can there be any arguement about this?

      Read and understand the HTTP protocoll. HTTP is from original design not only a server to client communication, but allows a lot of proxies in between _caching_ the data. (Keeping copies of the content). Not the cache usually goes back some days to weeks, but now whats really different between a week and a year?

      --

      --
      Karma 50, and all I got was this lousy T-Shirt.
    3. Re:No less stringent than the GPL by kevinank · · Score: 2
      If you are writing from europe then you are correct. Under United States copyright law you would be mistaken however. Once a copy of a copyrighted work has been handed out it no longer is under your control. The only rights you maintain over that copy are the ones spelled out in the copyright act which are roughly: the right to publish, the right to publically perform, and the right to create derivative works. Any other uses, such as the use of reading the work, the use of selling it to another party, or the use of storing it for posterity are not exclusive rights granted to the copyright holder.

      You might be arguing that there was no alienation. That is, that even though you gave a copy to me, you didn't really give it to me, but only loaned it to me for a while or something like that. Whether that position would be held factual would be for a court to decide.

      In any case what you are asking for is simply and plainly contrary to the technological nature of the Web. Cache controls given by the web page designer are advisory, not mandatory. There is no technical means on the Web for doing what you ask. A smart attorney might use that to show that you gave implied consent to have the data copied and cached (even if there was no alienation) by placing the data on a medium where that copying and caching is implicitly a part of the technological means of communication.

      I imagine we will be seeing more case law in the next couple of years on this topic, and the results will probably surprise both of us.

      --
      LibBT: BitTorrent for C - small - fast - clean (Now Versio
  83. I hope that one day the net credit card by Rareul · · Score: 2, Funny

    transaction companies decide to integrate
    their historical transaction databases.
    That way, when this game is over, we get all of our money back.

    ?sp

  84. Copyright and robots by dsoltesz · · Score: 2
    Making a copy for archival purposes is not a violation of your copyright. It's fair use.

    The bigger issue is the rudeness of the archive in ignoring robots.txt and rifling through files that one does not wish to have linked or accessed (e.g. stuff under development that isn't ready for 'prime time' yet).

  85. History is more important than your copyright. by Jamie+Zawinski · · Score: 2
    I read somewhere that, when the archive.org folks are asked to delete something from their archive, their response is, "I will gladly delete you from the historical record. Enjoy oblivion."

    I sincerely hope that they don't ever really delete things, and that they ignore robots.txt as far as archiving goes. It's fine for them to not serve back your pages if you ask them not to. For a while. Say, until you are long dead.

    But this information might be interesting to future generations, and frankly, any librarian or archivist owes more to those unborn people than they have any obligation to obey your transitory wishes.

    Copyright laws change.

    Oblivion is forever.

  86. Websites from 1998.. by Chicane-UK · · Score: 2, Informative

    I read an article about the site.. the project has actually been running since 1998 - thats when they started collecting peoples websites, and adding hardware to their 'collective' to store all the data.. they only made the site public in like 2001 (or whenever it was) despite collecting it for so long.

    I think if you use the Wayback Machine to go back to their own site in 1998/1999 their front page tells you this.

    --
    "Hey! Unless this is a nude love-in, get the hell off my property!!"
  87. Copyright vs. employer right by heroine · · Score: 2

    How much should employers find out about you based on the Wayback Machine?

  88. Information wants to be free by Rupert · · Score: 2

    ... in the same way that water wants to run downhill. Finding it strange that people object to certain uses of their information is like finding it strange that people object when you spill their beer.

    --

    --
    E_NOSIG
  89. Re:robots.txt won't work by MushMouth · · Score: 2

    Learn the robots.txt protocol, you can shut off all bots and only allow the ones you want by simply having
    User-agent: good_bot Allow: /

    User-agent: * Disallow: /

  90. Re:robots.txt won't work by gerardrj · · Score: 2

    I know the robots.txt, but I (along with most web publishers) have better things to do than to keep track of every web bot that may visit my site. Given how fast crawlers come and go, just keeping up with a list would probably be a daunting task.

    Maybe the robots.txt spec should have a new tag that the archive bots look for:

    archive-agent: * Disallow: /

    --
    Article X: The powers not delegated... by the Constitution...are reserved...to the people