Slashdot Mirror


The Wayback Machine, Friend or Foe?

ShaunC asks: "As the webmaster of numerous sites, I'm curious how others feel about the Wayback Machine. What particularly interests me is the fact that the Machine is a relatively new animal, yet it contains snapshots from my sites dating back to 1998. I can't help but wonder: where did they get such old copies of my websites, and who gave them permission to make those copies? I certainly didn't provide either. Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one. Archives have bothered me ever since the fledgling days of DejaNews." This site last made an appearance on Slashdot, earlier this year. Internet archival sites are right smack in the crosshairs of copyright, but they are useful. Anyone who has ever used Google's cache (and there are plenty of those links on Slashdot) can attest to this. Of course, the issue that may bug many content providers is how to opt-out of such services, since some see it as a copyright violation. Is it possible to balance the issues of copyright and history, or will these two Internet resources find themselves in legal trouble in the future?

"The way I see it, archives are much like SPAM; I never opted in, why should it be my responsibility to opt out? I manage a number of domains and the process of refining robots.txt files and submitting myself to the Wayback Machine for removal seems to be intrusive. Worse, domains I've abandoned (which have lapsed or been re-registered by someone else) are forever archived in the Machine and I have no way to exclude them. Why should I have to deliberately remove my copyrighted material from an archive which was never granted permission to replicate that material in the first place?"

508 comments

  1. Erm by adamwright · · Score: 3, Insightful

    Isn't this exactly the point of robots.txt? Google won't cache content it doesn't spider, and it won't spider content forbidden by your robots.txt. Does the WayBack Machine obey the robots rules?

    1. Re:Erm by JebusIsLord · · Score: 2, Informative

      Yes, it does follow robots.txt protocol. Therefore there really isn't a problem now is there?

      --
      Jeremy
    2. Re:Erm by JebusIsLord · · Score: 1

      Little karma whoring here, but if you are not familiar:
      Just make a file named robots.txt in your webroot and fill it with the following 2 lines:

      User-agent: *
      Disallow:
      This will prevent any webcrawler that is compliant (IE most of them) from indexing your site at all. Problem solved.

      --
      Jeremy
    3. Re:Erm by Anonymous Coward · · Score: 0

      The Wayback Machine claims to honor robots.txt files and meta tags, but there's no way to remove a site once it's in there. I had a site back in 1996 and I didn't know anything about robots.txt files back then. That site's long gone -- at least I thought -- but I found it with the Wayback Machine. You can't make a robots.txt file for a site that no longer exists.

    4. Re:Erm by JebusIsLord · · Score: 2, Informative

      Shoot, that should be:

      User-agent: *
      Disallow: /

      --
      Jeremy
    5. Re:Erm by HP+LoveJet · · Score: 2, Funny

      Clearly an RFC is needed here:

      "Retro-Temporal Automated User Agent Exclusion Protocol"

      I'll try to put a draft together by April 1.

      --
      spawn_of_yog_sothoth
    6. Re:Erm by 1g$man · · Score: 2

      Why do webmasters have to "opt-out" rather than "opt-in" to be cached?

      Shouldn't the default be "don't allow spiders and caching" ? And if I want it then I should specifically allow it.

    7. Re:Erm by MushMouth · · Score: 1

      Not True!
      The Wayback Machine will retroactively honor robots.txt.

    8. Re:Erm by zootread · · Score: 2, Informative

      Yeah, you can add a robots.txt file and ask them to remove your site and it'll be wiped from their records. The problem is, if you don't have access to the site anymore, you can't throw in the robots.txt file. But, I just checked on a web page I requested they remove, which no longer existed so I couldn't put up a robots files, but I made the request anyways.

      It looks like the page has been removed! My guess is if you request to remove a page and it doesn't exist anymore, they probably will remove it for you. This web page revealed me as the pothead and pro-marijuana person that I was (and still am though in private) back in college. I was afraid my employers were going to find my old web page, but they're probably potheads too.. But still, its good to be able to cover up the silliness of my past.

      --
      Zoot!
    9. Re:Erm by Anonymous Coward · · Score: 0

      the ia_archiver does abide by the policies set by robots.txt. all (pretty much normal) info is here:

      http://pages.alexa.com/help/webmasters/index.htm l

      the wayback machine is a tool developed to browse the archives collected by the ia_archiver. there was much discussion about how to display the results; whether to change the code to display the page(s) the way it was meant to be seen by a viewer, or to leave the code intact, making most pages unviewable.

      coming from a former, and one of the firt employees, the intention is good, although some of the things it has been used for have been sketchy at times. lawyers are your friend... (:

    10. Re:Erm by kevinank · · Score: 5, Insightful
      The goal of the person who started archive.org was to record the history of the world wide web. The assumption was that whatever anyone thinks about the archive, there will never be another chance to go back and get that data once it is lost.

      The copies that they have archived in their databases are individual copies served from the original web requests, so they have the right to keep them. They became their copy when they were originally downloaded. Whether they have the right to make new copies and redistribute them depends on how you think fair use applies to that content.

      Ultimately if a lot of people start suing them they will probably shut down the archive to public access and only allow researchers to view their original copies on site. And if you'd prefer that, well, you'll end up with the world you deserve.

      --
      LibBT: BitTorrent for C - small - fast - clean (Now Versio
    11. Re:Erm by amRadioHed · · Score: 1
      Why do webmasters have to "opt-out" rather than "opt-in" to be cached?
      Shouldn't the default be "don't allow spiders and caching" ? And if I want it then I should specifically allow it.

      I would have to say that the reason is because the internet is a public medium. No one should need you permission to read your webpage. That goes for users as well as spiders. Of course, if you disagree, then you can deny spiders access via the spiders.txt file, and most spiders are kind enough to respect your wishes.
      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
    12. Re:Erm by Ross+C.+Brackett · · Score: 5, Funny

      Well, the default is to not plug your server into the Internet the first place, now isn't it? To quote Doug from Ghost World, "It's America, dude, learn the rules."

      Seriously, if someone's precious intellectual property - as if anything worthwhile was ever posted on the Internet in the first place - becomes compromised because they don't know a basic principle of how to run a website, well then boo hoo.

      It's worth the tradeoff. That the Wayback Machine exists is seriously cool, and some day will be of definite historical worth. If the occasional Brady Bunch erotic slash fiction author has to take a ride on the waaahmbulance because "A Very Brady Gangbang (M/m/F/f nc b/d)" got copied without their permission for the greater historical good, then that's a price worth paying.

    13. Re:Erm by dswensen · · Score: 5, Informative

      Yes it does, and how. In fact, immediately upon reading this story, I went to the Wayback Machine and checked out my personal website archive. There it was, material dating back to 1996 ("Oh God, no, not the digging man GIF!"). I made a new robots.txt file:

      User-agent: *
      Disallow: /
      # BITE ME WAYBACK MACHINE

      ... uploaded it, went back to the Wayback Machine, and got:

      Robots.txt Query Exclusion.

      We're sorry, access to [site] has been blocked by the site owner via robots.txt.
      Read more about robots.txt
      See the site's robots.txt file.
      Try another request or click here to search for all pages on [site]

      So, yeah, they seem to check the site for the most current robots.txt file before they show the archive. And if the robots.txt disallows archiving the site, ALL the entries are marked unavailable, not just the current ones.

      So, it's pretty easy to solve the problem of the Wayback Machine -- and probably without going balls-out with the "disallow everything everywhere" like I did.

    14. Re:Erm by treat · · Score: 2
      Why do webmasters have to "opt-out" rather than "opt-in" to be cached?


      You are opting in when you make data publically accessible. It is part of the implicit social contract, due to the nature of information. Since it is such an obvious, natural, and desirable feature. A large proxy server will probably have several sites cached in their entirety. Retention time need not be considered at issue, due to the low cost of storage and the simply natural idea that if the information has even a slight value, it will recover the cost of storing it.


      When I view anything, it is my natural right as well as access to air is, to be able to electronically retain a copy of it, if for no other reason than to aid my memory. You have no right to prevent me from retaining a picture I took that your car was in the background of.

    15. Re:Erm by M-G · · Score: 2

      Well, the default is to not plug your server into the Internet the first place, now isn't it?

      That's quite possibly the most perfect comeback I've ever seen....

    16. Re:Erm by g_attrill · · Score: 1

      You can ask Google to index but not cache a page:

      <META NAME="googleBOT" CONTENT="NOARCHIVE">

      Gareth

    17. Re:Erm by HD+Webdev · · Score: 1

      Dejanews always allowed people to opt out.

      None of my newsgroup messages were ever archived by DJN.

      --
      This is not a dream, not a dream...we are transmitting from the year 1-9-9-9.
    18. Re:Erm by KCRWreck · · Score: 1

      They may end up in the world THEY deserve, but the rest of us will end up in a world we DO NOT deserve.

      If it's not worth archiving why was it worth posting in the first place?

    19. Re:Erm by Qrlx · · Score: 2, Interesting
      I agree with the kevin completely. What is wrong with having old copies of your site archived? take this quote from the front page of this article:
      1. Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one. Archives have bothered me ever since the fledgling days of DejaNews.

      I don't know what kind of a "purist" this person thinks they are. DejaNews (now google) is one of the *best* places to look for info that's relevant but not this week's headline. We might as well burn all the libraries to the ground, since they contain books with embarassing misprints or factual errors.

      It might not be easy to get your site out of the Wayback machine, but it doesn't sound like it's impossible either. Consider the alternatives; would you rather live in a world where the past can be "updated" as needed, like the (purportedly reputable) New York Times did to the web version of a Sep. 9 story warning about Osama bin Laden. Right after September 11 they replaced it with a puff piece-- full details here. (Warning, contains links to the NYT registration-reqd pages and I think the content may have been re-scrubbed since this appeared on BuzzFlash.)

      If there's no record of content, how am I supposed to provide a bibliography or references for "something I saw on the web somewhere?"
    20. Re:Erm by Wyatt+Earp · · Score: 1

      I was going to say that very thing. Then I went and made a pizza and watched TV.

      I don't understand what the point of the story was about in all honesty.

      Someone puts something up on the web, and someone might see something later and he doesn't think that's fair? Crap, then lets yell at the brower makers for including the Save As Web Archive too.

      The Wayback Machine is cool. If you publish to the web, expect to be archived.

      "...where did they get such old copies of my websites, and who gave them permission to make those copies? I certainly didn't provide either. Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one."

      That's a statement from someone that must think the "internet" is a nebulous "thing" that exists without "borders" or "rules" beyond "nations". And doesn't understand that it lives on pieces of spinning disks with magnetic particles and can be easily saved elsewhere.

      Frankly, it sounded more like a Jon Katz article.

    21. Re:Erm by dan+the+person · · Score: 1

      Exactly, he opted in when he choose to put his content up onto the public internet.

    22. Re:Erm by DaveOMatic · · Score: 1
      . If the occasional Brady Bunch erotic slash fiction author has to take a ride on the waaahmbulance because "A Very Brady Gangbang (M/m/F/f nc b/d)" got copied without their permission for the greater historical good, then that's a price worth paying.
      Sir, please stop maligning both my character and my work.
    23. Re:Erm by uncoveror · · Score: 2, Insightful

      I like the wayback machine's reason for being: preserving history. In 20 or 100 years, it will be very valuable information. I found old copies of my website, The Uncoveror there. It relly took me back. What I didn't like, though, is that it tried to force-feed me spyware, namely Gator and Bonzi Buddy. If the Spyware and ads were removed, then it would be a true historical archive; the kind real historians, and students can use for research. With the garbage on it, however it has little, if any, academic value.

      --
      The Uncoveror: It's the real news.
    24. Re:Erm by Anonymous Coward · · Score: 0

      >The copies that they have archived in their databases are individual copies served from the original web requests, so they have the right to keep them

      Complete and utter nonsesnse. Viewing my copyrighted work does *not* give The Internet Archive the right to keep a copy, no right whatsoever. That's 100% iron-clad pure fact.

    25. Re:Erm by Anonymous Coward · · Score: 0

      # robots.txt for Slashdot.org
      User-agent: *
      Disallow: /authors.pl
      Disallow: /index.pl
      Disallow: /article.pl
      Disallow: /comments.pl
      Disallow: /journal.pl
      Disallow: /messages.pl
      Disallow: /metamod.pl
      Disallow: /users.pl
      Disallow: /search.pl
      Disallow: /pollBooth.pl
      Disallow: /pubkey.pl
      Disallow: /topics.pl
      Disallow: /zoo.pl
      Disallow: /palm
      Disallow: authors.pl
      Disallow: index.pl
      Disallow: article.pl
      Disallow: comments.pl
      Disallow: journal.pl
      Disallow: messages.pl
      Disallow: metamod.pl
      Disallow: users.pl
      Disallow: search.pl
      Disallow: pollBooth.pl
      Disallow: pubkey.pl
      Disallow: topics.pl
      Disallow: zoo.pl
      Disallow: /~
      Disallow: ~

    26. Re:Erm by Anonymous Coward · · Score: 0

      Wow, and you just admitted to being a pothead on a page that will end up in Google's cache forever. Great idea.

    27. Re:Erm by guttentag · · Score: 3, Informative
      A number of people who don't want their content archived by the Internet Archiver may still want search engines to direct traffic to their sites (The Washington Post does this). If that's the case, use this in your robots.txt file:

      User-agent: ia_archiver
      Disallow: /

      Most (all?) search engines provide information on how to specifically exclude their spiders (while allowing everyone else). Just go to the engine's site and search for info on how they treat robots.txt.

    28. Re:Erm by Anonymous Coward · · Score: 0

      Exactly. So why is this little shot bitching about something so useful and really available. Just shut the fuck up you little turd.

    29. Re:Erm by imperator_mundi · · Score: 1

      "verba volant scripta manet"

    30. Re:Erm by Lumpy · · Score: 2

      Here's the thing that needs to be sorted out. The internet is globally public. if you put anything out there that is not password or access protected it is public domain, property of the world's population. (Just like that handbill pasted to a wall or telephone pole, your website is nothing more than that.) we really need to get some sane laws and regulations out there, if it's pubically displayed, you have no control over how many copies of said page are copied, distributed, or used. (The individual graphics are different, let's use (GASP) existing copyright laws to protect them.) but the snapshot in time of the page you produced is Mine,my neighbors,and that stinky-kid down the street's property now.... dont like that? get the hell off the web you loser.

      This is the crux and design of the internet. no laws passed can change this. Until you start an advertising campain that the internet is NOT a community, it is NOT open to the world, and IS the property of corperate america and equivilant to walking in a store.. (and with entry webpages stating that.... just like real stores (GASP AGAIN... making people responsible for their actions? the Horrors!)

      So to the person who submitted the story, quit your whining you big baby, if you dont want your webpage viewed and reproduced publically... dont make it public.

      --
      Do not look at laser with remaining good eye.
    31. Re:Erm by julesh · · Score: 1
      Seriously, if someone's precious intellectual property - as if anything worthwhile was ever posted on the Internet in the first place - becomes compromised because they don't know a basic principle of how to run a website, well then boo hoo.

      Which basic principle are you talking about? The one that states, presumably, that if you don't want somebody to download an old copy of your site and then redistribute it to the public without requesting your permssion for it you should utter some obscure incantation and put in a file called 'robots.txt' on your server as a charm against such things happening?

      Please. Give me a break. Without knowing that this site exists, which I suspect is the situation that 99.9% of web publishers are in, you cannot know what you need to do to prevent the archival of your data without affecting other perfectly legitimate uses of the data.

      Yes, its a good cause. That's the only reason why they don't get their asses sued over this, because what they are doing is illegal. They have no right to distribute content that is my copyright property in this fashion. I could sue them, and would almost certainly win, if I had any inclination to do so.

      They legally need permission of the copyright holder to do the things they do. No lack of action by the copyright holder can grant them this permission. Not setting up a robots.txt file is not enough.

      "Basic principles of how to run a web site" are not allowed to override laws. Sorry to bring you down to earth on this one, but that's just the way it is. Don't get too upset if these people get sued for copyright violation, because that is what they are doing. Their use goes way beyond "fair use" as it is set down in the laws of most civilised countries.

    32. Re:Erm by jbarr · · Score: 1

      I don't think it's the "precious intellectual property" that people are worried about. It is the stuff that we don't want to remember that worries us!! ;-)

      --
      My mom always said, "Jim, you're 1 in a million." Given the current population, there are 7000 of me. God help us all!
    33. Re:Erm by rapid+prototype · · Score: 1

      you seem to be arguing for this scenario:

      step 1: you put your copyrighted content on the WWW.

      step 2: i visit your website and your copyrighted content goes into my cache.

      step 3: i make my cache available on the WWW.

      step 4: you sue me for copyright infringement.

      it actually DOES make a little sense. i mean, i could understand a search engine providing a LINK to your WWW page based on the KEYWORDS of the site which they have cached, but i really don't know how they can get away with republishing cached COPIES of your WWW page if those pages are copyrighted.

      -rp

    34. Re:Erm by ebonkyre · · Score: 1
      >Ultimately if a lot of people start suing them they will probably shut down the archive to
      >public access and only allow researchers to view their original copies on site.

      This is pretty much how the WayBack Machine operated until very recently. While you didn't have to be "on site" to access the archives, you *did* have to submit a request explaining what you were researching and why you needed access. It was only opened to the public in the last year or so.

      --
      "Time is an abstract concept devised by carbon-based lifeforms to monitor their ongoing decay." - Thundercleese
    35. Re:Erm by nanojath · · Score: 1

      I think you hit this one on the head. The fact is if we stand on the small print and to the letter on copyright violations with regards to the internet, the internet becomes worse than useless. I mean, technically I'm violating the hell out of copyright right now - I've no doubt there are hundreds if not thousands of cache files of copyrighted materials my browser has stored for convenience of recall. So instead of getting all huffy, we accept all the purely systematic copying, and for the usefulness of things like Google we accept that if we want to avoid "robotic" scrutiny we have the option to opt out - and every legitimate agency will obey these means.

      --

      It Is the Nature of Information to Transgress Artificial Boundaries

  2. Yummy by sheepab · · Score: 2, Informative

    Slashdot from 1997.

    1. Re:Yummy by quintessent · · Score: 2

      Very nice. And it's good to know they were using the same careful journalism back then. I like this headline:

      Judge Uninstalls IE in 90 seconds.

    2. Re:Yummy by mongoks · · Score: 1
      Already /.'d.

      "Even in the future nothing works!" - Dark Helmet

    3. Re:Yummy by digitalsushi · · Score: 2

      In the process of digging this up, you have also apparently answered the question of "who originally archived this", as the bottom of the page has a "welcome user from .alexa.com" footer.

      --
      slashdot: where everyone yells sarcastic metaphors to themselves to understand the issue
    4. Re:Yummy by Anonymous Coward · · Score: 0

      What? No cowboyneal option?

    5. Re:Yummy by egreB · · Score: 1

      And how about Google from 1998? It's really great!

  3. "The Wayback Machine" by pb · · Score: 3, Informative

    "The Wayback Machine" has been a pet project for a long time, and we're only now seeing results. I know for a fact that they have pages back at least as far as 1996, and it's a damn shame they don't have anything that much earlier...

    And yes, it obeys the Robot Exclusion Principle.

    "Ask Google" strikes again; I would hope that you could find all of this information by searching, or reading an "About" page, or something. Fortunately, these abortions to journalism don't appear on the Front Page very often.

    --
    pb Reply or e-mail; don't vaguely moderate.
    1. Re:"The Wayback Machine" by Disevidence · · Score: 4, Insightful

      I think the question is not about its being publicly available, but rather about it archiving web pages that were taken down at later dates for various reasons.

      Its legally grey, and all it really takes is for some paranoid person to sue, and then the fireworks start.

      IANAL.

      --
      Think nothing is impossible? Try slamming a revolving door.
    2. Re:"The Wayback Machine" by martyn+s · · Score: 4, Insightful

      So I suppose libraries should just stop carrying books because the author doesn't like what he wrote anymore? I mean, what the fuck?

    3. Re:"The Wayback Machine" by Disevidence · · Score: 2

      Im not saying whats right or wrong, im saying he could possibly sue however.

      Person A puts up a website about X. Wayback Machine archives this website. Later, unbeknownst (sp?) to the Wayback Machine, Person A is sued by someone who controls or has copyright on X, and its taken down. Yet that copy is still on the Wayback Machine.

      What if the Wayback Machine archives a link to Decss? They do archive forums, i;ve checked numerous old old posts of forums through the wayback machine.

      I just think its a bit iffy about archiving stuff...

      --
      Think nothing is impossible? Try slamming a revolving door.
    4. Re:"The Wayback Machine" by Reality+Master+101 · · Score: 2

      So I suppose libraries should just stop carrying books because the author doesn't like what he wrote anymore? I mean, what the fuck?

      The issue is more akin to a library making a copy of a book and giving out copies of that copy to anyone who asks.

      --
      Sometimes it's best to just let stupid people be stupid.
    5. Re:"The Wayback Machine" by rodgerd · · Score: 2

      Actually, if a book is declared obscene or libellous, a library may well stop carrying it, and the Wayback machine has the same problem.

      And while it is sometimes delightful that it preserves things that, eg, Big Companies may prefer we didn't see, it's less delightful that the ramblings of a 17 year old's blog may come back to haunt them years later...

    6. Re:"The Wayback Machine" by Mr+Windows · · Score: 2

      Possibly sue for what? It's not libellous to (truthfully) say "n years ago, so and so said 'whatever'".

      It's always been the case that "if you don't want a future potential employer to read it, don't put it out in public". If a newspaper prints a libellous story, they issue a retraction, they don't seek out and destroy all copies of the paper.

    7. Re:"The Wayback Machine" by Rick+the+Red · · Score: 5, Insightful
      No, the issue is more akin to a library carrying newspapers and magazines for years, and their publishers suddenly telling the libraries "those copies are out of date, stop letting people read them." Why? If you didn't want anyone to read it, why did you put it out on the web?

      Are you ashamed of what you did back then, when you were young and foolish? Grow up -- we're all ashamed of what we did when we were young and foolish, and years from now you'll be ashamed of what you're doing today. Get over it.

      Personally, I think archives are great. Whenever I design an application I always ask about archiving, because inevitably they're gonna want it and it's easier to design in from the start. Oh, you want to know what your top 10 customers ordered last Christmas? Now you tell me! Geeze, we flushed that data last February, 'cause you said once the credit card cleared you didn't care to pay for the storage. But I digress.

      Someday your next client will want examples of your previous work, then you'll go crawling on your hands and knees to the Wayback Machine, begging them to show you what your pages looked like. And they'll honor your robots.txt file and tell you to get lost.

      --
      If all this should have a reason, we would be the last to know.
    8. Re:"The Wayback Machine" by scotti · · Score: 1

      Should we erase our history or allow it to be lost forever? A copyrighted material does eventually become public domain. But public domain can be circumvented if somebody is allowed to disallow copies or archives of their material.

      Lets say that I write a really keen book that everybody likes. I encrypt it with a really good ebook encryption that not likely to be broken in the next 1000 years. I allow people to use my website to read the book but after a year a depression sets in and people can afford to read my book and I go out of business because I can't make enough money. My site goes off line and my clear text version of the book ends up getting deleted through neglect. It's gone without hope of it being used for public domain or anything.

      Of course this stuff happens right now. We call it abandonware. Since the source is not published or archive the ideas behind it are simply lost someday because the writer dies or decides to delete it.

    9. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      Good idea, let's delete everything from history that we don't like.

      What about newspapers? They are replaced daily, but we keep them for reference purposes. If a newspaper article is libellous, the original copy can still be viewed, so what's the difference with web pages?

    10. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      Yeah, but I didn't know that all my web forum posts from 4 or 5 years ago would be archived. Think of how much careful you'd be in a conversation with someone if you knew they were taping it.

    11. Re:"The Wayback Machine" by budgenator · · Score: 2

      IANAL but I don't think you can sue someone because they truthfully reported that you were stupid, or did something embaressing a few years ago; but then again I didn't they you could be sued for serving your hot coffee hot either.

      But on the other hand, someone just asked me if I might work on some web pages they had about a year ago, and sure enough there they were, with just a few broken images, This could be usefull.

      We'll just have to remember that anything posted is forever now, or at least until they run out of storage space

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    12. Re:"The Wayback Machine" by Lord+Ender · · Score: 2

      If you posted it to a publically accessable web page, then you gave everyone in the world permission to copy it, in my opinion. Anybody who viewed your page made a duplicate of it in their browser cache. And copying is copying. Don't make it publically available on the web if you want to restrict the copying of it. Ass.

      --
      A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    13. Re:"The Wayback Machine" by Richard_at_work · · Score: 1

      hmmm were you once employed by Enron?

    14. Re:"The Wayback Machine" by Disevidence · · Score: 2

      Im an 18 year old Australian Uni Student.

      So yeah, how did you work it out?

      --
      Think nothing is impossible? Try slamming a revolving door.
    15. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      Umm, yeah. When I firsted started online, I know a lot of people, including myself, posted resume information on their web pages. Being in college and all, it was sort of the thing to do esp. when trying to get a job (times then are different than they are now). In '96 and '97, being to make a web page was considered decent enough as a job skill. And we included information on there that we NEVER would even think about including in a resume today, much less online and esp. not on a web page. Like addresses (some still valid, as they were parent addresses), social security numbers (real stupid, I know, but it was done).

      Times change. I wouldn't want that info there.

    16. Re:"The Wayback Machine" by gerardrj · · Score: 2

      No, the issue is more like the library deciding to sell copies of the books it carries, without the author or publisher's permission.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    17. Re:"The Wayback Machine" by global_diffusion · · Score: 2

      I think the question is not about its being publicly available, but rather about it archiving web pages that were taken down at later dates for various reasons.

      If this is the question, it implies that there is content out there that should be unavailable to certain people. I strongly disagree to that because I feel that data should be free. From anything like Bertrand Russell papers to kiddie porn, if it was on the net, it should be considered part of the net. If we select what should be part of our archive by select standards, we are in effect choosing how we want history to look at us. I say that we should store all the data and allow interpretations of it to change over time. Ask any anthropologist or historian and I bet you that they would love it if everything, even the obscene, had been recorded. If we want a true picture of the net, we need to include everything.

      Copyright on the web is a silly notion. If you put something on the "world wide web," then it is public to the world. You can't just take it off and expect it to disappear. If you take that idea to the extreme, next we would have people suing us for not deleting their websites from our browser cache. Copyright is silly. I just don't get it these days.

    18. Re:"The Wayback Machine" by martyn+s · · Score: 2

      I dunno, I take a look back at my early usenet posts, and although I may blush a little, and I'm embarrassed about it, I just deal with it. If it's just a matter of people knowing that they're being archived, then I can solve the problem very easily: You are being archived. Consider yourself informed.

    19. Re:"The Wayback Machine" by martyn+s · · Score: 2

      As someone said in another post, I thing it's tragic when any piece of information, no matter how trivial, is lost forever.

    20. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      Yes, as soon as content such as mp3s also become available on the web, everyone in the world should also be free to copy that as well. Ass.

    21. Re:"The Wayback Machine" by Suppafly · · Score: 2

      No its not, reading anything on the internet constitutes copying, anyone that places information on the internet knows or reasonably can be expected to know that, so the issue of redistribution doesn't really apply. So if you are going with a book analogy, it really is just like a library allowing people to come in and read books.

    22. Re:"The Wayback Machine" by Suppafly · · Score: 2

      No, the issue is more like the library deciding to sell copies of the books it carries, without the author or publisher's permission

      But it's not like that at all since they aren't selling anything and, reading anything on the internet constitutes copying, anyone that places information on the internet knows or reasonably can be expected to know that, so the issue of redistribution doesn't really apply and one can't legitimately complain about copying since using http implies you want stuff to be copied.

      So if you are going with the book analogy, it really is just like a library allowing people to come in and read books without the authors or publishers permission (with is just fine with physical books).

    23. Re:"The Wayback Machine" by Rick+the+Red · · Score: 2
      Gee, I must have missed the part where the Wayback Machine charges to look at their archive. I guess I'm stealing from them, eh? Or else maybe they're not selling anyone's copyrighted work.

      Oh, and every year I pay for my public library, whether I use it or not, in my property taxes. So in a sense they are selling me those books (only I don't get to keep them, so I guess I'm just renting them).

      The Wayback Machine doesn't even do that; they just let you see them for free, just like you could see them for free at the original sites. Imagine that! They're giving away free what the copyright holder gave away for free. Now, you'd have a case if they archived pay sites and let you see them for free -- point me to their pay-for-porn collection!

      --
      If all this should have a reason, we would be the last to know.
    24. Re:"The Wayback Machine" by gerardrj · · Score: 1

      Okay, so take the money out of the equation. What if the library started making free copies of the newest best-sellers to give away.

      Reading on the Internet consitutes temporary copying. In most cases the copy goes away when you close the browser, or within a day or two depending on the size of your cache.

      Copying and republishing/redistributing are two different things, as I've seen pointed out elsewhere in this discussion:

      You may make several copies of a song from a CD (perhaps one on your computer, one on a "mix" CD, one on your MP3 player), but you still haven't distributed it.
      On the other hand, one copy can be distributed via mass media without other copies being kept at the receiving end.

      Your last line's analogy is the first phase, where I published something on-line and you go to read it. The problems come in that this company is making a copy in the archive and then allowing others to make copies, circumventing the original publisher.

      If you want a real-world example... try get the back issues of the New York Times (or some equally presigious periodicle, movie or television show). Make unlimited copies of those back issues Set up a catalog or web site where you sell or give those copies to anyone who asks for one. I guarantee you will be slapped with a copyright infrigement lawsuit. I can almost guarantee you would loose such a battle if you challanged it.

      The only difference in that scenareo and what Wayback is doing, is that the content at Wayback was never in the physcal world (on paper, film, video, actors performing ,etc). There seems to be this illusion today that the Internet is a special place where the real-world rules don't apply.
      The Government seems to think this too at some level judging by the way the alledgedly are attempting to tap it, or censor it in ways that are illegal in the real world.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    25. Re:"The Wayback Machine" by gerardrj · · Score: 2

      The money is not the issue. Selling, giving, loaning whatever. The works are not theirs to do any of these with. The fact that I allowed you to view my web page at one point does not extrapolate to your right to make that page viewable by others on demand for eternity.
      The library neither copies nor republishes the works they hold. The Waybakc machine does both.

      You drive your car in public. Does that make it public domain? would you report it stolen if I borrowed (without asking) it for a drive to Montana for a week even though you have two other cars (essienitally copies)?

      Every year I pay property tax also. Much of it goes to schools, but I don't have kids. Are they selling me the schools, or the children? Can I enroll in high-school classes as a refresher course? But again, money is not the issue.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    26. Re:"The Wayback Machine" by Lord+Ender · · Score: 2

      Posting your own intellectual property to the web (or giving someone else permission to do so) is much different than posting someone else. Think about it. Totally different.

      --
      A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    27. Re:"The Wayback Machine" by pjrc · · Score: 3, Insightful
      Are you ashamed of what you did back then, when you were young and foolish?

      I am. Well, sorta anyway. My site has all of the pages that have ever appeared, all the way back to 1995. For example, this circuit board schematic page got a lot of hits in 1995. For years, I got emails from people who attempted to build it... a few were success but most were failures. So, in 1997 I redesigned the board/schematic so that it would be much easier to build and troubleshoot, and then I made another new rev in 1999 (because the flash rom chip became obsolete).

      Based on lots of user feedback, I redesigned it yet again in 2001, mainly to increase the speed, add more memory to be C compiler friendly, and I added the most user-requested feature, a port to plug in a standard LCD.

      Today, those old pages (well, still need to update the '99 ones) have a message at the top of the page that tell the visitor they're viewing obsolete material and strongly suggests they follow a link to the new version of the circuit board, which is easier to build (added in 1997), uses parts that are currently available on the market (added in 1999), and has more features (added in 2001).

      An archive of the original 1995 page, even archived in 1996, isn't going to warn the poor user about the usability improvements added in 1997, the part that became obsolete in 1999, and the nice new features that were added in 2001. At the very least, it'd be proper for archive.org to link to the current version of the page (if it's on-line)... but even that would be difficult since the site moved from a university to its permanent domain name in 1999 (the old site keep a redirect for a couple years, but even that is gone now).

      So, while it sucks that someone might find that old material and suffer though all the problems that have been corrected and miss out on the improvements of the last several years, it doesn't suck enough that I'd hire a lawyer, or even bother to tell them to exclude my material.

      But I can understand how a large company would not want its old products displayed with the then-current literature in a way that might confuse potential customers.

    28. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      That's not the same thing all. I mean, what the fuck? Are you that much of a fucking dumb fuck child-molesting Taliban supporter? God are you fucking stupid!
      This is like COPYING a book at on a COPY MACHING like what Cannon and Xerox manufacture. Then taking that copy and making your own LIBRARY from those ILLEGAL copies for the sake of making copies.

    29. Re:"The Wayback Machine" by Rivard · · Score: 1
      If you want a real-world example... try get the back issues of the New York Times (or some equally presigious periodicle, movie or television show). Make unlimited copies of those back issues Set up a catalog or web site where you sell or give those copies to anyone who asks for one. I guarantee you will be slapped with a copyright infrigement lawsuit. I can almost guarantee you would loose such a battle if you challanged it.
      >

      I thought this at first, but it's a different medium. With the New York Times (the physical paper) they print copies, send them to libraries, the libraries pay for the subscription and allow anyone with a library card to come in and look at the Times' articles.

      But, the Wayback Machine will take nytimes.com and archive what is showed to the world, for free (lets, for a second, ignore that nytimes.com requires registation to read articles--this is a seperate issue, but an interesting one). What did nytimes.com do? Instead of "slathering ink on dead trees" they pushed about some poixels onto a screen. They made avalible their content via a web server that served the thousands of people that visit their site each day.

      But the server is different from the paper. The server isn't a one-to-one supplier, it is a distribution center for content. The Wayback Machine re-distributes that content in the manner in which it was originally distributed. Yes, it does things different from the New York Public Library, because it can serve thousands of people at a time, but that is the medium in which the content was presented, to be destributed to thousands of people at a time.

      They Wayback Machine does what a library does in a medium-specific manner. It catalogues the content that it recieves from the site, content that is willingly and legally put there by willing writers, and hands it out to any person visiting their library.

      However there are interesting legal arguments. For instance, nytimes.com charges a fee for viewing (via web browser, and, optional, saving and printing of) articles. If the WayBack machine curtails that business by offering those articles for free, is it doing something illegal? And, on a simlar note, if it does archive nytimes.com articles that are avalible to subscribers only, should it now allow that information to be accessed only by nytimes.com subscribes?

      But, taking a non-registration-requiring site, like the Washington Post and dolling out the information they put out seems like a perfectly legitament, docile and resourceful tool that can only be considered a valuable tool for future generations.

      The web is acid-paper, only worse. With each second paragaphs are moves, sentances shifted, facts corrrected, new things added. This provides for a consistant-with-the-now nature that is the Internet, however there is no point in relishing what is now if we have no idea what was.
    30. Re:"The Wayback Machine" by anshil · · Score: 2

      Well to honest thats a bit of nonsense, somebody _sees_ anyway that he is viewing a side form in example 1996.

      Take in example magazines, I really have old ones in my cellar, and is so funny to take a computing magazine from 1992 or so and read it. (in example Powerplay if anybody remembers it). When they ie discuss how BardsTale 3 on the CPC is the most fantastic ever seen.

      Do I get confused by this magazine? Certainly No, I look at the frontcover and see the date the information applies to. Same as reading very old newspapers, or old social magazines like the are lieng at the tables in doctors wait rooms. Anybody get confused? No anybody can read the date on the frontcover. Like last time I read a magazine at the doctor from 1999, telling about the Y2K problematic etc. it's very funny today.

      No what you're telling is to burn down the powerplays in cellar, the magazines at the doctors, and the old newspapers in the libraries because somebody might get confused by not reading state of the art information. Welcome 1984.

      --

      --
      Karma 50, and all I got was this lousy T-Shirt.
    31. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      My GOD are you stupid. I guess at Ohio State you can be a real slack-jawed yokel fuck and still get in. That logic is so fucking stupid I can't even imagine what kind of a stunted brain you have. Did your Mother eat too much dog shit when she was pregnant with you or what?
      If you make a movie and it winds up in the theater, does that give everyone the right to bring in a camera to copy it so they can encode it in divx and distribute it on IRC? What the fuck you moron?
      I've got a good career move for you. Join the GOP, they need real genius like yours.

    32. Re:"The Wayback Machine" by gerardrj · · Score: 2

      Your comment that ...this is a different medium..." is at the heart of my arguments...
      Why should publishing on the Internet have any more or fewer rights and restrictions than real world publishing? Just because the new medium requires something that is illegal in the real world, does not necessarily mean the real-world rules should suddenly not be germain.

      It's also interesting that you mention the Washinton Post, as they have "opted out" of the wayback machine.

      Wayback does differ in several major ways from the operation of a standard library:
      1. A library does not provide, allow or condone copying and/or redistributing their content (books, periodicals, reference materials) except for fair use. You may use the copies they have rights to. They do not produce as many copies of a work as there are patrons wanting ot use it.
      On wayback, the copies you implicity get rights to when viewing a web page are the one in memory and on your screen, and the one temporarily in your browser's cache. You have no distribution rights, or rights to make any other copies beond fair use (such as to a backup CD or tape).
      To stay withing their rights, you would have to go to the Waback building and look at the page on their computer that captured/cached the site/page.

      2. A library does not keep all copies of a work indefinately. They rotate stock to keep up to date.
      The Wayback is specifically attempting to maintain all data, even if wrong, false or outdated. Your library will destroy and replace such items.

      3. A library purchases content (generally) through channels that specifcally know the purchase is for a library, and certain special rights may be conveyed or restricted.
      The Wayback is taking pages offered for one use and using them for something else entirely. They further make or condoning the making of multiple copies while redistributing the works.

      4. A Library never alters stored works.
      Due to space, technology and other limitations, not all pages in the archive are re-rendered as initially offered to the public. This is tantamount to re-writing those pages, and could be considered plagerism.

      There's also common knowledge. The public at large is well aware that if they write a book, it may well end up in a library. Wayback has no such ubiquity, and they seem happy to remain in the background, collecting these pages without most authors' knowledge.

      I'm not for shutting down the Wayback, or others like it. I'm just saying that overall things would be less legalistic(is that a word?) if they would simply take a solid "opt-in" stance. That is, they would only store pages that specifically allow them to do so, via athe robots.txt, meta information, sign-up at their site, whatever. This serrupticious gathering of copyrighted works, and the questionably legal issue of re-distributing them in whole or in part, possibly altered from their origional form is just not right.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    33. Re:"The Wayback Machine" by LordLucless · · Score: 1

      Actually, it's more like a newspaper giving away copies of it's paper one day, then demanding that anyone who received a copy burn it the next day because it's now out of date.

      Archiving doesn't hurt anyone financially - it was free when it was archived, it's still free now.

      Archiving doesnt't steal anyone's glory - archives usually give the URL the page came from, so you know whose work it is.

      Is there a way archiving can actually devalue the original? If there is, there might be a valid reason for complaining about it, but I can't find any.

      --
      Just because you're paranoid doesn't mean there isn't an invisible demon about to eat your face
    34. Re:"The Wayback Machine" by shilly · · Score: 1

      I'm sorry to say this, but your point 2 clearly show that you have absolutely no idea how a library works. While *some* libraries will rotate *some* content, many libraries deliberately store old versions of texts, even where newer versions are available. Have you never heard of copyright libraries? They're must surely exist in the US too--libraries, such as the British Library and the University Library of Cambridge (that's England, not MA) that are entitled by law to hold a copy of any copyrighted text published in (in the case of these two) the UK.
      It's not just these particular libraries either--most libraries have a policy of updating some texts and keeping others for historical reference. That's why you can go to a library and read gardening manuals or travel guides or chemistry textbooks written five, ten or 20 years ago. As for newspapers...as you acknowledge, libraries frequently keep back issues of newspapers. Many also have an archival policy for newspapers to allow access to old material (e.g., microfiching copies more than a couple of years old). Content providers have no say in this matter. And in general, they should have no say on the web either.

    35. Re:"The Wayback Machine" by dossen · · Score: 1

      Excatly. The wayback machine is just a reasonable extrapolation of these principles to the context of the web.

      One of the important aspects of copyright is that it is limited in time, and something WE give authors in exchange for their work. If there are no libraries and archieves, then there will likely not be a copy of the work, once it enters the public domain. One would expect that the retro-active robots.txt feature on the wayback machine is limited to the term of copyright (so in... what... 65 years or so... they should allow free access to the first sites captured, robots.txt or not).

      Allowing access now, while respecting the common ways of restricting such use, is consistent with the function of a library, adapted for the web.

    36. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      >Excatly. The wayback machine is just a reasonable extrapolation of these principles to the context of the web.

      It's an utter circumvention of my rights as Copyright holder.

    37. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      > Or else maybe they're not selling anyone's copyrighted work.

      Which means absolutely nothing. What they're doing is a violation of my rights as copyright holder. Simple fact. I own my work, I retain all rights to that work. I didn't pass any of those rights on to Wayback. They have no right to archive my work without my permission.

    38. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      libraries do this -- its called a library. The library obtains a copy of the book legally. Someone goes to the library, gives the lady his card, she stamps the due date on the back flap, he takes the book home, *READS IT*, takes it back and then someone else does the same thing all over again. And the author doesn't get paid by either one. What's more, a library may have multiple copies of a book, and until its patrons start getting 503, they will keep downloading the books!

    39. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      If you loan me your car to drive to Montana, what if someone goes with me?

    40. Re:"The Wayback Machine" by Anonymous Coward · · Score: 0

      you are aware that when someone goes to an archive program and punches in 1995, it is reasonable to expect them to know that what they are viewing is an archived copy from 1995 and not the 2001 update, correct?

    41. Re:"The Wayback Machine" by Hassan79 · · Score: 1

      Its legally grey, and all it really takes is for some paranoid person to sue, and then the fireworks start.

      Hmmm... Imagine that there would be a Web Content Writers Association Of America (WCWAA). Then we would have the biggest copyright lawsuit in human history ;-)

      --

      Don't drink and su! antidisestablishmentariazationally
    42. Re:"The Wayback Machine" by gerardrj · · Score: 1

      It's not the archiving that I think is wrong. It's the redirobution.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    43. Re:"The Wayback Machine" by LordLucless · · Score: 1

      I still can't see a problem.

      The content was originally being distributed for free.

      It is still being distributed for free, and the author is still being credited.

      --
      Just because you're paranoid doesn't mean there isn't an invisible demon about to eat your face
    44. Re:"The Wayback Machine" by gerardrj · · Score: 2

      The problem is that money is not the issue. Copyright law still pertains. I did not give them any permission to redistribute my site.
      THe footer on my site even says so. The funny bit was they reproduced the footer stating that what they where doing was prohibited use.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    45. Re:"The Wayback Machine" by gerardrj · · Score: 2

      Let me take money out of the equation with another example:

      An artists takes a pitcure. He then makes 300 prints of that negative.

      He gives away the prints to the first 300 people who come asking for them.

      Can those 300 people then make copies of thier prints and give them away?
      As the developer, can I run more prints from the negative and sell or give them away?

      In both cases, the people giving away the copies they made are violating copyright law.

      In this example, the negative is analagous the initially publised web site; the prints to the archived site.

      As I've said in many of the other threads, the money is not the issue. The issue is copying rights, who gets them, when and why. You only get rights to my work that I explicitly transfer to you and those of fait use.
      In publishing my web site, I automatically have copyright to the content. When I offer the pages on the web, I implicitly transfer to the viewer the right to view that content, and to cache it in RAM and on disk for a relativelt short period. I undertand that caching is an inherent part of the medium in which I publish my work. You also have certain rights under fair use. You may discuss my work, reference it in your own workds, make backup s, cite portions of it, etc...
      Noplace, do I explicitly, by reference or implication transfer rights whereby you can re-transmit any or all of my content.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    46. Re:"The Wayback Machine" by Rick+the+Red · · Score: 2
      Simple fact. I own my work, I retain all rights to that work. I didn't pass any of those rights on to Wayback. They have no right to archive my work without my permission.
      Gee, when I visited those sites (c. 1995) I don't recall seeing anything on them that said I didn't have the right to read them. If you don't want people to read your stuff, don't put it on the Internet! If you put it on the World Wide Web, don't be surprised if someone takes that to mean you don't mind if anyone reads it. Remember, if you put a "robots.txt" disclaimer, they'll honor your request.

      --
      If all this should have a reason, we would be the last to know.
  4. Robots.txt by mshowman · · Score: 5, Informative

    I had recently placed a restricted robots.txt file on my site and when trying to access any of the past revisions, I get a message saying that the owner has restricted access to the site via robots.txt. They seem to have that aspect under control.

    1. Re:Robots.txt by Dwedit · · Score: 1

      Tell that to the squatters who bought up great old sites and restrict their spamsites from being viewed by a new robots.txt file!

    2. Re:Robots.txt by spacefight · · Score: 1

      Remove the robots.txt and the Archive is back online... in contrary to what they said about also "deleting indexed content". Liars.

    3. Re:Robots.txt by MulluskO · · Score: 1

      Next thing you know they'll be putting all the robots in reservations, and I know you humans won't honor those treaties you signed.

      -Speaking as a brain in a robot body (sealab)

      --

      Too busy staying alive... ~ R.A.
    4. Re:Robots.txt by ShaunC · · Score: 3, Interesting
      Sigh. This, I suppose, is what happens when Slashdot keeps stories in the queue too long:
      2002-03-30 10:12:57 The Wayback Machine, friend or foe? (askslashdot,news) (accepted)
      At the time, I was having severe problems getting in touch with anyone at The Wayback Machine. Yes, their site makes it quite clear how to have your site removed. Yes, I placed the appropriate entry in my robots.txt files. Yes, I submitted my sites for exclusion. Then nothing happened. After emailing them several times with a list of domains I'd prefer to have removed from the archive, I got a reply back saying they should disappear by the end of the following day. No go.

      That's all changed. They've got the kinks worked out, as best I can tell, and have begun obeying robots.txt files. They weren't so diligent about it three months ago, or I wouldn't have gotten ticked at 'em.

      BTW, my submission was edited in at least one place: I don't capitalize the word "SPAM," as the capitalized version is Hormel's trademark. (Maybe my submission was combined with someone else's; hard to remember what I wrote 3 months ago.)

      Everything else I'd say has already been said, I wish I'd noticed the story sooner.

      Shaun
      --
      Thanks to the War on Drugs, it's easier to buy meth than it is to buy cold medicine!
  5. There are more than copyright concerns... by Anonymous Coward · · Score: 4, Insightful

    It's a scary thought that things kids are saying on message boards when they're teenagers are going to be back to haunt them when they apply for jobs in their mid 40s...

    I mean, if everything I posted on BBSes in the 1980s were still attributable to me... yikes.

    Remember kids. Use a nickname, and change it frequently if you ever want to run for any kind of office.

    1. Re:There are more than copyright concerns... by TheMonkeyDepartment · · Score: 4, Insightful

      Well, that's a great point, and it's a good illustration of the double-edged sword of free speech. You are free to say whatever dumbshit, ridiculous things you want. But you are also free to deal with the social consequences.

    2. Re:There are more than copyright concerns... by rhaig · · Score: 4, Interesting

      dejanews was my best tool to weed out resumes

      before I secheduled even a phone interview, I'd always search dejanews for the person in question. Sometimes I'd come up with a definate hit (first and last name as well as email and mentioning the local area or some work that was on their resume) and I'd be able to see what kind of person I was really dealing with. That's when I started looking at what I'd posted.

      --
      "We are not tolerant people. We prefer drastically effective solutions"
    3. Re:There are more than copyright concerns... by Anonymous Coward · · Score: 0

      Hey, if you did it in Houston, look out. One of these days I'm going to dig out my 12 year old BBS message archives and start putting it online. Then it'll be in Google and the Wayback Machine, too.

      The only trick is making the stupid 1541 talk to the parallel port under Linux...

    4. Re:There are more than copyright concerns... by gad_zuki! · · Score: 2

      The problem with this is that as the copyright owner you cannot convince google to remove your old usenet posts unless you still have that 1998 email address. Its a ridiculous requirement for google to ask people to have their college email accounts when they want the posts they own and wrote removed from google's system. A simple proof of ID should be enough for them.

      An inaccessible copyright policy is like having no policy at all. Expect fallout sooner or later.

    5. Re:There are more than copyright concerns... by Anonymous Coward · · Score: 0

      I would definitely check for simple spelling errors as a sign quality too.

    6. Re:There are more than copyright concerns... by Anonymous Coward · · Score: 0
      I'm one of those kids who posted prolifically and embarrassingly to USENET since day one. I'm now a hacker in my 40's and all I need to do to unearth my ancient grievous acts by the dozens is to search groups.google.com for certain unique and clever turns of phrase. Yeeks. I just hope that the potential employer who finds these pearls realizes that they were written twenty years ago, and I've mellowed some since then.

      But do I love google groups and the wayback machine? You betcha.

    7. Re:There are more than copyright concerns... by Anonymous Coward · · Score: 0
      i said a bunch of silly things back in 93, 94, and i stand by everything stupid, thoughtless thing i said. i was younger, and i grew. anyone who thinks i am still that person and who judges me based on that, well, that's their problem.

      hell i suppose i still say a bunch of (different) thoughtless stupid things and i stand by them b/c i can admit my mistakes and work forward from there.

      besides, if the pres of the states can snort coke and still get elected into office, i'm betting i can say silly things and get elected as well.

    8. Re:There are more than copyright concerns... by Fencepost · · Score: 2

      Wrong. You can get Google to remove postings you made with no-longer-extant email addresses. See the Google Groups help, specifically this entry.

      --
      fencepost
      just a little off
    9. Re:There are more than copyright concerns... by Anonymous Coward · · Score: 0

      The man who said, "I didn't inhale," still got into the highest office in the land ;) And you're worried about what you said twenty years ago?

      U want to work for someone who digs that deeply into you're past? I'm so disappointed that I can't work for Castro or his like(sarcasm)

    10. Re:There are more than copyright concerns... by Suppafly · · Score: 2

      I don't understand why people can't grasp that concept and also the concept of 'if you are going to put something out that publically viewable, you can't take it back'.. It'd dumb to whine about caches, if you put something out that is world viewable thats the risk you take.

    11. Re:There are more than copyright concerns... by Suppafly · · Score: 2

      Why would you want to have them removed if you posted them?

    12. Re:There are more than copyright concerns... by Sloppy · · Score: 1

      It's a scary thought that things kids are saying on message boards when they're teenagers are going to be back to haunt them when they apply for jobs in their mid 40s...

      Look at it from the other side: you're considering hiring someone. Do you care what they said 20 years ago? (No, I'm not asking if the tabloid media will care what a presidential candidate said 20 years ago, I'm asking about a real people.) Probably not.

      You probably take it for granted that they did or said something dumb (who hasn't?), and then dig it up to rib 'em with at the next office party.

      --
      As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
    13. Re:There are more than copyright concerns... by Shalda · · Score: 1

      I mean, if everything I posted on BBSes in the 1980s were still attributable to me... yikes.

      Not to freak you out, but I have archives of the BBS I ran from '93 - present. Well, chunks of it anyway. I know several other past and present sysops who keep much better records than I. While the people I know may not be a statistical sample, I'd say there's a good chance that unless you ran the board yourself, someone has an archive of it.

      That said, I need to plug my ongoing BBS... The Dark Tower, currently operated out of Richmond, VA.

    14. Re:There are more than copyright concerns... by DrMaurer · · Score: 1

      Great, so YOU'RE the reason I can't get a job. Looking back at stupid messages I posted back in high school now that I've graduated college and have a child . . . I can CHANGE! GOD DAMN YOU! I CAN CHANGE! FOR THE LOVE OF ALL THINGS HOLY, I CAN CHANGE!

      I promise I won't post any grotesque misspellings on Usenet any more! I promise I won't post to the alt.talk.origins newsgroups! I promise I'll be good! I'll never play Quake again!!!

      So, can I have a job now?

      --
      Dan
    15. Re:There are more than copyright concerns... by Anonymous Coward · · Score: 0

      My nickname is "Anonymous Coward"... Should I change it??

    16. Re:There are more than copyright concerns... by madmancarman · · Score: 4, Interesting
      dejanews was my best tool to weed out resumes

      before I secheduled even a phone interview, I'd always search dejanews for the person in question. Sometimes I'd come up with a definate hit (first and last name as well as email and mentioning the local area or some work that was on their resume) and I'd be able to see what kind of person I was really dealing with. That's when I started looking at what I'd posted.

      This kind of freaked me out when I started teaching in 1998 - I'd been running a large fan web site devoted to one of my favorite bands, and being heavily into the band, I posted a lot in their newsgroup and participated in more than one flame war. Of course, I was in college and in my very early 20's and late teens, but it's all archived on DejaNews now, with no way to remove it. I really doubt any public school districts are going to wise up to this (or even care, considering the national teacher shortage), but I wouldn't be surprised if it came back to haunt me in some way some day. As a previous poster mentioned, such is the burden of free speech.

      An interesting thing did happen to me at the beginning of this school year. I teach high school computer classes, and I was talking about managing that fan web site when one of my students (a junior) opened his eyes really big and pointed at me with his jaw dropped, sort of aghast. I paused and asked him what was wrong, and he exclaimed that he downloaded and used the guitar tabs I'd written years earlier when he was in junior high. I found that kind of amusing!

      I think the archiving of the internet is particularly scary when I can still find a lousy guitar tab I did of Pearl Jam's "Footsteps" that I did back in 1992, when I was a senior in high school piggybacking off an account at the nearby university, on my parents' Apple //e, while I was still learning how to play guitar. Obviously, the internet can have a much longer shelf life than a ProDOS 5.25" floppy (excluding news sites that "expire" their articles after limited availability).

      First they ignore you, then they laugh at you, then they fight you, then you win. -- Gandhi

      --
      First they ignore you, then they laugh at you, then they fight you, then you win. -- Gandhi
    17. Re:There are more than copyright concerns... by rhaig · · Score: 2

      when they put me somewhere doing more than just busywork, I'll call you. Until people start hiring sysadmins again, I'm stuck working for the state.

      --
      "We are not tolerant people. We prefer drastically effective solutions"
    18. Re:There are more than copyright concerns... by guttentag · · Score: 2
      Remember kids. Use a nickname, and change it frequently if you ever want to run for any kind of office.
      20 years later...

      MTV: "PresidentNeal, have you ever posted on Slashdot?"
      PresidentNeal: "Yes, but I never inhaled controlh controlh controlh controlh controlh controlh controlh modded anyone down for being a troll."
      MTV: "Mr. President, control-H doesn't work on Microsoft TV. It looks like you think you're using Linux."

    19. Re:There are more than copyright concerns... by sql*kitten · · Score: 2

      I wouldn't be surprised if it came back to haunt me in some way some day. As a previous poster mentioned, such is the burden of free speech.

      The thing is, people posted to usenet believing that it was an ephermeral medium, and that everything they said was essentially a throwaway comment that would expire within a week at most. The idea that someone was actually saving all this stuff simply didn't occur to 99% of posters (myself included). Partly it was because way back when, the storage to keep a usenet history online would have been prohibitively expensive, and partly because who would even want to preserve alt.*?

      Google do have a procedure for removing posts from their archive, but either it doesn't work or they are simply autoresponding then ignoring the request.

    20. Re:There are more than copyright concerns... by jolshefsky · · Score: 1
      I gotta say, it excites me to see the potential evolution of human society.

      It seems two things are true of people (as a group) today. (Among other things.)

      First, individuals assume that people will hear exactly what they say in exactly the context they said it and in exactly the way they were thinking it. Do you remember in 1999 when David Howard used the word "niggardly" properly and was forced to resign because people though it sounded too much like a racial epithet? All you have to do is (1) clarify the pronunciation of the word, (2) mention that it was in the context of budgetary matters and nothing to do with blacks, and (3) state that it never crossed your mind that it could be misconstrued.

      Right?

      Second, individuals don't believe that others can change. Mark David Chapman killed John Lennon in 1980. Logically, I think the guy could have changed his behavior over the last twenty-one years, but the rest of my brain claims that he must still be a dangerous nutcase. Everybody is different from the person they were twenty years ago, and there's one thing that each person can tag as the worst behavior back then that they don't do anymore. Sure, some things stay the same but some do change.

      Truth sometimes comes along and punches you in the head and says, "wake up--you're being an idiot." Things like the fact that WebArchive is public has the potential to do that to a large part of society. People (as a group, explicitly self-inclusive) need to look at other people as someone who could be like yourself rather than some automaton programmed with a simple set of moral and behaviorial instructions.

      Let me just add that I've got my share of embarassing things on the web. Heck, even a search of my RIT user ID revealed that someone archived a part of VAX Notes from the early nineties (http://www.csh.rit.edu/~tonyl/ancient/levhall.htm ) ... boy was I a dolt sometimes in college--and that's not even through WebArchive. There's even worse stuff that could surface ... such as the public flogging over my condescending view of homosexuality. I have to believe that people are able to change because I have.

      --
      --- Jason Olshefsky

      Karma: Poser (mostly affected by adding this line long after everyone else did)

  6. Opting out -- of publicly available HTTP??? by TheMonkeyDepartment · · Score: 4, Interesting

    When you publish something on the web, it is publicly available via HTTP. End of story. Responsible netizens can observe the requests of "robots.txt" but they don't have to. If you want something more controlled, create a VPN or intranet or some other kind of non-public data server.

    Your argument is similar to that of newspaper publishers who didn't like "deep linking." What they couldn't (or didn't want to) understand is that the nature of an HTTP web server is quite simple. A client asks for a file, the server gives it back. Using that protocol implies that you are OK with that. If you're not, I suggest you look into different technologies, instead of complaining about lack of control, in a medium that was never intended to provide it.

    1. Re:Opting out -- of publicly available HTTP??? by ajmarks · · Score: 1

      One of the problem with archiving is simple copyright violation. If I make a site, regardless of the fact that HTTP is open, it is legally very questionable (understatement) to save a copy of it and redistribute it without my permission.

      --
      Opinions are not Informative, though they may be Insightful or Interesting.
    2. Re:Opting out -- of publicly available HTTP??? by I_redwolf · · Score: 1

      amen.

    3. Re:Opting out -- of publicly available HTTP??? by billybobSDK · · Score: 1

      So should magazine publishers also be alowed to opt out of un-requested archival in my bathroom too?

    4. Re:Opting out -- of publicly available HTTP??? by krogoth · · Score: 2

      Exactly what I wanted to say. Of course, when you put something on the Internet you don't expect it to be archived forever, but you have to keep in mind that anyone can download it and do what they want.

      --

      They that quote Benjamin Franklin on liberty and safety deserve neither.
    5. Re:Opting out -- of publicly available HTTP??? by KillerCow · · Score: 4, Insightful

      When you publish something on the web, it is publicly available via HTTP. End of story.

      I don't think that that is a good enough standard. When a television show is broadcast, or when a book is published, it is publicly available -- but we don't think that the publisher looses their right to copyright protection in these cases. Publishing on the web is similar. The creator wants people to see his/her creation, but does not automatically give visitors the right to archive and retransmit the works.

    6. Re:Opting out -- of publicly available HTTP??? by FreeUser · · Score: 2

      When you publish something on the web, it is publicly available via HTTP. End of story.

      Exactly. By publishing online and publicly you've already opted-in.

      This is just another example of how incompatibel copyright is with any kind of normalcy vis-a-vis individual freedom and, in this particular case, the freedom to archive information and hold someone accountable if they try to change it retroactively (and on the sly). Unless we want Orwellian-style changing of the facts post facto copyright must lose to the right of archivists to preserve information from being lost. Any other policy would be disasterous.

      --
      The Future of Human Evolution: Autonomy
    7. Re:Opting out -- of publicly available HTTP??? by sckeener · · Score: 2

      When you publish something on the web, it is publicly available via HTTP. End of story.

      Ah...as a previous post pointed out, I don't think kids should have their remarks recorded forever. I doubt I would have made it as far as I have if my BBS quotes were still around...

      --
      "Only one thing, is impossible for god: to find any sense in any copyright law on the planet." Mark Twain
    8. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      i tried i site i did and it gave me my original post, none of the updates were in the "updates"
      odd no?

    9. Re:Opting out -- of publicly available HTTP??? by I_redwolf · · Score: 2

      Comparing a library of books or whatever you are thinking about and the HTTP protocol doesn't work.

      You go to the Library, borrow a book which has a value and has to be returned; If you keep it then you must pay the value of the book. You could also take the book and make a copy wherever you are and redistribute and most likely pay heavy fines when caught.

      The HTTP protocol, information is put on a public network that operates on the basis that information exchange should be free (this is why there is an httpS://). The information, files, etc you receive have value except they don't have to be returned. You could also take those files etc and make a copy where ever you are and redistribute and most likely pay heavy fines when caught. HOWEVER if you archive those files, images etc that you receive it's not a big issue. Just like you read a book and it has value, you archive as much as possible in your brain for a test, or you pull pieces of it out or whatever. Except in those cases you DO redistribute on paper or into a report or whereever else.

      If you go your route then eventually teachers wouldn't be able to teach, you kill off the internets original purpose and you usher in a 1984 like society; in general this is a very bad idea.

      I'm going overboard right? If I'm not mistaken your email address says you go to cornell, look around campus and tell me if I'm going overboard.

    10. Re:Opting out -- of publicly available HTTP??? by elmegil · · Score: 2
      I doubt I would have made it as far as I have if my BBS quotes were still around...

      I don't see why not. I have lots of fun quotes floating around in google, deja* and slashdot, but my employers have never called me on them. Do you think HR has nothing better to do than find ways to embarass potential employees? Perhaps some places, but I wouldn't want to work there....

      Public office is a different matter, but honestly, I don't see how embarassing things said on the net are any worse than embarassing things done in the public record (like the famous Newt G. divorce your wife while she's in the hospital with cancer debacle).

      --
      7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
    11. Re:Opting out -- of publicly available HTTP??? by TheCarp · · Score: 5, Interesting

      The otherquestion is one of historical record.

      What you say does not BELONG to you. It is not property. Once you write it, it exists. You may own the medium it is on, but once it is out in the world it is uncontrollable and no longer owned. You may hold copyright... but a hundred years from now when you are long since dead and copyright is expiring, then what?

      We have the works of Galileo, we have letters that Thomas Jefferson wrote to people, why? because they were written. Many years later, long after the fact, these were made public and part of historic record because they survived.

      On the net, we have a culture of written information apearing and disapearing. This information is part of our culture, its things that we read and see, when it goes away - for whatever reason - we have lost something.

      I have websites from 96 that exist now only in the way back machine. Yea, som eof the stuff I aid back then I don't agree with now, and would rather not have associated with me but, by that same token, I wouldn't want it to be lost forever. If someone read it and what I wrote had enough impact on them that they want to see it again... then I would not even dream of trying to stop them (even if the impact was one of disgust - an impact is an impact) - even if its just someone wanting to see what the web looked like 5 years ago... I think thats valid... I think thats an important record fo our culture.

      the only thing I can see a case for really is the removal of personal information that shouldn't have been public in the first place. Beyond that though, I think its good... i mean... its not something that is ever going to be mistaken for a live current site - you have to actually go to the way back machine and ask for it.

      All in all this is a good thing and I hope it survives longtime.

      -Steve

      --
      "I opened my eyes, and everything went dark again"
    12. Re:Opting out -- of publicly available HTTP??? by Jobe_br · · Score: 2

      Archiving is one thing, rebroadcasting (or rehosting as is the case here) is another. By copyrighting my site, I reserve the sole right to host a server that distributes that content. Nobody else is given a right, expressly or implied, to 'mirror' my site, regardless of if its for archival purposes or not. That's the consideration that needs to be understood here. Archives are great - I often make use of Google's cache, but only if the *real* content I'm trying to reach is behind a slow connection or down entirely. Technically, Google cache should reserve copyright, too - a concept that would certainly kill off the practice. Is it worth it to lose the convenience? Possibly ... think about the ramifications and think hard. If its permissible to archive and host a site's content, why wouldn't it be permissible to archive and broadcast the ST:TNG episodes you've so faithfully taped, w/o paying the royalties to whoever holds the rights to that? Seems like its pretty much the same thing to me, eh?

      Now - if you yourself want to archive a site that you're interested in, or if you want to contact the maintainer of a site for something you're looking for that used to be on his/her site, that's perfectly legit and respectable. I personally have all the previous sites for my company archived - if someone wants something that's no longer on our current site, they can certainly ask and we'll try to fulfill their request.

      I dunno ... I'm up in the air on this, but I entirely understand the copyright ramifications of the situation.

    13. Re:Opting out -- of publicly available HTTP??? by GoatEnigma · · Score: 1
      Uh, I think KillerCow is the only person here who apparently understands the difference between "publicly available" and "copyright". These are not exclusive of each other - just because something is public does not mean it's free for anyone to do whatever they want with. I can't go to the library and copy entire books - that is violation of the copyright act. Yet I can go and read them because they are publicly available.

      Same with magazines - I can buy one and read it because it is published (published comes from same root as public, I'd bet). But that doesn't give me the right to go reprint articles and hand them out. So no, the magazine company can't opt out of you archiving in your bathroom, but they CAN hit you with a lawsuit if you try to reprint them. That's why libraries aren't sued for archiving books - they aren't reprinting them, just storing them.

      The issue here is whether the Machine is reprinting stuff or not. Is it? If the content is exactly the same, is it a reprint or just a stored copy? If I made something publicly available, I don't see why someone can't archive it - just as long as they don't reprint it. So if they're not the exact same bits and bytes that I made, it's a violation. But if it is..... Very interesting distinction.

    14. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      yea, so in other words...shut the fuck up

    15. Re:Opting out -- of publicly available HTTP??? by LordNimon · · Score: 1

      TV is not the WWW. We can't allow people to believe that they can have such strict control over the contents of their web site, because otherwise the WWW is dead.

      --
      And the men who hold high places must be the ones who start
      To mold a new reality... closer to the heart
    16. Re:Opting out -- of publicly available HTTP??? by LordNimon · · Score: 0

      So you're saying that Google should be legally responsible if its cache or search engine is out of date? That's crazy. Just think of the Wayback Machine as a Google cache that's way out of date. After all, that's pretty much what it is.

      --
      And the men who hold high places must be the ones who start
      To mold a new reality... closer to the heart
    17. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      Then you shouldn't distributed your content in a public manner. The law can't protect what you're giving away yourself. If you want to protect it, don't post it.

    18. Re:Opting out -- of publicly available HTTP??? by I_redwolf · · Score: 3, Interesting

      If the creator wants people to see his/her creation, but not give them the right archive and retransmit the works just like always they can put a (C) at the bottom of their webpage expressing that redistribution of the work without express permission of blah is prohibited. Obviously that would bring the question about how many people the creator would want to see his/her work in the first place. If they want to be selective then be selective, write a webapp that only allows registered users who have agreed to a non-disclose or redistribute license etc etc. There are many ways to go about it so long as the creator understands "When you publish something on the web, it is publicly available via HTTP".

    19. Re:Opting out -- of publicly available HTTP??? by osgeek · · Score: 2

      When you publish something on the web, it is publicly available via HTTP. End of story. Responsible netizens can observe the requests of "robots.txt" but they don't have to. If you want something more controlled, create a VPN or intranet or some other kind of non-public data server.

      Your argument is analogous to those of spammers and telemarketers. You have a publicly available email box, so we can use it to send you spam - You have a publicly reachable phone, so we can use it to call you to sell you stuff.

      Respecting the wishes of those who create content, own email boxes, or have telephones would trump the wishes of those who wish to use those resources -- in my ideal society.

    20. Re:Opting out -- of publicly available HTTP??? by krypto246 · · Score: 4, Insightful

      People are just pissed about this archinving because they like the internet to be a 100% responsibility free zone - now matter what you say or do, you ca nalways change, edit or delete it later. How about standing behind your comments and opinions, instead of just deleting them when they can be held against you? Yes - use nicknames and aliases, but dont expect that the things you put out there to be temporary. You put something out into the internet, it stays there, and it can be found later, thats the power of the net, and the price you pay for it.

    21. Re:Opting out -- of publicly available HTTP??? by RovingSlug · · Score: 2
      Okay, recast it in this direction:

      It's the same problem we're having with Napster, Kazaa, Blizzard, etc. That information can trivially be copied, that certain "copies" of information are absolutely fundamental for a computer to work properly ... these issues eat at the original preconditions to copyright.

      My computer needs a copy of your information in its registers, in its L1/L2/L3 caches, in its system RAM. Software may and often does save (archive) copies to a cache on the hard drive -- admins usually appreciate this because it reduces server load. A transparent web proxy to an intranet may cache web requests for its internal clients if it has a slow outgoing connection.

      Surely you shouldn't have to "opt in" the first few cases. But, it's all the same principle, caching/archiving. So, as we go out, especially to the transparent web proxy, where do you have to opt in? And what about further out, as computers become just one component in a cluster of computers? Where does broadcasting begin and caching end?

      I think there has to be a lot more philosophy than ST:TNG analogies to make a sound decision about copyright ramifications to computers (see Taking the Copy Out of Copyright). It's a very broad issue, and it will siginificantly determine the way we use both information and computers/electronics in the future.

    22. Re:Opting out -- of publicly available HTTP??? by bhsx · · Score: 1

      Let me ask you this... is a proxy server illegal/immoral? That's all this is.

      --
      put the what in the where?
    23. Re:Opting out -- of publicly available HTTP??? by delta407 · · Score: 2

      HTTP makes provision for caching and caching proxy servers, which does give visiting machines the right to archive and retransmit the works. Of course, there are expiration headers, but there is nothing that says it has to be purged from the cache once it expires.

      Are cache servers in violation of copyright?

    24. Re:Opting out -- of publicly available HTTP??? by M-G · · Score: 2

      Yeah, but the magazine publisher isn't gonna go into your bathroom, remove the old magazine content, and replace it with new stuff, which is in effect what happens with web sites.

    25. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      Comparing a library of books or whatever you are thinking about and cable TV doesn't work.

      You go to the Library, borrow a book which has a value and has to be returned; If you keep it then you must pay the value of the book. You could also take the book and make a copy wherever you are and redistribute and most likely pay heavy fines when caught.

      With cable TV, information is put on a public network that operates on the basis that information exchange should be free (this is why there is Pay Per View). The sitcoms, documentaries, etc you receive have value except they don't have to be returned. You could also take those shows etc and make a copy where ever you are and redistribute and most likely pay heavy fines when caught. HOWEVER if you archive those shows, documentaries etc that you receive it's not a big issue. Just like you read a book and it has value, you archive as much as possible in your brain for a test, or you pull pieces of it out or whatever. Except in those cases you DO redistribute on paper or into a report or whereever else.

      If you go your route then eventually teachers wouldn't be able to teach, you kill off the television's original purpose and you usher in a 1984 like society; in general this is a very bad idea.

    26. Re:Opting out -- of publicly available HTTP??? by budgenator · · Score: 2

      My site not only has the copywright notice on the bottom of the page, but add banners for companies long since bankrupt durring the DOT-bomb phase of the internet; but they are archived for posterity by the wayback machine. so the little (c) is ignored

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    27. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      Bullshot !
      If you broadcast it, it is beyond your control.
      How can you say " I want this house to get my program, but not these."
      You put it on the web, it becomes part of my cache on my computer, you can't tell me what to do with it.
      Your lucky I don't charge you storage fees.

    28. Re:Opting out -- of publicly available HTTP??? by Uncle+Ira · · Score: 1
      When a television show is broadcast, or when a book is published, it is publicly available -- but we don't think that the publisher looses their right to copyright protection in these cases

      Based on the uproar over PVRs lately, I would have to say that many people on this forum do think just that.

      Of the two exmples above, book publishers have a stronger argument as far as copyright protection since there is a physical object that can be used to establish ownership. A television broadcast is just thrown out over public airwaves. At that point any control you may have over the content effectively dissapears.

      HTML archiving has more of a parallel to the telvision show concept- it technically has to be copied before it can be used. If you publish to the web, you're publishing for free. Do so at your own risk.

    29. Re:Opting out -- of publicly available HTTP??? by GoatEnigma · · Score: 1
      Hence the purpose of archiving. Essentially, every time a web site is updated, it is a new publication (or an updated version of a previous publication). Think of it as a new book to be archived in a library. As long as Wayback Machine isn't altering or selling their archived information, they are essentially acting as a public library.

      Would you sue a public library for having a copy of a novel you wrote, even though they didn't ask?

    30. Re:Opting out -- of publicly available HTTP??? by Dan+D. · · Score: 2
      Then why aren't people suing the TV Guide for deep linking infringment. And yes you can record TV and quite often you can find archivals of TV that may or may not actually have been collected by the original producer of the show.

      If its broadcast on publicly available wavelengths (or electrons) its publicly availabe.

      --
      People who quote themselves bug the crap out of me -- Me.
    31. Re:Opting out -- of publicly available HTTP??? by I_redwolf · · Score: 2

      Cable TV is a protocol? I can broadcast my own tv shows without FCC permits?! Wow, AC's sure do know alot of stuff it seems. If you could forward that information to me. Or read the rules of public access.

    32. Re:Opting out -- of publicly available HTTP??? by I_redwolf · · Score: 2

      The ad banners were for companies long since bankrupt during the DOT-bomb phase of the internet; but the banner ad's don't belong to you, you were redistributing them on your website. Maybe with the permission of the advertiser including a small fee they might of paid you, i bet if you ask them, they won't complain about these same banner ad's still promoting whatever product. But hey, I could be wrong there are some advertisers who only want people who saw the website in June of 98 to buy their stuff.

    33. Re:Opting out -- of publicly available HTTP??? by nexthec · · Score: 3, Interesting

      Actually, In canada (I am an american, but I'm married to a canuck) anybody can rebroadcast anything. the deal is tho, that they can not change it, cant remove advertisments, cant shorten, lengthen, commentary over it, or put up their logo. kinda a neat idea.

    34. Re:Opting out -- of publicly available HTTP??? by ajmarks · · Score: 1

      Redistributing materials to students is one thing. Storing a copy of a webpage from three years ago and redistributing that copy publicly, for purposes other than things like instruction, is something completely different. Hell, I think the limits on what percentage of a book may be redistributed to students are draconian.

      --
      Opinions are not Informative, though they may be Insightful or Interesting.
    35. Re:Opting out -- of publicly available HTTP??? by npsimons · · Score: 1
      When a television show is broadcast, or when a book is published, it is publicly available -- but we don't think that the publisher looses their right to copyright protection in these cases. Publishing on the web is similar.


      You're right; it is similar. When a book gets published, I can go to the library to check it out, free of charge. When a television show is broadcast, I can pick it up with an antenna, free of charge. And when something is published online, I can copy it and link to it, free of charge. If you don't like that, don't use the web!

    36. Re:Opting out -- of publicly available HTTP??? by hymie3 · · Score: 2

      How about standing behind your comments and opinions, instead of just deleting them when they can be held against you?

      Okay, for usenet, sure. That's why I haven't removed my stuff from the google news thing. I've got posts from 1992 on their, many of them not all that flattering (I was 18, that's my excuse).

      My websites, on the other hand, are *my* creation, not "released to the public" as has been argued is the case for email and usenet. I *still* own the copyrights, but they are not being respected.

      I can stand behind stupid/offensive websites I made in my younger days. Can they stand behind their claims that they respect copyright? Last time I checked, copyright (at least in Berne signatory countries) was not an opt-in thing.

    37. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      No. One of the good things about the DMCA is that it makes caching legal.

    38. Re:Opting out -- of publicly available HTTP??? by PerryMason · · Score: 1

      Then how to allow caching. Surely a proxy cache simply archives and retransmits the website. The Way Back Machine is just a proxy cache with a cache that doesnt get overwritten very often ;)

      --
      "I'm tired of all this 'Aren't humanity great' bullshit. We're a virus with shoes" - Bill Hicks
    39. Re:Opting out -- of publicly available HTTP??? by subsolar2 · · Score: 2

      On the net, we have a culture of written information apearing and disapearing. This information is part of our culture, its things that we read and see, when it goes away - for whatever reason - we have lost something.

      I have to agree whole heartedly. I wish the archive went farther back to the beginnings of the web so people could really see how it started out. It's always bothered me that there was no way of saving it because of the ability to basically re-write hostory and not being able to prove it.


      There are also sites that I wish would have made it to the archive back when I first started out in 95 ... it would be cool to look at them again.

    40. Re:Opting out -- of publicly available HTTP??? by stinky+wizzleteats · · Score: 2

      The creator wants people to see his/her creation, but does not automatically give visitors the right to archive and retransmit the works.

      It's amazing to me how people can be so enthusiastic about using technology to spread information and yet be so capable of an unreasonable need to control that information once spread. To demand ownership of HTML, when storage and retransmission are a normal part of the operation of web browser software, and when you really don't even have control over how your page is presented by the browser, is patently absurd.

      You can't have it both ways. If you want to play in a world where information is freely and rapidly exchanged, then you must be prepared for exactly that.

    41. Re:Opting out -- of publicly available HTTP??? by Lars+T. · · Score: 2
      A copy of (almost) every printed book or newspaper is stored in at least one public library. In some countries it is even required that you give one copy to a national archive. Why should publishing on the web be any different?

      It is somewhat odd that the Slashdot crowd both wants to get rid of IP and cheers for Open Whatever, but wants their copyright protected if somebody archives their webpage or Usenet post.

      --

      Lars T.

      To the guy who modded me down from perfect to terrible Karma - Apple haters still suck

    42. Re:Opting out -- of publicly available HTTP??? by Simon+Brooke · · Score: 2
      I don't think that that is a good enough standard. When a television show is broadcast, or when a book is published, it is publicly available -- but we don't think that the publisher looses their right to copyright protection in these cases. Publishing on the web is similar. The creator wants people to see his/her creation, but does not automatically give visitors the right to archive and retransmit the works.

      We cannot have it both ways.

      Either there is an information commons, with rights to fair use of copyright information, or there is a DMCA-inspired world where ultimately it becomes illegal to so much as quote a phrase anyone else has previously used. If you said a thing, you said it; it's a fact and it's part of the historical record. You might reasonably have reason to complain if the wayback machine altered what you said in any way, or made it appear that you had said something you had not said; but so long as it merely archives and keeps an historical record, as far as I am concerned it is entirely legitimate and proper.

      I also hope that the wayback machine is archiving material that is hidden by robots.txt files, and will make them public after normal copyright has lapsed. It is part of the historical record, too.

      --
      I'm old enough to remember when discussions on Slashdot were well informed.
    43. Re:Opting out -- of publicly available HTTP??? by Twylite · · Score: 2

      Wrong, your website IS released to the public, unless you have taken steps to make it private (say, using password protection and having a limited set of permitted members). Publishing on the web is just that - publishing. That gives libraries the right to take your publication and make it available to the general public.

      The issue is more accurate one of whether the archive site is simply making your publication accessible, or republishing (which is protected by Copyright). Copyright ownership does NOT give you carte blanche to decide how and who has access to your material; there are limitations and provisions, in particular that public libraries can provide access to that material on a loan basis irrespective of the licensing clause you try to apply.

      --
      i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net
    44. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      >If the creator wants people to see his/her creation, but not give them the right archive and retransmit the works just like always they can put a (C) at the bottom of their webpage expressing that redistribution of the work without express permission of blah is prohibited.

      Not nescessary. As creator of a work you retain copyright unless you expressly release it into the public domain, not Copyright notice is require. You create it, you own it.

    45. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      >HTML archiving has more of a parallel to the telvision show concept- it technically has to be copied before it can be used. If you publish to the web, you're publishing for free. Do so at your own risk.

      And again, where does it say in HTML that I give up my rights to my own COpyrighted material just because I post it on my web site? I created it, I own it, end of story. You people can mumbo-jumbo and dance around that all you want it dosen't budge the simple truth an inch.

    46. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      >When you publish something on the web, it is publicly available via HTTP. End of story.

      Not at all the end. I own the material I have created. I can make that material available to readers in whatever manner I choose. I still retain the copyright, *me* not Wayback, *me*. I have the right to decide where, when and how my works are used, stored and viewed. *Me* not them.

    47. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      >Then you shouldn't distributed your content in a public manner. The law can't protect what you're giving away yourself. If you want to protect it, don't post it.

      I see. If someone steals my work, then it's my fault? I have rights as a copyright holder. What Wayback does violates those rights. Using your logic it's okay to write down a song I heard someone sing and do whatever I wanted to with it, right? Or use the plot and characters of a street play or live theatre, etc. After all if they didn't want it stolen they shouldn't have performed it in public.

    48. Re:Opting out -- of publicly available HTTP??? by Jobe_br · · Score: 1

      I think I agree ... copyright is an intrinsically difficult concept to wrap my brain around and particularly difficult to apply to new areas. However, I think I stand on solid footing when I say that in absolutely *no* case is *anyone* allowed to publicly display in *entirety* my copyrighted material without my prior authorization. Since this is the case in many respect with the web archive, then I think at least on this point, the concept is flawed.

      Now - how 'bout sites that no longer exist? If the copyright holder (company or individual) still exists and can be reached then the copyright should be respected (as it would be if the issue were taken to court by said individual or company). However, if a company or individual has "sunk into the ether" then I could see archived material being released into the public domain, no sweat.

      And that could be done on a try/fail basis - so if due diligence is practiced in attempting to ascertain if the holder of a copyright still exists and that fails then rehosting the content should be allowed - if later the copyright holder *does* turn up and ask that the material be removed or restricted then that should be respected at that time.

      A practical way might be to determine if (a) a particular domain is still active. If so, chances are that its the same owner (I realize that this isn't always the case). An automated check could query whois and determine if the owner information for a particular domain has changed - if so, the archive could query the current owner and determine if they now hold the copyright to the archived material (this could be an automated process, for the most part).

      If a domain is no longer active, that's a good sign that the company no longer exists, at least for a phase 1. Checks like this would go a long way towards assuaging the fears of current content providers, including myself.

    49. Re:Opting out -- of publicly available HTTP??? by dfackrell · · Score: 1
      When you publish something on the web, it is publicly available via HTTP. End of story.

      I don't think that that is a good enough standard. When a television show is broadcast, or when a book is published, it is publicly available -- but we don't think that the publisher looses their right to copyright protection in these cases. Publishing on the web is similar. The creator wants people to see his/her creation, but does not automatically give visitors the right to archive and retransmit the works.


      Please remember two points when talking about copyright:

      1. It was never intended to protect publishers, but to protect authors from publishers and plagiarists.
      2. It has always been an artificial scheme to try to encourage the production of more creative works.
      --
      "What is the purpose of reality?" When you can answer the question, it will be time for you to leave.
    50. Re:Opting out -- of publicly available HTTP??? by Anonymous Coward · · Score: 0

      so you can't have your own opinion or disagree with, critique, or comment on what is fed to you?

    51. Re:Opting out -- of publicly available HTTP??? by Rose+Meir · · Score: 1
      The problem with the analogy is that HTTP is a different medium (so to speak.)

      When you (or I or anyone) look at a web page we make a copy of the page that was on the "owners" server and place it on our disk. The whole functionality of HTTP is to make an identical copy of the data that was on the remote web site. When a person buys a book (or borrows it for that matter) they would have to take a second step to photocopy it (or transcribe or whatever) to get another copy of the book. The same with the broadcast, the receiving medium (television for example) does not function by making an exact copy and then showing it. Additional mechanisms can be added to do this (Tivo, VCR, etc.) but the basic "protocol" does not work in this manner.

      The truth is is that HTTP was designed for sharing information and exchanging ideas. When you read this message you've just copied it to your disk and effectively archived the page. These are still my words, they could be copyrighted. But by publishing them in on a web page which is served up using HTTP I have implicitly given you (and whoever wants) permission to copy them. You can delete them by clearing your cache, but the copy was already made.

      -Rose

  7. Talk about a time machine... by wompser · · Score: 3, Interesting

    Went back and looked at the site for the .com I used to work for, very nostalgic. The wayback machine is a good resource for people who create content on someone's site (a.k.a. me), and then lose access to it because the company goes under. Now I'm able to add my old content to my portfolio, now that the company who once owned it is gone.

    --
    .....
    1. Re:Talk about a time machine... by Prof.Phreak · · Score: 1
      Totally agree. It's very nostalgic... brings back lots of memories. I'm actually kinda upset it doesn't go back farther in time. (my oldest site there is from 1997, kinda sad can't see the 'original' though).

      And to the people who complain about copyrights: It's public content. If you don't want your "copyrighted" stuff on the internet, then simply don't put it there in the first place. Nobody is complaining about Google's cache, and this is something similar, except it goes back years.

      I think this is a great thing! You can go back to see how the internet used to be. (go see how corny microsoft.com or ibm.com looked in 1996 :-) The only bad thing is that it doesn't go back to the very beginning, other than that, it's one of the sites that will be on my favorites from now on.

      --

      "If anything can go wrong, it will." - Murphy

  8. Simple rule by npsimons · · Score: 1
    There's a very simple rule to remember on the Internet: if you don't want it copied or linked to, don't put it online.


    Come on people, wake up! First NPR, now this brain dead crack monkey who calls himself a "webmaster". Anyone who doesn't understand the simple rule stated above is not qualified to be a webmaseter.


    I can understand clueless users, but clueless sysadmins is something with which I will not put up.

    1. Re:Simple rule by galejt · · Score: 1

      Roger that!!!

  9. Permission... by gorf · · Score: 3, Insightful

    who gave them permission to make those copies?

    The way I see it, you implicitly give people some limited form of permission by putting it up on the internet freely available to download in the first place. You put it up for people to download, print out and so forth (which amounts to copying), and therefore you've implied that people may do so.

    Sure, you own copyright, and blatant plagarism is something that clearly is wrong. But I see nothing wrong with taking an article that you published on the web and reproducing it, as long as it is taken in context and is clearly attributed (and it made obvious that the copy isn't the original, but proper attribution would do this and therefore suffice).

    Of course, this is republication and so the issue is not so clear and obviously subjective. That's just my opinion.

    1. Re:Permission... by rector · · Score: 1

      You put it up for people to download, print out and so forth (which amounts to copying), and therefore you've implied that people may do so.

      You argument is similar to the following:
      If program is shown on TV, why someone can't record it and sell copies?
      Note that copying the content of the website is the same. And showing banner on the website that contains a copy is the same as selling content.

      And some website owner explicitely prohibit even printing. See Bloomberg

    2. Re:Permission... by JebusIsLord · · Score: 1

      No, he said absolutely nothing about "selling" it. Your example is flawed.

      --
      Jeremy
    3. Re:Permission... by rector · · Score: 1

      But I see nothing wrong with taking an article that you published on the web and reproducing it, as long as it is taken in context and is clearly attributed

      In fact, there is a special case where this is not a correct approach. It is popular among scientists to distribute their unpublished work over the Web. But as soon as an article is accepted for publication, the copyright is ussualy transfered to the publisher who in turn prohibits the article to be published elsewhere. In this case cashing such articles not only involves legal issus, but also contradicts common practice.

    4. Re:Permission... by jgilbert · · Score: 1

      The way I see it, you implicitly give people some limited form of permission by putting it up on the internet freely available to download in the first place.

      Amen. I would have to whole-heartedly agree. The internet is a public resource. If the content is not in some way restricted from access (Basic Auth, etc) then you should be able to do this. This is goes the same for linking of any sort to another site. If you don't want it linked to, don't put it up unprotected.

      This is getting ridiculous. Is it just me?

    5. Re:Permission... by rector · · Score: 1

      Selling (even through posting banners) is a special case. But would the copyright owner of a movie be happy, if you record the movie and distribute copies free of charge (say, at a cost of magnetic tape) or encode them in MP2 and place on the Web? While at the same time the same movie is sold in shops?

    6. Re:Permission... by elmegil · · Score: 1

      What exactly is the Wayback Machine selling again?

      --
      7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
    7. Re:Permission... by rector · · Score: 1

      If they place banners on the website, they make money of it. It is essentially the same as selling. Web Machine DOES have banners. Even if a banner says just "Support the Internet Archive. Make a donation" it is a way to make money of someone else's intellectual property.

    8. Re:Permission... by MushMouth · · Score: 1

      There are no banners on the archived pages except the original banner that may have been on the page.

    9. Re:Permission... by gorf · · Score: 2

      Well, assuming you didn't pay to watch TV (if you did, then you're cutting into their revenue by selling copies, so that's a good reason for it not to be right, as otherwise you wouldn't be able to get pay TV) then as long as you kept the advertisements intact, then I don't see a problem.

      As for website owners prohibiting things, I don't really consider that kind of notice valid. It's already implied that you can view them by the fact that you can (HTTP and all that). Restricting you after you've already done it is meaningless and therefore (in my humble opinion) invalid.

    10. Re:Permission... by rector · · Score: 1

      But do you go to the pages directly? Or through the inteface of the Wayback Machine?
      It is the same as when you distribute a copy of a movie and place your own commecial at the beginning. Of course, there are no commercials that interrupt the movie. But first you watch the beginning of the tape. And only after that you try to skip the commercials and go to the beginning of the movie.

    11. Re:Permission... by gorf · · Score: 2

      That's interesting. My stance on that would be that the publisher should have considered that it had been on the internet, therefore publicly disseminated and thus impossible to prevent the article from being reproduced elsewhere when they made any agreement to transfer ownership of copyright.

    12. Re:Permission... by rector · · Score: 1

      In such a case, if you borrow a book in the library, why not make a photocopy? Copying the film is the same. Both ussually violates the copyrite. (Some books allow copying.)

    13. Re:Permission... by rector · · Score: 1

      Imagine a writer published a book. It became a bestseller. Then he sells the copyright to someone. The new copyright owner again can, say, publish N more copyes and sell profitably. The case with the article is the same. At least legally there is nothing wrong. And the text of a publicly disseminated article benefits from reader's comments and suggestions until it is ready for publication. Moreover, some people still keep "early editions" of articles on their web-sites. The copyright concerns the text of the articles but not the scientific achievmets. So, it is legal as long as there is no textual coinsidence (in whole or in part).

    14. Re:Permission... by gorf · · Score: 2

      Because you paid for the book (the library did on your behalf) and the same for a film. The owner of the copyright didn't implicitly give you permission to do anything, because in the case of a film you paid for the privilege to see it, and for a book your library did.

    15. Re:Permission... by gorf · · Score: 2

      Yes, but say a writer published a book, owned the copyright and gave permission in the book for anyone to freely make copies, but then sold the copyright to the publisher. The publisher can't then prevent me from freely making copies if I'd bought the original publication of the book. What I'm saying is that publishing something on the internet for anyone to download for free implies permission to freely make copies.

      Like you say, it's the text that's copyrighted and not the content, so that whole (other) thread is irrelevant.

    16. Re:Permission... by budgenator · · Score: 2

      Yes my original banners are there and yes, if someone clicked them we would make money (if the banner's company wasn't bankrupt that is) so technicaly we are potentialy get paid for content that is no longer available on our site.

      a bit off topic but here goes, while don't the tv ad sponsers negotiate revenues for time shifted programing?

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    17. Re:Permission... by charon_on_acheron · · Score: 1

      Flawed example. You normally have to pay to see a movie, either in a theater, or by buying or renting the VHS or DVD release. They are not publically shown in the town square, free for anyone to watch or record at their leisure.

      Websites on the other hand, are often free, and encourage people to visit many pages. If these pages are retained on the visitor's computer, that is the nature and intent of the WorldWideWeb.

      Other arguments may survive more scrutiny, but that line of comparison is a dead-end.

    18. Re:Permission... by elmegil · · Score: 1

      Are you paying attention, or are you a congenital idiot? The only banners are the original banners, if any. That means the Wayback folks aren't adding banners to make their own money.

      --
      7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
    19. Re:Permission... by Anonymous Coward · · Score: 0

      >Sure, you own copyright, and blatant plagarism is something that clearly is wrong. But I see nothing wrong with taking an article that you published on the web and reproducing it, as long as it is taken in context and is clearly attributed (and it made obvious that the copy isn't the original, but proper attribution would do this and therefore suffice).

      The fact that you see nothing wrong with it only shows that you and a lot of other people here have no understanding of Copyright law at all.

    20. Re:Permission... by gorf · · Score: 2

      Who said anything about Copyright Law? I'm talking about what is right and wrong, not what the law says you should and should not do. Have you not noticed in your infinite wisdom that there is actually a difference?

      There is a very good reason that copyright law exists; it's there to promote production of works by giving a financial incentive to do so. If someone is publishing something on the internet, then in my view he is effectively saying that he has more interest in dissemination of what he has to say than to make money, and therefore permission to copy verbatim is implied. If there are ads on the web page that he uses to fund himself, then that's fine; they'll get copied just as well as the other stuff.

  10. Friend or Foe? Hmmmm... by Navaash+Fenwylde · · Score: 1

    If I choose Friend, I can get half or none of the Wayback Machine's content...

    but if I choose Foe, I can get all or none of its content?

    Better choose Foe.

  11. Legally you can stop them, but why? by the_womble · · Score: 3, Informative
    If you own the copyright they can not archive it without your permsiission, legally, that is all there is to it.

    Of course in practice you have to purse this and ask them to remove it.

    If you really object I suggest a list of every site you have or have had and dates with a request to remove everything. Then you only need to notify them when you put up a new site that that whould also be excluded. That would not be such a nuisance, would it?

    That said I think they are providing a service that is interesting so unless you are harmed by it, why object?

    I am interested in knowing how they had such old versions of your site though. Do search engines keep archives?

    1. Re:Legally you can stop them, but why? by iammichael · · Score: 1
      If you really object I suggest a list of every site you have or have had and dates with a request to remove everything. Then you only need to notify them when you put up a new site that that whould also be excluded. That would not be such a nuisance, would it?
      Well, if you only consider the one archive, maybe not. But how do you know there aren't others now or in the future? What happens when ever medium to large company starts maintaining one to get "background" information on job applicants? Talk about nuisance.
    2. Re:Legally you can stop them, but why? by zangdesign · · Score: 2

      Nuisance, but not illegal. That is actually a good idea from a personal responsibility standpoint. Are you still willing to stand by words you spoke many years ago? If not, why?

      Who know? It might actually get people to talking for a change.

      --
      To celebrate the occasion of my 1000th post, I will post no more forever on Slashdot. Goodbye.
    3. Re:Legally you can stop them, but why? by poot_rootbeer · · Score: 2

      If you own the copyright they can not archive it without your permsiission, legally, that is all
      there is to it.


      So if I want to establish an old-fashioned library full of books that are made of paper, I can't do so until I get permission from the author/publishers of every single book in the building?

      The issue is not as cut-and-dried as you represent it to be.

    4. Re:Legally you can stop them, but why? by Anonymous Coward · · Score: 0

      > So if I want to establish an old-fashioned library full of books that are made of paper, I can't do so until I get permission from the author/publishers of every single book in the building?
      > The issue is not as cut-and-dried as you represent it to be.

      Yes, it is.

      Libraries don't archive copies that they themselves have republished, they archive originally published works. No copyright involved.

      TWM, by necessity, archives copies of the originals - and copyrights to the originals of the story here were expressly reserved to the original site.

    5. Re:Legally you can stop them, but why? by hymie3 · · Score: 2

      Nuisance, but not illegal. That is actually a good idea from a personal responsibility standpoint. Are you still willing to stand by words you spoke many years ago? If not, why?

      How old are you? I'll be 29 this year. I've been on the internet since I was 18. My first two or three years saw me posting quite a bit of offensive/tasteless stuff. At the time, I had a reasonable expectation to not have my words archived for my great-great-grandchildren to read.

      Somehow, jokes about Roland De Graaf having sex with Chelsea Clinton in the back row during the premier of Jurrasic Park seem a lot less funny now.

      Anyhow, my copyrights are being violated. I don't have to opt-in to be granted copyright. The mere act of authoring grants implicit copyright under the Berne convention (US signed on in 1989, which covers all of my web sites *and* gopher sites). Where's my satisfaction?

    6. Re:Legally you can stop them, but why? by zangdesign · · Score: 2

      I will be 35. I've been on some form or another of the internet since 1986. Which hardly invalidates your complaint, of course. I have probably made an equal amount or more of inconsiderate, illiterate, or even downright stupid comments online, so I do not relish the idea of those things being resurrected.

      However, the question arises: which weighs more - historical record or personal rights?

      We have no way of determining, at this time, what may be historically relevant in the future. If we have the means to archive these things for historical purposes it behooves us to do so for future generations to study and pull some pearls of wisdom or knowledge therefrom (perhaps: "Don't do that" might have more meaning because of your statements).

      The question of copyright is a specious argument, in my opinion. There is no violation of copyright, since your words have not been altered, no profit is gained by making a copy of your words, and, in fact, no readily apparent benefit has yet arisen from having archived your ill-considered words. Any benefit exists, in potentia, but one can hardly mortgage a potential metaphorical benefit for a gain now. In fact, one can use the same argument that a library would use for keeping books on the shelf. The author may retain the copyright, but the library has the right to take publicly available material and make it available to others. The potential public benefit outweighs the needs of the individual in that case.

      I would say that you fail, or at one time did fail, to understand the nature of digital media, in that it is intended to be a permanent medium, one certainly more permanent than paper storage. The truth of this argument falls far short of the dream in most cases, but advances are still being made.

      As for your "satisfaction", what satisfaction do you demand? Without a clear definition of your "satisfaction", your moral outrage is mere noise, signifying nothing. Do you demand that your websites and gopher sites be removed from the Wayback Machine? The Wayback Machine makes no attempt to usurp your copyright. You still retain all rights to your material, including the right of removal if you so desire.

      I suggest that you start trying to figure out how to explain to your great-great-grandchildren just who the heck Roland De Graaf is.

      And probably Chelsea Clinton.

      --
      To celebrate the occasion of my 1000th post, I will post no more forever on Slashdot. Goodbye.
  12. The story should read 'since 1996' by forged · · Score: 2

    www.cisco.com, 1 page (1996)
    www.microsoft.com, 5 pages (1996)
    www.ibm.com, 7 pages (1996)

    This is in the FAQ.

  13. As an creator... by Bonker · · Score: 2

    As someone who makes lots of free sellable and href="http://www.furinkan.net/fanfic/">unsellab le content, I think The Wayback Machine is an invaluable resource. I can look back a see how big a dork I was and still am. I've also found stuff of mine that I've lost over time, amazed that anyone ever bothered to hold on to it.

    --
    The next Slashdot story will be ready soon, but subscribers can beat the rush and slashdot the links early!
    1. Re:As an creator... by rknop · · Score: 2

      I've also found stuff of mine that I've lost over time, amazed that anyone ever bothered to hold on to it.

      Yes, I've used it for this too. I'm a volunteer webmaster for a site (www.fudgerpg.com) where we have a "monthly spotlight", but foolishly I wasn't keeping track of past spotlights. Eventually I wanted to put together a list of past spotlights, and realized that I hadn't kept that list. I felt stupid. The Wayback Machine (mostly) came to my rescue there.

      -Rob

      [ Reply to This | Parent ]
    2. Re:As an creator... by Paradoxish · · Score: 1

      I agree that the Wayback Machine is pretty cool, although I don't think I'd call it "invaluable" or even a "resource". But it is very nice to go back and see old websites of mine stored there. Websites that I had taken down a looong time ago but are still being preserved in one way or another. Ultimately, if the Wayback Machine manages to last for long enough it'll be a way for anyone that has ever put content on the web to have something they created stored and available.

      --
      If you need to interpret my post, then you don't get it.
  14. Ah, Gee! by Dark+Paladin · · Score: 2, Funny

    Sherman: Mr. Peabody, I want to go back in time!

    Mr. Peabody: Be quite, Sherman. This new Wayback Machine is now accessable via a browser. Be happy with that.

    Sherman: But I wanted to go back in time and watch Cleopatra taking one of those milk baths again.

    Mr. Peabody: .... Damn it, boy, fire up the Wayback machine. And fetch me my chew toy.

    1. Re:Ah, Gee! by Anonymous Coward · · Score: 0

      Maybe Mr. Peabody can buy Sherman a spellchecker.

  15. Even better.... by sheepab · · Score: 1

    This is just....mind blowing. Look at Ebay from 1997.

    1. Re:Even better.... by WEFUNK · · Score: 2

      This is just....mind blowing. Look at Ebay from 1997 [archive.org].

      You fool! You've just Slashdotted Ebay!

      I think we've also taken out Slashdot, and we're probably on our way to taking out the whole damn history of the internet. It's one thing to knock out somebody's geocities account or web serving PDA, but the Slashdot effect has finally gone totally out of control!

      --
      My next sig will be ready soon, but friends can beat the rush!
  16. Who DOES have permission to copy your site? by allism · · Score: 3, Insightful

    Do I have permission to copy the content of your site to my browser history directory, and if so, how long do I have permission to keep it? Can I show a copy of an html document that is stored in my browser history to my mother? What about my neighbor? Or the dude in another country I happen to be chatting with online?

    IANAL blah blah blah, but once you open your files up to being downloaded and stored by a browser, you've pretty much given up the right to tell people they can't be re-distributed--I would think the best you could hope for is that people would re-distribute them, in whole, the way you originally released them.

    1. Re:Who DOES have permission to copy your site? by Pseudonym · · Score: 2

      To answer your question: You have always had fair use rights. Plus, thanks to the DMCA[1], you now have proxying and caching rights. You have never had republishing rights except where explicitly granted. Indeed, on most corporate web sites that I've seen, republishing is explicitly disallowed in the relevant corporate disclaimers.

      IANAL either, but lawyers and courts are usually not impressed by the "slippery slope" arguments that we geeks (I do include myself) usually come up with in these situations.

      [1] You heard that right, by the way. If it weren't for the constitution-violating Chapter 12, the DMCA would actually be a pretty good law. It has some lovely shiny new rights for Americans to enjoy. This is one of them.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
  17. I like it but... by rknop · · Score: 4, Insightful

    When I first discovered it, it was a lot of fun. Much nostalgia; it was fun seeing earlier verisons of my webpages. Some go back quite a number of years.

    On the other hand, I was horrified when I realized that there was full archiving of www.dramex.org. If you visit that site, you will see that there are a large number of scripts (as in plays), many of which have restrictions on use. Over the years, we've had people request that scripts be removed from the site; of course, we did so. However, they weren't necessarily removed from the archive, and an archive keeps them forever. Specifically with the wayback machine, I was able to submit stuff that removed the specific directories I was worried about (they don't archive the scripts from www.dramex.org, just the "front page" stuff which is all part of the fun), and keep them from doing it again.

    I like the idea of archives; it preserves history. The web is a transient medium, but not entirely. Yes, much of the content is dynamic and should only be dynamic. Some of it, though, is like the front page of a newspaper. Each day, what's on "today's front page" is different-- but there is value and use in seeing what was on the front page in any day in history.

    But sometimes you need to delete something and make sure it really is no longer available. When you don't completely control your site (i.e. somebody else archives it, rather than just mirrors it), that becomes impossible.

    newspaper.

    (Incremental backups can have a similar issue. If you only back up files which are "newer than the last backup", your backup doesn't have the information about files which have been *deleted* since the last backup. When you restore, you might find some files there you thought shouldn't exist any more.)

    (Dramex.org has changed so that it's not straightforward to get directly to the scripts any more. META tags tell the search engines to leave the actual scripts alone, and you can only get the text itself via CGI. Yes, it's easy to subvert if you put your mind to it, but at least you do have to put your mind to it, and automated search engines or archivers won't. 90% of the security for 1% of the effort.)

    -Rob

    1. Re:I like it but... by Anonymous Coward · · Score: 0

      you still have the same sort of problem with newspapers and people/libraries that keep them forever. They can print a story that is just WRONG. They are asked to retract it and they do. But what stops anyone from taking that orginal story out of context and attributing it to the newspaper? not much... its very hard to put the toothpaste back into the tube... And really you do not have much to worry about. They have taken a copy of the stuff you provided for free and published it. They are taking responsablity of that. Not you. You have made a good faith effort to rid your site of things like that. There is not much you can do except to ask them to remove either the content or the who cache of your site. They probably would be willing to do both. It really is their problem now not yours dont sweat it.

  18. its a good thing... by negativethirsty · · Score: 1

    " Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one."

    If you dont have a record of what something was before, how do you know its changed?

    Personally I love seeing older versions of previous work and watching the trends in web development as they progres.

    --

    thirsty*i^2

    "Ya I finished that last week, it just doesn't work"
  19. I love it. by gripdamage · · Score: 3, Informative

    What's the problem?

    If you do something illegal on your website, you won't be held responsible more than once just because the data persists on the Wayback machine. If you remove the offensive material from your site, that's all you can do. The Wayback machine can deal with their own lawsuit threats. And I'm sure they'll remove material if you are the site owner and ask nicely.

    As far as outdated information, anyone reading pages on the wayback machine and expecting them to be current would have to be crazy. It's an archive after all.

    It's easy to opt out. Google provides instructions in there webmaster faq which points out "There is a standard for robot exclusion at http://www.robotstxt.org/wc/norobots.html."

  20. As a webmaster of various sites... by schon · · Score: 5, Insightful

    As a webmaster of various sites, I have no problem with archives.. if I didn't want people to see my stuff, I wouldn't have put it on the internet in the first place.

    where did they get such old copies of my websites, and who gave them permission to make those copies?

    They probably got the copies the same way everybody else did - by surfing. You (implicitly) gave them permission to cache your sites by not including an appropriate entry in your robots.txt.

    The way I see it, archives are much like SPAM; I never opted in, why should it be my responsibility to opt out?

    Archives are nothing like spam. Spam is primarily harrassment. These guys aren't harrassing you. They did ask your permission (by way of checking your robots.txt). If you've since changed your mind, it's your responsibility to notify them.

    Google caches material too - do you consider them to be spam as well?

    Archive sites provide a valuable resource to the rest of the 'net. If you don't like it, put an appropriate entry in your robots.txt file, and be done with it.

    1. Re:As a webmaster of various sites... by Anonymous Coward · · Score: 0

      You (implicitly) gave them permission to cache your sites by not including an appropriate entry in your robots.txt.

      Yes, and I specifically denied them permission to redistribute my intellectual property when I wrote "copyright XXXX, by YYYY. All rights reserved."

    2. Re:As a webmaster of various sites... by schon · · Score: 1

      Yes, and I specifically denied them permission to redistribute my intellectual property when I wrote "copyright XXXX, by YYYY. All rights reserved."

      Well obviously they didn't see that - so why don't you call them and get them to remove it, instead of bitching about it here on /.?

    3. Re:As a webmaster of various sites... by Anonymous Coward · · Score: 0

      Google caches are a bit different - they're caching info available now, or in the recent past. You can be reasonably sure anything you put on the web now is going to hang around, but I for one didn't realise it 4/5 years ago. Not that I've said anything too horrendous, but I'm not entirely happy with it. These things should be opt-in, at least when they're being retrospective about caching pages.

    4. Re:As a webmaster of various sites... by Twylite · · Score: 2

      It should also be pointed out that in most countries it is a legal requirement that a number of copies of all publications be lodged with the central/national library. Because of the ad hoc nature of Internet publication (which means ANY web site) this is largely overlooked or ignored.

      Archive sites provide a facility which can be equated to a public library. Once you have published material publically, you have no right to demand that it cannot be presented in a public library.

      --
      i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net
    5. Re:As a webmaster of various sites... by MrFredBloggs · · Score: 2

      Is there a sensible comment anywhere in this story giving reasons why stuff *shouldnt* be archived? I can see sensible pro-archiving comments, and anti-archiving whining, but is there a rational argument against archiving freely accessable information?

  21. Can libraries keep old newspapers? by cperciva · · Score: 2

    The submitter states that he never gave the Internet Archive permission to replicate his work. He is wrong.

    By placing material on the web, one is implicitly granting permission for it to be read. If I put a poster up in my window, I lose the right to complain if someone walking by on the street reads it.

    Equally, I lose the right to complain if someone walks by and takes a photograph of the front of my house, including the poster. The fact that someone might then be able to read the poster ten years from now is irrelevant.

    If the Internet Archive were required to seek permission before archiving freely and publicly available material, then the same argument would require libraries to seek permission prior to archiving (free) newspapers.

    Timeshifting is fair use, and it applies to web pages just as well as TV signals.

    1. Re:Can libraries keep old newspapers? by Anonymous Coward · · Score: 0

      > By placing material on the web, one is implicitly granting permission for it to be read. If I put a poster up in my window, I lose the right to complain if someone walking by on the street reads it.

      Reading != copying; note the complaint was about COPYright, not "readright". He's objecting to the archive having an unauthorized copy of his site, in violation of his explicit statement that he granted no such right.

      Yes, it may be impractical to enforce, except via conventions like robots.txt, but at least argue the correct issue.

    2. Re:Can libraries keep old newspapers? by Jack9 · · Score: 1

      And archiving old webpages != (reading || copying)
      It is an undefined area and I'm going to use the awful SHOULD word...
      This type of archiving SHOULD be considered what it is, archiving information. The comparison between webpages and newspapers and bulletin boards applies here. You can archive papers for viewing, why did this poster submit a flame about webpages?

      --

      Often wrong but never in doubt.
      I am Jack9.
      Everyone knows me.
  22. Quit simply, without Google ... by Vicegrip · · Score: 2

    I would never have visisted countless sites I reguarly surf to. Google has definitely been a major gateway to the internet for me.

    I think making an issue of the caching is a moot point, as about 99% of the time I always go to the website for the content since the source is always better than the cache. I use the cache only in cases when the content has disapeared or in some cases when the website itself is gone.

    This is a valuable service Google is providing-- and webmasters get it for free.

    --
    Do not spread "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" over the internet, thank you.
    1. Re:Quit simply, without Google ... by MushMouth · · Score: 1

      THe Wayback Machine has nothing to do with Google.

    2. Re:Quit simply, without Google ... by Vicegrip · · Score: 2

      Perhaps, but the article did. Try reading it again.

      --
      Do not spread "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" over the internet, thank you.
  23. Preserving information is important. by Chiasmus_ · · Score: 5, Insightful

    I doubt that I'm alone in my belief that it is always tragic when any piece of information--no matter how trivial--is lost forever.

    If a person has offered that information for free at any point, to the extent that an automated script could access it, then I believe that information can be safely considered public domain. I doubt that there's any mechanism by which Richard M. Stallman could lose his mind and "rein in" all copies of GNU, or by which Stephen King could recall all his novels and refund the purchase price; once something is offered to the public, it no longer belongs exclusively to the publisher.

    In my opinion, the value of archives in the future immeasurably outweighs occasional inconveniences of having information stick around longer than the author would have wished.

    --
    "Beware he who would deny you access to information, for in his heart he deems himself your master."
    1. Re:Preserving information is important. by quintessent · · Score: 2

      Throwing out so much dated information would mean discarding a critical part of our written history. Did you notice how the multitude of Y2K disaster sites changed from 1999 to 2000? That is history.

      If the courts are going to outlaw archives of the Internet, I suggest they do a complete job of suppression and order the burning of all books, newspapers, and magazines more than a year old.

      Then authors will be free to rewrite history as they wish.

    2. Re:Preserving information is important. by Anonymous Coward · · Score: 0

      Whenever an elderly person dies a library burns.

      - dunno where I heard it.

    3. Re:Preserving information is important. by ImaLamer · · Score: 2

      This is one point I agree with 100%.

      Maybe some sites aren't meant to be "archived", no matter how cool it would be to see sites such as Yahoo '96 again.

      There are sites however out there with good information that should be available forever. This is our history folks! One of the advantages of the Web and the Internet in general is access to data. If archive.org wants to be the ones who house this data let us praise them!

      I think they were smart by doing this. Now we don't have to rely on nightly news (or other horrible, skewed sources) to replay their tapes or interviews. Now I don't have to save every newspaper I get.

      Archive.org demonstrates what the web should be!

      Also, check out there movies section. The greatest.

      Of course opt out with robots.txt (as said above. they won't delete their copy, just exclude it from the site)

  24. It has its uses. by Helmholtz+Coil · · Score: 1

    I like it...I'm just the latest in a long line of webmasters for the site I run, my boss ran it before me. I will gleefully pull out his work for him anytime he gripes about the current incarnation. :)

  25. err okay... by NanoGator · · Score: 2

    "I can't help but wonder: where did they get such old copies of my websites, and who gave them permission to make those copies?"

    You sound like Television broadcasters when you say something like that. "We'll broadcast content over the airwaves, but you better not capture it!"

    Well, let me make it simple for you: When you make something public you cannot expect to bottle it up later. That's the whole reason that the internet is in existance: Extreme redundancy so that data is never lost. The original idea was to build a data network that could survive a nuclear attack.

    I don't think anybody should ever post stuff on the web without expecting it to last forever in some form or another, regardless of whether permission is granted.

    --
    "Derp de derp."
  26. How so? by SkyLeach · · Score: 2

    "Of course, the issue that may bug many content providers is how to opt-out of such services, since some see it as a copyright violation."

    So I need to burn all my old comics? Or perhaps I don't need to every allow anybody to look at them?

    Caches aren't republishing information, they are archiving it. That's what libraries do to. Hell, they can even charge for the service if they want and still be in the moral right.

    --
    My $0.02 will always be worth more than your â0.02, so :-p
    1. Re:How so? by Anonymous Coward · · Score: 0

      So I need to burn all my old comics? Or perhaps I don't need to every allow anybody to look at them?

      No, but you can't take them down to the local Kinko's and pass out free photocopies of them.

    2. Re:How so? by Anonymous Coward · · Score: 0

      > Caches aren't republishing information, they are archiving it. That's what libraries do to. Hell, they can even charge for the service if they want and still be in the moral right.

      Tough to say; libraries aren't archiving copies of the books, they're archiving the originally published books. No copyright violations involved.

      The Wayback Machine, by definition and necessity, is archiving copies they've made, ie republishing, the original works. So no, this isn't what libraries do unless your library has made copies of original books and is loaning out the copies.

  27. Excellent idea by synthox · · Score: 1

    I myself am a fan of the Wayback Machine. I really like to see snapshots some of how my sites and some of favorite websites have evolved over the years. I would also like to think that I could actually show my Grand Children what the internet was like in my prime instead of saying "back in my day we read Slashdot and we liked it, now pass me my teeth".

    --
    ~~Some people never go crazy what truly horrible lives they must lead.~~ Charles Bukowski
  28. Fork over your caches by Eponymous,+Showered · · Score: 3, Funny

    I browsed your all of your sites (even the abandoned ones) and since my browser cache is set to 782TB (and I'm still running Netscape 1.0N), your sites are still there. And my cache is publically accessible via my webserver. Yet another way you're being violated. Ah, the risks and perils of publishing on a public network.

    1. Re:Fork over your caches by quantum+bit · · Score: 2

      (and I'm still running Netscape 1.0N)

      And it hasn't crashed yet...?

    2. Re:Fork over your caches by Anonymous Coward · · Score: 0

      I dont think netscape ever figured out quite how to make it not do that.

  29. awwwwww by red_five_standing_by · · Score: 0

    how cute...baby slashdot...

  30. Archives need to be made by Waffle+Iron · · Score: 4, Insightful
    If the courts determine that it is technically illegal to make archives of electronic content, then the copyright laws should be changed to explicitly allow archiving. Otherwise, we could eventually lose track of history. The only written record of large portions of our civilization would be relegated to a few rusting web server hard drives buried landfills.

    If you read 1984, you might remember that the government tightly controlled all old copies of documents so that they could manipulate history as they wished. We might get into a similar situation by accident if we don't allow independent archives of electronic information.

    With traditional media, you publish something on paper, but you don't get to control who puts the paper copies in which archives. That has served us well for keeping track of history, and an equivalent system needs to maintained for electronic content.

    1. Re:Archives need to be made by Target+Drone · · Score: 2

      Reminds me of the article about Online News Stories that Change Behind Your Back. Granted it's nothing like what the Ministry of Truth did in 1984 but it's still scary that news agencies have actually taken that first step of changing a news story.

    2. Re:Archives need to be made by xactoguy · · Score: 1

      By accident? Do you really think that if we sit here and do anything we are going to lose track of history because it was an accident?

      - I think that if anything, we are going to lose track of history, not because we accidentally caused it to happen, but because we are too lazy to care, sitting on our rears in front of our computers, televisions, etc... and because the government said "Well, they don't care, and it is to our advantage to control the past, so why not?"

      - Thing like the Wayback Machine are a definite help to prevent this, and it is a good thing that someone takes the initiative to get something like this started.

      --


      And so we go, on with our lives
      We know the truth, but prefer lies
      Lies are simple, simple is bliss
    3. Re:Archives need to be made by Anonymous Coward · · Score: 0

      If the courts determine that it is technically illegal to make archives...

      Sorry to break you logic bubble but the courts rule on law, not about the technical ability to create something like an archive. Therefore your arguement is, well trash because it's not well reasoned or thought out.

  31. A Real World Example/Question by GeekLife.com · · Score: 2, Insightful

    Do libraries have to get permission to save and allow browsing of copies of newspapers (both physical and microfiche)?

  32. Copyright must die! There is no such right by WetCat · · Score: 0, Troll

    We have right to live, feed, have children, work, be under cover. We have no right to copy. Copying is free! And don't restrict rights of other to access to information, please!
    (yes I know about to copy and copyright)

    1. Re:Copyright must die! There is no such right by Chiasmus_ · · Score: 2

      According to Locke, the "natural rights" of man are life, liberty, and the ability to own property; when you enter into a society, you turn over all those rights to the State in return for whatever rights it deems fit to grant you.

      Thus, no one has the right to eat, have children, work, or be sheltered, unless their government sees fit to grant those rights. Certainly, America does not acknowledge a right to be employed or to eat; in fact, it's been known to blacklist people in the hope that they'll do neither.

      And no, no society I'm aware of has ever given its citizens the right to copy information indiscriminately. Personally, I would love to see a society do so, because I suspect that such a society would actually probably end up richer in technology and culture. Both sides of the argument make some sense, but only one is actually tried, and it's apparent that excessively restrictive copyright laws actually retard cultural and economic growth. But, no, as it stands, society has deemed that the exclusive right to copy a piece of work is something a government can hand out.

      --
      "Beware he who would deny you access to information, for in his heart he deems himself your master."
  33. And what to do when info must die? by Nf1nk · · Score: 2, Insightful

    For the most part I don't have a problem with them archiving my sites (after all they can show me what a site used to look like faster than digging out my back ups), but recently one of my customers told me to remove all traces of a product from thier site (something about nasty litigatiation). I pulled the info off our servers quickly, but three hours later I get a nasty phone call from the customer saying he can still see the product on the site. seems it was hung up in some proxy server between here and there.

    back to the point how do you deal with an archive when you need to get rid of information that is a liability to you now? Maybe we are better off without them in some cases

    --
    I used to have a cool sig, back when I cared
    1. Re:And what to do when info must die? by crath · · Score: 1

      how do you deal with an archive when you need to get rid of information that is a liability to you now

      The first rule of email is: write every email assuming that everyone in the world will eventually read it.

      The first rule of web-posting is not too dissimilar: post every document/picture/program/you-name-it assuming that it will always be readable by everyone.

      Anyone who ignores these rules will suffer well deserved consequences. If you don't want your content cached, copied, archived, printed, et al then don't post it in the first place.

    2. Re:And what to do when info must die? by Anonymous Coward · · Score: 0

      Use No-Cache meta tags if you have to. Also, any decent proxy runs a "has this page changed" request to the server even when serving a cached copy.

    3. Re:And what to do when info must die? by crath · · Score: 1

      ...any decent proxy runs a "has this page changed" request...

      Actually, most decent proxy servers allow themselves to be tuned such that they do not perform any checks. If you're on a very slow dial-up connection, this can be a life saver.

    4. Re:And what to do when info must die? by Anonymous Coward · · Score: 1, Insightful


      Let me paraphrase:

      "Archives make it harder to sweep nasty secrets under the rug"

      And that is bad how?

  34. it's even... by gabvalois · · Score: 1

    It's even as slow as it was back then!

  35. Friend to Hosting Comapnies by Da+J+Rob · · Score: 5, Funny

    I was talking to this guy who works for a web hosting company, and he says a fourth of his sales calls are people calling him up cause they're pissed that their last hosting company 'lost' thier site. (in reality most the time its later found out that the guy deleted it himself or renamed index.html to index2.html, etc..) He says 90% of the sites he can find a copy on the wayback machine. He'll then start to quote the website's contents to the guy on the phone and usually will have the amazed (and dumbfounded) customer signing a hosting contract by the end of day.

  36. Hah! by Anonymous Coward · · Score: 0

    Move it to Sealand or something like that, or some other country where copyrights are meaningless.

  37. actually, Fan or Freak. by sulli · · Score: 0

    depending on whether the site you had up when you were scanned is/was any good!

    --

    sulli
    RTFJ.
  38. caching proxy servers by bigpat · · Score: 1

    Caching of web pages on the internet is considered fair use and is central to the Web. Isn't this like a time-delayed caching server. This is just caching for a different purpose... and they aren't making money off of other people's content.

    1. Re:caching proxy servers by Anonymous Coward · · Score: 0

      > Caching of web pages on the internet is considered fair use and is central to the Web.

      Considered "fair use" by whom? What court decided this?..

  39. Uh, robots.txt! by Tom7 · · Score: 2

    Use robots.txt, stupido. It lets you prevent search engines from indexing and archiving your property. However, if you're that concerned about people copying your pages, you might try avoiding the internet.

    I personally love the internet archive and google's cache.

    1. Re:Uh, robots.txt! by Anonymous Coward · · Score: 0

      > Use robots.txt, stupido. It lets you prevent search engines from indexing and archiving your property. However, if you're that concerned about people copying your pages, you might try avoiding the internet.

      No, robots.txt only lets you ask search engines not to index & archive. If a given site (..like, say, a spammer's spider..) doesn't follow the convention, tough luck.

      Your last sentence is the only real remedy.

  40. The web is a public medium! by Steveftoth · · Score: 2

    This parent post said almost everything I was going to, but one thing that I wanted to add was that the web, if a spider is even able to get to a page, (even if it doesn't follow the robots protocol which the wayback machine does) is only seeing a public page that anyone with an internet connection can get to.

    Otherwise you have bad control over your content and need to update your web server to not serve that content. If you don't want people to be able to copy your information then don't give it to them. Or only give it to them in a signed format that cannot be easily duplicated.

    It's like being surprised that someone has forwarded an email that you sent them.

  41. robots.txt won't work by tps12 · · Score: 0

    I know everyone is going to say, "just make a robots.txt file and everything will be okay." Sadly, that is naive and incorrect. What makes you think that the people who send out 'bots looking for content (rather than create their own or use hyperlinks!) would honor such a noble convention?

    This is like trying to solve music piracy by putting a "No Napster" sticker on the jewel box. Nice thought, but it's a dead-end.

    --

    Karma: Good (despite my invention of the Karma: sig)
    1. Re:robots.txt won't work by gerardrj · · Score: 2

      BUT...
      You have to KNOW the thing exists in order to put them in your robots file.

      This means that there are MANY sites in that archive that are being captured and re-published without any knowlede by the authors.

      the robots.txt file is like making a burular alarm that only stops people you know to rob houses. Wayback should use the robots file to only archive sites that specifically allow them to do so.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    2. Re:robots.txt won't work by MushMouth · · Score: 2

      Learn the robots.txt protocol, you can shut off all bots and only allow the ones you want by simply having
      User-agent: good_bot Allow: /

      User-agent: * Disallow: /

    3. Re:robots.txt won't work by gerardrj · · Score: 2

      I know the robots.txt, but I (along with most web publishers) have better things to do than to keep track of every web bot that may visit my site. Given how fast crawlers come and go, just keeping up with a list would probably be a daunting task.

      Maybe the robots.txt spec should have a new tag that the archive bots look for:

      archive-agent: * Disallow: /

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
  42. it's a good thing by red_five_standing_by · · Score: 1

    someone backed up the Internet to floppy.

  43. Euro friendly :) by Anonymous Coward · · Score: 1, Interesting

    Well, the wayback machine helped me in confronting some companies for raising their prices when we changed to the euro :)

    Especially dominio's pizza. They raised their prices more that 12%. I printed out the page and got a 15% discount :)

  44. robots.txt DUUUUUUUUHHHHHHH!!!!!!!! by jsimon12 · · Score: 2

    For such a "webMASTER" this guy doesn't seem to know a lot about the Internet, seems more concerned with keeping his "Intellectual Property" safe then actually understanding the way things work.

    People like this ruin the concept of the Internet, the free exchange of knowledge. I hope other people on /. feel the same.

  45. Copyright and websites. by www.sorehands.com · · Score: 3, Interesting
    It could be argued that the site is publically available and thus anyone can copy it. There is also the issue of fair use. That is why many people place terms of use and robots.txt files on their sites. It could even be a DMCA violation where an IP (or range) has been blocked, so people from that IP use the google cache to bypass the block.


    I don't mind that my site is being added to indexes that the public have use of for free. I have a problem where a company uses my site to make a profit, with no public benefit.


    There is case law where unauthorized access to a website is a copyright violation.


    I am trying to use copyright law against some of the spammers who scrape my site for email addresses. Then, go after the spam software companies for contributory infringement (let the napster rulings serve some good).

  46. Get Used to It, please by pyrrho · · Score: 2

    I understand the concerns, but I think it's a part of the net, a good part, that we have to wrap our minds around.

    Especially when you mention Usenet archives, which are (ok, get ready to laugh) historically important. I'm not kidding! There is a little signal in there, it's a cultural brain dump, and that's of historic interest.

    I think the rub is, if the archive presents the data exactly as you presented it (that is, it doesn't play with your content, present it in a frame or otherwise embed it as their own content), then it is a fair archive, a ghost of your site still walking the internet. There is no taking it back once you post it.

    --

    -pyrrho

  47. TV Broadcast analogy by rknop · · Score: 4, Interesting

    Some have already drawn analogies to TV broadcasts, saying hey, it was broadcast, you get to keep a copy. You can't bitch now if people still have that copy, unless you're Jack Valenti.

    You can spin this how you want. Here's one valid way to think about it though: a TV network brodcasts a show. You make a private copy on a VCR tape. Jack Valenti aside, you can watch that copy again as often as you like, and it's no big deal. However, you do emph not have the right to rebroadcast your copy of that show to the public without the permission of the original copyright holder. (I have my B5 tapes. I'm watching them through again now, showing them to my wife. I'm sure nobody is upset about this. But I'd be in deep doo-doo if I managed to broadcast them on a local access station, or uploaded them to a public website.)

    If you are inclined to be negative about the Wayback Machine, you could view it this way. While the page existed on the original site, it was broadcast to the public. If somebody made a personal copy, they have it and will always have it, even if the site goes down. However, when the site goes down, individuals do not necessarily have the right to then "rebroadcast" (i.e. post) themselves the content they downloaded and kept. This, however, is what the WayBack machine is doing.

    Mind you, except for the issue with www.dramex.org that I noted above (and which I fixed long ago), I like the WayBack machine, and am happy that they archived the content which was implicitly copyrighted to me. I would have opted in if I had wanted to. But, of course, I didn't know about it back in 1996 to opt in.

    I don't have a good answer to the questions. Just thought.

    -Rob

    1. Re:TV Broadcast analogy by bhsx · · Score: 1

      I replied with this thought a bit earlier? Is a proxy server illegal/immoral? That's doing the exact same thing; copying copyrighted material and rebroadcasting it, it just doesn't do it over port 80.

      --
      put the what in the where?
    2. Re:TV Broadcast analogy by Anonymous Coward · · Score: 0

      Here's one valid way to think about it though: a TV network brodcasts a show. You make a private copy on a VCR tape. Jack Valenti aside, you can watch that copy again as often as you like, and it's no big deal. However, you do emph not have the right to rebroadcast your copy of that show to the public without the permission of the original copyright holder.


      However! There is a nonprofit called The Museum of Television and Radio, which does allow the public to view old television and radio broadcasts. I don't know if they have the permission of the copyright holders or not, but if they do, it's very similar to what the Wayback machine is attempting to do.
    3. Re:TV Broadcast analogy by Suppafly · · Score: 2

      The problem with the internet is that you can't really compare it to books. On the internet, all access to material implies redistribution.

      If you could compare it to books, you'd call the Way Back Machine a library and no one would complain, because you'd go to the library (the way back website) and view archieved versions of publically available content.. In the book world, there is nothing wrong with this. If Micheal Criton sells copies of Jurassic Park and then later decides to rerelease Jurassic Park with extra chapters, the old version doesn't go away and there is nothing to keep me from freely allowing you to read my copy. The problem with the real world is that people don't see it like that, they bitch and moan because someone copied their content to another place and is distributing it from there, totally ignoring the fact that basic usage of the internet relies on the fact that information must be copied before it can be viewed (and the concept of proxy servers and the various legitimate reasons for caching content).

      There is nothing wrong with the way back machine, if people didn't want thing publically viewable forever, they shouldn't have put them on the internet. Things such as plays that people don't want to be totally publically available need to actually make their entries protected in someway instead of expecting everyone to notice or care about a little (C) that may or may not be valid.

  48. best thing since sliced bread by John+Sokol · · Score: 2, Insightful

    There is nothing-worst then revisionist history. I can't stand seeing site that post something and a bit later it vanished forever or have it altered removing the very think I was interested in.
    There are several GPL'ed Open Source software packages that I have copies of, that have vanished with all references to them and are no longer available on the net. Also a number of great sites that came and gone for either lack of cash or time. I think if someone open sources something it should stay that way.

    Also if it's open on the net for public viewing, then it should be fair game. Especially if the original author is credited and it is in the original context, like the Wayback Machine is. I know there are always special cases where something was put up that the webmaster was not entitled to like a copyrighted book or something, but for most stuff this is invaluable and a great service to humanity.

    Also think of all those users who's we site was lost without backup. Now they can get that data back.

    The Wayback Machine is one of the few web services I'd be willing to pay for.

    John

    --
    I am always doing that which I can not do, in order that I may learn how to do it. - Pablo Picasso
    1. Re:best thing since sliced bread by dossen · · Score: 1

      If you have archives of GPL'ed software, no longer available, please setup a site with it. You have the right to distribute under the GPL, and I'm sure there are people who are interested.

  49. Re:Permission... phhtht. by Anonymous Coward · · Score: 0

    And they can bite my shiny metal ass. Especially Bloomberg.

  50. p2p by mephist01 · · Score: 1

    this reminds me alot of the old opt-in/ opt-out p2p debate.

  51. Don't publish a website ... by Anonymous Coward · · Score: 0

    Don't publish a website available to anyone on the Internet if you don't want a "snapshot" taken. I'm personally very comfortable with my work and writings being available to anyone, forever. If I wasn't I wouldn't have put them online.

  52. Wayback machine = free backups! by FamousLongAgo · · Score: 1

    I like to think of the Wayback Machine as my personal backup server.

    I just put all my most vital files in a web folder, and their crawlers take care of the rest.

    And for encryption? Two words, baby:

    ROT-13

    --

    A customer service representative will be with me shortly.
  53. Library archives are given broader copyright uses by tiltowait · · Score: 5, Informative

    .... and wayback is sponsored, amongst others, by the library of congress. The archive itself a 501(c)(3) public nonprofit. See 17 U.S.C. SECTION 108(a)(3) for more information.

    Strange that such a complaint would appear within a group expousing that "information wants to be free." :)

  54. For what it's worth... by Reality+Master+101 · · Score: 2

    What particularly interests me is the fact that the Machine is a relatively new animal, yet it contains snapshots from my sites dating back to 1998.

    Interestingly, if you look at Slashdot's earliest entry (man, that page was ugly back then!), and then look at the bottom of the page, it shows the domain that was used to pull the page: "Welcome User From firestone.alexa.com".

    Alexa.com appears to be some web search ("powered by Google") toolbar thingy. I can't determine if they are the same people as the wayback machine or not.

    --
    Sometimes it's best to just let stupid people be stupid.
    1. Re:For what it's worth... by MushMouth · · Score: 2, Informative

      Alexa does the Archive's crawl. Notice that Brewster Kahle's name is attached to both.

    2. Re:For what it's worth... by jasonkohles · · Score: 1

      Alexa.com appears to be some web search ("powered by Google") toolbar thingy. I can't determine if they are the same people as the wayback machine or not.

      And you weren't tipped off by the fact that they tell you on their front page that they are?

      pay attention to the part that says:

      The Internet Archive, working with Alexa Internet, has created the Wayback Machine.

    3. Re:For what it's worth... by Anonymous Coward · · Score: 0

      Alexa Internet is the company owned by the Brewster Kahle, who is responsible for the archive. I believe some funding for the archive comes from Alexa.

    4. Re:For what it's worth... by adolf · · Score: 2

      IIRC, Alexa is responsible for the content of Netscape's "What's Related" button, and they've been, appearently, taking snapshots of whatever they could for years. I seem to recall some discussion about this button, and the data-collection policies at Alexa about the same time the button started appearing in Netscape.

      Ironically, despite archive.org's extensive cache and slashdot's search feature, I can't find it. Hrmph.

      According to whois, archive.org and alexa.com are both registered to companies in San Francisco. Additionally, the 9/11 TV news archive page, as linked from archive.org's main page, credits a number of Alexa employees in the right-hand sidebar.

      http://tvnews3.televisionarchive.org/tvarchive/h tm l/

      I'd say they're all the same people, more or less. Different corporations, perhaps, but at least the same faces.

    5. Re:For what it's worth... by Anonymous Coward · · Score: 0

      As others have noted Alexa and the Internet Archive were both founded by Brewster Kahle some years ago. However approximately 3 years ago Amazon acquired Alexa (I'm sure you could use the WayBack machine to pull up copies of the announcements 8-).

      Since then the Internet Archive and Alexa have become more and more divorced, and today there is little or no official connection between the two - Brewster is no longer on the staff at Alexa, the archive is a registered non-profit, Alexa on the other hand as a division of Amazon is very much in the profit game (or attempting).

      The archive is now hosted on a totally seperate server farm from Alexa, it's not even in the same part of town.

      Disclosure time - my wife works for the Internet Archive

  55. Purist? Pure what? by American+AC+in+Paris · · Score: 5, Insightful
    Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one. Archives have bothered me ever since the fledgling days of DejaNews.

    I'd say it makes you more of a control freak than a purist, personally.

    Seriously, how did you ever get it into your head that a medium that serves documents to the general public on demand would be somehow exempt from archiving?

    Would it bother you of John Q. Savant could recite the contents of your web pages from memory ten years after you'd taken it down?

    Would it bother you to learn that stock prices, perhaps the most "ever-changing" thing out there, are permanently archived by a variety of services?

    Or are you just jittery at the thought that your spouse/boss/Friendly Neighborhood Representative of The Man/kids may be able to someday look at the shite you plastered all over the web in your younger days? ("Ech, that stupid Netscape 2 animated title hack--honey, you actually -did- that?")

    --

    Obliteracy: Words with explosions

  56. But!!! by www.sorehands.com · · Score: 2
    A person may take a picture of the front of your house and of you and your painting for personal use.

    Now, when that person redistributes it, then it becomes an issue of fair use, copyright and license.

    1. Re:But!!! by hymie3 · · Score: 2

      Yeah, but the wayback machine/internet archive isn't creating a derivative work--they're republishing (without my consent!) my copyrighted material. Copyright isn't opt-in, is it? If it is, I'll be adding a lot more mp3s to my archive.

  57. Definately foe by brandonsr · · Score: 1

    It's always been a monument to bad grammer and spelling on my part. So years down the road I can go back to see how terrible it was, then realize it hasn't improved one bit.

    Plus the darn thing crawls my web sites everyday.

  58. Denmark solved that problem by law by Jezral · · Score: 1

    In Denmark, it is a legal obligation to hand over a copy of any and all publicized material to the Royal Library, including anything publicized on websites, for archiving and historical services/research.

    That so few does it just indicates that nobody knows about that law.

    But, I think it's a wonderful law that there is one central place that at least tries to be complete...

    I'd like to see a similar law passed in international media, regarding services such as the WayBack Machine, so that they are not only allowed to, but required to take copies of every and all public material.

    For academic, research, history, whatever reasons...

    -- Tino Didriksen / Project JJ

    1. Re:Denmark solved that problem by law by Anonymous Coward · · Score: 0

      Yeehaw!!! A law requiring me to hand over documents and information to some government body! Yeeeeaah baby!!

  59. Microsoft.com in 1996 by dasheiff · · Score: 2

    http://web.archive.org/web/19961020014044/http://w ww.microsoft.com/

    Well back in 1996 you really could win a million dollars from Bill Gates... well atleast a cruise.

    See all the exciting things happening on the Internet in Latin America, and win big prizes at the same time! Register for the first Latin American Internet Explorer Race. You'll have a great time, and perhaps even win a Caribbean cruise!

  60. aside from robots.txt by archen · · Score: 1

    For those who aren't in control of the root domain, and still want to exclude portions of their site, you can (try to) use meta-tags. No guarantees if a spider will honor it.

    <META NAME="ROBOTS" CONTENT="NOINDEX">
    <META NAME="ROBOTS" CONTENT="NOFOLLOW">
    <META NAME="ROBOTS" CONTENT="NOARCHIVE">

  61. easy to remove and stop from archiving by arson1 · · Score: 2

    robots.txt

    User-agent: ia_archiver
    Disallow: /

    --


    --
    Don't sweat the petty things, and don't pet the sweaty things.
  62. You have given permission by MrResistor · · Score: 4, Insightful

    By the very act of posting your site on the web you have given permission to make copies of it. Otherwise, how would anyone view it? And if no one is supposed to view it, why have you published it in a publicly accessible space?

    If I went to your website 2 years ago and never closed or refreshed that browser window, would I now be violating your copyright? What if I saved the page so I could view it later offline? What if I never erased that file, would that mean that I'm violating your copyright? I have several floppies of web sites I saved at school for viewing at home from the days when I was stuck on a crappy dial-up service. Does that make me a pirate? What about all the copies of sites held in my browsers cache?

    Don't get me wrong, I understand where the sentiment is coming from, even if I disagree with it. I'm just trying to point out how incongruous it is with the basic nature of computers and the internet and how they work.

    These questions aside, though, I have to come down in favor of the historians. People here are always whining about old movies/books/music being lost because their owners refuse to let them go, even if they aren't using them, why should the web suffer the same fate? The rate of destruction is far faster on the internet, and since it isn't a physical media, the information has to be actively archived if it is to be preserved.

    --
    Under capitalism man exploits man. Under communism it's the other way around.
    1. Re:You have given permission by MURL · · Score: 1

      Um, no.

      By posting content to the web, I have given people permission to view it.

      I have NOT given anyone permission to republish my work on another site.

      --Anton

      --
      --- Have you seen MURL?
    2. Re:You have given permission by SuiteSisterMary · · Score: 2

      So, anybody sitting behind a caching proxy...or an offline cache...is doing something you don't want them to do? Because the first, and under a strict interpretation, the second, fall under the heading 'republishing.'

      --
      Vintage computer games and RPG books available. Email me if you're interested.
    3. Re:You have given permission by illerd · · Score: 1

      If this ever gets to a judge, I don't see how he/she could distiguish between this phenomenon and rebroadcasting a television show. Eventualy expressed written consent will be required.

    4. Re:You have given permission by gerardrj · · Score: 2

      The line gets drawn when you re-publish the web page in question.
      You are correct, by puclishing a web page on a non-protected, publicly accessible server, you explicitly provide rights for your content to be viewed and retained by browsers.
      In your example, you would be wrong when you decided to later put the page you stored back on-line at something other than the original URL.

      If these people want t archive the internet, they should specifically be reqired to do it on an opt-in basis. You go to their site and enter the URL that you want them to keep archives of.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    5. Re:You have given permission by Teutates · · Score: 0

      But...they are keeping the original url and not republishing it as something else. Look at any URL in that site:
      http://web.archive.org/web/20010407020209/h ttp://s lashdot.org/
      for instance.

      It's keeping all the original copyright information IN tact so there is little to be said about it. People should be happy that they are getting free hosting from this company.

    6. Re:You have given permission by version5 · · Score: 1

      Better talk to these guys then.

      --

      "It's Dot Com!"

    7. Re:You have given permission by gerardrj · · Score: 2

      Your own example shows that the URL required to access the document (now archived) has been altered.

      Doing evil in the name of good is still evil.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    8. Re:You have given permission by jdcook · · Score: 1
      "If I went to your website 2 years ago and never closed or refreshed that browser window, would I now be violating your copyright?

      Nevermind the copyright issues. Where did you find a browser that would run for two years without crashing?

      --
      Q:How many libertarians does it take to stop a Panzer division? A:None. Obviously market forces will take care of it.
    9. Re:You have given permission by Kvan · · Score: 1

      And that is precisely why we need legislation to make this explicitly legal (and even to make it impossible for peopl to opt-out).

      --

      "A *person* is smart. People are dumb, panicky, dangerous animals and you know it."
      - 'K' in Men in Black.

    10. Re:You have given permission by MrResistor · · Score: 2

      I have to disagree with opt-in for archiving, simply on the grounds that to much would be lost simply due to laziness. Chances are I wouldn't go out of my way to opt-in to an archiving program, and I'm willing to bet that 99% of webmasters wouldn't either. If I were running the archive I probably wouldn't offer an opt-out either, but then I also probably wouldn't put my archive up on the web. It makes more sense to me to make such an archive available on some physical media such as CD or DVD.

      I could see making republishing on the web opt-in, but not the archiving itself. If you take that step, you're opening the whole browser cache can of worms, and before you know it some idiot is suing people for using caching web-proxies.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
    11. Re:You have given permission by gerardrj · · Score: 2

      But why must "everything" be preserved fo posterity.

      My dog took a dump on the lawn this morning. I have no record of what the lawn looked like before the event, of me cleaning up the event, ot what the lawn looked like afterward. What is the loss to society for this lack of information?

      MOST of the web is just that... someone's brain dump. Most of it has no socially or intellecuallt reasonable need to be archived.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    12. Re:You have given permission by MrResistor · · Score: 2

      Because it's the everyday stupid stuff that nobody thinks is important that gives true insight into a society.

      Someone reading your post knows that:

      You have a dog, and that it is probably a pet since you trust it enough for it to have the opportunity to crap on your lawn.

      You have a lawn, an area around you residence which you care for enough to clean up when your dog takes a crap on it.

      It can also be infered that you live in a society which has domesticated animals and a concept of individual property ownership, and that your society places a value on hygene, and most likely the appearance of cleanliness, not just of the person but of the area surrounding the person.

      The loss to our society for not having this information is nothing, since pet dogs and lawns are common in our society and we all know that. The loss is to future societies trying to understand ours. Did the Egyptions keep pet dogs? Did they have lawns? Did they clean it up when their dogs crapped on their lawns? If they did, how would we know except that someone wrote "My dog crapped on my lawn and I had to clean it up" and that writing was somehow preserved?

      Academic and philosophical writings, while arguably more worthy of preservation, generally give little or no insight into the life of the average person, and at this point in time the way the average Athenian lived and the everyday things they did, and took for granted, is of more intellectual interest to us than that Aristotle knew a few things about Geometry.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
    13. Re:You have given permission by gerardrj · · Score: 2

      But how do you know what I wrote is true? perhaps I just made it up?
      The problems with these statments and stories is there is no way to prove them true. Hence for all we know, perhaps a future society will attempt to build a religion on my posts.

      Look what's happened with the bible... Some lonely people wrote down versions of stories that had been passed down via word of mouth fot milenia. Once written down in teh collective, they where suddenly looked as as authoratative, and people begane killing each other over them. The archiving of those stories has caused more grief, misguided decision, and war than anything else.

      Archiving without related, supporting data, and without explination of the archived documents by the authors can be a dangerous thing.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
    14. Re:You have given permission by MrResistor · · Score: 2

      Once written down in teh collective, they where suddenly looked as as authoratative, and people begane killing each other over them. The archiving of those stories has caused more grief, misguided decision, and war than anything else.

      Every fabrication has a kernel of truth. Even fiction has historical value. While I agree that more suffering has been caused in the name of the Bible than any other written work, more good has been done as well. To say that the Bible advocates or encourages violence one would have to be totally ignorant of what it actually says. Jesus wasn't exactly vague on that point. To put it another way; many, many people have been killed by being hit with hammers. Would we be better off without hammer technology? No. It isn't the hammer that kills people, just as it isn't the Bible that starts wars. It all comes down to people twisting everything around them to serve their own ends. To blame the hammer, or the Bible, is to ignore the real problem.

      Archiving without related, supporting data, and without explination of the archived documents by the authors can be a dangerous thing.

      All the more reason why we should archive everything we can. Even the things which, on their face value, seem worthless or worse. Everythign anyone writes down gives an insight into their thoughts, state of mind, dreams, desires, etc. When all of these things are taken together, you have a society.

      I hate white supremists. I think the world would be much better off without them. I think they spread ignorance and lies, and I'm ashamed to be a part of the same species, let alone race, as them. But, it would be impossible to understand the dynamics of the society I live in without knowing they exist and what they are about, or that there are people who oppose the beliefs they espouse.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
  63. A great tool for future historians /archeologists by msoldo · · Score: 2, Insightful

    Could you imagine if there was the equivalent of the wayback machine for everything published in 5th century Athens? We'd know and incredible amount more about were the human race had already been intellectually and where its going.

    I publish several websites and I don't mind this a bit - If someone wants to host my content for free and offer my customers a way to get at older versions of the site for whatever reason (maybe they want to know what prices were 2 years ago), then they've done me a service. Cool.

  64. get laid by porkface · · Score: 1

    Some people worry too much. If I want information, I want to be able to find it whether someone wants to host it anymore or not. If I'm bored, I can find entertainment from the Wayback Machine. If you don't want your site to be part of the Wayback Machine, program it so that it can't be snagged by the Wayback Machine. The wayback machine will not be confused with the real thing, and since the HTML / images format of most of the www is inherently unprotectable, content owners have no claim to stop linking and caching. It was the nature of the beast they signed up for.

  65. Historical Records by JonBuck · · Score: 2, Interesting

    As a historian and future librarian, one thing has always bothered me about the Internet. Because change is a constant, it's very difficult to keep records. It isn't like newspapers, pamphlets, books, or any other form of written record of the past five thousand years. Unless they're printed out, our writings here leave no physical evidence of their existance. Because I feel that the Internet is as significant as the printing press five centuries ago, the prospect of having no records from its early days is frightening.

    We have books from five centuries ago. Will anything here still exist in a readable form five centuries from now? Unless something is done to preserve it, I feel there will be a massive gap in history.

    And this is why I do not object to web archives. They are a half step to printed and more permanent storage mediums, but preferable to nothing at all.

    1. Re:Historical Records by SimJockey · · Score: 2

      Maybe just to play devils advocate here, but is there anything on the web that is historically significant that is not also in a more permanent (say, dead tree) format? I'll agree that the Internet is important, but in the scope of history I would think that the structure would be of more interest than the content.

      --
      Laugh while you can, monkey boy!
    2. Re:Historical Records by dossen · · Score: 1

      Well, the structure of the web change all the time too. And why should we rely on dead tree surviving, when the information is available digital, ready for storage?

    3. Re:Historical Records by SimJockey · · Score: 1

      Yes, but we are notoriously bad at coming up with digital storage mediums that are readable after 20 years (have you seen an 8" disk drive lately?) or that are stable for extended periods (CD's). Even hard drives require a certain amount of ongoing maintenence to keep entropy at bay. But written text can easily survive thousands of years with minimal upkeep costs.

      --
      Laugh while you can, monkey boy!
    4. Re:Historical Records by dossen · · Score: 1

      Well, it might not work for thousands of years, but short term one could simply move to new disks periodically.

  66. public domain by qubit64 · · Score: 1

    The way i see it (maybe its been said before on here i dont have time to check now) if you put up something on the web that is FREELY available to anyone you don't exactly lose the rights to it, but you have to expect that people may distribute it around long after your site is down. If you dont want people seeing stuff in a few years time, don't put it up on your website.

    --
    "Save me jebus!" - Homer Simpson (btw, I'm probably talkin out of me arse)
  67. use robots.txt by Anonymous Coward · · Score: 0

    The wayback machine recognizes the poster's viewpoint. Not only will they pass over your site for archiving if robots.txt advises so, but they will also make your previous archive entries unavailable until such time that you change your robots.txt policy to allow indexing by web crawlers.

    Cheers.

  68. libel? by sckeener · · Score: 3, Interesting

    I didn't know that the wayback machine went that far back. I wonder if anyone is going to go to jail from posts they made in the past....

    --
    "Only one thing, is impossible for god: to find any sense in any copyright law on the planet." Mark Twain
  69. Something you should know by Anonymous Coward · · Score: 0

    Is that the wayback machine is part of archive.org - which sits in the same room (and network) at alexa.com which is probably where they got the web pages. Don't ask to much how I know this - just when I was working for another company I used to have contacts with them (for instance I know their sys admin on a first name basis). In other words - archive.org has really been around a lot longer then you think.

    So stop freaking out and go back to browsing porn.

  70. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  71. opting out by josepha48 · · Score: 3, Interesting
    At least for google to opt out of its service add the following tag in the "head" of your web page:
    <meta NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
    This will tell google not to cache your pages. If you dont want them to index your page and include the page in the search engine use:
    <meta NAME="ROBOTS" CONTENT="NOINDEX">
    Now I am not sure about this other site that is caching old pages, and right now I cannot get through but if they are caching any of my pages I will tell them to take them off as ALL my pages are just that MY pages. I think you can sue them, I'd imagine with all the other internet lawsuits it would be valid. They are stealing your pages.
    --

    Only 'flamers' flame!

    1. Re:opting out by mbauser2 · · Score: 1

      Those instructions are seriously messed up. Those META values will keep Google (and at least a dozen other search engines) from including a page in search results. You'll make the web page completely "unfindable" in the major search engines.

      Web page authors who want to prevent a page from being archived/cached should use this tag:

      <meta name="ROBOTS" content="NOARCHIVE">

      That'll stop Google, the Internet Archive, and a couple of other caching sites from caching the HTML file, while still allowing them to index it for other purposes.

      (As an aside: "stealing your pages"? Yeah, right. Have your pages disappeared from your server? Is the Wayback Machine claiming they wrote your pages? Get a grip.)

      --
      Proud to be / Smiley-free / Since Nineteen / Ninety-Three
    2. Re:opting out by Buran · · Score: 2

      With only a robots.txt entry to stop Google, my site's entry in their index reads:

      www.buran.org/
      Similar pages

      ... I assume it's drawing off the domain name itself (the search term was "buran") to put the site in the index. The robots.txt file reads:

      # go away
      User-agent: *
      Disallow: /

    3. Re:opting out by josepha48 · · Score: 2
      Yes I messed up the tags, but that is a page that I don't want cached or indexed.

      As far as stealing my pages, yes they are. If people want to search for me then they should find my latest site, not an old snapshot of it. Many of the images on my site are mine, except for the ones that are taken from sites that say. "if you want to link to this site use this image". Yes I created them. What about using images from apple.com or some other site, wouldn't that be stealing images? What about caching pictures from a site that does not want you to cache the pictures? Isn't that stealing?

      As the poster says "Why should I have to deliberately remove my copyrighted material from an archive which was never granted permission to replicate that material in the first place?"

      --

      Only 'flamers' flame!

  72. His-to-ry by Fapestniegd · · Score: 1

    Yeah, I don't know why we keep all of those pesky pre-contemporary books around either. Lets get rid of it all. It really bothers me that you would want to rewrite or erase history. Even if your site is just a blog or some crap like that. It tells us something about the mental state of those that came before us. Which can help us understand each other.

    He who does not know about the dot com bubble is doomed to repeat it.

  73. Serious flaw in the internet's design by l33t-gu3lph1t3 · · Score: 1

    Browsers by default have a history folder that is only what, 15 days long? Websites rarely last longer than, what, 2 years?

    The "internet" seems to be a transitory medium. Unlike paper, digital information is intangible, and can be easily wiped and replaced, or edited. A perfect example of this is the way that news sites often take their articles offline once they've been up for a week, or just look at webtracking software, which shows that links are dying faster than ever before.

    If this trend continues (and given the current architecture of the 'net, I don't see it changing) then we might have a serious problem. I won't analyze it, but there is definitely something wrong when data is forever lost after existing for such a short period of time.

    --
    ------- "From bored to fanboy in 3.8 asian girls" ----------
  74. Bad use? by sheepab · · Score: 1

    What happens if it archives a website with the Nimda/Code red virus?

  75. Mine! Mine! Mine! by bryny · · Score: 1

    With some of the attitudes about control of information on the web that we're seeing, maybe we should flip the WWW over and call it MMM....

  76. Damn you slashdot! by Aanallein · · Score: 2

    I was just digging through a few hundred pages of information in the wayback machine when the site became sluggish. I jokingly told my friends (you know, the kind that live in my head?) it seemed I was singlehandedly slashdotting the site.
    *sighs* Seems I had some help...

    Anyway, I love the Wayback Machine. Besides being an extremely useful tool, it proves that Zindell was right. Information is never lost, only ever created.

  77. I had my sites removed by kstumpf · · Score: 2

    I used to run a Half-life map review site, and a TFC map review site called "radium". I took my sites down a couple of years ago, and recently some friends pointed out that they showed up on one of these archival sites. I took my sites down for a reason, and didn't appreciate them hovering about on someone else's server without my permission. Say what you will, but I just don't like it. I emailed them and had my property removed from their servers. It took a bit of badgering, but it finally got done.

    1. Re:I had my sites removed by PurpleBob · · Score: 2

      "A bit of badgering?" They will automatically remove any site whose robots.txt denies them, so it's not like they're trying to make it hard to get out of their archive.

      Perhaps they were more uncooperative because you were being nasty in your e-mail to them.

      --
      Win dain a lotica, en vai tu ri silota
    2. Re:I had my sites removed by Anonymous Coward · · Score: 0

      You shouldn't have to re-register a domain name that you let expire just to stick up a robots.txt file.

  78. Wayback Machine and Privacy!! by jdriller · · Score: 1

    Scarier then the archives not asking your permission is their connection to Alexa Internet and their ownership by Amazon for use as a marketing tool and guide. Also of interest is the change in Alexa's search tool's privacy notice from the original aggregated/generalized data only to the newer we-track-who-you-are-and-where-you-go version - but old users likely did not notice the change. Bezos is no dope....there's gold in them thar archives....

  79. The backup copy of the archive by Animats · · Score: 2
    The Wayback Machine started as Brewster Kale's project. He also did Alexa, which provided some of the old archive tapes to start up the Wayback Machine.

    The long-term plan is to have a copy of the history of the Internet, beyond the power of any single government to censor. To this end, there are copies of the archive at multiple locations around the world.

    One of them is in the Bibliotheca Alexandrina, in Egypt. They too have a Wayback Machine. It's jointly operated by the Government of Egypt and the United Nations Scientific and Cultural Organization. While they will usually honor removal requests, they don't have to do so.

    There are plans for two more archive sites around the world, affiliated with major national libraries.

  80. Enough is enough by Anonymous Coward · · Score: 0

    OK all this you can copy this you can not link to this etc junk should end and how should it end you ask? Well the guy/body/orginization that holds the copyright/patentent/pink fuzzy thing that is the http protocal spec needs to include that all content delivered via this method is archivable, linkable, and general avalible to be munged with as people desire why because your using MY IP the protocal to deliver it and thats what I say dont like it then use another protocal and not https dossent count thats just ssl http a protocal in a protocal. So this would leave companys that want to complane about this only serving things up via http that they want open.

  81. archive this by Anonymous Coward · · Score: 0

    fucking jews

  82. Wayback Machine and Websense Enterprise!! by jimwelch · · Score: 1

    Access to this web page is restricted at this time.

    Reason:
    The Websense category "Proxy Avoidance Systems" is filtered.
    URL:
    http://web.archive.org/

    --
    Never trust a man wearing a coat and tie!
  83. function like search engines. by Restil · · Score: 2

    Wayback machines should function exactly like search engines. If there's a robots.txt file, check it. If it tells you to get lost, do so. A search engine is going to cache at least the text part of your site, and you know it. And you can prevent it if you wish. And depending on the engine, it can take months or years to update.

    Besides, wayback machines will run into the same snags that search engines do. They can't replicate cgi scripts any better than search engines can, so to deny them access to those resources for their sake as well as the server's makes sense.

    I don't know how wayback works. At the very least they SHOULD read the robots file. If they do, then I consider most of the copyright issues to be a moot point.

    -Restil

    --
    Play with my webcams and lights here
    1. Re:function like search engines. by Teutates · · Score: 0

      They do read the robots file...

      it's how you tell it NOT to archive your site.

  84. dating back to 1998 by quantaman · · Score: 4, Funny

    Anyone else find it mildly disturbing that 1998 is considered to be distant history?

    --
    I stole this Sig
  85. The other issue by corebreech · · Score: 2

    It is suspected by many that archive.org also removes archives based on content.

    For instance, try accessing news sites back in the days immediately before and after 9/11. It is a very spotty record.

    I have seen this for myself as well, as a web site I am struggling to find the time to build, and which has controversial content, was at one time retrievable under archive.org, but no longer is.

    For that matter, it seems impossible to get Google to index it anymore either (though they too once included the site.)

    By presenting themselves as having a complete record of the Internet's web sites, and then selectively deleting or restricting access to sites based on content is a very pernicious form of censorship. It isn't a First Amendment issue perhaps since dotgov assumedly isn't the one restricting content, but it is worrisome nonetheless.

  86. Copyright *is* archiving. by blair1q · · Score: 2

    You can't unregister a copyright.

    You give a copy of your work to the Libary of Congress, and there the evidence sits for eternity, free to be accessed by anyone with a request slip.

    The price you pay for copyright protection is public availability and persistence of your old rantings.

    --Blair

    1. Re:Copyright *is* archiving. by zenyu · · Score: 2

      You give a copy of your work to the Libary of Congress, and there the evidence sits for eternity, free to be accessed by anyone with a request slip.

      Unfortunately this is not the case, if a librarian at some point feels your book isn't historically significant they will chuck it. They don't have offsite archiving, like some more reputable university libraries do, so there just isn't the space to keep every book that's sent to them. They do however have a right to keep those two books you sent them in perpetuity and copy them into archival formats if they want to.

  87. Saved my butt more than once.... by jafiwam · · Score: 1

    A few points about why I think the Wayback Machine is good:

    Have an old "emergency pager" (read, customer bugs calls because they cant get spell checker to work right at 3am) which was turned off, and then hidden so it is not carried. We lost the phone number, and then couldnt cancel the pager without it. Wayback had it from an old copy of our support site, and the phone number.

    New web hosting client had 50% of the files that went down in the hosting company's servers in WTC. Wayback had them. We got all the verbiage from them.

    We also occasionally need to point out to a customer what state their web site was in when we turned it over for maintanance by them. Having a third-party demonstrate that wiggly email gif was not us in the first place helps a lot.

    I totally disagree with the original article, the Wayback Machine has some practical uses, and is fun for looking at old cheesy web sites. They also seem to be cooperating with people to take things out that need to be taken out. So I have no problem with it at all.

  88. Nothing you can do by litewoheat · · Score: 2

    Once you're on the Internet you can never get out. Its simple fact. Someone will always have a copy of that e-mail you sent professing your love to Missy Gringlebach or the nntp post about how brilliant Hitler was or your web site dedicated to New Kids on The Block.

    Trying to get that stuff off is futile at best. A professor of mine once said that there is not a nanosecond when some computer isn't processing or storing something about you somewhere. And that was in 1991. I've got to side with McNealy on this. There is no such thing as privacy anymore.

  89. Are we now advocating for the RIAA? by fermion · · Score: 1
    I don't see the issue. If something is published, there is some probability it will be archived. In the past such archiving was expensive, and therefore was limited in scope and availability. However, new technology makes archiving, at least for electronic resources, relatively cheap. You no longer need large amount of real estate, staff, and shelve to archived information. A small room with dozens of computers is sufficient.

    By present standards no one says that I have to destroy a book after a certain amount of time, or who I may share it. No one says I can't print out a web site and, within fair use laws, shares it with posterity. No one says that I can't take a book whose copyright has expired and post it on the net. The kind of laws that would be necessary to protect on-line work beyond what is already granted for other works would lead to the kind of legislation promoted by the RIAA and their ilk.

    That said, it is scary that everything we say may be saved for the future. There should be some social standard on what can be saved and what can't. I would say that general emails may not be good candidates for archiving, as they are not published (Although notice that many peoples personal letters do make it into books, so there is some wiggle room here). On the other hand, publicly accessible web pages are pretty much subject to the same archiving standards as other published works. We can certainly pick nits over copyrights, but this is slashdot after all :-).

    --
    "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
  90. Some one hasn't done their research by mfos.org · · Score: 4, Informative

    A few things

    1) They've been archiving since 1998, but they've only recently had the horse power to provide a live connection to it

    2) It is very easy to not have your stuff indexed. the directions are here.

  91. If I print a book, can a library display it ?!?!? by Anonymous Coward · · Score: 0

    OK, clowns, get a grip.

    If you print a book, a library is free to put it in it's collection. Does it matter if you come out with other versions, or does it matter that more than one person might read it? NO!

    I *highly* disrespect the notion that saying "this is Joe's website circa 1998" and showing that exact page to anyone who wants is a copyright violation. It's libel if I change it, and copyright violation if I take credit for it, but otherwise, I'm more like a library than someone selling illegal copies of a book or movie.

    If you don't want it available- and reproduced - don't release it, especially digitally.

  92. court evidence? by dubiousmike · · Score: 2, Funny

    It's so funny that I've been sending around links to my friends of their old corporate websites for months now. Totally freaks them out.

    On a different note, how long until the wayback machine is used as evidence in court?

    "No, Your Honor, we never posted slanderous comments about XYZ Company. *Oh CRAP! Not the Wayback Machine?!?*

    1. Re:court evidence? by MushMouth · · Score: 2, Informative

      Already used in the Go.com GoTo.com trademark suit 3+years ago

  93. Excuse me? by innocent_white_lamb · · Score: 2, Insightful

    Er, you posted content on the WWW for world+dog to read. After all, that's the purpose of posting said content. And now you're unhappy because folks are reading it?

    If you don't want folks reading your stuff, for heavens sake don't post it on the web!

    Seems obvious to me, somehow...

    --
    If you're a zombie and you know it, bite your friend!
  94. Re:Mississippi Trollse by TheMonkeyDepartment · · Score: 2

    You know what, I actually found that amusing.

  95. anyone want a blowjob? by Anonymous Coward · · Score: 0

    I'm not gay, I just hate my life..

    Anyone want a blowjob? My only requirements is that you have at least 2 major STDs, one being AIDS.

    You must also ejaculate in my mouth, and possibly pound my ass if youd like.

    I must die. Thanks

  96. Dead-tree publishing parallel by Todd+Knarr · · Score: 2

    Why, somehow, does this strike me as similar to an author having published an utterly bad, horribly stinky book that, later in life, he regrets ever having let see the printing press, and complaining that some people won't turn in their copies to him to destroy now that he wants to unpublish it? Remember that copyright isn't an unlimited right to prevent copies. IMHO most of these archival sites fall into the same category as a library that bought a newspaper, scanned it onto microfilm and then subsequently had the original newspapers destroyed in a flood: they had legitimate access to the originals, the copies were legitimate fair-use copies when made, the originals haven't been transferred to anyone else, the copies remain legitimate fair-use copies.

    It may be embarrassing to the creators to have copies of their sites preserved for posterity, but copyright isn't about preventing an author from being embarrassed.

  97. ShaunC by Ryu2 · · Score: 1

    I guess you object to libraries keeping copies of all those old books that the author doesn't "like" anymore either, too?

    There needs to be some sort of archive, make it free or payware, I don't care(as long as it is not a commercial company that controls it) of the web, like the Library of Congress does for books, like LexisNexis does for printed media.

    It's called preserving history, the main medium of this page is digital bits, and ironically, it's the most transient compared to all past media.

    Those who fail to learn about the past are doomed to repeat it...

    --
    There's 10 types of people in this world, those who understand binary and those who don't.
  98. What damages? by blair1q · · Score: 2

    Since material put on the web and made available for free access has no value, there can be no damage due to copying should someone copy it for their own use, or to use it against you in the future.

    Your copyright is valid, but valueless.

  99. A serious question? If so, it's OT by Anonymous Coward · · Score: 1, Insightful

    The author writes: and who gave them permission to make those copies?

    Honestly, is this a serious question to pose to /.? I don't know that /. is in any way connected to this site. So what's going on here? It sounds like the author is trying to rally public outrage by claiming to be a victim.

    Personally, I found the writer just a little bit insulting and selfish. (No offense, but that's how I read it.) To the author, I say: if you have copyright disputes with the site, contact the maintainers. Copyright problems happen all the time, and are handled gracefully and quickly. You don't need my help or /.'s involvement in this.

    I suspect the only injury was to the writer's pride. Had there been any commercial loss from the infringement, he would not have used this "wounded bird" rouse in his story.

    On the same topic, you might consider an more enlightened view, and place your old sites under the GNU Free Document license. Details are available at: http://www.fsf.org/copyleft/fdl.html

    So, to the author's (seemingly) rhetorical question, I reply: if you are serious, your question is completely off topic.

  100. yeah, sure, let's destroy history by Anonymous Coward · · Score: 0

    "I never opted in, why should it be my responsibility to opt out?"

    Because the burden of caching/archiving falls on the archiver, not the archived. You are not paying for the long-term storage space. You are not having to wade through an inbox full of junk to get to it.

    If there's something you don't want archived, you'd better have a damn good reason for it. Because in 50 years there will otherwise be no evidence that this discussion - ANY discussion, work of art or content that is online RIGHT NOW - ever existed in the first place.

    What do you want the record of your generation's online activities to be? A mere footnote saying "no data"? Or an archive that can be browsed, read and appreciated (for good or bad) exactly as it was at the time?

    If nobody remembers - why do it in the first place?

  101. Alexa ~= Wayback Machine by mbauser2 · · Score: 1

    The Internet Archive and Alexa were founded more-or-less simultaneously by Brewster Kahle in April 1996. (I'm really surprised you haven't heard of Alexa. It's old news by now.)

    Alexa crawls the web with a bot named ia_archiver as part of their site analysis. archive.org and alexa.com are legally separate organizations, but Kahle runs both, and Alexa still donates a copy of everything they crawl to the Archive.

    --
    Proud to be / Smiley-free / Since Nineteen / Ninety-Three
  102. Nebulous argument... by Codex+The+Sloth · · Score: 2

    While all that is true, proxy servers cache information to re-transmit and nobody complains about that. Don't my Usenet posts from 1990 implicitly have my copyright on them? Where do you draw the line? I say if you put it out there, you should just live with it and let the chips fall where they may. It's more like archeology than copyright theft...

    --
    I am not a number! I am a man! And don't you ... oh wait, I'm #93427. Ha ha! In your face #93428!
  103. Leave a better legacy..... by Anonymous Coward · · Score: 0

    If you worry about people saving things you've said or produced,
    maybe you should say/produce better things....

    If your grandkids (or a Grand Jury) were to see
    this, would you be ashamed?
    (or be shown guilty of a crime?)

    Simply put...watch what you say/do...and leave a good legacy.

    (BTW, I understand "kids will be kids" but you have to grow up eventually and take responsibility for your past actions!)

  104. Archiving since September 1996 by mbauser2 · · Score: 1

    I'm killing 2 quotes with one fact:

    "where did they get such old copies of my websites"

    and

    "I know for a fact that they have pages back at least as far as 1996"

    ia_archiver (the bot that collects files for the Internet Archive) was unveiled in September 1996, just a few months after the Archive was founded.

    Here's a a copy of the original robot annoucement from 5 Sep 1996.

    --
    Proud to be / Smiley-free / Since Nineteen / Ninety-Three
  105. Re:Library archives are given broader copyright us by Anonymous Coward · · Score: 0

    Strange that such a complaint would appear within a group expousing that "information wants to be free." :)

    Indeed. But there are two things to note about public archives of websites: One is that the archives were obviously started way before they were made available or even announced. That caused the misperception that the web is volatile by nature and many did not realize how their actions could come back to haunt them. The other thing is that we'd like to see everyone treated the same. Copyright for all or no one. Usually the big players get all the rights while the rest might complain or not - it doesn't make a difference. The well known archives Google and Waybackmachine do honor requests for content being taken down, but instead of deleting sites from their archive, they only block them. They claim that it's due to technical reasons, but the suspicion remains that paying customers might still be able to access these blocked parts of the archive. BTW, I'd still opt for free flow of information, but if that were to be the rule, I'd like to be informed about it at the same time as everybody else and I wouldn't want exceptions made for anyone.

  106. Who archived it and why by Robotech_Master · · Score: 2

    It's funny the submitter should mention this...because I remember when the people who archived it started archiving it in the first place. A rather big to-do was made about it, as I recall; it was archived as a side-project of the folks at Alexa--you know, the ones who provide the "what's related" technology to Netscape? At the time they started, they didn't know for sure what they would do with it except store it for future generations...but they clearly had some ideas, judging from what they've done with it recently.

    As to the poster's complaint about his old stuff being archived...my immediate response is to say, "Well, tough...you should have thought about that before you put your content out there in the open for anybody who wanted to look at it."

    I mean, seriously, if you do something in public, you have no reasonable expectation of privacy thereafter.

    --
    Editor Emeritus and Senior Writer, TeleRead.org
  107. This is just angering by Mr.+Buckaroo · · Score: 1

    I dug out my account I haven't logged in for a year or so to reply to this thread. I don't really have time to consisely write this, but hopefully it will catch at least some.

    The idea of shutting down services like google or archive.org is a virulent stream of bad thought. It is largely predicated on fear and the fact that copyright law has become so perverted.

    There is a _very_ strong societal interest in having history. If organizations are not allowed to archive things, then we end up with less history. We (society) are depending on organizations to accurately store their records and faithfully give them to us in the future. Historically, it has been the case that it is not likely that organizations will not clean up their history to favorably reflect themselves (Disney and their WWII propoganda cartoons for example).

    Most organizations do not want to pay the cost of storing and have a general interest in avoiding the liability implicit in storing in this day and age.

    The argument that some organization must get permission from _every single_ web page it archives is just insanity. Based on that theory everything comes to a halt. I can't take a picture of folks in Time Square because I need everyone's permission, permission of building owners, permission of advertisers, etc.

    This is all so frustrating because the U.S.'s constitutional framers were concerned with creating IP rights at all. They feared a overarching monopoly on ideas, which is unfortunately what we are running towards. Surprisingly, copyright wasn't designed to give folks a monopolistic production right. It was designed to give a right ensuring you accuracy of reproductions on your work.

    It WAS NOT the idea of copyright that some entity could control production rights EFFECTIVELY FOREVER.

    Archive.org obeys robots and you can opt out. They are providing imo a very useful service. They are one of the organizations out there on the front line arguing and fighting battles to preserve _your_ right to history.

  108. Wayback gives accountability to the Net by Anonymous Coward · · Score: 0

    I think it is imparative that the Web have archives like the Wayback Machine so people can't go back and erase history. The argument that the Web should be fluid is nice, but what about accuracy. If you have read George Orwell's "1984," you will remember that the main character in the book had a job in which he deleted exerpts from newspapers and media in order to make history match the present. The web would have made his job as sinch because all he would have to do is delete or replace a Web site if people didn't want it to beleive it anymore... With the Wayback machine taking snapshots, there is no chance that people will be able to erase history. A bit of a rant but you get the point.

  109. The purpose of copyright... by kcbrown · · Score: 3, Insightful
    The Congress shall have Power To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries
    -- United States Constitution

    The purpose of copyright is to promote progress, to entice authors and inventors to release their works and discoveries to the public.

    But that is not an end unto itself. The true end is the benefit to society that the release of such works brings.

    Now, remember that the whole incentive here, the entire reason for granting the monopoly privilege of copyright, is to allow the originators of works to make money from their works, which in turn (theoretically) gives them incentives to release their works to the public.

    When you publish something on the web, you're publishing your works for free, unless you go to the extra trouble of implementing some kind of access control. The Wayback Machine won't work on a site that has access control, so all it ends up archiving is stuff that was published for free public consumption.

    So the real question is: if a work has already been released for free to the general public, how would letting authors restrict the republication of that work after the fact bring greater benefit to society than not letting the author impose such restrictions?

    My opinion is that it is much more beneficial to society as a whole if the release of a work for free public consumption automatically implied that members of the public have the right to redistribute that work. So if an author doesn't want people in the general public to be able to redistribute his work, he has to control who receives the work and who doesn't. Certainly requiring payment for the work in question is sufficient to meet the requirement of controlling access. But whatever method the author chooses, it should be one that makes it clear that the work in question is not being released for free to the public.

    --
    Use 'slashdot stuff' in the subject line in any email you send me if you want to get past the spam filter.
  110. Hey, it's just a really... by neo · · Score: 2

    really slow proxie server. It's just got lots of options for which caches version you want to see. :-)

  111. Removing yourself from the Archive... by spacefight · · Score: 1

    is not quite that easy. Allthough they say that with the correct robots.txt, your index will not be searchable and they say also that all indexed content will be erased, they do not erase it.

    I checked this with removing the robot.txt entries and wumms, the content is back in the machine. I call them a bunch of liars....

    1. Re:Removing yourself from the Archive... by dossen · · Score: 1

      They might be misstating it, but they are doing the right thing. If the copyrightholder wants his site removed, it is made inaccessible. But the copy they have on file was made within the law, so they keep that one. Then if the sites owner changes his mind, or (I hope) when the work enters the public domain (remember, copyright is for a limited time only), they allow access again.

      In fact I'm hoping that they try their hardest to archive everything, robots.txt or not. That way we will have, in the future, a nice archive of the past, and not a period full of blanks and questionmarks.

  112. If Wayback machines are outlawed, only Outlaws... by bubbaD · · Score: 0

    Unfortunately the present copyright system encourages defensiveness and the "but where's my share?" mentality.
    If the "Wayback machine" is crippled or killed, entire old internet sites would probably be pirated, traded or sold.
    BTW: I know I wouldn't want to work for someone who searched my adolescent transgressions, which may be found on school records, police records and who knows where else. It would reflect more on the seeker than the writer, I think.

  113. Great for getting around corporate content filters by lscotte · · Score: 1

    This is great for getting around the corporate firewall so I can once again browse porn on company time! Woohoo!

    --
    This post is licensed under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.
  114. Removal Instructions by akiy · · Score: 2

    There are removal instructions at:

    http://www.archive.org/internet/remove.html

    --

    --
    http://www.aikiweb.com - AikiWeb Aikido Information

  115. archive by happy+monday · · Score: 1

    you're making a problem where one doesn't exist. why did you even ask this question? what do you care, you should be happy they've archived your web site. it doesn't make any difference to anybody. stop whining about pointless stuff. and no you can't sue them, if you posted stuff on the web it's publicly accessible. i can't believe you're even asking this question.

  116. loses by Anonymous Coward · · Score: 0

    looses != loses

  117. the wayback machine is great for old porn by Anonymous Coward · · Score: 0

    it is very nice to search for wayback porno and
    remember all of the good wayback porno memorys

  118. Here's the scoop on copyright... by dsrtegl · · Score: 1

    Fair use does include provisions for archiving for your_own_use. Just as you cannot tape Major League Baseball ("Free" broadcast) and rebroadcast it later without consent, the copyright holders of the websites have every right to be upset that their websites are being "retransmitted" without their knowledge or consent.

  119. It's time for a robots INCLUSION standard. by SplatFileGoo · · Score: 1

    Archive.org does obey robots.txt. Unfortunately, it will still crawl a site even though the robots.txt ban is there. So, you have to add them to your htaccess ip/agent ban list.

    Additionally, this isn't just the Wayback Machine we are dealing with - remember, there is a relationship with Alexa. You remember Alexa from their days in the cross hairs of privacy problems right?

    There are so many big questions left hanging about archive.org. I can't figure it -can you? There is something more going on here. This isn't a normal site.

    Unanswered or short answered Q's:

    What is/has been archive.org doing with all that text for all these years? They haven't been sharing it publically for any time at all.

    Have they been selling data (your site), to third parties other than Alexa?

    Does archive.org have contractual agreements with any govts?

    Who are they feeding? Hmmm, collecting data for how long? And now just putting it online.

    How are they making any money? Where's the revenue stream to fund such a mass collection?

    Who is funding such a massive long term effort?
    Think about how long they have been doing this. Since 96 when a good work station would cost a couple years salary. This is massive, just massive tech investment that would probably put most of the search engines on the net to shame. Where's it coming from?

    Finally, with rogue bots being the #1 problem of many sites, it is time for a robots INCLUSION standard. All bots are banned unless specifically allowed. That is a whole lot different than the deprecated, unworkable joke known as the robots.txt standard (that was never endorsed by any major net organization).

    "Welcome to ABC's Monday Night Football. This telecast is for the sole exclusive use of our viewing audience. Any retransmission...."

    Why should the web be any different? Copyright is copyright whether it is TV, MP3, or text on a SlashDot story.

    /tanstaafl

    1. Re:It's time for a robots INCLUSION standard. by dossen · · Score: 1
      As far as funding goes, here's a bit from their front page:

      Archive Donors
      • Alexa Internet
      • AT&T Research
      • Compaq
      • the Kahle/Austin Foundation
      • Prelinger Archives
      • Quantum DLT
      • Xerox PARC

      And it wouldn't surprise me if the Library of Congress (mentioned as a "user") chips in a little.
  120. It's a FRIEND *and* a FOE by newerbob · · Score: 1
    I like the Wayback Machine! I wish it had everything in it.

    But it could be embarassing, in the way that Google's "Complete USENET Archive is". Reading my posts there from 12-14 years ago makes me wince!

    Anyway, I was was involved with a site that was pulled down because we got a credible threat of a lawsuit. I'm pleased to see it's in the WayBack machine!

    --

    --
    Ask the Ya-Hoot Oracle Anything!
  121. Fair Use by Anonymous Coward · · Score: 0

    It's fair use to keep a personal copy in your browser's cache. It's arguably not fair use to redistribute that copy to millions of others through the Wayback machine.

  122. The WayOff Machine by TheJohn · · Score: 1

    One thing that may affect copyright claims is that it's not correct about the pages given the dates. I just checked a former employer, and the page that the WayBack Machine said was from Dec. 1998 had a 1999/2000 copyright notice, and announced a product I know was not available in 1998.

    So copyright holder could claim the WayBack Machine misrepresents their site.

  123. Your kepboard is a microphone... by surfcow · · Score: 2

    .. which anyone can listen to.

    Do you use caution when speaking into a microphone? Why?

    Anything you publish can be used against you. Data wants to be free, remember?

    =brian

  124. What a stupendous waste of DASD. by crovira · · Score: 1, Troll

    Man, I've found pages from old porn sites I worked on that never made it out of the fuckin' ISP. (Management troubles, A.K.A. intertcine warfare.)

    What a STUPENDOUS waste of storage.

    Who the fuck paid for all these drives?

    Do his doctors know he's off his meds?
    Could I get him to donate a few terabytes to my boxen?

    --
    MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
  125. Wayback machine by Anonymous Coward · · Score: 1, Interesting

    In my opinion, when you post publicly on the web, you are essentially saying "This is public information, it may be copyrighted, but it is public". Then it's a question of whether or not the Wayback Machine is considered "fair use", and I believe it is. If it is, then you can't stop them. End of discussion, right?

    Now, if you don't want this stuff to be publicly accessible on the web, there is now a precedent (set by Google) for SSL sites. There is also the robots.txt convention you mentioned.

    The only real issue I see in the archival sites is "How do they know that domain ownership changed hands?". If a porn site comes along and buys the domain after you're done with it, how does the wayback machine protect you from inconsequential damages that might arise?

    I don't know... But I do know that the web and the internet in general was never intended for privacy or copyright, as such, and maybe we just need a new protocol?

    Dave

  126. Errr...i disagree by Archfeld · · Score: 2

    If you want to OPT OUT, then don't put it up on the net. The NET is a public utility, put content out there and expect it to accessed, cached, and backed up in numerous ways by LOTS of individuals intentionally or un-intentionally. If you want your data private DON'T put it on the net, seems fairly straight forward and simple.

    --
    errr....umm...*whooosh* *whoosh* Is this thing on ?
  127. I dont care much about copyrights..... by O.F.+Fascist · · Score: 1

    So the wayback machine is just fine by me.

    I'm of the belief that if you put something online that its public domain for anyone to do whatever they want with it.

    1. Re:I dont care much about copyrights..... by josepha48 · · Score: 2, Offtopic

      Happy lawsuits... when you steal a logo from a corporation that just wants to screw someone...

      --

      Only 'flamers' flame!

  128. Re:Library archives are given broader copyright us by Teutates · · Score: 0

    Information wants to be free as long as it doesn't involve me or what I don't want to see...then i want total control because I am a hypocrite.

  129. Coincidentally enough... by Kickstart70 · · Score: 2, Interesting

    Yesterday I used the Wayback Machine for one of the lawyers at the law firm I work at to prove that a company at one point had an office in a certain location. The company in question was trying to duck out of a contracted agreement by saying they were not the people who signed the contract.

    The Wayback Machine proved that they indeed knew of, approved, and granted authorization to this specific office, and the other people had a valid contract. In this specific case, the Wayback Machine prevented an apparently scumbag company from trying to screw some apparently good people over.

    Kickstart

  130. WOW! SEX.COM! by wo1verin3 · · Score: 2

    This wayback machine is invaluable!

    I was able to travel back to the early days of internet pr0n (click here to launch sex.com from '96) and research ancient authentication methods including "Click here if you are over 18".

  131. Here is how to opt out by kennethrona · · Score: 1

    http://www.archive.org/internet/remove.html

  132. There are really two issues by Jerf · · Score: 2

    There are really two issues: 1. Should the archives be made? Which is what everyone seems to be discussing, and 2. Should the archives be publically accessible?

    I agree that any interpretation of copyright law that says the answer to "1" is "No" means that copyright law needs to be changed, not that it is "illegal and therefore immoral".

    But a case can be made for "2" that the distribution should only be made for when copyright on the material has either expired, or could reasonably be expected to be expired. Which brings up two other issues, which are the absurd lenght of copyright materials, and the near impossibilty of determining if a material is still copyrighted.

    So, I don't have any answers, just better questions.

  133. Anybody heard of the Library of Alexandria???? by fdiaz5583 · · Score: 3, Interesting

    If anyone has ever heard of the Library of Alexandria it was supposedly the most impressive knowledge base the world had ever assembled. Some crazy guy came by and burnt it to the ground -- setting the entire industrialized planet back hundreds perhaps thousands of years. We are now in the process of surpassing this great library, and are making it even easier for people to have access to knowledge. That knowledge may be porn, may be the morning news, or sports scores, it may even be how to construct a nuclear bomb. Nevertheless it is knowledge and EVERY person who is alive has the God (and any other higher power) given right to knowledge, despite what any government agency, or copyright may say. 21st century libraries such as the WayBack Machine are providing the tools necessary for researchers to go "back to the future." This is a great service to mankind, and it's overall importance should not be outweighed by greedy, and or overparanoid privacy rights activists. If you do not wish to be known, please do not post any information on the web, and move to the jungles of Africa and step away from a time and place known as the PRESENT.

  134. What if... by rimsky · · Score: 1

    What would happen if the wayback machine starts archiving its own site?

  135. What would really be useful... by Junior+J.+Junior+III · · Score: 2

    Would be an archive site that kept versions of news articles before and after they were changed by editors. Often, an article making allegations of corruption or bad intent gets changed shortly after it is published, and the replacement gives a more neutral stance, which doesn't give readers the whole story anymore, and in many instances makes the story a non-story, leading me to wonder why it was even published in the first place.

    --
    You see? You see? Your stupid minds! Stupid! Stupid!
  136. Lawsuits in the making? by SnappingFish · · Score: 1

    A friend of mine discovered that that Google groups, when searching on his name, is reporting lots of spoofed postings to Usenet under his name. Really assinine wierd stuff. Not at all my fiends style, but a prospective employer might not understand this or even give him an opportunity to explain. Not only that but the tons of follow-ups quoting the articles and atributing it to him. I don't know what the hell he is going to do about it since Google says its his problem. Get all the the follow-up posters to understand and remove the flame-fest posts? Ha! Is Google a publisher or Republish of this I wonder. Anyone with a legal opinion? It certainly could dammage his reputation when people do searching about him. Life in the new era, Yikes!

    1. Re:Lawsuits in the making? by mlk · · Score: 1

      Tell your friend to sign everything he sends.
      Send email/news posts or creating websites which looks like it's send/created by anyone is childs play, your only real defence is digital sigitures.

      mlk

      --
      Wow, I should not post when knackered.
    2. Re:Lawsuits in the making? by SnappingFish · · Score: 1

      That is nice in thory, but really work retroactivly and many people (read some of the comments in this thread) consider articles to Usenet under your name as credibly comming from you. A prospective employer may not say "hey we want to hire you.... but did you really write this insane shit in this post?" You are not going to get the opportunity to explain that some jerk pulled a prank on you. If you piss someone off or have a vindictive ex-wife or who-knows-what -- Do they get to use Google to destroy your life? This will be an interesting area for the courts to explore I think.

    3. Re:Lawsuits in the making? by SnappingFish · · Score: 1

      Re: digital signatures it was meant to say ... but WON'T really work retroactivly.

  137. Like it or not, it's the law by macwhiz · · Score: 1

    There's a difference between copies made as a necessary part of reading, and the copies made by the Wayback Machine. The very nature of the Internet means that an intermediate copy must be made to read a web page. That's a fair use copy. A browser cache copy makes reloading the page more convenient for the user, and doesn't give profits to anyone--again, fair use.

    Retaining a copy indefinitely and serving it up to other users isn't the same thing at all. The Wayback Machine isn't a necessary part of using the Internet. With the addition of a banner ad, it instantly becomes an income-generating enterprise. In that case, clearly there's a copyright issue, because they would be profiting from the work of others without compensation.

    By putting up a web page, I'm not giving any permission for people to copy it beyond those copies strictly required to view it in the first place. Putting something up for public view does not place it in the public domain! U.S. law is quite clear on that count.

    If I put my garden tractor on the front lawn, where others can see it, does that give my next door neighbor the right to come take it and use it on his lawn without asking? Nope.

    Why are libraries different from Wayback Machine? Photocopies are expensive. It wouldn't be cost effective to photocopy a whole book. It'd be cheaper to buy a new copy. The costs of copying a web page are much lower... so there's no disincentive that keeps people from violating copyright flagrantly. In this case, though, it's not about profit like it is with the record companies and MP3s. It's about an author's right to decide who may profit from his work. Even if Wayback Machine isn't in it for the money, their reputation profits from other people's work. At best, this is a shady practice.

    A better analogy would be: Wayback Machine is a public library consisting of photocopies of books. Anyone may check out books. It costs them nothing to do so. No profit is made... but photocopying the book in the first place was still illegal, because of copyright. Once the library buys a book, the First Sale Doctrine says they can lend it out. There's a consideration paid for the work. Wayback Machine isn't giving any monetary consideration for their use... and they aren't even being polite and asking permission!

    (If someone sets up a tent on your lawn and camps out, do you think the cops will not arrest them for trespassing if they say "hey, the property owner never said we couldn't!" ? )

    Like it or not, copyright is the law. Everything created in the U.S. has copyright invested in its author from the moment of creation until the copyright expires (if Congress ever permits that again) or until the author explicitly places it in the public domain. Publishing it, whether on paper or electronically, doesn't put it in the public domain. If I printed out my web page and handed the printouts to passersby on the street, I'm not giving away my copyright on the work.

    1. Re:Like it or not, it's the law by Anonymous Coward · · Score: 1, Informative

      err, as someone pointed out earlier, copyright
      law gives libraries and archives special fair use
      powers.

  138. Not strange at all. by Ungrounded+Lightning · · Score: 2

    Strange that such a complaint would appear within a group expousing that "information wants to be free." :)

    Not strange at all.

    Slashdot is not populated by a bunch of lockstepping conformists. Its postership is large and diverse. The individuals are NOT the average, nor are they the stereotype.

    Perhaps on the average the posters think that IP laws are 'way too tight. But some think they're too loose. Post an article about somebody making them tighter and the make-em-loosers will complain, post one about somebody apparently not respecting them at all and the make-em-tighters will sound off.

    Further: Few if any Slashdot posters think a published author has no rights at all over the distribution of his work. (How would Copyleft work if that were true? B-) ) So when it looks like a service may be copying and republishing past works far beyond the authors' intended distribution they may sound off.

    And even the most fanatic of the "information wants to be free" faction may still post a cautionary note about how a particular act of radically freeing it may attract opposition.

    Which seems to be what happened here.

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
  139. Re:Library archives are given broader copyright us by Anonymous Coward · · Score: 0

    no all the mp3 snorkers out there now KNOW what its like to have their warzes stolen. Its exactly the same thing. Just with a different set of files. If it is library of congress they made themselves exempt from these sorts of laws YEARS ago. They want to keep large volumes of recorded, non recorded, digital, or whatever. To help preserve our history as a nation... What they are doing is a good thing. Even if there is copyrighted work on there. You can get the same sort of service just by going to your local public library and requesting some copy of something from them. They may charge you to do it to cover the shipping costs. But they will do it. They have been for years trying to make it easier and easier to get at their content. think Ill go try mp3.com on there.

  140. Anarchists Cookbook by lunchlady+doris · · Score: 1

    As an interesting aside to this, William Powell, author of everyone's favorite tome of tyranny the Anarchist Cookbook, publicly denounced the book. There used to be a note from the author on Amazon urging people not to buy the book, but I see that it's been removed. Guess Barricade Books wasn't too fond of his idea.

    1. Re:Anarchists Cookbook by martyn+s · · Score: 2

      Yeah, that's actually the book I had in mind when I said this. That and "How to get the women you desire into bed", by Ross Jeffries. He also mentioned in the forward that he was thinking about, but was talked out of, taking the book off the market.

      See one day I might regret having admitted that I read that book "How to get the women you desire into bed", but there ain't nothing I can do about it :)

  141. Old encryption can be broken! by Jon+Howard · · Score: 1

    I've got some encrypted messages which I posted a long time ago that have been archived. I'm not going to tell you where, you'll have to find them yourself. Their contents are not catastrophically embarrassing, but they're definitely not something I would enjoy having out there completely in public view - hence the encryption.

    My problem is: the encryption I used when I was working on a 386 is now trivial to decrypt on modern machinery - potentially rendering my messages fully in the public view - at least to anyone who is marginally motivated.

    If Internet archiving is more than a passing trend, I urge you to be very careful about what you put online - period. Encrypted content may be safe now, but when you're applying for a job 20 years down the road and your potential employer can view all of your PGP'd email from today, you might have one less job opportunity.

    I'm not even inclined to entertain thoughts about how bad things could get for you if the changing climate of politics were to count your antiquated encrypted correspondence as disloyal.

    1. Re:Old encryption can be broken! by GuNgA-DiN · · Score: 1
      "I urge you to be very careful about what you put online...."

      Very good advise! I deal with clients all the time who say things like: "We want to put all these images on the web...but, we don't want anyone to copy them". To which I reply (in a nicer sort of way): "If you don't want them to take your shit...don't put it on the Internet you morons!"

      Fact: The Internet is a public network of networks
      Fact: If you put shit on the public Net anyone can access it
      Fact: If people steal your shit it is your fault for putting it on the Internet

    2. Re:Old encryption can be broken! by SuiteSisterMary · · Score: 2

      What's that quote from Cryptonomicon, when the guy tells his buddy to use 4096 bit encryption? Something like "I want this encrypted until men no longer do evil."

      --
      Vintage computer games and RPG books available. Email me if you're interested.
    3. Re:Old encryption can be broken! by Jon+Howard · · Score: 1

      I think that any number whose length we could express would be too optimistic.

  142. Re:Library archives are given broader copyright us by Anonymous Coward · · Score: 0
    That caused the misperception that the web is volatile by nature and many did not realize how their actions could come back to haunt them.

    Waaaah. Waaah. Civil Lib-babies want to opt out of consequences!

  143. Copies of copyrighted material by nixterino · · Score: 1

    So making archival copies of copyrighted material on the Web is bad, but making copies of other copyrighted material (musoc, etc.)is okay?

    Boo hoo - how dare somebody copy my Web site! The nerve of them!

  144. My thoughts on this.. by zeno_2 · · Score: 1

    If you dont want your data to be cache'd, then put it behind a password protected site.

    You put something on the internet, and its going to be cached by a lot of places, some places may dump that cache weekly (proxy servers) or they may stay up for a while (google).

  145. Deal with it.... or else... by Anonymous Coward · · Score: 0

    Seems to me, that everyone seems to be a little confused. Any web site, is like a book. Once it's written and posted, it's out in the world. So, if, by this opt-in/out conversation is carried all the way, I guess we can burn the bible/koran/budda texts... etc.... since they were scribed in the past. The Net NEEDS an archive of past sites/pages/texts. Without it, how can one qoute/research past ideas/thoughts and back it up with an actual document? Some one PLEASE explain to me, why a site, that is in efect a library of history, is causing such a stir... I just don't get it. If retaining history is such a bad idea... lets burn all books and forget about the past.... OH... unlike 'spam', you don't have to look at it... and every browser I know, has a history...so do I need your permission for that to? I think I see a class action suit against all browser makers...

    Deal with it people...or stop writing something you consider "private" (PERIOD)

    NOW GET OVER IT FOR F**K's SAKE.

    oh, btw, yes /. does suck these days, nothing but a**es like me

    (censored by me, typos and spelling mistakes by me too, deal with that)

  146. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  147. The Net as a publication medium by erichill · · Score: 1
    In Project Xanadu, the idea was that documents are presented to the world as a publication, and that publication is permanent (modulo court order or such). There would be an explicit distinction between publishing something and letting the world view a work in progress.

    Remember that once you've made a public statement in the real world, it's out there and there's nothing anyone can really do about it. You can issue(*) a later correction, retraction, clarification, or whatever, but the original doesn't go away, despite what politicians and other public figures might wish. Now that we all have access to the world's screens, we need to be careful what we say if we care about later consequences.

    (*) There are other aspects of the Web that make this difficult. See the "Related Projects..." section at crit.org for more.

    (apologies if I'm fuzzy on any details)

    --
    Credo sim. - I think I am.
  148. opt-in by EdMcMan · · Score: 1

    Putting something on the web is 'opt-ing in' to allowing people to look at it, index it, and save it if they wish. Do you have a problem with search engines? If not, you shouldn't be complaining about this. Personally, I think the WBM is a good thing (tm), but I think webmasters should be able to opt out if they want.. for whatever reason. I don't know for sure, but maybe the WBM reads robots.txt?

  149. Just wait... by Eythian · · Score: 1

    ...until the Wayback machine archives Google...and Google caches the Wayback machine...and the Wayback machine archives Google...and the entire internet gets sucked into a singularity.

  150. 2 little points by Sabalon · · Score: 2

    1) I was glad that they had one of my old pages on there. I lost it due to a crash (my brain crashed and I wiped it out). I was able to pull it back off their site and get it right back running.

    2) Are we not the same collective group that gets mad at NBC for not wanting us to use our Tivo's? I realize that there are a crapload of people on /., but sometimes the irony is just too much.

  151. Old, "useful" information by Anonymous Coward · · Score: 0

    I wonder if the archives contain any information that the government has since taken down in the wake of 9/11?

  152. Slashdot history... by __aawavt7683 · · Score: 1

    The wayback machine goes back to 1998. Upon searching, the slashdot search engine goes back to 25558. Here's a link. Problem is, with the older slashdot archive articles, there's NO YEAR. :-( I don't wanna go back through each page marking every december month, trying to figure out about where they occur, counting the years backwards.. someone else maybe?

    -DrkShadow

  153. Of course, archives should be legal. by shimmin · · Score: 2
    Archiving Web sites ought to be fair use.

    Arguing otherwise is like saying retaining old copies of magazines after the new ones have come out is an infringing use of those magazines.

  154. OT: Local archiving of Wayback machine results? by HEbGb · · Score: 2

    I found some information on the Wayback that I would really like to archive myself - for legally defensive reasons (i.e. trademark use, and to kill patents).

    Is there a way to archive sites from the Wayback machine in a clean (linked) way? I tried using standard web downloaders (Webreaper, Offline Explorer), but they didn't work correctly. Their FAQ says it can't be done, but for some reason I don't believe them... :)

    Anyone have advice? Thanks.

    1. Re:OT: Local archiving of Wayback machine results? by Anonymous Coward · · Score: 0

      Have you tried "File > Save As..." ???

    2. Re:OT: Local archiving of Wayback machine results? by HEbGb · · Score: 2

      Duh. I wasn't asking about saving one page, but an entire site.

  155. Wayback Machine == Friend by GuNgA-DiN · · Score: 1

    Just because you are too lazy to create a robots.txt doesn't make me feel sorry for you. I have had nothing but good luck with the Wayback Machine. I was able to find work I did 4 years ago that I thought was long gone. I was able to find phone numbers to people who had taken down their web sites. I was able to research press releases and license agreements that had been changed by the authors (without telling anyone!).

    So, in my opinion, the Wayback Machine is a great tool to data-mine the past. Just because you don't like it -- tough shit. Create a robots.txt file and maybe you won't get spidered. Your argument is weak.

  156. What do you expect? by humblecoder · · Score: 1

    IANAL, so I can't comment on the legality of archive.com. However, based upon my own sense of "fair play", I think that if you put information on a public web server and allow people free access to it (as opposed to making people pay to view), you can hardly cry flow when somebody actually makes use of the information. If you didn't want the information to be distributed across the Internet, then why did you post it on there in the first place?

    I could see you having a beef if somebody took what you put on your web page, copied it, and claimed it as their own work. That is wrong. However, archive.com doesn't claim to have authored the pages. Visitors know up front that it is an _archive_.

  157. Mind boggles by cicho · · Score: 1

    If Wayback Machine is slashdotted, can I find their archive in Google cache? And does the Wayback Machine archive items from Google cache? And does Google cache the Wayback archives of its cache? And does Wayback archive the Goog**stack overflow**

    --
    "Only the small secrets need to be protected. The big ones are kept secret by public incredulity." - Marshall McLuhan
  158. Why does your post have a 2-line body? by Drunken+Coward · · Score: 0
    --
    Have you been stalked by Seth today?
  159. Fair use its not by Anonymous Coward · · Score: 0

    Ok... a lot of people here seem to be bringing up robots.txt as a solution, that it is the same as indexing. It isn't.

    People are saying that it is like libraries being asked to pull papers, journals, books, etc. if the publisher no longer wants them published. It isn't.

    First, indexing is not making a copy of the pages at all, most indexes work via a dictionary method, and store at most phrases of about 3 words. Those words/phrases are entered into a database. When you do a search for "microsoft windows" the database looks up that key and sees that the following URLs have that word/phrase in it, it does that for all the words and phrases and subphreases and then computes the join of all of the records... the result is what you see on your search (basic theory, everyone does it their own way). If one could even considering it archiving (only operating on words or short phrases) it would certainly fall under fair use. What the wayback machine is copying WHOLE documents (questionable), and then REPUBLISHING them (just plain wrong, and in violation of copyright unless the author gave them permission or placed the document in the public domain). So now that we have established that indexing and archiving are different, how is robots.txt going to help? If I use it to prevent archiving, I apparently prevent indexing; that is not a real solution. (note that in this case I should need to opt-in to archiving, as the default of copyright is to NOT allow others to copy my work.

    Ok, now to address the issue of libraries (or individuals for that matter). It is quite obvious that a publisher cannot reach into your home or library and pull back something you (or they) purchased (well, unless it has a EULA which says they can, but that's a different flame). What they do prevent is copying, regardless of if the publisher stoped printing it or not. If your library or you started copying entire papers, books, software and gave it out to whoever asked and was discovered do you really believe that you would be allowed to continue? (Makes a lot more sense when you look at it like software, doesn't it? Same laws apply to both).

    Ok, so what is the solution? The simplest... keep archiving it, and publish it in 70 years, or 90, or whenever the copyright expires. No it isn't "what people want", it is however the "right" solution (legally).

  160. i can see where this would be helpful by sab0tage · · Score: 1

    if say a federal agency or someone was tracking a child pornographer who had erased his site's content along the way, if archive.org had archived bits of the site they could find enough evidence to take the owner of the website to court and perhaps jail him for life for distributing child pornography

    1. Re:i can see where this would be helpful by xiaix · · Score: 1

      The flipside of this being that now thanks to the archiving, the pedophiles that were members of the site may still be able to find their material in the archives.

      --

      Have you read the Moderator Guidelines yet?

    2. Re:i can see where this would be helpful by gerardrj · · Score: 1

      The other flipside is that since the Wayback is re-distributing the content they may be criminally responsible also.

      Logically:
      1. You know there is child porn on the Internet
      2. Your archive specifically attempts to retrieve, copy and reistribute all Internet content
      3. You knowingly are distributing child pornography.

      --
      Article X: The powers not delegated... by the Constitution...are reserved...to the people
  161. Freaky by filtersweep · · Score: 2

    My ancient vanity site that received no traffic, nor deserved any, has been duly archived. I'm dying of embarrassment at my rudimentary HTML- back in the day.

    My question is why I was even on their radar?

    --


    Those that suggest you "dance like no one is watching" really want to see you make a complete fool of yourself.
  162. Why opt out? by objekt404 · · Score: 1

    If something is put out there for everyone, then why shouldn't there be an archive system of somekind. All it seems to look like is a simple, historical structure of what has gone on.

    Complaining would be like Disney complaining about "Steamboat Mickey" being played again & again. Unlike them (Disney) you can't show any loss of revenue from a quick replay....

    --
    "Good, bad, I'm the guy with the gun."
  163. you idiot by blisspix · · Score: 1

    wayback is new? if you have been on the net for any time at all you would know that the wayback project was started many years before your website. They have been capturing this data for at least the last five years now.

    you don't lose control of your copyright because wayback has a copy. they have a copy made under fair use for research purposes, which is how libraries can copy vasts amounts of work for college courses and research.

  164. The way I see it ... by Anonymous Coward · · Score: 0

    ... you are a moron. You publish stuff, it gets seen and archived. Tough. Can't be helped, move on.

  165. Re:DAVE WINER - Why he's been missing by SnappingFish · · Score: 1

    I am sure Dave Winer would not find the previous post ironic, nor funny.

  166. No less stringent than the GPL by blueskies · · Score: 2, Insightful
    The copies that they have archived in their databases are individual copies served from the original web requests, so they have the right to keep them. They became their copy when they were originally downloaded.


    You have the right to something once you download it?

    If I copyright my content, other people are not allowed to distribute it without my consent. There is no way around this. I don't have to add extra disclaimers, just a copyright notice. How can there be any arguement about this?

    Ok, someone GPLs some software they wrote and put it on their website. If you download a compiled version of the software, you can't redistribute the compiled executable without making the source available. Why? Because the copyright owner (via the GPL) only gives you permission to redistribute if you also make the source available. The owner can do this because the GPL is backed by copyright laws, just like copyrighted web content. Notice I said owner, because the law grants special priviledges to people that create content and copyright it. There is no implied social contract that says the content is up for grabs. And there is also no reason fair use even comes close to applying if you are talking about a large quantity of content.

    I do think the archive provides a useful service, but I think they are on shaky legal ground.
    1. Re:No less stringent than the GPL by kevinank · · Score: 2
      If I copyright my content, other people are not allowed to distribute it without my consent. There is no way around this. I don't have to add extra disclaimers, just a copyright notice. How can there be any arguement about this?

      Assuming that you were the copyright owner of the original web page, then when you made a copy for the original download to the people running archive .org you were within your rights. Since you gave the copy you made to them, the data is now theirs to dispose of as they please (this is a reasonably straight forward mapping of Copyright law into the digital domain.)

      Within the limits of copyright law, you can make your single (or multiple) originals available to other people without the Copyright owner's consent, assuming we can apply the first sale doctrine to alienation of the data by transfer over public networks.

      Likewise you can do anything else with the original legal copy you have that is permitted under copyright law, such as make fair use of the original. Fair use might be stretched to include the use that archive.org is making of the documents, or it might not, but it has yet to be tested. The only reason you can't say for sure that it isn't a fair use is that fair use isn't a specified set of uses, but any use that the courts consider fair. There are guidelines that have been created for judging fair use, but so far I don't know of any case law establishing archive.org's use as fair or not fair.

      My point was that if you really want them to lock away their database to a location where only they can use their originals then you can probably force them to do so in court. I'm merely of the opinion that the world will be poorer for the loss of readily available information.

      --
      LibBT: BitTorrent for C - small - fast - clean (Now Versio
    2. Re:No less stringent than the GPL by anshil · · Score: 2

      If I copyright my content, other people are not allowed to distribute it without my consent. There is no way around this. I don't have to add extra disclaimers, just a copyright notice. How can there be any arguement about this?

      Read and understand the HTTP protocoll. HTTP is from original design not only a server to client communication, but allows a lot of proxies in between _caching_ the data. (Keeping copies of the content). Not the cache usually goes back some days to weeks, but now whats really different between a week and a year?

      --

      --
      Karma 50, and all I got was this lousy T-Shirt.
    3. Re:No less stringent than the GPL by Anonymous Coward · · Score: 0

      >Assuming that you were the copyright owner of the original web page, then when you made a copy for the original download to the people running archive .org you were within your rights. Since you gave the copy you made to them, the data is now theirs to dispose of as they please (this is a reasonably straight forward mapping of Copyright law into the digital domain.)

      Again, they don't own my material. SImple fact. I created my work, I own it, I decide where and when it gets archived. To say otherwise is a complete and total perversion of the Copyright law.

    4. Re:No less stringent than the GPL by Anonymous Coward · · Score: 0

      they own the copy that you gave them.

    5. Re:No less stringent than the GPL by kevinank · · Score: 2
      If you are writing from europe then you are correct. Under United States copyright law you would be mistaken however. Once a copy of a copyrighted work has been handed out it no longer is under your control. The only rights you maintain over that copy are the ones spelled out in the copyright act which are roughly: the right to publish, the right to publically perform, and the right to create derivative works. Any other uses, such as the use of reading the work, the use of selling it to another party, or the use of storing it for posterity are not exclusive rights granted to the copyright holder.

      You might be arguing that there was no alienation. That is, that even though you gave a copy to me, you didn't really give it to me, but only loaned it to me for a while or something like that. Whether that position would be held factual would be for a court to decide.

      In any case what you are asking for is simply and plainly contrary to the technological nature of the Web. Cache controls given by the web page designer are advisory, not mandatory. There is no technical means on the Web for doing what you ask. A smart attorney might use that to show that you gave implied consent to have the data copied and cached (even if there was no alienation) by placing the data on a medium where that copying and caching is implicitly a part of the technological means of communication.

      I imagine we will be seeing more case law in the next couple of years on this topic, and the results will probably surprise both of us.

      --
      LibBT: BitTorrent for C - small - fast - clean (Now Versio
  167. cache = memory by paul_cairney · · Score: 1

    personaly I think that society will very soon progress to a stage where the line between human memory (in our brains, information stored by biological reactions) and computer memory (information stored in bits and bytes, currently on magnetic disk drives and silicone subtrate "memory" chips but feasably in the future will be stored on biological or sub-atomic storage arrays) will become so blured that society will cease to differentiate between the two.

    I ask you to ponder the difference between the copys of the website/BBS/usenet comments everybody is so paranoid about which are stored in a readers humany memory (admitadly in 99.9999% of cases inacurate) and the copy stored (perfectly) by electronic means. Is it breach of copyright if i tell my freind what you said in usenet ten years ago? What if I forward him the post?

    As a species we have progressed by learing from our parents, if in the animal world parents were to refuse to to teach their young they would die very quickly. If we had had to re-invent the wheel every generation im sure I wouldnt be posting this comment now. We have progressed as a species by passing on information, and electronic copies of data are merly an extention of our own memories which have the advantage of of being a lot more acurate than the human memory. This however doesnt address the line between what is public and what is private....

  168. Like hob discrimination by Anonymous Coward · · Score: 0

    And what purpose does that serve other than to discriminate against people. Would you hire a usenet troll? What if he was trolling alt.os.windows? Would you hire someone who got into flame wars? What if they were flame wars against the win-trolls? Would you not hire someone because they subscribed to a scientology group, what about a gay/lesbian group?

    Using any evaluation tool which places any applicant at a disadvantage is not only unscrupulous, but also against employment laws in the US.

    Perhaps you did it to find the people who blasted you out of ng's for being a kook and wanted to hurt those people like they hurt your feelings.

    OOOOOh maybe you based employment on mispelling on usenet. All those damn illiterate people who wrongly understood usenet to be for informal discussions and didn't mind their P's & Q's and dot their i's and cross their t's.

    Well I guess you wouldn't hire me because you would have seen all the people I called "fairy-mushroom-rapists" and "meal-worm-cornhole-fucker." But that is good, because I really wouldn't want to work for a "tutu-wearing-mole-humping" net kook like you.

    PS.... I work for you, you "snivel-shit-cavitated-peanut-headed-half-wit."

  169. Is this where they get all the latest 5 year old.. by Anonymous Coward · · Score: 0

    Slashdot stories.

  170. Billionaire Jimmy James owner of WNYX by Conrad_Bombora · · Score: 1

    has a way back machine, so it must be good.

    Obscure reference only hard-core News Radio fans would get, but not necessarily find funny...

  171. I hope that one day the net credit card by Rareul · · Score: 2, Funny

    transaction companies decide to integrate
    their historical transaction databases.
    That way, when this game is over, we get all of our money back.

    ?sp

  172. Publishing versus sharing by intermodal · · Score: 1

    The internet is a medium for sharing information. It was created for military and later for educational sharing of data and other information. Commercialization of the internet and copyrighted content is nothing but a bastardization of it. Simple truth: The internet is a giant collection of stacks of papers. If you grab a copy of one, no big deal. It was provided without cost anyway. As long as you don't claim that you own it or that you created it if you didn't, there's nothing wrong with it. After all, facts cannot be copyrighted. If someone creates an archive of the internet and makes it available freely, I see no reason whatsoever for anyone to object without a flagrantly correct reason why not (i.e. you indexed my passworded site, you publicly published my email that I didn't make public, etc.) but when you make something freely available to the public over the internet, you have no justification for complaining if someone passes it around in its original form (or an unmodified text doc of it, if it's something that can be distributed in that manner, ,for that matter.)

    --
    In SOVIET RUSSIA... erm...NSA AMERICA, the Internet logs onto YOU!
  173. Copyright and robots by dsoltesz · · Score: 2
    Making a copy for archival purposes is not a violation of your copyright. It's fair use.

    The bigger issue is the rudeness of the archive in ignoring robots.txt and rifling through files that one does not wish to have linked or accessed (e.g. stuff under development that isn't ready for 'prime time' yet).

  174. moderators blow - if you want to know how alexa... by Anonymous Coward · · Score: 0

    got such old data you'll read the above message.

  175. I can't resist.... by madcow_ucsb · · Score: 1
  176. Archives, and their use by XO · · Score: 1

    I am an adoptee.

    I posted a message on May 17th, 1989, on what was then the FidoNet ADOPTION message board. It was then gated to UseNet, and sent presumably across the world, seeking my birth family.

    Because my mother found that message, in a Usenet archive, on the Internet, on May 17th, 1999 (ten years later), we have met. I know who my family is.

    If you think archives are bad, Fuck You.

    --
    "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
  177. This is getting ridiculous. by eatenn · · Score: 1
    What particularly interests me is the fact that the Machine is a relatively new animal, yet it contains snapshots from my sites dating back to 1998. I can't help but wonder: where did they get such old copies of my websites, and who gave them permission to make those copies?

    I'm sorry, this seems ignorant to me.

    You can't just put information on the internet and only have people using it the way you want. Once information is on the internet, available to the general public, it's no longer your concern what is done with it. Saying it's wrong to archive something you've found on the web is almost exactly the same as the RIAA saying you can't convert a CD to mp3 for archival purposes... Heck, it's even worse because we're supposedly talking about *FREE* information here (I'm assuming the Wayback machine doesn't try to crack through protection in order to archive private data, but correct me if I'm wrong).

    It's your own responsibility (and it always has been) to make copyright information available alongside the documents being accessed, that way this data will be archived along with the rest.

    If you want to put something on the web, you have to deal with the pros and cons. You can't have it one way without the other.

    --
    "But the cars are all flashing me, bright lights are passing me, I feel life passing me by" - Stiff Little Fingers
  178. History is more important than your copyright. by Jamie+Zawinski · · Score: 2
    I read somewhere that, when the archive.org folks are asked to delete something from their archive, their response is, "I will gladly delete you from the historical record. Enjoy oblivion."

    I sincerely hope that they don't ever really delete things, and that they ignore robots.txt as far as archiving goes. It's fine for them to not serve back your pages if you ask them not to. For a while. Say, until you are long dead.

    But this information might be interesting to future generations, and frankly, any librarian or archivist owes more to those unborn people than they have any obligation to obey your transitory wishes.

    Copyright laws change.

    Oblivion is forever.

  179. So, Basil, by willpost · · Score: 1

    if I travel back to 1994 and i'm using Mosaic in 1993, I could go look at my old Mosaic web page. But, if I'm still using Mosaic in 1993, how could I have loaded the Wayback javascript page in '02 and traveled back to '94? Oh, no, I've gone cross-eyed!

    Relive an old browser at http://dejavu.org/

  180. robots.txt by Anonymous Coward · · Score: 0

    Not only does it respect robots.txt, but it does so retroactively. In other words, if you create a robots.txt which blocks your site today, all previous content will be blocked the next time they spider your site.

    I know this because we had a robots.txt blocking everything on our site during some new development when the Wayback Machine was announced, and found that we couldn't access anything. Fixed up the robots.txt and now we have archives going back to 1996.

  181. why copyright's important by Anonymous Coward · · Score: 0

    I don't care if it's a simple test page or a great work of art -- if someone made it, it's theirs. It's not public domain. It's not free-for-all.

    When you post something on the web, you don't forfeit copyright. Since when does robots.txt supercede copyright? That's ridiculous. I have pages with copyright notices in the Wayback Machine -- they chose to ignore them.

    Web authors depend on copyright laws. Open source software depends on copyright laws. The only way you can enforce GPL is if you have strong copyright laws.

    The web is dynamic, immediate, and conversational. People express their ideas freely. This is the way it should be. An archive threatens that freedom. Have you ever pulled out a video camera and watched people's behaviour change? They act a little differently when they're being recorded, right. I think content on the web might change too when people find out that everything is being stored in an archive.

    I don't have a problem with the concept of an archive, I just have a problem with the Wayback Machine's implementation. I appreciate the desire to preserve knowledge and information, but an archive needs to be made openly with the cooperation of web authors and administrators, not clandestinely by a third party. It needs to be "Opt-In" only. Right now it's without the knowledge or consent of the site owners.

  182. Old Slashdot, c. 1998 by piranha(jpl) · · Score: 1
    Since I don't have anything better to contribute... =)

    Here's the oldest copy of Slashdot that seemed to work on the Wayback Machine: Nov. 11, 1998. It doesn't look that much different design-wise, but the atmosphere of the comments seems to be significantly different.

    The whole list.

  183. I like it by scottgfx · · Score: 1

    It has allowed me to go back and view a website I designed for NTT (The Japan Telco) back in 1998. As it appears that internet avatars aren't part of their business anymore, it nice to be able to show people, "Hey look I did this!" :)

    Also, it allows me to go back and laugh at failed prost production companies that had websites. (www.brickhouse-editorial.com)

    --
    It's mandatory to wash your hands before returning to the land of Dairy Queen.
  184. Archive is good for keeping companies honest. by ariocksayssquee · · Score: 1

    PacBell just jacked up my DSL price from 39 to49 a month. When I called to ask why, they lied and said the price was a special promotion.

    Considering I signed up in March of 2000, how the hell should I remember?

    Checking back in the archive I can find that they never said it was a promotion at all. They had a promotion where they gave away a free computer....I never got a free computer.

    There seem to be two main objections. Concern over copyright violations? These are all items that were freely available to the public. They are also all no longer available. sounds like abandon-ware. If you want to make your work private, put up a password.

    The other concern seems to be people embarrassed over something they used to have up. Who publishes something with the expectation that they will be able to disavow the publication in the future? Don't write it if you think you might be embarrassed by it later.

    I propose that the possible positive impacts arising from consumer protection vastly outweigh the concerns over

    1) a non-profit making free copies of something previously provided for people to freely copy.

    2) poster's remorse (to poorly coin a term)

  185. Obligatory /. Slashback by Mizery+De+Aria · · Score: 0
    --
    If you're religishitty, KILL YOURSELF!
  186. Doesn't affect me.... (that much) by Arricc · · Score: 1

    Ages ago, I put a quick bit of javascript in the HEAD of my webpages that checked the URL and if it wasn't as expected, booted you to where you should be.

    As a result, when I heard about the Wayback Machine and tried to view old copies of my website I got booted to the current one...

    Oh well.
    ~Fizzgig

  187. Websites from 1998.. by Chicane-UK · · Score: 2, Informative

    I read an article about the site.. the project has actually been running since 1998 - thats when they started collecting peoples websites, and adding hardware to their 'collective' to store all the data.. they only made the site public in like 2001 (or whenever it was) despite collecting it for so long.

    I think if you use the Wayback Machine to go back to their own site in 1998/1999 their front page tells you this.

    --
    "Hey! Unless this is a nude love-in, get the hell off my property!!"
  188. The Reason... by Anonymous Coward · · Score: 0

    I'm sure the reason that slashdot posted this story was because they knew it would be flamebait. And, could you think of an easier way to increase traffic to a site like slashdot if their parent company complained to the slashdot founders that their site's banner click-through numbers had dropped? And under the threat of losing funding for this money consuming venture, I'm sure the slashdot founders would be like OSDN lapdogs.

    And, like a true democracy, this article will be moderated out of existance.

  189. What do you have to hide? by Vacilando · · Score: 1

    Two comments from me: 1) What do you have to hide that you're so bothered by archives? 2) OK, so now we go and copyright history as well?

  190. Wrong Cache by Wastl · · Score: 1

    Actually it seems that the wayback machine contains wrong contents also. For my site (www.wastl.net) it reports for the year 2000 the start page of MS IIS. However, the site has been running under Linux since it exists and I have always had full control over DNS...

    Makes me think even more. Actually this is kind of forgery.

    Sebastian

  191. consider yourself lucky! by XLR · · Score: 1

    With today's shitty economy and the multitudes of "web developers" who are out of a job, the wayback machine is a real life saver.
    Most of the projects i've worked on the last few years have gone bunkrupt for this or that reason. The sites themselves were tecnicaly fine, and there would be no way for me to prove i've done them if it wasn't for the way back machine. My CV/Resume would have been reduced 50% if i couldn't back up a project description with a url..... It's not the developers fault people can't run their company (or actually, CAN run it, but only towards the ground....)
    And besides, we've all witnessed some of the dumbest ideas ever been put online in the name of 'business'... how else can we teach our children about 'stupidity' if we can't show them those blunders? eh?

    "If we worked on it, we want to show it off" (mao tse bong)

    --
    -----------RL------------ http://www.harelmalka.com
  192. You DID opt in by hummassa · · Score: 1

    Hey, you published the stuff in the Web, right? You had the work to get an HTTP server up and running (or leased one), put the right files in the right dirs, and voila... there were your pages. So, that's it. Now, to the information to be un-published, you have to take special steps (opt-out).

    ---h.

    --
    It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
  193. Double standards by Nephrite · · Score: 1

    Oh, how I love those geeks who cheerfully pirate mp3s and warez, support p2p networks and such and then when somebody absolutely legally take their puclicly distributed content they raise that fuss all over Slashdot. Come to think of it "archives are like SPAM" !!! Even RIAA speaking heads haven't come to such a ridiculous formula.

  194. Re:Opting out ... what about RF recording? by jolshefsky · · Score: 1
    The Star Trek comment struck me: it is certainly illegal to rebroadcast a television show, but what if you made a recording of a range of the electromagnetic spectrum covering all TV channels in a particular area and retransmitted that? For instance, if you had a 200 megasample-per-second PCM encoder and recorded the electromagnetic spectrum from 0-100MHz in New York City, you could play back that spectrum--right into the back of a TV, for instance--and watch any show that was on at the time. This isn't completely unrealistic ... it'd only be 400MB/second for 16-bit samples which is what FireWire can theoretically deliver today, and you could currently pack about 8 minutes onto 200GB (some arbitrarily "large" hard drive) ... about as much audio as you can put into 80MB, and 80MB was huge not so long ago (in real-world terms, kids.)

    I think things like WebArchive straddle that line ... in some ways, they're making a snapshot of the entirety of the Internet as a whole and providing access to that. However, since it's done by copying data from individual servers, it isn't really all that similar.

    --
    --- Jason Olshefsky

    Karma: Poser (mostly affected by adding this line long after everyone else did)

  195. Upper Limit?? by Anonymous Coward · · Score: 0

    In one of the articles about the wayback machine, one of the creators commented that the amount of content on the internet has an upper bound (5 billion people typing 60 words per minute, 24 hours per day.. etc)

    I was just thinking, if Google caches the Wayback machine, and the Wayback machine caches Google, don't we have an infinitely growing, ever changing cache? (Assuming that the systems constantly checked for changes in the other site...)

  196. Re:Library archives are given broader copyright us by zBoD · · Score: 1

    > Strange that such a complaint would appear
    > within a group expousing that "information
    > wants to be free."

    Who told you we were a "group" ?

    BoD

    --
    BoD
  197. Copyright vs. employer right by heroine · · Score: 2

    How much should employers find out about you based on the Wayback Machine?

  198. The Value Of Archiving by chaoticset · · Score: 1
    "The way I see it, archives are much like SPAM; I never opted in, why should it be my responsibility to opt out? I manage a number of domains and the process of refining robots.txt files and submitting myself to the Wayback Machine for removal seems to be intrusive. Worse, domains I've abandoned (which have lapsed or been re-registered by someone else) are forever archived in the Machine and I have no way to exclude them. Why should I have to deliberately remove my copyrighted material from an archive which was never granted permission to replicate that material in the first place?"

    Well, frankly, if your primary concern is not "how you feel" but "whether other people will view my site or not", then you should let WayBack do its job.

    Whenever I've gone to a web page and found that they've blocked themselves (usually only obvious if their main page is unavailable on WayBack) I know that the people running that website don't give a damn about the content there. People who underestimate the value of content usually aren't worth my time; they say stupid things, like "Why would I read a book?" or "So what if the plot sucks, it's just an action movie."

    If your concern is appearing intelligent to your customers/readers, then you want WayBack crawling all over your pages. If you have no such concern, then feel free to tell WayBack to stop archiving you.

    As for why you should have to ask someone to remove their copies of your crap from their archive...you offered it publicly. If you can't handle a public archive of your public site, get the hell off the Net.

    Think of the trust you can gain when a user ends up at the WayBack and sees that you've been publishing for X years. Think of the spirit of cooperation produced when you tell WayBack, "Yeah, archive me. I'm a valuable source, and I participate in a community."

    WayBack is a free resource. If what you're doing has no return value, no ability to be updated, and no reason to be archived...then what's worthwhile about what you're doing?

    --

    -----------------------
    You are what you think.
  199. No Wayback by Anonymous Coward · · Score: 0

    The company I work for (a big telecom concern) has wayback blocked by the s-e-c-u-r-i-t-y p-r-o-x-y f-i-r-e-w-a-l-l.

    Your old content may not be getting seen by as many people as you fear.

  200. the Ghost of Slashdot past by Anonymous Coward · · Score: 0
    IBM announces a 25 gigger Hardware Posted by Hemos on Wednesday November 11, @10:11AM from the why-i-could-put-3/4-my-cd-collection dept. Booker writes "So IBM announces a 25 gig hard drive... does the world need this yet? Unless this is in a RAID, would you really want to trust 25 gigs on a single drive? What would you use this for? 400+ hours of MP3s comes to mind... " Read More... 64 comments
  201. Re: Permission by dbialac · · Score: 0
    I can't help but wonder: where did they get such old copies of my websites, and who gave them permission to make those copies? I certainly didn't provide either.

    Actually, the day you posted your web site, way back when, you gave permission. You put that content out on the web with the intention of allowing others to view your material. The fact that some people chose to store it and redisplay it is simply an element beyond your control.

  202. archive this! by zootread · · Score: 1

    Wow, and you just admitted to being a pothead on a page that will end up in Google's cache forever. Great idea.

    Oh no, everyone will know the one and only "zootread" is a pothead. What will I ever do? I've ruined the reputation of my web alias. There's only one thing I can do now... cough hack cough.. damn, that's some good stuff. now what were we talking about?

    --
    Zoot!
  203. How they got stuff from 1998 by boyko · · Score: 1

    I believe that the Wayback machine started as "Alexa" - a browser enhancement tool for MSIE and netscape that predated (and probably gave the idea for) "find similar" buttons on modern web browsers. Alexa was a very good tool, and I used it. One of it's features was the cache which did archive items - that in itself has turned into Alexa's killer app. With spidering and the alexa clients still out there, I'm not surprized if they have stuff from 1998.

    As for "opting out" - legally, you don't have a leg to stand on. Wayback does acknowledge robots.txt, and you published the information publically. Wayback provides a service - the ability to archive the internet, that would not work under "opt-in" policy, the "opt-in" is neither intrusive, nor is it illegal, nor is it a violation of copyright. Implied in the fact that the material was published online is the fact that in order to access that information, copies of it would be downloaded to multiple hard drives - otherwise the information would not have been accessable. Once on the hard drive, the physical bits and bytes become property of the user and can be accessed at any time. (while the content those bits and bytes translate into may remain yours, copyright wise.)

    What the Wayback machine does is take the cache on the hard drive and makes it available. It makes no claims to ownership of the property, it provides opt-out mechanisms to the owners of the property, and it does not alter the content. In that respect, I cannot possibly see any violations of copyright law. Yes, through advertising they may make a profit on other people's work, but if a specific complainant wishes to have his work no longer indexed by Alexa or Wayback, the information is removable. Lack of complaint becomes implied consent, until such time as one complains.

    The similarities to SPAM are not useful in the least. Spam is unsolicited advertising that forces the recipient to bear the burden of the reciept of the message. Wayback's "victims" have recourse and are not bothered or harrassed. Furthermore, Spam provides (usually) no service - unlike Wayback, which provides an archive.

    Yes, there is concern for copyright - Wayback does redistribute material that perhaps the origional owner did not want redistributed. But the "opt-out" mechanism doesn't impede on the operation of the database nor the owner of the origional content further than is required for routine maintainance... Cease & Desist letters are neither nessessary nor effective, since Wayback has the recourse of saying that non-authorized material may have been removed at any time.

    I have found Wayback invaluable. In high school, I took alot of web design jobs on the cheap that now look good on a resume. Since many of those companies have gone under, however, I thought these items were lost permenantly - Wayback has been a savior.

    Brian.

  204. Oh, my God that thing is good... by jcpii · · Score: 1

    I'm rather amazed that the wayback machine found *6* old versions of my college-days website! Does anyone have an impressively obscure website they've found on the machine?

  205. Information wants to be free by Rupert · · Score: 2

    ... in the same way that water wants to run downhill. Finding it strange that people object to certain uses of their information is like finding it strange that people object when you spill their beer.

    --

    --
    E_NOSIG
  206. It was a friend for me, today by Kwantus · · Score: 1

    I just discovered the Washington Post killed off NewsBytes.com, and I had three of their articles in my timeline. Unfortunately WebArchive's last Last NewsBytes record was Jan 24, but I recovered one article.

    IMO it's hard to use the WWW as a *serious* resource when stuff like news articles just *vanish.* Or, arguably worse, get silently diddled.

    BTW does anyone know there's such a hole in Web Archive news-site records from mid-July to 9/11?

    my thing: http://geocities.com/hclsmith/my-tl/

    WA's NB records: http://web.archive.org/web/*/http%3A//newsbytes.co m

    Sorry I didn't make pretty HTML but /. gives me such hell when I do that I now use plain text.

  207. Get real by belg4mit · · Score: 1

    Oh wait, so you like it when the same question gets asked time after time on USENET? Yes, yes,
    there are FAQs, but they aren;t always read and
    don't contain everything. And what if it isn't a FAQ, but merely an occasionally recurring question?

    Are search engines spam? No? Right. They aren't.
    And yet you must manually-opt out with robotos.txt
    (which doesnt; guarantee your protection).

    --
    Were that I say, pancakes?
  208. Parent exaggerates greatly. by Anonymous Coward · · Score: 0

    This site has a more historically accurate analysis of the burning of the royal library. There is no possible basis to the claim that the planet was set back hundreds or thousands of years by this event, as Mediterranean civilization at the time was not much more advanced than it was 100 years later, if it was more advanced at all (in either of the three possible dates suggested - it seems most likely the crazy guy was Julius Caesar). Also, there were several other relatively advanced civilizations in existence at the time (e.g. China, India) which were completely unaffected. We have already surpassed the achievement of the library several times over: the most inflated accounts of the Alexandria holdings number 700,000 scrolls, which is orders of magnitude less information than contained in say, the Library of Congress. When we lose an information store like a library or an internet archive, the greatest loss is not to the advancement of industrialization, which tends to work on human expertise, but to the knowledge of later historians and anthropologists. The lesson we should be learning is that a single repository of information presents a single point of failure, and the wayback machine presents a means to keep our history from disappearing.