Slashdot Mirror


Wayback Machine Safe, Settlement Disappointing

Jibbanx writes "Healthcare Advocates and the Internet Archive have finally resolved their differences, reaching an undisclosed out-of-court settlement. The suit stemmed from HA's anger over the Wayback Machine showing pages archived from their site even after they added a robots.txt file to their webserver. While the settlement is good for the Internet Archive, it's also disappointing because it would have tested HA's claims in court. As the article notes, you can't really un-ring the bell of publishing something online, which is exactly what HA wanted to do. Obeying robots.txt files is voluntary, after all, and if the company didn't want the information online, they shouldn't have put it there in the first place."

182 comments

  1. Simple post by Kagura · · Score: 3, Informative
    1. Re:Simple post by Anonymous Coward · · Score: 0, Flamebait

      Why is the parent redundant? The summary does not list the URL at all

    2. Re:Simple post by freehunter · · Score: 2, Insightful

      Uhh, yeah, the summary linked to www.archive.org/web/web.php or something like that, which is the site in question. I know everyone likes to rag on the editors, but they aren't horrible, at least not all the time.

  2. Jimmy James says. .. by Anonymous Coward · · Score: 1, Informative

    "Dave, don't mess with the man with the wayback machine."

  3. I want.... by Whiney+Mac+Fanboy · · Score: 4, Funny
    Obeying robots.txt files is voluntary, after all, and if the company didn't want the information online, they shouldn't have put it there in the first place."

    I want a search engine that only indexes items excluded in the robots.txt file :-)
    --
    There are shills on slashdot. Apparently, I'm one of them.
    1. Re:I want.... by hackstraw · · Score: 2, Interesting

      I want a search engine that only indexes items excluded in the robots.txt file :-)

      What's interesting is that I've heard of robots that do that exclusively. It may of been here on slashdot, but I've heard of people putting stuff in their exclude list in robots.txt and some robots _ONLY_ searched those files.

    2. Re:I want.... by Kamineko · · Score: 1

      Can't you do this yourself with the Google API?

  4. Autolawyers by Doc+Ruby · · Score: 3, Insightful

    What's really disappointing is that it's apparently cheaper to pay lawyers to settle a case than it is to defend your right to ignore optional guidelines like robots.txt in US courts.

    If Congress were serious about keeping the US economy "safe and effective", it would reform the "lawyers' job security" laws. Instead it will surely make them even worse, and make the lawyer tax on technology mandatory.

    --

    --
    make install -not war

    1. Re:Autolawyers by arthurpaliden · · Score: 3, Insightful

      Unless lawyers are paid by the state, like doctors in Canada, they cannot be considered officers of the court who's job it is to represent your rights before said court. Once they accept payment from a client, either actual or pending, they become no more that hired sales consultants peddaling their clients version of the truth.

    2. Re:Autolawyers by hackstraw · · Score: 2, Informative

      If Congress were serious about keeping the US economy "safe and effective", it would reform the "lawyers' job security" laws. Instead it will surely make them even worse, and make the lawyer tax on technology mandatory.

      I don't see that happening any time soon -- http://www.yourcongress.com/ViewArticle.asp?articl e_id=1671

    3. Re:Autolawyers by The+Only+Druid · · Score: 1

      I don't agree with any of the statements in your post.

      "Unless lawyers are paid by the state, like doctors in Canada, they cannot be considered officers of the court who's job it is to represent your rights before said court. Once they accept payment from a client, either actual or pending, they become no more [than] hired sales consultants [peddling] their [clients'] version of the truth."


      Second, there is no distinction between being an advocate for a client's version of the truth, and being an advocate for that same client's rights in court. Unless you presume that your client is intentionally lying or otherwise misrepresenting their case to deprive the other party/parties of rights, then these two concepts are identical. If you do, in fact, presume that the client is intentionally attempting to deprive the other party/parties of their rights, then no bar in the country believes it is ethical for that lawyer to proceed.

      Third, there is no proper analogy between the hypothetical socializing of medicine and of law. Doctors attempting to heal a single patient are not competing with one another, while lawyers for adverse clients in a single case are by definition competing with one another. While I will not claim there are no possible benefits to socializing lawyers, those benefits are not based in some such analogy.

      --
      "Stumble before you crawl"
    4. Re:Autolawyers by Doc+Ruby · · Score: 4, Insightful

      There's a good case to be made for lawyers being paid by the state, as they certainly are working in those offices on that business. But even more than doctors they cannot be allowed to make their own interests coincide with that of the state. Lawyers often work for people against the state, which must be recognized by the state as a primary responsiblity of lawyers. Doctors rarely find their interests conflicting with that of the state (except when they're not getting paid on time ;), so that structure isn't as dangerous.

      There's probably a way to ensure that lawyers represent people's rights better than they do now. Regular random audits of billings and practices. More "contempt of court" punishment. More suspended/revoked licenses, especially for repeated frivolous representation. More "malpractice" awards. There ought to be more competition, with more standardized reviews contextualizing all those "scores", published for consumers.

      Lawyers even more than doctors hide behind consumer ignorance and blind "respect". Exposing their performance as part of the shopping process would make them more competitive, and better adhere to the required "ethics" that usually are assumed to come with the tie.

      --

      --
      make install -not war

    5. Re:Autolawyers by Doc+Ruby · · Score: 1

      Interesting stats. I'd love to see the percentage of challengers to incumbents who are lawyers. Every second November, like this coming November 7, 2006, we can fire all the lawyers in the House, and probably about 30% of the lawyers in the Senate. And replace them with people who legislate, rather than lawyer.

      --

      --
      make install -not war

    6. Re:Autolawyers by The+Only+Druid · · Score: 1

      Wow, worst formatting errors I've ever let through. Obviously, my text shouldn't be italic, and the paragraphs should be introduced as "first" and "second". Ugh.

      --
      "Stumble before you crawl"
    7. Re:Autolawyers by Bloke+down+the+pub · · Score: 1

      I don't agree with your any of your use of italics. Which are belong to us. Or something.

      --
      It's true I tell you, feller at work's next door neighbour read it in the paper.
    8. Re:Autolawyers by nebaz · · Score: 1

      They don't have to be paid by the state, merely licensed by the state. That license comes with certain responsibilities, I think some pro-bono work must occur every year under some circumstances, for example.

      --
      Rhymes that keep their secrets will unfold behind the clouds.There upon the rainbow is the answer to a neverending story
    9. Re:Autolawyers by The+Only+Druid · · Score: 1

      It's worth noting that your suggestions about increased contempt and malpractice damages (against lawyers) are possible today, without any new legislation: you would probably be surprised how much existing leeway there is for judges to make such damages. For a variety of reasons, they rarely do so. I like the idea of random audits, but it'd require a very sophisticated system of deployment to prevent harassment (for example, how do you weight a lawyer's likelihood of being audited? Should a more prolific lawyer be more likely to be audited?).

      --
      "Stumble before you crawl"
    10. Re:Autolawyers by Doc+Ruby · · Score: 2, Interesting

      Lawyers should be required to instruct (off the clock) clients how to complain, and judges should ask clients if they've been informed (checking against a form the client signs). Failure should be like violating Miranda rights.

      Yes, a more prolific lawyer should be more likely to be audited. Probably every nth case (by all lawyers) should have an audit initiated secretly to follow the proceedings, reporting malpractice as it's observed, so corrections aren't applied only after the case is derailed. That doesn't sound so sophisticated, but it does seem like lawyers would spend their careers learning to abuse it. NP complete, but best effort counts.

      Another big reform that seems essential is to direct all punitive damages (not compensation damages) to the state, or perhaps even to some certified victim's fund, rather than to the plaintiff (and a percentage to their lawyers). That seems like a fundamental abuse that needs to be fixed, and would help fund a better justice department to make better decisions. Oh, and big penalties for lawyers introducing invalid evidence, all evidence determined before trial in separate hearings... anything for lawyer accountability to standards would make big improvements.

      --

      --
      make install -not war

    11. Re:Autolawyers by AuMatar · · Score: 1

      The main reason they rarely do- most judges used to be lawyers.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    12. Re:Autolawyers by Blakey+Rat · · Score: 1

      Yeah, but the problem here is that archive.org kept the material accessible even though their own policy is to delete material if robots.txt says to. It has nothing to do with the right of archive.org to ignore the robots.txt file, it's all about whether archive.org must follow their own published policies.

    13. Re:Autolawyers by MindStalker · · Score: 1

      From what I understand there is a group of lawyers who are assigned to you if you are charged with a crime and can't afford a lawyer yourself. Despite what you may see on TV these lawyers do a decent job (not always) of disagreeing with the State. And if they do a bad job you can often get a Judge (who seem to be reallly good at disagreeing with the State) to rule that your lawyer was incompetent. Sure if we went to a system of all public layers we would need some tougher checks and balances, but so far it seems the third branch of the government is pretty good at disagreeing with the first two. /I'm sure I'm gonna get some liberal to jump on this and argue about recent appointments and such and such..... Flame On!! :)

    14. Re:Autolawyers by Doc+Ruby · · Score: 1

      It's probably worth trying an experiment expanding public defenders and prosecutors to encompass a greater percentage of criminal cases. Maybe even require private lawyers to rotate through those offices something like 1 of every 10 years. If successful, maybe it's worth trying with civil law, too.

      FWIW, I'm not really "a liberal", but I did notice that more Conservative justices overturn Congress more than less Conservative justices. Which makes calling them "Conservative" ironic, and makes the Conservatives "activist judges". I don't think any of them, no matter how "liberal", act to overturn laws nearly enough.

      --

      --
      make install -not war

    15. Re:Autolawyers by MindStalker · · Score: 1

      I think its become fairly obvious by now that "liberal" judges are ones that use the constitution to say things we've been doing all along were bad. While "conservative" judges are ones that use the constitution to say things we've just started doing are bad. /Don't like any judge that tries too hard to read what they want to hear out of the constituion. //Don't think anyone really purposfully does that, its just how their mind is set.

    16. Re:Autolawyers by Doc+Ruby · · Score: 1

      Obvious to Republicans, maybe. What about recent crimes like the NSA wiretapping, lying us into Iraq, Guantanamo, Abu Ghraib, Terry Schiavo, signing statements...

      --

      --
      make install -not war

    17. Re:Autolawyers by MindStalker · · Score: 1

      And you think these things are new in war??? HAHA.

      I wasn't trying to make a republican democrat statement I was trying to cover the old addage of
      "Conservative" = Likes things the way they are.
      "Liberal" = Likes to try new things.

    18. Re:Autolawyers by Doc+Ruby · · Score: 1

      War on Americans like domestic wiretapping, Iraq lies, American torture gulags, pandering to zombie lovers and unitary executive tyranny is recent, but not what "Conservative" judges are ruling against. They're ruling against civil rights, labor, environmental, "Watergate" oversight laws. Just because "Conservatives" want to "roll us back" to an imaginary past doesn't make mean they're really "conserving" anything. Just like "liberals" don't necessarily want to "try new things", but usually do want to keep free from old constraints.

      And just because these recent American crimes are old war crimes doesn't make them OK, or something we can accept, especially having found ways to move past them.

      And just because you prefer "Conservative" activist judges to "liberal" ones doesn't make the Conservatives any less activist. They're more activist, and more dangerous.

      --

      --
      make install -not war

    19. Re:Autolawyers by MindStalker · · Score: 1

      And just because you prefer "Conservative" activist judges to "liberal" ones doesn't make the Conservatives any less activist. They're more activist, and more dangerous.

      I don't think I said that. I keep trying to make neutral statements, and I keep getting attacked.
      I guess I just learned lesson one in Internet, don't bother arguing a neutral position against someone who obviously has an axe to grind.

    20. Re:Autolawyers by Doc+Ruby · · Score: 0, Troll
      You said
      so far it seems the third branch of the government is pretty good at disagreeing with the first two. /I'm sure I'm gonna get some liberal to jump on this and argue about recent appointments and such and such
      .

      Then you said
      I think its become fairly obvious by now that "liberal" judges are ones that use the constitution to say things we've been doing all along were bad. While "conservative" judges are ones that use the constitution to say things we've just started doing are bad.


      When I specified recent crimes which "Conservative" judges say are OK, you said
      And you think these things are new in war??? HAHA.


      You're obviously a "Conservative". You obviously prefer "Conservative" judges. You obviously think that "Conservative" judges are more likely than "liberals" think to "disagree" with the rest of the government. They're not, and they're more dangerous. And you're grinding your ax while hiding it behind your back, under judges' robes, in true Conservative fashion.

      When you have a neutral position, I'll argue it in neutral terms, unless you're denying the highly unbalanced context of the issue. You won't catch me helping pretend that Conservative judges aren't activists, especially when they are tacit activists ignoring the crimes of fellow "Conservatives".

      The Internet lesson is that you can say something to someone who won't play along with the denial that people in a familiar echo chamber will play. You just might learn something new.
      --

      --
      make install -not war

    21. Re:Autolawyers by MindStalker · · Score: 1

      Weeeee this is fun. I generally vote libertarian, but yes your right for the most part loud liberals annoy me slightly more than loud conservative, but they both annoy me.

      Let me break down the points you make.

      My quote I'm sure I'm gonna get some liberal to jump on this and argue about recent appointments and such and such

      Ok yea sure, I was trying to be offensive there, sorry. I was just thinking that in general judges are pretty neutral, but I was sure I was going to hear arguments that recent judges are overly conservative... Which obviously I did... And no I'm not going to argue the the point, because you're not completly wrong.. I just hear it from every side, about "activist" judges. And I get tired of it.

      I think its become fairly obvious by now that "liberal" judges are ones that use the constitution to say things we've been doing all along were bad. While "conservative" judges are ones that use the constitution to say things we've just started doing are bad.

      AS I already stated all that means is

      "Conservative" = Likes things the way they are.
      "Liberal" = Likes to try new things.

      You can try to deconstruct my viewpoint out of that, but its a stretch.

      And NO I don't think these recent crimes are OK. Just that warcrimes arn't new, but I think the supreme court judges do have the ability to seperate their political leaning from what this "administration" wants. As it really owes nothing to this "administration". So maybe its their own opinions.. Just maybe...

      Anyways don't try to label me so fast. It really pisses me off. /I'm sure I'm gonna get some liberal to jump on this and argue about recent appointments and such and such

    22. Re:Autolawyers by MindStalker · · Score: 1

      Ok somehow that last "/I'm sure I'm gonna get some liberal to jump on this and argue about recent appointments and such and such" was pasted to the bottom of the page. Ignore it :)

    23. Re:Autolawyers by Doc+Ruby · · Score: 1

      I knew you'd plead "Libertarian". That's the weasel move of cryptorepublicans. Like ignoring the Supreme Court justices' party affiliation in their votes, like voting Bush into office. Like the two "Conservative" Bush appointed justices who helped create Bush's torture gulags and unitary executive tyranny.

      I didn't deconstruct your viewpoint. I just reconstructed your side of the discussion, which is contradictory to some of your defensive statements. Like denying that recent crimes "Conservatives" ignore or commit would make them "liberal". There's nothing new under the Sun, but there are new abuses against old laws. While your definitions are a nice theory, the actions of actual "Conservatives" don't abide by them, nor do "liberals". The reason I keep putting those terms in quotes is that they're only nominally accurate, and practically meaningless. It's actions that I care about, that speak louder than any "loud" partisan, whether "liberal" or "Conservative".

      --

      --
      make install -not war

    24. Re:Autolawyers by MindStalker · · Score: 1

      Alright, I took the flame-bait hook line and sinker. You win! /Arguing on the internet is... you know the rest

    25. Re:Autolawyers by Doc+Ruby · · Score: 1

      You took the flamebait? You are the flamebait. Your original post concluded with "Flame On!! :)".

      I took the bait, replied without flaming, though apparently you can't tell the difference between holding your own contradictory words up to your face and a flame.

      Along the way, you admitted "Ok yea sure, I was trying to be offensive there,", while I offered "The Internet lesson is that you can say something to someone who won't play along with the denial that people in a familiar echo chamber will play. You just might learn something new." . Which you'd probably think is nothing but a "flame", because '"Liberal" = Likes to try new things.'.

      You prefer the old things, like "I keep trying to make neutral statements, and I keep getting attacked", the old conservative pretense of partisanship masked with neutrality denials.

      You keep flamebaiting, flaming, denying, and ignoring your own serious defects. If I wanted to just flame you, I'd just point out that you're from Florida. Damn right I win.

      --

      --
      make install -not war

    26. Re:Autolawyers by MindStalker · · Score: 1

      So you disagree with'"Liberal" = Likes to try new things.'.

      Have you taken a civics class? Ever??

  5. Exclusion policy.... by Anonymous Coward · · Score: 1, Informative
    The whole exclusion policy

    Thought I'd go karam slutting maybe have a load of karma hit you too. ;-)

  6. Don't need no Wayback by kaizenfury7 · · Score: 5, Funny

    If you go directly to their site, you get a version of their site that looks like it's from 1995.

    1. Re:Don't need no Wayback by cptgrudge · · Score: 2, Funny

      Quick! Get those people some Rounded Corners and Gradients!

      Welcome to Web 2.0!

      --
      Qualitas edurus commercium, nullus penitus net rimor, nullus deus beneficium
    2. Re:Don't need no Wayback by alsundma · · Score: 1

      Wow! I thought you were linking somewhere else as a joke. That site takes me takes me right back to the glorious 1990's.

    3. Re:Don't need no Wayback by loraksus · · Score: 1

      Not anymore...
      Go Slashdot!

      --
      1q2w3e4r5t6y7u8i9o0pqawsedrftgthyjukilo;p'azsxdcfv gbhnjmk,l.;/
    4. Re:Don't need no Wayback by MindStalker · · Score: 1

      No, nobody needs Web 2.0.

      But the site doesn't even look midly professional. I could have made that page back in high school, and I SUCK at web design.

    5. Re:Don't need no Wayback by Al+Dimond · · Score: 1

      Since when did "professional" mean "difficult to make"? If the site conveys its content in a clear way who cares if you could have made it in high school? A web site that's simple to implement is a great thing, and extra technologies (that usually will increase development, maintenance and bandwidth costs) need to be justified in terms of how they actually make the site's experience better.

    6. Re:Don't need no Wayback by MindStalker · · Score: 2, Insightful

      I don't know, maybe I just don't expect my local newspaper to look like my highschool newspaper.
      Inital impressions go a long way. It may seem silly to some people, but in buisness it can mean the difference between people taking you seriously and buying your product, or not.

    7. Re:Don't need no Wayback by Anonymous+Brave+Guy · · Score: 1

      Oh, man, you're sooooo behind the curve.

      Triangles, baby. Web 3.0 is going to be all about triangles. Hundreds of thousands of them, all lovingly rendered in real time...

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  7. I sense a little two-faced opinion here by InsaneGeek · · Score: 4, Insightful

    which is exactly what HA wanted to do. Obeying robots.txt files is voluntary, after all, and if the company didn't want the information online, they shouldn't have put it there in the first place

    So by the logic, if I didn't want AOL to release my search information I shouldn't be mad as it's my fault to have used them in the first place? Or that if I want my copyrighted information to not be republished by someone else, I should just simply not publish at all? How about, if I don't want my GPL code resold by someone in a closed source product I should just know better and not put it out in the open to begin with. And that if I post something stupid when I'm 9 we believe it should follow me around throughout my entire lifetime, because a 9 year old should know better.

    1. Re:I sense a little two-faced opinion here by Amouth · · Score: 1

      it should.. follow you around for ever.. but it should also be noted that you where 9.. and the other party has to decide if you at 9 knew better or not.. it is their point of view.

      the way back thing always told you when it was.. never trying to show it off as now

      --
      '...if only "Jumping to a Conclusion" was an event in the Olympics.'
    2. Re:I sense a little two-faced opinion here by fm6 · · Score: 4, Interesting

      Another example: someone I know wrote an essay that he thought only people in his class would ever see. It contained one or two mildly embaressing disclosures, not terribly personal, but not something you'd want a complete stranger to know about you. Some idiot put it up on the school web site without his permission.

      Here's a nasty possibility. Suppose somebody unintentionally publishes information useful to terrorists. DHS drops by and points out the error, and the information is withdrawn. Does Wayback Machine have a right to keep the information online?

      In fact, Wayback Machine has never asserted their right to keep anything online. As the article points out, they'll remove stuff that's noncompliant with the current robots.txt, even though it was compliant at the time it was spidered. This lawsuit wasn't about their right keep stuff online. It was just somebody accusing them of being negligent about enforcing their own policies.

    3. Re:I sense a little two-faced opinion here by gsn · · Score: 4, Insightful

      Thats crazy - when you typed in your search term into AOL you had an expectation of privacy and you did not for one minute believe that they would release that data. All webpages are copyright and the Wayback machine is using fair use to archive copies for educational use. If you publish information (its automatically copyright) and someone reproduces it they might be able to under fair use or they might be infringing your copyright - talk to your lawyer. And yes if you posted something on the net when you were 9 that was stupid it might well follow you around for the rest of your life. Same goes if you were in a porno in college and you put it online. Sorry. Tough shit. Maybe your parents should have paid more attention to your online activities. Or you should have known better. IANAL and 9 year olds may get some protection as minors but basic point remains - you publish something online you had no expectation of privacy. This is not at all what you were doing when you sent AOl your search queries - you published zilch.

      If you post something on the net then I can point my browser to it - there is no privacy, and nor was there any expectation of it. I could have used wget -r -erobots=off on your page every day and got all its content - and I'd have that archive even when you deleted it or moved it into some private archive, and it happily ignored your robots.txt. Since obeying robots.txt is volutary I simply chose not to.

      News websites often want you to pay to for older content but there is nothing theoretically stopping you from saving all the content day by day. You are comparing apples and oranges.

      Heres the summary - we posted evidence online that was used against us in a court of law, we lost, we sued the people who provided that evidence, and because its cheaper to settle than deal with bloody lawyers we settled with them.

      --
      Reality must take precedence over public relations, for nature cannot be fooled.
    4. Re:I sense a little two-faced opinion here by Chosen+Reject · · Score: 1

      So by the logic, if I didn't want AOL to release my search information I shouldn't be mad as it's my fault to have used them in the first place?

      AOL's privacy policy was not such that your searches would be released to the world at the time people made those searches. That ended up not being the case, so you would have a legitimate concern. Also, AOL searches were not being made public a la a web page at the time people were making them. I am sure many people would not have used AOL, or at least changed their search habits, had they known all of their searches would be immediately posted on the web.

      Or that if I want my copyrighted information to not be republished by someone else, I should just simply not publish at all?

      If this were a copyright issue, it would have been brought up as such. HA is not saying that archive.org violated copyright, only that they ignored a voluntary robots.txt file. If your copyrighted material is being infringed upon, you are more than welcome to stop the perpetrators. However, this is more like time shifting on TV, which in the US and many other countries, is considered perfectly legal.

      How about, if I don't want my GPL code resold by someone in a closed source product I should just know better and not put it out in the open to begin with.

      No, you knew better and GPLd the code. Therefore it falls under a license that gives you the legal right to stop the offending party from reselling it in a closed-source project. But this isn't about licenses, it's about a voluntary robots.txt file.

      And that if I post something stupid when I'm 9 we believe it should follow me around throughout my entire lifetime, because a 9 year old should know better.

      It's unfortunate that sometimes we do some dumb things when we might not have known any better yet still have to live with people knowing it forever. But then, this is no different than how things have gone on forever, it's just that now we have a much larger audience. Lots of kids do dumb things before they fully understand the consequences of their actions, and lots of people remember those things. However, that is no basis for a legal challenge.

      There is nothing two-faced about this. Even if all of /., except for you of course, were actually all of one mind on those issues, this issue doesn't concern any issue you have brought up, except for maybe the 9-year-old thing, but even then, that is just a reality we have to deal with.

      --
      Stop Global Warming!
      Just say no to irreversible processes!
    5. Re:I sense a little two-faced opinion here by DeadboltX · · Score: 2, Interesting

      why do people make such god awful analogies?

      if you give private information to AOL and they release it publicly then you can get upset
      if you post private information on "check-out-my-ssn.com" and its public to the whole world then you can't get mad.

    6. Re:I sense a little two-faced opinion here by wik · · Score: 1

      What is their policy for websites that no longer exist? Their website says nothing about this.

      I want to remove archives of my websites for hostnames/domains that are no longer connected to the internet. Obviously, the robots.txt method cannot work here.

      --
      / \
      \ / ASCII ribbon campaign for peace
      x
      / \
    7. Re:I sense a little two-faced opinion here by MadEE · · Score: 1
      Thats crazy - when you typed in your search term into AOL you had an expectation of privacy and you did not for one minute believe that they would release that data. All webpages are copyright and the Wayback machine is using fair use to archive copies for educational use.
      I am certain that when they published the site concern over the wholesale copying of it was about as high on their list as privacy is to a search engine user less so when most search engine's TOSs allow them to pull this crap. Regardless the use being educational in nature doesn't make something automatically fair use particularly when it's published publicly.
    8. Re:I sense a little two-faced opinion here by alexhs · · Score: 2, Insightful

      Maybe you need to inform yourself of what Robot Exclusion is and isn't.

      Its purpose is not to censor information but to avoid incident by agressive robots that could stress WWW servers (introduction in the first link).

      HA action is revisionism. Like a politician yelling something then a few years later claiming he never said such a thing and threatening people with a piece of evidence to the contrary.

      --
      I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
    9. Re:I sense a little two-faced opinion here by DamnStupidElf · · Score: 1

      Here's a nasty possibility. Suppose somebody unintentionally publishes information useful to terrorists. DHS drops by and points out the error, and the information is withdrawn. Does Wayback Machine have a right to keep the information online?

      Why don't you just play the child pornography card instead? At least that's *illegal*, unlike putting publicly available information online instead of hidden in some dusty library gaurded against terr'ists by a librarian.

      The fact is, if something is actually illegal to posess, the Internet Archive can't possess it either. That said, hopefully no DHS flunky notices this case and subpoenas the whole archive to make sure it's clean of terrorist helping information...

    10. Re:I sense a little two-faced opinion here by Drooling+Iguana · · Score: 1

      Now say "KHAAAN!!!"

      --
      ... I'm addicted to placebos
    11. Re:I sense a little two-faced opinion here by fm6 · · Score: 1

      Why don't you ask them?

    12. Re:I sense a little two-faced opinion here by fm6 · · Score: 1

      Who said anything about putting publically available information online? It might, for example, be private information about a building that makes it easier to blow it up. "Our new death star has a state of the art venting port, located for easy access at ..."

      It's funny that you accuse me of bad faith, since you're lumping me in with the Bush administration's crazy attempts to control information. I didn't say anything about censorship. I simply pointed out that a web site can have legitimate reasons for wanting to withdraw information that was previously on its site.

    13. Re:I sense a little two-faced opinion here by soft_guy · · Score: 1

      Suppose somebody unintentionally publishes information useful to terrorists.

      Your information is useful to people. Terrorists are people. Therefore, your information is useful to terrorists.

      Therefore, you need to refrain from posting any information that is useful to people. Therefore, Slashdot is OK.

      --
      Avoid Missing Ball for High Score
    14. Re:I sense a little two-faced opinion here by DamnStupidElf · · Score: 1

      Who said anything about putting publically available information online? It might, for example, be private information about a building that makes it easier to blow it up. "Our new death star has a state of the art venting port, located for easy access at ..."

      Frankly, anything that has been posted on the Internet should not be considered private anymore. There are too many search engines and archives and caches that can keep it around. If damaging information is ever made available on the Internet, then whatever the information was about has to be changed to prevent that information from being damaging anymore. Like closing the venting port (or putting a grille in it) after lots of Bothans have been dying for some vague but important goal recently.

      It's funny that you accuse me of bad faith, since you're lumping me in with the Bush administration's crazy attempts to control information. I didn't say anything about censorship. I simply pointed out that a web site can have legitimate reasons for wanting to withdraw information that was previously on its site.

      I'm sorry, it's just that when the terrorists are used in an argument about preventing access to information, it's usually information that any private citizen could discover within a few hours of searching in the right places. I'm sure there are plenty of state secrets that shouldn't be on the Internet, but that was my point: If it's illegal for Secret or Top Secret information to be on some web site, it's equally illegal on the Wayback machine and can be removed through the normal legal process. For secrets that websites wish hadn't been published, perhaps they should reevaluate their security if simple information can make them vulnerable. For instance guard schedules and routes can be changed, security codes can be changed, cameras can be added to cover dead spots in coverage, etc. If the entire security of some building or resource depends on some vulnerability that can't be remedied if it's disclosed, then it was never secure to begin with.

    15. Re:I sense a little two-faced opinion here by fm6 · · Score: 1

      My only point was that there are legitimate reasons for wanting to withdraw information. You don't seem to disagree with that, so why are you arguing with me? You should try reading posts before responding to them, instead of trotting out the standard response to the standard argument you assume the other person is making.

  8. If you don't want it read... by saskboy · · Score: 3, Insightful

    ...Don't put it on the Internet. In fact, don't even type it into a computer, or write it down.
    People shouldn't put anything on the Internet that they wouldn't want their worst enemy, boss, NSA, or grandmother to see. Obviously since the porn industiry exists online, few people follow this rule, but it's a good one none the less.

    I enjoy Archive.org and when I get nostalgic about my websites of the past, it's there to show me a glimpse into history.

    --
    Saskboy's blog is good. 9 out of 10 dentists agree.
    1. Re:If you don't want it read... by Anonymous Coward · · Score: 0

      Of coarse that's a reasonable percaution to take, but that's exactly because things are the way they are. That doesn't mean there couldn't or shouldn't be change in the way things work.

      (Oh, and what if your company puts it online?)

    2. Re:If you don't want it read... by MindStalker · · Score: 1

      What about my financial information for almost every single bank, credit cards, bill I have. And there is little I can do about it.
      It might be fairly secure... But its on the web. Point is everything will eventually be on the web, its only a matter of do you trust the security of the site. Should you trust the security of myspace? No..

    3. Re:If you don't want it read... by From+A+Far+Away+Land · · Score: 1

      My company already is online. And per its objectives, has links to many other companies who have "it" online too. If you're familiar with the term Free Software or Open Source, you'll have heard the phrase that software wants to be free. It may sound strange to anthropomophize lines of code, but to me it means that the natural state of information is "free" to everyone, and to conceal it requires work. The natural state of the universe tries to balance vacuums and areas of higher pressure, so when there isn't enough work into keeping a secret safe, the natural tendency is for it to slip.

      A recent example is of the CNN reporter caught with her mic on in the bathroom. She badmouthed her sister-in-law when she wasn't diligently working at keeping the secret of how she felt. The universe conspired against her secret keeping, and now the whole world knows the real information.

      That's my long winded way of saying, "Shit happens, so plan for it."
      Either don't create a secret, or plan for when it gets out.

    4. Re:If you don't want it read... by saskboy · · Score: 1

      "It might be fairly secure... But its on the web."
      Lack of real information security is the trade we made as a computerized networked society, for convenience in banking. With the effort saved in banking I'd say it's worth it, even with the potential identiy scams the plague thousands of people every year. Crime happens whether it's online or off.

      --
      Saskboy's blog is good. 9 out of 10 dentists agree.
  9. unringing the bell? by hguorbray · · Score: 0, Offtopic

    Good thing this isn't anything to do with the Bush Administration -else they'd have retroactively classified all this stuff as 'Top Secret' and then charged the Wayback machine of Treason under the Patriot act ....

    and then the machine would find itself held without trial or charges in Gitmo until it turned to rust.

    Sometimes you gotta laugh to keep from crying.....Hopefully you're laughing at this

    1. Re:unringing the bell? by Kainaw · · Score: 1

      Um... The Patriot Act is terrible, but Congress passed the Patriot Act, not Bush. Nobody in Gitmo will every be charged with anything related to the Patriot Act because it is for surveillance, not prisoners of war. Have you been watching too much Michael Moore? It is idiotic statements about the Patriot Act that keeps the public from understanding the truth of why it is bad. So, it can never be fixed. I often wonder if Congress paid Moore and the ACLU to go after the act with idiotic (and completely unrealistic) statements so the stupid public would never know what was really in it.

      --
      The previous comment is purposely vague and generalized, but all of the facts are completely true.
  10. metaphorically speaking by nizo · · Score: 1, Troll
    ... you can't really un-ring the bell of publishing something online...


    For the life of me I can't figure out what ringing a bell and publishing something online have in common. Maybe if we didn't use digital clocks we could turn back the sands of time and use a different mixed metaphor instead?

    1. Re:metaphorically speaking by LordNimon · · Score: 2, Informative

      There's only one metaphor - "you can't unring a bell", so there is no mixed metaphor.

      --
      And the men who hold high places must be the ones who start
      To mold a new reality... closer to the heart
    2. Re:metaphorically speaking by The+Only+Druid · · Score: 1

      The use of the phrase "you can't unring the bell" in the discussion of Free Speech is an old one, based on the concept that no matter what you do after someone rings a bell, you can't "unring" it. The use here as an analogy is appropriate, in that you cant "un-release" information from the internet.

      --
      "Stumble before you crawl"
    3. Re:metaphorically speaking by Anonymous Coward · · Score: 0

      Nah. I heard it on "My Name Is Earl" last night.

      -----

      Earl: Sorry about that.don't worry I'll find your dog. But htats it right, then your life is exactly back to the way it was seven months ago, we're done.

      Scott: yeah, I think that's completely everything back to normal.

      Earl: good.

      Scott: unless (to Tess) you didn't have sex with anyone else while we were broken up , did you?

      Tess: I used my hand on a guy a little.

      Earl: yeah I'm not sure how to unring that bell.

      -------
      MY NAME IS EARL
      1X07 - BROKE STOLE BEER FROM A GOLFER
      Original Airdate (NBC): 08-NOV-2005
      link
    4. Re:metaphorically speaking by whitehatlurker · · Score: 1
      I've never heard this expression either, and I agree that it is poor. You just wait until the echoes of the bell have dissipated and it's like the bell never rang.

      [Curmudgeon]Un-ring? Bah! Nonsense.[/Curmudgeon]

      --
      .. paranoid crackpot leftover from the days of Amiga.
  11. But.... by Stanislav_J · · Score: 2, Informative

    ....even if Wayback did respect the robots.txt (which I was under the impression that they generally do), any pages archived before the robots.txt was placed on the server aren't going to automatically disappear -- they are still there. You have to directly ask them to remove the previously arvhived pages if you don't want them to be accessible.

    --
    "Every great cause begins as a movement, becomes a business, and eventually degenerates into a racket." -- Eric Hoffer
  12. Wayback of missing documentation? by Anonymous Coward · · Score: 0
    Sometimes you gotta laugh to keep from crying.....Hopefully you're laughing at this

    Laughter is induced by the ironic or unexpected. Unfortunately, I fully expect what you say would be how things would play out :-(
  13. What REALLY pisses me OFF by scenestar · · Score: 4, Insightful

    Is that some sites that used to exist had no robots.txt file, yet still get blocked

    After a certain domain was no longer in use for years some adware search rank linkpharm whatever it is added a robots.txt file to a "hijacked" domain.

    One can now get formerly accessible sites removed from archive.org. EVEN IF THE ORIGINAL OWNER NEVER INTENDED TO.

    --
    perpetually dwelling in the -1 pits
  14. Check out their robots.txt... by Anonymous Coward · · Score: 3, Interesting

    Check out their robots.txt: http://www.healthcareadvocates.com/robots.txt They ONLY restrict Internet Archive, from accessing their web site, but don't restrict any other spider... Haven't they heard of Google's cache?

    1. Re:Check out their robots.txt... by Sir+Pallas · · Score: 2, Interesting

      Which is funny, because ia_archiver is actually the Alexa Internet crawler; it's a throwback to before Amazon.com bought Alexa. (To this day, Alexa donates crawl data to the Archive.)

  15. A world without cooperation by Anonymous Coward · · Score: 5, Insightful

    Obeying robots.txt is "voluntary" in the same sense that obeying RFCs is voluntary. In other words, it isn't. You can technically ignore any and all standards, but there will be sanctions. In the case of robots.txt, these sanctions can very well be court ruling against you, because robots.txt is an established standard for regulation of the interaction between automated clients and webservers. As such it is an effective declaration of the rights that a server operator is willing to give to automated clients in contrast to human clients. This is especially important with regard to services which mirror webpages. Doing so without the (assumed) consent of the author is a straightforward copyright violation and if the author explicitly denies robot access, then the service operator knowingly redistributes the work against the author's will.

    Even if you don't fear the legal system, disregarding robots.txt can quickly get you in trouble. There are junk-scripts which feed bots endlessly and there are blocklisting automatisms against unbehaving bots. If people program their bots to ignore robots.txt, these and possibly more proactive self-defense mechanisms will become the norm. Is that the net you want? Maybe obeying robots.txt is the better alternative, don't you think?

    1. Re:A world without cooperation by pimpimpim · · Score: 2, Interesting
      Yeah, and in a world with full cooperation, you wouldn't have to lock your door because no-one would enter your house, at that would mean that there will be serious actions against them. Dream on, mr AC! robots.txt is a flaky way of security, and everyone knows it. If I would want to find out something nasty/interesting of a certain company, I'd look at the robots.txt files to see what I could find.

      Furthermore, there are perfectly good ways to lock content away from the outside in a more rigourous way, password-protected pages, pages only accessable via VPN to the intranet, etc. All other information, that is put unlocked, unencrypted, over the internet can be considered open. There will be some chance that you will find it accidentally, for example.

      --
      molmod.com - computing tips from a molecular modeling
    2. Re:A world without cooperation by Anonymous Coward · · Score: 5, Insightful

      An attitude like yours is exactly why people go to court over these things. If you don't even adhere to the most basic rules, then it's easier and less costly to have you pay my lawyers and a fine instead of trying to stop robots from reading information that human users are supposed to see without difficulty. The lack of common courtesy on the net is disconcerting. The server tells you in no uncertain terms that you are not welcome, but you keep requesting "forbidden" pages. Consider an analogous situation in real life: You are walking in the park and someone asks you for a dollar. You decline, but the beggar keeps asking. You're saying that accepting your first denial as binding is "voluntary" and the beggar can keep bugging you as long as he likes. If that happened to me twice, I'd have the asshole arrested, and that's exactly what you're going to see online if people don't behave, especially when their behaviour leads to copyright violations which would have been avoided if they had followed the robot exclusion standard.

    3. Re:A world without cooperation by Anonymous Coward · · Score: 0

      WRONG!!! In this case voluntary means there is no law forcing anyone to follow the robots.txt file. Webbrowsers do not follow it, so I can easily view the material that is "protected" from robots. There is no implied contract for the visitor (human or automated) following the prescription of another file on the same system. It is simply a standard for friendly (or dare I say moral) use of a system.

      Copyrights and robots.txt files have nothing to do with each other. If a search engine can create a set of keywords from your publicly available information, or cache information from your publicly available information without copyright infringement (of which I do not know the understanding of the law on this point), then a robots.txt file does not change the copyright status of the information.

      A robots.txt file is not the same as a door with a lock. Simply placing a file in a publicly accessible place is like leaving a very expensive HDTV in a dumpster (garbage is public domain in most localities, police don't even need a search warrant for it), with a big sign that reads, "If you want to be nice, don't take the TV in the dumpster."

    4. Re:A world without cooperation by Anonymous Coward · · Score: 0

      You are an idiot.

      A more apt analogy would be wearing a t-shirt with writing on it while holding a sign that says "do not take my picture"

      You can't sue me for taking your picture.

    5. Re:A world without cooperation by grumbel · · Score: 2, Informative
      Obeying robots.txt is "voluntary" in the same sense that obeying RFCs is voluntary. In other words, it isn't.

      How about we have a look what the RFC-drafts (its not even official) say about robots.txt:

      "Web site administrators must realise this method is voluntary, and is not sufficient to guarantee some robots will not visit restricted parts of the URL space."

      "It is not an official standard backed by a standards body, or owned by any commercial organisation. It is not enforced by anybody, and there no guarantee that all current and future robots will use it."

      Its really that simple, robots.txt is not a security tool, its a guideline, nothing else. If you don't want robots to collect your data simply don't send it them.

      This is especially important with regard to services which mirror webpages. Doing so without the (assumed) consent of the author is a straightforward copyright violation

      Its a straightforward copyright violation, yep, but that has nothing todo with robots.txt, since having it or not, doesn't make it any less a violation.

    6. Re:A world without cooperation by iminplaya · · Score: 1

      ...and the beggar can keep bugging you as long as he likes.

      As long as he doesn't physically assault you, you should have no recourse. Copyright violations do not relate to phsical assault and removal of your physical property that would deny you the use of that property. Copyright violation is nothing more than the denial of a special privilege granted by the government. A privilege that that been abused for far too long.

      --
      What?
    7. Re:A world without cooperation by Anonymous Coward · · Score: 0

      Spidering isn't a copyright violation you twats, republishing it might be. Just because I ignore robots.txt means nothing. I may be gathering statistics or building a hash of content to determine if your site's updated. Is it rude...yes--but fortunately that isn't unlawful in most of the world.

      And to the parent--obeying RFCs most certainly *is* voluntary--subject to the same rules as any other social club. Or are you going to have me fined for violation of 1855 http://www.faqs.org/rfcs/rfc1855.html) and pointing out that I think you're fscking tools?

      Go to hell with your mindless legal threats. Having a begger arrested for asking for change twice--it's people like you who give a bad name to freedom loving people.

    8. Re:A world without cooperation by sasdrtx · · Score: 1

      Wishing and hoping that everyone will obey the rules has never worked. Not ever.

      The poster isn't the one you want to take issue with. There are plenty of people out there who obey only rules that cause them pain for violating. Always have been, always will be. Most of them don't read /., and therefore won't have the benefit of your sternly-worded admonitions.

      --
      Most people don't even think inside the box.
    9. Re:A world without cooperation by Anonymous Coward · · Score: 0

      First of all, no, I would not have a beggar arrested for asking for change twice. I'd have him arrested if he continuously kept asking me for change despite a clear refusal to give him money, and only if that happened twice. There are laws against this sort of aggressive panhandling. Freedom is not freedom to harass others.

      The bindingness of the robots exclusion standard comes from it being a well known and, unlike RFC1855, mostly respected de-facto standard for declaring certain parts of a website off-limits for automated clients. Visiting these parts of the site with an automated client does not in itself break any laws, but if it results in content being redistributed, then you cannot argue that you assumed that the author, by putting that content online, agreed to have it mirrored. Robots.txt is not a security measure. Seeing it as that would be silly, because usually human visitors are supposed to be able to access these urls without authentication. But robots.txt is a clear declaration of intent and as such it has legal significance and should not be taken lightly. It is also wise to obey robots.txt because it is regularly used to mark parts of a website which are hard to swallow for automated clients, for example areas with an infinite number of urls which are created on demand.

    10. Re:A world without cooperation by Anonymous Coward · · Score: 0

      Obeying robots.txt is "voluntary" in the same sense that obeying RFCs is voluntary. In other words, it isn't.

      No, obeying robots.txt is less voluntrary than obeying an RFC. Obeying RFCs isn't a legal requirement. However, obeying copyright law is a legal requirement. The only reason you can download a document on the web at all is because the owner gives (implied) consent by placing it on the web: for the sole purpose of downloading that document.

      Robots.txt removes the implied consent that allows you to copy the author's documents from his computer to yours without a copyright violation taking place.

      Republishing someone else's copyrighted work is illegal without their consent. If you "archive" someone's work by re-publishing it on the web to all comers, you violate their copyright. Just because you happen to have a copy of something doesn't give you the right to copy it: that's the entire point of copyright.

      If I hand out flyers, you can't legally photocopy them and distribute them after I choose to stop distributing them. The paper I gave you belongs to you: but the right to copy it remains mine.

      This is very, very simple stuff: the reason people can't understand it is because they assume copyright isn't supposed to be restrictive. It is restrictive: that's why it's so powerful, for good or for ill.

  16. I sense a collection of poor analogies here by Anonymous Coward · · Score: 2, Interesting

    So by the logic, if I didn't want AOL to release my search information I shouldn't be mad as it's my fault to have used them in the first place?

    You never intended to make your search results publicly available. These guys intentionally made their web page publicly available.

    Or that if I want my copyrighted information to not be republished by someone else, I should just simply not publish at all?

    That's a better point, but the question is whether the Wayback Machine "republished copyrighted material". If they instead archived material available in the public domain, it is a different matter entirely, regardless of what the creators of that material want.

    How about, if I don't want my GPL code resold by someone in a closed source product I should just know better and not put it out in the open to begin with.

    If you don't want something to be used freely, don't release it into the public, unless there are legal protections in place. If it's the GPL, people are legally forbidden from incorporating it into a (publicly released) closed source product. If it's the LGPL, people can do so. If you don't like that, don't release it publicly.

    And that if I post something stupid when I'm 9 we believe it should follow me around throughout my entire lifetime, because a 9 year old should know better.

    This is a fact of Internet life and always has been. This isn't different from other activities of 9-year-olds or anyone else in the public sphere. If you streak through a mall naked and someone snaps your picture, too bad: you can't make the photos disappear.

    1. Re:I sense a collection of poor analogies here by Anonymous Coward · · Score: 1, Funny

      If you streak through a mall naked and someone snaps your picture, too bad: you can't make the photos disappear.

      It wasn't me. It was my imaginary twin brother!!!

    2. Re:I sense a collection of poor analogies here by soft_guy · · Score: 1

      How about, if I don't want my GPL code resold by someone in a closed source product I should just know better and not put it out in the open to begin with.

      It is more similar to releasing it as public domain code, then someone puts it in a commercial product, then you change your mind and re-release it as GPL, then you sue the people who made the commericial product. And you should lose that case.

      --
      Avoid Missing Ball for High Score
  17. Retroactive robots.txt by Kelson · · Score: 5, Insightful

    I recently discovered exactly how the Wayback Machine deals with changes to robots.txt.

    First, some background. I have a weblog I've been running since 2002, switching from B2 to WordPress and changing the permalink structure twice (with appropriate HTTP redirects each time) as nicer structures became available. Unfortunately, some spiders kept hitting the old URLs over and over again, despite the fact that they forwarded with a 301 permanent redirect to the new locations. So, foolishly, I added the old links to robots.txt to get the spiders to stop.

    Flash forward to earlier this week. I've made a post on Slashdot, which reminds me of a review I did of Might and Magic IX nearly four years ago. I head to my blog, pull up the post... and to my horror, discover that it's missing half a sentence at the beginning of a paragraph and I don't remember the sense of what I originally wrote!

    My backups are too recent (ironic, that), so I hit the Wayback Machine. They only have the post going back to 2004, which is still missing the chunk of text. Then I remember that the link structure was different, so I try hitting the oldest archived copies of the main page, and I'm able to pull up the summary with a link to the original location. I click on it... and I see:

    Excluded by robots.txt (or words to that effect).

    Now this is a page that was not blocked at the time that ia_archiver spidered it, but that was later blocked. The Wayback machine retroactively blocked access to the page based on the robots.txt content. I searched through the documentation and couldn't determine whether the data had actually been removed or just blocked, so I decided to alter my site's robots.txt file, fire off a request for clarification, and see what happened.

    As it turns out, several days later, they unblocked the file, and I was able to restore the missing text.

    In summary, the Wayback Machine will block end-users from accessing anything that is in your current robots.txt file. If you remove the restriction from your robots.txt, it will re-enable access, but only if it had archived the page in the first place.

    1. Re:Retroactive robots.txt by ebyrob · · Score: 1

      In summary, the Wayback Machine will block end-users from accessing anything that is in your current robots.txt file. If you remove the restriction from your robots.txt, it will re-enable access, but only if it had archived the page in the first place.

      That's pretty cool. I wish more software behaved in a manner that well thought out.

    2. Re:Retroactive robots.txt by rthille · · Score: 1

      Cool maybe, but also bad. I can gain control over content [at least to prevent access] I never originally published if I now control the domain.

      That's uncool.

      --
      Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
    3. Re:Retroactive robots.txt by Jeremy+Erwin · · Score: 1
      an aquaintance of mine was interested in basing a wargame on modern day protest movements. One of the sources he planned to use was a19-- an adhoc organization devoted to producing some sort of protest march on august 19th (of some year). They had a website called a19.org. It was no longer of any value to them, and the domain eventually found its way into the hands of a net parasite.

      You know the type:

      You searched for quantum chromodynamics. Would you like to buy flowers instead?


      and of course, robots.txt was used to block the really interesting stuff that was published on a19.org.

      The moral of the story?

      Use wget -r as your web browser.
  18. Wayback Machine essential for public domain by proxima · · Score: 3, Interesting

    Many people think of the Wayback Machine as being a tool for history and nostalgia. However, consider copyright expiration (IANAL, etc.). Many web pages have items like "Copyright 1995-2006 Blah". Some of the content was created as early as 1995. Assuming, of course, that items created in modern times eventually have their copyright expire, we will need a record of the content of these pages at that time.

    As more content moves online, the idea of publishing a work becomes blurred. Revisions years later can effectively update the copyright of the work, if the reader cannot distinguish when the content was created. So the Wayback Machine will hopefully provide that resource. The amount of potentially public-domain content there is huge.

    As a side note, it will be interesting to note when the first GPL programs (for example) lose their copyright. Of course, by then, the languages will seem more than archaic.

    --
    "The universe seems neither benign nor hostile, merely indifferent." --Carl Sagan
    1. Re:Wayback Machine essential for public domain by MindStalker · · Score: 1

      Actually its not without caselaw. If you change then republish something you get a new copyright on it. BUT someone can still copy the old material if they can find old material that the most recent revision of has fallen out of copyright. /Yes even you can take Shakespear, change of few words and copyright your publication. :)

    2. Re:Wayback Machine essential for public domain by proxima · · Score: 1
      Actually its not without caselaw. If you change then republish something you get a new copyright on it. BUT someone can still copy the old material if they can find old material that the most recent revision of has fallen out of copyright. /Yes even you can take Shakespear, change of few words and copyright your publication. :)

      Right, I was operating under that assumption. Therefore, it is very important that we have a record of what existed at a given point in time.

      What I don't know for certain is the answer to this hypothetical situation: A PDF or text file (or whatever) is made available on X date. X+100 years later (or whatever), the file is still available (perhaps not from the same source, but assume the file itself is dated). Is the file in the public domain, if accessed at a later date? I think it is, so long as the file is the same, bit-for-bit. Translate it into a new format, and you might have a new copyright, I'm not sure (and I'm not sure there is case law on it).

      This brings up an analogy with other types of creative works. Are photographic reproductions of old artwork copyrighted? From what I understand, this depends on the country (with this being the relevant U.S. ruling that such a photograph is not copyrightable).
      --
      "The universe seems neither benign nor hostile, merely indifferent." --Carl Sagan
    3. Re:Wayback Machine essential for public domain by Anonymous Coward · · Score: 0

      Actually, you'll find that copyright on almost nothing will expire thanks mostly to Disney. The latest round of copyright extention (1998) extended copyrights to the life of the creator + 70 years (or 95 years if it was a work of corporate authorship).

      Basically, "[u]nder this act, no additional works made in 1923 or afterwards that were still copyrighted in 1998 will enter the public domain until 2019". http://en.wikipedia.org/wiki/Sonny_Bono_Copyright_ Term_Extension_Act

  19. Isn't ignoring robots.txt unauthorised access? by datajack · · Score: 1

    First, let me get two points expressed first. 1) IANAL, 2) I wholeheartedly agree with the aims of wayback and support that organisation whole-heartedly. I am playing devil's advocate here.

    In the UK Computer Misuse laws, there is the concept of unauthorised access. It is an offence to access data on a computer system without authorisation.

    Typically it is assumed that access to data held on a publicly available website, without notice to the contrary, is authorised. A notice displayed stating that you should not look at the data unless you are me is sufficient to make you aware that you should not access it. Similarly, a robots.txt file is the place to explicitly definae what data is unauthorised for access by automated spider systems. Anyone writing such a system can be reasonably expected to know that robots.txt contains such information and should therefor have the spider check that to see if access to the data is unauthorised. Failure to check that does not magically make the access any more legal. I would imagine that the US has similar provisions.

    The creatiopn of a robots.txt file after the spider has collected the information will not make the previous access and data collection illegal nor should it affect the presentation of that data. Copyright law may have an imapct though.

    1. Re:Isn't ignoring robots.txt unauthorised access? by dangitman · · Score: 1
      Typically it is assumed that access to data held on a publicly available website, without notice to the contrary, is authorised. A notice displayed stating that you should not look at the data unless you are me is sufficient to make you aware that you should not access it.

      That sounds rather absurd. It's like posting a massive page of text in a busy public location, with a sticky note attached saying "do not read this text."

      I would think that in terms of computer networks, "unauthorized access" means breaking into a site that is protected by password or other security measures. The fact that your machine can reach a site and get content without any password or hacking amounts to authorisation in my opinion. If you aren't authorising public access, then why did you post it in public?

      In more inflammatory terms - how about a page that was public, but said something like "You are not authorized to read this if you are a Jew. Offenders will be prosecuted." Somehow, I don't think the courts would take a positive view of that.

      --
      ... and then they built the supercollider.
    2. Re:Isn't ignoring robots.txt unauthorised access? by Anonymous Coward · · Score: 1, Informative

      The robots exclusion standard was primarily designed to exclude robots from the parts of the server's namespace that robots can't handle, like (practically) infinite url trees or shop sites. You don't want bots to crawl a neverending swamp of dynamically generated content that points to ever more dynamically generated content. You also don't want bots to order stuff or vote for comments when they crawl the scripts (the webmonkey should have used POST, not GET, but if he chose to use robots.txt instead, you're going to at least get an angry call). There are many more reasons to exclude robots from certain url prefixes. If you're operating a robot, follow that standard, for your own good. Some servers are actively hostile if you don't follow robots.txt.

    3. Re:Isn't ignoring robots.txt unauthorised access? by ratboy666 · · Score: 1

      Datajack: To whom ae you playing devils advocate?

      The IA does exactly that -- it respects robots.txt. Further, it RETROACTIVELY applies robots.txt. Now, this may not work (which is what the complaint was about). And AFAIK the retroactive edit doesn't remove data, it simply doesn't allow visibility (which is one of the reasons it may not work -- if there are two separate paths to the data, and the data is there, it can still be retrieved).

      The devils advocate argument would be that IA may be necessary to retain copyrighted works, but since everything is copyrighted, IA has no right to mirror or serve such content (not until the copyright expires, or under suitable subpoena).

      Ratboy
      YMMV

      --
      Just another "Cubible(sic) Joe" 2 17 3061
    4. Re:Isn't ignoring robots.txt unauthorised access? by datajack · · Score: 1
      That sounds rather absurd. It's like posting a massive page of text in a busy public location, with a sticky note attached saying "do not read this text."

      The UK law does not seem to make any mention of protection mechanisms, just authorisation,
      If I leave an unlocked bicycle outside a shop on a busy street while I go and buy something, it is still illegal for you to take it away.
    5. Re:Isn't ignoring robots.txt unauthorised access? by datajack · · Score: 1
      The robots exclusion standard was primarily designed to exclude robots from the parts of the server's namespace that robots can't handle
      Agreed, but it is still notification that bots are not authorised to access those pages. Therefore it likely has an effect on the legality of the situation.
    6. Re:Isn't ignoring robots.txt unauthorised access? by dangitman · · Score: 1

      But it's not illegal to read a poster someone has placed in a public place. Very different to stealing property.

      --
      ... and then they built the supercollider.
  20. Does anyone here know what copyright is?! by Anonymous+Brave+Guy · · Score: 2, Insightful

    Pretty much every time we have a discussion about the legality of web/Usenet archive sites, the only argument with any legal weight that's given for what would otherwise be a clear infringement of copyright is that the rightsholder is implicitly consenting to certain uses by making the material available on that medium. The degree to which this holds in general is debatable, and AFAIK has never been tested in any major court case in any jurisdiction. However, even if robots.txt is voluntary, it's a clear statement of intent. There is no way you can claim implicit permission to copy the material when the supplier explicitly indicated, using a recognised mechanism, that they did not want it copied.

    That makes comments like this one by Doc Ruby and this one by saskboy seem a little presumptuous, IMNSHO.

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  21. Info published on the Internet... by msauve · · Score: 2, Insightful

    shouldn't be copyrightable - there is nowhere more "public domain" than the Internet. Same with radio/TV - anyone who makes use of the public airwaves should sacrifice any claim to copyright for that priviledge. If someone wants to control their works through copyright, they should use controlled, private distribution.

    I'll no doubt have lawyer (and lawyer wannabees) protesting - but that only follows the literal and common sense meaning of "public domain," instead of the legal rationalization which has been brought about by those who want to have their cake, and eat it too.

    --
    "National Security is the chief cause of national insecurity." - Celine's First Law
    1. Re:Info published on the Internet... by Khuffie · · Score: 1

      Erm, if I post something on my website (which I bought the domain for and paid for hosting), it is not a public space, since I paid for it. Stuff on www.whitehouse.gov, on the other hand, would be, since tax payer money paid for it.

    2. Re:Info published on the Internet... by Lactoso · · Score: 3, Insightful

      And just what does that check to your hosting company pay for aside from the physical location and maintenance of the webserver? Propogation of your website's IP address to DNS and bandwidth. And what do you need bandwidth for if not to share your web pages with the internet at large...

    3. Re:Info published on the Internet... by phulegart · · Score: 5, Informative

      so if my content is behind a protected "members area" then it is still public domain and should be freely available? If I am a photographer, and my site clearly states that all images are copyright of a certain date and that use of them without my permission is forbidden, that means nothing? If someone uses images of me without my permission, that they got from a website or protected members area, how is it that I can get them removed by complaining? If they are public domain, then it should be my tough luck, right?

      If I post your credit card and bank information on a forum site, does that mean it is now public domain and you have no protection?

      If I post on a forum site that I am selling stolen credit card info and bank info, my post should not be touched, because it is public domain and it should be freely available?

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    4. Re:Info published on the Internet... by phulegart · · Score: 2, Insightful

      here's a little story... it deals with archiving and the like.

      My friend's hosting service got hacked. we caught it right away, before a site had been put into place, but the individuals attempted to put up the site http://paypal-protect.org./ We shut them down quick. They went on to hack another hoster, and currently have their little phishing site up and running. I suggest you go to the site, and without using ANY real information, login with a bogus email and password, and check it out. If you take a look at the WHOIS entry for paypal-protect.org, you will see a name and address of an actual individual. We called this guy and told him that it was likely his name, info and credit card were used illegally to register the domain.

      THe important thing to notice, is the EMAIL contact in the WHOIS entry. GO ahead, and do a google search for that email address. You will turn up two forum posts this guy made, where he is selling credit card info, bank info, Ccvv2 numbers and more. Now, the first result in your google search is a post at paypalsucks.com. You would not BELIEVE what it took to get the admin there to remove the post. And his policy wasn't to remove posts normally, but to just move them to a "garbage" thread, which would still be publically available. The second and third result in your google search, were a post left on a free board that was created at anyboard.net. I was able to get that board taken down within 12 hours of notifying the host, netbula. THe board was being used for lots of CC resellers, for at least 5 years before I got it shut down. How do I know? Three of those years are archived at archive.org.

      However... EACH OF THOSE POSTS is still there in the google cache. Go ahead and see. Why is this important? Because all you need to see, if you are in the market to buy stolen Identities and credit cards, is the contact information. It does not matter if it is in an archive, or if it is in an active forum. Archiving it has made it virtually impossible to remove from the net, because now there is no way of knowing exactly who has archived this information.

      Now, I've not provided clickable links for a reason. I've provided enough information here, that if you want to check my facts, you can do so.

      A library might be public domain, but the books within are not. There are some books that ARE considered public domain, but that does not mean that EVERY book is public domain.

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    5. Re:Info published on the Internet... by iminplaya · · Score: 3, Interesting

      If I post your credit card and bank information on a forum site, does that mean it is now public domain and you have no protection?

      If anything bad comes from it, it only means that the banks employ weak security. That information by itself should mean nothing. Complain to the financial institutions, not the person who posts it. Make it the bank's problem and it will go away. Don't use their services until they make it secure without making it unduly inconvenient for the customer. The silly passwords and 20 minute waits for failed logins do nothing for security. Make financial security the institution's responsibility instead of suppressing the flow of information. And furthermore, you know what you can do with your copyrights. If you don't want people to use your photos keep them to yourself. If you don't want your information divulged, then don't reveal it to anybody.

      --
      What?
    6. Re:Info published on the Internet... by phulegart · · Score: 2, Insightful

      what you are saying, is that the person who puts the information on the internet, is the one who decides if it is public domain. As opposed to the person to whom the information belongs.

      You know the current standard the US follows, for copyright of printed works, is LIFE+70 years? That means that once the author copyrights their work, the copyright is good for 70 years after they die. Only after the copyright expires and it is not renewed, the work becomes public domain.
      http://onlinebooks.library.upenn.edu/okbooks.html
      there are some specific exceptions based on when the work was copywritten, when the work was published, what country it was published, whether or not the copyright notice was properly added to the work, and more.

      To continue the library analogy I started earlier, the internet is a library. websites are the books. each must be treated as an individual entity. If someone steals your identity through a phishing scam, and uses that info immediately, then sure you might be able to get out of liability by appealing to your bank. DOes that mean that phishers should be allowed to run their scams freely and uncontested, because they can just pot your info and declare it public domain, which would then in turn give them license to use that info however they wanted?

      What if YOU didn;t put those photos on the internet? What if your Ex Girlfriend stole them by using your spare key when you were at work? Sorry charlie, they are on the net now and are public domain? I don't think so.

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    7. Re:Info published on the Internet... by iminplaya · · Score: 1

      Phishing is only an issue due to ineffective(and I believe intentionally) authentification employed by the financial institutions. And now we use that scam to suppress the flow. Make no mistake, it's the feeble security methods that make phishing so profitable, not the exposure of your information. If you reveal the info to anybody, it's no longer exclusively yours. Make the trustholders trustworthy. Weak, selectively enforced laws will never cut the mustard, as we are witnessing today.

      --
      What?
    8. Re:Info published on the Internet... by Anonymous Coward · · Score: 0
      Are you really this dense or do you have some sekrit, soon to be revealed point you're trying to make? If your content is not 'published' to the masses (in a passworded user area, intended to be private), then no. If you have a separate and evident copyright in effect over your content, then it is obviously governed by whatever copyright that is.

      Your other examples are obviously illegal acts and not covered by fair-use or copyright laws.

    9. Re:Info published on the Internet... by phulegart · · Score: 2, Informative

      Phishers do not deal with security. Phishers deal with unsuspecting and uneducated internet users. I'm sorry you are so scared to do it, but really.. go ahead and visit http://paypal-protect.org./ It is a phishing site that we are attempting to take down. Go ahead and login with a bogus email and garbage password. It doesn't check anything before hand. It simply takes you into a site that aside from the URL, does look like Paypal. You are then asked to provide everything. Name, address, social security, even your PIN number for your credit card. It won't even allow you to proceed without your PIN. Then, after you submit your information (which is then sent to whomever is running the scam), you are redirected to the actual paypal site.

      Now, if a poor sap fell for it, anything that sap could have done online that involved money, the phisher can do.

      You want to try to make the distinction about "If you reveal your info".. well, what if I worked at the gas station you frequent, and I copied your cred card info and ccv2 number from the back, when you made a purchase? OOPS, it was YOUR fault for actually buying something. According to you, the only way to be safe is to isolate yourself from the world, and make everything you need from scratch. Noone should be responsible for protecting your interests.

      If I grabbed your info from your trash, it's your fault, right? because you didn't incinerate your trash, right?

      You are wrong, in that everything posted on the internet is public domain. That is an assumption you are attempting to back up with obfuscation. What is posted on the internet is no different than what is on the shelf in a library, what is on TV, and what is on the radio. You have the right to enjoy it. You do not have the right to rebroadcast it without permission.

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    10. Re:Info published on the Internet... by Kelson · · Score: 1
      If someone wants to control their works through copyright, they should use controlled, private distribution.

      But isn't the purpose of copyright to extend legal protection beyond "controlled, private distribution"?

      After all, photocopiers, VCRs, audio tape recorders, CD/DVD writers -- heck, the printing press -- mean that distribution is no longer controlled or private, unless you restrict access to people who can use them. (Or you try to make it technically difficult via DRM, but that's only a temporary hurdle.) Every technological advance in information distribution has put that distribution into the hands of more people, loosening the distribution control. That's the "wants to be free" half of the slogan that often gets tossed around without the second half, that information "also wants to be expensive."

      Copyright, at its core, says, "OK, it's technically easy for you to copy this and redistribute it, but we're going to declare it illegal for you to do so without permission except under very specific circumstances."

      The medium -- broadcast, network, paper or plastic -- is irrelevant. Unless you think that the Internet is somehow the ultimate expression of free information, and distribution technology can't get any more "free" than this. 'Cause, you know, everything worth inventing already has been, and 640K should be enough for anyone, especially with that world market for 3, maybe 5 computers.

    11. Re:Info published on the Internet... by Anonymous Coward · · Score: 0
      ... the legal rationalization which has been brought about by those who want to have their cake, and eat it too.

      Damned right -- this is just more of the grasping bastard IP culture which wants full control of all content "from my server to your eyeball".

      Mainly it allows them to cover their asses when they have to correct the content.

      Some years back, some vast media outfit went after Akamai (or whatever the name of the caching server outfit was). They were claiming that if something like the WSJ or some such corrected an article, viewers might see the uncorrected, original, cached version and the poor, little, defenseless corporation would be seen as being in error.

      That probably explains that little fucker who used to follow me around all day. He'd pencil in corrections on my paper copy so I wouldn't have to wait for tomorrow's edition to figure out where they'd screwed up.

    12. Re:Info published on the Internet... by 1u3hr · · Score: 1
      However... EACH OF THOSE POSTS is still there in the google cache. Go ahead and see. Why is this important? Because all you need to see, if you are in the market to buy stolen Identities and credit cards, is the contact information.

      I don't see what your point is. Surely if these guys are engaged in criminal activity as you suggest, and have contact information, they should be investigated and arrested. The FBI (etc) should take over the contacts and shut them down, or use them to entrap other thieves. It should not be up to you to root them out from the Internet. Giving vigilantes the ability to erase all records of an event is a short, sharp and very slippery slope leading to the "Ministry of Truth". For a preview, see google.cn.

    13. Re:Info published on the Internet... by msauve · · Score: 1

      After all, photocopiers, VCRs, audio tape recorders, CD/DVD writers -- heck, the printing press -- mean that distribution is no longer controlled or private,

      Those are examples of private distribution. By "controlled private distribution," I do not mean avoiding distribution to the public through regular sales channels, where there exists a definite relationship between buyer and seller. When you buy a CD, that is a private transaction between you and the seller. It is controlled (you get the CD after you're paid for it, and are agreeing that your use is bound by copyright).

      However, when you publish a web page, it is open without restriction to anyone on the Internet to read. That is placing information into the public domain (literally, practically, but maybe not legally).

      To another commenter: if you use some method to stop open access to some content on your web site, such as password protection, I would not consider that content to be "on the Internet" for purposes of this discussion, but merely accessible through it. I'm addressing content which is literally in the domain of the public - available for all to access without limitation.

      If someone posts a handbill on a telephone pole, it is in the public domain, and a photographer should not have to be concerned with a picture he takes infringing some copyright. Same with archiving the web.

      --
      "National Security is the chief cause of national insecurity." - Celine's First Law
    14. Re:Info published on the Internet... by phulegart · · Score: 1

      If you do not already know exactly what to search for in these archives, you aren't going to find the evidence to lock anyone up. I only know about THIS particular individual because of the tracks he left behind after hacking my friend's host.

      Now, you must have missed.. what.. HALF of my post that you replied to? The jerk in question posted in two forums, attempting to sell stolen credit cards. The free forum that was hosted by anyboard.net was up and running FOR 5 YEARS! Archive.org has got 2002, 2003, and 2004 in their archives. Over that time, it has been used by countless individuals to sell credit cards, number dumps, blank cards, stolen personal information, Egold trojans and more. The board only got yanked from the web after I found it in my search for this hacker.

      5 years of illegal activity. SURE these people should be investigated and arrested. Officials and people who actually gave a shit didn't know where to look. But anyone who went to this board to get themseles a few THOUSAND credit card accounts, would be able to find it in the archives.

      And don;t start thinking that those contact emails and Instant Messenger accounts can't possibly be any good after all this time in the archives. Yahoo has been notified and notified and notified to shut down the email address of at least one hacker. Yahoo does nothing. Not surprisingly, a lot of these resellers use Yahoo.

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    15. Re:Info published on the Internet... by 1u3hr · · Score: 1
      Now, you must have missed.. what.. HALF of my post that you replied to? The jerk in question...

      No, I read it all. But the point is you want to be judge, jury and executioner. Let the police deal with him. Don't complain about being obstructed from going vigilante. Sure, you may be righteous, but not everyone is.

      Besides who is actually going to search in an archive of an old forum when they want to find a reliable criminal to deal with? There are plenty of live forums where you can do this almost openly. You seem to have a personal venedetta going.

    16. Re:Info published on the Internet... by phulegart · · Score: 1

      You are correct. I've been doing what I can to shut down phishers for a few years now. At first, like you and most people, I did nothing. The old "Let someone else handle it" attitude. Then, I did nothing. The old "The authorities will get them" attitude. Guess what? Nothing got done.

      THen, I started observing the life span of some of the phishing sites I was directed to. The entire time I was observing, they continued to live and apparently thrive. I thought... Why has noone done anything? I thought... Where are the authorities?

      My next step was to bring the phishing site to the attention of the major organizations that were being targetted. Companies like Paypal (there is a policy for reporting phishing sites at paypal), and Yahoo, ICQ, and others. Still nothing was done, and these sites stayed up.

      So, that's when I fired off my first email to a host. BAM! the phishing site was removed within an HOUR of my email.

      Now, maybe you are just too fucking apathetic for humanity's sake, but I'm not. You want to protect these phishers and scammers with your inactivity, and your ENCOURAGEMENT of other's inactivy, you go right ahead and continue to be a waste of space. You are the kind of person who will witness a pedestrian get hit by a car and ignore it, because you are sure that someone else will take care of it.

      You are right that there are plenty of live forums that criminals use to pursue their illegal activities. I found one that had been in operation for YEARS until I stepped up and did something. That single example justifies my actions. In 5 years, noone did a thing to try to stop what was going on at that site. Not you, not your friends, not the "authorities".

      I can't fucking believe you are actually giving me shit because I did something good, for people I don't even know. What the fuck is wrong with you? Who pisses in your cornflakes every morning?

      I don't want to be judge jury and executioner. I don't just run around hacking hosts to pull down these sites. I do everything I can to follow all the appropriate protocols. First, I read the TOS/AUP of a host, not only to see their procedure for reporting abuse, but also to see if their TOS even covers the activity. Then I bring it to their attention. That is all it takes... in every case other than dealing with Yahoo. I don't judge their activity to be illegal or wrong. Someone else already did that. I don't listen to the phisher's point of view as to why they should be allowed to phish. We already know it's wrong. I'm not judge, jury, or executioner. I'm just no different than someone who watched an episode of America's Most Wanted, and called the hotline because I know where someone featured on the show is.

      And I'm appalled that you can't tell the fucking difference. GO crawl back into your apathetic little hole, and keep praying to your genitals that "the powers that be" will make the world a rainbow bright my-little-pony place for you, while you do nothing.

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    17. Re:Info published on the Internet... by iminplaya · · Score: 1

      Phishers do not deal with security...OOPS, it was YOUR fault for actually buying something....

      Did I say anything remotely like that? You're not even making the feeblest effort to understand what I'm saying. I said security should be the institution's problem. They are the weak link. They are the ones who fail to verify the veracity of the info much less where it's coming from. They simply drop a bolt into the hasp where a padlock is needed. They are the ones who make phishing so easy and profitable. The phishers are simply exploiting that. And remember, with fraud, it' takes two.

      If I grabbed your info from your trash, it's your fault, right?

      Fascinating twist on the argument. Once I discard my trash, it's out of my control. You can't seem to grasp that info by itself shouldn't have that kind of power. Whatever I leave in my trash should be of no consequence to me. It is a weak system that allows the damage, not the access to info.

      You do not have the right to rebroadcast it without permission.

      Permission is not yours to give. Once you "discard" your info to the net, it is indeed in the public domain. Copyright steals(ok borrows) from that domain. Copyright violations take absolutely, positively nothing but the privilege of exclusivity, granted by powers that will take it away should there ever be a worldwide epiphany. The hair on your head is only your to control while it remains attached. The second it falls out, it's ours for the taking. It's quite clear that you can't let go of this 17th century thinking that has no place in the 21st century world. You refuse to understand that it's time for a complete re-evaluation of how we interact with each other, and you all say nothing that can justify maintaining doing things the same way we always have. It's a form of laziness that is keeping us in technological dark ages. It's why we still move around in kerosene burning jalopies instead of self powered maglevs or something similar. It disallows the ability to build on the works of others. The wheel has to be constantly re-invented or reverse engineered. It's why our computers remain such kludges and run so slowly.

      So I will leave you to ponder the words of a great genius:

      "Bow tie daddy dontcha blow your top
      Everything's under control
      Bow tie daddy dontcha blow your top
      'Cause you think you're gettin' too old
      Don't try to do no thinkin'
      Just go on with your drinkin'
      Just have your fun, you old son of a gun
      Then drive home in your Lincoln"

      --
      What?
    18. Re:Info published on the Internet... by phulegart · · Score: 1
      ok then

      Phishing is only an issue due to ineffective(and I believe intentionally) authentification employed by the financial institutions. And now we use that scam to suppress the flow. Make no mistake, it's the feeble security methods that make phishing so profitable, not the exposure of your information.

      Ok, that is exactly what you said. Let's say that security has been beefed up so that in order to make a purchase, a fingerprint scan is required to prove the money you are spending is yours and does not belong to someone else. All a phisher has to do is to trick you into using HIS fingerprint scanner. Guess what? He just stole your identity. Like I said. Phishers rely on the ignorant and uneducated. It doesn't matter what security measures institutions use. If you buy something, you expose yourself to being scammed. Sometimes, even if you use cash. So since any security measures that institutions can implement, can be defeated by a phisher that is counting on their victim not knowing any better, it is not the fault of the institutions.

      If I have EVERYTHING that I need to pretend to be YOU, and I go to a branch of your bank where none of the tellers have ever physically seen you... you are blaming the bank if I get your money? You do realize that with the right info, I can get your birth certificate. Once I have that, I can get a valid state ID with my photo and your info. The three physical identifiers on that birth certificate are Sex, Eye color, and Hair color. Now, if I don't match in all three, yeah, pretending to be you might be hard. But, people do get sex changes, wear colored contacts, and dye their hair. Ok, so that is pushing it, but not that far.

      Info discarded in the trash and out of your control DOES have that power. Authorities use the tactic of looking for evidence in trash all the time. Some Identity thieves also search trash. Private Investigators do too. Whether or not your personal info SHOULD have that power is irrelevant. It currently does. And because people are people, and NO SECURITY MEASURE IS UNDEFEATABLE, personal info will always have power.

      Your entire argument is flawed, in that you view the internet as a huge trash bin. You do.
      Permission is not yours to give. Once you "discard" your info to the net, it is indeed in the public domain.

      It isn't. Books are not "discarded" into a library. Books do not become public domain once they are included in a library. You are given permission to visit a web site, and enjoy it. You do not have the right to take what is there, and do whatever you want with it. Just because I have the ability to copy everything about a website (content, design, etc.) and slap a new domain name on it and call it my own, does not mean that it is right, or legal.

      You might see Copyright as toothless. That's your opinion and your problem. But there has to be some kind of protection system in place that allows me to create something, and keep someone else with more resourses from just taking it away from me and deciding that it is ours. We aren't flying around in anti-grav sleds because large unscrupulous companies are keeping us in an automotove dark age... by abusing a good system. Copyright needs MORE strength to protect the individuals with good ideas and no resourses.

      The hair on my head is mine until I say it isn't. Not until YOU say it isn't.
      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    19. Re:Info published on the Internet... by soft_guy · · Score: 1

      anyone who makes use of the public airwaves should sacrifice any claim to copyright for that priviledge. If someone wants to control their works through copyright, they should use controlled, private distribution.

      I think that is too restrictive. You should automatically give up your copyright if you show you work to the public by any means.

      --
      Avoid Missing Ball for High Score
    20. Re:Info published on the Internet... by 1u3hr · · Score: 1
      Now, maybe you are just too fucking apathetic...And I'm appalled that you can't tell the fucking difference. GO crawl back into your apathetic little hole, and keep praying to your genitals that "the powers that be" will make the world a rainbow bright my-little-pony place for you, while you do nothing.

      Get your shotgun and blow them away. Let God sort them out. Hoo-rah!

    21. Re:Info published on the Internet... by lerxstz · · Score: 1

      Just because something is on the internet does NOT mean the person is waiving their right to copyright. Putting something on the internet is a *publishing* (and distribution) method. It has *nothing* to do with defining ownership. By your reasoning, all book authors should sacrifice any claim to copyright as well since they are putting their books in a public place; a bookstore. And all musicians should sacrifice their claim to copyright since their music is sold via a public venue. So just exactly how are copyright holders able to sell their works if they don't make the information available to the public? Just exactly how would you define "controlled, private distribution" if you dispute that distributing via the internet (for example, a private transaction between an author's website and a buyer) is not a private transaction?

      --
      I chose to end my comments, not with a rim shot, but a long decaying F#7sus4
    22. Re:Info published on the Internet... by phulegart · · Score: 1
      you still don't get it.

      People with your attitude are the reason why the forum I took down, was in operation for 5 years. All it took was a single email. One little email. You have put more effort into posting here, than it took to take down that forum. But you would rather direct your efforts into getting in the way and trying to stop OTHERS from doing what is right, than actually doing what is right yourself. People with your attitude are the reason why problems like this flourish.

      Get your shotgun and blow them away. Let God sort them out. Hoo-rah!

      That's your solution, not mine. I don't own a shotgun, nor will I ever. But when you need help, and noone gives a shit, you'll understand. You won't remember that I said it, but you will get it then. Not the help, of course. You are too efficient in spreading the apathy.

      Acting without care is reckless and irresponsible. Whether it is archiving material without care and regard, or it is zooming by that broken down car on the highway in need of a good samaritan.
      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    23. Re:Info published on the Internet... by 1u3hr · · Score: 1
      you still don't get it.

      I get it when you call me obscene names.

      That's the end of taking you seriously.

    24. Re:Info published on the Internet... by phulegart · · Score: 1

      I hope to god noone takes you seriously. That could be dangerous and even deadly. You want to act like an asshole who doesn't give a shit, you should expect to be called an asshole who doesn't give a shit.

      I find your attempt to label me a vigilante to be offensive and obscene, especially when it was clear that I was not. That was when you lost ANY respect that I might have put in responses to you.

      But by implying that I should just mind my own business and let it all be, well that is also obscene. Especially when the evidence is right in front of you that expecting someone else to solve the problem, is what allowed the problem to continue unchecked for so long.

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    25. Re:Info published on the Internet... by iminplaya · · Score: 1

      ...you are blaming the bank if I get your money?

      Yes, I am, and I always will. They failed to take proper precautions. Like I said, we must use new, as yet invented methods of verification. We can speed that up if we take take charge and demand it. We are not doing that. So we continue on with 17th century ways because it's more convenient...and as it turns more profitable. Take the profit away from lax security and it will become a little bit better.

      You might see Copyright as toothless.

      I never said it was toothless. It has shown to be anything but. And individual liberties(considered an obscenity in "today's post 9/11" hysteria) are suffering because of it. It is well protected...by atomic bombs if desired. I'm saying it's evil. Evidence of this is quite plentiful. People with more resources than you can quite easily take away a copyright. FM radio is one of the more well known examples.

      The hair on my head is mine until I say it isn't. Not until YOU say it isn't.

      Not if it falls on my floor. Unless you pick it up before I do. I can do what I want with it, including making a clone of you. And it is the state's/corporation's responsibility to make sure it's a clone, not yours, not mine. I will not permit you, or anybody else to lay that on me. So look out, as OJ would say. The clones could already be out there. And try to make a tiny effort to think "out of the box", to coin an old cliche. You just might see that suppression isn't necessary. It's merely of convenience for the lazy.

      --
      What?
    26. Re:Info published on the Internet... by phulegart · · Score: 1
      I find that funny.

      Like I said, we must use new, as yet invented methods of verification.
      Nothing like ignoring the issue today, and thumbing your nose at everyone who won't agree with your view of the future. The problems are here and now... it doesn't matter if they are going to get worse in the future. We need a solution now.

      "Sorry sir. I know you have cancer. Sorry, were not going to do anything for you, because we haven't found a cure yet. So go away, and we will let you know when a cure is available that will work without fail."

      I notice you criticize, but provide nothing constructive. I get the impression that you have no clue whatsoever as to what these future security measures might be. How about a way to be able to implement these future security measures so that they encompas everyone, OOTB. Because those 17th century measures you loathe so much, will still be in place while your new 22nd century are being applied. Of course, as you like to point out human nature, what we REALLY see, is that these new fangled 22nd century security measures are put in place, they will be available to those who can afford them. Those who can't will be stuck at first with the old 17th century model... then they will be locked out completely.

      I can easily say that we need this undeveloped and never thought of thing, and we need that never before thought of concept, and that out of the box idea. We need a lot of things. But for all your calling for people to start thinking of new ways of doing things, we need people with their feet firmly planted on the ground attacking the current issues. Because while you spend your time wrapped up in theoretical models, there are hundreds of people who have to live in the mundane to support you. Yes. The store employees, the restaurant workers, and everyone else who canot afford to be a dreamer living in the future. They are too busy living in the present.

      If we don't pay enough attention to our problems NOW, there won't be a future to develop security procedures for.
      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    27. Re:Info published on the Internet... by 1u3hr · · Score: 1
      But by implying that I should just mind my own business and let it all be,

      If you have evidence put it before the authorities. Pressure them to take action. That's what they're for. Otherwise, you are being a vigilante. But if you act with them as you have here, I'm not surprised no one is interested in your ranting.

    28. Re:Info published on the Internet... by phulegart · · Score: 1

      Nope. No ranting. Just information.

      You know, I'd like you to quote me where I called you an obscene name... as opposed to my using words that you just did not like.

      However, here's a phishing site I posted about at 8:50 this morning, EST.
      http://yahoo-security-dept.5u.com/
      I posted about it here, because I wanted to show how Yahoo has taken no action in months, about the site that directs people there. Now, because I wanted to actually show the site, and because I wanted to give the "authorities" the time to do something, I have done nothing else.

      Guess what? it's been over 10 hours. How many people have lost their Yahoo accounts in that time? You don't think that's a big deal? Some people use Yahoo legitimately, and even have it as their official contact email for things like their domains, their hosting accounts, and more. They lose their Yahoo account, they lose quite a bit more. And phishers have a new domain and hosting space to work from.

      Time is of the essence when a phishing site is spotted. Every minute can mean more accounts and identities are stolen. How many people are now screwed because I did not notify the host about the site? My local FBI office knows about it though.

      Why do you feel that it is not your duty to report a crime when you spot it?

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    29. Re:Info published on the Internet... by iminplaya · · Score: 1

      We need a solution now.

      Oh my, where have I heard that brfore? We must act now! Doesn't matter if we do the wrong thing and make the problem worse. We must do something! There should be a law... It turns out we have a solution now. Just watch what would happen if everyone took their money out of the banks, or they quit using Pay Pal, or their credit cards, automobiles, etc. You would see virtually instant results. But no, it's "give me convenience or give me death". The addiction is so overwhelming that a suitable solution will never happen, not while the present situation is so profitable. But your exquisite tunnel vision of the hear and now completely blinds you to future consequences and you are bound to repeat the mistakes of the past as has been happening throughout history. And we continue to feed the authoritarian monster. Your wish to scratch a minor itch has turned it into an infected, festering, cancerous open wound. I guarantee you that inaction is better than the wrong action. But with life being so short you will never see the results. Your grandkids will. Your responses indicate to me that you're more concerned in protecting the interests of business and the authorities that protect it, with some kind of wierd idea that if the corporations fail, society will crumble. Essential freedoms of the individual just don't seem to matter to you as long as the electrodes aren't attached to your nuts. Well, sometimes you just might have to "reformat". Erase everything and start with a clean slate. I bet it wouldn't have to even be that painful. But you all to are too frightened of uncertainty. Well guess what? Food actually does grow on trees. Without our help believe it or not. Water falls freely from the skies, and we are capable of transporting anywhere we want. We don't need Pay Pal or any other "pal" to survive. We certainly don't need copyright to live well. These things exist for the benefit of a select few, and for the conveience of many. So I could give a damn what happens to them in the effort to make our lives easier. All these rules you want to impose upon us is for their benefit, not ours. And as the hypocrisy of our great leaders becomes ever more obvious(which is the real reason for this mad desire to control the net), you will find that the future is indeed very bleak. Respect for law, and for people in general for that matter, is reaching a new low and getting lower every day. The anger is rising all over, and there will be trouble. You can bet on it. Your desire for the iron fist will only make things worse.

      ...we need people with their feet firmly planted on the ground attacking the current issues.

      Yes, we need the people who provide nothing but war and destruction to give us what we want. That's what the people you support have given us so far....for thousands of years that's what they have given us. It's time time to throw those people out into the ocean to drown. Now is the time for something completely different. Or we will be the ones that are drowning. The tought that doing what we have been doing for millennia will bring different results this time around is indeed ludicrous. It's no different from the hangover victim screaming, "I promise, Lord, I'll never have another drink. And this time I mean it". You just can't admit failure.

      --
      What?
    30. Re:Info published on the Internet... by phulegart · · Score: 1

      As far as that first paragraph is concerned... what are you doing on the internet then? Why are you not living on a piece of land where you eat what you grow, do not use anything that requires electricity, and everything else you are involved with was made with your own two hands? Why are you not living the life you say can be lived now? Of course, you only have your own two feet for transportation, although I suppose you could build a wagon and use a horse. Viking longships made it here across the Atlantic (with stops) so I suppose that you could get a few people together if you wanted to visit Europe. For someone who wants to think of securities that are better suited for the future, you seem to want to live a life that died out 2000 years or more ago.

      I would embrace a world where there was no need for money, police, or governments. Unfortunately, nature proves that in such an environment, only the strong survive. The weak are culled from the herd. That would be great as far as over population is converned. However, what it would mean is that those who were willing and able to use force to get what they want, would do so. Society would quickly turn into collections of Dictatorships where the strongest and his cronies, ruled over peasants who grew the food and built the buildings. At best, we would be looking at serfdom for the masses, working pots of land that "belonged" to the local lord, who in this cas would be self-appointed. Essentially the same as the dictatorship, but the lord would be a nicer guy (I guess).

      I would love a Star Trek type world where we worked as one, globally, without the petty concerns of corporations and individual profit. I hope we reach that. It will never happen instantaneously. There will be revolutions, and dark times between now and then. There are too many people now that use force to take what they want without earning it to just switch over to a system like the one you describe. Yes, it is currently an authoritarian type of society we live in. Do you know why? Because someone will always attempt to place themselves in charge. Someone will always try to place themselves above others. History teaches us that. If we forget or ignore history, we are doomed to repeat it.

      Yes, I want convienence. So do you. You just want different convienent things than I do. So I ask you. Until new and better security measures are in place (not the ones you imagine, because you haven't imagined any specific ones yet), how do we handle the security situation now? Do we ignore it?

      How do we get EVERYONE to climb aboard the bandwagon you wish the world to become? We can't agree on a religion. We can't agree on a language. We can't agree on the money that I would like to abolish. We can't agree on basic rules of conduct; basic human rights.

      I'd like to see all weapons destroyed. How would that be accomplished? That would have to be part of the Utopia. They couldn't be destroyed anyway, until everyone agreed with the program. Because if we established this Utopia here in the USA, you can bank on the fact that another country would take advantage of the US now being defenceless, and commence with the slaughter. It doesn't matter if we cried out "We've changed! We aren't capitalists anymore!"

      It would be great if everyone took their money out of paypal and banks, and destroyed their credit cards. That won't happen. You can wish and wish till you are blue in the face, but everyone would not do it. See paragraph about how we can't agree on anything. It is unfortunate that you do not understand a lot of what I have said. I am not really concerned with protecting the interests of corporations. I am interestd in protecting those people who are out here on the internet, but just do not KNOW enough to properly protect themselves. Unscrupulous individuals are looking to prey on them. I want to do what I can to reduce that as much as I can. You say it is better to do nothing, than to do the wrong thing. That is because you want socie

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    31. Re:Info published on the Internet... by 1u3hr · · Score: 1
      Nope. No ranting. Just information.

      "I can't fucking believe you are actually giving me shit because I did something good, for people I don't even know. What the fuck is wrong with you? Who pisses in your cornflakes every morning?"

    32. Re:Info published on the Internet... by iminplaya · · Score: 1

      However, what it would mean is that those who were willing and able to use force to get what they want, would do so.

      You don't think we have that now? Life inside the empire might be nice for everybody, but in general that's precisely how the world still works today. Strong nations rob the weak ones. Nothing has changed. Go visit some of those weak places and you will see a true wayback machine, going back thousands of years.

      At best, we would be looking at serfdom for the masses, working pots of land that "belonged" to the local lord...

      Like the corporate farms fed by Monsanto that are running same family farms into the ground. You are saying nothing new. We just happen to be on the right end of the big stick. We still live in a world of might makes right. As long as we are one of the mighty, all is well.

      If we forget or ignore history, we are doomed to repeat it.

      What do you mean "if"? We are doing just that.

      I am interestd in protecting those people who are out here on the internet, but just do not KNOW enough to properly protect themselves.

      Then educate and advice them. After that let them live or die by the coices they make. If they refuse to follow your advice, them let them rot. They have been warned. What we're talking about here after they do recieve the advice is a grand version of battered wife syndrome.

      Before we had any of the convienences you loathe...

      I don't loath convenience. I do question as to how it comes about. If you have to steal to get those conveniences then something is horribly wrong.

      I'm being pushed out the door. To be continued...

      --
      What?
    33. Re:Info published on the Internet... by iminplaya · · Score: 1

      Part deux...or more likely a lot of repetition

      I find that I always end up saying things that have been said thousands of times, so I'l basically leave it alone. Suffice it to say I've already repeated what you said in your post about might makes right, blah, blah, blah. And as it turns out, since we are just as natural as nature itself, we might not have a choice as to how to act. We do repeat history roughly every or every other generation. That is easy to see. But I do insist that we do have a solution. It's that we shouldn't blindly accept what we are being told as we have done throughout all of time. We must question authority if we are to progress. The net, with all its copyright violations is helping to make that a little easier. Yes, we do have something that we never had, and that's the instant communications and virtually instant travel where you can be dropped anywhere on the planet in less than 24 hours, but for some reason it still takes months to get permission. The regulations we create are little more than an attempt to make it as slow as the pony express. And unfortunately most everybody accepts it without question. I will always believe that it's the lawbreakers who brought us the freedoms we enjoy today. And many of today's lawbreakers will bring us greater freedoms tomorrow. And now I'm off to watch the kindergardener's version of quantum mechanics "What the bleep do you know?" so I can learn which end of the screwdriver I should hit. Thanks for the good discussion.

      --
      What?
  22. I disagree by Anonymous Coward · · Score: 0

    "Publishing" something online is hardly a reason for that information to stay online and be available indefinitely. After all, latest AOL fiasco just shown us that not all information should be available in perpetuity. With technology getting ever simpler it becomes trivial to expose online documents that were never meant to be seen by others. Claiming that any exposure is grounds for the information to be available to anyone in perpetuity is clearly wrong.

    For a simple example, say your personal diary with your private thoughts and writings somehow falls out of your bag and ends up on the street. It is available for anyone to read. Would you agree to have its content published and disseminated to all the world in newspapers or some such? Or would you rather someone returns it to you quietly and the information stays private.

    Archiving information produced by other people without their express consent is wrong and, potentially, harmful. This is one case where I strongly beleive copyright law should be applied and enforced.

  23. No it isn't. by Anonymous Coward · · Score: 1, Informative

    robots.txt is not about whether accesses are "authorized" or not. Because the web server will still serve up the content if the robot asks for it! If you only want "authorized" users accessing the content, you should put some sort of access control mechanism where users have to type a password or something. Not only will that keep the robot out, but it demonstrates a clear intent to keep the robot out.

    robots.txt is more of a "please don't look at this" request to spiders. If the spider asks for the content anyway and your server happily sends it, then you can't claim this is "unauthorized" access.

    1. Re:No it isn't. by Anonymous+Brave+Guy · · Score: 1

      robots.txt is more of a "please don't look at this" request to spiders. If the spider asks for the content anyway and your server happily sends it, then you can't claim this is "unauthorized" access.

      Doors are more of a "please don't come into my home" request to strangers. If a thief tries the door and finds it unlocked, you can't claim this is "unauthorised access".

      No, wait, that's a stupid idea, because it ignores your explicit wishes and then argues that you consented to an act that would be illegal without permission.

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  24. HA by law should have to give up the data by MushMouth · · Score: 1

    IIRC This was in response to a situation where someone was suing HA, the plaintiff's law firm hammered archive.org and was able to get some of the pages that they were interested in. At which time HA sued the archive for copyright infringement because they changed their robots.txt to prevent the information from getting to the plaintiff's attorneys. The problem with this whole thing is that adding the robots file after the lawsuit is akin to destroying evidence during a trial and they should have been found in contempt of court. Them expecting the archive to delete the data is unlikely as unless they are serving the data there is no copyright violation. I don't see why the plaintiff's lawyer didn't serve the archive with a subpeona for the information like gmail users have had their "deleted" email subpeona'd

  25. A world without culture. by Anonymous Coward · · Score: 0

    "Consider an analogous situation in real life: You are walking in the park and someone asks you for a dollar. You decline, but the beggar keeps asking. You're saying that accepting your first denial as binding is "voluntary" and the beggar can keep bugging you as long as he likes. If that happened to me twice, I'd have the asshole arrested, and that's exactly what you're going to see online if people don't behave, especially when their behaviour leads to copyright violations which would have been avoided if they had followed the robot exclusion standard."

    And yet no one sees the analogy between the above, and those "please do not copy" reminders on artists web pages. Maybe we can pull out that old slash-standby (you locking up MY culture with your robot.txt file).

  26. Violated their Own Policies by jafiwam · · Score: 1

    Their policy is pretty simple, and direct, and involves minimal interaction with a human. (A bonus.)

    Put in a robots.txt.

    Direct wayback to index what you want or dont.

    THAT DIRECTION IS APPLIED TO FILES ON THEIR SITE FROM PREVIOUS VERSIONS.

    Meaning, if you deny all, and their bot sees it, all of your stuff is supposed to get deleted from the archive.

    If they didn't do that they violated their own policy.

    True, there can be complications (such as switching domain names) that might keep any given text in there without interaction.

    What they do is a great and and tremendously useful tool. But not entirely out of the "gray area" for copyright problems.

  27. ... I could make it so you were never born. by Corngood · · Score: 3, Interesting

    You missed the best part of the quote.

  28. What about robots.txt in/from the future? by Bozzio · · Score: 1
    Obeying robots.txt files is voluntary, after all,

    It may still be voluntary today, but who knows what the future will bring?

    I, for one, welcome our robot.txt overlords.
    --
    I just pooped your party.
  29. wrong by oohshiny · · Score: 2, Interesting

    The US has copyright laws, and lots of people rely on it, including open source projects.

    The robots.txt file is a clear indication of the conditions under which a copyright holder gives you access to their copyrighted materials. As such, it is not "voluntary".

    In addition to probably being in violation of copyright law, it is simply rude for companies to ignore robots.txt files; if the Internet Archive does this, they are badly behaved.

    If courts should decide that robots.txt files can be ignored at will, then more sites will require registration, click-through licenses, and those annoying "try to read this" safeguards, making life more miserable for all of us.

    The best thing for everybody, including the Internet Archive, would be for the robots.txt standard to be enforced strongly by courts.

  30. Wrong, wrong, wrong by kimvette · · Score: 3, Informative
    As the article notes, you can't really un-ring the bell of publishing something online, which is exactly what HA wanted to do. Obeying robots.txt files is voluntary, after all, and if the company didn't want the information online, they shouldn't have put it there in the first place."


    Wrong, wrong, wrong. archive.org explicitly tells you that if you want your content removed from their index, that you should modify your robots.txt and re-submit your site, and when their bot reads your robots.txt and sees the appropriate directives, your content will be dropped from the index. See:

    http://www.archive.org/about/faqs.php#2

    http://web.archive.org/web/20050305142910/http://w ww.sims.berkeley.edu/research/conferences/aps/remo val-policy.html

    Let's review the text here, just in case someone from archive.org scurries to change it:

    Addendum: An Example Implementation of Robots.txt-based Removal Policy at the Internet Archive

     


    To remove a site from the Wayback Machine, place a robots.txt file at the top level of your site (e.g. www.yourdomain.com/robots.txt) and then submit your site below.

    The robots.txt file will do two things:

              1. It will remove all documents from your domain from the Wayback Machine.

              2. It will tell the Internet Archives crawler not to crawl your site in the future.

    To exclude the Internet Archive's crawler (and remove documents from the Wayback Machine) while allowing all other robots to crawl your site, your robots.txt file should say:

                                                  User-agent: ia_archiver

                                                  Disallow: /

    Robots.txt is the most widely used method for controlling the behavior of automated robots on your site (all major robots, including those of Google, Alta Vista, etc. respect these exclusions). It can be used to block access to the whole domain, or any file or directory within. There are a large number of resources for webmasters and site owners describing this method and how to use it. Here are a few:

                          http://www.global-positioning.com/robots_text_file /index.html

                          http://www.webtoolcentral.com/webmaster/tools/robo ts_txt_file_generator

                          http://pageresource.com/zine/robotstxt.htm

    Once you have put a robots.txt file up, submit your site (www.yourdomain.com) on the form on http://pages.alexa.com/help/webmasters/index.html# crawl_site.

    The robots.txt file must be placed at the root of your domain (www.yourdomain.com/robots.txt). If you cannot put a robots.txt file up, submit a request to wayback2@archive.org.


    By not honoring those directives, are they not engaging in both copyright infringement and fraud?
    --
    The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    1. Re:Wrong, wrong, wrong by Anonymous Coward · · Score: 0

      Let's review the text here, just in case someone from archive.org scurries to change it

      bah... if they do change it, use the Wayback Machine

    2. Re:Wrong, wrong, wrong by kimvette · · Score: 1

      *snickers* that's funny. The question I would answer your suggestion with is:

      Would their robot obey or ignore the directives when crawling archive.org? ;)

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
  31. Mod parent up. by piper-noiter · · Score: 1

    He makes an interesting point about how archives make it hard to delete previous illicit activity.

    --
    Shick's Law: There is no problem a good miracle can't solve.
    1. Re:Mod parent up. by ibbey · · Score: 1

      They also make it hard to delete EVIDENCE of past illicit activity. Woudld you (or the parent) be upset if the info in the Archive were used to convict the perpetrators? If it contains a link to his valid email address, it very easily could be (and probably has been) used by law enforcement for that very purpose. In reality, in the wrong hands, just about any information can be used for nefarious purposes. The archive can be used for such purposes, but it has FAR more positive uses then negative ones.

    2. Re:Mod parent up. by phulegart · · Score: 1

      I agree that archives can be used for good purposes. we are in fact going to depend on a few archives when presenting this package of information to the FBI tomorrow in our meeting at the local office.

      But blanket archiving without moderation is just wrong. It's similar to going through your house and archiving everything. Trash, dead pets, dirty laundry, old food, etc. I'm not proposing that I know how to moderate such an undertaking, ut there are plenty of good things that because they can easily be turned around and used for bad ends, are now restricted. If necesary, Vodka can be used to keep a wound from getting infected. That's good. Should it be available to 6 year olds for consumption? Hell no. Matches are used to ignite combustibles. Mmmmm. Fire Good. Should we give those matches to the 6 year old to play with? Hell no. Go ahead and argue about how using a 6 year old is a bad example. But there are good examples too, right? And in the case of this 6 year old, there are restrictions because somethings have good and bad uses.

      What is getting archived is as important as the archiving itself. Restricting access to all-encompassing archives is important too. I've seen mention of information terrorists could make use of, kiddie porn, etc. Hell, if a Kiddie porn site gets taken down, do we really want the people who frequented it, to just go to the archive site to get their kiddie porn?

      Come on. It's not just all or nothing; black and white. The fact that it is all shades of grey means that we have to apply a certain amount of intelligence to the entire picture. Not just rush blindly forward and grab everything in sight. I don't want to see archive.org go byebye. I also don't want to be able to search through archive.org for http://www.anyboard.net/comp/www/hostwatch to browse through people selling trojans, exploits, credit cards and more. WHere is the moderation? Where is the middle ground?

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    3. Re:Mod parent up. by piper-noiter · · Score: 1

      Actually I'd have to disagree on the 'FAR more positive uses.'

      Yes I agree that there are reasons to archive and I'd never suggest we just get rid of them all. What I would suggest, in this case, is that archives edit out personal data that could be used for easy identity theft (credit card numbers shouldn't be saved). Thats all.

      And no, my blog entry on spores and fungus can hardly be used for nefarious purposes. When you put information on the internet about yourself, you've made your bed and taken that risk. When someone else posts personal information about you on the internet thats something else entirely. Not that you can place a blanket law against that either as that would restrict freedom of the press.

      --
      Shick's Law: There is no problem a good miracle can't solve.
    4. Re:Mod parent up. by ibbey · · Score: 1

      Archiving everything in your house is a bad idea. Archiving the Internet is a very different thing. First off, who is going to be the moderator? You? What if I disagree with your opinion of what is worthy of archiving or not? What happens if, as I pointed out, something that you chose not to archive would have turned out to be useful in prosecuting an identity thief (or worse)? Any moderator will have absolutely no way of knowing what information will be useful in the future and what information won't be. Then there's the practical aspect: How many people do you suspect it would take to screen each and every page in the Archive for it's suitability for six year olds?

      And do they also do the same for the Internet proper? All of the information that you object to exists on the Internet itself. It's not hard to find. You can find the exact same information on Google, so should we moderate them also? Every other search engine? Usenet? Yahoo Groups, Google Groups? Do you begin to see the futility of your quest?

      Your argument is also unfounded considering the IA has a removal policy in place for exactly the sort of info you object to "Occasionally, data disclosed in confidence by one party to another may eventually be made public by a third party. For example, medical information provided in confidence is occasionally made public when insurance companies or medical practices shut down. These requests are generally treated as requests by authors or publishers of original data." If they are archiving your credit card number, let them know and they will remove it. As for Kiddie Porn, considering they would go to prison for hosting it, I suspect that they do their best to remove it ASAP.

      Regarding your story about Paypal-protect.org, why didn't you just contact Yahoo to have his email account shut down? In their terms of service: "You agree to not use the Service to: 1. upload, post, email, transmit or otherwise make available any Content that is unlawful, harmful, threatening, abusive, harassing, tortuous, defamatory, vulgar, obscene, libelous, invasive of another's privacy, hateful, or racially, ethnically or otherwise objectionable;". It doesn't much matter how enthusiastic the guy is about selling your CC data, if you have no way to contact him he won't be too successful.

    5. Re:Mod parent up. by ibbey · · Score: 1

      Actually I'd have to disagree on the 'FAR more positive uses.'

      So you think the possibility of compromised personal information, even though they do have a policy to remove it, outweighs all of the other valuable uses? You've cited effectively one negative use, and (considering the removal policy, see my reply to the parent poster) one that isn't very sound at that. Positive uses: academic research, law enforcement, catching racist or otherwise offensive language on politicians websites, tracking down product info for discontinued products... There are thousands of positive uses, there are a few negative ones. To me, that meets my definition of "FAR more".

      And with regards to your blog entry, I said "just about any information", so that particular entry may not be useful. But you'd be surprised what seemingly useless or trivial info can be used by a skilled social engineer. One of the fundamental skills of a good con man is his ability to extract useful info out of the seemingly useless. Maybe in that Blog post, you name your city. In one you made last month, you name your street. In a third, you show a picture of your new house, complete with street number. No single bit of information could be used to identify you (unless someone just happened to recognize your house from the photo), but put the three of them together, and I now have your home address (granted, these are obvious examples, but you can see my point). Or maybe the con man, knowing of your interest in spores & fungus, casually mentions something on the topic in conversation. Since he knows more about you then you know about him, you are at a significant disadvantage. And since you are (presumably) an honest sort of guy and he's not, most likley you won't have an inkling that you are being conned until long after it's over. My point wasn't that all info can be used, just that it's very dfifficult to forsee what information -could- be used improperly when someone has an agenda that you don't know about.

    6. Re:Mod parent up. by phulegart · · Score: 1

      it's a nice assumption that Yahoo would remove the email address. However, the reality is that after repeated requests to Yahoo to do exactly that, Yahoo has done NOTHING.

      It gets better. I uncovered a phishing site, specifically targetted at stealing Yahoo accounts, hosted on a Yahoo business server, I've reported it multiple times, and Yahoo won't do a thing. It has been there for MONTHS, and now the phishers use it to redirect to other temporary phishing sites. So a posted AUP/TOS apparently means nothing... whether it deals with what an archive site will remove, or what a host will consider against their policy. Don't believe it? Http://www.goodyseth.com is the site. That index page is innocuous enough. If you are quick with your browser's stop button, visit http://yahoo.abuse.dept.goodyseth.com/ and look at the source code. It has been edited YET AGAIN, and points to a NEW temp phishing site (Which I'm gonna have to get shut down). Do a WHOIS on goodyseth.com. See where it is hosted. Then report it to Yahoo and watch how nothing happens.

      Now, I already admitted that I have no idea how moderating an archiving site could be done, only that it SHOULD be done. However, it seems to me that a complete archive of the internet should be kept under lock and key. Should all those records of illegal activity stored in there be freely available to everyone?

      Also how can I get archive.org to remove my stolen, posted, and archived personal information, if I don't know all of the places it has been posted? I don't know if you realize this, but while I can search for old websites I have had in the past by their domain names, and I find them archived... I can't search by information ON those websites. I've tried. Nothing comes up. Yet, there those sites are, when I search by domain. SO their promise to remove archived material is a toothless guard dog. It might look nice from a distance, but it doesn't fit in practical application.

      And obviously they AREN'T going to remove Kiddie porn from their archives, just like they aren't going to remove other illegal material... because you can search there NOW, and the illegal material is there. So, when I say they won't, it is because they HAVEN'T.

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    7. Re:Mod parent up. by ibbey · · Score: 1

      If Yahoo fails to act in accordance with their stated policy, then your fight really should be with them, not the Archive. You have very valid complaints about how they have handled the situation.

      With regard to the Archive, I can certainly see your point, but I disagree. The Archive has too many valuable uses to allow it to be watered down. It goes back to my earlier point, who decides what merits inclusion? It's a ridiculously difficult challenge, and it's the sort of question that people have been debating since the birth of the Internet, and to a lesser extent since the birth of libraries. Any information that actually violates privacy, and any information that is actually illegal should certainly be removed. But from what I saw, the links you point to don't actually show a significant amount of personal info, but link to ways that you can find or buy personal information (granted, I didn't spend much time looking). Information on how to commit a crime is legally protected by the first ammendment. Actual offers to sell credit card data are not legally protected, but the offer is useless if Yahoo does its job and deactivates the email address/URL. Once the email account is deleted, then suddenly the offer to sell the credit card data is actually useful, both to law enforcement and to researchers, privacy experts, and for consumer education. Once again, your complaints against Yahoo are valid, you complaints against the Archive are misguided.

      For that matter, do you know for certain that the email account isn't still active specifically because Yahoo is working with law enforcement to capture the guy? Once they cancel his email address, it will be almost impossible to catch the guy. With a valid email, they can fairly easily track him down.

      As for kiddie porn, if its there, have you told them about it? I would be VERY surprised to find out that they failed to act on removing actual child pornography. I could be wrong, but I believe that their failure to act to remove known child pornography is a serious legal offense in the US, so it's pretty doubtful that they would just ignore it. Note, there is some material that while highly objectionable to some people, is not legally considered child pornography. The photos of Robert Maplethorpe or David Hamilton come to mind. While you may find the content offensive, they are art, and do not come close to meeting the definition of pornography. If that is the sort of material you object to, then you will not get very far having them removed (just as you won't succeed in getting them removed from the Library of Congress). Also things like short stories depicting sex with a minor are not child pornography, even if they claim to be true, and regardless of any apparent artistic merit. Child pornography has a very specific legal definition, so things that fall outside of that definition, however tasteless, are perfectly appropriate for inclusion in the Archive. Since I don't know any kiddie porn websites, I can't verify your claims that they haven't removed it... I tend to suspect that you are wrong on this one.

    8. Re:Mod parent up. by phulegart · · Score: 1

      most likely I am wrong about the kiddie porn. It was an example based oupon theory, not fact.

      I'm not opposed to the archiving PER SE, as much as I am opposed to the free access. If material that proves illegal activity is archived, then only those in a position to pursue and stop that activity should have access to it. A poor analogy would be that just because some people can use a gun responsibly, that does not mean that guns should be free and freely available. There should at the very least be a policy in place where if you insist that you MUST have access to the archives of illegal activity, you can be easily tracked down and held accountable for what you do with that information.

      Being able to request that information be removed from the archive kind of defeats the purpose of archiving everything. Since that implies that some information should not be archived, isn't it now a mater of coming up with a standard definition of what is to be kept and what isn't? I know I could never make that list. I also know that the information on how to contact those selling stolen goods should not be kept in an easy place to access. This only encourages the sale of stolen goods. But with the creation of that standard list, there is no need for a removal policy.

      The Yahoo thing... the hackers behind the phishing site, the ones attached to the yahoo email address, did their work from 4 IP addresses. Two out of Taiwan, and two out of Vietnam. Maybe Yahoo is currently working with authorities to track them down and arrest them. They have been in operation for a little while now, and they are collecting more victims as I type. The Yahoo address is still active, because yahoo does not reject communication attempts to that address. Now, Yahoo months ago was informed of a different phishing site, that was hosted on their server. Maybe the investigation is still pending, but I've been watching that site be updated again and again and again. Yahoo has the logs of all that activity. There has been more than enough time to shut it down. They do not. Now, both of these inactions on the part of Yahoo do not form a pattern. But they do indicate a tendency.

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
    9. Re:Mod parent up. by ibbey · · Score: 1

      I'm not opposed to the archiving PER SE, as much as I am opposed to the free access. If material that proves illegal activity is archived, then only those in a position to pursue and stop that activity should have access to it.

      But the information may be valuable to people OTHER then law enforcement and criminals. Also, not everything on Anyboard is illegal, so are you saying that just because 1% of the content is questionable, none of it should be archived?

      Being able to request that information be removed from the archive kind of defeats the purpose of archiving everything. Since that implies that some information should not be archived, isn't it now a mater of coming up with a standard definition of what is to be kept and what isn't?

      Not everything is or should be archived. As a web author, I have the right to say that my content is copywritten and cannot be archived. Personal information won't be archived. Child Porn won't be archived. They honor a robots.txt file, so if you don't want your site archived, just set one up. If your information was archived without your permission, they have a removal policy, I linked to it previously.

      I know I could never make that list. I also know that the information on how to contact those selling stolen goods should not be kept in an easy place to access. This only encourages the sale of stolen goods. But with the creation of that standard list, there is no need for a removal policy.

      But again, it shouldn't matter that that info is in there to begin with. Every page in the Archive is AT LEAST 6 months old. Yahoo (or whoever) should be the ones you are angry with, not trying to arbitrarily censor the Archive because Yahoo is incompetent.

      And, again, keep in mind that the information regarding credit card numbers is EASILY available on the internet. Get onto Usenet and you can find CURRENT ads offereing the same information, and the email addresses are much more likely to still work. The simple fact that the Archive is inherently obsolete makes your objections really silly. I can't imagine that there are a whole lot of script kiddies using the Archive to get their stolen CC numbers when there are hundreds or thousands of other sites that will be much more likely to have valid email addresses.

      The Yahoo thing... the hackers behind the phishing site, the ones attached to the yahoo email address, did their work from 4 IP addresses. Two out of Taiwan, and two out of Vietnam. Maybe Yahoo is currently working with authorities to track them down and arrest them. They have been in operation for a little while now, and they are collecting more victims as I type. The Yahoo address is still active, because yahoo does not reject communication attempts to that address. Now, Yahoo months ago was informed of a different phishing site, that was hosted on their server. Maybe the investigation is still pending, but I've been watching that site be updated again and again and again. Yahoo has the logs of all that activity. There has been more than enough time to shut it down. They do not. Now, both of these inactions on the part of Yahoo do not form a pattern. But they do indicate a tendency.

      I have stated previously, your complaints about Yahoo are perfectly reasonable. This discussion is about the Internet Archive, so it's a bit off topic. Have you contacted YahooDomains and had them shut down the domain itself? Since they are the registrar, that would be my first action.

      Anyway, I'm clearly not going to convince you that you are wrong, and I'm probably becoming even more resolute in my opinions the more I think about it, so what say we just drop it here?

  32. DMCA Violation by Anonymous Coward · · Score: 0

    Since robots.txt is an access control mechanism, bypassing it is illegal under the DMCA.

  33. The wayback machine is worthless anyway... by Captain_Carnage · · Score: 1

    It's full of lies. It told me that the Internet was invented by Larry Roberts, working for DARPA. Everyone knows it was Al Gore...

  34. You've got it backwards by phr2 · · Score: 1

    It's more like: bum asks you for a dollar. You give him one. Two weeks later you decide you don't want to give handouts any more, so you write on your forehead "no soliciting". Next you go to court and claim that writing "no soliciting" on your forehead means you not only won't give more handouts, but the bum who you PREVIOUSLY gave a dollar to, now has to return it.

    See: that company DID NOT HAVE a robots.txt directive active when the Wayback machine archived it. They put the robots directive up two weeks later, once they realized that the archived file showed they were doing bad stuff that would embarass them.

    1. Re:You've got it backwards by Anonymous Coward · · Score: 0

      I am arguing against the general notion that obeying robots.txt is voluntary. The case with Healthcare Advocates is different, but the author of the Slashdot summary used it to transport his message that if it's online it's fair game, regardless of robots.txt.

    2. Re:You've got it backwards by Jtheletter · · Score: 1

      Please cite the law that says robots.txt is NOT voluntary. Until then you're wishing for a law, but you can't defend your stance legally based on a wish. Morally? Ethically? Sure, you have an argument. But you're demanding the ability to prosecute, which does not currently exist.

      --
      -- I'm not a pessimist, I'm a realist. It's not my fault that life sucks so much. --
    3. Re:You've got it backwards by Anonymous Coward · · Score: 0

      Reading comprehension is required if you want to participate in a meaningful textual discussion. The law isn't the only thing that can make you do things non-voluntarily. Even though RFCs are not law, you will quickly regret if you ignore some of the more fundamental ones. The same can happen if you ignore robots.txt. In addition to that, robots.txt has some legal relevance because it is a declaration of intent. Ignoring robots.txt is not illegal, but the decision is not voluntary either.

    4. Re:You've got it backwards by Jtheletter · · Score: 1

      Wow, aren't you a prick. Here's reading comprehension for you - these are all from the AC posts in this thread that I assume come from you, the same person in each. Ready?

      In the case of robots.txt, these sanctions can very well be court ruling against you

      This is especially important with regard to services which mirror webpages. Doing so without the (assumed) consent of the author is a straightforward copyright violation

      Even if you don't fear the legal system, disregarding robots.txt can quickly get you in trouble.

      If that happened to me twice, I'd have the asshole arrested, and that's exactly what you're going to see online if people don't behave,


      In each case you have expressly implied that ignoring robots.txt has some legal consequences, and in your [poor] analogy of a bum soliciting for cash you explicity stated you would have the person arrested, the logical extension through the analogy being that you have some legal reprecussion against violaters of robots.txt. You're also confusing copyright violations with ignoring robots.txt and as other have pointed out, there are plenty of other reasons robots spider sites other than to wholesale copy and redistribute the data, including but not limited to checking a hash of the site to see if it has been updated, or collecting statistical data for commercial or educational purposes. Copyright violations are illegal and can be prosecuted and while they can be facilitated by ignoring robots.txt, they by no means require it or vice-versa. Had you actually read my post you would see that I stated only that there are no LEGAL reprecussions from ignoring robots.txt. That was my whole point. You are trying to make a connection where there is none. Your reading comprehension is clearly lacking since I never disagreed that there aren't other consequences to ignoring robots.txt. Also, read Grumbel's response elsehwere in this thread where he was kind enough to qutoe the relevent [draft] RFC text which state in no uncertain terms that robots.txt is wholly voluntary. To recap, here's the text from the RFC: "Web site administrators must realise this method is voluntary, and is not sufficient to guarantee some robots will not visit restricted parts of the URL space."
      "It is not an official standard backed by a standards body, or owned by any commercial organisation. It is not enforced by anybody, and there no guarantee that all current and future robots will use it."
      and here's your statement: Ignoring robots.txt is not illegal, but the decision is not voluntary either.

      Just because there are consequences to actions (other than through force of law) does not mean that those actions are no longer voluntary. And stop posting as AC or at least sign the damn posts with some identifier if you expect to continue a conversation.

      --
      -- I'm not a pessimist, I'm a realist. It's not my fault that life sucks so much. --
    5. Re:You've got it backwards by Anonymous Coward · · Score: 0

      You just don't get it. There is a difference between being arrested for ignoring robots.txt, getting your ip space sanctioned for ignoring robots.txt and being sued for doing something which you wouldn't have done if you had abided by the robots exclusion standard. If you can't differentiate between these cases, further discussion is useless.

      If you think that obeying robots.txt is voluntary, visit one of the sites which employ countermeasures against robots which tread onto forbidden urls. Go ahead and make "fair use" copies of data in paths that robots are excluded from, which is often data of which webauthors don't accept mirrors. The fine will be higher because you willingly or negligently ignored robots.txt. Looking left and right before crossing the street is also "voluntary", but if you cause trouble by not looking before you step on the road, that will count against you. Robots.txt is a civilized and easily implemented method of regulating automated clients. If you don't subscribe to civilized regulation, then someone will go legal or berserk on you. It's your - voluntary - choice.

  35. what your government DOESN'T want you to know by way2trivial · · Score: 1

    http://www.whitehouse.gov/robots.txt

    think about it-- anything on this list IS NOT on google..

    why???

    --
    every day http://en.wikipedia.org/wiki/Special:Random
  36. erasing history by 1u3hr · · Score: 1
    Wayback Machine has never asserted their right to keep anything online. As the article points out, they'll remove stuff that's noncompliant with the current robots.txt, even though it was compliant at the time it was spidered.

    I really hate that. When I want to find some info about some hardware made by a long-defunct company, I find old usenet posts referencing their website, This is now taken over by some scumbag who has filled it full of porn and viagra ads. I go to the Wayback Machine and find ALL the history of the site is inaccessible because the current owner of the domain has blocked them in the robots.txt, despite the fact the owners of the original site have no relation to them at all.

    1. Re:erasing history by fm6 · · Score: 1

      You make a good point. (Which is unfair of you, since you're on my foes list!) It would make sense for the Wayback machine to not enforce robots.txt retroactively when the web site has obviously changed hands. Problem is, that's something you can't automate — and checking millions of web sites by hand is just not doable!

  37. George? Is that you? by pointbeing · · Score: 1
    Suppose somebody unintentionally publishes information useful to terrorists.
    Fearmongering. Great way to make your point - Sagan called this an argument from adverse consequences.
    --
    we see things not as as they are, but as we are.
    -- anais nin
  38. When the next lawsuit comes. by austinpoet · · Score: 1
    If a computer connected to the web one day, was then disconnected from the web and had a website in its cache, would that somehow be DMCA violations because the cache'd website was taken down by the owner.

    From the Ars article, it would seem that one of the arguments from Healthcare Advocates was that looking at a cached version of an out-dated website was a violation of the DMCA.

    That's just crazy.

  39. Dizzying intellect by Lactoso · · Score: 1
    I was simply pointing out that your payment for maintaining a website does not automatically make all of its content yours or assure you of any degree of privacy (in and of itself).

    If you have a "member's area" or some other area not intended for public consumption, then I'd imagine you have a reasonable expectation of privacy. That's what it all comes down to - if you post legal (not illegal or stolen such as your examples) information on a website, for all to see, then you really can't complain later when an archive company shows you what you had up there.

    I saw your other post on that credit card scammer thing, and that's outside the scope of this argument. Obviously illegal content should not be reproduced.

    1. Re:Dizzying intellect by phulegart · · Score: 1

      but it IS being reproduced. And it is still there on archive.org. And it is still there in Google's cache links. And companies like Yahoo refuse to do anything about the illegal activities they are being informed of, on their own servers and within their own userbase. By your argument, if ABC puts on a television program for all to see (with just an antenna and an old black and white TV for example), then they really can't complain when someone else tapes the show and rebroadcasts it for all to see whenever they want.

      --
      "I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
  40. Re:George? Is that you? by fm6 · · Score: 1

    You're the second jackass to accuse me of imitating the idiot pres, and I'd be insulted if I took you at all seriously.

    One sign of diminished intelligence is a fondness for quoting fallacy definitions without really understanding them. Though I can't blame you in this case. "Argument from adverse consequences (putting pressure on the decision maker by pointing out dire consequences of an "unfavorable" decision)" is so vague as to be meaningless. "Captain, slow down! There are icebergs ahead!" "Oh, stop arguing from adverse consequences!"

  41. you're talking about two different things by phr2 · · Score: 1
    1) Robots.txt being a mandatory instruction to not spider a site bearing the instruction. Fine, I can go along with that.

    2) Robots.txt being a mandatory instruction to retroactively get rid of any archives collected before the robots.txt directive went up. That is much harder to justify.

    Do you understand the difference?

  42. The Internet is no different than print by chicago_scott · · Score: 1

    Why should the Internet be different than print media?

    Has anyone (other than the Government) ever gone to the Library of Congress and successfully demanded that they destroy print media in their archives? How about digital media?

    The answer is no.