Web Copyright Crackdown On the Way

← Back to Stories (view on slashdot.org)

Web Copyright Crackdown On the Way

Posted by kdawson on Friday March 5, 2010 @02:38AM from the eighty-percent-rule dept.

Hugh Pickens writes "Journalist Alan D. Mutter reports on his blog 'Reflections of a Newsosaur' that a coalition of traditional and digital publishers is launching the first-ever concerted crackdown on copyright pirates on the Web. Initially targeting violators who use large numbers of intact articles, the first offending sites to be targeted will be those using 80% or more of copyrighted stories more than 10 times per month. In the first stage of a multi-step process, online publishers identified by Silicon Valley startup Attributor will be sent a letter informing them of the violations and urging them to enter into license agreements with the publishers whose content appears on their sites. In the second stage Attributor will ask hosting services to take down pirate sites. 'We are not going after past damages' from sites running unauthorized content says Jim Pitkow, the chief executive of Attributor. The emphasis, Pitkow says is 'to engage with publishers to bring them into compliance' by getting them to agree to pay license fees to copyright holders in the future. Offshore sites will not be immune from the crackdown: almost all of them depend on banner ads served by US-based services, and the DMCA requires the ad service to act against any violator. Attributor says it can interdict the revenue lifeline at any offending site in the world." One possible weakness in Attributor's business plan, unless they intend to violate the robots.txt convention: they find violators by crawling the Web.

186 of 224 comments (clear)

Min score:

Reason:

Sort:

Robots.txt by Jaysyn · 2010-03-05 02:39 · Score: 2, Insightful

I'm sure these guys have no compunction against ignoring robots.txt if it makes them money by doing so.

--
There is a war going on for your mind.
1. Re:Robots.txt by yincrash · 2010-03-05 02:43 · Score: 5, Insightful
  
  Seriously. Following robots.txt is not law, only convention. I'm sure it doesn't take much to convince themselves to ignore it. Money, "doing the right thing", etc. If you view the copyright infringers as pirates, then why should Attributor follow their wishes?
2. Re:Robots.txt by notgm · 2010-03-05 02:43 · Score: 4, Insightful
  
  is there some written law that holds people to following robots.txt? if not, how is it even possible to call it a weakness?
3. Re:Robots.txt by HungryHobo · 2010-03-05 02:46 · Score: 1
  
  First on the chopping block:
  Slashdot for it's copy-pasted copies of linked blogs with copy-pasted copies of magazine articles copy-pasted directly from press releases.
4. Re:Robots.txt by HungryHobo · 2010-03-05 02:48 · Score: 1
  
  nah, it's just considered bad manners.
5. Re:Robots.txt by Joe+U · 2010-03-05 02:50 · Score: 1, Interesting
  
  If they are going to extend the DMCA to other countries, then let's extend computer trespassing laws to cover robots.txt violations.
  I'm being somewhat serious (but not super-serious). If courts want to hold that a website TOS is binding, then isn't the robots.txt binding as well?
6. Re:Robots.txt by Anonymous Coward · 2010-03-05 02:55 · Score: 1, Informative
  
  Seriously. Following robots.txt is not law, only convention.
  
  Unauthorised access to a computer system isn't against the law? Which country are you talking about? robots.txt is the standard method to express that certain access methods are not authorised.
7. Re:Robots.txt by houghi · 2010-03-05 03:01 · Score: 1
  
  I wish the standard would be opt-in and not opt-out. Sure that would mean that many sites won't be found by google and others, but if they want to be found, they could just add the robots.txt
  
  --
  Don't fight for your country, if your country does not fight for you.
8. Re:Robots.txt by Jaysyn · 2010-03-05 03:02 · Score: 1
  
  That would be interesting. While Google has a lot more money, Geeknet would be a much softer target.
  
  --
  There is a war going on for your mind.
9. Re:Robots.txt by LordAndrewSama · 2010-03-05 03:05 · Score: 1
  
  if a websites TOS is binding, why not just put what's in the robots.txt file in the TOS in legalese, or just state in the TOS that the robots.txt file must be obeyed or whatever?
10. Re:Robots.txt by DaTroof · 2010-03-05 03:07 · Score: 2, Insightful
  
  I can see ways that their service could be effective while respecting robots.txt settings. They'd simply need to crawl the indexes of other search engines. After all, if a violator is not accessible through Google or Bing, it's probably a low priority.
11. Re:Robots.txt by MtHuurne · 2010-03-05 03:08 · Score: 1
  
  If the infringing sites have a robots.txt that tells all crawlers to skip them, they will not show up in search engines. If they single out Attributor's crawler's user agent string, they would look very suspicious.
12. Re:Robots.txt by Joe+U · 2010-03-05 03:08 · Score: 2, Interesting
  
  That's the point I was trying to make. I posted this somewhere else:
  http://blog.internetcases.com/2010/01/05/browsewrap-website-terms-and-conditions-enforceable/
  So now you can turn around and sue them for crawling your site if you specifically disallow it in the terms and robots.txt.
  The results should be interesting to watch.
13. Re:Robots.txt by petermgreen · 2010-03-05 03:19 · Score: 1
  
  What if they only allow known crawlers from major search engines?
  
  --
  note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
14. Re:Robots.txt by Anonymous Coward · 2010-03-05 03:22 · Score: 1, Insightful
  
  Crawlers which ignore robots.txt can be detected and blocked. Therefore, even though adhering to the robots exclusion standard isn't the law, it is a good idea.
  On the other hand I don't see how robots.txt is even relevant here: Surely most of these sites depend on search engine traffic, so they can't hide the articles from crawlers.
15. Re:Robots.txt by Xest · 2010-03-05 03:28 · Score: 1
  
  Because if they use a robot, you can just identify it and feed it shit.
  It wont be long before people know the details of their crawlers and can just serve them something random.
16. Re:Robots.txt by DaTroof · 2010-03-05 03:29 · Score: 1
  
  Then they can be found in the major search engines' indexes, and Attributor doesn't even need to crawl their site.
17. Re:Robots.txt by poetmatt · 2010-03-05 03:34 · Score: 1
  
  don't worry. They're going to break a lot of laws, break a lot of legs, and basically commit suicide. At least when it's through we'll have less dinosaur industries to deal with.
  They're literally planning to go to domain providers and threaten DMCA to get content taken down. Instead of, you know, DMCA'ing the website appropriately this is an end run around the legal process. Expect a quick smackdown. Why they would host such a company in California of all places to do this, where cali is the most clear about how 3rd parties are not liable.
18. Re:Robots.txt by poetmatt · 2010-03-05 03:38 · Score: 1
  
  I'm quite sure that people who own websites that their robots.txt is being ignored by a crawler are going to express that in quite hostile ways, in comparison.
19. Re:Robots.txt by rnturn · 2010-03-05 03:39 · Score: 1
  
  That's only if their web crawler even looks at robots.txt. It's not required, only a courtesy. I'm sure they'll not be so courteous and claim that they need to do this because the violators they're looking for would block them anyway.
  The sure fire way to keep them out would be to find out what is IP address Attributor is using and block that at your firewall. The trouble with that is they could easily change their IP address or even employ something akin to a botnet to do their web crawling so that their probes appear to be coming from a large number of different addresses on different networks. Try keeping up with that moving target.
  
  --
  CUR ALLOC 20195.....5804M
20. Re:Robots.txt by gbjbaanb · 2010-03-05 03:41 · Score: 1
  
  They'd simply need to crawl the indexes of other search engines
  after purchasing a licence to use the search engine's data, naturally :)
21. Re:Robots.txt by mea37 · 2010-03-05 03:46 · Score: 2, Informative
  
  Really?
  Do you also believe that ToS violations constitute unauthorized access to a computer? That approach was tried recently by the U.S. prosecutors. Ultimately the court didn't buy that position.
  So... why would robots.txt, which advises me of your wishes but to which I never actually agree, carry any more legal authority than a ToS document to which I do supposedly agree as a condition of using your system?
22. Re:Robots.txt by omnichad · 2010-03-05 03:47 · Score: 1
  
  And what's the real difference in lost performance between a hit for robots.txt returning a 404 and a hit for robots.txt returning 200 and a few hundred bytes of text?
  
  Either your site is public or its not. And robots.txt is such a simple standard, why is it so hard to do? You don't even have to write your own.
23. Re:Robots.txt by clone53421 · 2010-03-05 03:49 · Score: 2, Funny
  
  More work for the /. editors! Horror!~
  
  --
  Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
24. Re:Robots.txt by Registered+Coward+v2 · 2010-03-05 03:49 · Score: 3, Interesting
  
  Seriously. Following robots.txt is not law, only convention. I'm sure it doesn't take much to convince themselves to ignore it. Money, "doing the right thing", etc. If you view the copyright infringers as pirates, then why should Attributor follow their wishes?
  I'd go even farther to say that sites that use robot.txt to eliminate crawling are probably not major targets - if they don't show up in search engine sthen tehy probably don't generate enough traffic to be worth the effort. Sites that are high traffic are much better targets - their revenue stream form ads is prbabaly significant enough that they don't want to risk losing it. Once enough fall into line they can worry about the ones that are not indexed - in fact they may just want to kill them off to preserve traffic to licensed sites.
  
  --
  I'm a consultant - I convert gibberish into cash-flow.
25. Re:Robots.txt by DaTroof · 2010-03-05 03:53 · Score: 2, Informative
  
  after purchasing a licence to use the search engine's data, naturally :)
  Depending on the search engine and its terms of service, they might not even need to purchase a license. Google, Bing, and Yahoo all provide search APIs for third-party software.
26. Re:Robots.txt by clone53421 · 2010-03-05 03:57 · Score: 1
  
  robots.txt is an opt-out. If it isn’t present, you can crawl.
  Furthermore, you are in no way legally obligated to even check robots.txt before crawling a site. It’s merely a standard of politeness to do so.
  
  --
  Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
27. Re:Robots.txt by DaTroof · 2010-03-05 03:58 · Score: 1
  
  Attributor wouldn't even need to crawl their sites. As I noted above, both Google and Bing provide search APIs.
28. Re:Robots.txt by clone53421 · 2010-03-05 03:59 · Score: 1
  
  If a robot passes the Turing test, does it have to check robots.txt before it crawls the website?
  If I manually crawl through all the pages on their site and bookmark all the links, am I a robot?
  Such difficult questions... how on earth would we legislate something?
  
  --
  Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
29. Re:Robots.txt by Mashdar · 2010-03-05 04:47 · Score: 1
  
  Why don't you ask the robot?
30. Re:Robots.txt by aztracker1 · 2010-03-05 05:22 · Score: 1
  
  Most likely the sites in question aren't blocking via robots.txt anyhow, since they are relying on search engines for inbound traffic. Also, copying an entire article without proper linking and attribution is a far cry from fair use. It's not the same as having a few large quotes from another source, with citation and links back to said source article. I suspect for now the first targets will be the whole copy sites... I hate when I find tech articles from people's blogs re-posted on other sites, usually without even a cursory link back to the author's site. Many people won't realize they aren't on the author's site.
  
  --
  Michael J. Ryan - tracker1.info
31. Re:Robots.txt by Crudely_Indecent · 2010-03-05 05:38 · Score: 2, Informative
  
  Anyone interested in finding out what's really going on with a website would look at robots.txt first and ask themselves 'now why do they want the robots to avoid these pages?'
  Of course, some of those entries will be dead-ends (dynamic pages that make no sense to crawl, password protected pages that would detract from a sites rankings, etc...).
  What's going to be interesting is what happens when their method is identified and/or the IP addresses they're using to make those identifications. There is no way to bypass .htaccess type restrictions. If their bot identifies itself (or can be identified), or their IP (range(s)) can be identified, the site owners can become invisible to the copyright bot and/or the agency tasked with detecting violations.
  A clever administrator might even build a script to deliver alternate content to the bot/agency so as to not appear suspiciously invisible.
  The exact method to thwart their efforts hinges on exactly how they detect the violations. It can definitely be done.
  
  --
  
  "Lame" - Galaxar
32. Re:Robots.txt by paazin · 2010-03-05 05:39 · Score: 1
  
  is there some written law that holds people to following robots.txt? if not, how is it even possible to call it a weakness?
  Nope:
  
  There is no law stating that /robots.txt must be obeyed, nor does it constitute a binding contract between site owner and user, but having a /robots.txt can be relevant in legal cases. (from here
33. Re:Robots.txt by nacturation · 2010-03-05 05:49 · Score: 2, Insightful
  
  Right... because a judge will find that offer, consideration, and acceptance of a contract took place between a webserver and a bot? The court case you cite is irrelevant to an automated program that has no understanding and cannot accept conditions presented online.
  
  --
  Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
34. Re:Robots.txt by Joe+U · 2010-03-05 05:59 · Score: 3, Insightful
  
  Right... because a judge will find that offer, consideration, and acceptance of a contract took place between a webserver and a bot? The court case you cite is irrelevant to an automated program that has no understanding and cannot accept conditions presented online.
  Awesome, so anyone can DoS a server, send mass spam or distribute a virus as long as a bot does it, because a judge will rule that the bot acted on its own and wasn't developed or set loose by anyone at all.
  If the software wrote itself you might have a point, otherwise the people who wrote it are the ones responsible for how it acts.
35. Re:Robots.txt by nacturation · 2010-03-05 06:52 · Score: 1
  
  Awesome, so anyone can DoS a server, send mass spam or distribute a virus as long as a bot does it, because a judge will rule that the bot acted on its own and wasn't developed or set loose by anyone at all.
  You do understand that those activities are illegal but requesting information from a webserver is not illegal, right?
  
  If the software wrote itself you might have a point, otherwise the people who wrote it are the ones responsible for how it acts.
  How it acts is by requesting information from a webserver. If such activity does not violate any laws then it follows that it must be a legal activity. However, your premise is not that requesting information from a webserver is illegal, but that a computer program requesting information from a webserver is capable of entering into a contract on behalf of the computer program's authors based on the contents of the information being returned. That is ridiculous. What evidence can you supply that a court would accept this legal theory of yours?
  
  --
  Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
36. Re:Robots.txt by theArtificial · 2010-03-05 07:54 · Score: 1
  
  A clever administrator might even build a script to deliver alternate content to the bot/agency so as to not appear suspiciously invisible.
  Clever but what happens when the bot doesn't identify it self as "CopyrightBOT 123". Google bots don't always identify as Googlebot to prevent exactly this. I'm sure this behavior isn't unique to Google.
  
  --
  Man blir trött av att gå och göra ingenting.
37. Re:Robots.txt by The+End+Of+Days · 2010-03-05 08:05 · Score: 1
  
  I find a certain amount of humor in the concept that one should obey the theoretical privacy-seeking wishes of people who are attempting to hide their lawbreaking activities.
  Oh kdawson, you are a genius rabble rouser.
38. Re:Robots.txt by silverglade00 · 2010-03-05 08:35 · Score: 2, Funny
  
  Work for the /. editors! Horror!~
  Fixed that for ya.
39. Re:Robots.txt by Crudely_Indecent · 2010-03-05 09:17 · Score: 1
  
  Clever but what happens when...
  Didn't I mention IP address range(s)?
  Why, yes....yes I did.
  
  --
  
  "Lame" - Galaxar
40. Re:Robots.txt by theArtificial · 2010-03-05 09:29 · Score: 1
  
  True to your name :P How would one obtain and identify the IP blocks if the bots don't identify themselves?
  
  --
  Man blir trött av att gå och göra ingenting.
41. Re:Robots.txt by Joe+U · 2010-03-05 09:55 · Score: 1
  
  You do understand that those activities are illegal but requesting information from a webserver is not illegal, right?
  You do understand that the website operator/owner can revoke access to any part of the server on a whim. Also, people have been prosecuted for going to URL's they weren't supposed to, even though they were not password protected. Combine all that with the fact that the courts have ruled that the terms of service are enforcable. This means you either respect the wishes of the website operator/owner or you will find yourself with a minimum of a civil suit and possibility of a criminal one, especially if your robot causes even the slightest change in the server's normal operation.
  
  a computer program requesting information from a webserver is capable of entering into a contract on behalf of the computer program's authors based on the contents of the information being returned.
  Yeah, like I said, if you write an app for a specific purpose you're responsible for the actions of that app.
42. Re:Robots.txt by Joebert · 2010-03-05 10:24 · Score: 1
  
  Oooh, word jumble.
  
  Ok, whorrres for /. editor Thor!!~
  
  Does /. have an editor named Thor ?
  
  --
  Wanna fight ? Bend over, stick your head up your ass, and fight for air.
43. Re:Robots.txt by silverglade00 · 2010-03-05 10:41 · Score: 1
  
  I'm sure they'll not be so courteous and claim that they need to do this because the violators they're looking for would block them anyway.
  Isn't that how it always starts? "You wouldn't need robots.txt if you have nothing to hide!" "Only criminals and pirates have a robots.txt!!" Yecch. Next thing you know, Comcast is throttling bandwidth to websites with a robots.txt and RIAA will be serving takedown notices if they find one on your site.
44. Re:Robots.txt by Crudely_Indecent · 2010-03-05 11:20 · Score: 1
  
  Didn't I write the word "IF" as a qualifier in the 5th sentence.
  Why, yes....yes I did.
  Perhaps you should re-read the entire post you're commenting against before you press Submit.
  I operate under the assumption that things CAN be done until I'm convinced that they cannot be done. I remain optimistic that the copyright police scans can be identified in some way.
  All of this is theoretical. I don't host copyrighted material on any of my sites, so I don't really care about that. I am concerned about additional usage on my sites by yet another bot where I will never see any benefit.
  And my chosen name on /. was taken from a dictionary definition of the word vulgar. My response to you was not vulgar.....although, it was impudent. I attempt at all times to remain dissimilar to the name.
  
  --
  
  "Lame" - Galaxar
45. Re:Robots.txt by theArtificial · 2010-03-05 14:23 · Score: 1
  
  Didn't I write the word "IF" as a qualifier in the 5th sentence.
  Why, yes.... yes you did.
  
  Perhaps you should re-read the entire post you're commenting against before you press Submit.
  So one can disregard it? See above. You have yet to answer my question about HOW. Do you understand what I am asking? HOW would YOU identify a bot that you don't know is a bot or an actual visitor. Lets assume that you are just looking at logs. What are your thoughts on that?
  
  All of this is theoretical. I don't host copyrighted material on any of my sites, so I don't really care about that. I am concerned about additional usage on my sites by yet another bot where I will never see any benefit.
  An academic exercise, great. I didn't think you were hosting anything of that nature. Why, no.... no I didn't suggest that.
  
  And my chosen name on /. was taken from a dictionary definition of the word vulgar [thefreedictionary.com]. My response to you was not vulgar.....although, it was impudent. I attempt at all times to remain dissimilar to the name.
  Perhaps pompus would be more fitting.
  
  --
  Man blir trött av att gå och göra ingenting.
46. Re:Robots.txt by shnull · 2010-03-05 16:47 · Score: 1
  
  It's sometimes actually amusing to watch the last desperate twitches of this dying technophobic breed who thought they were the last generation with a true identity. Pity the fools 20 years from now when technological acceleration explodes and they finally realize that all their efforts to control the future were futile ... it's a bit ironic since they did nothing but protest 'the system' that tried to control their lives ... i have to quote Shirley Bassey again : all just a little bit of hippies turning into nazis repeating
  
  --
  beware he who denies you access to information for in his mind, he already deems himself to be your master (SMAC-ish)
47. Re:Robots.txt by OrwellianLurker · 2010-03-05 19:10 · Score: 1
  
  True to your name :P How would one obtain and identify the IP blocks if the bots don't identify themselves?
  You identify the crawlers you want, such as Google's, and that leaves you with the crawlers you don't want. Whatever methods you have for dealing with those are up to you, but identifying them won't be overly difficult. However, they would be smart to just use search engines and target large sites.
  
  --
  'Political power grows out of the barrel of a gun.' - Mao Tse-tung
48. Re:Robots.txt by mdwh2 · 2010-03-06 02:52 · Score: 1
  
  Bot requests access to a public page. Server authorises access to that bot.
  Sorry, what does robots.txt has to do with anything?
49. Re:Robots.txt by Crudely_Indecent · 2010-03-06 03:27 · Score: 1
  
  HOW is a matter of collaboration and diligence. I often watch the logs roll by, looking for anything that doesn't seem right. It's fairly easy to distinguish a bot from a real user by the way their logs appear. A real user loads the page, then the CSS and JS files, then images. Bots load page after page without picking up the other linked files. Even if it doesn't identify itself as a bot, there are telltale signs.
  Once I've identified an anomalous visitor, I can make a quick visit to ARIN or APNIC or whatever numbering authority is responsible for the IP address in question and look up the owner, their assigned netblock and other associated netblocks.
  Again, collaboration is also a factor. Webmasterworld will certainly have a forum topic about this once there are real-world cases. All it takes is one good administrator to come up with a method for identification and the rest will expand and extend it.
  
  Perhaps pompus would be more fitting
  Not bad, but I like the idea of a name that is opposite, the antithesis of who I am. Haven't you ever met a fat guy called "Tiny" or a bald guy called "Curly?"
  Maybe you could find a new name for me that has a two word definition that's cooler than "Crudely_Indecent"?
  
  --
  
  "Lame" - Galaxar
50. Re:Robots.txt by nacturation · 2010-03-06 21:15 · Score: 1
  
  You do understand that those activities are illegal but requesting information from a webserver is not illegal, right?
  You do understand that the website operator/owner can revoke access to any part of the server on a whim.
  Now you're just trolling. That's not even a rational response to the question I asked.
  
  Yeah, like I said, if you write an app for a specific purpose you're responsible for the actions of that app.
  So if your web server sends information that crashes my browser, will you accept responsibility for that? After all, you're responsible for the actions of the web server you choose to run.
  
  --
  Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
51. Re:Robots.txt by Joe+U · 2010-03-07 10:16 · Score: 1
  
  So if your web server sends information that crashes my browser, will you accept responsibility for that? After all, you're responsible for the actions of the web server you choose to run
  Sure. If I deliberately code my web server to overload your computer and crash your browser by deliberately ignoring settings that your browser sent to me via a published protocol that we both agreed on, then yes, I'm responsible for crashing your browser.
  Here's a real world example: I run a shop, you visit the shop to get a free flyer. I have a note that says please don't go behind the counter to get or read flyers. You ignore that and go in the back and start looking around. I tell you to stop, but you continue to rummage around looking for flyers. I have you arrested for trespassing.
  You visit my server to get a free webpage with a special browser. I have a file called robots.txt and a notice (tos.html) linked to on every page in the site that says, you can request this page, but don't go visiting pages in this directory. You ignore that and go and index all the files, including that 600GB file I had lying around, and you eat up all my bandwidth for the month. I have you arrested for computer trespassing.
  You're a user on my site. You annoy everyone on the site and cause people to leave. I tell you to take a hike and ban your account. You sign up for a new account against the terms of service and annoy everyone on the site again. I sue you for disrupting my service and according to precedent that I cited in my original post, you're not only going to lose that case, but there's a good chance you can see some jail time if I convince the DA to press criminal charges.
  Get it?
52. Re:Robots.txt by nacturation · 2010-03-07 12:52 · Score: 1
  
  You visit my server to get a free webpage with a special browser. I have a file called robots.txt and a notice (tos.html) linked to on every page in the site that says, you can request this page, but don't go visiting pages in this directory. You ignore that and go and index all the files, including that 600GB file I had lying around, and you eat up all my bandwidth for the month. I have you arrested for computer trespassing.
  The thrust of your matter will revolve around whether you can prove to a court that robots.txt, something which has not been codified into any law, establishes a binding contract between a bot and a site owner. Every court case I'm aware of which has attempted to show this for a bot has failed as it doesn't meet the three criteria for a contract: offer, consideration, and acceptance. The same applies for your terms of service, something which isn't in a computer-readable format like robots.txt. If robots.txt has never been shown to establish a contract, then an arbitrarily linked file would similarly fail.
  Again, this is all about bots. The exception you noted for a human doesn't apply as the human is capable of receiving an offer, considering that offer, then accepting it and therefore meets the three conditions for a binding contract.
  
  You're a user on my site. You annoy everyone on the site and cause people to leave. I tell you to take a hike and ban your account. You sign up for a new account against the terms of service and annoy everyone on the site again. I sue you for disrupting my service and according to precedent that I cited in my original post, you're not only going to lose that case, but there's a good chance you can see some jail time if I convince the DA to press criminal charges.
  Get it?
  As it applies to a human? Makes total sense as to sign up for an account, there is offer, consideration, and acceptance. I would fully support your right to sue and/or press for criminal charges. And if someone wrote a bot which maliciously signed up for accounts, detected when it was banned, and signed up for a new account, then that makes sense as well and you would be right to sue and/or press for criminal charges against the author of said malicious bot.
  For a well-behaving bot whose sole purpose is to request a page once in order to check for copyrighted content? Sorry, I'm not aware of any legal precedent which would apply here. Google has lost lawsuits for copyright infringement for displaying cached versions of spidered pages, but there was no finding against Google with relation to requesting the content in the first place. I'll believe you when you can demonstrate a court case which applies, but so far your argument by human examples fail to convince how this legal theory would apply.
  
  --
  Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
53. Re:Robots.txt by Joe+U · 2010-03-07 13:31 · Score: 1
  
  Did you skip over my original post?
  There was a recent court ruling that says terms of service for a website can be enforced.
  I asked, does that make robots.txt part of the terms? Can it be added to the terms? If you specifically say 'no bots' in the terms can you sue someone for using a bot?
  You responded that bots can't enter into contracts. Well, that's great, except a bot is just an application that a person wrote that follows instructions given to them by that person.
  Here's the logic:
  If the bot is breaking the terms of use of the site, and courts say that terms are enforcable (even if you don't read them), then bot actions are enforcable.
  Here's a bot:
  I have a program that runs on my network, it hides licence agreements during software installation and clicks ok/install. Since I never saw the agreement, I'm not bound by the license? After all, the app did it for me, all I did was set it loose on my personal network. "There might have been an agreement, but I wrote the application to ignore it, not read it, and continue anyway" will not go over well in court.
54. Re:Robots.txt by nacturation · 2010-03-07 15:03 · Score: 1
  
  Did you skip over my original post? There was a recent court ruling that says terms of service for a website can be enforced.
  Did you skip over my reply? I'll repeat it:
  The exception you noted [in the recent court ruling you provided a link for] for a human doesn't apply as the human is capable of receiving an offer, considering that offer, then accepting it and therefore meets the three conditions for a binding contract.
  
  --
  Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
55. Re:Robots.txt by Joe+U · 2010-03-07 15:38 · Score: 1
  
  Yes yes, I know, the bot can't enter into a contract. I suggest you patent your contract avoidance system. You just write an app to ignore the agreement and yay! no agreement because the app did it.
  Of course your application is downloading and analysing material for your own personal gain. But yes, I believe judges will look favorably on your 'My proxy bot hid it so it doesn't apply to me' legal theory.
  No matter what I say from this point, you're going to continue with the idea that you can have a program act on your behalf and avoid agreements, so fine, you keep thinking that, now go post another comment so you can have the last word.
56. Re:Robots.txt by nacturation · 2010-03-07 18:17 · Score: 1
  
  What a strange little man you are.
  
  --
  Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
57. Re:Robots.txt by nacturation · 2010-03-08 14:26 · Score: 1
  
  My other comment was uncalled for, so my apologies.
  
  I suggest you patent your contract avoidance system. You just write an app to ignore the agreement and yay! no agreement because the app did it.
  I just find it interesting that you seem to think that Google's bot, for example, has entered into millions of agreements because of the terms of service on the millions of sites it has indexed. If you don't think that Google has entered into millions of agreements because of its bot, do you think they are operating under a theory of a "contract avoidance system" as you claim? Same goes for Microsoft, and every other search engine which indexes content on the web. If your theory is right, you may have hit the jackpot and you should be starting class actions against these companies... no doubt they have violated some provisions of all these millions of terms of service they have entered into contractual agreements under.
  
  Of course your application is downloading and analysing material for your own personal gain.
  Check it out: http://www.google.com/search?q=%22terms+of+service%22 There are 438,000,000 instances of "terms of service" that Google has indexed. Google has download, analyzed, and stored material from those sites for their own personal gain (selling advertising against the search results). By your claims, Google has entered into agreements on every one of these.
  
  --
  Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
DMCA.. by ltning · 2010-03-05 02:42 · Score: 2, Interesting

What on earth is the DMCA supposed to achieve, in the context of Ad-providers?
Sounds pretty scary to me.

--
Love over Gold.
1. Re:DMCA.. by Yvan256 · 2010-03-05 03:18 · Score: 1
  
  In this case they're referring to the Downloadable Media Computer Advertising.
2. Re:DMCA.. by julesh · 2010-03-05 04:09 · Score: 1
  
  What on earth is the DMCA supposed to achieve, in the context of Ad-providers?
  Sounds pretty scary to me.
  Agreed. I've never heard of this, and a quick scan of the legislation doesn't turn up anything that appears to relate to this; the categories of service it regulates appear to be (a) telecoms providers transmitting data at user request, (b) those hosting temporary copies of content (e.g. caches), (c) those hosting content at the request of third parties, and (d) search engines, directories and other link collections. I see no suggestion in the text of the legislation that it applies to people providing additional content that is aggregated on the same page, or in any other way I can think of that would catch out ad networks.
  Also, there are plenty of non-US ad networks and affiliate sites, many of whom probably don't give a fuck about DMCA notifications even if they do apply.
Re:i'm a little clueless here by Tim+C · 2010-03-05 02:53 · Score: 4, Insightful

This one.
On the other hand, that's an utterly asinine comment to have made (the one you quote, not yours). Of course they'll ignore it, why on Earth wouldn't they? It is in no way binding, and robots are free to ignore it, just as site owners are free to block connections from specific incoming IP addresses, the owners of those IPs are free to switch to new ones, and so on, ad infinitum.

--
It's official. Most of you are morons.
Lessoned learned from RIAA by KnownIssues · 2010-03-05 02:54 · Score: 4, Insightful

Sounds like they've learned their lesson from the RIAA. I'm not saying I agree with them and think they are right to do this. But, if you're going to try to enforce your interpretation of the law, this is at least a sane philosophy of doing so. Not going after damages is a smart move.
Will that ultimately include slashdot? by elrous0 · 2010-03-05 02:54 · Score: 5, Interesting

A lot of aggregator sites like this one base a lot of their topical content on articles printed elsewhere. While most (incl. /.) don't print whole articles intact, a lot of them do quote heavily (what used to be called "fair use," back when that phrase actually meant anything). So their first step is to go after the sites that reprint the articles whole-cloth. But will they stop there?

--
SJW: Someone who has run out of real oppression, and has to fake it.
1. Re:Will that ultimately include slashdot? by c-reus · 2010-03-05 03:15 · Score: 1
  
  Initially targeting violators who use large numbers of intact articles
  (emphasis mine)
  No, they will not stop there.
2. Re:Will that ultimately include slashdot? by Yvan256 · 2010-03-05 03:20 · Score: 2, Funny
  
  And will Slashdot be targeted again and again? (you know... all the dupes)
3. Re:Will that ultimately include slashdot? by MtHuurne · 2010-03-05 03:30 · Score: 3, Insightful
  
  Unless an article is very short, quoting 80% of it is not fair use. So for now, I think they have every right to take steps against sites making money from their content without compensation.
  Yes, I am cynical enough to expect the reasonable 80% limit to be lowered over time until it reaches unreasonable levels. But let's hold the flames until they have actually crossed that line.
4. Re:Will that ultimately include slashdot? by Abcd1234 · 2010-03-05 03:43 · Score: 2, Insightful
  
  Since when did Slashdot ever use 80% of an article verbatim?
  Sorry, no, any website doing *that* should be shut down. I hate those assholes. They're the reason why a search for a given term in Google pops up thousands of sites with the *exact same content*, just ripped from one another.
5. Re:Will that ultimately include slashdot? by elrous0 · 2010-03-05 03:57 · Score: 1
  
  By the time they actually cross that last line, I suspect it will be too late.
  
  --
  SJW: Someone who has run out of real oppression, and has to fake it.
6. Re:Will that ultimately include slashdot? by elrous0 · 2010-03-05 04:04 · Score: 1
  
  Yes, but 80% is where they're *starting*. I'm asking if that's where they're going to *end* it.
  
  --
  SJW: Someone who has run out of real oppression, and has to fake it.
7. Re:Will that ultimately include slashdot? by Abcd1234 · 2010-03-05 04:14 · Score: 1
  
  Well, given the fair use doctrine still exists, there will always be a lower bound at which their legal actions will no longer have any basis.
8. Re:Will that ultimately include slashdot? by MtHuurne · 2010-03-05 04:27 · Score: 1
  
  In my opinion preemptive protests against valid copyright enforcement only weaken the argument against copyright abuse.
9. Re:Will that ultimately include slashdot? by cpt+kangarooski · 2010-03-05 04:44 · Score: 1
  
  Unless an article is very short, quoting 80% of it is not fair use.
  Well, that depends on the circumstances. The amount of the work used, and the substantiality of the portion of the work used is a factor in determing if the use is fair, but there isn't a hard number.
  
  --
  -- This and all my posts are in the public domain. I am a lawyer. I am not your lawyer, and this is not legal advice.
10. Re:Will that ultimately include slashdot? by natehoy · 2010-03-05 04:56 · Score: 3, Insightful
  
  80% is a reasonable starting point. If they start lowering it, we'll have to express our righteous indignation then. Fair use, when interpreted, is generally considered a LOT lower than routinely cutting-and-pasting 80% of articles, so they have a long way to lower it before we can honestly call our indignation righteous.
  Seriously, this really isn't a "slippery slope" situation. It seems to be a well-thought-out and sane set of guidelines. If anything, they are being a bit generous for now, and they can still tighten this quite a bit without coming close to busting "fair use" or even "reasonable use".
  Basically they are saying, "if you routinely use 80%+ of our articles as your own content, we're asking you to stop. We won't sue you for any past uses, we just want to make it clear that this isn't cool any more."
  A fair usage (not the lack of quotes, I am not talking about a legal doctrine) would be to use about 20% of the source article (properly attributed) with a link back to the original article. Give credit where it's due (and cite your sources). Then add your own thoughts, or don't. But don't take whole-cloth articles and post them on your own site with your own ads.
  Every discussion board I've ever participated in has pretty much recommended some really close variant to this anyway. It usually reads something like "cite a paragraph or two at most and have a link to the source article plainly visible nearby".
  
  --
  "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
11. Re:Will that ultimately include slashdot? by DragonWriter · 2010-03-05 05:50 · Score: 1
  
  Yes, but 80% is where they're *starting*. I'm asking if that's where they're going to *end* it.
  They're going to end it at the point where both of the following occur frequently enough to make the legal strategy a net cost:
  1) Intimidation fails, the target issues counter-notices, and a court battle ensues, and
  2) The court rules that the use is fair use.
12. Re:Will that ultimately include slashdot? by elrous0 · 2010-03-05 07:14 · Score: 1
  
  Well, that's assuming that:
  a) The person running the website has the money to defend themselves it court
  and
  b) The increasingly conservative courts interpret "fair use" as liberally as we do
  Those are two pretty big "if's."
  
  --
  SJW: Someone who has run out of real oppression, and has to fake it.
Offshore sites WILL be immune by unity100 · 2010-03-05 02:56 · Score: 1, Interesting

all this harrassment is going to do will be to push the global small internet publishers to services in other countries. Datacenters, Ad services in u.s. will lose customers. There are already strong companies servicing in those areas in Eu. Eu will be happy to receive that amount of business.
the stupor of american corporatism is overwhelming. they can even go to the extent of shooting themselves in the foot.

--
Read radical news here
1. Re: Offshore sites WILL be immune by Jaysyn · 2010-03-05 02:58 · Score: 1
  
  All we can do is sit back & watch the fireworks.
  
  --
  There is a war going on for your mind.
2. Re: Offshore sites WILL be immune by Sockatume · 2010-03-05 03:03 · Score: 3, Insightful
  
  Are you kidding? ACTA's going to harmonise everything so closely to the US that they'll be able to prosecute anyone.
  
  --
  No kidding!!! What do you say at this point?
3. Re: Offshore sites WILL be immune by cpghost · 2010-03-05 03:06 · Score: 2, Insightful
  
  Yes, but ACTA is not the whole world.
  
  --
  cpghost at Cordula's Web.
4. Re: Offshore sites WILL be immune by daveime · 2010-03-05 03:43 · Score: 1
  
  When did that fact ever stop the US doing whatever the hell it wants ?
5. Re: Offshore sites WILL be immune by bickerdyke · 2010-03-05 03:59 · Score: 1
  
  No. Just the part of the world that's intrested in doing any buissness with any other part of the world.
  Yes, It may not be North Korea.
  
  --
  bickerdyke
6. Re: Offshore sites WILL be immune by cpghost · 2010-03-05 04:11 · Score: 2, Informative
  
  According to this, only Australia, Canada, USA, EU, Japan, South Korea, Mexico, Morocco, New Zealand, Singapore and Switzerland are currently part of that treaty. This (currently) leaves more than enough room for a whole lot of other countries (some of them as big as Russia and China) that are not part of it.
  
  --
  cpghost at Cordula's Web.
7. Re: Offshore sites WILL be immune by julesh · 2010-03-05 04:14 · Score: 1
  
  Are you kidding? ACTA's going to harmonise everything so closely to the US that they'll be able to prosecute anyone.
  If you think Vanuatu et al are going to be signing up to ACTA, then I want some of what you're smoking.
  Sure, most of the large economies will probably be signing, but there's no reason not to base an Internet business on a little island somewhere nice with friendly laws (and, as a nice side benefit, zero taxation).
8. Re: Offshore sites WILL be immune by natehoy · 2010-03-05 05:06 · Score: 1
  
  I have to ask.. what kind of small internet publisher do you think would be hurt, and why?
  If the article is yours, then you can post 100% of it and claim it as your own. No one disputes this. This coalition is not going after work they don't own, and if they do you have valid grounds to sue them 'till it hurts.
  If the article belongs to someone else, and you as a publisher are routinely taking more than 80% of someone else's articles and using them to generate your own revenue, you really need to reconsider your business model.
  Unless I'm fundamentally misunderstanding what this new coalition is about to do, I see it as a reasoned and even somewhat generous response to a real problem. We're not talking about Slashdot citations or RSS summaries, because those rarely contain the entire article. We're talking about wholesale copying of entire articles off the sites that paid to have them written.
  
  --
  "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
9. Re: Offshore sites WILL be immune by petermgreen · 2010-03-05 06:02 · Score: 1
  
  It's an option, the thing is most people doing this are in it to make money. That means they have to have some kind of financial transactions with the rich world whether to pay for access directly or (more likely) to pay for advertisements.
  Just like with online gambling I would expect the pressure to be applied to the companies who help the profits flow to the offshore companies performing actions that are legal (or at least impractical to prosecute) in the offshore companies jurisdiction but illegal in the customers jurisdiction.
  Remember the big advert networks are what enable these sites to make money by placing customers adverts on sites that the customer would be unlikely to associate with by choice.
  
  --
  note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
10. Re: Offshore sites WILL be immune by unity100 · 2010-03-05 06:34 · Score: 1
  
  you very well know that they wont leave it at that. like, remember how ioc claimed rights on a sportswoman's name. like, some organization claiming rights to a photo of a monument, randomly taken by some visitor. the 'rights' concept in america is so out of control and corrupted that people are able to claim rights to even fundamental thoughts and concepts.
  anyone can be hurt. someone posting some photo of something that a person or company thinks that they 'own' anything related to it can take an entire forum down, with this thing and acta (if).
  
  --
  Read radical news here
11. Re: Offshore sites WILL be immune by natehoy · 2010-03-05 07:25 · Score: 1
  
  IF and when that happens, I'll be on the front lines with you, Brother.
  However, this isn't what this group is looking for. If you routinely take 80% or more of entire articles, copy them, and put them on your own web site, you really are profiting from something that isn't yours to profit from.
  If, on the other hand, you are routinely taking excerpts from sites, typing up your own thoughts, and posting a link to the articles you are using as source material, there really isn't anything to worry about from this specific group.
  I'm not saying that abuses of DMCA copyright takedowns do not happen, or that this consortium represents all that is good and holy and bright. But this movement is UNDERenforcing rights they actually have.
  The excessive length and reach of copyright, the draconian silliness of DRM, the constant abuse of DMCA, the erosion of personal backup rights, I'm with you on all of that. We need to shorten copyright back to 20 years or less. We need to reinstitute real fair-use provisions.
  This isn't eroding any of that. This is going after the real abusers. The ones that the rest of the IP industry is always citing in their ongoing quest to erode our rights.
  
  --
  "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
12. Re: Offshore sites WILL be immune by Ungrounded+Lightning · 2010-03-05 07:55 · Score: 1
  
  Are you kidding? ACTA's going to harmonise everything so closely to the US that they'll be able to prosecute anyone.
  Only among countries that both sign and implement it.
  If there's big bucks to be made by providing a safe haven, some small and possibly impoverished country will likely do so - either explicitly or by giving lip service while doing little to enforce.
  (Example: Nigeria and the 419 con men.)
  
  --
  Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
13. Re: Offshore sites WILL be immune by unity100 · 2010-03-05 09:33 · Score: 1
  
  now the thing is, as you concede to these people, they just keep pushing to their end. that's why im saying that it wont stop there. to stop the trend we need to push back, so we can reach an equilibrium.
  considering how badly the equilibrium is leaning to the draconian side, i think that we shouldnt condone anything until we get to a reasonable point.
  
  --
  Read radical news here
14. Re: Offshore sites WILL be immune by Compaqt · 2010-03-05 18:49 · Score: 1
  
  The problem with this is that when the US wanted to ban offshore gambling sites, they did so not directly but through the backdoor by prohibiting credit card transactions to such offshore sites.
  Whether it gets to that point or not is anybody's guess, but don't underestimate the US Government's penchant for trying to control anything and everything inside or outside its borders.
  
  --
  I'm not a lawyer, but I play one on the Internet. Blog
15. Re: Offshore sites WILL be immune by unity100 · 2010-03-07 14:43 · Score: 1
  
  well, chinese payment providers stepped in at that point. even worse for u.s., since even despite wto, noone in the world can do shit against china in any respect.
  
  --
  Read radical news here
Please do so by OzPeter · 2010-03-05 02:59 · Score: 5, Insightful

And in the process take down all those inane blogs whose sole purpose is to scrape and repost articles so they get an advertising hit.

--
I am Slashdot. Are you Slashdot as well?
1. Re:Please do so by Anonymous Coward · 2010-03-05 03:13 · Score: 3, Insightful
  
  While they're at it, can they take down forum/mailinglist mirrors too?
  It is extremely annoying when searching to find that the top 30 results all contain the exact same forum or blog post.
2. Re:Please do so by garcia · 2010-03-05 03:22 · Score: 4, Insightful
  
  And in the process find all the commercial sites using my copyrighted Flickr photos for their own purposes without my permission or payment. I'm tired of sending invoices and dealing with companies who tell you that your photo wasn't worth the $300 you charge and instead send you $50 thinking that it will clear up the matter.
  I love the hypocrisy of all of this. They are just as much at fault as any of those aggregation blogs. They just have more money to be a pain in the ass.
3. Re:Please do so by clone53421 · 2010-03-05 04:19 · Score: 2, Interesting
  
  I'm tired of sending invoices and dealing with companies who tell you that your photo wasn't worth the $300 you charge and instead send you $50 thinking that it will clear up the matter.
  They’re basically giving you the finger. Don’t fuck around playing their little games... show them you mean business. Slap on a surcharge to cover your additional expense and send their name and remaining balance to a debt collector. It’s probably cheaper and less of a hassle than suing them in small claims court.
  IANAL... you may want to ask a real lawyer what your options are, but seems to me you have a few.
  
  --
  Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
4. Re:Please do so by garcia · 2010-03-05 04:24 · Score: 1
  
  It's easier to make an ass out of them on the Internet. Twitter is an effective tool (especially with Google indexing it in real time) in the fight against these assholes.
  I eventually did get paid by a newspaper (include late charges) after three months. I have not been so successful with other businesses using my images in their marketing materials w/o my permission.
  Oh and debt collection (when it's $300) isn't worth my time--neither is small claims.
5. Re:Please do so by clone53421 · 2010-03-05 04:39 · Score: 1
  
  Eh, I’ve been sent to debt collection for $50 doctor visits that the bill for got lost...
  I mean, after the initial reaction (seriously?), I called them up and paid it. But yeah... seriously?
  In any case I’d think $300 is significant enough to justify going after them with some heavier ammo than just bad rep on Twitter.
  
  --
  Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
6. Re:Please do so by MartinSchou · 2010-03-05 05:15 · Score: 1
  
  Your pictures, your copyright. And we've already seen that they are responsible for massive damages for each infringement. And unlike P2P it's very easy to tell just how many times your copyrighted item has been distributed to others.
  Just imagine how much fun it'll be to send a second letter to them, pointing out that the 300 dollars you originally charged as a settlement has now been changed to a much more reasonable 1,000 dollars per infringement.
7. Re:Please do so by GooberToo · 2010-03-05 05:35 · Score: 1
  
  Collectors typically work two different ways. Sometimes they simply buy the debt at something like 25% - 50%, and often purchased in bulk. In this case, anything they collect over their % debt paid is gross income. The second way is to collect and charge a percent; typically 50% or so. IMOHO, most people are familiar with the later of the two.
  So if there is a $300 debt, even after they take their 50%, that's $150 for both parties. Considering their labor is generally pretty cheap, that pays for A LOT of phone calls and even a modest about of legal work (lien, etc). Its simply not likely a collector is going to pass on a $300 debt.
8. Re:Please do so by Ungrounded+Lightning · 2010-03-05 08:00 · Score: 1
  
  I'm tired of sending invoices and dealing with companies who tell you that your photo wasn't worth the $300 you charge and instead send you $50 thinking that it will clear up the matter.
  What fools.
  By paying the $50 they've admitted they know they owe you SOMETHING.
  
  --
  Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
9. Re:Please do so by lwsimon · 2010-03-05 08:27 · Score: 1
  
  I run one of those news aggregator sites. It is very small, and its purpose is to stockpile news articles on a topic in one place, so they don't disappear and drop off the net. Yes, I've been copying articles verbatim, though at the very least I link to the host site first and foremost, and quote the article so that it is clear that I did not write the content.
  As for ads, I have one Adsense block on there, and have made $8 in the 3 months the site has been up.
  How do you propose balancing legitimate archival with the needs of the copyright owners? I think I have an equitable scheme in place, though I realize it may or may not be a legal one.
  
  --
  Learn about Photography Basics.
Not So Good for the Economy by lobiusmoop · 2010-03-05 03:00 · Score: 1

"Offshore sites will not be immune from the crackdown: almost all of them depend on banner ads served by US-based services, and the DMCA requires the ad service to act against any violator. "
Not sure this is such a great idea - when you're broke you don't starve off the little income you're still getting... I'm inclined to think that in the near future, things will more likely go in the opposite direction, grey-legal stuff will be fully legalized to provide some as much extra economic stimulus as possible.

--
"I bless every day that I continue to live, for every day is pure profit."
1. Re:Not So Good for the Economy by Duane13 · 2010-03-05 03:35 · Score: 1
  
  "Offshore sites will not be immune from the crackdown: almost all of them depend on banner ads served by US-based services, and the DMCA requires the ad service to act against any violator. "
  Not sure this is such a great idea - when you're broke you don't starve off the little income you're still getting... I'm inclined to think that in the near future, things will more likely go in the opposite direction, grey-legal stuff will be fully legalized to provide some as much extra economic stimulus as possible.
  I for one hope the economy isn't based off of people scraping legitimate websites to make money for themselves. How about the people who are barely getting by because of "gray area" websites, get off their lazy butts and do their own legitimate work instead of stealing others.
  
  Theft of another's work should not be rewarded, quite the opposite.
  
  What Attributor are doing is not even evil, they are not looking for past damages, just protecting their future interests.
Re:i'm a little clueless here by KingSkippus · 2010-03-05 03:01 · Score: 4, Interesting

The Robots exclusion standard. Not that it will stop them; as others have pointed out, if they think they're "doing the right thing," I'm sure they will not be concerned about such a standard.
The worry here really isn't so much for the people who are hosting sites with infringing content. I'm sure a moral argument could be made that Attributor is well within the right to disregard the wishes of those who are breaking copyright law. However, I run several sites that have no infringing content whatsoever, sites with things that have content that, while not private, I don't particularly want spiders crawling. I'm not so naive to think that they don't do it anyway; I have server logs proving that they do. However, in this case, we have a company that is claiming to be legitimate completely ignoring my--someone who is not infringing--wishes and doing it.
Put another way, by convention, my neighbors don't use binoculars to peer into my house windows to see what I'm doing although there's currently not really anything stopping them from doing so. Even though I don't particularly have anything to hide, if I find that they are violating our polite social contract, then I'll put up shades just because it's none of their damn business.
I don't think that the robots.txt convention will be the thing that stops Attributor. I think that it will be that it won't take long for web site authors to figure out what user agents, IP address, etc. that Attributor is using and will block access from Attributor to their sites. Like I said, I have no infringing content on my sites, but if Attributor is going to ignore me politely asking their robots not to scan my sites, then I'm fully in the right to take further steps to forcibly prevent them from doing so.
The 80 percent mark by tepples · 2010-03-05 03:01 · Score: 2, Insightful

Slashdot for it's copy-pasted copies
News publishers using Attributor probably won't attack Slashdot for excerpting one paragraph from a ten-paragraph story any time soon. From the summary:

the first offending sites to be targeted will be those using 80% or more of copyrighted stories
1. Re:The 80 percent mark by HungryHobo · 2010-03-05 03:04 · Score: 1
  
  I'm fairly sure they quote the entirety of very small articles every now and then.
  more than a few times a month? absolutely!
  I'm curious if they're going to start hitting forums when people do they "hey look at this guys" quote of a news article.
  It could really hurt a lot of free forums.
2. Re:The 80 percent mark by Jaysyn · 2010-03-05 03:08 · Score: 1
  
  A lot of forums require credentials to view & have systems in place to keep automated accounts from being generated.
  
  --
  There is a war going on for your mind.
3. Re:The 80 percent mark by tomhudson · 2010-03-05 08:47 · Score: 1
  
  Offshore sites will not be immune from the crackdown: almost all of them depend on banner ads served by US-based services, and the DMCA requires the ad service to act against any violator.
  Attributor says it can interdict the revenue lifeline at any offending site in the world."
  
  [X] My web site doesn't have ads, you insensitive clod!
  ... though I do smell an opportunity here for non-US-based adservers. In Soviet Russia, ads serve YOU!
Re:i'm a little clueless here by Joe+U · 2010-03-05 03:02 · Score: 1, Interesting

Ok, here's an argument.
http://blog.internetcases.com/2010/01/05/browsewrap-website-terms-and-conditions-enforceable/
So, the terms of use of a website are binding, at least according to this court. If the terms spell out mandatory following of robots.txt, is robots.txt now binding?
Ad networks geotarget their ads by tepples · 2010-03-05 03:05 · Score: 1

As I understand it, advertisers targeting readers in the United States tend to choose ad networks that operate or at least have some sort of assets in the United States, not ad networks that operate in the European Union. Advertisers who target readers in the European Union probably will not want to pay to reach readers in the United States, especially for a product not available in the United States.
1. Re:Ad networks geotarget their ads by arthurpaliden · 2010-03-05 03:46 · Score: 1
  
  So .. when the ad is placed the customer selects the target country / region. Using IP addresses takes care of the rest. Yes there will be some missdirection but on the whole it works.
  
  --
  Undetectable Steganography? Yep, there's an app fo
2. Re:Ad networks geotarget their ads by tepples · 2010-03-05 03:57 · Score: 1
  
  So .. when the ad is placed the customer selects the target country / region.
  So I take it you're imagining an EU based ad network that deals with advertisers in foreign markets. But how would such an ad network efficiently deal with US advertisers while having zero assets in the US or in any other country with a takedown system remotely like that of the US?
3. Re:Ad networks geotarget their ads by arthurpaliden · 2010-03-05 04:26 · Score: 1
  
  You log on to the advertisers site and submit your ad along with your credit card number. And you do it from a country that is not subject to US or US like control. Remember with the internet you do not need a brick and mortar or even a flesh and blood presence anyway to do business.
  
  --
  Undetectable Steganography? Yep, there's an app fo
the article, for your convenience by mdemonic · 2010-03-05 03:05 · Score: 5, Funny

A coalition of traditional and digital publishers this month will launch the first-ever concerted crackdown on copyright pirates on the web, initially targeting violators who use large numbers of intact articles.
Details of the crackdown were provided by Jim Pitkow, the chief executive of Attributor, a Silicon Valley start-up that has been selected as the agent for several publishers who want to be compensated by websites that are using their content without paying licensing fees.
In a telephone interview yesterday, Pitkow declined to identify the individual publishers in his coalition, but said they include “about a dozen” organizations representing wire services, traditional print publishers and “top-tier blog networks.”
The first offending sites to be targeted will be those using 80% or more of copyrighted stories more than 10 times per month.
In the first stage of a multi-step process aimed at encouraging copyright compliance instead of punishing scofflaws, Pitkow said online publishers identified by his company will be sent a letter informing them of the violations and urging them to enter into license agreements with the publishers whose content appears on their sites.
If copyright pirates refuse to pay, Attributor will request the major search engines to remove offending pages from search results and will ask banner services to stop serving ads to pages containing unauthorized content. The search engines and ad services are required to immediately honor such requests by the federal Digital Millennium Copyright Act (DMCA).
If the above efforts fail, Attributor will ask hosting services to take down pirate sites. Because hosting services face legal liability under the DCMA if they do not comply, they will act quickly, said Pitkow.
“We are not going after past damages” from sites running unauthorized content said Pitkow. The emphasis, he said is “to engage with publishers to bring them into compliance” by getting them to agree to pay license fees to copyright holders in the future.
License fees, which are set by each of the individual organizations producing content, may range from token sums for a small publisher to several hundred dollars for yearlong rights to a piece from a major publisher, said Pitkow.
Attributor identifies copyright violators by scraping the web to find copyrighted content on unauthorized sites. A team of investigators will contact violators in an effort to bring them into compliance or, alternatively, begin taking action under DMCA.
click the link to read the last 21%
1. Re:the article, for your convenience by Anonymous Coward · 2010-03-05 04:00 · Score: 1, Insightful
  
  You bring up an interesting point. According to the Slashdot TOS you own that comment. Who should attributor pursue?
2. Re:the article, for your convenience by clone53421 · 2010-03-05 04:44 · Score: 1
  
  Depends on who cooperates and to what extent.
  First, Slashdot (dear Slashdot, please delete the comment); second, him personally (dear Slashdot, please give us his IP); and finally, Slashdot’s ISP (dear ISP of Slashdot, Slashdot isn’t cooperating with us, please shut down their domain).
  
  --
  Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
3. Re:the article, for your convenience by natehoy · 2010-03-05 05:13 · Score: 3, Interesting
  
  No one.
  He posted the article, cited it as the original article (knowing there was a proper citation link above), and posted less than 80% of it. This is a completely legitimate use of the article as per Attributor's new rules. Two or three more words from the article would have made it an "80% rule" bust, but would still have been OK as long as he didn't make a habit of it. It's repeated use of more than 80% of source article text that Attributor wants to go after.
  Most discussion boards already limit direct citation to a paragraph or two, or approximately 20% of the article.
  So Attributor's 80% limit is making a clear statement that they are really only interested in pursuing people who make a routine habit of copying entire articles. And if the bulk of your content is coming from copying 100% of someone else's original news articles, you aren't exactly someone I want to waste my righteous indignation defending.
  
  --
  "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
4. Re:the article, for your convenience by noidentity · 2010-03-05 05:13 · Score: 2, Insightful
  
  Click the link to read the first 21%
  The first offending sites to be targeted will be those using 80% or more of copyrighted stories more than 10 times per month.
  In the first stage of a multi-step process aimed at encouraging copyright compliance instead of punishing scofflaws, Pitkow said online publishers identified by his company will be sent a letter informing them of the violations and urging them to enter into license agreements with the publishers whose content appears on their sites.
  If copyright pirates refuse to pay, Attributor will request the major search engines to remove offending pages from search results and will ask banner services to stop serving ads to pages containing unauthorized content. The search engines and ad services are required to immediately honor such requests by the federal Digital Millennium Copyright Act (DMCA).
  If the above efforts fail, Attributor will ask hosting services to take down pirate sites. Because hosting services face legal liability under the DCMA if they do not comply, they will act quickly, said Pitkow.
  "We are not going after past damages" from sites running unauthorized content said Pitkow. The emphasis, he said is "to engage with publishers to bring them into compliance" by getting them to agree to pay license fees to copyright holders in the future.
  License fees, which are set by each of the individual organizations producing content, may range from token sums for a small publisher to several hundred dollars for yearlong rights to a piece from a major publisher, said Pitkow.
  Attributor identifies copyright violators by scraping the web to find copyrighted content on unauthorized sites. A team of investigators will contact violators in an effort to bring them into compliance or, alternatively, begin taking action under DMCA.
  Offshore sites will not be immune from the crackdown, said Pitkow, because almost all of them depend on banner ads served by U.S.-based services. Because the DMCA requires the ad service to act against any violator, Attributor says it can interdict the revenue lifeline at any offending site in the world.
  Attributor already has been engaged by several major book publishers to get unauthorized eBooks off unauthorized sites. "And we have 99% success rate," he said.
so instead- google or other cache by way2trivial · 2010-03-05 03:07 · Score: 1

easy enough to search google cache and bypass the robots.txt problem....
heck.. they SHOULD proclaim the spider name-- drum up a lot of informaiton
and focus on sites that mention it in robots.txt to check from other sources

--
every day http://en.wikipedia.org/wiki/Special:Random
1. Re:so instead- google or other cache by clone53421 · 2010-03-05 04:23 · Score: 1
  
  Google cache doesn’t index pages that the robots.txt told it not to crawl...
  
  --
  Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
2. Re:so instead- google or other cache by way2trivial · 2010-03-05 15:45 · Score: 1
  
  Youve missed the point
  sites that steal text articles for webvertising pay per clicks will NOT reject, but rather welcome google in.
  they would reject the company this article is about..
  so if they announce the name of their spider-- any site who rejects them--
  the company who is checking then goes and slams googles cache for what would be denied them via robots.txt directly live..
  
  --
  every day http://en.wikipedia.org/wiki/Special:Random
3. Re:so instead- google or other cache by clone53421 · 2010-03-05 16:51 · Score: 1
  
  Yes, but there’s not much help in that, because they could just as easily not announce the name of their crawler.
  And automating queries to Google’s cache is against their TOS, which has a little more legal validity than robots.txt.
  
  --
  Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
Go back to dial-up? by tepples · 2010-03-05 03:07 · Score: 1

Sometimes I really wish we could just go back to the early 90's when big media thought the internet was a joke, we didnt need them then and frankly I usually think we would be better off without them now.
Home Internet access in the early to mid 1990s was dial-up. Do you want to go back to that?
1. Re:Go back to dial-up? by grapeape · 2010-03-05 04:07 · Score: 1
  
  If that was the tradeoff needed to prevent the internet from becoming one big corporately guided tour pay as you go presentation of what they want us to see or do...sure. Actually by the mid 90's I was on ISDN.
2. Re:Go back to dial-up? by grapeape · 2010-03-05 04:31 · Score: 1
  
  That is exactly my point. I wasnt trying to troll, simply pointing out that the internet was supposed to be a great equalizer, most media outlets have no desire to be part of the community they want to be the community and have gone out of their way to shut out anything that even resembles equality online. Linking has traditionally been the way that sites agregate news, many simply use rss summaries provided by the original content, what 80% are they going after? 80% of the rss summary would often mean if you quote more than sentence your in violation.
  The comment about "licensing" going back to the smaller sites is legit too, how many times have you seen a article that appears on someones blog get picked up by a major news source with no credit at all? I've seen it lifted word for word regularly.
3. Re:Go back to dial-up? by Hatta · 2010-03-05 05:19 · Score: 1
  
  I'd rather go back to the BBS days of the 1980s than the upcoming mega-corp controlled clusterfuck the internet is turning into.
  
  --
  Give me Classic Slashdot or give me death!
Re:i'm a little clueless here by Tim+C · 2010-03-05 03:13 · Score: 1

I think the key there is the visibility of the terms:

But in this case the court found the terms and conditions (including the forum selection clause) to be enforceable. In contrast to Specht, the ServiceMagic site did give immediately visible notice of the existence of the terms of the agreement.
If I write a robot to crawl a site looking for certain keywords (e.g. Metallica), I will not necessarily ever have had visibility of those terms.

--
It's official. Most of you are morons.
in other words by circletimessquare · 2010-03-05 03:14 · Score: 1

this is the beginning of an arms race

--
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
Re:i'm a little clueless here by wjousts · 2010-03-05 03:16 · Score: 1

Put another way, by convention, my neighbors don't use binoculars to peer into my house windows to see what I'm doing although there's currently not really anything stopping them from doing so.
Curtains?
Re:i'm a little clueless here by fuzzyfuzzyfungus · 2010-03-05 03:17 · Score: 3, Informative

Since, as you say, robots.txt will likely do nothing against them, the bigger question becomes "how do they plan to do their crawling?". Crawling from a well defined IP block, using software with user agent Attributor_copy_cop, will be laughably simple to block or present false noninfringing content to.

Spoofing the UA strings and(if necessary) some of the behavior of common web browsers is a simple software problem, so I assume that they'll do that(unless they are terminally incompetent). Out of curiosity, though, does anybody know how easy and cheap it would be (using legitimate methods not botnet style stuff) for such a commercial entity to obtain a reasonably large number of, ideally "residential looking", IPs that change fairly often? Do you just call verizon and say "I want 500 residential DSL lines brought out to so-and-so location"? Would you obtain the services of one of the sleazy datacenter operators who caters to spammers and the like and knows how to switch IP blocks frequently? Do you pay to have second lines installed at your employee's houses, with company scanner boxes attached?
Robots.txt is irrrelevant by NevarMore · 2010-03-05 03:21 · Score: 1

If a site posts articles yet has them excluded by robots.txt doesn't that defeat the purpose of posting the article where it can be indexed and found?
In other words if an article is posted, but robots.txt says to not index it, that article isn't going to show up in a search. Its a bit like rebroadcasting an NFL game in a movie theatre with no one in the theatre to watch it.
Binoculars by Mateo_LeFou · 2010-03-05 03:23 · Score: 1

Prosser, in both his article and in the Restatement (Second) of Torts at 652A-652I, classifies four basic kinds of privacy rights:
1. unreasonable intrusion upon the seclusion of another, for example, physical invasion of a person's home (e.g., unwanted entry, looking into windows with binoculars or camera, tapping telephone), searching wallet or purse, repeated and persistent telephone calls, obtaining financial data (e.g., bank balance) without person's consent, etc.
http://www.rbs2.com/privacy.htm

--
My turnips listen for the soft cry of your love
oh wait.. woops by Mateo_LeFou · 2010-03-05 03:25 · Score: 1

pasted too soon
"Only the second of these four rights is widely accepted in the USA. In addition to these four pure privacy torts, a victim might recover under other torts, such as intentional infliction of emotional distress, assault, or trespass.
Unreasonable intrusion upon seclusion only applies to secret or surreptitious invasions of privacy. An open and notorious invasion of privacy would be public, not private, and the victim could then chose not to reveal private or confidential information. For example, recording of telephone conversations is not wrong if both participants are notified before speaking that the conversation is, or may be, recorded. There certainly are offensive events in public, but these are properly classified as assaults, not invasions of privacy."

--
My turnips listen for the soft cry of your love
Re:i'm a little clueless here by yourlord · 2010-03-05 03:29 · Score: 2, Informative

I welcome them to crawl my sites and ignore my robots.txt files. They won't get very far though. When my server detects that behavior it passes the IP to my firewall which adds it to the "drop these packets into a black hole" list.
I have quite a large table of IP addresses of idiots that violated robots.txt.
Business Plan by DaveAtFraud · 2010-03-05 03:40 · Score: 1

1) Put up a file sharing site with lots of music and movie files.
2) Craft a robots.txt to keep out the RIAA and MPAA.
...
Profit!!!

Robots.txt is a convention that was never intended to restrict checking for illegal content. The idea behind robots.txt is only to keep site indexers such as Google, Yahoo, etc. out of certain directories.

Cheers,
Dave

--
They that can give up essential liberty to obtain a little temporary safety deserve neither safety nor liberty.
Ben
Re:i'm a little clueless here by twidarkling · 2010-03-05 03:41 · Score: 1

Except that the recent RIAA case ruling that you don't need to have actually seen a copyright notice in order to be bound by it, due to the ubiquity of the notice, ToS are similarly ubiquitous, so you should be bound by that as well, seeing it or not.

--
Canada: The US's more awesome sibling.
Re:i'm a little clueless here by plague3106 · 2010-03-05 03:42 · Score: 1

A robot can't enter into a contract though, I would imagine.
my experience with Attributor by bcrowell · 2010-03-05 03:47 · Score: 5, Informative

I've had an experience with Attributor myself, and it's given me a pretty low opinion of them. I'm the author of a CC-BY-SA-licensed calculus textbook, titled "Calculus." Someone posted a copy of the pdf on Scribd, as allowed by the license. So one day I got an email from one of the people who runs Scribd, saying that Attributor had sent them a takedown notice, which they were skeptical about. Attributor hadn't supplied any useful information about what they thought was a violation. I called Scribd, and they checked and said it was a mistake -- they were working for Macmillan, which publishes another book titled "Calculus." So here they were, serving a DMCA notice under penality of perjury, and they hadn't even checked whether the name of the author was the same, or whether any of the text was the same. Their bot just found that the title, "Calculus," was the same as the title of one of their client's books. Pretty scummy.

--
Find free books.
1. Re:my experience with Attributor by bcrowell · 2010-03-05 03:59 · Score: 3, Informative
  
  Oops, important correction to the parent post: "I called Attributor, and they checked and said it was a mistake -- they were working for Macmillan..."
  
  --
  Find free books.
2. Re:my experience with Attributor by Anonymous Coward · 2010-03-05 04:19 · Score: 1, Insightful
  
  Wouldn't we have to see the actual takedown notice to be sure it was faulty? Maybe there was another copy of the Calculus book on Scribd and that one was MacMillan's, and perhaps the DMCA was a bit sloppily written, or perhaps Scribd just didn't read it right.
  It could happen. I can look at Scribd right now and see several books with the title Calculus on them, only one can be yours, the others? Maybe they approve of their works being on Scribd, maybe not.
  So do I think Attributor was particularly scummy just because you got an email from Scribd? No, not on its own. Everybody can make a mistake without it being malicious.
3. Re:my experience with Attributor by jfengel · 2010-03-05 04:37 · Score: 1
  
  It sounds like this sort of thing wouldn't happen under their new tactic, which actually does compare the text rather than just the title.
  Pretty stupid of them to have sent a takedown notice based on nothing more than the title.
4. Re:my experience with Attributor by Anonymous Coward · 2010-03-05 04:41 · Score: 1, Informative
  
  Disclaimer: a Slashdot forum discussion is no substitute for professional legal advice; seek professional advice if you need it.
  To be a valid 17 USC 512 (c) takedown notice, it has to clearly identify the infringing content, i.e. with a link. If it doesn't, that's not a takedown, it's just an angry email.
  Also, it does require there to be a 'good-faith belief' that the material in question is infringing (i.e. that is not the perjury part), and a statement on penalty of perjury that the distribution of the material they are purporting it to be is not authorised by the copyright holder (that one could get some of the agents behind the recent takedowns of music blogs in hot water, because some of those definitely have been authorised by the copyright holders), and that they are either the copyright holder or the copyright holder's appointed agent.
  They should NEVER be sending out takedowns based on the whim of a bot with no human oversight; that represents overt negligence, not a good-faith belief, and fraud on their part as they claim to the copyright holders that they never do this.
  Please post the DMCA takedown in question to Chilling Effects, and contact Macmillan directly to inform them of the unfounded, mistaken threats Attributor are making in their name.
  I don't know about you, but we send out an invoice for costs for every false DMCA takedown we receive (so far they have all been misidentifications, some of which are repeated misidentifications, and we have contacted the copyright holder of that particular work directly - as a result, they are no longer using the services of Mark Ishikawa's "BayTSP"). Legislation in my country may change to reflect this, although much of the rest of it is going to go down the pan I don't doubt.
5. Re:my experience with Attributor by clone53421 · 2010-03-05 05:00 · Score: 1
  
  Does it OCR the pages itself, if the PDF hasn’t already been OCRd?
  
  --
  Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
6. Re:my experience with Attributor by coofercat · 2010-03-05 05:05 · Score: 2, Insightful
  
  It's one thing to make a mistake, and entirely another to invoke the law to enforce a mistake. You're right, it's entirely possible the takedown was poorly written, but therein lies the problem with the takedown mechanism - there's no standard that it must reach before it can be served. Thus mistakes, honest or otherwise threaten people with very real, very wide-ranging and scary/expensive actions - completely in error. As such, as reasonable people, we expect anyone taking action as serious as a takedown would apply a good dose of due diligence. Sadly though, not everyone views takedowns as "serious action", and so perhaps aren't taking the care with them that perhaps they should.
7. Re:my experience with Attributor by Hatta · 2010-03-05 05:17 · Score: 2, Interesting
  
  So, did you press charges?
  
  --
  Give me Classic Slashdot or give me death!
8. Re:my experience with Attributor by slashqwerty · 2010-03-05 13:32 · Score: 1
  
  You're pretty close but this part isn't quite right.
  
  Also, it does require there to be a 'good-faith belief' that the material in question is infringing (i.e. that is not the perjury part), and a statement on penalty of perjury that the distribution of the material they are purporting it to be is not authorised by the copyright holder (that one could get some of the agents behind the recent takedowns of music blogs in hot water, because some of those definitely have been authorised by the copyright holders), and that they are either the copyright holder or the copyright holder's appointed agent.
  
  The actual law requires:
  
  (v) A statement that the complaining party has a good faith belief that use of the material in the manner complained of is not authorized by the copyright owner, its agent, or the law.
  
  (vi) A statement that the information in the notification is accurate, and under penalty of perjury, that the complaining party is authorized to act on behalf of the owner of an exclusive right that is allegedly infringed.
  
  They don't have to have a good faith belief that the material infringes. They only need a good faith belief that the use described in the complaint was not authorized. It does not matter if the use described in the complaint is totally fictitious.
  
  The penalty of perjury part only applies to a statement that the person filing the complaint (i.e. lawyer) is authorized to act on behalf of a copyright holder whose rights have allegedly been infringed. The law doesn't even require that it be the same copyright holder.
well by unity100 · 2010-03-05 03:52 · Score: 1

i dont think that france, germany, spain, scandinavian countries and rest of the eu will just sit and accept u.s. as dominator of the world information.

--
Read radical news here
1. Re:well by s1lverl0rd · 2010-03-05 04:13 · Score: 1
  
  Isn't Google?
Re:i'm a little clueless here by clone53421 · 2010-03-05 04:07 · Score: 1

No, but the person who deployed the robot can implicitly do so by its actions.
Same argument used by the guy who had his cat click the EULA confirmations, and same flaw. He’s still liable.

--
Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
Impeach while we still can. by impeach · 2010-03-05 04:08 · Score: 1

Copyright treaty and "Cybersecurity" ruses are massive police state measures which will fundamentally alter life as we know it. We need as many minds as we can muster working out Articles of Impeachment just as fast as we can.
1. Re:Impeach while we still can. by Burz · 2010-03-05 06:31 · Score: 1
  
  Agreed! Let's start with Judges Scalia and Thomas.
Oh, goody by Mathinker · 2010-03-05 04:08 · Score: 1

A robot can't enter into a contract though, I would imagine.
If I cannot be liable for my robot breaking a contract, why couldn't I just make a robot which spiders the net and copies everything, violating copyright? Somehow I think your logic is flawed.
1. Re:Oh, goody by plague3106 · 2010-03-05 05:41 · Score: 1
  
  Oh, you mean like google?
2. Re:Oh, goody by bws111 · 2010-03-05 08:16 · Score: 1
  
  No, the logic is good. You can be liable for your robot breaking a contract (or a law). What he said however was that a robot can not enter into a contract, and that is correct. Entering a contract requires two parties to agree to something (be of one mind), and a robot can not do that (at least not something like a web crawler). Since no contract was entered into, neither party is bound by it.
Speaking as a publisher.... by Ponyegg · 2010-03-05 04:14 · Score: 1

To me this is a natural culmination of larger traditional media outlets who are still on the whole managed and run by people who simply don't understand what the internet is, or how it works, nor how people engage with and use theis ifnormation. I'm increasingly surrounded by people who have little or no background in media or internet publishing (they call themselves professional managers) who are telling us how things will work in the future without so much as a weeks worth of shop-floor experience (I've worked for a large UK media owner for 11+ years).
Look at Murdoch's utter inability to understand what the web represents and his reactionary walled garden approach to media delivery and consumption. What you've got are senior managers all desperately trying to create mechanisms to restrict access to their content because they believe that scarcity will somehow shore-up revenues. What they've failed to understand is that the fundamental rules of engagement have changed. It's time they stood aside/down and let those people with the understanding and foresight to get on with building their company's future.
if publisher's want to build a future for themselves then listen up. Open up your content to developers, engage with your audience & readership, partner with well selected commercial entities to extend your markets, limit the amount of advertising you provide but make that advertising relevant and engaging for you and your audience (because otherwise everyone will start using Adblock), offer unique content, know where your content is being consumed and what revenues you're generating on the back of it and crucially understand that as a publisher you are no-longer able to call the shots as you once did on how people access your content. Technological innovation is something they should be embracing in all it's scary, unfettered raw glory, not something to hide away from and build walls to defend against.
1. Re:Speaking as a publisher.... by Garwulf · 2010-03-05 05:11 · Score: 1
  
  As somebody who owns a small publishing company myself, I am left wondering just what sort of experiences you've had, and what branch of publishing you're in.
  The reason I ask is because I've now got experience in the book market at just about every level now (I started off as an author with Simon & Schuster and Osborne/McGraw-Hill), and most of my experience has been that publishing companies are early adapters. Even publishing attitudes towards e-books are heavily informed by having tried to make them work in a serious way back around 2000 (an experiment I was part of).
  One of the big problems that Attributor is dealing with is that when somebody copies an article wholesale, it doesn't necessarily result in additional publicity for the original source - in fact, all too frequently the original source isn't mentioned at all, and the website claims it as their own. That's just straight-up plagiarism, and it is a problem.
  Now, there are certainly dinosaurs, and there are fields where the way people engage with content is very different from the original form of the field. To take newspapers, for example, people consume news in an active way - they don't just read the article, they comment on and discuss it. So, understanding that consuming news is very much a communal experience is the key to success today, and those who don't realize that are in for a rough time (on the internet, it is very easy to vote with your feet).
  (Compare that to books, which are not consumed in an active, communal way, and you can see why the traditional model there still stands - the e-book represents something like 2.5% of the total book market on a good day. As somebody in that field, however, I can assure you that it has undergone quite a few internal revolutions, and another is in the process of starting with the Expresso Book Machine.)
  But, at the same time, there's a very big difference between failing to adapt to your market and having your content stolen (or copied - it's a semantic issue, and the important thing is that the content is question is not the pirate's content to reproduce). And, there's nothing wrong with taking action there. The fact that somebody can burgle a house does not make it immoral to catch and jail the burglar for doing it...or to put it the cliched way, just because one CAN do a thing, it does not follow that one SHOULD do that thing.
  So - and part of this may be that your branch of publishing is not the same as mine, and you've had a lot of bad luck with idiotic higher-ups - I see this as a justified reaction to a "pirate culture," if you want to call it that, and content producers and distributors do have a right to protect themselves within reason. So long as it doesn't fall into RIAA-level thuggery and stupidity, I think this is a good thing in the end.
  
  --
  Robert B. Marks
  Author, Demonsbane in Diablo Archive
2. Re:Speaking as a publisher.... by pydev · 2010-03-05 06:24 · Score: 1
  
  and most of my experience has been that publishing companies are early adapters [sic]
  They may well be, but their entire model is outdated. What value do they add other than marketing? And why should they get such obscenely large cuts for that?
  And, there's nothing wrong with taking action there.
  These companies aren't going to stop at prosecuting wholesale copying. They are going to see their revenues fall further and further (that's inevitable), and then they are going to go first after fair use and then they are going to try to copyright facts and news itself (they are already floating proposals for that).
  Wholesale copying is wrong, but that's not the problem of publishers and news organizations. Their problem is that they are very inefficient. They could afford those inefficiencies when their product was a fairly expensive physical product, they can't afford them anymore today whether their product costs essentially nothing to replicate and distribute.
  The problem of those companies is, in short, that they are dinosaurs and are becoming irrelevant. But they still have a lot of political clout and they are not going to go down easy.
  And if you think that that's not a risk, think again. More and more businesses are hugely subsidized by the public: agriculture, music, broadcasting, steel, cars, airlines, oil, etc. In an efficient market, many of those industries would either not exist in the US or look radically different. Publishing will want its cut as well as it becomes obsolete.
3. Re:Speaking as a publisher.... by Garwulf · 2010-03-05 09:07 · Score: 1
  
  "They may well be, but their entire model is outdated. What value do they add other than marketing? And why should they get such obscenely large cuts for that?"
  You really need to get a clue.
  I'm serious about that, and I see stuff like this a lot. You are completely ignorant of the publishing world, and yet you are quite willing to make statements about how our entire model is outdated. Do you even know what our model is?
  First off, you really need to look up the word "outdated." For something to be outdated, it must be operating in a way that was valid in the past, but is no longer valid. Taking the current book market, around 95% of the market consists of printed books, with e-books and audiobooks fighting over the other 5%. This means that the present model has not been replaced. Don't take my word for it - do a search for "Association of American Publishers" and look at their news - they track the market and the sales figures on a month by month basis.
  Second, besides marketing and distribution being pretty big things, there's also quality control and editing, which make a huge difference. And that part of publishing a book takes months of work.
  And finally, I suggest you do some research on wholesalers, distribution, and production costs. Or do you really think that Amazon offers a 25% discount off cover prices at a loss? For that matter, some basic business knowledge would be a good start.
  
  --
  Robert B. Marks
  Author, Demonsbane in Diablo Archive
4. Re:Speaking as a publisher.... by pydev · 2010-03-05 22:40 · Score: 1
  
  You really need to get a clue.
  No, you really need to get a clue. Publishing books about Greek humor to aging classics majors gives you a very warped perspective on publishing. In fact, your kind of publishing may well continue to exist--after all, you really sell decoration, not literature.
  there's also quality control and editing
  Editing? Quality control? Is that some kind of joke? In the technical world, publishers let the authors do their own editing, they publish whatever they can get, and then they charge upward of $100 for it. If people really need editing, they can contract it out for next to nothing over the Internet.
  As for quality control, that should never have been in the hands of organizations whose primary purpose is to make money. In the past, that was a necessary evil, these days, it's just evil. I don't want you or any other publisher being a gatekeeper to the kind of information I can access. If I want quality control, I'd rather go to Oprah's book club than to you.
  Second, besides marketing and distribution being pretty big things
  Distribution is electronic and costs next to nothing. Marketing is the only legitimate function of publishers that's left; but why would I want someone like you to do my marketing for me and take away 85% of the net (!) revenue?
  And finally, I suggest you do some research on wholesalers, distribution, and production costs.
  I suggest you do the same, because if you think that those costs even matter anymore, you're stuck some time in the last century.
5. Re:Speaking as a publisher.... by Garwulf · 2010-03-06 03:43 · Score: 1
  
  You know, you remind me of one of those conspiracy theorists who no matter what evidence is laid before them, can't be turned away from some bizarre and inexplicable theory. So, there's no point in arguing with you - have fun in your fantasy world. I'll continue to live in the real one.
  
  --
  Robert B. Marks
  Author, Demonsbane in Diablo Archive
6. Re:Speaking as a publisher.... by pydev · 2010-03-06 11:42 · Score: 1
  
  You haven't provided any "evidence", you've simply been waving your hands and spewing the usual bullshit that we hear from publishers about editing, quality control etc.
  The evidence, on the other hand, is clear: just go to the online sites for print-on-demand publishing, electronic distribution, editing and writing services, and marketing and advertising services. An author can get a book published and marketed for less than $1000 in both print and online form, and get a much higher cut than you or any other publisher is willing to give. And the author controls the terms and the where and how. That's the evidence.
  In ten years, you're going to be out of business.
  (As for "conspiracies", pulling one off requires that one knows what is going on; you don't.)
Web Crackdown Full Stop by ObsessiveMathsFreak · 2010-03-05 04:18 · Score: 2, Insightful

It's not just copyright. The slow but steady alignment of copyright holders, oppressive governments, legal changes, media pressure and surveillance technology has wound itself around the internet worldwide, and now the real pressure is being applied. This is a secular change, largely unobservable over smaller intervals, but the end result is that the web in 10 and 20 years time will be a noticeably less free place than it is today. Everything you do online will be monitored, everything will be logged, everything will be legally defined and controlled, and every infringement will be subject to criminal penalties.
The parties responsible have the support of the politicians, the censors, the press, the money men and most of the public. We used to have the support of the geeks and their creativity in bypassing censorship. But let's face it; geeks have not created a truly disruptive technology since BitTorrent almost ten years ago. While Geekdom slept, the likes of Cisco and the major Telcos have constructed a frightening array of technologies for surveillance and control of the internet, and the fruit of their efforts can be seen in China, Iran and now even countries like Australia. Soon it will be seen all over the world.
The Web has changed. Governments are no longer going to tolerate the freedom and anarchy that it grants to the population at large. They now have the means, method and opportunity to put this genie back in the bottle. This crackdown is the first offensive on what is going to be a wide front. Expect the free net to lose.

--
May the Maths Be with you!
1. Re:Web Crackdown Full Stop by night_flyer · 2010-03-05 04:59 · Score: 1
  
  10 to 20 YEARS? I think you are being optimistic...
  
  --
  
  Thanks to file sharing, I purchase more CDs
  Thanks to the RIAA, I buy them used...
2. Re:Web Crackdown Full Stop by cpghost · 2010-03-05 06:43 · Score: 1
  
  Expect the free net to lose.
  And expect the anonymous overlay networks (a la freenet et al.) to replace it. However, we're in a sad state in general when only anonymous speech remains truly free speech.
  
  --
  cpghost at Cordula's Web.
Here is how I solved this problem myself... by MarcoF · 2010-03-05 04:25 · Score: 1

just a few days ago, when I found a real magazine had copied without permission, integrally and without attribution, an article of mine. I wrote this: http://stop.zona-m.net/node/112 then asked them to please cancel their copy and they immediately did it.
I hope their algorithm can keep up by aarenz · 2010-03-05 04:28 · Score: 3, Interesting

I suspect that many sites that are using this type of content will find ways of hiding that fact by using non-display characters, breaking the article into multiple pages and the like to cover the fact that they are using the content. Would love to see their system in action on some test sites to figure out how much you need to do to cover the content and make it not match the original.
Re:IT. IS. LAW!!! by clone53421 · 2010-03-05 04:52 · Score: 1

Wrong. robots.txt asks you to not index certain pages. It does not give permission to index the rest of the pages.
Permission to read the pages is implicit in the fact that you’re serving them freely to whoever or whatever makes an HTTP request for them.

--
Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
Re:So where were you when Fox by clone53421 · 2010-03-05 05:02 · Score: 1

without explicit permission, copyright states that you cannot copy.
It's only convention (not law) that says you can.
Actually, you’re wrong. The law defines when you can copy. It’s called fair use.
Asking the server for a copy of a page so that you can read it is considered fair use, and there’s nothing in the law that says a robot is any different than a human in this regard.

--
Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
Final 21% by thefear · 2010-03-05 05:07 · Score: 1

Offshore sites will not be immune from the crackdown, said Pitkow, because almost all of them depend on banner ads served by U.S.-based services. Because the DMCA requires the ad service to act against any violator, Attributor says it can interdict the revenue lifeline at any offending site in the world.
Attributor already has been engaged by several major book publishers to get unauthorized eBooks off unauthorized sites. "And we have 99% success rate," he said.

--
:(
pirate my @ss by Anonymous Coward · 2010-03-05 05:14 · Score: 1

Since when did not toeing a corporate line with regard to intangibles turn otherwise upstanding citizens into criminals?
This BS began with the EULA, gained traction with the DMCA, and will be solidified with ACTA.
Stop the madness! Vote the bums out!
Re:Better stop syndicating, thrn by natehoy · 2010-03-05 05:19 · Score: 1

Huh?
If they are RSS syndicating their content, that constitutes a legitimate use of their work. The 100% copy is done with the permission of the author. You are allowed to read that all you want.
If YOU are RSS syndicating complete articles that someone else wrote, this is not a legitimate use of their work. You could, however, RSS syndicate 79.9% of each and every one of their articles and put a link to the original article for your readers to read the other 20.1%. Common courtesy would actually dictate that you post a few paragraphs only, then link back to the source for the rest of the article. Credit where credit is due.
Seriously, if you are routinely using complete articles written by someone else, and you aren't compensating them for that use, you are violating copyright.

--
"This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
Re:i'm a little clueless here by ASBands · 2010-03-05 05:21 · Score: 3, Interesting

One idea would be to use the many available cloud services like EC2, Google App Engine and Azure. The IP blocks those services come in are going to remain fairly regular, but they are so common that it might not be acceptable for a site to block everything from ghs.l.google.com (and whatever EC2 and Azure live on). It is still blockable, though, so it probably would have been better for them (from a technical standpoint) if they hadn't announced their existence and these sites had been slowly indexed by their service before anybody knew what was happening.
Another (better) idea would be to use a service like Tor. Sure, their latency is going to skyrocket, but that's not a big deal since interactivity isn't a primary concern of an indexing service. It's still blockable, if infringing site admins block Tor nodes. This may or may not be doable, as I would imagine many users of said infringing sites use anonymizing networks for their normal traffic.
Sure, either of the solutions I've come up with in five minutes can be circumvented, but the idea isn't to totally eliminate piracy, its to make it inconvenient enough to make getting the legitimate version easier.

--
My UID is a prime number. Yeah, I planned that.
GoodLuckWithThat by Hurricane78 · 2010-03-05 05:38 · Score: 1

Will they be writing angry letters to Google too? You know... those that index all their content, so it can be found in the first place?
Let them kill themselves with their delusional business model. The space and jobs will quickly be filled with something else.

--
Any sufficiently advanced intelligence is indistinguishable from stupidity.
Re:IT. IS. LAW!!! by nacturation · 2010-03-05 05:44 · Score: 1

IT. IS. LAW!!! You need to make a copy of a copyrighted work to view that work on the internet. You cannot make a copy unless you have permission. robots.txt gives that permission BY CONVENTION on the internet.
Let me guess: you check robots.txt every time you browse to a new website and if that website has no robots.txt, you leave because you don't have permission to view anything?

--
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
Re:i'm a little clueless here by nacturation · 2010-03-05 05:52 · Score: 1

On the other hand, that's an utterly asinine comment to have made (the one you quote, not yours).
It's a kdawson quote... what more can you expect?

--
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
news organizations are pretty arrogant by pydev · 2010-03-05 06:07 · Score: 1

They expect people to provide them with free information (they call it "interviews" and "fact gathering" and all that) and then turn around and try to sell it. Oh, but they do add something: overpaid upper-middle-class bias and political favoritism in return for being allowed to hobnob with the imporant people and getting invited to all the right parties.
We can only hope that the big players in this industry will go bankrupt.
FAQs by BigSes · 2010-03-05 07:53 · Score: 1

Wonder if they will apply this to sites that feature FAQ-type writeups. I remember reading a small strategy guide for MW2 Multiplayer mode on a website that I Googled. It was nearly verbatim to the original one on a competitor's site, just without the pictures and the same formatting. Hell, they even tried to use slightly different sentence structure in some places, but still used the same adjectives and adverbs in many places (much like how someone plagarizing a term paper would "re-write" it in their own words). All with zero attribution to the original source.
Another corporate overlord? by Whuffo · 2010-03-05 08:39 · Score: 1

These folks don't seem to have thought their cunning plan all the way through. No matter how they try to dress it up, they're vigilantes in pursuit of their idea of justice and there are some legal issues that they are going to have to deal with.
I'll let someone like NYCL describe those issues in detail - and I don't have any of anyone else's material online but it might be fun to do so, collect a DMCA complaint from these clowns - then sue them and watch them try to dance for the judge.
Re:So where were you when Fox by taoye · 2010-03-05 10:30 · Score: 1

I was in the bathroom. You got a problem with that?
Re:i'm a little clueless here by bertoelcon · 2010-03-05 16:20 · Score: 1

Try posting with your account, you dickless wonder.
At least have the decency to do so yourself.

--
Anything can be found funny, from a certain point of view.
What about a court order? by Puppet+Master · 2010-03-06 01:23 · Score: 1

Hosting providers shouldn't just take down a site based on a letter from Attributor. There's something called "Fair Use". also why should the hosting provider take the risk of taking down a site? Whose to say that Attibutor is not making a mistake and accusing the wrong publisher? Let a court decide, not some stupid start up trying to make a buck.

--
The day Microsoft creates a product that doesn't suck, it will be known as the Microsoft Vaccuum Cleaner!
goose that lays the golden eggs by vuffi_raa · 2010-03-06 05:46 · Score: 1

is it just me or have none of these business execs ever read "goose that lays the golden eggs" as a kid- the fact of the matter is that restricting information actually discourages people from wanting to read it- it doesn't in any way encourage them to pay for it, people will in all likelihood just watch the news or look for stories that are free somewhere else which means both more disinformation and bias being treated as news.
Re:i'm a little clueless here by JNSL · 2010-03-06 17:41 · Score: 1

I think you're mixing up two distinct issues here. Copyright notice is not necessary to somebody owning a copyright and enforcing the associated rights. It has nothing to do with "the ubiquity of the notice."
Re:i'm a little clueless here by twidarkling · 2010-03-06 18:10 · Score: 1

I guess you missed this story:
http://www.wired.com/threatlevel/2010/02/former-teen-cheerleader-dinged-27750-for-infringing-37-songs/
Most specifically this part: " the Copyright Act precludes such a defense if the legitimate CDs of the music in question provide copyright notices."
This, despite her claim that she never had the actual CDs to see the notice.

--
Canada: The US's more awesome sibling.
Re:i'm a little clueless here by JNSL · 2010-03-06 19:00 · Score: 1

While I had not read that story, it doesn't materially change my post. I just read the case (2010 WL 653322 - it's not available in a federal reporter yet), and here is the jist of why my original post stands and your analogy is a false analogy.

To start, you said that "in order to be bound by [the copyright notice)", "you don't need to have actually seen a copyright notice." This is always true. It has nothing to do with the ubiquity of notice. After the US signed Berne, we eliminated formalities like notice, and - as I said - "Copyright notice is not necessary to somebody owning a copyright and enforcing the associated rights."

This is different from the ToS because ToS are contractual. Copyright is statutory. That is why ubiquity matters to ToS. We should know better that many sites have ToS, so we are bound. That said, I don't think every jurisdiction is so liberal with ToS application.

In any case, for copyright, it is not that "we should know better, so we're bound by copyright." This is why your analogy ("except. . .") to Maverick Recording Co. v. Harper is a non-starter.

And even false analogy aside, you misunderstood what happened in the case. Harper was trying to assert the innocent infringer defense in order to lower damages. But " 402(d) . . . gives publishers the option to trade the extra burden of providing copyright notice for absolute protection against the innocent infringer defense."

So what happened was that Harper made out a prima facie case for innocent infringement according to the district court, thus (as an issue of fact) the matter would be left to the jury. However, the court of appeals said, "hold on, even if you make the prima facie case ( 405(b)) the publisher has an absolute defense."

Harper lost because of 402(d), not because notice is ubiquitous. 402(d) would apply even if nobody knew that CDs had copyright notices on them. Once the notice is on the original, it's good to go.