Slashdot Mirror


Web Copyright Crackdown On the Way

Hugh Pickens writes "Journalist Alan D. Mutter reports on his blog 'Reflections of a Newsosaur' that a coalition of traditional and digital publishers is launching the first-ever concerted crackdown on copyright pirates on the Web. Initially targeting violators who use large numbers of intact articles, the first offending sites to be targeted will be those using 80% or more of copyrighted stories more than 10 times per month. In the first stage of a multi-step process, online publishers identified by Silicon Valley startup Attributor will be sent a letter informing them of the violations and urging them to enter into license agreements with the publishers whose content appears on their sites. In the second stage Attributor will ask hosting services to take down pirate sites. 'We are not going after past damages' from sites running unauthorized content says Jim Pitkow, the chief executive of Attributor. The emphasis, Pitkow says is 'to engage with publishers to bring them into compliance' by getting them to agree to pay license fees to copyright holders in the future. Offshore sites will not be immune from the crackdown: almost all of them depend on banner ads served by US-based services, and the DMCA requires the ad service to act against any violator. Attributor says it can interdict the revenue lifeline at any offending site in the world." One possible weakness in Attributor's business plan, unless they intend to violate the robots.txt convention: they find violators by crawling the Web.

13 of 224 comments (clear)

  1. DMCA.. by ltning · · Score: 2, Interesting

    What on earth is the DMCA supposed to achieve, in the context of Ad-providers?

    Sounds pretty scary to me.

    --
    Love over Gold.
  2. Re:Robots.txt by Joe+U · · Score: 1, Interesting

    If they are going to extend the DMCA to other countries, then let's extend computer trespassing laws to cover robots.txt violations.

    I'm being somewhat serious (but not super-serious). If courts want to hold that a website TOS is binding, then isn't the robots.txt binding as well?

  3. Will that ultimately include slashdot? by elrous0 · · Score: 5, Interesting

    A lot of aggregator sites like this one base a lot of their topical content on articles printed elsewhere. While most (incl. /.) don't print whole articles intact, a lot of them do quote heavily (what used to be called "fair use," back when that phrase actually meant anything). So their first step is to go after the sites that reprint the articles whole-cloth. But will they stop there?

    --
    SJW: Someone who has run out of real oppression, and has to fake it.
  4. Offshore sites WILL be immune by unity100 · · Score: 1, Interesting

    all this harrassment is going to do will be to push the global small internet publishers to services in other countries. Datacenters, Ad services in u.s. will lose customers. There are already strong companies servicing in those areas in Eu. Eu will be happy to receive that amount of business.

    the stupor of american corporatism is overwhelming. they can even go to the extent of shooting themselves in the foot.

  5. Re:i'm a little clueless here by KingSkippus · · Score: 4, Interesting

    The Robots exclusion standard. Not that it will stop them; as others have pointed out, if they think they're "doing the right thing," I'm sure they will not be concerned about such a standard.

    The worry here really isn't so much for the people who are hosting sites with infringing content. I'm sure a moral argument could be made that Attributor is well within the right to disregard the wishes of those who are breaking copyright law. However, I run several sites that have no infringing content whatsoever, sites with things that have content that, while not private, I don't particularly want spiders crawling. I'm not so naive to think that they don't do it anyway; I have server logs proving that they do. However, in this case, we have a company that is claiming to be legitimate completely ignoring my--someone who is not infringing--wishes and doing it.

    Put another way, by convention, my neighbors don't use binoculars to peer into my house windows to see what I'm doing although there's currently not really anything stopping them from doing so. Even though I don't particularly have anything to hide, if I find that they are violating our polite social contract, then I'll put up shades just because it's none of their damn business.

    I don't think that the robots.txt convention will be the thing that stops Attributor. I think that it will be that it won't take long for web site authors to figure out what user agents, IP address, etc. that Attributor is using and will block access from Attributor to their sites. Like I said, I have no infringing content on my sites, but if Attributor is going to ignore me politely asking their robots not to scan my sites, then I'm fully in the right to take further steps to forcibly prevent them from doing so.

  6. Re:i'm a little clueless here by Joe+U · · Score: 1, Interesting

    Ok, here's an argument.

    http://blog.internetcases.com/2010/01/05/browsewrap-website-terms-and-conditions-enforceable/

    So, the terms of use of a website are binding, at least according to this court. If the terms spell out mandatory following of robots.txt, is robots.txt now binding?

  7. Re:Robots.txt by Joe+U · · Score: 2, Interesting

    That's the point I was trying to make. I posted this somewhere else:

    http://blog.internetcases.com/2010/01/05/browsewrap-website-terms-and-conditions-enforceable/

    So now you can turn around and sue them for crawling your site if you specifically disallow it in the terms and robots.txt.

    The results should be interesting to watch.

  8. Re:Robots.txt by Registered+Coward+v2 · · Score: 3, Interesting

    Seriously. Following robots.txt is not law, only convention. I'm sure it doesn't take much to convince themselves to ignore it. Money, "doing the right thing", etc. If you view the copyright infringers as pirates, then why should Attributor follow their wishes?

    I'd go even farther to say that sites that use robot.txt to eliminate crawling are probably not major targets - if they don't show up in search engine sthen tehy probably don't generate enough traffic to be worth the effort. Sites that are high traffic are much better targets - their revenue stream form ads is prbabaly significant enough that they don't want to risk losing it. Once enough fall into line they can worry about the ones that are not indexed - in fact they may just want to kill them off to preserve traffic to licensed sites.

    --
    I'm a consultant - I convert gibberish into cash-flow.
  9. Re:Please do so by clone53421 · · Score: 2, Interesting

    I'm tired of sending invoices and dealing with companies who tell you that your photo wasn't worth the $300 you charge and instead send you $50 thinking that it will clear up the matter.

    They’re basically giving you the finger. Don’t fuck around playing their little games... show them you mean business. Slap on a surcharge to cover your additional expense and send their name and remaining balance to a debt collector. It’s probably cheaper and less of a hassle than suing them in small claims court.

    IANAL... you may want to ask a real lawyer what your options are, but seems to me you have a few.

    --
    Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
  10. I hope their algorithm can keep up by aarenz · · Score: 3, Interesting

    I suspect that many sites that are using this type of content will find ways of hiding that fact by using non-display characters, breaking the article into multiple pages and the like to cover the fact that they are using the content. Would love to see their system in action on some test sites to figure out how much you need to do to cover the content and make it not match the original.

  11. Re:the article, for your convenience by natehoy · · Score: 3, Interesting

    No one.

    He posted the article, cited it as the original article (knowing there was a proper citation link above), and posted less than 80% of it. This is a completely legitimate use of the article as per Attributor's new rules. Two or three more words from the article would have made it an "80% rule" bust, but would still have been OK as long as he didn't make a habit of it. It's repeated use of more than 80% of source article text that Attributor wants to go after.

    Most discussion boards already limit direct citation to a paragraph or two, or approximately 20% of the article.

    So Attributor's 80% limit is making a clear statement that they are really only interested in pursuing people who make a routine habit of copying entire articles. And if the bulk of your content is coming from copying 100% of someone else's original news articles, you aren't exactly someone I want to waste my righteous indignation defending.

    --
    "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
  12. Re:my experience with Attributor by Hatta · · Score: 2, Interesting

    So, did you press charges?

    --
    Give me Classic Slashdot or give me death!
  13. Re:i'm a little clueless here by ASBands · · Score: 3, Interesting

    One idea would be to use the many available cloud services like EC2, Google App Engine and Azure. The IP blocks those services come in are going to remain fairly regular, but they are so common that it might not be acceptable for a site to block everything from ghs.l.google.com (and whatever EC2 and Azure live on). It is still blockable, though, so it probably would have been better for them (from a technical standpoint) if they hadn't announced their existence and these sites had been slowly indexed by their service before anybody knew what was happening.

    Another (better) idea would be to use a service like Tor. Sure, their latency is going to skyrocket, but that's not a big deal since interactivity isn't a primary concern of an indexing service. It's still blockable, if infringing site admins block Tor nodes. This may or may not be doable, as I would imagine many users of said infringing sites use anonymizing networks for their normal traffic.

    Sure, either of the solutions I've come up with in five minutes can be circumvented, but the idea isn't to totally eliminate piracy, its to make it inconvenient enough to make getting the legitimate version easier.

    --
    My UID is a prime number. Yeah, I planned that.