Slashdot Mirror


Copyright Tool Scans Web For Violations

The Wall Street Journal is reporting on a tech start-up that proposes to offer the ultimate in assurance for content owners. Attributor Corporation is going to offer clients the ability to scan the web for their own intellectual property. The article touches on previous use of techniques like DRM and in-house staff searches, and the limited usefulness of both. They specifically cite the pending legal actions against companies like YouTube, and wonder about what their attitude will be towards initiatives like this. From the article: "Attributor analyzes the content of clients, who could range from individuals to big media companies, using a technique known as 'digital fingerprinting,' which determines unique and identifying characteristics of content. It uses these digital fingerprints to search its index of the Web for the content. The company claims to be able to spot a customer's content based on the appearance of as little as a few sentences of text or a few seconds of audio or video. It will provide customers with alerts and a dashboard of identified uses of their content on the Web and the context in which it is used. The content owners can then try to negotiate revenue from whoever is using it or request that it be taken down. In some cases, they may decide the content is being used fairly or to acceptable promotional ends. Attributor plans to help automate the interaction between content owners and those using their content on the Web, though it declines to specify how."

185 comments

  1. Wager by Baricom · · Score: 3, Insightful

    Anybody care to place a friendly wager that they're not going to honor robots.txt?

    1. Re:Wager by Anonymous Coward · · Score: 0

      Anybody care to place a friendly wager that they're not going to honor robots.txt?

      Who cares? If they don't lie in their User-agent you can just block your entire site. And why wouldn't you? They are leeching your bandwidth and you get nothing in return.

      If they do lie then their scanning would appear to be a clear breach of the unauthorized access provisions of the UK Computer Misuse Act 1990, a criminal offence.

    2. Re:Wager by Anonymous Coward · · Score: 0

      "The company claims to be able to spot a customer's content based on the appearance of as little as a few sentences of text or a few seconds of audio or video."

      Uh, isn't this de-minimis?

    3. Re:Wager by Crudely_Indecent · · Score: 2, Informative
      Another company "Cyveillance" already does this for major corporations and the government. I've used htaccess rules to disallow all from their assigned netblocks after they racked up almost 20,000 hits to my personal site in one day. As you mentioned, they didn't follow robots.txt and attempted to index parts of my site that are password protected as well as content names that did not exist (music and videos and such), all the while identifying their bot as a variant of IE.

      Here's how to block two subnets using htaccess and mod_rewrite on apache:

      RewriteEngine On
      RewriteCond %{REMOTE_ADDR} "^63\.148\.99\.2(2[4-9]|[3-4][0-9]|5[0-5])$" [OR]
      RewriteCond %{REMOTE_ADDR} "^63\.146\.13\.6([4-9]|[7-8][0-9]|9[0-5])$"
      Rewri teRule ^(.*)$ - [F]
      Line 1 activates the rewrite engine
      Line 2 sets the condition to include remote addresses 63.148.99.224-255 and includes [OR] to allow further processing
      Line 3 sets the condition to include remote addresses 63.146.13.64-95
      Line 4 sets the rule that any url be forbidden

      So, save your bandwidth by denying access to your content from unauthorized viewers (bots)
      --


      "Lame" - Galaxar
    4. Re:Wager by Anonymous Coward · · Score: 0

      I already block 38.x.x.x because of some jokers using the PSI block for similar robots.txt ignorant site ripping.

      We make a collection of large documents availiable in multiple formats (txt, xml, html, swx, pdf and doc), guess what the genius copyright enforcers did? If they'd repeated from another netblock I would have followed through with abuse tickets.

      We could always use HTTP auth, then again so could sites doing unauthorized redistribution of copyright materials. That renders the entire "content policing" concept totally worthless.

    5. Re:Wager by PalmKiller · · Score: 1

      If they dont honor them, I will bet that the new startup's ip address blocks are filtered at most routers though.

    6. Re:Wager by markana · · Score: 1

      If they're fingerprinting such a small amount of source material, then they'll generate *mostly* false positives. Of course, that won't stop them from sending takedowns and auto-suits based on just the supposed match. You just can't get a very unique fingerprint with so few input bits.

      I hope everyone is prepared for the massive flood of notices this is going to generate...

    7. Re:Wager by antarctican · · Score: 1

      Anybody care to place a friendly wager that they're not going to honor robots.txt?

      I had a similar thought. How much extra bandwidth is this going to suck from sites hunting for copyright material on completely legitimate sites? Particularly sites which might have a lot of large media content.

      If I put up a terms of service forbidding the crawling of my site, can I then sue them for bandwidth costs? Seems reasonable to me, why should I be presummed to be guilty?

    8. Re:Wager by BrynM · · Score: 2, Informative
      There's an easier way. You can hand mod_access netblocks and more. This method will avoid eating cycles with mod_rewrite. If you can put it in your conf instead of .htaccess, you'll save even more time/processing. Just put it in for your doc root. From my httpd.conf:

      <Directory "/var/www/htdocs/">
      # BRYN'S DENIALS
      # allresearch.com
      deny from 209.73.228.160/28
      # branddimensions.com user-agent: BDFetch
      deny from 204.92.59.0/24
      # cyveillance.com
      deny from 63.148.99.224/27
      deny from 65.118.41.192/27
      # www.markwatch.com user-agent: markwatch
      deny from 204.62.224.0/22
      deny from 204.62.228.0/23
      deny from 206.190.160.0/19
      # nameprotect.com user-agent: NPBot
      deny from 12.40.85.0/24
      deny from 12.148.196.128/25
      deny from 12.148.209.192/26
      deny from 12.175.0.32/28
      # rocketinfo.com
      deny from 209.167.132.224/28
      # END BRYN'S DENIALS
      </Directory>
      Now I gotta look up IPs for these clowns... damn copyright ambulance chasers... arin.net here I come!
      --
      US Democracy:The best person for the job (among These pre-selected choices...)
    9. Re:Wager by Anonymous Coward · · Score: 0

      So what exactly is robot.txt all about for us dickheads

    10. Re:Wager by Anonymous Coward · · Score: 0

      Its a way to find out who is new here.

  2. Can't they just use google or torrent sites? by LiquidCoooled · · Score: 3, Informative

    Can't they just use google or torrent sites?
    If users can find items they want, presumably the copyright holders could use the same methods...

    --
    liqbase :: faster than paper
    1. Re:Can't they just use google or torrent sites? by Anonymous Coward · · Score: 0

      Oh my God, they've discovered Google!

    2. Re:Can't they just use google or torrent sites? by owlnation · · Score: 2, Funny

      And the opposite situation shows why this tool is a waste of time.

      Imagine a tool where you could reliably return accurate and search results for images and video. Does this exist yet? No, as one who searches the web daily for pics and video for my own sordid uses, let me assure you that it most certainly does not yet exist.

      And what an horrific waste to have such a tool - if it works - for policing content for copyright violations. Bearing in mind also that such "violations" are no such thing in some countries, regardless of the imperial arrogance of media companies.

      As always, and tell your family and friends, only buy music directly from the artist or secondhand. It's the only way to win.

    3. Re:Can't they just use google or torrent sites? by advocate_one · · Score: 1
      As always, and tell your family and friends, only buy music directly from the artist or secondhand. It's the only way to win.

      or else make it yourself... but then again you've got to pay the nickel for the bl00dy sheet music or tabs... and they don't half try to rip you off there as well... it's that or write your own... and then try and stop them from ripping you off...

      --
      Donald 'Duck' Dunn: We had a band powerful enough to turn goat piss into gasoline.
    4. Re:Can't they just use google or torrent sites? by Lissajous · · Score: 1
      As always, and tell your family and friends, only buy music directly from the artist or secondhand. It's the only way to win.

      What rubbish - the only way to win is not to play!
  3. Dupe by gravesb · · Score: 1

    Pretty sure this is a dupe, or so closely related to an earlier story as to not matter. Anyway, to recast points made earlier, how hard will it really be to "smudge" these digital fingerprints? Its not really different than any other DRM, and has the same issues involved. Also, who thinks that someone is going to pay for this service, and then allow their works to remain for promotional reasons? They are going to sue the heck out of the person violating copyright.

    --
    http://bgcommonsense.blogspot.com
    1. Re:Dupe by xlordtyrantx · · Score: 1
      Pretty sure this is a dupe, or so closely related to an earlier story as to not matter
      Yessir, I remembered that too .. http://slashdot.org/article.pl?sid=06/12/05/134622 9
      One example would be Audible Magic, a 'fingerprinting program' for video released a few days ago that promises to use peculiarities of recording and editing to tag and identify forbidden material.
      --
      Eagles may soar, but weasels never get sucked into jet engines...
    2. Re:Dupe by AKAImBatman · · Score: 2, Interesting
      Pretty sure this is a dupe, or so closely related to an earlier story as to not matter.

      It's not a dupe. (Unless you count anything that appears on Digg first to be a dupe.) However, it's also not the first story of its kind. About a gazillion companies have formed with the exact same business plan (save for the "hotness" at the time being digital music) and about a gazillion of those companies have failed to develop software that catches anything but the most obvious infractions.

      Every so often, some RIAA/MPAA fair-haired boy manages to get funding for yet another attempt. He then fails miserably and the cycle repeats. You'd think the investors would learn. Unfortunately, they keep getting dazzled by the latest, buzzword-compliant technologies.
    3. Re:Dupe by Maximum+Prophet · · Score: 3, Interesting
      Since copyright lasts a long time and doesn't depend on being defended like trademark, there will be some allowances "for promotional reasons" like this:
      1. Leak copywritten material in easy to copy format to places where it will be copied
      2. Watch viral marketing campaign take over
      3. Profit
      4. Wait 'til revenue falls
      5. Find infringers using new scan tools
      6. Sue them
      7. Profit more!!!
      --
      All ideas^H^H^H^H^Hprocesses in this post are Patent Pending. (as well as the process of patenting all postings)
    4. Re:Dupe by PTBarnum · · Score: 1

      How do DRM and fingerprinting have the same issues involved? One is trying to prevent you from decoding a file you possess, and the other is trying to recognize it. The usually stated problem with DRM is that they have to give you both the content and the key, thus it really just amounts to security by obscurity.

      While it would certainly be a difficult AI problem to write software that can recognize music or video as well as a human can, I see no reason why it should be theoretically impossible. If you try to change the file so much that it is no longer recognizable to the software, what guarantee is there that it will still be recognizable to humans? When I download the latest episode of my favorite TV show, I want it to be a high fidelity copy, not a badly distorted mess.

      I concede that the level of distortion necessary to defeat recognition by today's technology is probably very minor and not noticeable by humans, but unlike DRM, there is no fundamental guarantee that software recognition will always be so lame.

    5. Re:Dupe by novus+ordo · · Score: 1

      The legal implications of this tool greatly outweighs the technical considerations. Especially when you consider that there is a good chance somebody from another country might be infringing and then you get into a big mess of bureaucracy. But I think these sorts of ventures will ultimately fail because they underestimate the honesty of most people. See this interesting little tidbit from Freakonomics for a telling example.

      --
      "You're everywhere. You're omnivorous."
  4. Oh noez!! by Anonymous Coward · · Score: 0

    Yeah, so what? Is it unknown that the internet is 98% of illegal crap?

    1. Re:Oh noez!! by Anonymous Coward · · Score: 0

      Yeah. 98% is illegal crap. The rest is GOOD illegal stuff.

  5. buh by lucky130 · · Score: 5, Insightful

    "as little as a few sentences of text or a few seconds of audio or video"

    Like quotations in a paper, or video snippets in an educational presentation?

    1. Re:buh by brouski · · Score: 1
      The software won't know the difference of course.

      I suppose it will be up to the copyright owner to determine whether any given hit is actionable.

      --
      Proud member of the American Non Sequitur Society. We might not make much sense, but boy do we love pizza!
    2. Re:buh by silentounce · · Score: 1

      Yes, it flags them. Then you can check the source to see if they are infringing on the entire document. I doubt anyone is going to come down hard on something if it is only a two second sample or a few sentences and the source is notated.

      --
      There are many tongues to talk, and but few heads to think. -Victor Hugo
    3. Re:buh by Anonymous Coward · · Score: 0

      I suppose it will be up to the copyright owner to determine whether any given hit is actionable.

      Hopefully the copyright owners will actually go through steps to determine if action needs to be pursued rather than the apparent tactics of the RIAA. Unfortunately though I fear the later. Imagine all those slide shows people have put together with the family photographs and some song in the background. I doubt that most of them contain original work.

      Jim

    4. Re:buh by NeutronCowboy · · Score: 4, Insightful

      You're assuming anyone is going to manually verify any of the results. From my experience with people using monitoring software (especially non-techies who are simply consumers of the technology, but who provided the money for it), the vast majority of them are simply going to call their lawyers when they see the dashboard light up. I see vast letter writing campaigns come from this, with little actual infringing being prosecuted.

      This is a scary product. Not so much because of the technology behind it, but because of how it is going to be implemented and (ab)used.

      --
      Those who can, do. Those who can't, sue.
    5. Re:buh by Reziac · · Score: 1

      I had the same thought. This is going to catch an awful lot of "fair use" snippets in the crossfire.

      It wouldn't be so bad if the crawler would then further verify that the ENTIRE work was present and infringed, but you can bet it'll lead to a flurry of half-cocked threats instead.

      --
      ~REZ~ #43301. Who'd fake being me anyway?
    6. Re:buh by grimJester · · Score: 1

      Does it matter? Normal people can't afford to defend themselves in court, so the money will roll in regardless.

      Hmm.. Is that profit I smell?

    7. Re:buh by lucky130 · · Score: 1

      I was actually assuming they wouldn't manually verify things like that and just automate the lawsuits too, it's just that the first two things I thought of when reading that statement were legitimate uses.

  6. No fear ! by Rastignac · · Score: 1

    There's no copyrighted pirated things on my computer, so I don't fear th+++$£%+ NO CARRIER

    --
    -- Rastignac was here.
  7. Spam obfuscation techniques suddenly useful... by scottsk · · Score: 1

    Seems like spam obfuscation techniques will be useful against this sort of scan, too, if someone really wanted to infringe on copyright.

  8. At least somebody knows it: by Veetox · · Score: 1

    "'We all know that as soon as somebody comes up with a way to secure a piece of property, somebody else will come within days and crack it,' says Lawrence Iser." ...It's the damn-hard truth...

  9. i don't like robots.txt anyway. by Anonymous Coward · · Score: 0, Insightful

    If you don't want it on the public Web, don't put it there in the first place.

    1. Re:i don't like robots.txt anyway. by FooAtWFU · · Score: 5, Informative
      You're absolutely right that "if you don't want it on the public Web, don't put it there in the first place" -- but there are still times when you have a legitimate reason that you don't want a page indexed, downloaded, or otherwise visited by a robot. Dynamically generated content is one example reason; sometimes certain pages can be a big drain on your website, and you'd prefer not to have every spider in the world hitting them up every few minutes.

      Let's take a fun legitimate site like, oh... Wikipedia:

      # Folks get annoyed when VfD discussions end up the number 1 google hit for
      # their name. See bugzilla bug #4776
      # en:
      Disallow: /wiki/Wikipedia:Articles_for_deletion/
      Disallow: /wiki/Wikipedia%3AArticles_for_deletion/
      Disallow : /wiki/Wikipedia:Votes_for_deletion/
      Disallow: /wiki/Wikipedia%3AVotes_for_deletion/
      Disallow: /wiki/Wikipedia:Pages_for_deletion/
      Disallow: /wiki/Wikipedia%3APages_for_deletion/
      Disallow: /wiki/Wikipedia:Miscellany_for_deletion/
      Disallow : /wiki/Wikipedia%3AMiscellany_for_deletion/
      Disall ow: /wiki/Wikipedia:Miscellaneous_deletion/
      Disallow: /wiki/Wikipedia%3AMiscellaneous_deletion/
      Disallo w: /wiki/Wikipedia:Copyright_problems
      Disallow: /wiki/Wikipedia%3ACopyright_problems
      (They also disallow certain specially generated pages like Special:Random, and any of the pages which actually let you edit the site).

      Let's see, what are some other sites? Ooh. Take a look at Slashdot's robots.txt! (disallows a variety of fun pages.) Microsoft's? How about whitehouse.gov? Google?

      --
      The World Wide Web is dying. Soon, we shall have only the Internet.
    2. Re:i don't like robots.txt anyway. by Anonymous Coward · · Score: 1, Interesting

      I did that a while ago to my universitys website and found an excel spreadsheet that was used as in inventory of chemicals, what room they were in, and the access codes to the rooms. No I didn't do anything with said information.

    3. Re:i don't like robots.txt anyway. by Monoliath · · Score: 1

      Yeah, see, intelligent people understand this concept, but those with dollar signs in their eyes and ass just 'don't get it'.

      This is only going to fuel the fire and cause programmers to write scripts to screw up such scans.

      I think someone said it best in another post earlier on today in another article:

      "Freedom never gets easier to defend"

    4. Re:i don't like robots.txt anyway. by mandelbr0t · · Score: 5, Informative

      Dynamically generated content is one example reason; sometimes certain pages can be a big drain on your website

      And dynamic content is, of course, the answer. If I'm going to put up copyrighted content in the future, I'd use one of a dozen schemes that regenerate the download link on a per-session basis. Obviously they're not going to honour robots.txt, but why are your links readable by such a basic spider? You need to:

      1. Disallow anonymous downloads. You need to be logged onto the site to download anything, torrent or otherwise
      2. Use a CAPTCHA to prevent spiders from signing up for said accounts
      3. Use the session id to generate unique download links on a per-session basis
      4. Change the key on your BitTorrent tracker every 12-24 hours. This will require that a downloader get the latest torrent from the original website (which requires login), reducing the impact of a leaked torrent
      5. Compress and possibly encrypt the content so that it's less obvious what it is

      Anyone who follows the above steps (and most sites already do most or all of this) won't be found by the spider. Period.

      The only thing I can think of that this product would be useful for is to find people who have blatantly copied my website, but I'm sure you could find those people equally easily with Google.

      mandelbr0t

      --
      "Please describe the scientific nature of the 'whammy'" - Agent Scully
    5. Re:i don't like robots.txt anyway. by Anonymous Coward · · Score: 1, Funny

      Seems like it would have been easier for whitehouse.gov to just use the following:

      Disallow: *

      (if that's even the correct syntax)

    6. Re:i don't like robots.txt anyway. by Anonymous Coward · · Score: 0
      Disallow: *

      (if that's even the correct syntax)

      Disallow: /

      Posted by k2spider
      Fighting the machine
  10. My first thought.. by FunWithKnives · · Score: 1

    My first thought upon reading this is that we're going to the this type of thing on a wider scale. I wouldn't doubt it for a second. Corporations in this age have a tendency to blindly target anyone for anything relating to copyright or trademark.

    I guess we just have to wait and see. Maybe these companies' collective IQ has suddenly jumped twenty points or so.

    --
    "We may face a scorched and lifeless earth, but they're accountable to their shareholders first."
  11. Yeah.. good luck with that. by Rob+T+Firefly · · Score: 1
    The Wall Street Journal is reporting on a tech start-up that proposes to offer the ultimate in assurance for content owners.
    This almost had me going until the second half of the sentence. When has anyone ever offered any product as the "ultimate" anything that ultimately proved to actually ultimately be the ultimate whatever it was?
    1. Re:Yeah.. good luck with that. by banerjek · · Score: 0

      This almost had me going until the second half of the sentence. When has anyone ever offered any product as the "ultimate" anything that ultimately proved to actually ultimately be the ultimate whatever it was?

      My reaction was similar. I'm wondering how they intend to do an effective scan without getting locked out of everything. It's not nice to systematically scan systems and download files. Many folks will treat that as an attack and take appropriate measures.

  12. Fighting an avalanche with a snow shovel by TheWoozle · · Score: 4, Insightful

    Doesn't this merely serve to point out the absurdity of "Intellectual Property"?

    --
    Insisting on "correct" English is like saying that there is only one, definitive recipe for chili.
    1. Re:Fighting an avalanche with a snow shovel by Anonymous Coward · · Score: 0, Troll

      no it just points to the efforts required to stop freelaoding scumbags taking other peoples hard work.

    2. Re:Fighting an avalanche with a snow shovel by cryfreedomlove · · Score: 1

      TheWoozle,

      Today's world of copy protection is voluntary. You have the right to produce content that people want and to waive copyright on it. That's your free choice. Are you doing that? If not, then why not?

    3. Re:Fighting an avalanche with a snow shovel by Anonymous Coward · · Score: 0

      It seems it also points to the effort required by freeloading scumbags taking other peoples' hard work.

      What is the moral difference between a cartel using its de facto control over the industry to profit from others' works and pirates distributing said works for no profit?

    4. Re:Fighting an avalanche with a snow shovel by TheWoozle · · Score: 1

      At least in the U.S. (where I'm from), copyright is an "opt out" form of copy protection. I'd rather it was "opt in".

      Early physical and psychological development in humans is spurred by, and social behavior is learned through, imitation. We are, it appears, hard-wired to imitate other humans. Art and self-expression are rooted in imitation of others and almost all art forms are taught by imitation (called "technique") and most art is derivative of earlier expression.

      In light of all this, it seems absurd to insist that anything I think, say, or do is "original." It seems doubly absurd to insist that because I thought of something "first", that I and my descendants deserve compensation in prepetuity for my "original" idea.

      As widely discussed here on Slashdot, the current basis for "Intellectual Property" in the U.S. is patent and copyright. It appears that the framers of the U.S. Consitution shared the same idea that all works are at least in part derivative. They attempted to create a framework wherein people would have an incentive to *share* their ideas, after a *limited* time in which to capitalize on the *expression* of their ideas. I personally take issue with the idea that this conveys "property" rights, in either the legal or practical sense.

      In closing, I defy anyone to identify my personal infringement of intellectual "property" that may or may not be contained in the firings of the neurons in my brain, or the unlicensed performances for friends of portions of movies, etc. That's some "property" (*laugh*)...you can't even identify who is currently in possession of it at any given point in time!

      --
      Insisting on "correct" English is like saying that there is only one, definitive recipe for chili.
    5. Re:Fighting an avalanche with a snow shovel by Anonymous Coward · · Score: 0

      The moral difference is VERY simple to see - in the first instance, the artist is agreeing to it. In the second one, they are not. How do you not see that?

    6. Re:Fighting an avalanche with a snow shovel by cryfreedomlove · · Score: 1

      I'd rather it was opt in as well. However, you could still be a producer and opt out. Have you done that?

  13. Raise. by Tackhead · · Score: 3, Funny
    > Anybody care to place a friendly wager that they're not going to honor robots.txt?

    127.0.0.1: $ cat robots.txt
    # robots.txt for 127.0.0.1
    # This file is copyright 2006 by me.
    User-agent: AttributorCorporationDMCABot
    Disallow: *

    And if they do honor robots.txt, I'll be able to sue the fuckers for infringing on my copyright, because they must have read it in order to honor it.

    1. Re:Raise. by Hijacked+Public · · Score: 1
      Good luck with that.

      Unless you also sell a few companies and put together a few billion as a stake to hand over to attorneys I suspect you'll fare as poorly as everyone else does.

      --
      "Sacrifice for the good of The State" - The State
    2. Re:Raise. by rhartness · · Score: 2, Insightful

      You know, I've actually had a thought along those lines in trying to explain to untechnologically savvy individuals why Digital Rights laws are screwed up and that handling digital content on the web is a grey area. Consider the following.

      Most web sites have a copyright statement on them some where (even this one!). Technically speaking, if I go to that web site, my browser copies the page along with all it's media content and caches it. Since many of those sites do not have a terms of service posted allowing the viewing of the content through regular web browsing my computer is therefore violating copyright laws, right?

      Every single web user out there is breaking the law!

    3. Re:Raise. by commodoresloat · · Score: 1

      Reading something does not violate its copyright. If they distribute copies of robots.txt you might have a case of some sort.

    4. Re:Raise. by Mayhem178 · · Score: 5, Funny

      127.0.0.1: $ cat robots.txt
      # robots.txt for 127.0.0.1
      # This file is copyright 2006 by me.
      User-agent: AttributorCorporationDMCABot
      Disallow: *


      Hahaha! You screwed up! I have your IP address now! I will send 127.0.0.1 to every company that uses the sniffer and tell them the person at that IP is an evil, evil person who exploits innocent people for their own profit and power!

      --

      "You will pay for your lack of vision..." - Emperor Palpatine to Ray Charles

    5. Re:Raise. by advocate_one · · Score: 1
      Reading something does not violate its copyright. If they distribute copies of robots.txt you might have a case of some sort.

      how can you read it on the web then without having made a copy of it somewhere on your computer... you've pulled in a copy of it using your browser, there is now a copy of it in ram and also maybe in the cache... so you've made at least two unauthorised copies.

      --
      Donald 'Duck' Dunn: We had a band powerful enough to turn goat piss into gasoline.
    6. Re:Raise. by FooAtWFU · · Score: 3, Interesting

      You joke, of course, of course, but there are tools out there to detect when a bot is abusing your site and not following robots.txt. The usual technique is to hide a few links in your page, and also have these links blocked by robots.txt. When a user visits the link, they're banned from viewing the site. (Sometimes, a CAPTCHA-like utility for unblocking yourself is presented along with the 403 page, in the event that a particularly curious user manages to find the link and activate it manually.)

      --
      The World Wide Web is dying. Soon, we shall have only the Internet.
    7. Re:Raise. by eosp · · Score: 1

      And then when the FBI knocks on your door asking for that IP...

    8. Re:Raise. by civilizedINTENSITY · · Score: 1

      Perhaps it doesn't matter because you aren't distributing the copies?

    9. Re:Raise. by Anonymous Coward · · Score: 0

      anyone know what range of IP addresses they use ??
      a little bit of htaccess and they'll only ever see the clean version of my site

      actually that sounds like to much work i'll just redirect them to their own company website

    10. Re:Raise. by PPH · · Score: 1

      What you need is some language about prohibiting web crawlers from using the site in violation of the settings contained in this file and the penalties for such violations (get an actual lawyer to write it up). Then, go after the violators.

      --
      Have gnu, will travel.
    11. Re:Raise. by PPH · · Score: 1

      Make sure you include a line in your copyright to send all notices of legal action to Postmaster@mouse-potato.com

      --
      Have gnu, will travel.
    12. Re:Raise. by Da_Weasel · · Score: 1

      It's called fair use. Maybe using one of the "Offline" browsing options in browsers might step over the fair use line, but the cached copy and in memory copy don't.

      --
      If you must!
    13. Re:Raise. by Kamiza+Ikioi · · Score: 2, Insightful

      True, but there's a way around that as well. Any robot service worth its weight in fiber has more than one IP, and can have multiple subnets. Best way is to dump robots.txt links to a separate subnet, have it check later in the day. If the IP gets banned, it can check by trying to access the main page, see if it starts getting errors. It can then mark "booby-trap" sites on a list, and route around either the specific triggers or actually honor the robots.txt.

      You have to have more links than they have IPs to stop a full scan. Of course, if even one link bans, they can just pay a guy to sit on a few major ISP provider accounts and manually check your robot links. Then they don't care if you ban, because you'll have to ban entire regions of the world as they bounce around with multiple dynamic IPs. If you have this as automated subnet banning, you'd actually help them out by allowing them to set your bans across major ISPs... especially if you have any content they deem questionable, you just gave them a way to shut you down remotely.

      Of course, the rule here is to never ever automate subnet bans on a public access site... but then you still can't stop them either way.

      --
      I8-D
    14. Re:Raise. by Anonymous Coward · · Score: 0

      One word: tarpit.

      Make dynamic pages which normal user cannot (reasonably) get into. Make infinite number of the pages (dynamically creating them makes this trivial). Make the pages download slowly (few character per IP address per second). No robot is ever going to survive that.

  14. Yeah by Hijacked+Public · · Score: 3, Interesting
    FTFA:

    If it works, it's a fantastic invention


    Its purpose aside, yes, it would be a fantastic thing to be able to scan the entire web and reliably identify the context and content of any specific media file type. Video, audio, image, etc. Particularly if it could identify purposely obfuscated content.

    I'm in what is almost certainly a tiny minority of Slashdotters in that I actually create copyrightable material rather than only consume it. I'm again in the minority in that I think copyrights are a good thing and again in the minority in that I can separate out the purpose of copyrights and the evil actions of the legal arms of **AA companies.

    Regardless, while scanning the internet for improperly used material sounds great on paper this will probably end up being as effective as finding water with a divining rod. The current tactic of locking down things at the hardware and OS levels will get more support from the media companies, not that they seem all that good at choosing tactics when the internet is involved.

    --
    "Sacrifice for the good of The State" - The State
    1. Re:Yeah by jedidiah · · Score: 3, Insightful

      There's a wide gulf between copyright being a good idea in concept and being sensibly implemented in it's current form.

      Not everyone that creates content thinks that draconian enforcement attempts are a good idea, or even in the best interests of those that create content.

      If your work can't survive in the marketplace, which includes the prospect of everyone on the planet getting to use it for free, then perhaps you should get some sort of more conventional day job.

      The difference between a game that sells 50K and one that sells 5 Million has nothing to do with DRM.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    2. Re:Yeah by AdamKG · · Score: 4, Interesting
      and again in the minority in that I can separate out the purpose of copyrights and the evil actions of the legal arms of **AA companies.
      Let's make one thing clear: the RIAA/MPAA lawsuits are not, in any way, shape, or form, an abuse, negative side of, misapplication or malicious use of Copyrights. They fulfill the role of Copyrights in the first place; they are the logical end result of a system that says citizens are allowed to distribute ideas (or expressions of ideas), then stop any further distribution of them.

      The **AA lawsuits are ridiculous, yes. But the ridiculous part is not the litigation itself, it's the laws on which the lawsuits are brought under.
      --
      groupthink: It's good for self-esteem.
    3. Re:Yeah by kanweg · · Score: 3, Interesting

      I'm a patent attorney and no stranger to IP. Having said that, any IP law is, or at least should be, a balance to on the one hand freedom to operate (both for IP users and for IP creators) and on the other hand a means for compensation for IP creators. For patents, that balance is not there for patents on software. Also for patents, at least they last for 20 years max. For copyright, that balance is not there. And I'm curious to hear whether you think it is a good thing that whatever you create is still under copyright more than 40 years after you die.

      Bert

    4. Re:Yeah by Anonymous Coward · · Score: 0

      Hell, most slashdotters do create copyrightable material. That email you sent to your sysadmin? Copyrightable (oops, almost said girlfriend there). That comment you wrote on Slashdot? Copyrightable (well, nevermind. Most are dupes).

      Copyright © me, 2006

    5. Re:Yeah by fatman22 · · Score: 1

      Copyrights and patents are there to protect the ownership of, and distribution/licensing rights to, original works created or invented by people. They should belong solely to the creator(s) or inventor(s) of the works or ideas and be nontransferable and non-inheritable.

    6. Re:Yeah by DeadChobi · · Score: 1

      Just one little niggle, but citizens are most certainly not required to stop distributing an idea once the implementation of that idea is copyrighted. Otherwise there would be no more crappy songs about high school relationships on the radio after the first, as the idea of obsessively romantic love will have been copyrighted. The idea is to expressly prohibit the copying of a specific expression of an idea while still maintaining everyone's right to love each other like idiots, for example.

      --
      SRSLY.
    7. Re:Yeah by Hijacked+Public · · Score: 1

      And I'm curious to hear whether you think it is a good thing that whatever you create is still under copyright more than 40 years after you die


      No I do not think that life+40 years is a good thing. Any length of time is likely to be some arbitrary guess, but anything more than the life of the creator is too long in my estimation.

      These repeated attempts by media companies to extend the time periods for both their copyright and sometimes mine make a lot of news here and are often held up as examples of the way copyrights have been bent against the public. When compared with the reality of file sharing they matter very little though. A look at the most popular torrents or the number of files returned from search results in traditional P2P applications reveals that the bulk of material being traded is very recently released stuff.

      --
      "Sacrifice for the good of The State" - The State
    8. Re:Yeah by Anonymous Coward · · Score: 0

      If your work can't survive in the marketplace, which includes the prospect of everyone on the planet getting to use it for free, then perhaps you should get some sort of more conventional day job.

      You know, the Beatles were rejected by Decca records who didn't think they were any good. EMI wasn't too sure about them either. If the Internet had been around in the 50s and 60s, Decca and EMI would have been irrelevant. And the Beatles still would have been *rich* even though everyone would have been downloading their stuff for free. How ironic.

    9. Re:Yeah by teamhasnoi · · Score: 1

      Hell, most slashdotters do create copyrightable material. That email you sent to your sysadmin? Copyrightable (oops, almost said girlfriend there). That comment you wrote on Slashdot? Copyrightable (well, nevermind. Most are dupes).

      Copyright © me, 2006 I'm in ur Slashdot
      Infringing ur copyrights
      Fair Use in da house?
    10. Re:Yeah by Anonymous Coward · · Score: 0

      It actually DOES make a difference for me, but not what they think - I am about 4x less likely to buy a game with DRM, and if there is a game that is just as good with no DRM, I will always buy that instead. It will be an endless battle, more DRM, less buying because of it, media companies screaming "See, our sales have dropped because the restrictions aren't strict enough! More DRM!", putting more restrictive DRM in their media, see step one.

    11. Re:Yeah by Laur · · Score: 1
      I'm in what is almost certainly a tiny minority of Slashdotters in that I actually create copyrightable material rather than only consume it.
      Everyone who posts on Slashdot creates coyrighted material.
      --
      When you lose something irreplaceable, you don't mourn for the thing you lost, you mourn for yourself. - Harpo Marx
    12. Re:Yeah by DamnStupidElf · · Score: 1, Insightful

      I'm in what is almost certainly a tiny minority of Slashdotters in that I actually create copyrightable material rather than only consume it. I'm again in the minority in that I think copyrights are a good thing and again in the minority in that I can separate out the purpose of copyrights and the evil actions of the legal arms of **AA companies.

      Tiny minority? Everyone who posts to slashdot is creating copyrighted material. Everyone who sends an email or writes on a post-it note is creating copyrighted material. Everyone with a myspace account creates copyrighted material. Don't pretend that you are part of an elite minority with special rights, the fact is that the creation and distribution of information is a normal human activity that everyone in the civilized world participates in. As part of this activity it is extremely common to quote other people and to comment on what they've said or created. It is likewise very common to share books, movies, and pictures with friends and other people either to simply share with them or to get their opinion on something. The fact that the Internet allows copies of information to be shared instead of one single physical object in no way changes the social implications of the sharing. Basically, copyright law is a horribly broken restraint on free society in the information age and is just being milked for money by the **AA and other companies.

    13. Re:Yeah by Relic+of+the+Future · · Score: 1
      "I'm in what is almost certainly a tiny minority of Slashdotters in that I actually create copyrightable material"

      Well aren't we all high-and-mighty. Forget something though?

      "All trademarks and copyrights on this page are owned by their respective owners. Comments are owned by the Poster."

      (Virtually) EVERY expression of an idea is copyrightable; including every lame post made to /.. You've fallen for the same trap as so many others (artists, politicians, even everyday people) of believing that it only "counts" if it's used to turn a profit.

      --
      Those who fail to understand communication protocols, are doomed to repeat them over port 80.
    14. Re:Yeah by grcumb · · Score: 1
      I'm a patent attorney and no stranger to IP.

      Are you indeed? Then you should know better than to use the term 'Intellectual Property'.

      You of all people should know that no such thing exists - certainly not under the laws of any country I've ever had the leisure to study. A lawyer of all people should know better than to bandy inaccurate, misleading terms about. I believe the reason is that unwise talk such as that can come back to... what the legal term again? Ah yes: bite you in the ass. 8^)

      --
      Crumb's Corollary: Never bring a knife to a bun fight.
    15. Re:Yeah by kanweg · · Score: 1

      First of all, thanks for the reply. I was really curious about it. I'm also pleased to hear that the current period is long in your view.

      I appreciate your point about bit torrent. Personally I do buy all my music and movies (but get very angry about being forced to watch the copyright notices, something copyright infringers don't have to endure). But the opposite is also lamentable: There currently is a push to create a broadcast right. Take a non-copyrighted creation, broadcast it, and you're the new copyright owner! Things like software patents blurr the vision of many about the clever workings and usefulness of a good patent law. Similarly, things like silly copyright terms, underpayment of creators by megacorps and said silly broadcast proposal will not help creators as people will start losing faith/will readily grab the excuse not to abide by it. That is neither to your advantage, nor to that of the society.

      Bert

    16. Re:Yeah by rohan972 · · Score: 1

      Copyrights and patents are there to protect the ownership of, and distribution/licensing rights to, original works created or invented by people.

      bzzzt! wrong!

      In the US, at least, copyrights and patents are there to promote the progress of science and useful arts. Protecting "ownership" of the works is a means to an end, not the purpose itself. The government is not permitted by the US constitution to use patents and copyrights to pursue any other goal.

      To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;

    17. Re:Yeah by rohan972 · · Score: 1

      A look at the most popular torrents or the number of files returned from search results in traditional P2P applications reveals that the bulk of material being traded is very recently released stuff.

      I think this could well change if copyrights were 14-20 years as they were. I think many people, if faced with the choice of a downloading a movie from the 80's legally or downloading a recent movie illegally would choose the legal download. I know I would. Of course, many wouldn't care and would take the recent movie. Court action against infringers would also probably be seen as more reasonable.

      As it is, I can rent old movies for $1 each for a week (if I get them on Tuesday), which is cheaper than my download costs for a movie anyway.

  15. Thank goodness... by Anonymous Coward · · Score: 0

    ... someone is finally looking out for the little guy, the defenseless one who is always run over roughshod and never has the resources to defend himself... thank goodness someone is finally defending the copyright holder.

  16. and in little pieces, they will consume bandwidth by way2trivial · · Score: 1

    roughly equal to the entire volume of the publically available internet..

    think about it, to do what they say, they have to request ALL the data they can lay their hands on,
    and then chuck it.. and for comparative purposes, they'll have to do it again.

    so Sony hires 'jfm copyright trackers'
    and microsoft hires 'sco copyright trackers'
    and mgm hires yo momma

    and each of these 'ip owners' representatives have to scour the entire net, bit by byte by megabyte, for their clients.

    holy crap! think about the potential for bandwidth abuse- it's a corporate ddos- upping bandwidth bills for everyone
    who pays for use...

    as to the asswipes who suggest they 'use google' think about that- how much luck do you expect they'll have hitting google for their entire cache.... (and google pays for bandwidth too)

    --
    every day http://en.wikipedia.org/wiki/Special:Random
  17. Software is in beta by Weaselmancer · · Score: 2, Funny

    Attributor plans to help automate the interaction between content owners and those using their content on the Web, though it declines to specify how.

    And apparently being written by underpants gnomes.

    --
    Weaselmancer
    rediculous.
  18. Some interesting questions... by PingSpike · · Score: 4, Insightful

    Great, now all the torrent sites will require captcha verification too! ;P

    Actually, can they even scan torrents without downloading the entire file? And whats to stop everyone from just blocking them from accessing their websites? Are they going to go in covertly, pretending to be actual users? I can see every legit website blocking their access as well, why pay for bandwidth to supply that?

    Sure, youtube can be more efficiently attacked...but youtube has been dancing in front of the cannons since its inception, we all knew it was going to get shot eventually.

    1. Re:Some interesting questions... by Anonymous Coward · · Score: 0

      > Great, now all the torrent sites will require captcha verification too! ;P

      I'm not being facetious here, but most high-level private torrent sites all ready do. I could name three or four off the top of my head, if naming them on a public site wouldn't earn me instant bans, probably not just for me but everyone on my ISP, for doing so.

    2. Re:Some interesting questions... by NeutronCowboy · · Score: 1

      Here's another thought: what if your copyright license expressly forbids this kind of downloading? Can you then sue whoever downloaded your home grown musical, fanfic or picture of your cat via that tool?

      Then again, this entire counter-suing point is completely moot. Very few individuals have the money to slug it out in court with large media publishers, and not too many businesses can either.

      --
      Those who can, do. Those who can't, sue.
  19. Dashboard by AVee · · Score: 1

    This must be really essential bussiness software. It has a Dashboard! Wanna bet the next version is SOA enabled?

  20. search by hash? by straponego · · Score: 3, Interesting
    Does Google allow searching by md5sum or equivalent? I'm sure they have the capability. While not as impressive as what this company claims, it'd also be more reliable for unaltered media files.

    But it looks like the real "innovation" these guys are pushing toward is fully automated filing of lawsuits. I think that was in Accelerando, which is fantastic, and which you can download it free.

    1. Re:search by hash? by Johann+Lau · · Score: 4, Informative

      "Unaltered media files" are the exception, not the rule. Changing even a bit of metadata (stripping exif from an image, changing an mp3 tag) would change the checksum, not to mention things like putting things into an archive, resizing images, (re)recompressing music.

      But yeah, it might make sense for Google to become "aware" of unique content and variations of it.. but I doubt they'd ever use that openly for (aiding in) hunting down copyright infringement, simply for PR reasons.

    2. Re:search by hash? by sootman · · Score: 1

      They're still common enough to make md5-based searching a very useful tool. And, in fact, a lot of stuff does just get downloaded form here and posted there with no change.

      --
      Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
    3. Re:search by hash? by sootman · · Score: 1

      I wrote to google and yahoo a year or two ago suggesting they implement this but never heard back from either and have not seen it implemented. (Please, everyone WRITE!*) It would be the COOLEST THING EVER for a number of reasons. Say you downloaded a picture off the web. A year later, you stumble across it and decide you want to see if the site has any more similar pics. You could just md5 the image and search for that. (Of course this could be made very easy for non-technical users: Google could have a little 'search for this file' page, with an upload button just like you use for webmail attachments.)

      Of course it would also be easy to see if anyone else is offering a file you created, and there are bunches and bunches of other uses, too. If you're downloading a file and the hosting server is slow, you could learn it's md5sum (from google, the same way you can see cached pages) and then search for that md5sum and see if anyone else is hosting it and know that you're getting exactly the file you want. (Very useful when you know that version x.y.n of a program works for you and x.y.(n+/-1) doesn't.) Plus, with a huge catalog of checksums, Google could also do some interesting research on real-world hash collisions as well.

      Yes, any time someone changes the image, the md5sum would be different, so it wouldn't be the be-all, end-all of course, but it would still be very useful in a lot of ways. Just because it isn't perfect is no reason not to do it.

      * http://www.google.com/support/pages/bin/request.py , http://feedback.yahoo.com/?prop=SiteExplorer (I think)

      --
      Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
    4. Re:search by hash? by stivi · · Score: 2, Interesting

      Hm, what about computing checksum of the actual media contents? For example, compute checksum only for sound data in MP3 or image data in image files, ignore all other data/metadata. Usualy media files are containers for smaller objects or data streams... Resampled or modified contents would not be detected though.

      --
      First they ignore you, then they laugh at you, then they fight you, then you win.
    5. Re:search by hash? by straponego · · Score: 1

      I'd mod you up if I could. Another benefit of this would be the network effect on hashing tools. Yeah, any linux/osx/unix user has them already, and they're easy to get for Windows as well. But if google started exposing this, tool makers would follow. This would really boost infrastructure and standards for things like p2p apps, desktop search, backup tools, Internet-hosted storage, etc. The ??AA would also want to use it, of course, and this might even be a reason google has refrained from making it public so far but... I think the tech is more useful than damaging.

    6. Re:search by hash? by Johann+Lau · · Score: 1

      Shareaza peeps call that "tagless hashing": http://forums.shareaza.com/showthread.php?threadid =51258

  21. More reason to procure your warez... by Anonymous Coward · · Score: 0

    ...from Usenet. Still going strong after all these years.

    1. Re:More reason to procure your warez... by Thraxen · · Score: 1

      The first rule of Usenet...

      You know the rest.

  22. Copying is great! by MarkByers · · Score: 1

    After all they just copied http://copyscape.com/ 's idea.

    --
    I'll probably be modded down for this...
    1. Re:Copying is great! by asadodetira · · Score: 1

      That's the first thing that came to mind when I saw the article. It's been around for years. I've used a it a few times and was amazed to find one of my random website texts in other peoples's work (It was properly cited so I don't complain).

  23. Negotiate Monitization? by eno2001 · · Score: 1

    Why the fuck does everyone want to be paid for every little thing these days? Sure, wholesale piracy is one thing. I disagree with the idea that people should be trading movies and music online with no restrictions at all. If you want an album, buy it. If you want software that costs something, buy it or learn to use free/open software. If you want to see a movie, pay to watch it in the theater or rent the DVD when it comes out. But, where this all falls apart is when someone quotes someone else online and that someone else feels they need payment for the quote. Or... someone uses a popular song as the music bed in their Youtube video and the entire video clip is only 25 seconds long or the quality is so poor that no one in their right mind would consider keeping it as something to put on their iPod. Or, someone edits together a bunch of clips from a popular movie to make a funny statement about something. These are all reasonable uses of copyrighted information that SHOULDN'T be charged for. If the industry had their way, rap music would have never happened (because it used previously recorded material) since many of the early rappers didn't have the money to pay for sample clearance. Everyone is being nickeled and dimed to death. Why? Why does it have to be this way. Whatever happened to the concept of fair use and encouraging people to build upon the works of others?

    --
    -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
    1. Re:Negotiate Monitization? by FireFury03 · · Score: 2, Funny

      If the industry had their way, rap music would have never happened

      I don't understand... your post seems to imply this is a Bad Thing?

    2. Re:Negotiate Monitization? by blugu64 · · Score: 1

      I've been waiting all my life to what you said. Since you already did, if copyright nazis would kill rap...well....that almost makes it worth it....almost...

      --
      "Personal ownership is a hallmark of conservative capitalism. And I don't believe I am entitled to anything that I did n
    3. Re:Negotiate Monitization? by jo42 · · Score: 1

      There is a reason that rap is part of the word Crap...no?

  24. It's just a tool by 91degrees · · Score: 2, Insightful

    As long as it respects basic internet rules of conduct (including respecting robots.txt), then this is ethically neutral.

    It all depends on how it's used. Many companies would prefer to avoid coypyright infringing material, and will take it down if the existence is pointed out to them. Many companies will simply be asking others to remove material which clearly and flagrantly breaches their copyright. This is perfectly reasonable behaviour.

    1. Re:It's just a tool by Anonymous Coward · · Score: 0

      Is it ethically neutral?

      According to them, they are actively going to known (or suspected) pirate sites, and downloading things at random. That's a lawsuit waiting to happen, especially if one of their competitors happens to have had some of their precious copyrighted information posted there.

  25. Current Engines... by Neutari · · Score: 1

    I would have thought that someone at google or microsoft would have thought this up long before. They are in the perfect position to make it happen.

  26. Maybe it can work both ways by Anonymous Coward · · Score: 1, Insightful

    Corporate plagarism hurts the little guy too
    so maybe it's time a tool like this was in everybodys hands?

    http://www.robmanuel.com/2006/12/13/is-coke-rippin g-off-the-little-guy/

  27. Freedom Tool Scans For +1, Patriotistic by Anonymous Coward · · Score: 0

    for Democracy Violations.

    Sincerely,
    K. Trout, EX-patriot

    P.S: F The President

  28. Fair Use Issues by MrLizard · · Score: 1

    Of course, "a few sentences of text or a few seconds of video" most likely are being used within legal fair use boundaries. So what's going to happen is that the corporate law firm will grab this program, then send out auto-takedown notices without a human being (to the extent anyone working in the legal department meets that criteria) ever looking to see if the use is even arguably a violation of copyright. Then you'll get the backlash where at least one such auto-generated letter makes its way to someone with the knowledge to fight back and the platform to do it from, and someone will have to issue an embarrassed apology, and then probably turn around and sue the software makers.

  29. Fool proof? by Virtualtaco · · Score: 0

    This is one of those ideas that when the piracy community decides to do something like reverse the names for all pirated content, or just use numbered files with lookup indices, the search software fails and the company goes out of business.

  30. what's their probability of false alarm? by Anonymous Coward · · Score: 2, Insightful

    This may be much less helpful than its promoters claim.

    First of all, what's the their probability of a false alarm? Even if they false alarm fairly infrequently, the vast amount of content on the Web means they could easily have a flood of false alarms, in addition to whatever actual copies are found. The user of the system is then going to have to have human beings sift through that flood to identify what's A) really a copy, B) whether that copy is infringing or not, and C) if so, is it worth taking action against the infringer?

    The above may be more trouble/expense than it's worth in many cases.

    Not that the RIAA always bothers to verify actual infringement has taken place before suing, but some organizations may be a little more ethical, or at least a little less trigger-happy.

    1. Re:what's their probability of false alarm? by DamnStupidElf · · Score: 1

      First of all, what's the their probability of a false alarm? Even if they false alarm fairly infrequently, the vast amount of content on the Web means they could easily have a flood of false alarms, in addition to whatever actual copies are found. The user of the system is then going to have to have human beings sift through that flood to identify what's A) really a copy, B) whether that copy is infringing or not, and C) if so, is it worth taking action against the infringer?

      You must be new here. The solution is D) Send a generic DMCA takedown notice to everyone and their ISP regardless of fair use.

  31. Wait a minute by Billosaur · · Score: 1

    Ok, it's supposed to be unlawful to access copyrighted information on the Internet without the copyright holder's permission, right? I mean, that's the gist of the *AA's arguments right -- we hold the rights, you can't access this material unless we say so. So if the tool has to access the information to determine the copyright, wouldn't it be violating that principle? Nitpicking I know, but an interesting thought. They'd have to get dispensation from the *AAs to do it, wouldn't they?

    --
    GetOuttaMySpace - The Anti-Social Network
    1. Re:Wait a minute by edraven · · Score: 1

      Copyright protects the right to make a copy. It's nothing to do with access to the information. If you find a CD in the gutter and it still plays, bonus. You have every legal right to listen to that CD. So do all your friends. But even if you purchase the CD you don't have the right to copy its contents. It's not about possession or access. It's about making copies.

  32. If you value your "property" so much... by Anonymous Coward · · Score: 2, Insightful

    ...then do not put it to the Internet.

    In fact, burn it to a DVD and lock it up to a safe, and never talk about it. That way nobody else will ever have access to your "intellectual property".

    1. Re:If you value your "property" so much... by AmberBlackCat · · Score: 1

      Unless you're selling books, CD's and DVD's and somebody else puts it on the Internet.

  33. Scan Blocking by Daemonstar · · Score: 1

    Proactive firewalls (IDS) properly configured should shut the "scan" down relatively quickly, no? Besides, if the service is provided by a specific location (IP block), then IP blocking is trivial.

    On another note, so now they are going to throw more traffic over the Internet? :P

    --
    I don't reply to Anonymous posts; if you have something to say to me, identify yourself or I won't reply.
    1. Re:Scan Blocking by Opportunist · · Score: 1

      It's just a little more noise in the static between Spammails and Windows boxes phoning home.

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
  34. Now SCO can continue... by filesiteguy · · Score: 1

    This is the tool Micros - um, I mean - SCO has been waiting for. They can now just scan all those millions of Linux Servers on the intraweb and see their copyrighted code right there in the open....

    ...or maybe not.

  35. Property? by Cybert4 · · Score: 1, Insightful

    Asshole lawyer.

  36. Finally an actual useful purpose for leet-speak? by kevintron · · Score: 1

    When such companies comb the Web for snippets of text, could their engines of litigation be defeated by the simple expedient of translating Gr34t w0rKs uv L1t3r4tur3 into leet-texts?

    (I sure hope not!)

  37. What a waste by j00r0m4nc3r · · Score: 1

    Like there's any copyright infringement on The Interweb. I don't see how a whole book could fit in those tubes...

    1. Re:What a waste by glenstar · · Score: 1

      From what I understand They (the Illuminati) have secretely made the tubes bigger. You can even put whole CDs and DVDs through them.

  38. Ringtone by tepples · · Score: 1

    If you want an album, buy it. If you want software that costs something, buy it or learn to use free/open software.

    So where's the free/open alternative to an album?

    Or... someone uses a popular song as the music bed in their Youtube video and the entire video clip is only 25 seconds long

    A ringtone is 25 seconds long, as that's how long it takes for the call to be routed to voice mail.

    or the quality is so poor that no one in their right mind would consider keeping it as something to put on their iPod.

    Over a mobile phone's ringer, quality matters little.

    Whatever happened to the concept of fair use and encouraging people to build upon the works of others?

    Sonny Bono happened.

  39. Evidence of a disease. by GodInHell · · Score: 1
    The problem: your services as a content mitigator have been rendered useless by the appearance of a medium which is so cheap as to appear free, so fast as to appear instant, and so easy as to appear effortless.

    The cure, corrosive, caustic and highly dangerous responses flooded into the arteries of your survival - a general failing of the organs of service, and an increasingly gruesome appearance as you stamp on the consumer and turn on your distributors looking for signs of theft and duplicity.

    Prognosis - Death.

  40. Copyright protection for the rich only. by John+Sokol · · Score: 1


        I find my stuff copied and plagiarized all the time, and it's nearly impossible to enforce without a large budget for lawyers. From inventions to source code to writing.
      More then I could ever possible list here, but I have come to realize it's in the nature of things.

        So now big cooperate America are going to get even better at chasing stuff down and coming after everyone that even borrows a paragraph now. Using there intimidation tactics.

        The place where it really gets interesting is then they steal your stuff and then threaten to sue you for copyright or patent infringement.

      I know it sounds crazy till you have had it done to you several times.

        Example. In 1985 I named my audio card product for the PC the "Sound Byte" showed it at a trade show then a few month later a very small competitor file for the trademark and had their lawyer send me a nasty letter. Being very broke, just out of high school and living off the sales of each audio card, I had to change my name to "audio byte"

        Example 2. I released an com file of compiled assembly code to CompuServe of a program that played 6 bit digital audio out the PC's internal speaker. Several years later a company "First Byte" disassembled the code, and filed a patent on it.
        At that time I was selling a sound library to game developers, they sent me a nasty letter. Then threatened several large game companies, Activation who also Disassembled my code and Borrowed it, contacted me and paid me to help then win their patent case.
        But I was threatened a law suite for using my own invention!!!

      Anyhow I guess that's enough pissing and moaning.

      This system can tell when you copy from then, but not when they copy from you.....

        This automated copyright enforcer is a dangerous thing.

    --
    I am always doing that which I can not do, in order that I may learn how to do it. - Pablo Picasso
    1. Re:Copyright protection for the rich only. by Reziac · · Score: 1

      "This system can tell when you copy from then, but not when they copy from you....."

      That's the best point anyone's made here today. How does the tool know if the person doing the scanning is the actual originator of the content? It can't. It can only go by the subscriber's say-so.

      --
      ~REZ~ #43301. Who'd fake being me anyway?
  41. Well, that's Ironic by cfulmer · · Score: 1

    They're going to be COPYING stuff from websites into their index so they can perform paid searches on it. Why isn't that copyright infringement all by itself?

    If somebody were to sue them, they would have to claim that theirs is a fair use. But, many large copyright holders (i.e. their potential customers) would vehemently disagree with such a position. That's an interesting position to be in.

  42. I can see another use for this software by exp(pi*sqrt(163)) · · Score: 1

    It'll save me the time I spend doing 'vanity' web searches.

    --
    Doesn't it make you feel good to know that our freedoms are protected by politicans, lawyers and journalists.
  43. Profit by future+assassin · · Score: 1

    Put photos (content) on your website
    Don't include a water mark to make it tempting for someone to download the image
    Make the content available at full size or large size
    Wait a few years
    Send out the hounds to sniff for your content
    Send out invoices for content usage
    ??????? Corbis ???????????
    Profit

    --
    by TheSpoom (715771) Uncaring Linux user here. I have nothing to add to this but please continue. *munches popcorn*
    1. Re:Profit by future+assassin · · Score: 1

      ??????? Or was that Getty ???????????

      --
      by TheSpoom (715771) Uncaring Linux user here. I have nothing to add to this but please continue. *munches popcorn*
  44. Whack-a-mole by SirGarlon · · Score: 1

    I don't see how this will change anything; copyright holders still have to pay lawyers to go after infringing sites/servers so there is still a bottleneck. This is kind of like using video surveillance in the ol' whack-a-mole game. You may see more moles, but it doesn't mean you can whack them faster.

    --
    [Sir Garlon] is the marvellest knight that is now living, for he destroyeth many good knights, for he goeth invisible.
    1. Re:Whack-a-mole by SeaFox · · Score: 1
      I don't see how this will change anything; copyright holders still have to pay lawyers to go after infringing sites/servers so there is still a bottleneck.

      Well, since the major media conglomerates have lawyers on salary, it wont effect their costs at all. They'll just send a letter to your ISP/host and the host, fearing legal costs of their own (since they DON'T have a lawyer always available), will bow down and pull your whatever from the servers.

      Many don't seem to follow the proper procedure of like, you know, ASKING THE USER about the infringing content so they can show that they can defend themselves first, or even requiring the "copyright holder" to prove they do in fact own the content.

      Even if you were using a snippet for fair use I don't think it would be done right. You would have to settle things with the holder before you can have the content posted again, which isn't the way it is supposed to work.
  45. *sigh* by WWWWolf · · Score: 1

    I'd love to see this technology available for public use. The idea is brilliant. The fact that they restrict it to members of an association is not.

    I'm a Wikipedia contributor. In Wikis, you have to be paranoid about the copyrights of the contributions. We have a not-that-glamorous, as of yet a little bit limited bot that does exactly what this tool appears to do - find suspected copyright violations.

    I'm sure wiki editors, bloggers, and other open content creators would be terribly interested to see where their material gets copied and would be terribly interested to know if someone's misusing the content too.

    But of course, no one's listening to the little guy. Even the Wikipedia bot has to use Yahoo API (hint hint, Google folks. =)

  46. Sounds like TurnItIn by Kelson · · Score: 1

    Anyone else ever had their site visited by the Turnitin bot?

    And the article mentioned Copyscape, which is more aimed at finding dupes of web pages (you enter a website, and it looks for similar pages in their index).

  47. Buzzkill by edraven · · Score: 1

    Sure I get that it's a joke. But that being said, A) you don't have to label something as copyrighted for it to be protected by copyright. Copyright is granted the moment you produce an original creative work. However, B) there's nothing creative about the contents of your robots.txt file. Labelling something as copyrighted doesn't make it copyrighted.

  48. Man... by Anonymous Coward · · Score: 0

    Archiving the internet? That is a LOT of porn.

  49. "...may decide the content is being use fairly..." by yar · · Score: 1

    Of course, some nice things about fair use are that
    a) the creator of the copyrighted content does not get to decide whether the use is or is not fair;
    b) although the amount being used is one of the factors used to evaluate fair use, it is by no means the only factor, and in some situations using more than a limited amount is fair.
    No technology can make that evaluation, and copyright holders don't get to, either.

  50. How to detect your IP! by merc · · Score: 1

    /sbin/ifconfig -a

    Walla!

    --
    It's true no man is an island, but if you take a bunch of dead guys and tie 'em together, they make a good raft.
  51. A real use on /. by EmbeddedJanitor · · Score: 2, Funny

    The editors could run this tool just on /. to check for dupes!

    --
    Engineering is the art of compromise.
  52. Would they turn to hacking? by Tarinth · · Score: 1

    I can imagine this progression of events:

    1) They don't honor robots.txt
    2) Sites that don't want to be scanned by them will add code to their rewriting rules and/or dynamic pages so that their search bot gets directed to a dead-end page.
    3) The search engine needs to be modified in such a way to hide its identity, operate through proxies, etc., in an attempt to get around #2.

    Upon 3, are they criminally liable for hacking?

    1. Re:Would they turn to hacking? by sowth · · Score: 1

      I think by step 3 you and many others may have a case for a class action lawsuit for wasting your bandwidth. I'm not sure if it is against any law, but if you put some sort of bandwidth wasting or no bots clause into your terms of service, then you should be able to sue. The judge would toss out the case for people who were actually violating copyright, but everyone else should get some money from them.

      They would also be violating many people's first amendment rights by artificially making it more expensive to publish, but I don't know if they'd be technically breaking any law there.

      I didn't even go into false DMCA reports. Somebody ought to get them for that...maybe with a slander of title lawsuit?

      I'm thinking of putting up a website. I don't want to be screwed by these bastards. From their patterns, I don't think they are even trying to protect their copyrights, I think they are trying to eliminate anything they perceive as competition. Even free speech. As long as someone can be entertained by reading and posting on websites, usenet and such, it will take away time from buying / watching / listening to TV, music, DVDs, whatever. I don't want the media companies to do to the internet what they did to radio. What little bandwidth you can use for two way communication you either have to pay some company (cellphone) or take a bunch of tests for an obscure license (hams) or use unwanted frequencies (such as WiFi net sharing with microwave ovens).

    2. Re:Would they turn to hacking? by VJ42 · · Score: 1

      Upon 3, are they criminally liable for hacking? IANAL but I believe that they would be in breech of the 1990 computer misuse act here in the UK. I don't know about your part of the world.
      --
      If I have nothing to hide, you have no reason to search me
    3. Re:Would they turn to hacking? by Fred_A · · Score: 1
      2) Sites that don't want to be scanned by them will add code to their rewriting rules and/or dynamic pages so that their search bot gets directed to a dead-end page.

      What if the copyright violation scanning bot uses "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)" for its UA string ?

      Presumably since they are looking for "questionable content" they won't be playing by the rules.
      --

      May contain traces of nut.
      Made from the freshest electrons.
    4. Re:Would they turn to hacking? by Tarinth · · Score: 1

      I think that's an example of my Case #3 -- modifying your software to conceal its identity. If they did use a Mozilla UA string, then it seems that it would be only a matter of time before people realized what the originating IP/domains are, at which point the service would need to use proxies to conceal its true identity. Of course, since most ordinary websites don't want to stop their users from accessing them by a proxy, there's probably not much that can be done in terms of detection--but it still raises the issue of any liability the company might have for accessing the site in a way that's expressly forbidden by its owners.

    5. Re:Would they turn to hacking? by RockDoctor · · Score: 1
      As long as someone can be entertained by reading and posting on websites, usenet and such, it will take away time from buying / watching / listening to TV, music, DVDs, whatever. I don't want the media companies to do to the internet what they did to radio.

      You have advertising on radio?
      [MODE = Scotty ON]
      How quaint.
      [MODE = Scotty OFF]
      --
      Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
  53. robots.txt can be bypassed. .htaccess may not by Psicopatico · · Score: 0
    127.0.0.1:/www/htdocs/: $ cat .htaccess

    RewriteEngine On
    Options FollowSymLinks
    RewriteCond %{HTTP_USER_AGENT} AttributorCorporationDMCABot
    RewriteRule ^.*$ error.html [L]
    Please do not forget to enable mod_rewrite in your apache2 configuration. Check manual if needed.
    --
    Mastering the English language is fucking easy: all you have to do is to put an f* word in every fucking sentence.
    1. Re:robots.txt can be bypassed. .htaccess may not by CantStopDancing · · Score: 1

      While this is a useful partial solution, it bypasses one of the reasons for robots.txt's existence, that of reducing load on the server. If apache has to stop and examine every request and look at the user-agent, that causes resource consumption that a well-behaved robot would not.

      That said, these copyright-sniffing robots are likely to be extremely poorly behaved, and so your advice (or other equivalent server configs) may indeed be the only practical solution.

      --
      I'm running a pirated copy of Linux.
  54. Should scan for edu proper usage by WillAffleckUW · · Score: 1

    Education is permitted broader rights under copyright laws, and fair use of satire and - quite frankly - mockery of corporate copyrighted materials should be enforced as well, resulting in people making fun of wierdos like Disney who want copyright to last until the sun burns out.

    And then send teams with pies to smoosh in the faces of the CEOs.

    But that's in a just world, not one like we live in.

    --
    -- Tigger warning: This post may contain tiggers! --
  55. I've seen the code for their copyright scanner... by mmell · · Score: 1
    10 IP_ADDRESS=RND(from_IPv4_addr_space)

    20 if RND(between_0-1) < .5 then print "IP_ADDRESS GUILTY! SUBPOENA COMPUTER DISK AND SUE OWNER." else print "We'll get 'im next time!"

    30 GOTO 10

  56. Australia? by Anonymous Coward · · Score: 0

    I hope they don't try to use this in Australia - I hear giving people URLs to copyrighted material is illegal...

  57. Re:I've seen the code for their copyright scanner. by WillAffleckUW · · Score: 1

    I think you meant:


    10 IP_FORM = 0 'FALSE
    20 IP_FORM = GETIPVER(baseNet) '4 for IPv4, 6 for IPv6
    30 DO CASE
    40 CASE IP_FORM = 4
    50 IP_ADDRESS=RND(from_IPv4_addr_space)
    60 CASE IP_FORM = 6
    70 IP_ADDRESS=RND(from_IPv6_addr_space,minus_chi_gold farm_flag,minus_compomised_DRM_rootkit,plus_mil_si tes)
    80 CASE ELSE
    90 IP_ADDRESS=RND(sony_ps3_rootkit_pirates)
    100 END CASE
    110 if RND(between_0-1)

    --
    -- Tigger warning: This post may contain tiggers! --
  58. Countermeasure by hoggoth · · Score: 1

    So now as a countermeasure someone will produce a tool to scramble the lowest order/frequency information in the file. For example, randomize the lowest order bit in an image, randomly exchanging black[#020202] and black [#020302]. For videos and music randomize the lowest frequency that is below the threshold of viewing. It will take horsepower to reencode the files, but it only has to be done once. You only need to change one bit for a fingerprint to fail.

    And the arms race goes on...

    --
    - For the complete works of Shakespeare: cat /dev/random (may take some time)
    1. Re:Countermeasure by Da_Weasel · · Score: 1

      I don't claim to know how they have or might develop this system, but it seems to me that if they plan on dealing with a file being encoded by different people in different formats, with different quality levels then your "low order bit" theory isn't going to do jack to stop them. It seems to me like a pretty trivial thing to add thresholds to these checks to allow slight to moderate variations in the finger print.

      Remember they don't have to 100% identify content as unauthorized copyrighted material with the automated process. They just need to identify suspect material and then notify a human to further investigate the situation.

      These people are as retarded as you might think. Do not under estimate them...

      --
      If you must!
    2. Re:Countermeasure by Opportunist · · Score: 1

      These people are as retarded as you might think.

      I wonder what Dr. Freud would think 'bout that slip...

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
  59. His IP is my IP to by Anonymous Coward · · Score: 2, Funny

    and whenever I go out, the FBI begins to shout Title 17 U.S.C...

  60. CopyScape by rakerman · · Score: 1

    Nice venture-capital-boosting announcement there, but CopyScape has already been doing this for years, albeit for text only.

  61. Digimarc has been doing this for a few years by Anonymous Coward · · Score: 0

    Digimarc (the company that has their watermark reader plug-in included with Adobe Photoshop) has a service just like this.

    You give them like 20 URLs that you want to monitor (or more if you pay them more money), and they have a web spider that checks images on those sites and checks to see if one of your Digimarc watermarks is embedded in any of the images. They'll then present you a report (from a password protected website I believe) where you can see who has been using your images without permission. Apparently their technology can persist across image format and size changes (to a point), so it will even catch web sized versions of print quality images in theory. The watermarks have a variable strength, and the more resilient you make them the more visible they are.

    I thought it seemed like kind of a neat idea actually.

    The other thing I'm waiting for is for Google to do something like this. Maybe you could upload a small picture as a search query, and the search engine would make a low-res internal fingerprint of the image with some fuzzy tolerance, and then show you sites with matching or similar images. I'm sure it will happen sooner or later.

  62. What concerns me: by botlrokit · · Score: 2, Interesting

    I'm bothered by this type of scenario:
     
    "Dear [webmaster]:
     
    It has come to our attention that your website, [sh*touttaluck.com], does not meet compliance in terms of a variety of copyright laws of the United States and other countries. Infractions indicated by our software include, but are not limited to:
     
    Images created with an unregistered copy of Adobe Photoshop
    Flash files created with an unregistered copy of Macromedia Studio MX 2004
    PDFs created with an unregistered copy of Adobe Acrobat Professional
    Content and structure created with an unregistered copy of Macromedia Studio MX 2004
    Content and structure created with an unregistered copy of Microsoft Office Frontpage 2003
    Images created with an unregistered copy of . . . "
     
    ...starting to see what I'm going with? I understand they're likely talking about copyrighted content such as prior art images or mp3 files, or maybe even damaging company secrets that are leaked by a whistleblower, and then redistributed for the intent of airing dirty laundry, but I'm thinking about the structure of a page itself. A person group or company who solicits a webpage to be created by a web design studio would now have to ensure that the studio itself is in compliance, or the products they use to create the pages are legal. That's where I get all nervous.

    1. Re:What concerns me: by PPH · · Score: 2, Funny

      ...html created with an unregistered copy of vi.

      --
      Have gnu, will travel.
  63. They should be prepared... by reebmmm · · Score: 1

    ... to be sued for copyright infringement. From TFA:

    Attributor analyzes the content of clients, who could range from individuals to big media companies, using a technique known as "digital fingerprinting," which determines unique and identifying characteristics of content. It uses these digital fingerprints to search its index of the Web for the content.

    It is unlikely that this company will be able to upon the same defense the traditional search engines do. Putting something on the internet is not an invitation to copy or create derivative works of the materials put online. If they are essentially building gigantic databases of materials to search, they ought to be in for a world of hurt.

    Moreover, at the first action they will probably be countersued for any number of other issues: breach of contract, misappropriation, trespass etc. Not to mention those companies paying for the service will also be sued for their own acts of copyright infringement in connection with this service.

  64. dummy files... by themindfantastic · · Score: 0

    Now I wanna go forth make dummy files of all the top rated copywritten files out there, place em on a domain, and wait for the lawsuits to roll in and show their ignorance that just because something has the same hash, size, or name, doesn't mean its actually said file... just to piss them f&*kers off.

  65. Been around for a while by stoneycoder · · Score: 0

    Just a few weeks ago I noticed a certain IP address was hitting wierd pages that didn't even exist, causing a boat load of 404 traffic, and doing all sorts of other nasty things it shouldn't be doing. So I did a whois on it and found it was some place called turnitin.com, advertises as a service to assist educators in finding plagirism. I promptly blocked them and immediately uploaded every paper I wrote in college and highschool.

  66. Two words by kippers · · Score: 1

    Fair Use.

    1. Re:Two words by Shaltenn · · Score: 1

      Fair Use.
      Doesn't apply when it stands in the way of corporate profit.
      --
      If you were offended by anything I said... No, I'm not sorry. Please lighten up.
  67. Public vs. Searchable. by Anonymous Coward · · Score: 0

    My blog is public but I sure as hell don't want it to show up in search queries.

    Like my cell number. Anyone can call it but only those who know the number.

    1. Re:Public vs. Searchable. by Da_Weasel · · Score: 1

      You can't leave naked pictures of your girl friend laying around on the side walk and then get mad at people for looking at them.

      Putting a non-password protected web site online is about the same thing.

      --
      If you must!
  68. They simply can not scan for subsets by teece · · Score: 1

    The company claims to be able to find "a customer's content based on the appearance of as little as a few sentences of text or a few seconds of audio or video."

    This is nonsense, setting aside the fact that such things are quite probably fair use. Having any kind of complete catalog of "digital fingerprints" for a given work is (practically) impossible. At best, a few select snippets of a given document could be fingerprinted. Changing even a single bit will change a one-way encryption hash (which is, presumably, the method used here), and it won't change the fingerprint in a predictable way. One would need to catalog hashes for every subset of the given document, and the number of such hashes would grow as n^2, where n is the "word-size" of the document.

    I wrote two articles on it on my blog, one general, one mathematical. Read 'em if you'd like. Beware the Digital Snake Oil How Many Substrings in a Given Text?

    --
    -- Hello_World.c: 17 Errors, 31 Warnings
  69. Slashdot standard business plan by hummassa · · Score: 1

    1. con the *AA into giving off loads of money saying that a new fingerprinting thingy will help them find them "arr pirates";
    2. fail miserably, because it's impossible;
    3. PROFIT!!!

    --
    It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
  70. may decide content is fair use by Da_Weasel · · Score: 1

    In some cases, they may decide the content is being used fairly or to acceptable promotional ends. Riiiiiiiiiiight.....!
    --
    If you must!
  71. I've experienced it from both sides. by bcrowell · · Score: 2, Informative

    I've experienced this from both sides.

    I have a bunch of my books on the web, and every once in a while I do a search on some text from my own books to see who else is mirroring them. The books happen to be copylefted (dual-licensed GFDL/CC-BY-SA), but I'd like to know who's mirroring them, and check whether they're violating the license. A lot of people just seem to be hoarding the PDF files on their university servers, maybe because they're afraid my web site will disappear; that's flattering. One guy was selling them on CDs on e-bay, violating my license (claimed they were PD, didn't propagate the license). Another guy translated them to html, with lots of errors, changed the license to a more restrictive one, and put his own ads up; he fixed the licensing violation when I complained, and in a way it was a good thing, because it motivated me to make my own html versions (which are now bringing me a significant amount of money from adsense every month). One kind of annoying thing about mirroring is that the people who are mirroring never bother to update their mirrors, but in general I just figure there's no such thing as bad publicity :-)

    From the other side, I once received an e-mail from a museum in the UK that was complaining that I was using a 17th century oil painting of Isaac Newton. I guess they own the original, and they may also have been the ones who did the scan that I found in a google image search, but under U.S. law (Bridgeman Art Library, Ltd. v. Corel Corp.), a realistic reproduction of a PD two-dimensional art work is not copyrightable. What really surprised me was that they came across it at all, because at that time I think my book was only in PDF format, and hadn't been indexed by google because the file size was too big.

    The whole thing doesn't seem negative to me in general. It makes just as much sense as people doing a vanity search in Google before they apply for a job, or authors watching their amazon.com sales rankings obsessively. I guess the most obvious potential for abuse would be if they send a nastygram to your webhost, and your webhost is a low-end one that figures it's not worth their time to keep your account, so they just shut off your account.

  72. robots.txt may be moot by MooseTick · · Score: 1

    It wouldn't be too hard to make this software by looking up key phrases of a web site in google. If there is an exact hit, then there may be a copyright violation.

    How hard would it be to intelligently grab chunks of YOUR web site and then Google those parts. Then grep the results. If there is/are positive hits (not from your domain) then light up the dashboard. If you wanted to be extra picky, query yahoo, msn, google, and whoever else you like to search with.

  73. Information Feudalism by Anonymous Coward · · Score: 0

    "Intellectual property" is meant to make you think that all information and thought is worthy of being owned forever.

    Of course in the long run, this idea is counter-productive to innovation an progress.
    There must be a better term to describe this behavior.

    Probably there are better phrases to describe this behavior where an eternal organization wishes to acquire some information or thought, and "own" it so as to have exclusive profits on the information or thought for 100-1000 years?

    It seems heading toward an information class system which could be like Feudalism.
    The Lords are eternal corporations.
    The Vassals are possibly the executives in the corporation.
    And Peasants are the workers that find new "information" property to own (if profitable for the corporation), and then pay taxes to the corporation to have access to information.

    Of course this system requires that the government keep this idea of infinite information ownership. But that has not seemed to be a problem so far for the long-living corporations.

  74. Duplicate! There's a surprise! by Baloo+Ursidae · · Score: 1

    Mmmmm, eight year old duplicate. This one really takes me back to my high school days.

    --
    Help us build a better map!
  75. A decent tool already exists by trawg · · Score: 1

    See: www.google.com

    Searching for +mp3 intitle:index.of +[insert your favourite artist here] would be enough to keep these jerks busy for a while.

  76. Re:and in little pieces, they will consume bandwid by rohan972 · · Score: 1

    as to the asswipes who suggest they 'use google' think about that- how much luck do you expect they'll have hitting google for their entire cache.... (and google pays for bandwidth too)

    As I understand the suggestion, it is that if regular users can find the content using google, so can the copyright owners. No robot required, just search in the normal way. If you can't find it like that, probably not many people can download it anyway.

  77. False positives!!! by Geofs · · Score: 1

    Let's overflow them with false positive results. Hidden links to specially crafted files could lead to millions of inappropriate cease and desist letters, their horror tools will be useless. In a word, let's SPAM them ;-) If everyone is suspect, let's all seem guilty so they can't actually distinguish. I'm not against copyright, but i'm against spies and arbitrary investigations. Have nothing to hide doesn't mean let everybody spy.

  78. Five words corporate response by Opportunist · · Score: 1

    We don't give a fu..

    --
    We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
  79. Hrmmm.... a new tool for piracy! by Amphetam1ne · · Score: 1

    Anyone got a torrent for this software up yet? Where's the best sites to download movie sigs from?

    --
    I only buy pepper spray that's been tested on anti-vivisectionists.
  80. Solution by d_54321 · · Score: 1

    The solution seems simple:
    Make a tool that scans for scans from the Copyright Web Scanner (CWS), and then is the CWS is detected, do something.