Slashdot Mirror


Amazon Patents Changing Authors' Words

theodp writes "To exist or not to exist: that is the query. That's what the famous Hamlet soliloquy might look like if subjected to Amazon's newly-patented System and Method for Marking Content, which calls for 'programmatically substituting synonyms into distributed text content,' including 'books, short stories, product reviews, book or movie reviews, news articles, editorial articles, technical papers, scholastic papers, and so on' in an effort to uniquely identify customers who redistribute material. In its description of the 'invention,' Amazon also touts the use of 'alternative misspellings for selected words' as a way to provide 'evidence of copyright infringement in a legal action.' After all, anti-piracy measures should trump kids' ability to spell correctly, shouldn't they?"

22 of 323 comments (clear)

  1. Patentable? by OnlyPostsWhilstDrunk · · Score: 5, Insightful

    This bugs me about patents. This sounds like an exact copy of what they've done with maps for years. They add/remove/rename tiny roads in the middle of nowhere and if you distribute maps with those roads then they know you copied their stuff.

    Everything is a damn patent these days. Yo dawg, I put a clock in your clock so I can sue you while you check the time.

    --
    Sig: I don't spell check and this is legit. This was written while I was drunk, and quite possibly with m eyes closed, b
    1. Re:Patentable? by tinkertim · · Score: 4, Funny

      Aww come on. This is the smuckin fartest invention ever!

    2. Re:Patentable? by Your.Master · · Score: 4, Informative

      I don't blame people for not reading the claims section, because it's necessarily an obtuse fusion of legalese and jargon.

      But no, they did not patent *doing* this, they patented the *way* that they do this. Patents cover implementations, and not ideas. Some have argued that the line has been blurred with certain classes of patents, but it hasn't blurred so far that the concept in the slashdot summary is actually locked up as IP.

      Frankly, I can't be bothered to look at the claims either. But the idea itself certainly lends itself to ideas that are patentable (whether they should be patentable or will be rendered retroactively invalid is another question). For instance, I'm curious how they identify which words should be replaced, and the system by which they choose a synonym that hopefully doesn't destroy rhyming patterns, metrical rhythm, puns, shades of meaning, and ambiguity in words with multiple meanings that don't completely intersect the candidate synonym's meaning.

      Also, whatever they are they doing to prevent the trivial case of three copies being compared to recover the original. Maybe they have a bunch of sets of synonyms that are commonly replaced so you need more to get the original, but even then, do they arrange it in some way so that the source of the leaks can be traced down despite the alteration? Or maybe they just assume that book pirates are morons.

      They might do nothing for any of those cases, mind you. Once again, I can't be bothered to read these damned things. Which is part of why I don't submit articles about ones that I've decided I think are actually stupid.

    3. Re:Patentable? by pvera · · Score: 4, Informative

      Yes. And that is a variation of the classic canary trap (http://en.wikipedia.org/wiki/Canary_trap): copies of classified documents that are not 100% identical. When the leaks surface, you can trace the original recipient of the compromised copy. I like the thing with the maps because it is the kind of thing that makes the violator look like a complete idiot, and it's impossible to defend in court.

      --
      Pedro
      ----
      The Insomniac Coder
  2. Advertising by Kell+Bengal · · Score: 4, Insightful

    Yup - that's the killer application.

    Change "Johnny nervously wrinkled his brow as he reached for his Coke" into "Johhny nervously wrinkled his brow as he reached for his Pepsi".

    If this doesn't happen, I will eat my hat/del/ ACME Brand Prestige Fedora TM.

    --
    Scientists point out problems, engineers fix them
    altslashdot.org: The future of slashdot.
    1. Re:Advertising by Steve+Franklin · · Score: 4, Funny

      Scientists point out problems. Engineers use them to kill people overseas.

      --
      Hic iacet Arthurus, rex quondam rexque futurus.
    2. Re:Advertising by VValdo · · Score: 5, Funny

      Coming soon...

      "Well, well, well. What do we have here?" Crockett exclaimed.

      "Looks like pure uncut Pepsi(TM)," said Tubbs.

      "The Microsoft(TM) doesn't fall far from the tree, does it, pal? Well... we got here in the nick of Newsweek(TM). Get immigration on the iPhone(TM) and tell them to revoke Carlos' work American Express(TM)."

      "But Sonny Delite(TM), I don't know if we'll Heinz(TM) with Carlos before he gets to the border! Besides, he's already wanted for assault and Duracell(TM), let alone Rite Aid(TM) smuggling. Plus I think he's a Kelloggs(TM) killer."

      "Oh, we'll catch him alright. You can take that to the Chase(TM)."

      --
      -------------------
      This is my SIG. There are many like it, but this one is mine.
  3. Sounds familiar by cjfs · · Score: 4, Insightful

    Amazon also touts the use of 'alternative misspellings for selected words' as a way to provide 'evidence of copyright infringement in a legal action.'

    Sabotaging your product out of fear someone might violate your copyrights. Where have we seen that before?

    If it wasn't obvious infringement prior to the changes, what's the big deal?

  4. Canary trap by dido · · Score: 4, Informative

    Intelligence agencies have been doing this sort of thing for decades, giving slightly different versions of a sensitive document to suspected spies or places where possible spies might have access to it, with some subtle changes in the words, seeing which one gets leaked or appears elsewhere. Tom Clancy coined the term Canary trap for the technique. Patriot Games was published in 1987, but its real-world use for exposing information leaks most likely predates the novel.

    --
    Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
  5. Huh, so the nook won't be able to corrupt books? by straponego · · Score: 4, Insightful

    That thing looks better all the time.

    Amazon, free tip: words matter. Especially in books.

  6. Don't shop amazon if you like artistic integrity! by plasmacutter · · Score: 5, Insightful

    A synonym is not reflective of the intent of the author.

    As Al Franken points out, 'friendly' is a synonym for 'intimate', so coulter obviously stated she was having a trist with franken when asked by a reporter!

    Authors choose their diction carefully, at least good ones do, and that should not be tampered with.

    Lesson learned: do not shop at amazon if you respect artistic integrity.

    --
    VLC FOR MAC IS DYING! IF YOU DEVELOP, PLEASE SAVE IT!!
  7. Re:Sounds Dodgy at Best by reashlin · · Score: 5, Funny

    Sounds much more like Amazon infringing on copyright by selling an item subtly changed from a prior copyrighted work.

  8. It's not about the patent, it's about the lying by localroger · · Score: 5, Insightful

    It's an heretical thing when mapmakers do it, lying (even trivially) and corrupting their craft because of the threat of being copied. It should not be tolerated there nor should the practice claimed by this patent application be tolerated, not because the patent is bad but because the practice itself is an affront to all of us.

    --
    Brackets contain world's first nanosig, highly magnified:[.]
    1. Re:It's not about the patent, it's about the lying by Techmeology · · Score: 5, Insightful

      Pirates can work together. Suppose you have ten pirates. They each download a copy of the book. They then compare their copies with each other - crosschecking them (after, of course, stripping the DRM). Nine of the ten books use "to be or not to be", and one uses "to exist or not to exist", and similarly for other words. They may then produce a more accurate copy of the book. So now, instead of pirate versions being technically superior (due to the lack of DRM), they're also more accurate! Well done, Amazon, you've patented a wonderful scheme to ensure people don't trust genuine products! Normally I am very anti-intellectual property. On this occasion, however, I do hope Amazon is granted it and enforces it. Perhaps it would some day prevent someone else doing the same.

      --
      Excuse for why is your room always messy?
  9. Moral rights by davidwr · · Score: 4, Informative

    Canada and some other countries have "moral rights" which belong to the author.

    Changing words without his permission could violate these rights.

    In some countries these rights are inalienable and non-assignable. This means the author can't be ordered to waive them by the publisher or other copyright-holder.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  10. If I was an author . . . by Tanman · · Score: 5, Insightful

    If I was an author who had slaved a year over a book, and anyone but my editor (with my approval on each change) altered my precious words and distributed it as my work, I'd sue the pants off of them. It'd be like if someone was selling prints of my painting and changing a brush stroke. You just don't do that. Words are the author's paint.

  11. Re:Canary trap by jeisner · · Score: 5, Informative

    Intelligence agencies have been doing this sort of thing for decades, giving slightly different versions of a sensitive document to suspected spies or places where possible spies might have access to it, with some subtle changes in the words, seeing which one gets leaked or appears elsewhere. Tom Clancy coined the term Canary trap for the technique. Patriot Games was published in 1987, but its real-world use for exposing information leaks most likely predates the novel.

    But the classic Canary Trap requires someone to modify the document manually, which is hard to do on a large scale. Here it is being done automatically by an algorithm.

    However, I am aware of published methods for this problem dating back to 2001 by Mikhail Atallah at Purdue. In fact Atallah received a patent for followup work in 2007, a year after the Amazon patent was filed.

    Here are a few hundred papers on the subject, via Google Scholar. Some adjust whitespace, some modify images of the text, and some attempt fairly sophisticated syntactic analysis and restructuring of selected sentences.

    I apologize that I haven't read the Amazon patent, or read the prior literature carefully, or gone to law school, so I can't comment on whether the patent seems valid or not.

  12. Re:Prior art by Anonymous Coward · · Score: 4, Informative

    ... the word "aluminum". Some use the US spelling, the other use the correct spelling.

    FTFY

  13. Not a Wise Practice by Frightened_Turtle · · Score: 4, Insightful

    First, there is already pre-existing examples of this practice. Indeed, Tom Clancy described this very technique in one of his novels and called it, "The Smoking Word Processor."

    Second, as an author, I go through quite an effort to ensure that the spelling and grammar are correct throughout any work that I created. To have Amazon completely throw away my efforts and ruin my work would really anger me. This might encourage me to inhibit Amazon from selling any of my work.

    --


    Whew! This water sure is cold!
  14. The logical progression by Impeesa · · Score: 5, Insightful

    If this becomes widespread, here's how it'll go: first, pirate groups will only have to pay for/obtain a couple extra copies, and come up with an automated reconstruction system that will compare the copies and perform error correction. Then the publishers will start obfuscating things more and more, and the pirate groups will develop more and more advanced algorithms. Eventually, the publishers will be publishing near-100% noise, with their heads too far up their asses to realize it, the only people buying copies will be the dedicated pirate groups, who will afford it by charging for their services, and before you know it, "content miners" will just be another step in the chain. The establishment is just last generation's rebels, am I right?

  15. Re:Prior art by Kaenneth · · Score: 4, Interesting

    With specs its a bit more difficult, but with books its not really that hard to get 2 copies from 2 seperate sources. Diff the two and you can create a unique sig than matches neither.

    Incorrect, with current methods you can identify both.

    Depending on the number, and distribution of intentional errors, you can tweak such a system to indentify any number of mixed sources. For example if you insert 30 errors into each copy at unique points, and 3 copies are blended randomly, if will contain an average of 10 errors from each source, possibly enough to identify all 3 sources. With overlaping points, even if a best 2 out of 3 method is used to generate the copy, you can still find out which sources. Consider each point at which an error is inserted or not as a bit, and think of RAID, ECC, Parity, etc.

    I believe that a particular large software company already uses this type of method on their source code distributions, to indentify leaks. I recall a presentation from someone working at that company on the local university learning channel where they described fingerprinting source code in this manner.

  16. Simple solution! by KingSkippus · · Score: 4, Interesting

    It seems to me that there's a pretty easy way to defeat this. Use the technology against itself.

    If you ever want to distribute something, make your own minor spelling variations and substitute your own synonyms into the original, thus further altering the altered work. If someone sues you, just point out the fact that their copy "proving" you're guilty doesn't even match the copy of the work that was distributed.

    You could use this idea for just about anything that is digitally watermarked. Don't want that MP3 traced? Introduce your own small, imperceptible variations into the waveform. Don't want your printer tracing you through microdots on your hardcopies? Write a driver that adds its own microdots, and lots of 'em. And so on...