Slashdot Mirror


Altering Text In eBooks To Track Pirates

wwphx writes "According to Wired, 'German researchers have created a new DRM feature that changes the text and punctuation of an e-book ever so slightly. Called SiDiM, which Google translates to 'secure documents by individual marking,' the changes are unique to each e-book sold. These alterations serve as a digital watermark that can be used to track books that have had any other DRM layers stripped out of them before being shared online. The researchers are hoping the new DRM feature will curb digital piracy by simply making consumers paranoid that they'll be caught if they share an e-book illicitly.' I seem to recall reading about this in Tom Clancy's Patriot Games, when Jack Ryan used this technique to identify someone who was leaking secret documents. It would be so very difficult for someone to write a little program that, when stripping the DRM, randomized a couple of pieces of punctuation to break the hash that the vendor is storing along with the sales record of the individual book."

31 of 467 comments (clear)

  1. So... by Impy+the+Impiuos+Imp · · Score: 5, Informative

    Normal book publishers have been doing this for decades, inserting the occasional misspelling here or there. Later, they inserted correct spellings, but of the wrong word, to get around auto-correction in scanner software.

    So...no, they can't patent it.

    --
    (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
    1. Re:So... by Shompol · · Score: 5, Insightful
      Yes but this is different because

      ... on a computer

      So yes, they can (and will)

    2. Re:So... by Z00L00K · · Score: 5, Insightful

      And if the publisher do change texts in different e-books anyone that wants to get around it would just need a few copies and use a statistical analysis to blank out the differences.

      This is similar to what steganography does, so if you mess up the punctuation inserted then it will be really hard to look up the perpetrator - or even that the wrong party will be pointed out.

      So now the Pandora's box is opened.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    3. Re:So... by Will.Woodhull · · Score: 4, Insightful

      There would be no need to reverse engineer a pristine copy of the work. Simply proofreading a single copy and correcting some of the existing errors, while at the same time, introducing a few new errors of the same type would be enough to confound any attempt to make a positive identification of the source.

      This approach has an incredibly high bogosity factor. I can't imagine anyone in the publishing industry with half a brain who would spend any money on its implementation... Oh wait. We are talking about the partially brain dead idjits who thought DRM was the best thing since sliced bread....

      If I was going to do this, I would probably also play with the kerning to force some repagination, add some space characters before the newline at the end of some paragraphs, and so on. This approach to DRM is about as simple to get around as using a black magic marker on the edge of an "uncopyable" CD disk.

      --
      Will
    4. Re: So... by Opportunist · · Score: 3, Insightful

      C'mon, where have you been hiding? Adding "on a computer", or, more recently, "on the internet" makes everything patentable again.

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    5. Re:So... by Opportunist · · Score: 5, Funny

      Uh... yes. When you find misspelled words in my messages here, it's just my new DRM. It's just that. It's not that I'm too dumb to use a spellchecker.

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    6. Re:So... by Idaho · · Score: 3, Insightful

      In fact, this is one more reason for good authors to avoid traditional publishers. I can think of quite a few authors who would have a thing or two to say about algorithms like these being used to modify their work.

      Just like in the music industry, big publishers are simply not necessary anymore. Editors most certainly are, but publishers?

      --
      Every expression is true, for a given value of 'true'
    7. Re:So... by gnasher719 · · Score: 3, Funny

      I think that map makers have been doing this for a century or more.

      I remember a pub guide with 1,200 pub reviews including three fake ones, and a newspaper copied (and slightly rearranged the words) of ten of their reviews and managed to copy one of the fake ones. Good fun.

    8. Re:So... by Arrepiadd · · Score: 5, Informative

      There would be no need to reverse engineer a pristine copy of the work. Simply proofreading a single copy and correcting some of the existing errors, while at the same time, introducing a few new errors of the same type

      I didn't read the article because I had seen it earlier in another news source, so I don't if this is mentioned in the one mentioned here, but proofreading may not do it in this case. The source I read mentioned two specific types of change that do not introduce any typos (I'm choosing the exampled myself):
      - One of them was reordering of nouns when the order does not matter, e.g. "Peter and John went for lunch" vs "John and Peter went for lunch";
      - The other was playing with negatives: e.g. "something is unclear" vs "something is not clear"

      Since there are no actual typos, it's hard to spot the identifying bits. You'd have to change the text substantially, in order to have a good chance of being free from discovery. Adding your own typos may not serve any purpose, since the company selling can focus just on the changes they made, not looking for other changes introduced after.

      Of course, if there is a concerted effort to release documents, all pirates would need to do would be buying a few copies and diffing the documents. You may not get the original back, but if the changes are randomly put in a specific set of words, you certainly can end up with something close to the original than any of the sold copies and still free from pirate identification.

    9. Re: So... by Threni · · Score: 5, Funny

      And then "on a mobile device but with slightly rounded corners".

  2. Defeated in one... by NFN_NLN · · Score: 4, Funny

    1. Sign up to service with alias
    2. Use untraceable account (prepaid credit card, bitcoin, points card)
    3. Share files with "watermarks"
    4. Don't give a shit that it gets traced back to a throw away account

    They could have saved a significant amount of effort if they had asked me first...

    1. Re:Defeated in one... by SuricouRaven · · Score: 3, Insightful

      It'd be easy to make minor alterations to the text itsself. Perhaps a character can be described as dark-haired and wearing a red shirt in one version, but wearing a red shirt and dark-haired in another. Find 32 such places and you can identify four billion unique versions.

    2. Re:Defeated in one... by NFN_NLN · · Score: 4, Interesting

      And once you bought a book (with your own credit card), and then decide afterwards that you want to put it out there for pirates, suddenly, you realize that it's not such a good idea.

      You realize it's not such a good idea... and 3/4 of a second later you just download it from another source. So you've really accomplished little.

      Has Apple's similar approach impacted music piracy?

      "Apple embeds your account information in all songs sold on the store, not just DRM-free songs. Previously it wasn't much of a big deal, since no one could imagine users sharing encrypted, DRMed content. But now that DRM-free music from Apple is on the loose, the hidden data is more significant since it could theoretically be used to trace shared tunes back to the original owner."

      http://arstechnica.com/apple/2007/05/apple-hides-account-info-in-drm-free-music-too/

    3. Re:Defeated in one... by Opportunist · · Score: 3, Interesting

      Well, it's not so far fetched that there will be various files that reach back to one source. I remember a certain song that had a quite noticeable glitch somewhere, a compression mistake or something like that. I know for a fact that it wasn't meant to be that way because it was played up and down on every radio station and music TV station, every time without that glitch (and it just sounded like a compression bug, too). The same applies to the pressed CD because I later bought it just for the sole reason to find out whether that glitch is supposed to be there, and on the original pressed disc there was no such artifact.

      But no matter where I went and at what party I heard it, I always heard exactly the same glitch. Ok, one may say, it's a local thing. So I thought, too, until I heard it at a party on a different continent. I waited for it, and I was quite amazed to hear that well known glitch.

      And then on YouTube...

      And it wasn't some obscure, barely known song, it was something that clogged the airwaves for quite a while. I later tried to create an MP3 of the file myself to check whether it was some obscure reason why it "has to" end up with that glitch when converted and no, at least my converter managed to encode it flawlessly.

      So I guess the only conclusion I could come up with is that everyone on this PLANET downloaded the same file from the same crappy source. One person encoded it and everyone downloaded from him.

      Kinda amazing that it still was such a seller. I mean, isn't the big complaint of the music industry that everyone is just downloading it? And obviously, for this song one sold CD would have sufficed to satisfy the damned... I mean the demand.

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    4. Re:Defeated in one... by Opportunist · · Score: 5, Funny

      No, I said it was music!

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
  3. Goddammit. by Chrontius · · Score: 5, Funny

    I catch all the typos in my books.

    They irritate me.

    I'd probably crack 'em, fix them all, and goddammit, that'd be "circumvention".

    1. Re:Goddammit. by Bremic · · Score: 3, Insightful

      Imagine going to Shakespear and saying "Sure we will publish your plays, but every person who buys a copy will get a different version where we change the words and the cadence a bit."

      Buy a copy of a play for every actor, all of them have minor variations which cause massive confusion.

      Hell, change the Bible randomly; that wouldn't get noticed at all.

    2. Re:Goddammit. by TrollstonButterbeans · · Score: 4, Funny

      It's "Shakespeare" but you not know that because you stole ebook and DRM has caught you red-handed as Ebook-pirate-thief.

      --
      Priest: "Universe from nothing, no laws of physics, sped up time"+ huge discrepancies. Creationism? No. Big Bang Theory
  4. That's not how traitor-tracing algorithms work by _Knots · · Score: 5, Informative

    They don't hash the whole shebang into one number. Rather, they take a (random) number and use that to generate a set of mutations and then probe for that set of mutations in the leaked document. So now, even if you alter the document further, you probably didn't undo the mutations in question. Even if you did, you probably didn't undo all of them and you almost certainly didn't produce a high-confidence result that it's somebody else's copy.

    --
    Anarchy$ dd if=/dev/random of=~/.signature bs=120 count=1
  5. Similar to something Amazon patented by dido · · Score: 3, Informative

    There was an article about it here a few years ago. A followup someone made to a comment I wrote to the article mentions some work being done by some guy from Purdue that sounds a lot like what's being done here. IBM also seems to be doing work on canary trap-based ideas.

    --
    Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
  6. What does this actually prove? by XaXXon · · Score: 4, Insightful

    Is accidentally leaving a copy somewhere copyright infringement? How do they know the person they sold it to is the person who leaked it.

    Also, it's never been clear to me when copyright infringement actually occurs.

  7. Learn by Scutter · · Score: 4, Insightful

    Or, you know, maybe learn from the success of Apple iTunes and start selling eBooks for a reasonable cost and maybe they won't be pirated nearly as much. I know that the publishing process costs money that you deserve to recoup, and you deserve to make a profit, but it is offensive to charge as much as (or more) than a physical book for an eBook.

    --

    "Tell me doctor, with all of your defenses, are there any provisions for an attack by killer bees?"
    1. Re:Learn by Macgrrl · · Score: 5, Funny

      That's not how dyeing industries work.

      You negative attitude is colouring your response.

      --
      Sara
      Designer, Gamer, Macgrrl in an XP World
  8. Great trick to remove the watermark by Rosco+P.+Coltrane · · Score: 5, Funny

    - Scan/OCR book
    - Google translate into German
    - Google translate back into English
    - Print book

    Voila! No more watermark. You can share with confidence.

    --
    "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
  9. strip by Tom · · Score: 5, Insightful

    It depends. If it's done well, it can be fairly resistant to any noise introduced into the system.

    As an author myself, I see a very different issue with this. I don't want some robot changing my text. Some of those words it might decide to change because they are similar I may have pained over and decided for a reason to use this one and not the other one. Granted, few authors pick every single word intentionally, but the software won't know which ones are carefully selected.

    Often times, there is subtle meaning. For example, I might decide to always use the same phrase in certain contexts, giving a very subtle hint to the reader which things are alike and which ones are different. One he might not even notice consciously.

    It also will cause all sorts of trouble to quoting. How will teachers handle this if a student quotes a text but the quote differs slightly from the version the teacher has read? One of the most important things we teach students is that quotes need to be exactly as they appear, with any omissions or changes clearly marked.

    That also extends to quotes within the text. If character A reports what character B said, I doubt the system will have enough text understanding to change both texts the same way, so the reader will be left wondering if it is intentional that there's a slight difference and what the author wants to hint at, when there's no such thing implied.

    --
    Assorted stuff I do sometimes: Lemuria.org
  10. It's understandable by 93+Escort+Wagon · · Score: 3, Funny

    After all, we saw how quickly the iTunes Store withered and died after the DRM got removed from all that music. It'd be crazy for the publishers NOT to double down on DRM!

    --
    #DeleteChrome
  11. This idea is as new as my grandma by Stonefish · · Score: 3, Insightful

    There were printers in areas with classifed documents which automatically used to do this. They worked with whitespace, fonts and punctuation. Photocopies of the documents could still be tracked. Great work guys you deserve a badge.
    Amazon will be able to close the loop by automatically downloading the books that you have on your kindle to "check" that you don't infringe and stomp on those badguys.

  12. Re:Done already by Anarchduke · · Score: 4, Funny

    Yes, which is why they have successfully stamped out piracy, it is part of the sordid past of the Internet. Thank god we'll never see pirated e-books again.

    --
    who prays for Satan? Who in 18 centuries has had the humanity to pray for the 1 sinner that needed it most? ~Mark Twain
  13. Decades? Try centuries... by dbc · · Score: 4, Interesting

    Shortly after the moveable type press got going in Europe, books of tables of interest rates were popular among the merchants. Of course, they all had to be laboriously hand calculated by mathematicians (long division was college undergraduate math in those days...). Publishers would sprinkle errors into the least signficant digits on various entries to use as evidence in copyright cases. Because, you know, if you had a printing press, you could make good money by pirating somebody else's table of interest rates.

  14. Re:Too much work by ACE209 · · Score: 4, Insightful

    ..., but it would be too much work and expense for the average ripper.

    Until someone writes a program for it, so the average ripper only has to push a button.

    --
    "we are all atheists about most of the gods that societies have ever believed in. Some of us just go one god further."
  15. Re:Too much work by flyingfsck · · Score: 4, Funny

    why just strip out all the punctuation who needs commas full stops and capital letters anyway everything is still perfectly readable

    --
    Excuse me, but please get off my Pennisetum Clandestinum, eh!