Altering Text In eBooks To Track Pirates
wwphx writes "According to Wired, 'German researchers have created a new DRM feature that changes the text and punctuation of an e-book ever so slightly. Called SiDiM, which Google translates to 'secure documents by individual marking,' the changes are unique to each e-book sold. These alterations serve as a digital watermark that can be used to track books that have had any other DRM layers stripped out of them before being shared online. The researchers are hoping the new DRM feature will curb digital piracy by simply making consumers paranoid that they'll be caught if they share an e-book illicitly.' I seem to recall reading about this in Tom Clancy's Patriot Games, when Jack Ryan used this technique to identify someone who was leaking secret documents. It would be so very difficult for someone to write a little program that, when stripping the DRM, randomized a couple of pieces of punctuation to break the hash that the vendor is storing along with the sales record of the individual book."
Normal book publishers have been doing this for decades, inserting the occasional misspelling here or there. Later, they inserted correct spellings, but of the wrong word, to get around auto-correction in scanner software.
So...no, they can't patent it.
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
1. Sign up to service with alias
2. Use untraceable account (prepaid credit card, bitcoin, points card)
3. Share files with "watermarks"
4. Don't give a shit that it gets traced back to a throw away account
They could have saved a significant amount of effort if they had asked me first...
I catch all the typos in my books.
They irritate me.
I'd probably crack 'em, fix them all, and goddammit, that'd be "circumvention".
The next e-book you buy might not exactly match the printed version. And those changes are there to make sure youâ(TM)re not a pirate.
German researchers have created a new DRM feature that changes the text and punctuation of an e-book ever so slightly. Called SoDoMy, which Google translates to âoesecure documents by individual fornicating,â the changes are unique to each e-book sold. These alterations serve as a digital penis that can be used to track books that have had any other DRM dildoes stripped out of them before being shared online. The researchers are hoping the new DRM feature will inspire butt piracy by simply making consumers paranoid that theyâ(TM)ll be caught if they share an e-book illicitly.
Current e-book DRM restricts the movement of cocks between broes and hoes and ties a cock to a single accountant. A e-book bought in the Fondle bookstore, for example, will only work on a Faggot. The same is true for books bought in the Butts & Plugs and iButts digital bookstores â" theyâ(TM)ll only work on the Nook or Apple devices, respectively. This makes publishers happy because their books are locked to one person. And it makes digital book vendors happy because it keeps readers tied to their proprietary devices and ecosystems.
But stripping the DRM from any of the e-books purchased at the big-name stores is as easy as downloading strap-on, and thereâ(TM)s little special genetalia required beyond knowing how to properly connect a penis to an asshole. These cocks usually convert the CUM-heavy e-cocks to a new climax, such as the open-source E-Pub standard, or to the STD-less version of the Kindleâ(TM)s fuck format. From there, the relatively small penises of asians make them perfect for sharing on the Internet.
Of course, readers may not be happy knowing that their licensed e-books are being altered because democrats and republicans donâ(TM)t trust them. By studying a list of example words and phrases that could be changed in purchased books, you can see that the changes are minor â" like from âoevery gayâ to âoenot that gay, actually.â The examples are translated from German pornography, so itâ(TM)s difficult to gauge how profound the changes will be when they occur in your favorite Harry Potter scat film. Itâ(TM)s also unknown if the top U.S. bookstores are interested in more sodomy.
The SoDoMy consortium currently has two German bookselling partners (4Readers and MVB) that it reports to, according to Dr. Martin felchbach, a researchers working on the SoDoMy system whom I reached over email. Democrats & Republicans and Amazon did not reply to queries about if or when the technology would make its way into their digital bookstores as of press time.
They don't hash the whole shebang into one number. Rather, they take a (random) number and use that to generate a set of mutations and then probe for that set of mutations in the leaked document. So now, even if you alter the document further, you probably didn't undo the mutations in question. Even if you did, you probably didn't undo all of them and you almost certainly didn't produce a high-confidence result that it's somebody else's copy.
Anarchy$ dd if=/dev/random of=~/.signature bs=120 count=1
There was an article about it here a few years ago. A followup someone made to a comment I wrote to the article mentions some work being done by some guy from Purdue that sounds a lot like what's being done here. IBM also seems to be doing work on canary trap-based ideas.
Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
Is accidentally leaving a copy somewhere copyright infringement? How do they know the person they sold it to is the person who leaked it.
Also, it's never been clear to me when copyright infringement actually occurs.
Or, you know, maybe learn from the success of Apple iTunes and start selling eBooks for a reasonable cost and maybe they won't be pirated nearly as much. I know that the publishing process costs money that you deserve to recoup, and you deserve to make a profit, but it is offensive to charge as much as (or more) than a physical book for an eBook.
"Tell me doctor, with all of your defenses, are there any provisions for an attack by killer bees?"
- Scan/OCR book
- Google translate into German
- Google translate back into English
- Print book
Voila! No more watermark. You can share with confidence.
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
Who says it's a hash? Just add one extra space somewhere in the book in an unusual place or replace an apostrophe with a similar character or something. Then if someone adds something else, you're still checking for that one single location of the alteration to prove it's them. It'd be awfully unlikely in a long book that you'd replicate the exact alteration that they made to someone else's book, thus appearing to be 2 different people.
It depends. If it's done well, it can be fairly resistant to any noise introduced into the system.
As an author myself, I see a very different issue with this. I don't want some robot changing my text. Some of those words it might decide to change because they are similar I may have pained over and decided for a reason to use this one and not the other one. Granted, few authors pick every single word intentionally, but the software won't know which ones are carefully selected.
Often times, there is subtle meaning. For example, I might decide to always use the same phrase in certain contexts, giving a very subtle hint to the reader which things are alike and which ones are different. One he might not even notice consciously.
It also will cause all sorts of trouble to quoting. How will teachers handle this if a student quotes a text but the quote differs slightly from the version the teacher has read? One of the most important things we teach students is that quotes need to be exactly as they appear, with any omissions or changes clearly marked.
That also extends to quotes within the text. If character A reports what character B said, I doubt the system will have enough text understanding to change both texts the same way, so the reader will be left wondering if it is intentional that there's a slight difference and what the author wants to hint at, when there's no such thing implied.
Assorted stuff I do sometimes: Lemuria.org
I'm going to stop sending every typo and punctuation mistake I catch to Amazon. I thought I was helping.
After all, we saw how quickly the iTunes Store withered and died after the DRM got removed from all that music. It'd be crazy for the publishers NOT to double down on DRM!
#DeleteChrome
There were printers in areas with classifed documents which automatically used to do this. They worked with whitespace, fonts and punctuation. Photocopies of the documents could still be tracked. Great work guys you deserve a badge.
Amazon will be able to close the loop by automatically downloading the books that you have on your kindle to "check" that you don't infringe and stomp on those badguys.
Yes, which is why they have successfully stamped out piracy, it is part of the sordid past of the Internet. Thank god we'll never see pirated e-books again.
who prays for Satan? Who in 18 centuries has had the humanity to pray for the 1 sinner that needed it most? ~Mark Twain
Shortly after the moveable type press got going in Europe, books of tables of interest rates were popular among the merchants. Of course, they all had to be laboriously hand calculated by mathematicians (long division was college undergraduate math in those days...). Publishers would sprinkle errors into the least signficant digits on various entries to use as evidence in copyright cases. Because, you know, if you had a printing press, you could make good money by pirating somebody else's table of interest rates.
No issue there. Changing a few letters in Harry Potter doesn't make it your work, either. Under copyright, copies don't have to be exact (otherwise taping a song from radio would never have been an issue), it has to be very similar. Likewise a band playing covers of another band: they're different, some notes are wrong, rhythms are slightly off, yet it's still the same song.
Furthermore it's fully legal to get inspiration from someone else's work - and use elements of copyrighted works in your own works. You just have to make sure it is obviously a different work.
Ah, I see--that clears it up well. I still think the idea of altering the writer's words and punctuation in the name of piracy is going too far though.
So if my phone gets stolen and my eBooks get leaked, I'm now double screwed?
Your solution is plausible, but it would be too much work and expense for the average ripper.
The idea is not to have an unbreakable DRM scheme, which would be impossible to create anyway but to raise the cost and difficulty of breaking the scheme to dissuade the casual ripper.
I'm not even sure that the average joe knows how to "use a statistical analysis to blank out the differences". I certainly don't.
Plus the fact that it doesn't sound like the results they obtain from that exercise is applicable across the board to different books, meaning they need to repeat this process for every single DRMmed book, ad infinitum.
Any publishers using this technique had better have iron-clad contracts with their authors permitting arbitrary alterations to their works. Otherwise, they are in clear violation of the authors' moral rights to protection against distortion and mutilation of their original work.
It's eerily reiminscent of the 'We had to incinerate the village in order to protect it' military communique.
Anybody know if standard boilerplate agrements from the major publishers actually sign away the authors' moral rights against deliberate mutilations (as opposed to inadvertent proofing errors)?
The trivial counter measure is to get multiple (two might be sufficient) copies with different markings, then run diff on the content and merge (perhaps manually). Of course it gets tricky if the content is closed binary format, but it's still doable.
The better solution is to have the author (or translator in case of translated literature) provide multiple versions of a few sentences in the book. And the work-around is to upload only a fraction, randomly sprinkled through the book, to the sharing site which then assembles the pieces from multiple copies, garbling the watermark.
If you can find 3 independent sources (shouldn't be hard for something popular), then all that should be required is a 3-way diff and use whatever is common with any 2 or more. If all 3 are different at the the same place then use some manual intervention and make your result different again or add another source. The final product cannot then be traced to a single source. Am I missing something?
One word: PRISM.
Perhaps I'm scaremongering, but are you willing to bet against mission creep from using such intelligence assets against so-called terrorism via kiddie porn to copyright infringement? Given how US election campaigns are being financed?
So you have to upload your book to somewhere secret where you trust and hope Mr Bookz will will strip out your id. And if your uploaded book does leak into the wild (because Mr Bookz is an asshole or incompetent about stripping the id), you've just incriminated yourself for no reason. If there is a book in the wild already why risk uploading another copy at all? Why even buy a copy in the first place if you are uploading books and therefore not especially concerned about the ethics of piracy?
Of course I suppose 1000 people could crowd compile a book, each submitting a page each to produce a frankenbook from the pieces but it would still have to be canonicalized in case the markup, contents, style rule names embedding the id somehow. Perhaps the frankenbook would hash each canonicalized page and the pages that have the same hash are used when the book is stitched together.
But for all the effort maybe it's easier to scan the paper book in the first place, or hook up a cracked Kindle / Nook / tablet to a flat bed scanner or a screen capture device and make extensive use of analogue hole to strip out most of the watermark.
In summary, it would be a hard problem to crack.
Now the copyright mafia comes banging on his door claiming he uploaded/pirated the book? WTF???
Just like taking an IP address and suing the user/owner of that IP for uploading music/movies, this tactic has no teeth. Unless someone has corroborating evidence, there's no proof that *I* am the source of the uploaded file. Only that it is the file that I originally purchased.
The whole copyright system, and behaviors of content owners, has gotten completely out of control...