Breaking Google's DRM
An anonymous reader writes "Google's new Google Print service (that lets you see scanned pages from printed books) has a pile of advanced browser-disabling DRM in it ('Pages displaying your content have print, cut, copy, and save functionality disabled in order to protect your content.'). This works with JavaScript turned off, even in Free Software browsers. Seth Schoen has posted preliminary notes on some breaks to the DRM (beyond just automating a screenshotting process), including a proposal for a circumventing proxy that would fetch Google Print pages and strip out the DRM. A full exploration of the html obfuscation and DRM employed by Google would be very interesting; certainly the ability for a remote attacker to disable critical browser features like save, right-click, copy and cut against the user's wishes is a major security vulnerability in Moz/Firefox and should be fixed ASAP."
Google DRM
g url with cryptographic signature"); background-repeat:no-repeat; background-position:center left; background-color:white; }
.theimg background, to be saved to disk. For some reason, Save Page As.../Web Page (complete) still declined to download the background image at all, even in the absence of JavaScript, as if perhaps the CSS parser in the display logic in Firefox is smarter than the CSS parser in the Save Page As... code.
.mozilla/firefox/default.*/Cache/[0-9A-F]*). I'm still puzzled about why Page Info and the DOM Inspector won't actually reveal the image referenced in the .theimg style or allow it to be saved.
( [^ "]+\)")
.theimg, and then to load it directly. Perhaps that will change in the future.
To further protect your book content, printing and image copying functions are disabled on all Google Print content pages.
Similarly:
We've put a number of measures in place to prevent the downloading, copying, or printing of your content [...] Pages displaying your content have print, cut, copy, and save functionality disabled in order to protect your content.
I'm surprised at how much effort Google went to here. I would have expected my browser not to be vulnerable to having any of its "functionality disabled", yet, with a recent Firefox, I found that I couldn't
1. print the page to a PostScript file,
2. right-click on the page at all,
3. save the page to disk (the image would somehow not be downloaded at all),
4. view the precious image in Page Info/Media (although I could see which image it was),
5. save the precious image in Page Info/Media,
6. find the precious image in the DOM Inspector (which seemed like the really heavy artillery), although the DOM Inspector did let me see its URL as part of an uninterpreted style definition, and seem to reveal the trick: defining a style called ".theimg", with the definition
{ background-image:url("http://print.google.com/lon
and then invoking that style inside a tag:
So I tried turning off JavaScript, and I found that I was essentially no better off: right-clicking caused a copy of cleardot.gif, not the
The two ways I've found so far that work to capture images from Google Print are a screen capture (I used xwd, which of course worked perfectly) and looking in the on-disk cache (ls -lrt
If you wanted to write a proxy that would make Google Print pages capable of being saved to disk, you would presumably want to match
background-image:url("http://print.google.com/\
(although you'd need to be careful to match only the one in the definition of ".theimg", because it looks like there may at least one other background-image:url) and then replace
I haven't tried this because it felt like too much work relative to the previous two methods.
Contrary to what I expected, Google Print does not seem to check referer, so it seems to be possible merely to extract the URL from the definition of
Google must have hired some experts on html image protection or html obfuscation. To be sure, there are lots of other tricks in Google Print that I had never seen before. It is hard to think that the author of that HTML obfuscation was not the subject of Richard Stallman's accidental haiku. It is amusing to think that Mr. Bad's "other" DeCSS might at last be used for some kind of circumvention (although I doubt it, because presumably Google Print simply won't work at all with the CSS removed).
A full exploration of the html obfuscation and DRM employed by Google would be very interesting
I've been looking at this - there's a blog post with some preliminary discussions, and a follow-up giving some ways of getting around it. The short answer is that if you just want to save the image to disk, it's not too hard in a decent browser.
Gerv
Seacrh for "economic development".
gerv, a mozilla developer, has a few blog entries that talk about how the print service tries to stop you from getting to the jpeg's, and how to bypass that.
Google Print, And Clue Barriers
Google Print Hacking Ideas
nostrils
Now they're both mysteriously restricted to general viewing.
Your hair look like poop, Bob! - Wanker.
this is a damn good point.
I copied this from a post I saw earlier on slashdot - I have lost the link but still have the text.
That's why they need the dumb-ass DMCA, because it's impossible to make secure DRM. DRM is not and can never be cryptographically secure because it is not actually a cryptography problem. Cyrpography is about keeping secrets away from unauthorized people. That's fairly easy. DRM is about GRANTING people authorized access and GIVING them the key and then attempting to keep what you've given to them a secret from them.
DRM is a schizophrenic and fundamentally impossible task.
All they can do is the key obscurely inside the player and hope that no one makes the effort to look at it.
It was written about SACDs, but it applies just as equally to stopping people copying text. In the long run, DRM won't work. It's just a serious pain in the ass, especially for legitimate users (how can you get fair use if the damn copy/paste functionality is disabled?)
-- james
It's not tough "DRM"... my university's local online student newspaper equivalent effectively does the same thing.
The World Wide Web is dying. Soon, we shall have only the Internet.
First, turn off javascript. then turn on image dimensions. right click on the dimensions for the main image, and click view background image.
http://print.google.com/print?id=ULQSG0Zs7vcC&pg=3 &img=1&q=mastering+digital+photography&sig=gv2nFpt Ef0dj7Gzb8eZ4U8UdtUo
is the URL that is used, and surprisingly it is linkable from outside, it doesn't appear to check IP's, browsers, or anything else. (deep link away!)
Gerv, who works for mozilla/bugzilla, already went through this, and found several ways around google's hackery. He then went and summarized the multiple ways to do it in good browsers.
Get Firefox!
Here is an excerpt from a Mozilla blog regarding this. The parent URL of the print.google.com example is http://print.google.com/print?id=ULQSG0Zs7vcC&lpg= 3&pg=0_1&sig=O0-GVU5AdfrMmUtu0N5mNM7sUCg.
:-(
.theimg { background-image:url("http://print.google.com/prin t?id=ULQSG0Zs7vcC&pg=3&img=1&sig=gv2nFptEf0dj7Gzb8 eZ4U8UdtUo") }
Next idea: use the DOM Inspector to inspect the entire browser XUL. This means that the context menu will still work. It's more difficult to do, because you can't locate elements by clicking in the content area - it only works for the chrome. Still, we finally track down the clear GIF and delete it. Boom! This time Firefox crashes (taking with it an earlier version of this blog post.)
OK, let's try another approach. Let's find the surrounding in the DOM Inspector, look at its computed style, and copy the URL out of it. Except that the Computed Style view doesn't support copying. Undeterred, and feeling close to the goal, we view the applied styles for the and try and copy the URL out of the individual background style rule.
Success! This works. We can chop off the CSS gubbins, paste the result into a web browser URL bar, and finally get an image we can save.
In fact, you can also get the URL of the page graphic by viewing the source. It turns out that it's not as hard as I made out, because currently, the in question has a sensible class name:
so it's easy to find.
And censorship. You forgot their Chinese censorship ;)
Prosperity is only an instrument to be used, not a deity to be worshipped. Calvin Coolidge
Question #5 states:
What can I do with books that I find?
Well, you can browse a few pages, learn more about the topics explored by the book, buy it, or commit a selection to memory. To further protect your book content, printing and image copying functions are disabled on all Google Print content pages.
I don't see the big deal. As long as they let me still use "back", "forward" and "exit" I'll be happy. Sure it sucks that you might have to buy a book or write down your favorite quote, but it's free as in gratis at this point.
Amazon only lets you get about 3 pages into a book and usually you can't leave the introduction.
Get your Unix fortune now!
Google ceased to be good in my book when they used the DMCA to take down an rss feed of google news.
I am trolling
Whilst I'm all for breaking DRM that hinders the rights you have to use your content in the way you want - this just looks like breaking DRM to get stuff for free.
Which DRM? I have no DRM installed on my machine. I have agreed to no contracts or EULAs with regard to DRM.
Google sends me some copyrighted information. The copyright law limits what I can do with it (e.g. I cannot republish), but for my own private use I can do pretty much anything I want with it.
That image already exists as a file (or part of a file) on my machine. What Google is doing is trying to prevent me from looking at it in non-approved ways. Well, it can try, but I have no legal or ethical obligations to follow its wishes. If I want to take that image, load it into Photoshop and play with it there, I am completely within my rights.
So, no, I don't see any problems (either legal or ethical) with breaking this pseudo-DRM -- and I am willing to bet it will be breakable very easily -- and using these images however I want within the limits set by the copyright law.
Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
I just looked at the page source code... they actually did something very similar to this. They create a table cell, set the background image to the book page (it's fed out of their search engine as opposed to being a static image link, so I imagine the backend screens based on http_referer or something), and then stretch a 1x1 transparent gif over the table cell. "Show Image" then shows the transparent gif, and there is no "show background image" since we are over a foreground image.
They also use the standard context-menu disabling Javascript, which IE respects (and Mozilla does as well if you tell it to). Other than this (standard-issue) trick, they aren't doing anything sneaky to the user's browser at all. They could even disable the DRM for non-copyright pages if they wanted to (don't use the transparent cover image, and don't disable the context menu). All in all, it seems like a pretty slick implementation!
Save Maine's economy: write stuff down. All comments are exclusively my own, not my employer.
They became an Evil Company last april
I am trolling
It's not that hard to mess with a browser in this way. For example, to hide content when you print is a matter of some CSS2.
@media print {
#content { display: none; }
}
Toss in half a dozen other spoilers such as multi-part mime & redirects (to hide URLs), DOM event handlers (to handle & ignore mouse clicks), transparent gifs (to mangle context menus), transparent DIVs that become opaque when printed and you achieve the desired effect.
They're all surmountable, but I suppose Google want to be seen to be making a concious effort to block people from printing out pages.
* Set Adblock to "Hide Ads" * Block: http://print.google.com/images/cleardot.gif * Prevent websites from changing the context menu: Web features > Advanced * et voila
I'd like to see something like this, for instance, in Firefox's security settings near the Javascript permission settings:
Block sites from:
[X] Disabling right-click context menus
In Firefox:
* "Edit" -> "Preferences"
* Select "Web Features"
* Click the "Advanced" button next to "Enable JavaScript"
* Uncheck "Disable or replace context menus"
(This was bug 86193, checked into the code in March. It's in 1.0PR)
As for single-window mode, there are plenty of extensions. Try the one called "Tabbrowser Extensions", for instance.
So:
- Start at the beginning of the book
- Read 3 pages
- Pick a phrase on the third page
- Search for that phrase within the book
- Click the search result for the third page
- Read the next two pages
- Pick a phrase on the fifth page
- Search for that phrase within the book
- Click the search result for the fifth page
- Read the next two pages
- Repeat until end of book
It's irritating, but when you're trying to find a passage in the book and the three-page limit smacks you, you can use this method to get more of the book (or all of it, if you have the patience).IE. Default settings. No proxy, no modifications. Nothing particularly special about it.
t ?blablahblah");bunch of other stuff;}
-Load up the book in the browser.
-Click the View menu, select Source.
-Search for "div class=browse"
-Immediately before that, you'll find something like this in a CSS style:
{ background-image:url(http://print.google.com/prin
-Take that URL, copy and paste it into a new browser window and voila, you have the full size image. Save As or Print on this image works fine. No problems at all.
Seriously, this is trivial to break.
What's not trivial is getting an entire book. How to figure out how to get every page is the tough part. Getting the image itself is a cakewalk. It's just Javascript tricks to break right-clicking and CSS tricks to break direct printing from that window. Saving gets broken because of the tricky CSS using the IMG as a background image. The browser doesn't think to save the image, is all.
- Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.