PDF Tracking On the Way
(el)Capitan.Nick writes "PDFzone reports that the company Remote Approach has launched a service to track the movement of PDF documents with its tool Map-Bot. The purpose of this service is to allow PDF publishers the ability to measure their audience, as web publishers can already. Though personal information is not gathered from machines, IP addresses are. PDFs can require users to be connected to the Internet in order to read them, and every person you email the PDF to is subject to the service. As PDFzone's opinion article states, while 'the chances of running into a Remote Approach PDF right now -- and in the near future -- are pretty remote ... the potential for the technology to tarnish PDF's image [of security] is staggering.'"
Oh.. soon as they can track views of PDFs, people will start putting ads in them... I guarentee it!
I can see it now.. Google introduces AdWords for PDFs...
Excuse me, I don't mean to impose, but I am the ocean
It's simple... Refuse to read PDFs that require the technology. Publishers won't get any data from it, and given a loud enough voice, will find that the tool reduces their distribution. It does them no good if the users won't read their documents because of it.
- AMW
How is it any different from collecting the I.P. of everyone who visits your website?
Okay... Print, Save as PDF on the Mac, or Print, select PDF Writer on Windows, or print to ps and "distill" with gs on anything else, and there goes the tracking. Not right?
--Jim (me)
Oh, wait...
Your reality is lies and balderdash and I'm delighted to say that I have no grasp of it whatsoever. - Baron Munchausen
doesn't PDF stand for "personal document file?"
how does this application keep pdf's private?
will pdf's work without an internet connection(i often transport pdfs to a secondary computer for viewing, and it is not connected to the internet!)
Check journal for info on Anti-TextBook, an idea by me.
Let me see.. how about a DoS attack.. spam a PDF to a bunch of people and have the PDF phone home to a site you wish to attack. Or... can we run arbitrary code from PDFs?
The remote logging is done through embedded Javascript in the PDF file. Most free viewers such as gpdf, xpdf and kpdf don't support Javascript so you're safe with them.
Adobe Acrobat Reader starting supporting embedded Javascript with version 7.0, although you can disable it in the preferences dialog. Apparently it bugs you every time you start the program to re-enable it, though.
Bottom line: Stick with free software.
Don't you have to accept an EULA to use their service like any other's? Wouldn't that be a dead givaway when you're opening a lame PDF like this?
Just like I can shop elsewhere if I don't like being captured on a store's video surveillance camera. Except that they ALL have cameras. If there's no true alternative, you're screwed. Am I going to forego opening that online manual that I desperately need to troubleshoot a problem? I don't think so. A better solution is for some enterprising hackers to find a way to break this technology.
PDFs can require users to be connected to the Internet in order to read them,
No, they can't, PDF is nothing but a data format. Some broken PDF viewers (especially those from Adobe) may do this, but since PDF is an open format, there will always be some other viewers that don't promote spying on their users. Basically, this is the same nonsense as the "no printing" option.
OS Reviews: Free and Open Source Software
spyware companies will just classify it as spyware, and virus companies and spam filters will classify PDF the same as they do with EXE's and remove them
its probably good news for open source as security and threat containment can be sold
Rather than tarnish the PDF name, they should create the Tracked Document Format or TDF and that way users can distinguish between the two. To make people suspicious of PDF right after versions 5 and 6.0 were found to contain security holes, this will be bad for Adobe.
Saskboy's blog is good. 9 out of 10 dentists agree.
Disabling Javascript will keep the tracking from working, but if you don't, the transmission is completely invisible to you. It will look like normal HTTP traffic to your firewall.
Thankfully, if Adobe wants to, they could change their Acrobat license agreement to ban this sort of crap.
I know they aren't totally private, but since when has it become something that any software I might load can give away?
No! It's a *SIG*. Keep the Special Interest Groups away! (Con joke!)
Also, I definitely do not want to risk exposing my static IP to anyone, especially in a way that involves new technology that may be quite exploitable, just by clicking on a PDF link on google. I'm sorry but c'mon, that's just too much. Nevertheless, assuming the technology is viable, there'll be a demand that will outweigh objection for this new feature and Adobe will do it and make more money.
FORCE me to go online??? I just hope that technical papers never use this tool.
Denizens of the PDF world, however, take note. We enjoy--and sell--the differences between PDF, e-mail and HTML, and a lot of those differences are in the realm of security...
Remote Approach, however, is the beginning of a movement that could chip away at PDF's sterling rep, one document at a time...
Since the Map-Bot can chase a PDF through e-mail forwarding, it's more powerful data mining than that associated with Web pages, where the vital information gets thrown out when the user's cache is emptied.
One would think they would come up with a better name than Map-BOT!!!
Pretty damning, if I may say so.
Yup, the Catholic Church is the scourge of the earth.
Many "commercial" PDFs (usually whitepapers and such) are crap anyway. How far can these guys go beyond offering them for free download (perhaps with simple user registration) remains to be seen. Probably not too far.
But, for the good ones someone has to "pay" authors one way or another. And for docs that are worth it (from good authors or respectable sources, or with good ratings from other users) people will be willing to pay (with their privacy, for example, or money).
Thank you.
Tarnish?
Fire Wall
Seriously, unless they have something that prevents you from looking, but then, what happens when people are offline?
Nope. A Fire Wall will fix this problem fast.
As others pointed out, this potential for a security breach occurs of embedded javascript in a PDF document. Adobe's reader is vulnerable by default. Does anyone know whether Foxit (a totally free PDF reader for Windows) is safer?
any other file formats next in line?
The number one method of distributing pdf's is via website download, and that can already be tracked. So what is being gained (or lost) here? Tracking pdf's that are passed from person-to-person? *yawn*
I couldn't count the number of times my well-meaning but technologically-inept relatives sent around chains for free gift certificates to the Cracker Barrel and monochromatic clothing stores, or worse 'for each email you pass on $.10 goes to this kid dying of cancer.'
Heaven help us.
net. So, I guess I won't be able to read spyware pdf's.
PDF's are great for printing, but not as easy to view on the Internet as regular html files. The Google "viwe as html" tool will help greatly.
Don't blame Durga. I voted for Centauri.
Behaving as though you have any privacy on the internet, or almost anywhere else, is living in a fool's paradise. I know one guy who is now in jail because he was foolish enough to supply a kiddie-porn provider with his credit card number.
The trouble is that we need a certain amount of privacy so we can express ideas that those in power may not like. The idea that I might lose the the mortgage on my house because I offend Walmart terrifies me. The idea that I end up on a no-fly list because I read the wrong pdf also terrifies me.
Well, if we want to keep our freedom we better be prepared to fight for it.
That PDF sucks. Use HTML.
Ok, so I downloaded the demo document, and captured the packets. /remoteapproach/logging.asp?type=view&DocID=123456 7890&GroupID=123456789&ChannelID=123456789 HTTP/1.1
.PDF files can be opened with Ghostscript, and (obviously) do not send tracking information. Simply re-saving the document as PDF doesn't remove the tracking, but converting it (File--Convert) via pdfwrite APPEARS to remove the tracking.
There's a POST to remoteapproach.com (you could block all traffic going to remoteapproach.com, or just repoint remoteapproach.com to 127.0.0.1 or something in your hosts file.
The POST message looks like:
POST
The thing that gets me is that the content of the request also contains this:
1 0 obj]/F(/C/Documents and Settings/Administrator/Desktop/MBRemote Approach Manual.pdf)>>>>
As you can see, it contains the full system path to the file that I opened. This seems like a big privacy issue. After all, Acrobat didn't ASK if it could open the URL.
The
Some technology.
No. DRM will never end, because those who actually spend time and money producing content like to pay the bills like everyone else. Simple as that.
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
Open a PDF with Adobe reader. Print. Under "Outout Options" in the Print Dialog Box, click "Output as file" and choose "PostScript" from the type menu. Adobe won't stop you and Preview.App converts the .ps file back to a .pdf.
Leave me alone, would you - please! No - nose, nose, nosing around, on top of everything else!
I know that some PDFs that I've come across will only open in Adobe Reader. I'm sure the data is in there, but the only way I've found to get it out is via Adobe's Reader. PDFs with forms, for example, don't open with OS X's Preview.App. Some PDFs I've found won't open on Linux at all.
What lies, exactly?
File under 'M' for 'Manic ranting'
Any file format that allows scripting and connection to the net will eventually be subverted to accomplish tracking, right?
I once experimented with Macromedia Flash movies that are trackable because the first thing they do is use ActionScript to load a static JPEG or SWF (very small) on a known server. After that tracking the movie is as simple as watching for HTTP requests for that particular JPEG.
Oh, great, now they are embedding spyware in our pdf's.
Just say no!
My Windows firewall asks for permission for Acrobat Reader to access the Net all the time, and I always deny it. With no effect on the documents. They better not make that connection required, or I'll drop Acrobat entirely, for a snitchfree open alternative. PDF is an open format, with real alternative apps - Adobe would drive people into the arms of their open competition if they required such spyware.
--
make install -not war
I've got the free version of ZoneAlarm running. If a pdf (or media file, or app, or whatever) asks for net access when it shouldn't need it, I deny it. If the software then won't read/function, I won't use it.
Big Brother is here.
"I hate quotations. Tell me what you know." - Ralph Waldo Emerson
"Bottom line: Stick with free software."
Bottom line: Stick with free and open standards that can't be corrupted by corporate influences.*
*Like Java, or Flash.
By the time these features are implimented, alternative standardized formats will be available (e.g XML or open document format). Most people will not be willing to trade privacy for viewing PDF files. The problem will be, however, in the private companies and the publishers who will "oblige" people to read their files in a PDF format (manuals, scientific articles, books, etc...) just like many online companies used to (and still) block non-IE users.
Any one knows how adequate are alternative formats in replacing PDF?
As a long-time user of Acrobat, I know you can disable plugins (which includes JavaScript) by holding the Shift key at the splash screen. Just hold Shift while opening the PDF, and voila.
Nice try, though!
Nathan
As soon as I bust my nut.
There is nothing new about this. We've been (unfortunatly) using 3rd party document encryptor to protect some of our client's documents. Users require a plugin installed, but the document is actually encrypted, no javascript involved.
The document can be configured to ping the server every time any action on the document is performed. (Printing, opening, etc). The server can decide to deny any action too.
It does support a one-time-online-to-authorize mode (much like Windows Actvation), but that's about it.
Why aren't you encrypting your e-mail?
If the content is unencrypted, and inside the file, then anyone can read it if they want. PDF is a documented format, where you can read the specification, and simply make a reader that discards the tracking. Or simply add a line in /etc/hosts redirecting the tracker to 127.0.0.1.
The point about web is that it is easy to track because (most) people download pages from the server, and don't email the html-source to each other. They mail links. With PDF's, they mail the pdf.
Assembling etherkillers for fun an profit
From the websiteFAQ:
Using our MAP-TAG technology, you can not only track the document but you can shut them down. You can deactivate your PDF files - in general or for specific people to help prevent unauthorized readers.
Elsewhere in the FAQ:
How can I track them if they're not on the Internet?
We are currently beta testing a version of Remote Approach that allows you to specify that if your reader is not connected to the Internet, then they cannot read the document.
Does this mean that some sort of encryption is involved? I can imagine something similar to password protected PDF files, except perhaps that the reader must provide the correct information, which is sent to the server in exchange for the "key" that actually unlocks the document. It sounds like this can be used to limit viewing by unintended audiences, but neither the website nor the articles hint at how this could be enforced. For example, will I need a password, or will viewing be restricted to computers with certain IP addresses?
I wonder what the effect will be in programs that do not support javascript? It doesn't seem like any special reader software is needed, but the javascript requirement apparently rules out viewing these in programs like Preview on OSX. Not to mention the impact this will have on people who save PDF files for offline browsing (maybe to be read during a long flight), or who read them on a PDA.
Does anybody know of a link to one of these specially tagged PDF's?
lol
Clearly, since it is the pdf reader and not the pdf that will report. What is needed is a public-domain, open-source pdf reader.
Hey, moron. You obviously don't understand the left and right.
The only problem with this kind of feedback is that I want to control when it is given ( by me ). Not when I receive an e-mail ( read confirmation, etc ), not when I read a document, not when I open a WEB site, etc.. I may not have time to deal with feedback - so, if I could select the time I aknowledge the e-mail, document, or whatever - then no problems.
BUT feedback is important and very valuable. Wouldn't you like to know if someone is more than a little interested of whatever you delivere ?
Shouldn't be too hard to figure out a way that makes the data they collect useless.
And since a PDF is an open spec, writing some code to remove this "feature" can't be too hard.
so, you can always run a PDF file through a cleanup utility. Stupid idiots...
Oh well, what the hell...
Awww, getting a widdle upset, are we?
My company is already using AlphaMail which does exactly the samething. And my next build of our document delivery system will add javascript to pdfs and webbugs to htmls.
:-)
We're not protecting documents in any way, only capturing the tracking information. A lot of organization don't know that 1 seat license means 1 person and this tracking information would highlight offenders.
Our subsriptions are 5k+/yearly
Well then someone will eventually figure out how to trick Adobe Acrobat into saving the decrypted form of the document somewhere.
At that point the document is untrackable. All it takes is once.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
say xpdf
By the way, PDF is an open format. There are MANY non-Adobe applications, some of them open source (many not), that both read and write PDF files.
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
this is impossible. There is no way that you can track movements of files over the internet unless some govermental agaency decides to play big brother. The reason why it would be impossible to track pdf is that there is no way you can trach p2p networks and a lot of pdf files get distributed over p2p netowrks. Also even if some company decides to extend the pdf format so that the file will works kind of like a worm and send information back to the creator, pdf is an open format and soon after there will be a program that allows you to remove the tracker. Plus noone makes you use the pdf creator with the tracker if such a thing ever shows up. Anyway, if this was a late April's Fool joke it is pretty dumb.
"By the way, PDF is an open format. There are MANY non-Adobe applications, some of them open source (many not), that both read and write PDF files."
Same with DOC.
Adobe decided that they wanted to control the market for access control of PDFs so that they changed the licensing scheme for add-ins that can be used by the free reader software. If you write an add-in for the free reader, the PKI key and license will run you $1k. If your add-in does any access control, the key and license runs $25k/year.
Ok so they get your IP when you download a PDF and now they want it whenever you read it as well..... which means that you need to be on the web to read it. No thanks. Paper looks better and better
This message was brought to you by "Lack of Sleep."
Hey suck face - bite my @#$#@
In the worst case, if one really had to look at the document, just load it onto a laptop, venture out into the world, find some random wireless bandwidth, and read it there. For good measure, buy the wireless card at a flea market and toss it in a dumpster afterward. Just don't drive there in a car that's registered in Texas!
With SELinux, just block net access to the acroread binary. Or use Evince.
Look at this ebook format:
http://www.ebookgold.com/
I once purchased an "ebook" in this format. When their server was wack I couldn't even connect to it to read my ebook. But technology got the last laugh: I electronically reversed that purchase via a chargeback on my credit card.
Just the thought of something I purchase watching every move I make gives me the creeps.
Don't use a computer without one if you value your privacy.
Almost *every* app these days does some kind of outgoing communication - whether it's update checking, phoning home, or serial number checking.
It's trivially easy to configure most reverse firewalls to disallow any outgoing activity from specific apps. For Windows there is obviously ZoneAlarm and others. With OS X, I recommend Little Snitch.
in london, for example, it is impossible to leave your home--let alone go shopping--without being caught on multiple cameras.
Works fine for me under Win2k, too. Thanks!
-- Language is a virus from outer space.
It's not enough to have software that won't work without authentication through the net (HL2), now e-books also?
Doesn't anyone care about us, who do most of our reading offline on a laptop in public transport?
I wouldn't want to even go into debate about embeded tracking ID's from online stores, that sue people if their copy get copied on the p2p networks...
Q: How does this tracking mechanism differ from web log analysers?
A: Simple, web log analysers aren't capable of tracking redistributions of the same document. If you copy a web page, say about theories in free-market macroeconomics, and e-mail the copy to a friend, say in China, no one will ever know your friend has read it. But if you copy one of those and it's read by your friend there, then certainly your friend will have a red flag (pun intended) on him.
HTH
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
Some of them have. This one is chatting with St. Peter. If you cannot see the obvious saintliness of the man then you are blind. May his memory be eternal.
PDF = Privacy Depleted Format
cpghost at Cordula's Web.
here they come. The AdobeBufferOverflowExploit() function call should come in handy.
Join the Slashcott! Feb 10 thru Feb 17!
1. Download PDF
2. Let it phone home while sniffing
3. Append that data to the end of the file
Now when you or anyone else need to read the file the connection can be emulated.
Can this tecnology be used to create virus or spyware inside of PDF files?
Rethinking email
yes you can disable it
yet they want to forcefeed it to you.
so they bug you every time in order for you to accept this so-nice feature.
same kind of abuse we got 5, 7 years ago when each web browser and each media player checked if it was the "preferred" application and bugged you or reset itself as said "preferred" application.
so many years later, and some editors still use this shitty tricks. so much for customer respect, right ?
I write stories and let people download them. I wouldnt mind being able to make the available in a form that lets me know how they are moving romperson to person. BUT I would want it to be an opt-in. As in a pop-up asking permission to track back to the author leting me know who read it. The key is of course opt-in
Do tell.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
This isn't a technology I am particularly fond of.
...
It reminds me of how when I check-out at ToysRUs, they always ask for my telephone number. I know they are just collecting demographic data, but it is an invasion that really doesn't pay off for me directly.
The reason I am OK with webpages knowing what IP address am coming from is
1) apart from using an anonymous proxy - it is a necessary trade-off,
2) it has always been this way, so I don't _feel_ like I am getting hosed
3) it is something I know is happening and expect.
The Remote Approach PDF...
1) is not necessary (it just feeds the marketing drones)
2) introduces new privacy compromises, so I _feel_ abused
3) implements behavior a user does not expect from a document, without their knowledge.
It also seems that my local path to the document is being sent in the clear. The only people who could use this information are people who are up to no good.
Raisinettes are my raison d'etre
What about people who download PDFs to view offline? I hope they won't actually *force* people to be online just to read a document, or these people are screwed.
Unless you:
Get forwarded from another site.
Legit software updates jumps to site/webpage.
Have spyware.
'Hosts' file mixup.
Auto opening HTML emails.
IP spoof by other party using your current IP.
.
Hmmm how many other ways to trick mark?
Hi Folks,
I'm John Bielby from Remote Approach. I was hoping to jump in and answer some of your questions and concerns. I'm very open to discussion of the concept and the company. We didn't start Remote Approach for reasons beyond giving PDF publishers the same measurement tools that web publishers have. The origin was actually a colleague of ours was trying to advocate PDF use within their company but hitting a lot of brick walls because there was no way for the client to know how people were using their documents. If they posted something, it was permanently in the ether and either 100 or a million people could be using it. They wanted instead to stick with HTML so they could track direct readership, sacrificing the usability the PDF provided the users.
A few responses to comments:
hummassa commented that with web analysis no one knows if you copy the page and send it to a friend. That's not really true. It really depends on the design of the page and with the vast majority that use graphics, and in particular advertising, the links to the live images (or javascript, etc) will be saved and called everytime you open the page. That's not to say a savvy user can't suck it down and edit the html to make sure everything resolves locally, but that's a lot of effort and I think it's fair to say not something the average user would do, or want to do.
Rolan advocated that users shouldn't read PDF documents that use this technology so they won't use it. The reverse is also true. Using this technology will allow publishers ot create more PDF resources. In our beta tests this Spring, for example, we found that one clients had their private documents being distributed to an audience 30% larger than they had any idea existed. Based on those numbers, they will be removing the registration/login features from their site and making the existing - and more - documents available to the general public. Before, they really had no idea if people were actually reading their documents and were happy to find they were providing a free service that could be expanded to help promote their business. A case study on them, and a few other clients, should be going up shortly.
sanityspeech questioned a feature being beta tested right now that would check for an internet connection. While that particular feature is only available to a few beta clients right now, it's intent is only for PDF documents that required a high level of security and responsibility(for example, a business plan or a project proposal meant for a few eyes only). Documents like manuals and other public material shouldn't use this feature. It's for a similar reason that documents like that are often unsuitable for Digital Rights Management in general (e.g. with a username/password or keyed to your hard drive). We will work with our clients to make sure they understand that - both for their benefit and for their customers. BTW, our feedback from our clients so far has jibed on this thinking. They don't want to lock down documents - they want to prove the business case behind distributing them so they can produce more of them.
An Anonymous Coward pointed out that the http reference contains the name/path of the file being viewed.We already had filters in place to ensure that any information of this sort is not saved and accordingly not available in any type of audience measurement or analysis but are investigating whether it is possible to change the way Acrobat deals with the Internet in general (since any interaction of any PDF file - tagged by Remote Approach or not - with the Internet would pass this same information).
Redwing brings up an interesting point that he feels web logging is ok because he expects it but does not expect it in PDF files. I think most people would agree that the majority of average users don't actually know about web logs or session states or even understand how cookies really work.
The fact is that for PDF documents to grow as a viable distribution method some sort of audience mea
1. Just like postscript, PDF is a turing-complete language too;
2. These "phone-home" documents can be implemented in such a way that the text in the PDF is encrypted, with a decryption key to be retrieved from "home". Got it?
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048