A New Form of Online Tracking: Canvas Fingerprinting
New submitter bnortman (922608) was the first to write in with word of "a new research paper discussing a new form of user fingerprinting and tracking for the web using the HTML 5 <canvas> ." globaljustin adds more from an article at Pro Publica: Canvas fingerprinting works by instructing the visitor's Web browser to draw a hidden image. Because each computer draws the image slightly differently, the images can be used to assign each user's device a number that uniquely identifies it. ... The researchers found canvas fingerprinting computer code ... on 5 percent of the top 100,000 websites. Most of the code was on websites that use the AddThis social media sharing tools. Other fingerprinters include the German digital marketer Ligatus and the Canadian dating site Plentyoffish. ... Rich Harris, chief executive of AddThis, said that the company began testing canvas fingerprinting earlier this year as a possible way to replace cookies ...
I guess this is probably the best place to plug privacy badger https://www.eff.org/privacybad... (although I'm not sure if it would defeat this... noscript + privacy badger?)
I just learned about privacy badger 2 days ago at HOPE.
It looks like the technical details would be found in this link: http://cseweb.ucsd.edu/~hovav/...
In that first article the CEO of AddThis says that "Itâ(TM)s not uniquely identifying enough" and the guy who originally developed it says it's only 90% accurate.
There are a number of other sites that are hosting the code. Check the summary link to see what they are.
Since the sites using this exploit are sorted by Alexa rank, I gave up looking after a while, but here are "the biggies":
127.0.0.1 addthis.com
127.0.0.1 ligatus.com
127.0.0.1 cloudfront.net
127.0.0.1 vcmedia.vn
127.0.0.1 cloudflare.com
127.0.0.1 kitcode.net
127.0.0.1 pof.com
127.0.0.1 shorte.st
127.0.0.1 ringier.cz
127.0.0.1 insnw.net
127.0.0.1 domainsigma.com
Not sure how serious this would break things, but some are hosting the exploit on Amazon's cloud: 127.0.0.1 amazonaws.com
I come here for the love
Noooo! Don't mention /etc/hosts, lest you summon ... him.
John
They're already tracking you by your termcap.
Use the RequestPolicy addon in Firefox. It's a whitelist for allowing certain sites to load resources (of any kind) from other sites. If the pairing between the site you're on and another site is not explicitly added to RequestPolicy, nothing gets loaded (the request is not even made to begin with). It covers JS, CSS, images, anything.
IMO it's a more practical approach than NoScript, although not as ultra-secure.
In case you're wondering what's the difference between RequestPolicy and Ghostery:
i ate crayons when i was a kid and now i have two braincells and the blue ones taste nicer
I can see the privacy implications this has, but how in the world would such a method successfully discern between 2 identical devices?
I work with marketing software on and off. There are thousands of data points collected when you visit a site that cares enough to ID you. This would be just one. If this ID narrows the device down to 10 or so... and they also have date stamps, general location data based on your IP, browser type, etc? They can ID you specifically, pretty easily. I've not seen this particular method come up myself... in fact, most of the time the ways the marketing software ID's you is irrelevant to the site owner. They just buy the software and install it. Done. The general doesn't care that there's 1 new landmine in his arsenal when he's already blanketed the field with thousands of them.
Also, you need to understand that goal here... they don't care who you are. They just want to know that you are visitor 52467, and all the other times you were here you looked at products X, P and Q so they can display more information on those products. They also salt the site with "Free" offers that all you need to claim them is to input your contact information. Once you do that they link that contact information to your browsing history and shoot it over to a salesman and/or send you a personally designed advertisement to your email.
This may all sound dumb and horribly invasive... but it's amazingly successful. There is absolutely no way these companies would give it up voluntarily. Many of them wouldn't be in business without that sort of data... I'm not even sure you'd like it if it were gone. Getting ads is annoying, getting ads for African American hair styling products when you're a redhead is infuriating. Targeted ads are a good thing, it's the completely unaddressed side affects of that data collection that's a problem.
What needs to happen is laws governing how long the data can be kept need to be passed. As of now, it's kept forever as far as I know... because... well, why not? And who the data is shared with needs to be regulated. The intercooperation of these companies is pretty scary. Amazon should not know what I'm searching for on WebMD, and the fact of the matter is, as of now, pretty much every major site you visit is sharing data with every other site you visit for mutual profit. This likely includes government websites. I've seen the marketing companies brag about their government contracts so that's a tad scary. Lastly, pretty much all regulation is not-so-cleverly avoided by simply changing the tech. The regulation needs to be broad and easy to understand. As of now they do things like "Well, that's not a person, that's a device!" or "Is that really data?" etc... Bill Clinton word style play shouldn't absolve you of negligence.
Depending on what you mean by 'block', there may or may not be a properly satisfactory answer:
'Block' as in 'make this specific mechanism fail' is the relatively easy question. If the attacker can't manipulate a canvas element and read the result, it won't work. So the usual javascript blockers or more selective breaking of some or all of the canvas element (the TOR browser apparently already does this for methods that can be used to read back the contents of a canvas element, so you can still draw on one but not observe your handiwork) will do the job.
Unfortunately the attacker doesn't actually care about making your browser draw a picture, they care about achieving as accurate a UID as they can. Given that, you might actually make yourself more distinctive if your attempt to break a given fingerprinting mechanism succeeds. In the case of the TOR browser, for instance, attempts to read a canvas will always be handled as though the canvas is all opaque white. This does prevent the attacker from learning anything useful about font rendering peculiarities or other quirks of your environment's canvas implementation; but it's also a behavior that, for the moment at least, only the TOR browser has. Relatively uncommon. Possibly less common than the result that you'd receive from an unmodified browser.
That's the nasty thing about fingerprinting attacks. Fabricating or refusing to return many types of identifying information is relatively easy (at least once you know that attackers are looking for them); but unless you lie carefully, your fake data may actually be less common (and thus more trackable) than your real data.
The research paper discusses two entirely different things: Canvas fingerprinting, and "Evercookies & Respawning", which are two entirely different things. Canvas fingerprinting is just another method of trying to determine which browser the user is running, by looking at differences in the way the canvas renders text and the like. "fingerprinting doesn’t work well on mobile" because of the homogeneous nature of mobile devices - 90% of iOS devices are running version 7.1, for example, so they are all using the same web browser version and rendering code, thus they are going to draw canvas fingerprints exactly the same. Nothing in the research article says anything about canvas fingerprinting being used to track people.
Now the other topic "Evercookies & Respawning" is about tracking users. That is using multiple storage vectors to try and keep users from deleting cookies. For example, using tiny hidden Flash apps which have their own caching, actual cookies, HTML5 persistent storage, embedding unique identifiers directly in the HTML so when the cached page is pulled up the identifier is once again active.
So at this point canvas fingerprinting isn't about tracking, but browser identification. The leap to "A New Form of Online Tracking: Canvas Fingerprinting", as described in the Pro Publica article:
A new, extremely persistent type of online tracking is shadowing visitors to thousands of top websites, from WhiteHouse.gov to YouPorn.com.
First documented in a forthcoming paper by researchers at Princeton University and KU Leuven University in Belgium, this type of tracking, called canvas fingerprinting, works by instructing the visitor’s Web browser to draw a hidden image. Because each computer draws the image slightly differently, the images can be used to assign each user’s device a number that uniquely identifies it.
Well that's completely wrong - the bold text should read "this type of tracking, called Evercookies & Respawning". The persistent tracking has nothing to do with the canvas fingerprinting. It's mainly due to Flash (which also explains why it too is ineffective on mobile devices).
Better known as 318230.