A New Form of Online Tracking: Canvas Fingerprinting
New submitter bnortman (922608) was the first to write in with word of "a new research paper discussing a new form of user fingerprinting and tracking for the web using the HTML 5 <canvas> ." globaljustin adds more from an article at Pro Publica: Canvas fingerprinting works by instructing the visitor's Web browser to draw a hidden image. Because each computer draws the image slightly differently, the images can be used to assign each user's device a number that uniquely identifies it. ... The researchers found canvas fingerprinting computer code ... on 5 percent of the top 100,000 websites. Most of the code was on websites that use the AddThis social media sharing tools. Other fingerprinters include the German digital marketer Ligatus and the Canadian dating site Plentyoffish. ... Rich Harris, chief executive of AddThis, said that the company began testing canvas fingerprinting earlier this year as a possible way to replace cookies ...
Skipping all images to avoid tracking? Back to ncurses it is then
I guess this is probably the best place to plug privacy badger https://www.eff.org/privacybad... (although I'm not sure if it would defeat this... noscript + privacy badger?)
I just learned about privacy badger 2 days ago at HOPE.
It looks like the technical details would be found in this link: http://cseweb.ucsd.edu/~hovav/...
In that first article the CEO of AddThis says that "Itâ(TM)s not uniquely identifying enough" and the guy who originally developed it says it's only 90% accurate.
There are a number of other sites that are hosting the code. Check the summary link to see what they are.
Since the sites using this exploit are sorted by Alexa rank, I gave up looking after a while, but here are "the biggies":
127.0.0.1 addthis.com
127.0.0.1 ligatus.com
127.0.0.1 cloudfront.net
127.0.0.1 vcmedia.vn
127.0.0.1 cloudflare.com
127.0.0.1 kitcode.net
127.0.0.1 pof.com
127.0.0.1 shorte.st
127.0.0.1 ringier.cz
127.0.0.1 insnw.net
127.0.0.1 domainsigma.com
Not sure how serious this would break things, but some are hosting the exploit on Amazon's cloud: 127.0.0.1 amazonaws.com
I come here for the love
Noooo! Don't mention /etc/hosts, lest you summon ... him.
John
NoScript or Ghostery already block AddThis. It's just JavaScript.
John
sudo echo '0.0.0.0 addthis.com' >> /etc/hosts
That would lead to a "Permission denied" error because the appending to file is done by the normal user.
Try instead: sudo sh -c "echo '0.0.0.0 addthis.com' >> /etc/hosts"
Yeah, but the Amish also don't receive telemarketing calls or email spam.
Get free satoshi (Bitcoin) and Dogecoins
Instead of focusing on the privacy issue, I'm more curious about why "different computer draws the image slightly differently". Browsers are supposed to provide abstraction from the machine, and the same scripts run on different computers is supposed to behave in the same way. At most, it could tap into things like the user id, but shouldn't have access to more than that.
NSA Guy 1: Hey, there's that one guy that shows up as a black hole on the Internet.
NSA Guy 2: He is up a little early, isn't he?
NSA Guy 1: Yeah, he usually doesn't post his slashdot privacy rants until after browsing those "furry" sites for a half hour or so.
NSA Guy 2: He must not be in the mood.
Use the RequestPolicy addon in Firefox. It's a whitelist for allowing certain sites to load resources (of any kind) from other sites. If the pairing between the site you're on and another site is not explicitly added to RequestPolicy, nothing gets loaded (the request is not even made to begin with). It covers JS, CSS, images, anything.
IMO it's a more practical approach than NoScript, although not as ultra-secure.
In case you're wondering what's the difference between RequestPolicy and Ghostery:
i ate crayons when i was a kid and now i have two braincells and the blue ones taste nicer
I can see the privacy implications this has, but how in the world would such a method successfully discern between 2 identical devices?
I work with marketing software on and off. There are thousands of data points collected when you visit a site that cares enough to ID you. This would be just one. If this ID narrows the device down to 10 or so... and they also have date stamps, general location data based on your IP, browser type, etc? They can ID you specifically, pretty easily. I've not seen this particular method come up myself... in fact, most of the time the ways the marketing software ID's you is irrelevant to the site owner. They just buy the software and install it. Done. The general doesn't care that there's 1 new landmine in his arsenal when he's already blanketed the field with thousands of them.
Also, you need to understand that goal here... they don't care who you are. They just want to know that you are visitor 52467, and all the other times you were here you looked at products X, P and Q so they can display more information on those products. They also salt the site with "Free" offers that all you need to claim them is to input your contact information. Once you do that they link that contact information to your browsing history and shoot it over to a salesman and/or send you a personally designed advertisement to your email.
This may all sound dumb and horribly invasive... but it's amazingly successful. There is absolutely no way these companies would give it up voluntarily. Many of them wouldn't be in business without that sort of data... I'm not even sure you'd like it if it were gone. Getting ads is annoying, getting ads for African American hair styling products when you're a redhead is infuriating. Targeted ads are a good thing, it's the completely unaddressed side affects of that data collection that's a problem.
What needs to happen is laws governing how long the data can be kept need to be passed. As of now, it's kept forever as far as I know... because... well, why not? And who the data is shared with needs to be regulated. The intercooperation of these companies is pretty scary. Amazon should not know what I'm searching for on WebMD, and the fact of the matter is, as of now, pretty much every major site you visit is sharing data with every other site you visit for mutual profit. This likely includes government websites. I've seen the marketing companies brag about their government contracts so that's a tad scary. Lastly, pretty much all regulation is not-so-cleverly avoided by simply changing the tech. The regulation needs to be broad and easy to understand. As of now they do things like "Well, that's not a person, that's a device!" or "Is that really data?" etc... Bill Clinton word style play shouldn't absolve you of negligence.
The paper "Pixel Perfect: Fingerprinting Canvas in HTML5" by Keaton Mowery and Hovav Shacham is from 2012.
i ate crayons when i was a kid and now i have two braincells and the blue ones taste nicer
Depending on what you mean by 'block', there may or may not be a properly satisfactory answer:
'Block' as in 'make this specific mechanism fail' is the relatively easy question. If the attacker can't manipulate a canvas element and read the result, it won't work. So the usual javascript blockers or more selective breaking of some or all of the canvas element (the TOR browser apparently already does this for methods that can be used to read back the contents of a canvas element, so you can still draw on one but not observe your handiwork) will do the job.
Unfortunately the attacker doesn't actually care about making your browser draw a picture, they care about achieving as accurate a UID as they can. Given that, you might actually make yourself more distinctive if your attempt to break a given fingerprinting mechanism succeeds. In the case of the TOR browser, for instance, attempts to read a canvas will always be handled as though the canvas is all opaque white. This does prevent the attacker from learning anything useful about font rendering peculiarities or other quirks of your environment's canvas implementation; but it's also a behavior that, for the moment at least, only the TOR browser has. Relatively uncommon. Possibly less common than the result that you'd receive from an unmodified browser.
That's the nasty thing about fingerprinting attacks. Fabricating or refusing to return many types of identifying information is relatively easy (at least once you know that attackers are looking for them); but unless you lie carefully, your fake data may actually be less common (and thus more trackable) than your real data.
The research paper discusses two entirely different things: Canvas fingerprinting, and "Evercookies & Respawning", which are two entirely different things. Canvas fingerprinting is just another method of trying to determine which browser the user is running, by looking at differences in the way the canvas renders text and the like. "fingerprinting doesn’t work well on mobile" because of the homogeneous nature of mobile devices - 90% of iOS devices are running version 7.1, for example, so they are all using the same web browser version and rendering code, thus they are going to draw canvas fingerprints exactly the same. Nothing in the research article says anything about canvas fingerprinting being used to track people.
Now the other topic "Evercookies & Respawning" is about tracking users. That is using multiple storage vectors to try and keep users from deleting cookies. For example, using tiny hidden Flash apps which have their own caching, actual cookies, HTML5 persistent storage, embedding unique identifiers directly in the HTML so when the cached page is pulled up the identifier is once again active.
So at this point canvas fingerprinting isn't about tracking, but browser identification. The leap to "A New Form of Online Tracking: Canvas Fingerprinting", as described in the Pro Publica article:
A new, extremely persistent type of online tracking is shadowing visitors to thousands of top websites, from WhiteHouse.gov to YouPorn.com.
First documented in a forthcoming paper by researchers at Princeton University and KU Leuven University in Belgium, this type of tracking, called canvas fingerprinting, works by instructing the visitor’s Web browser to draw a hidden image. Because each computer draws the image slightly differently, the images can be used to assign each user’s device a number that uniquely identifies it.
Well that's completely wrong - the bold text should read "this type of tracking, called Evercookies & Respawning". The persistent tracking has nothing to do with the canvas fingerprinting. It's mainly due to Flash (which also explains why it too is ineffective on mobile devices).
Better known as 318230.
I'm more curious about why "different computer draws the image slightly differently".
Slight rounding differences, shape edge antialiasing behavior, font antialiasing behavior, installed fonts, and the like are the big ones I can think of. HTML5 Canvas behavior isn't specified down to the bit level.
Not really. The Amish reject technology across the board, whether useful or not. People that are on the internet are obviously not rejecting technology across the board - javascript-in-the-browser is a single, very problematic technology, which is responsible for the vast majority of computer infections.
So no, people that do not allow javascript are not much like the Amish of the internet. We are more like the 'people who know how to use condoms' of the internet.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
"Getting ads is annoying, getting ads for African American hair styling products when you're a redhead is infuriating"
No it isn't for most people, because we got used a LOT for this with TV. TV nearly never showed us advertising targeted for us specifically but more to a watcher class. But you know to whom it is infuriating to not target ads ? Marketing people. Because targeted ads means a better probability to transform an ad into a sale. In fact if marketing people could totally break our privacy and put camera everywhere to enhance their probability to higher level, they would do it, and pretend people like it. That's justification post hoc. They enable msot amrketing people to never discuss their own moral and ethical choice. Just pretend people like it and are infuriated when ads are not targeted to them. As opposed to be totally creeped out.
C. Sagan : A demon haunted world:
http://www.amazon.com/gp/product/0345409469/
visit randi.org
echo '0.0.0.0 addthis.com' | sudo tee /etc/hosts
also works.
That'll overwrite the whole file.
echo '0.0.0.0 addthis.com' | sudo tee -a /etc/hosts
will append.
There are those who say you need to use RequestPolicy and Ghostery and AdBlock and NoScript (and some other stuff, like a cookie blocker) to catch everything....
"[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz
Well, the other real issue here, is that such fingerprinting is in place specifically to work around the "limitations" of cookies.
Which are those "limitations"? That users can delete them. Honestly, most of the people I've dealt with when they ask for "better" fingerprinting cite that very cause. Not that cookies are per-browser and not per-user (which is what they want to track and what would be understandable at least). Not that cookies don't work with embedded devices. Not all those real limitations, but the fact that users can opt to delete them.
So, really, they're working against users directly, explicitly and consciously.