How Identifiable Are You On the Web?
An anonymous reader writes How identifiable are you on the web? This updated browser fingerprinting tool implements the current state of the art in browser fingerprinting techniques(including canvas fingerprinting) to show you how unique your browser is on the web.
Good food for thought when three-letter agencies talk about "mere metadata."
I haven't seen a /. effect for a long time.
The real "Libtards" are the Libertarians!
Always have been, and always will, for as long as light echos through space and time. But nobody really cares who I am. They know who I am, nevertheless.
I am the walrus.
Google serves my computer ads for mens watches, it serves my wifes computer, on the same NAT, (the same PC, same screen resolution) ads for shoes. Both have cookies blocked and flash is disabled by default. Mine also blocks lots of google sites, yet I have yet to find a way to block doubleclick. Our browsers are both set to tell sites to "do no track". Neither of us uses Google for search these days, switching to Duck Duck Go.
So the fingerprinting is enough for Google to send us personalized adverts.
Now if someone can tell me the full list of domains I need to block to prevent DoubleClick (also from Google) from serving ads, I'd appreciate it.
First, the simplest of script blockers completely prevented the home page from loading at all.
Second, when I allowed the site in my script blocker, it was slow as hell to load.
But Third, and more to the point: EFF's Panopticlick has been around for a long time now, and it's far better.
... of course they know who you are. You need an IP to send and receive information, just the nature of making a connection leaves a trail all by itself. Next it's not that hard to develop mathematical techniques to analyze text and language in posts since they can analyze that most people have limited memory and interest by nature of them being finite beings and can simply build profiles by simply combining all the little tiny bits of different info into some unique ID if they wanted to.
The nature of our technology has augmented our ability to see and detect so much it's increasingly difficult to hide anymore. I shudder to think how small cameras are becoming and how they will be all pervasive where it matters. We're basically moving into a "tripwire" society where hidden and not so hidden automated track wherever you go what you do and all that data can be stored, analyzed, etc.
Standard Mozilla behaviour last time this question came up is to include a list of fonts that your browser can display; I don't know whether other browsers do the same, or if they've changed it, but it's the kind of "feature" that hopelessly breaks your chances of non-uniqueness if you've ever installed fonts.
My work laptop has a font that's the Official Corporate-Branded font for $DAYJOB's corporate logo. Almost every Windows machine at my company has that (at least, every physical machine and the virtual machines running on the hosted virtual desktop cloud; there may be some lab machines that don't, and maybe some contractors, etc.) You might work for a smaller company that does the same. In my case, I've installed all sorts of other random fonts, either to see what they looked like, or simply because back in the 80s of course you wanted Elvish and Dwarvish fonts on your computer, or because I wanted a better monospaced programming font than the default MS one or Courier New.
Lots of other things leak information as well (cookies, etc.), but fonts are a quick and dirty way around identifying people who block those.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
0.3459*0.2254*0.5871*0.4004*0.2696*0.0109 = 0.00005386 Means about 1/20000 with this combination. Likely true that this is enough for tracking to be useful.
Fonts seems to be what does it. With many programs coming with extra/special fonts, it quickly narrows the users down based on what they have installed.
Of course, for fonts that only come as part of a software package but install fonts as system fonts (why?), it also tells remote sites what you have installed, which is an additional privacy concern.
Well, they claim 1 in 11000, as opposed to 1 in 20000. I question their math. (And yours). You don't get to multiple the liklihood of Chrome and Chrome 39 together, they are highly correlated. See also Windows and Windows 7.
Your ad here. Ask me how!
Their sample size is 11-thousand. According to my results, 1-in-6 computers are running Linux!
This is absurd, unscientific to the extreme, fear-mongering.
In your example, based only on the statistics you provided, there were 11099x0.0109 or 120 people in the central time zone *in their sample*, which is the sample size of UTC-6 users.
Their data is useless.
In comparison, https://panopticlick.eff.org/i... has almost 5-million in their database. This is somewhat more helpful.
The only thing I found interesting was this:
Use of AdBlock 49.28%
But that probably says more about the people who would visit the site than it does of AdBlock users.
Especially with the sample size so small at is is. https://panopticlick.eff.org/ has a much much higher sample base.
Other things that could be checked but which aren't include whether the browser allows SSL2, SSL3, TLS1.0, TLS1.1, and what kind of encryption.
Also, the ballpark speed at which it evaluates Javascript.
GIven most of the data is what's reported by a browser, why don't browsers filter the data?
Especially if "Do Not Track" is set to on - why don't they limit the data to send back?
Fonts - Microsoft released 6 fonts for the web over a decade ago - just report those 6 across all platforms and maybe a few standard system ones (you can get this from the User-Agent anyways). Make it whitelist of fonts.
Sure, some data is gathered through plugins, but I thought many are now click-to-run so you can't get that data unless you specifically run those plugins.
Is there a reason why browsers like Firefox return everything?
I agree that fonts seem to be the worst offender when it comes to browser fingerprinting. Surely browsers shouldn't need to send lists of installed fonts to web servers; a web page should simply list the desired fonts and the browser should decide, based on that, what font to use. Is the current behavior part of a standard? Even if it is, I hope browser makers are planning on stopping this leak.
The problem is not in fonts (on non-embedded there's no such thing as too many good fonts!), but in letting a random webpage poke that deeply into your system.
The message "No Flash or Java fonts detected" suggests who the culprits are. Flash belongs behind FlashBlock, Java belongs in /dev/null.
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
It's not the excessive tracking you should be afraid of. What you should worry about is the usage of incomplete data.
As has been covered on slashdot before NSA kills people based on metadata
Now add that together with some accidental killing of a person with the same name
A Reprieve team investigating on the ground in Pakistan turned up what it believes to be a confirmed case of mistaken identity. Someone with the same name as a terror suspect on the Obama administration’s “kill list” was killed on the third attempt by US drones.
What this tells me is that what I really should worry about is to accidentally having metadata that correlates with someone that the government wants dead.
What are you talking about? Browsers don't send installed fonts list to anybody!
The detection occurs when in CSS you specify font-family: XYZ. This is going to be displayed in the default font, unless the font XYZ is installed. By analyzing the width of the element you specified the font for (or drawing it into a canvas element) you can distinguish the cases where the font is installed from the case where the default font is used instead.
Hard to circumvent...
Write boring code, not shiny code!
This page will detect the fonts on your system without Java or Flash.
Write boring code, not shiny code!
With noScript enabled, it show no fonts at all.
None of the buttons work, either.
Dunno what you're talking about.
But I wonder why my browser needs to provide details about the plugins I have installed to any website I visit. What kind of legitimate use could that have?
Sites recover the plugin list to see if you support whatever content they want to send you. If you don't have a certain plugin the site can fallback to some other way of displaying the information or it can refuse to do anything. For example, trying Flash to diplay a video then falling back to html5.
Is it useful ?
Somewhat, albeit less and less with html5. Also, there's many plugins sites don't need to know about, as for example a pdf plugin. Some plugins should be totally transparent because they don't interact with the site.
Is it bad for anonymity? Yes, it's terrible.
Your understanding of their last statement is mistaken. The 1 over 11099 has nothing to do with the above statistics. It only says that of the 11099 browser tested, there are only 1 with the union of the above elements.
You're spot on, that's exactly what it says.
How big a set is, is irrelevant when considering its union with one or multiple other sets.
However, what the statistics do tell you is which of those parameters is more or less common with the ensemble. Eliminating a rarely occurring parameter could move you to a more common set intersection, making you thus less traceable. But deducing the union probability from the set statistics is not trivial, if possible at all without further constraints.
We're looking into putting in a recommendation system to help users improve their anonymity.
But I am wondering if 11099 trials can be considered significant in this case. There are looking at 6 or more parameters which have countless possible values.
It's sufficient for us to do quite a bit of analyses on the data and to possibly implement and provide the recommendation system. The data is however highly skewed towards geeks and towards user's in France (a.k.a french geeks!).
Disclaimer: a couple of colleagues and I created amiunique.org to get some data to understand fingerprinting better. It's a small student project but we feel there's potential. We were not ready for so many people to take an interest :)
Hard to circumvent...
NoScript takes care of most signature methods including tests for installed fonts.
I am becoming gerund, destroyer of verbs.
As others have noted, the EFF Panopticlick is the better service.
I just spent far too much time playing around with this, on an extended lunch break. I note the following things:
- You had better disable explicit tracking services (Ghostery), or it all doesn't matter anyway.
- Fonts are a big factor. Fonts are identified through Flash. There is a configuration file "mms.cfg" that can disable this. The location of this file depends on your operating system and on your browser - it took me a good half-hour to find it for my particular configuration.
- However, even after disabling fonts, and even using a "user-agent switcher" to look like a Windows/Chrome combination (instead of Linux/Chrome), I was still uniquely identifiable. The biggest factor were my language preferences, the list of plugins, and the precise browser version. Refusing to report system fonts was also pretty important :-/
In short, there's not much way around it - if you include other information available, like your IP address, you will be uniquely identifiable, and trackable across websites.
What is missing from this picture: Browsers provide an "incognito" mode. This mode needs to be extended to provide only absolutely essential information to the server. The server needs to know roughly what level of standards support you have (e.g., "Mozilla/5.0"), and what language to send content in (one language, not a list with weights). Everything else could be omitted, and virtually all websites would work perfectly.
Go a step farther and disable JavaScript in incognito mode, to prevent explicit sniffing. That will disable more websites, but if those sites start losing traffic, they'll offer versions that don't require JS.
Enjoy life! This is not a dress rehearsal.
"Your browser fingerprint appears to be unique among the 4,789,097 tested so far.
Currently, we estimate that your browser has a fingerprint that conveys at least 22.19 bits of identifying information."
Unique amongst the browser's tested. Is there a selection bias amongst people who would check to see if their browser is unique going on? I tried with IE from a generic install of windows 7 and still get the "you appear to be unique" message.
NoScript will be disabled on the websites you want to do something with. Those will be able to track you.
Write boring code, not shiny code!