HTTP GZIP Compression Leaks Data On the Location of Tor Web Servers
An anonymous reader writes: The GZIP compression format includes a field in its header that shows the Web server's local date, at which the data was gzipped. Almost all Web servers use "zeros" to pad this field by default, citing performance issues. Around 10% of Tor site operators have removed this feature and are printing the packet's compression date. Unknown to them, this "server local date" leaks the Tor site's timezone which law enforcement can then narrow down to a specific geographical area. Coupled with other Tor protocol leaks, this could help deanonymize .onion sites.
Or just pad it with zero's like everything else does, apparently.
Better to go with the flow in this case instead of trying to be clever.
My eyes reflect the stars and a smile lights up my face.
For very large values of location.
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
You thought it was the onion ring because of the layers, no it's because of the hole in the middle.
Nullius in verba
Relevant parts of the Gzip specification, RFC-1952:
2.3.1
MTIME (Modification TIME)
This gives the most recent modification time of the original
file being compressed. The time is in Unix format, i.e.,
seconds since 00:00:00 GMT, Jan. 1, 1970. (Note that this
may cause problems for MS-DOS and other systems that use
local rather than Universal time.) If the compressed data
did not come from a file, MTIME is set to the time at which
compression started. MTIME = 0 means no time stamp is
available.
7.
When compressing or decompressing a file, gzip preserves the
protection, ownership, and modification time attributes on the local
file system, since there is no provision for representing protection
attributes in the gzip file format itself. Since the file format
includes a modification time, the gzip decompressor provides a
command line switch that assigns the modification time from the file,
rather than the local modification time of the compressed input, to
the decompressed output.
RFC1952 clearly states that the mtime header is a POSIX timestamp, i.e., it is in universal time and not local time. The author of TFA somehow either completely missed or neglected to mention the fact that, per spec, there is no leakage of the timezone, and in fact two of his examples demonstrate exactly that.
Of the three examples cited in TFA, two of them - reddit.com and instagram.com - follow the spec and use POSIX time. Just run the php tool from TFA and you'll see that the time returned matches the current UTC time. Those servers aren't leaking their location because they follow the spec.
Only one example - bing.com - uses something other than POSIX time. Surprise surprise, some Windows-based server - presumably IIS? - ignores the standard and leaks the timezone in the process.
Now the question is, are people seriously running TOR hidden services on Windows machines? That just seems like asking for trouble. The operational security requirements of TOR hidden services are significantly higher than your average server, and I bet the chances of screwing that up with a Windows server are much higher. Leaking the timezone is probably the least of your worries in that case.
TL;DR Some Windows web server mis-implements the gzip standard and leaks the local timezone in the process. Spec-compliant web servers are not affected. TFA mis-identified two compliant servers as being affected. TFA did not list any Tor hidden services that are affected to allow for confirmation. This is mostly a non-issue.
All my servers are set to GMT. Why? Because when you're running across multiple TZs, it's a hell of a lot easier to trace logs when they use a single common global time. My activities don't care if they're in Asia/Tokyo, Europe/Berlin, Australia/Melbourne, or America/New_York, especially when services cross those regions.
The cesspool just got a check and balance.
This is not a problem with Tor. This is the server operator failing to properly anonymize their server.
It's like if I go and download and use the Tor Browser, but then fall victim to a phishing scam and give out personal information while using it. Tor will anonymise your connection to websites perfectly fine, but you the user are leaking information about yourself and Tor can't do anything about that. This is the same kind of issue.
Or just pad it with zero's like everything else does, apparently.
Even better would be to fill it with a value for a randomly selected TZ. That way you are poisoning the data, so "they" cannot be sure if any TZ fields are valid.
There are undocumented gzip command line switches (-m, -M) that control embedding timestamps in gzip archives. They're not mentioned in the man page or --help output, but you can see them in the source here (line 344): http://git.savannah.gnu.org/cgit/gzip.git/tree/gzip.c
#ifdef UNDOCUMENTED
" -m, --no-time do not save or restore the original modification time",
" -M, --time save or restore the original modification time",
#endif
I learned about this because I had to ensure consistent hash values of build artifacts for regulatory reasons and I believe it is a misfeature. For me the Principle of Least Surprise would have gzip produce this exact same output given the same input, by default. As it is you get a slightly different output each time you compress the same set of bits, and that is entirely down to this timestamp. I think the fact that switches to achieve that behavior exist yet are undocumented belies some conflict about this.
Lurking at the bottom of the gravity well, getting old
find a way to slap a VPN after TOR
You don't have any control over the "after TOR" side of the connection. You could slap a VPN before TOR, or operate an exit node that uses a VPN, but there's no way you'd want to be using your own exit node if you wanted the protection of TOR.
This has nothing to do with Tor and has everything to do with incompetent sysadmins.
And if the sysadmins of Tor nodes are incompetent, it has everything to do with Tor.
I see TOR kind of like HTTPS: it won't necessarily keep your transmission from being decrypted and deanonymized, but it probably makes it much harder to do so. As such it just sort of raises your default level of privacy (from plain HTTP).
but there's no way you'd want to be using your own exit node if you wanted the protection of TOR.
And using someone else's exit node to access your own VPN is also a bad idea.
Yea because knowing the servers location at the resolution of a timezone will help a lot...
If the time zone is the Winamac time zone in Indiana, or some of the other very regional time zones, it may.
It's just another datum in fingerprinting, but in some cases, it may be the crucial one.
It could be more helpful than you think. If the server says its timezone is in the US, for example, that may be enough for a judge to grant the FBI a warrant authorizing god-knows-what attacks against it.
"BSD: Free as in speech. Linux: Free as in beer. Windows 10: Free as in herpes." --Man On Pink Corner in #52607549.
You're thinking of BST, GMT is constant and the uk switches to BST during the summer.
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
Almost every attempt to poison data turns into another datapoint. That datapoint is likely more valuable than a NULL value.
For instance, that leaks data about your pseudo-random number generator, opens up timing based identification, etc.
Your ad here. Ask me how!
Let's see... FBI takes over Onion servers, supposedly paid universities for "research", managed to find the operator to silk road 2.0, and managed a rather large bust of criminals. Then you've got the issue of "fake tor clients"... Seems to me that considering tor (be it with a VPN before or after) is really irrelevant and that the underlying technology of tor has just been under attack for two long and needs to be replaced.
Select from tblFriends where interesting >= 4;
Exactly...
You would think nobody who decided to rob a bank would write the note on the back of mail addressed to themselves.
Hell you might think that nobody would think "hey, in this live production stock exchange trading system, lets try entering a value of -6"
Then you might think "Surely nobody developing a live production stock exchange trading system would ever simply cast a signed integer into an unsigned integer and allow a user to accidentally post a 69 trilliong dollar trade as a result?"
You would be wrong on all three, but, the first happened so many times you can find those for yourself:
http://news.slashdot.org/story...
"I opened my eyes, and everything went dark again"