HTTP GZIP Compression Leaks Data On the Location of Tor Web Servers
An anonymous reader writes: The GZIP compression format includes a field in its header that shows the Web server's local date, at which the data was gzipped. Almost all Web servers use "zeros" to pad this field by default, citing performance issues. Around 10% of Tor site operators have removed this feature and are printing the packet's compression date. Unknown to them, this "server local date" leaks the Tor site's timezone which law enforcement can then narrow down to a specific geographical area. Coupled with other Tor protocol leaks, this could help deanonymize .onion sites.
Tor is looking more and more "holey" all the time.
I can't help but wonder if the recent glibc DNS issue is not also an help in this deanonymization.
It seems to me there are less and less possibilities to escape the global panopticon.
The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
Patch it to use the same time-zone (e.g. UTC+0)?
For very large values of location.
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
Relevant parts of the Gzip specification, RFC-1952:
2.3.1
MTIME (Modification TIME)
This gives the most recent modification time of the original
file being compressed. The time is in Unix format, i.e.,
seconds since 00:00:00 GMT, Jan. 1, 1970. (Note that this
may cause problems for MS-DOS and other systems that use
local rather than Universal time.) If the compressed data
did not come from a file, MTIME is set to the time at which
compression started. MTIME = 0 means no time stamp is
available.
7.
When compressing or decompressing a file, gzip preserves the
protection, ownership, and modification time attributes on the local
file system, since there is no provision for representing protection
attributes in the gzip file format itself. Since the file format
includes a modification time, the gzip decompressor provides a
command line switch that assigns the modification time from the file,
rather than the local modification time of the compressed input, to
the decompressed output.
So it's effectively 6 Billion divided by 24 and easily mitigated just set a different timezone if it's your server you're going in on or connect to a tor server in a different time zone.
Build a Man a Fire, and He'll Be Warm for a Day. Set a Man on Fire, and He'll Be Warm for the Rest of His Life.
Pretty sure that if someone is running an anonymous server, for whatever purpose, they aren't going to fill in their location details accurately and just stick with the defaults. It's an interesting leak but it's a bit of a stretch to think this is a significant issue. Besides, it's been reported now, whatever leverage that could have given to law enforcement is basically zero now.
It's not a Tor problem, it's a metadata mindset problem.
RFC1952 clearly states that the mtime header is a POSIX timestamp, i.e., it is in universal time and not local time. The author of TFA somehow either completely missed or neglected to mention the fact that, per spec, there is no leakage of the timezone, and in fact two of his examples demonstrate exactly that.
Of the three examples cited in TFA, two of them - reddit.com and instagram.com - follow the spec and use POSIX time. Just run the php tool from TFA and you'll see that the time returned matches the current UTC time. Those servers aren't leaking their location because they follow the spec.
Only one example - bing.com - uses something other than POSIX time. Surprise surprise, some Windows-based server - presumably IIS? - ignores the standard and leaks the timezone in the process.
Now the question is, are people seriously running TOR hidden services on Windows machines? That just seems like asking for trouble. The operational security requirements of TOR hidden services are significantly higher than your average server, and I bet the chances of screwing that up with a Windows server are much higher. Leaking the timezone is probably the least of your worries in that case.
TL;DR Some Windows web server mis-implements the gzip standard and leaks the local timezone in the process. Spec-compliant web servers are not affected. TFA mis-identified two compliant servers as being affected. TFA did not list any Tor hidden services that are affected to allow for confirmation. This is mostly a non-issue.
There are undocumented gzip command line switches (-m, -M) that control embedding timestamps in gzip archives. They're not mentioned in the man page or --help output, but you can see them in the source here (line 344): http://git.savannah.gnu.org/cgit/gzip.git/tree/gzip.c
#ifdef UNDOCUMENTED
" -m, --no-time do not save or restore the original modification time",
" -M, --time save or restore the original modification time",
#endif
I learned about this because I had to ensure consistent hash values of build artifacts for regulatory reasons and I believe it is a misfeature. For me the Principle of Least Surprise would have gzip produce this exact same output given the same input, by default. As it is you get a slightly different output each time you compress the same set of bits, and that is entirely down to this timestamp. I think the fact that switches to achieve that behavior exist yet are undocumented belies some conflict about this.
Lurking at the bottom of the gravity well, getting old
This has nothing to do with Tor and has everything to do with incompetent sysadmins.
And if the sysadmins of Tor nodes are incompetent, it has everything to do with Tor.
Aren't you asking for trouble by compressing data and using HTTPS? My understanding was that BREACH was still a viable attack against TLS when using HTTPS. Seems to me that the last hop between you and the actual hidden service could use such an attack.
don't most servers use UTC TZ anyways?
Law enforcement can use the information, but the other 99% of your adversaries can't?!
Or they can intentionally set their timezone to a different value to mislead...
Chances are of zeroes are the default and tor sites have explicitly turned this off, then that's exactly what they've done... People running sites via tor are likely to be privacy conscious, so if they've changed a setting to a non default value they probably did it for a reason.
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
Hide the offset, Use UTC exclusively
Maybe the Metadata Anonymization Toolkit could help with this. https://mat.boum.org/
Set your server to UTC and don't worry about it.
I usually set mine to UTC, no matter where they are.
For this kind of leak I might "accidentally-on-purpose" select a timezone the machine doesn't happen to be in.
Why would you be doing a compression operation OVER AN EXTERNAL NETWORK, rather than crunching the file locally, then transmitting the compressed data over the internet. Unless you're seriously of the opinion that you can get higher actual communication speeds over a shared line servicing hundreds or thousands of other customers (compared to cables in your own wiring loom). That's to say nothing about the latencies and delays that are inherent in the Tor system itself.
Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"