Slashdot Mirror


HTTP GZIP Compression Leaks Data On the Location of Tor Web Servers

An anonymous reader writes: The GZIP compression format includes a field in its header that shows the Web server's local date, at which the data was gzipped. Almost all Web servers use "zeros" to pad this field by default, citing performance issues. Around 10% of Tor site operators have removed this feature and are printing the packet's compression date. Unknown to them, this "server local date" leaks the Tor site's timezone which law enforcement can then narrow down to a specific geographical area. Coupled with other Tor protocol leaks, this could help deanonymize .onion sites.

5 of 79 comments (clear)

  1. What the gzip spec says about MTIME by Anonymous Coward · · Score: 5, Informative

    Relevant parts of the Gzip specification, RFC-1952:

    2.3.1
                      MTIME (Modification TIME)
                            This gives the most recent modification time of the original
                            file being compressed. The time is in Unix format, i.e.,
                            seconds since 00:00:00 GMT, Jan. 1, 1970. (Note that this
                            may cause problems for MS-DOS and other systems that use
                            local rather than Universal time.) If the compressed data
                            did not come from a file, MTIME is set to the time at which
                            compression started. MTIME = 0 means no time stamp is
                            available.

    7.
    When compressing or decompressing a file, gzip preserves the
          protection, ownership, and modification time attributes on the local
          file system, since there is no provision for representing protection
          attributes in the gzip file format itself. Since the file format
          includes a modification time, the gzip decompressor provides a
          command line switch that assigns the modification time from the file,
          rather than the local modification time of the compressed input, to
          the decompressed output.

    1. Re:What the gzip spec says about MTIME by unrtst · · Score: 5, Informative

      Vote parent up.

      The article the summary references is just a summary of this: http://jcarlosnorte.com/securi...

      In which, he notes:
      Offset Size Value Description
          0 2 0x1f 0x8b Magic number to idenitfy gzip streams
          2 1 Compression method
          3 1 Flags
          4 4 Compression Date
          8 1 Compression flags
          9 1 Operating system

      He references that as coming from: http://www.forensicswiki.org/w...
      But that document does not say "Compression Date". It actually says:

      4 4 Last modification time. Contains a POSIX timestamp.

      Even his proof of concept shows that he's parsing that field as a POSIX timestamp: https://github.com/jcarlosn/gz...

      echo date('l jS \of F Y h:i:s A', $rdate);

      It appears that either:

      a) Something else in his php script is setting the TZ before doing that parse
      b) The server is calculating the POSIX timestamp incorrectly, which is a similar issue but quite a different root cause.

    2. Re:What the gzip spec says about MTIME by unrtst · · Score: 5, Informative

      ... just to confirm, the answer is "b": The server is calculating the POSIX timestamp incorrectly, which is a similar issue but quite a different root cause.

      I updated his script to print the difference between the current POSIX timestamp and the value returned by the server.
      bing.com: current - server_value = 28800
      reddit.com: 0
      instragram.com: 0

      Those were his three tests. I'm not surprised the Microsoft server is the one calculating a POSIX timestamp incorrectly. MS folks tend to do timestamp math very poorly. I suspect this only affects Microsoft servers, or horribly misconfigured $anything_else.

  2. Re:Use a single timezone by Gr8Apes · · Score: 3, Informative

    All my servers are set to GMT. Why? Because when you're running across multiple TZs, it's a hell of a lot easier to trace logs when they use a single common global time. My activities don't care if they're in Asia/Tokyo, Europe/Berlin, Australia/Melbourne, or America/New_York, especially when services cross those regions.

    --
    The cesspool just got a check and balance.
  3. undocumented gzip by TopSpin · · Score: 5, Informative

    There are undocumented gzip command line switches (-m, -M) that control embedding timestamps in gzip archives. They're not mentioned in the man page or --help output, but you can see them in the source here (line 344): http://git.savannah.gnu.org/cgit/gzip.git/tree/gzip.c

    #ifdef UNDOCUMENTED
    " -m, --no-time do not save or restore the original modification time",
    " -M, --time save or restore the original modification time",
    #endif

    I learned about this because I had to ensure consistent hash values of build artifacts for regulatory reasons and I believe it is a misfeature. For me the Principle of Least Surprise would have gzip produce this exact same output given the same input, by default. As it is you get a slightly different output each time you compress the same set of bits, and that is entirely down to this timestamp. I think the fact that switches to achieve that behavior exist yet are undocumented belies some conflict about this.

    --
    Lurking at the bottom of the gravity well, getting old