New Attack Steals SSNs, E-mail Addresses, and More From HTTPS Pages (arstechnica.com)
Security researchers at KU Leuven have discovered an attack technique, dubbed HEIST (HTTP Encrypted Information can be Stolen Through TCP-Windows), which can exploit an encrypted website using only a JavaScript file hidden in a maliciously crafted ad or page. ArsTechnica reports: Once attackers know the size of an encrypted response, they are free to use one of two previously devised exploits to ferret out the plaintext contained inside it. Both the BREACH and the CRIME exploits are able to decrypt payloads by manipulating the file compression that sites use to make pages load more quickly. HEIST will be demonstrated for the first time on Wednesday at the Black Hat security conference in Las Vegas. "HEIST makes a number of attacks much easier to execute," Tom Van Goethem, one of the researchers who devised the technique, told Ars. "Before, the attacker needed to be in a Man-in-the-Middle position to perform attacks such as CRIME and BREACH. Now, by simply visiting a website owned by a malicious party, you are placing your online security at risk." Using HEIST in combination with BREACH allows attackers to pluck out and decrypt e-mail addresses, social security numbers, and other small pieces of data included in an encrypted response. BREACH achieves this feat by including intelligent guesses -- say, @gmail.com, in the case of an e-mail address -- in an HTTPS request that gets echoed in the response. Because the compression used by just about every website works by eliminating repetitions of text strings, correct guesses result in no appreciable increase in data size while incorrect guesses cause the response to grow larger.
I'm not a fan of that article summary.
New summary:
It is the same as CRIME, but we're using your browser's performance timing JS API as the man-in-the-middle.
A review:
Stick sensitive info into compressed stuff, and you make that sensitive info less private. If the encryption is zlib-like, then the attacker can guess the information quite quickly-- a good compressor compresses substrings, not just the whole thing.
That means that if you have a SSN in there, the attacker can guess some substrings of your SSN, and the response won't be much bigger.
Guesses that don't share substrings with your SSN will be larger-- the attacker can reject those as bad guesses and not try those substrings again.
With HTTP2's HPACK compressor (only used for info in the headers), this side-channel is eliminated-- only an exact guess of the data will allow this to happen.This is completely unrelated, however, to someone using entity-body compression with HTTP2. If you mix sensitive data with everything else in the compressed-entity body... side channel attacks galore!
A mitigation: Don't put the sensitive data in the same resource as the non-sensitive data, and then don't compress the sensitive data.
HTTP2 makes this cheaper. If sites do this, then these attacks simply do not work any better than the brute-force guessing would.
Ensuring that this happens (no sensitive data compressed) isn't necessarily the most easy thing...
Another obvious one is disable the timing API for 3rd party stuff. This is not as effective theoretically, but it is way easier to deploy and makes these kinds of attacks require an external 3rd party.