New, Faster Attack against SHA-1 Revealed
VxSote writes "According to Bruce Schneier's
blog, a team of Chinese cryptographers has announced new results against SHA-1 that speed up the time required to find collisions compared to their previously published attack. Schneier says that a SHA-1 collision search is now 'squarely in the realm of feasibility,' and that further improvements are expected."
The NSA doesn't release its finding about new attacks against encryption algos. They use the info to crack and keep secure. Promote AES as a standard, and have a decades worth of research about useful attacks against AES that no-one knows about but the NSA.
Like public-key encryption. People in Britain discovered it first, but kept the research secret.
Two of the Chinese researchers (Xiaoyun Wang and Hongbo Yu) were due to present their SHA results at the CRYPTO 2005 conference in the US, but were denied visas in time to attend. Adi Shamir (the A in RSA) ended up announcing this latest result on their behalf.- chinese-crypto-researchers.html
http://cipher-text.blogspot.com/2005/08/visas-for
The NSA already said that people should stop using SHA-1 in favor of other methods. It can be assumed they already knew about attacks.
Still IMing in the stone age?
Second, RFC 4109 refers to the HMAC algorithms used for computing per packet integrity checksum that is resistant to tampering by a man in the middle. HMACs take as input both a known message and a shared secret (often, a session key for a symmetric key algorithm like DES, triple DES, AES, RC4, etc) and compute hash result ( MD5, or SHA1, or SHA-256, etc. ). In other words, part of the input to the hash algorithm is unknown. This makes it very difficult to find two messages, X and Y that compute the same HMAC. I.e. find X and Y such that HMAC(X, K) == HMAC(Y, K), where K is the shared secret. The attacks on MD5 and SHA-1 so far assume that there is no K, or if there is, it is known. And if the man in the middle knows K, he doesn't need to use these new cool attacks to tamper with messages; he's the man in the middle, he just tampers and re-computes the HMAC with far less computational overhead.
I've see no indication in Schneir's blog entry that these attacks break HMACs.
That said, it is surprising that SHA-256 wasn't added to the MUST list for RFC4109, given that when this RFC was published, it was known that SHA1 had be shown to be vulnerable to attacks of less than O(2^69). But then again, the RFC also mentions AES128 as MUST, but not AES256. Odd.
1) Figure out what the secret key is based on the data and hashes being sent. If you could do this, you could generate your own HMAC and send your own data without being detected.
2) Intercept the packet and find a way to change the data in a way that will still generate the same hash value. This would allow you to change packets without being detected.
The vulnerability the researchers found doesn't help you accomplish either of these tasks, but it does show that SHA-1 has some weaknesses. Therefore, it's probably a good idea to start moving to a different hashing algorithm before somebody figures out a practical way to do #1 or #2. Also keep in mind that an HMAC exploit wouldn't help an attacker decrypt IPsec packets. To do this, a vulnerability would need to be found in either the encryption algorithm or the key exchange algorithm. In reality, a HMAC exploit by itself would only allow an attacker to send garbage data to the target address.
Ok, bear with me for a moment...
In this example for simplicity's sake, I'm going to use a ficticious function to shorten the output. The same logic will apply to any hashing function that gives output of a specific length.
Let's give this function a name, so that people don't assume I'm using SHA-1, and flame me for it. Let's call the function "melandy_hash", and say that for whatever input it receives, it gives a 6 digit hex number for the output (yes, I know that this is ridiculously short, but this is a simplified example).
For the input "This is my input string", you might receive the output 82a78b. For the input "This is another input string", you might receive the output 1721ca. Note that although the values for the output are different, the length is the same, because we specified that the output would be a 6 digit hex number.
To expand this further, and to get directly to the point of your question... say we have a list of strings that we want to hash:
"This is my input string"
"This is another input string"
"And another one"
"And yet another one"
"X"
These five strings collectively are the "set" of inputs. When we compute the hashes of these strings using melandy_hash, we get the following output:
82a78b
1721ca
82a78b
1b82ac
97f25b
The above list of hex values is then the "set" of output.
Note that the strings "This is my input string" and "And another one" both have the same hash value. This is known as a collision. As a result, there are really only 4 unique values in out set of output. This means that the set of inputs (with 5 unique elements) is larger than the set of outputs (with 4 unique elements).
Where you may have been confused before is that the output for the string "X" is 97f25b, and for this particular element, the output is larger than the input. The original point that was made was that the number of unique elements in the input set is larger than the number of unique elements in the output set.
Make sense now?