New, Faster Attack against SHA-1 Revealed
VxSote writes "According to Bruce Schneier's
blog, a team of Chinese cryptographers has announced new results against SHA-1 that speed up the time required to find collisions compared to their previously published attack. Schneier says that a SHA-1 collision search is now 'squarely in the realm of feasibility,' and that further improvements are expected."
Would someone with mod points please mod parent funny?
I, for one, would not have thought of that... :-)
- Leon Mergen
http://www.solatis.com
Yeah but us Brits cracked Enigma, no matter what Holly Wood would have y'all believe.
OK so thats bait.
The NSA doesn't release its finding about new attacks against encryption algos. They use the info to crack and keep secure. Promote AES as a standard, and have a decades worth of research about useful attacks against AES that no-one knows about but the NSA.
Like public-key encryption. People in Britain discovered it first, but kept the research secret.
Also because their most recent attack is 2^63 complexity, which is under the 2^64 complexity that people have already run.
I am unamerican, and proud of it!
The problem is that these algorithms rely on external characteristics of the data sources and render them to a short description. Indeed, a "DNA" approach would look at what makes up the files (binary) rather than the obvious (ASCII characters) and create a profile that could only match that file.
Er, what are you talking about? SHA-1 works with blocks of binary data, not ASCII characters, and it's impossible to "create a profile that could only match that file" since such a hash would be at least as large as the file itself.
Near as I can tell, you've just strung some clever-sounding words together to try and appear clever yourself. It might get you karma, but it's nonsense, all the same.
> You forgot about lossy compression. In a way, you could view gzip, and bzip2 as hashing algorithms that take a larger file and reduce it to a smaller size.
:)
> This "hash" would only match one file per "hash",
AFAICT, you wrote "lossy" where you meant "lossless". The difference is large, and crucial to your entire discussion
Uunless the size of the "DNA" is larger than the size of the information being hashed, there will always be duplicates in the hash space.
Basic information theory. Nothing to do with file formats, transfer protocols, or endianness.
The article has links to the previous SHA attack papers and Xiaoyun Wang's publication list. (These links were added after the article was posted, so maybe you missed them.)
Two of the Chinese researchers (Xiaoyun Wang and Hongbo Yu) were due to present their SHA results at the CRYPTO 2005 conference in the US, but were denied visas in time to attend. Adi Shamir (the A in RSA) ended up announcing this latest result on their behalf.- chinese-crypto-researchers.html
http://cipher-text.blogspot.com/2005/08/visas-for
The NSA already said that people should stop using SHA-1 in favor of other methods. It can be assumed they already knew about attacks.
Still IMing in the stone age?
Second, RFC 4109 refers to the HMAC algorithms used for computing per packet integrity checksum that is resistant to tampering by a man in the middle. HMACs take as input both a known message and a shared secret (often, a session key for a symmetric key algorithm like DES, triple DES, AES, RC4, etc) and compute hash result ( MD5, or SHA1, or SHA-256, etc. ). In other words, part of the input to the hash algorithm is unknown. This makes it very difficult to find two messages, X and Y that compute the same HMAC. I.e. find X and Y such that HMAC(X, K) == HMAC(Y, K), where K is the shared secret. The attacks on MD5 and SHA-1 so far assume that there is no K, or if there is, it is known. And if the man in the middle knows K, he doesn't need to use these new cool attacks to tamper with messages; he's the man in the middle, he just tampers and re-computes the HMAC with far less computational overhead.
I've see no indication in Schneir's blog entry that these attacks break HMACs.
That said, it is surprising that SHA-256 wasn't added to the MUST list for RFC4109, given that when this RFC was published, it was known that SHA1 had be shown to be vulnerable to attacks of less than O(2^69). But then again, the RFC also mentions AES128 as MUST, but not AES256. Odd.
1) Figure out what the secret key is based on the data and hashes being sent. If you could do this, you could generate your own HMAC and send your own data without being detected.
2) Intercept the packet and find a way to change the data in a way that will still generate the same hash value. This would allow you to change packets without being detected.
The vulnerability the researchers found doesn't help you accomplish either of these tasks, but it does show that SHA-1 has some weaknesses. Therefore, it's probably a good idea to start moving to a different hashing algorithm before somebody figures out a practical way to do #1 or #2. Also keep in mind that an HMAC exploit wouldn't help an attacker decrypt IPsec packets. To do this, a vulnerability would need to be found in either the encryption algorithm or the key exchange algorithm. In reality, a HMAC exploit by itself would only allow an attacker to send garbage data to the target address.
The laws of thermodynamics and the amount of energy our Sun will emit in it's lifetime also come in to play when talking about bruteforcing encryption algorithms.
SHA-1 is a hashing algorithm - meaning that it was never intended to derive the original message from the hash - just that the hash should uniquely identify the original message (and a few known collisions that _should_ occur in only once in 2^[number of bits in hash length] other messages).
What is significant about these discoveries (the whole chain of them) is that it is far easier than generating 2^[number of bits in hash length] messages to identify collisions.
The NSA also changed the s-boxes, making it more resistant to differential cryptanalysis (a technique not discovered by others until years later). Both differential and linear cryptanalysis are more efficient than a brute force search, but massive amounts of known cyphertext is required.
Not Brits, Poles. http://en.wikipedia.org/wiki/Bomba_(cryptography)
Ok, bear with me for a moment...
In this example for simplicity's sake, I'm going to use a ficticious function to shorten the output. The same logic will apply to any hashing function that gives output of a specific length.
Let's give this function a name, so that people don't assume I'm using SHA-1, and flame me for it. Let's call the function "melandy_hash", and say that for whatever input it receives, it gives a 6 digit hex number for the output (yes, I know that this is ridiculously short, but this is a simplified example).
For the input "This is my input string", you might receive the output 82a78b. For the input "This is another input string", you might receive the output 1721ca. Note that although the values for the output are different, the length is the same, because we specified that the output would be a 6 digit hex number.
To expand this further, and to get directly to the point of your question... say we have a list of strings that we want to hash:
"This is my input string"
"This is another input string"
"And another one"
"And yet another one"
"X"
These five strings collectively are the "set" of inputs. When we compute the hashes of these strings using melandy_hash, we get the following output:
82a78b
1721ca
82a78b
1b82ac
97f25b
The above list of hex values is then the "set" of output.
Note that the strings "This is my input string" and "And another one" both have the same hash value. This is known as a collision. As a result, there are really only 4 unique values in out set of output. This means that the set of inputs (with 5 unique elements) is larger than the set of outputs (with 4 unique elements).
Where you may have been confused before is that the output for the string "X" is 97f25b, and for this particular element, the output is larger than the input. The original point that was made was that the number of unique elements in the input set is larger than the number of unique elements in the output set.
Make sense now?