Implications Of The Recent Hash Function Attacks
An anonymous reader writes "Cryptography Research has issued a Q&A that explains the security implications of the hash function
collision
attacks recently announced at CRYPTO 2004. Apparently the consequences can be catastrophic for certain kinds of code signing and digital signatures, but MD5 sums for checking binaries are (mostly) OK. While the
speculation that SHA-1 is about to fail seems to be overblown, updating the many legacy systems and protocols that rely on MD5 is going to be a massive undertaking."
"While the speculation that SHA-1 is about to fail seems to be overblown, updating the many legacy systems and protocols that rely on MD5 is going to be a massive undertaking."
Any time I've tried to point this out, I've been shouted down by hysterical people (such as relex) squawking that because it may be possible to generate two messages with the same MD5 hash, SHA-1 is automatically broken. Um, no. They're two totally different algorithms. Use some common sense, people. I'm as cautious as the next person but screaming about how "all hash algos are insecure" is hyperbole at its worst.
Note to M1-ers: a curt but otherwise insightful message is not "Flamebait" or "Troll".
Yes, too bad ECC is not a hashing algorithm and has absolutely nothing to do with this, or else we'd be set.
Say I have program A that I distributed and I supply the md5/sha1 sum to insure it's 'validity'. From what I read you can get two bitstreams to produce the same sum, ok that was expected. But what I dont get is this still doesn't allow somebody to arbitrarily pick whatever sum they want for their code right? I mean still the chances of somebody writing some trojan'd program and magically somehow getting the sum's to match is extreemly small and/or really computationally expensive. If they were that smart, wouldn't they be working for one of the TLA's (Three letter Acronyms) already?
Your hair look like poop, Bob! - Wanker.
In the wake of stories like this is this a message that we need more secure forms of encryption than we already have? RSA is great so far, but how long until 1024 is broken? Or any other schemes, like the MD5 hashing that's used for digital signatures?
Keeping ahead of the crackers is a big concern not only for security of transactions, but for personal privacy as well.
CMDRTACO CHECK YOUR EMAIL!
For example, a devastating attack would be one that enabled adversaries to obtain a legitimate server certificate with a collision to one containing a wildcard for the domain name and an expiration date far in the future.
quick questions:
1) Don't the browser check for wildcard domain names in the certificates???
2) If not, why not???
Consensus is good, but informed dictatorship is better
And too bad that ECC is a) not provably secure and b) is rumored to have been broken already.
I am disrespectful to dirt! Can you see that I am serious?!
Comment removed based on user account deletion
Use both!
You say its easier than ever to find the soloution to each one of these hashes, so just use em both. I really think that the number of collisions that occur similtaniously are a bit fewer and farther between. I think that will be secure until we find a better and decently fast hash.
md5sum
d41d8cd98f00b204e9800998ecf8427e
This isn't "a faster computer". This is "a better technique" or "a deeper understanding" - much more dangerous.
Has a collision been found yet concerning data which has both the same MD5 sum and the same SHA-1 sum?
It would seem as though even if SHA-1 were to fail, the two algorithms used together could bolster each other security-wise. This slows things down, of course, but would it not suffice for the time being?
you don't have to generate specific malicious code in order to exploit md5.
merely creating pure trash would be sufficient, think of the case of BIOS or other firmware. create random garbage with the same md5 hash and voila, you've turned your victim's PC/laptop/celphone/pda/etc into a doorstop.
there are many other ways that md5 can be exploited, this is only one.
Because wildcards are not necessarily a bad thing. The concept is that you have a single SSL accelerator in front of a whole pool of servers, and it absorbs the "security context" of all the hosts behind it.
If you want universal SSL deployment, this is one of the ways you get it.
--Dan
it would be better to post both the MD5 hash _and_ the SHA-1 sum. What's the chance of 2 different binaries having the same MD5 and the same SHA-1 at the same time??
Artaxerxes
ECC has been cracked...it just takes roughly the combined powers of 50 PCs to do it. This is a really old link but its valid for this thread.
http://cristal.inria.fr/~harley/ecdl7/readMe.html
Is here.
I didn't read the attack too well, but from the Q&A, it appears that the attacks are collision attacks (like the Birthday attack, but, I imagine, more efficient). The Q&A states "In contrast [to a preimage attack], a collision attack finds two messages with the same hash, but the attacker can't pick what the hash will be."
So, shouldn't it be possible to edit something in the document that doesn't change the meaning (such as a misspelling, or punctuation) before you sign it, thereby changing the hash to something completely different? It would seem that now the attacker is forced to find a document that has a given hash, which is essentially a preimage attack, or is there something I am missing?
...interesting if true.
512 bits made from 2 hashes, one weak and one strong will be weaker than a single 512 bit hash from the stronger algorithm.
The world is going to end! Giant asteroids will destroy all life on earth!
Oops, wrong article. Um... The world is going to end! Global warming... um, well... the Patriot Act... umm...
Well, it's not that bad. Somebody might be able to flip four very carefully selected bits in a file, and still produce the same MD5 hash. This could let me, for example, create an executable that had a normal, benign behaviour, and an evil trojan behaviour, and have one of the bits that I flip change a conditional so that the trojan behaviour was activated. (Note that open source tends to be immune to this kind of nonsense, since in the source code, the actual trojan part - not the conditional that activates it, but the actual evil payload - tends to stick out like a sore thumb.)
Note well that this does not let me create an evil version of somebody else's file. It only lets me create two closely related files, one of which differs by four bits from the other. I have to be able to construct the benign file in such a way that I can turn it into an evil file by changing four bits. And it can't be just any four bits, either; it's a very specific four bits.
So this isn't the end of the world. What it means is that you can't quite trust MD5 to guarantee that you got exactly, bit-for-bit, what you think you got.
But really, this new situation isn't much worse than what we had before. I mean, I could simply have the evil behaviour activated by the date, or by the IP address of the installed machine, or whatever, and get somebody else (who never saw the evil part run) to state that the program did what it was supposed to. Having an MD5 hash doesn't guarantee that the program isn't evil. Bottom line: don't run code written by bad people, whether it has a valid MD5 or not. (I know, I know. How do you tell who the bad people are? That's a hard question, but my point is that a valid MD5 has never told you whether the authors were bad people or not.)
A bad day for me.
I write 2 programs, lets call one "Cool Whizzy Must Have Util" and the other "Soul Sucking Destruction" and I tweak and tweak one of the binaries so that they have the same checksum.
Then I release the first one, everybody is eventually using it.
So then on my servers, I replace the first one with the 2nd one.
Gotcha!
Is that the danger here?
Yes but what does this mean to me, "Mr.MSAccess Guru/Administrator"?
Microsoft certification available upon request.
If you think
Some forms of ECC have been 'broken', Len Adlemann (A of RSA) showed that ECC in dimensions higher than 2 was no more secure. He has been working on some further attacks and thinks that ECC as a whole might be vulnerable.
I don't like ECC for two reasons. The first is that ECC is a very new field of mathematics, new results come regularly. It is entirely possible that someone would find an efficient means of transforming ECC problems into discrete math problems and come up with a solution.
The other reason is that ECC is patented up the wazoo. The most efficient ways of using ECC are patented and if you can't use them there is no efficiency advantage over RSA in a discrete field so why bother?
The hash algorithm thing is massively overblown. MD5 was already toast. SHA1 was due to be withdrawn in 2010 in any case and has already been superceded by SHA-256 and SHA-512. New versions of DSA for the larger hash sizes are also due.
It remains to be seen whether the construction of SHA-256 needs to be adjusted in the light of the MD5 results. It may well be that it shares the same vulnerability as SHA-1 and we should forget about the new hash functions and move straight to something else. Alternatively all might be right with the world. We do not know yet.
A lot of people are suggesting a competition similar to the AES competition for a new digest algorithm. There is already something underway for stream ciphers. This seems like a good plan, not least since the cryptographers seemed to have fun with the last one.
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
While it's still not certain whether a similar jump might be made in the case of the MD5 and SHA-1 hashing algorithms, you can bet that a lot of crypto people are looking. What's that OSS saying about many eyes making flaws shallow again? Even if there is a fatal flaw though, I doubt it's not going to be the show stopper for hashes some people seem to think; you just use more of them. RPM supports DSA, SHA1, MD5 *and* GPG checksums for example, even if all four algorithms were broken, I doubt there is a method for generating an equivalent file that matches all four checksums at the same time.
UNIX? They're not even circumcised! Savages!
"SHA1 is a totally different algorithm, so it's still perfectly safe."
Yes and no. MD5 collisions are not SHA1 collisions, and the attack that generated the MD5 collisions doesn't seem to be applicable to SHA1, or its authors would have published collisions on SHA1. The published collisions on several other algorithms: HAVAL-128, MD4, and RIPEMD. They also say that their method will work against SHA0. All these hash functions share similar design principles. It seems highly probable that the MD5 attack will have at least some applicability to SHA1 even though it isn't directly an attack against SHA1. Also, other researchers have published results against SHA1. In particular, Biham and Chen con produce collisions on reduced versions of SHA1 with up to about 40 rounds (the full hash function has 80). That isn't a break of the full hash function, and there's no guaranteed it can be extended to more rounds, but it looks worrisome.
"This attack produces two messages with the same hash, no guarantee what that hash would be, instead of one message with a chosen desired hash, so it isn't a threat to real systems."
That's just stupid. "No practically-findable collisions" is one of the design requirements for a secure hash function. Protocols using secure hash functions are based on the assumption that the functions used are secure hash functions. If your hash function doesn't guarantee collision resistence, then your protocols must be assumed to be broken unless you can go back and prove, for every protocol, "This one is still secure even if we use something that is not a real secure hash function."
One way a hash collision could be useful, for instance, would be against some signature schemes where the secret key is revealed if you ever sign an identical message more than once. People who use those schemes are careful to avoid signing the same message twice... but if you had two different messages and they had the same hash, it's quite possible to imagine that you could be tricked into signing the same hash more than once (because people sign hashes, not actual messages) and making trouble for yourself. Similarly, if you use hash output for initialization vectors in cipher modes that use those, the result could be encrypting two messages with the same keystream, which means an attacker can probably recover both messages (and then use them as stepping-stones to breaking the rest of your system).
Also, a fast way of finding collisions may well be extensible to a somewhat-slower, but still faster-than-brute-force, way of finding the preimages that you think an attacker really wants.
"This attack depends on the messages having a special form; they don't look like real plaintext, so it isn't a threat to real systems."
One of the conditions for a secure cryptographic system is that you don't depend on the plaintext having (or NOT having) a specific form. If your system doesn't work regardless of the content of the data I put through it, then I will punt on your system, and recommend to my clients some other system that will actually work. It's also not clear that the attack on MD5 really does require a specific form... those strings look randomly-generated to me, even though the XOR difference of them clearly is not. Maybe with just a little more work they can produce collisions of two meaningful and interesting strings with opposite meanings.
"All hash functions have collisions, so it was bound to happen and isn't a threat."
The important question is whether people can actually find collisions. With a good hash function, collisions should be rare enough that nobody has any reasonable chance of finding them on purpose any time soon. Wang, Feng, Lai, and Yu can find collisions on MD5 deliberately, with practical amounts of computer power. They have done this more than once, and have at least outlined a plausible theoretical explanation of how they can do it. That means MD5 does not provide the guarantees that a secure hash function must
The work that went into putting together AES was really fantastic.
I'm just looking forward to a similar effort around an advanced hashing standard.
Where would an effort like this form?
Upon hearing these news, Tom Ridge raised the level of alert to "Amber".
At least this time he had something a tad more substantial to instill fear in the hearts of all patriotic Americans such as myself.
Thank you Department of Homeland Defense! I sleep so much better at night!
Wearing pants should always be optional.
Posted anonymously to avoid offending any of my colleagues.
More info on the implications at Educated Guesswork. (It isn't my work, so anonymously it is.)
ROT-13 is completely invulnerable to hash collisions; no two non-identical inputs will ever result in identical outputs!
I recommend that everybody replace their existing encryption systems with ROT-13 immediately.
-Cbbg
http://eprint.iacr.org/2004/199.pdf
Here's an actual example of two different binary files having the same md5sum.
With a collision attack, you can perform an attack that matters - here's an example:
Imagine that Microsoft won't sign any audio drivers for Windows XP that allow raw audio data to be output to disk. Also imagine that you are the driver release engineer at Creative (Sound Blaster division) and you want to release a driver that can do that.
What you do is build both drivers (one that Microsoft will sign, and one that you want to release with the "unacceptable" feature) with a large static data buffer that isn't used in the binary. You then try to modify both buffers in such a way as the two files will have the same hash (doesn't matter what hash, just that it's the same). This will take about 2^40 worth of work for MD5 instead of the 2^64 that it should take because of this security issue.
Once you've created your two binaries with the same hash, you send the acceptable binary to Microsoft and they sign it. Then, in the release section of your website you post the other binary with the signature you got from Microsoft... and the signature verifys just like they signed it.
There is also a break in the digital check situation, *if* the digital check protocal has random padding (many do) *and* the payee generates the check (also possible).
-- The act of censorship is always worse than whatever is being censored. Always.
DSA and GPG aren't hashes. They're signature schemes (well, GPG implements PGP, an encoding of either RSA or ElGamal signatures), and they use other hashes in their checksums, like SHA1 and MD5. Of course, you can make GPG do RIPEMD160, which IIRC has not been broken yet either.
I hereby place the above post in the public domain.
On a lark I decided to run the purported collisions in the paper through MD5 to verify the claim, and I got a weird result. The two examples given are indeed collisions, but the hash is not what the paper says it is. The paper says that the hashes for the two examples are supposed to be:
9603161f f41fc7ef 9f65ffbc a30f9dbf
and
8d5e7019 6324c015 715d6b58 61804e08
but the hashes I get are:
74BE7342 8C5BDB65 9BE40E00 CF6AE31C
and
BC5E1391 D31E52F3 D41CBE8C 05D7DBC1
I'm using the MD5 library built in to Darwin (OS X) and I've verified that it passes the standard MD5 test suite in RFC 1321.
parent's trying to say more along the lines of "it'll be a lot less easy to find a collision dataset that's simultaneously a collision for md5 and sha1"
A lot of stuff I've seen floating around carries multiple verification methods (apache uses md5 and pgp sigs for example).
Even if one verification technique is rendered "broken" -- together, the two hash algorithms are still that much more complex to break (though your point is also valid: wasting 32 bits on crc32 isn't going to make it more secure than adding those 32 bits to your new nonbroken cryptographic hashing algorithm).
No. The new technique allows you to construct two pieces of data which have the same hash. It doesn't allow you to construct a piece of data which matches a given hash. This was explicitly spelled out in the Q&A document.
Also, MD5 password hashes are salted. This means that, although you could potentially find a collision with the hash in the password file, the colliding data almost certainly would not have the same salt, and therefore would not be accepted as a valid password.
Furthermore, think about it. If you had access to the password hashes, you would be root, and could just 'su' to the user account in question anyway.
Now, suppose that your goal was to guess the password and hope that the user uses the same password on other machines -- i.e., you want to boost yourself to other hosts. But you're still SOL, because the MD5 password hashes are salted. Therefore, you must recompute the equivalent password on the other hosts, even if the user used the same password on those hosts.
As the Q&A document explained, the ability to produce colliding hashes makes MD5 unsuitable for some tasks. Protecting UNIX logins is not one of the ways in which its use has been compromised.
Yes, some crypto people are already saying you should change the whitespace in any pre-generated document you are asked to digitally sign.
Changing spelling or punctuation would also protect against collision attacks.
Nothing to see here; Move along.