Implications Of The Recent Hash Function Attacks

← Back to Stories (view on slashdot.org)

Implications Of The Recent Hash Function Attacks

Posted by timothy on Wednesday September 1, 2004 @05:40AM from the some-but-not-all dept.

An anonymous reader writes "Cryptography Research has issued a Q&A that explains the security implications of the hash function collision attacks recently announced at CRYPTO 2004. Apparently the consequences can be catastrophic for certain kinds of code signing and digital signatures, but MD5 sums for checking binaries are (mostly) OK. While the speculation that SHA-1 is about to fail seems to be overblown, updating the many legacy systems and protocols that rely on MD5 is going to be a massive undertaking."

20 of 262 comments (clear)

Min score:

Reason:

Sort:

This is what I've been saying! by Dthoma · 2004-09-01 05:45 · Score: 5, Insightful

"While the speculation that SHA-1 is about to fail seems to be overblown, updating the many legacy systems and protocols that rely on MD5 is going to be a massive undertaking."

Any time I've tried to point this out, I've been shouted down by hysterical people (such as relex) squawking that because it may be possible to generate two messages with the same MD5 hash, SHA-1 is automatically broken. Um, no. They're two totally different algorithms. Use some common sense, people. I'm as cautious as the next person but screaming about how "all hash algos are insecure" is hyperbole at its worst.

--
Note to M1-ers: a curt but otherwise insightful message is not "Flamebait" or "Troll".
1. Re:This is what I've been saying! by Peter+Cooper · 2004-09-01 05:47 · Score: 2, Insightful
  
  Frankly I think people who are crying that we should all ditch the MD5 hash are similarly hysterical.
  
  It's like if the same person won the lottery two weeks in a row.. would that mean the method they used for choosing the numbers was null and void? No. Likewise, finding a collission (or even several collissions) in MD5 does not invalidate its use.
  
  What would invalidate its use is having some programatic way of changing the hash of some data by merely throwing in some junk to make it match a hash of choice..
2. Re:This is what I've been saying! by AKAImBatman · 2004-09-01 05:48 · Score: 4, Insightful
  
  All hash algorithms have collisions. That's just the nature of the beast. The issue is that a way has been found to create and exploit MD5 collisions. SHA-1 will eventually be a target of something similar, just not today.
  
  --
  Javascript + Nintendo DSi = DSiCade
3. Re:This is what I've been saying! by swillden · 2004-09-01 05:59 · Score: 5, Insightful
  
  finding a collission (or even several collissions) in MD5 does not invalidate its use.
  No, but having an algorithm to generate collisions in a practical period of time *does* make it suspect.
  What would invalidate its use is having some programatic way of changing the hash of some data by merely throwing in some junk to make it match a hash of choice
  That would make it completely useless for all security-related applications, yes, but a weaker break (such as being able to generate collisions) can break its usage for some security applications. Read the Cryptography Research Q&A for some examples.
  
  --
  Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
4. Re:This is what I've been saying! by Fros1y · 2004-09-01 06:56 · Score: 2, Insightful
  
  Finding 2 things that give the same hash is not such a big deal. Having a search algo that lets you find that second thing (given the first) in substantially less computational time / complexity than previously thought possible is a huge deal. Given that MD5 can hash more than 2^32 objects, the first is inherent. It's that second trait, and recently discovered algo, that's the real kicker.
5. Re:This is what I've been saying! by tyler_larson · 2004-09-01 09:31 · Score: 4, Insightful
  
  Have we forgotten what a hash function is?
  Finding a few collisions, or even an algorithm to generate collisions doesn't change a damn thing. We've always known that there are collisions. A hash function maps in infinite input set to a finite output set. Of course there are collisions. There are an infinite number of collisions for ANY hash function. We already knew that--it's a mathematical certainty. Yet somehow we're shocked and horrified when we actually find some.
  Tell me something I didn't already know. Then I'll be impressed.
  
  --
  "With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea...."
  RFC 1925
Idiot Question by OverlordQ · 2004-09-01 05:45 · Score: 4, Insightful

Say I have program A that I distributed and I supply the md5/sha1 sum to insure it's 'validity'. From what I read you can get two bitstreams to produce the same sum, ok that was expected. But what I dont get is this still doesn't allow somebody to arbitrarily pick whatever sum they want for their code right? I mean still the chances of somebody writing some trojan'd program and magically somehow getting the sum's to match is extreemly small and/or really computationally expensive. If they were that smart, wouldn't they be working for one of the TLA's (Three letter Acronyms) already?

--
Your hair look like poop, Bob! - Wanker.
1. Re:Idiot Question by gordyf · 2004-09-01 06:08 · Score: 2, Insightful
  
  These hashes ALWAYS have collisions. A checksum could never definitively prove that the data has not been tampered with, as there are far more bits in a harddrive than there are bits in a hash.
  
  The new discovery is that it's not hard to generate two messages with the same hash. The discovery is not "hashes have collisions" - this has always been the case.
2. Re:Idiot Question by Brandybuck · 2004-09-01 07:07 · Score: 3, Insightful
  
  Correct me if I'm wrong, but isn't it still just as difficult to create a collision with a *known* message, as it was before this "attack" was discovered? In other words, there's no feasible way for a crooked forensics technician to alter the data. And even if it were practical to create a collision with a known message, the result of this on a harddrive is not going to be data that looks like it came from a harddrive. Rather, it's going to look like data that came from a random number generator.
  
  Hmmm, I guess that's one thing a crooked cop could do. He could make it look like the entire disk is encrypted. In France that's enough to convict your Grandmother with...
  
  --
  Don't blame me, I didn't vote for either of them!
3. Re:Idiot Question by rsmith-mac · 2004-09-01 10:28 · Score: 2, Insightful
  
  In many situations any data inconsistency can cause catastrophe
  Bingo. Take BitTorrent for example; it uses SHA-1 hashing to make sure every piece downloaded isn't corrupt, and otherwise helps ensure that no one is tampering with the torrent(and if they are, most clients kick them). However, if you can generate an arbitrary SHA-1 hash, anyone could seed the swarm with bad data, and the clients would never know. This could not only be used by the **AA to disrupt illegitimate torrents, but script kiddies could disrupt legitimate torrents too, resulting in everyone downloading a corrupt version of Knoppix, for example.
  
  While this is only 1 case, the point remains: in many cases, you only need to disrupt the data, not explicitly replace it.
New methods needed? by scumbucket · 2004-09-01 05:46 · Score: 3, Insightful

In the wake of stories like this is this a message that we need more secure forms of encryption than we already have? RSA is great so far, but how long until 1024 is broken? Or any other schemes, like the MD5 hashing that's used for digital signatures?

Keeping ahead of the crackers is a big concern not only for security of transactions, but for personal privacy as well.

--
CMDRTACO CHECK YOUR EMAIL!
1. Re:New methods needed? by merlin_jim · 2004-09-01 05:59 · Score: 4, Insightful
  
  The question isn't how long until its broken... the question is how long until its broken for cheap.
  
  RSA-512 was already broken. It took a major portion of the world's computing resources for several years. You're not really in danger that your wife is going to find out about your girlfriend. Or that your state is gonna find out about your cocaine habit.
  
  If you're using RSA-512 you just might be on the cusp of being in danger of the government, having caught you and trying you for terrorism, is able to decode your e-mail enough in the six months before your trial to convict you.
  
  See its all about level of effort. How long till RSA-512 can be broken by anyone in a few minutes?
  
  Well 40-bit SSL was supposedly bulletproof when it was introduced. My P4 1.8 can decode SSL messages in about 10 seconds. So RSA-512 should be good for another 3-5 years.
  
  Honestly; always use the maximum number of bits. If your data is important enough to encrypt, its important enough to encrypt right.
  
  --
  I am disrespectful to dirt! Can you see that I am serious?!
Comment removed by account_deleted · 2004-09-01 05:50 · Score: 3, Insightful

Comment removed based on user account deletion
Ok... by doublebackslash · 2004-09-01 05:51 · Score: 2, Insightful

Use both!
You say its easier than ever to find the soloution to each one of these hashes, so just use em both. I really think that the number of collisions that occur similtaniously are a bit fewer and farther between. I think that will be secure until we find a better and decently fast hash.

--
md5sum /boot/vmlinuz
d41d8cd98f00b204e9800998ecf8427e /boot/vmlinuz
Re:The cycle repeats by rewt66 · 2004-09-01 05:55 · Score: 4, Insightful

This isn't "a faster computer". This is "a better technique" or "a deeper understanding" - much more dangerous.
Re: Are things really that bad? by quasi_steller · 2004-09-01 06:14 · Score: 2, Insightful

I didn't read the attack too well, but from the Q&A, it appears that the attacks are collision attacks (like the Birthday attack, but, I imagine, more efficient). The Q&A states "In contrast [to a preimage attack], a collision attack finds two messages with the same hash, but the attacker can't pick what the hash will be."

So, shouldn't it be possible to edit something in the document that doesn't change the meaning (such as a misspelling, or punctuation) before you sign it, thereby changing the hash to something completely different? It would seem that now the attacker is forced to find a document that has a given hash, which is essentially a preimage attack, or is there something I am missing?

--
...interesting if true.
Summary for those too lazy to read it by rewt66 · 2004-09-01 06:16 · Score: 5, Insightful

The world is going to end! Giant asteroids will destroy all life on earth!

Oops, wrong article. Um... The world is going to end! Global warming... um, well... the Patriot Act... umm...

Well, it's not that bad. Somebody might be able to flip four very carefully selected bits in a file, and still produce the same MD5 hash. This could let me, for example, create an executable that had a normal, benign behaviour, and an evil trojan behaviour, and have one of the bits that I flip change a conditional so that the trojan behaviour was activated. (Note that open source tends to be immune to this kind of nonsense, since in the source code, the actual trojan part - not the conditional that activates it, but the actual evil payload - tends to stick out like a sore thumb.)

Note well that this does not let me create an evil version of somebody else's file. It only lets me create two closely related files, one of which differs by four bits from the other. I have to be able to construct the benign file in such a way that I can turn it into an evil file by changing four bits. And it can't be just any four bits, either; it's a very specific four bits.

So this isn't the end of the world. What it means is that you can't quite trust MD5 to guarantee that you got exactly, bit-for-bit, what you think you got.

But really, this new situation isn't much worse than what we had before. I mean, I could simply have the evil behaviour activated by the date, or by the IP address of the installed machine, or whatever, and get somebody else (who never saw the evil part run) to state that the program did what it was supposed to. Having an MD5 hash doesn't guarantee that the program isn't evil. Bottom line: don't run code written by bad people, whether it has a valid MD5 or not. (I know, I know. How do you tell who the bad people are? That's a hard question, but my point is that a valid MD5 has never told you whether the authors were bad people or not.)
Re:The cycle repeats by Zocalo · 2004-09-01 06:25 · Score: 3, Insightful

Exactly. With crypto, it's often that little bit of extra insight or improved technique which can bring the entire thing crashing down. As an example, take Charles Babbage's (yes, that one) breaking of what up until then had been the presumed unbreakable Vigenere cipher. The observation Babbage made was that certain sequences of letters might recur in the ciphered text, and the insight was that these might be the same plaintext letters encrypted against the same cipher sequence. From this small foothold, Babbage was able to ascertain the length of the keyword used to drive the encryption and from there break the complexity down into a limited number of substitution ciphers which are much easier to tackle.
While it's still not certain whether a similar jump might be made in the case of the MD5 and SHA-1 hashing algorithms, you can bet that a lot of crypto people are looking. What's that OSS saying about many eyes making flaws shallow again? Even if there is a fatal flaw though, I doubt it's not going to be the show stopper for hashes some people seem to think; you just use more of them. RPM supports DSA, SHA1, MD5 *and* GPG checksums for example, even if all four algorithms were broken, I doubt there is a method for generating an equivalent file that matches all four checksums at the same time.

--
UNIX? They're not even circumcised! Savages!
Re:yes, it does invalidate its use by JSBiff · 2004-09-01 07:04 · Score: 5, Insightful

But, it still is worth pointing out to people that the uses of this collision finding technique is still *very* limited. Someone can't take a digitally signed contract, make arbitrary changes to it, and still have the signature valid. In most cases, the change would be non beneficial (to the attacker that is), like maybe changing 3 characters in the document (this statement is made based upon the fact that, in the collision examples given by the author of the paper, the two messages differed only by like 4 bits or something like that), and the odds are slim that the 3 characters would end up being in the right place in the text, and have an appropriate value, to make it useful. For example, what *would* be useful, but unlikely, would be to change the string '$1995' to '$2995', but as likely as not, to get it to hash right, you'd end up with like '$#g95' or some other rubbish, even if you managed to get the changed bits to line up with the critical bit of data (in this case a dollar amount). It's more likely that you'd end up changing some word like 'benefactor' to '2knefactor'.

However, for the example you gave, of firmware code, where you want it to be exactly right, or else it will cause problems (even 4 bits of difference in bios code can make a computer inoperable), you are right that the hash collision can be a much bigger, much harder to detect, problem.
Re:Unix /etc/passwd files vulnerable... by pclminion · 2004-09-01 09:40 · Score: 2, Insightful

In this case, wouldn't it be easy for anyone with access to the passwd file to generate passwords that match the hashes (collisions) allowing them to login to accounts which were previously thought secure?
No. The new technique allows you to construct two pieces of data which have the same hash. It doesn't allow you to construct a piece of data which matches a given hash. This was explicitly spelled out in the Q&A document.
Also, MD5 password hashes are salted. This means that, although you could potentially find a collision with the hash in the password file, the colliding data almost certainly would not have the same salt, and therefore would not be accepted as a valid password.
Furthermore, think about it. If you had access to the password hashes, you would be root, and could just 'su' to the user account in question anyway.
Now, suppose that your goal was to guess the password and hope that the user uses the same password on other machines -- i.e., you want to boost yourself to other hosts. But you're still SOL, because the MD5 password hashes are salted. Therefore, you must recompute the equivalent password on the other hosts, even if the user used the same password on those hosts.
As the Q&A document explained, the ability to produce colliding hashes makes MD5 unsuitable for some tasks. Protecting UNIX logins is not one of the ways in which its use has been compromised.