Slashdot Mirror


Implications Of The Recent Hash Function Attacks

An anonymous reader writes "Cryptography Research has issued a Q&A that explains the security implications of the hash function collision attacks recently announced at CRYPTO 2004. Apparently the consequences can be catastrophic for certain kinds of code signing and digital signatures, but MD5 sums for checking binaries are (mostly) OK. While the speculation that SHA-1 is about to fail seems to be overblown, updating the many legacy systems and protocols that rely on MD5 is going to be a massive undertaking."

262 comments

  1. The cycle repeats by ravenspear · · Score: 0

    No matter how complex encryption algorithms get, there will always be a faster computer that comes along to break them. That is, until quantum computing becomes widespread.

    1. Re:The cycle repeats by rewt66 · · Score: 4, Insightful

      This isn't "a faster computer". This is "a better technique" or "a deeper understanding" - much more dangerous.

    2. Re:The cycle repeats by ahsile · · Score: 1

      Ah, but I have said before... There are more people trying to break through this than there are trying to keep it secure.

    3. Re:The cycle repeats by Zocalo · · Score: 3, Insightful
      Exactly. With crypto, it's often that little bit of extra insight or improved technique which can bring the entire thing crashing down. As an example, take Charles Babbage's (yes, that one) breaking of what up until then had been the presumed unbreakable Vigenere cipher. The observation Babbage made was that certain sequences of letters might recur in the ciphered text, and the insight was that these might be the same plaintext letters encrypted against the same cipher sequence. From this small foothold, Babbage was able to ascertain the length of the keyword used to drive the encryption and from there break the complexity down into a limited number of substitution ciphers which are much easier to tackle.

      While it's still not certain whether a similar jump might be made in the case of the MD5 and SHA-1 hashing algorithms, you can bet that a lot of crypto people are looking. What's that OSS saying about many eyes making flaws shallow again? Even if there is a fatal flaw though, I doubt it's not going to be the show stopper for hashes some people seem to think; you just use more of them. RPM supports DSA, SHA1, MD5 *and* GPG checksums for example, even if all four algorithms were broken, I doubt there is a method for generating an equivalent file that matches all four checksums at the same time.

      --
      UNIX? They're not even circumcised! Savages!
    4. Re:The cycle repeats by wirelessbuzzers · · Score: 2, Informative

      DSA and GPG aren't hashes. They're signature schemes (well, GPG implements PGP, an encoding of either RSA or ElGamal signatures), and they use other hashes in their checksums, like SHA1 and MD5. Of course, you can make GPG do RIPEMD160, which IIRC has not been broken yet either.

      --
      I hereby place the above post in the public domain.
  2. Re:gentleMEN by MikeXpop · · Score: 0, Flamebait

    TIME to get YOURSELF a new KEYboard with a FUNCTIONING caps lock KEY.

    --
    Etiquette is etiquette. He kills his mother but he can't wear grey trousers.
  3. This is what I've been saying! by Dthoma · · Score: 5, Insightful

    "While the speculation that SHA-1 is about to fail seems to be overblown, updating the many legacy systems and protocols that rely on MD5 is going to be a massive undertaking."

    Any time I've tried to point this out, I've been shouted down by hysterical people (such as relex) squawking that because it may be possible to generate two messages with the same MD5 hash, SHA-1 is automatically broken. Um, no. They're two totally different algorithms. Use some common sense, people. I'm as cautious as the next person but screaming about how "all hash algos are insecure" is hyperbole at its worst.

    --

    Note to M1-ers: a curt but otherwise insightful message is not "Flamebait" or "Troll".

    1. Re:This is what I've been saying! by Peter+Cooper · · Score: 2, Insightful

      Frankly I think people who are crying that we should all ditch the MD5 hash are similarly hysterical.

      It's like if the same person won the lottery two weeks in a row.. would that mean the method they used for choosing the numbers was null and void? No. Likewise, finding a collission (or even several collissions) in MD5 does not invalidate its use.

      What would invalidate its use is having some programatic way of changing the hash of some data by merely throwing in some junk to make it match a hash of choice..

    2. Re:This is what I've been saying! by AKAImBatman · · Score: 4, Insightful

      All hash algorithms have collisions. That's just the nature of the beast. The issue is that a way has been found to create and exploit MD5 collisions. SHA-1 will eventually be a target of something similar, just not today.

    3. Re:This is what I've been saying! by ajna · · Score: 1

      "The problem as I see it is if a method is found to predict collisions in a hash. Is this possible with md5 and sha-1?"

      If by this you mean "Is it possibly to construct a message to have a pre-determined hash value?" the answer is no. It's in the article, search for "preimage" if you're lazy.

    4. Re:This is what I've been saying! by swillden · · Score: 5, Insightful

      finding a collission (or even several collissions) in MD5 does not invalidate its use.

      No, but having an algorithm to generate collisions in a practical period of time *does* make it suspect.

      What would invalidate its use is having some programatic way of changing the hash of some data by merely throwing in some junk to make it match a hash of choice

      That would make it completely useless for all security-related applications, yes, but a weaker break (such as being able to generate collisions) can break its usage for some security applications. Read the Cryptography Research Q&A for some examples.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    5. Re:This is what I've been saying! by sk8king · · Score: 2, Interesting

      Why not generate hashes that use both SHA-1 and MD5. The combined result would be that more unique. Finding two files that generate the same hash value in BOTH algorithms would be that much more improbable. Take the hash generated by SHA-1 and concatentate it [with a sensible delimiter if required] with the hash generated by MD5. Ta dah.

    6. Re:This is what I've been saying! by wiedmann · · Score: 3, Informative

      I think the SHA-1 speculation originated because of the SHA-0 collision, not the MD5 collision.

      SHA-0 obviously is related to SHA-1. So although no one has yet extended the SHA-0 collision to SHA-1, it is conceivable it might be possible.

    7. Re:This is what I've been saying! by CodeMaster · · Score: 1

      Finally another voice of sense. I'm with you brother!

      All the "security community" got all rattled up ebcause of SHA. My clients came over to me all shaken saying how does this affect us. Sheesh. Had to explain that almost nobody uses SHA anymore, so this is mostly legacy code and implementations.

      I hate it how all the paranoids jump up and down when they see something they can't even attest to...

      get a free ipod! This really works. (Free gmail invite to the ones using this referal and completing the offer!)

    8. Re:This is what I've been saying! by Anonymous Coward · · Score: 2, Interesting

      Wrong.
      They cannot "create" collisions. They can only try and find them.
      In fact the chances of finding a colision that could be used in a malicious manor are slim. Did you even look at the colision they found? It was two huge strings of random characters. Now if they find lots and lots more, and patters emerge, we're in trouble.

    9. Re:This is what I've been saying! by MarsDefenseMinister · · Score: 1

      Very good, particular for the common use in password authentication. No attacker is going to bother with breaking MD5 when it's just easier to guess poorly chosen passwords. So I'd say that using MD5 to hash passwords is going to be with us for a long time.

      --
      No weapon in the arsenals of the world is so formidable as the will and moral courage of free men.-Ronald Reagan
    10. Re:This is what I've been saying! by minus_273 · · Score: 1, Informative

      "What I fail to see is how finding 2 things that give the same hash is a big deal."

      wow, didn't expect to see that here. Well, in very simple terms, assume that you have a an OS where the password is saved in a file, say passwd, as a hash. Someone come along and finds another word that makes the same hash. Now when that person logs in as you, he can use that other 'password' and not your password and still login.

      --
      The war with islam is a war on the beast
      The war on terror is a war for peace
    11. Re:This is what I've been saying! by Anonymous Coward · · Score: 0
      It's like if the same person won the lottery two weeks in a row.. would that mean the method they used for choosing the numbers was null and void? No. Likewise, finding a collission (or even several collissions) in MD5 does not invalidate its use.

      EXACTLY. Take OJ Simpson, for example. Just because the courts found a 99.9999999% probability that the DNA was HIS, doesn't make him guilty. There's still a chance the killer was from outer space.

      /butthead mode off

    12. Re:This is what I've been saying! by LnxAddct · · Score: 2, Interesting

      Even if you could quickly be given a message and generate a separate message which has the same hash but is the same size of the original message, it still won't really matter simply because you have no control over what the second message is. For example, in the article they use "I, Bob, agree to pay Charlie $ 5000.00 on 4/12/2005.", well a 2nd message that has the same hash and is the same length as the first message would probably look like this "bhedjfgd70gdgr6rhg2rkjhgvrek342rgkhverghvgkhkfgrr e2wd". What I'm getting at is that since you have no control over the new message, and because it is decided by an algorithm, it will probably be a bunch of random characters, even if a word or two were formed, it would be a non-sensical statement. The article is making appear like you can take any two messages and modify the second one to have the same hash. Its bullshit. The best you could do is add a bunch of random characters at the end of your 2nd message until it has the same hash as the real message, but then its easy to point out, "Hey I didn't sign a paper with a bunch of random shit all over it." Oh yea, and it would also take many millenia.
      Regards,
      Steve

    13. Re:This is what I've been saying! by Zeinfeld · · Score: 3, Informative
      Why not generate hashes that use both SHA-1 and MD5. The combined result would be that more unique.

      SSL does this. It is not a very good idea.

      The problem is that MD5 and SHA-1 are both variations of MD4. Each one has an extra cycle. SHA-1 has in addition a mysterious expansion function that blocks many attacks and has five chaining variables rather than four. But at root there is no real difference. Both use the exact same functions for the transformation.

      There would be slightly more point to using SHA-1 with a hash algorithm with an entirely different construction mechanism. But even then the keying mechanism is not very satisfactory.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    14. Re:This is what I've been saying! by Ewan · · Score: 1

      You've really not thought this through yourself have you?

      Which is harder - steal the shadow password file, generate an md5 collision using the method in this article which is still extremely taxing on the system involved, use the text that generated the md5 to break into the system who's shadow password you just managed to steal, or, just brute force the entry of the user-entered password?

      Doing a dictionary attack where you know the result will be human readable is by far the easier way.

      Ewan

    15. Re:This is what I've been saying! by 't+is+DjiM · · Score: 0
      Whether it's 32 characters depends on what encoding do you use ;-)... Actually, it's 32 hexadecimal digits.

      As far as I know, md5 128 bit-digests do indeed have 2^128 possible values. This is "only" 340,282,366,920,938,463,463,374,607,431,768,211,45 6 possible hashes.

      When looking closer at the "find a collision"-challenge, you can distinguish 3 different challenges:
      1. just find 2 random messages having the same hash
      2. find a random message that has the same hash as a given message
      3. find a message for the given hash
      The last problem is the most difficult one to solve. So far, nobody can do this in a reasonable amount of time. So your md5-hashed passwords are safe for a while.
      The second one apparears to be solveable in a reasonable amount of time (I saw the formula somewhere but I can't find it. Anyone?). So this breaks md5 for using it as a hash function for certificates.
      The first problem also gets solved in a reasonable amount of time but it doesn't pose a threat on any algorithm. As you stated in your post, there ARE collisions due to the fact that there are less hashes than possible inputs.
      --
      --Use ant to make .war
    16. Re:This is what I've been saying! by Thuktun · · Score: 1

      [...] finding a collission (or even several collissions) in MD5 does not invalidate its use.

      Indeed, mathematically speaking, there MUST be collisions when you map a set of larger cardinality to one with a smaller cardinality. The probability of a collision between two arbitrary elements of the larger set is still very, very remote.

      The issue to worry about is if you can somehow derive or predict the input from the output, as in most cryptographic uses of the hash.

      However, if you're only using MD5 for UUID/GUID generation in non-cryptographic environments, this shouldn't be an issue.

    17. Re:This is what I've been saying! by Zeriel · · Score: 1

      NO NO ALL WRONG.

      You can generate two messages with the same hash.

      You cannot yet generate a message with a predetermined hash.

      So the passwd/shadow files are still safe (from this particular attack)

      --
      "America has done some terrible things. But I know that Americans don't cheer when innocents die." -Dave Barry
    18. Re:This is what I've been saying! by Fros1y · · Score: 2, Insightful

      Finding 2 things that give the same hash is not such a big deal. Having a search algo that lets you find that second thing (given the first) in substantially less computational time / complexity than previously thought possible is a huge deal. Given that MD5 can hash more than 2^32 objects, the first is inherent. It's that second trait, and recently discovered algo, that's the real kicker.

    19. Re:This is what I've been saying! by jwdb · · Score: 2, Interesting

      Why it's a big deal is explained in the article:

      Let's say someone uses this attack to generate two ssl certificates with the same hash, one benign, one malignant (ie * for host, 2200 AD for expiry). This person then sends the benign one in to Verisign and gets it signed as a trusted certificate. He then applies the malignant certificate, with a valid Verisign signature, to his little scam website - people log on, check the certificate, see that it's signed, trust the website...

      Jw

    20. Re:This is what I've been saying! by r2q2 · · Score: 1

      Sha-1 isn't has no attack at this time. Why not just use that?

      --
      My UID is prime is yours?
    21. Re:This is what I've been saying! by xxxJonBoyxxx · · Score: 4, Informative
      What is means is "MD5 is not a secure hash".

      In the United States, the National Institute of Standards and Technology determines what is and what is not to be considered secure enough for federal data processing using the definition below. I highlighted the part where MD5 would run into trouble because a method has been discovered to predict collisions in MD5. (NIST never classified MD5 as a "secure" hash.)

      From the NIST site, FIPS 180-2 (http://www.nist.gov):

      Federal Information Processing Standards Publication 180-2

      3. Explanation: This Standard specifies four secure hash algorithms - SHA-1, SHA-256, SHA-384, and SHA-512 - for computing a condensed representation of electronic data (message). When a message of any length < 264 bits (for SHA-1 and SHA-256) or < 2128 bits (for SHA-384 and SHA-512) is input to an algorithm, the result is an output called a message digest. The message digests range in length from 160 to 512 bits, depending on the algorithm. Secure hash algorithms are typically used with other cryptographic algorithms, such as digital signature algorithms and keyed-hash message authentication codes, or in the generation of random numbers (bits).

      The four hash algorithms specified in this standard are called secure because, for a given algorithm, it is computationally infeasible 1) to find a message that corresponds to a given message digest, or 2) to find two different messages that produce the same message digest. Any change to a message will, with a very high probability, result in a different message digest. This will result in a verification failure when the secure hash algorithm is used with a digital signature algorithm or a keyed-hash message authentication algorithm.

    22. Re:This is what I've been saying! by Carnildo · · Score: 1

      The SHA-0 collision technique cannot be applied to SHA-1.

      Way back in 1993, SHA-0 was published by one of the TLA government agencies. Less than a year later, they published SHA-1, which is SHA-0 with a modification they didn't bother to explain, and a note that SHA-0 was no longer recommended. It appears that the purpose of the modification was to block this exact attack.

      --
      "They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
    23. Re:This is what I've been saying! by andman42 · · Score: 2, Informative

      Sure there are bound to be collisions, but the number of outputs is so large (2^160 for SHA) that the odds of you ever accidentally "finding" one should be slim to none.

      What is important about the recent results is that it is possible to generate hash collisions. This means that an adversary can make two things that have the same hash value.

      Using the example from the article, suppose an adversary creates two messages with the same has value:
      1. "I, Bob, agree to pay Charlie $ 5000.00 on 4/12/2005."
      2. "I, Bob, agree to pay Charlie $18542841.54 on 9/27/2012."

      If the adversary can get you to sign the original message (by selling you something for $5000), the adversary could then claim that you agreed to the second message because the messages have the same hash.

    24. Re:This is what I've been saying! by Anonymous Coward · · Score: 0

      Wow, I DID expect to see that here. You didn't read the article. What you describe is what did NOT get cracked. What got cracked would be making up two new passwords that hash to the same value so I could use either one when I log in. There is still no way known simpler than brute force to crack md5 passwords even if I know the hashes (/etc/shadow these days, noone puts them in /etc/passwd).

    25. Re:This is what I've been saying! by minus_273 · · Score: 1

      my reply has nothing to do with the article, i havent even read it. but i am answering the guy's question.

      --
      The war with islam is a war on the beast
      The war on terror is a war for peace
    26. Re:This is what I've been saying! by tyler_larson · · Score: 4, Insightful

      Have we forgotten what a hash function is?

      Finding a few collisions, or even an algorithm to generate collisions doesn't change a damn thing. We've always known that there are collisions. A hash function maps in infinite input set to a finite output set. Of course there are collisions. There are an infinite number of collisions for ANY hash function. We already knew that--it's a mathematical certainty. Yet somehow we're shocked and horrified when we actually find some.

      Tell me something I didn't already know. Then I'll be impressed.

      --
      "With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea...."
      RFC 1925
    27. Re:This is what I've been saying! by bradkittenbrink · · Score: 2, Funny

      Read the Cryptography Research Q&A for some examples.

      That's the most polite RTFA I've ever seen

    28. Re:This is what I've been saying! by ArsonSmith · · Score: 1

      Except that finding that second password is just has hard as finding the real password, if not harder. passwords are sometimes limited to 8 characters and something like 16 on newer systems. the number of combinations of all valid password characters from one letter passwords to 16 letter passwords is still smaller than the total number of combinations of md5 sum possibilities. It is very possible that there are no collisions in the available password space.

      Signed binaries are a little different though as if you are able to just pad a trojan with random bits that create a file with the same md5 and about the same size as the original you can then use that to infect systems under the guise of a legitimate program.

      --
      Paying taxes to buy civilization is like paying a hooker to buy love.
    29. Re:This is what I've been saying! by Anonymous Coward · · Score: 0

      They state that they are able to find many collisions. They also state that they can do this with any chosen IV. They can also do this relatively quickly. I'm disapointed off that there is all this talk about this result with two Slashdot stories, and yet it seems that the method is still not published. What's the deal? How did they do it?

    30. Re:This is what I've been saying! by RWerp · · Score: 1

      There are an infinite number of collisions for ANY hash function. We already knew that--it's a mathematical certainty. Yet somehow we're shocked and horrified when we actually find some.

      Think of an analogy. We know that there are an infinite number of prime numbers. Yet we don't know how to find all of them, methodically (like having a formula for the n-th prime number). This is the same. Researchers found a way to find two messages with the same hash. It's different from just knowing that such messages exist.

      --
      "Long run is a misleading guide to current affairs. In the long run we are all dead." (John Maynard Keynes)
    31. Re:This is what I've been saying! by Xabraxas · · Score: 1

      Actually the possiblity that the DNA belonged to OJ was 99.5%, meaning that every 1 in 200 people had DNA similar enough to match the sample the police had.

      --
      Time makes more converts than reason
    32. Re:This is what I've been saying! by ahdeoz · · Score: 1

      Do you know why, back in the good old days you would sign your name to the bottom of a contract? It was so someone couldn't add a bunch of random shit after that to make it look like you agreed to something you didn't. Of course, nowadays, they throw the random shit on extra pages anyways, because contract law is dead.

    33. Re:This is what I've been saying! by thogard · · Score: 2, Informative

      for a given algorithm, it is computationally infeasible to find two different messages that produce the same message digest.

      This is the trick to figure out if MD5 is secure in how you use it. If I'm using md5 to sign a 4 character string, then md5 is secure for that reason because there are no collisions in the 4 billion input strings (its insecure because I can reverse it because I easily do 2^32 md5s)

      There are a number of applications where md5 is still secure. One of them is hmac since the sending end controls the random seed and there isn't any place to inject and funny bits.

      This isn't a killer for md5. Its a killer for md5 in some cases but not all. For now I'm going to continue to use md5 for a number of uses and one of them is integrity checking of what I downloaded from open source binaries. If they can hack the web site to put a virus in the binary, can't they also update the md5 displayed on the web site? In which case its a great tool to verify the download is what was on the web site. I have binary checkers that use md5 to make sure my binaries aren't corrupted on my systems. Right now I md5 every binary that gets backed up I may consider adding a seed to the front of the md5 I use. Someone may come up with a funny RH7 binary of ps that has the same md5 as the stock one, but they aren't going to come up with one that has the same hash that also has the same hash when I tack on "xyzzy" on the front.

    34. Re:This is what I've been saying! by Kymermosst · · Score: 1

      Any time I've tried to point this out, I've been shouted down by hysterical people (such as relex) squawking that because it may be possible to generate two messages with the same MD5 hash, SHA-1 is automatically broken. Um, no. They're two totally different algorithms. Use some common sense, people. I'm as cautious as the next person but screaming about how "all hash algos are insecure" is hyperbole at its worst.

      Additionally, because they are totally different, it is unlikely that one could find two messages that cause collisions in both algorithms at the same time, a message digest composed simply of hashes from both algorithms would probably still be secure.

      This is how it seems to me, at least.

      --
      "Alcohol, Tobacco, Firearms, and Explosives" should be a convenience store, not a government agency.
    35. Re:This is what I've been saying! by Anonymous Coward · · Score: 0

      It appears that the purpose of the modification was to block this exact attack.

      Can you back this up at all? What leads you to think this is the case?
    36. Re:This is what I've been saying! by eam · · Score: 1

      > So I'd say that using MD5 to hash passwords is
      > going to be with us for a long time.

      That's not really saying much when you consider that using crypt to hash passwords will also be with us for a long time.

    37. Re:This is what I've been saying! by Red+Pointy+Tail · · Score: 1


      Sounds great, except that even if you could generate a malignant one from the hash, it would be gibberish and nothing like a certificate which you can apply to a website.

    38. Re:This is what I've been saying! by Alsee · · Score: 1

      We've always known that there are collisions.
      Of course there are collisions.


      Well, we've always known that there's 200-year-old buried pirate treasure. Of course there's buried treasure. Go ahead, spend your whole life digging up the planet trying to find some. Good freaking luck.
      Tell me something I didn't already know. Then I'll be impressed.

      This development pretty much amounts to a handy scanner directing you to the exact spot to dig, at least for certain types of treasures. And the scanner is expected to have an ad-on pack to pick up other types of treasure pretty soon.

      And just to be clear, hashes are used by banks, hases are used by government security systems, hashs are used in all sorts of critical systems.

      On the bright side Trusted Computing also relies on the security of hashes. As far as I'm concerned anything that helps block or defeat Trusted Computing is a Good Thing.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
    39. Re:This is what I've been saying! by Red+Pointy+Tail · · Score: 1

      They didn't find a way to exploit the collisions, they found a way to generate it, that is, given a hash, generate a message that will hash to it.

      Of course, this is not likely to be the original, but the crucial thing is that it can still be used as a usable substitute - like in password checking. There are ways to make this harder to attack on the application layer - for example, by salting the message before hashing, so that if the hash has been intercepted, you cannot get back a usable password you can generate collisions.

    40. Re:This is what I've been saying! by Anonymous Coward · · Score: 0

      The SHA-1 rumour originated becuase of cryptanalysis being done on SHA-1. Read the links in the article(You too mods!).

    41. Re:This is what I've been saying! by Anonymous Coward · · Score: 0

      Also, ROT13 password hashing.

      Oh wait, crypt isn't a hash. Neither is ROT13. Oh well, what were you saying again?

    42. Re:This is what I've been saying! by God!+Awful+2 · · Score: 1

      Finding a few collisions, or even an algorithm to generate collisions doesn't change a damn thing. We've always known that there are collisions. A hash function maps in infinite input set to a finite output set. Of course there are collisions. There are an infinite number of collisions for ANY hash function. We already knew that--it's a mathematical certainty. Yet somehow we're shocked and horrified when we actually find some.

      You seem to misunderstand the difference between "a hash function" and "a cryptographic hash function". Perhaps you should look it up.

      Tell me something I didn't already know. Then I'll be impressed.

      Ignorance is not an excuse for making a dumb post.

      -a

    43. Re:This is what I've been saying! by jwdb · · Score: 1

      It's not about generating a malignant one from a hash, it's about searching for two non-gibberish certificates that have the same hash. This is now possible.

      You find the two certificates first, and THEN get the hasn signed. This is what the attack is all about - you can't choose the collision hash but you can find a collision between a valid benign and valid malicious string of bytes, and if you get the benign one signed...

      Jw

    44. Re:This is what I've been saying! by alex_tibbles · · Score: 1

      "There are an infinite number of collisions for ANY hash function."

      That's what I thought until I read this article: SHA-0,1 are only defined for messages of less than 2^64 bits. Therefore there can only be a finite number of collisions.

    45. Re:This is what I've been saying! by Red+Pointy+Tail · · Score: 1


      The probability of finding two certificates that have the same hash is almost nil. This is because just a random string is not a certificate - a certificate have a definite and meaningful structure and the probability that you can find two certs having the same structure and definition AND same hash is virtually zero.

      My understanding of the 'attack' is that given the hash it is now computationally inexpensive to generate a possible 'message'. This 'message' would probably not be a valid certificate you can use.

    46. Re:This is what I've been saying! by Red+Pointy+Tail · · Score: 1


      Uhh, if you actually use the attack to find a collision to 1., it will probably not at all like 2., but would be more like: JJHWDF&*@#UB@#$JKGW*&F@GR@BFKJEWFBWEJF#EFNJ$KRF#$( R$#@R#$F#$JFJ#IRF#$KLCLD:LASWX_WR@#$R@#...

    47. Re:This is what I've been saying! by jwdb · · Score: 1

      Not so sure about that...

      The article was pretty clear about the fact that you cannot generate a message for a given hash. It is impossibly unlikely that someone can find a collision for one of the hashes out there.
      However, there is now a documented, working process for finding two documents who'se hash is identical, which is the basis for the certificate attack described in the article.
      Indeed, there is no guarantee that you will find two valid certificates, but in the last month it has become a whole lot more possible...

      Jw

    48. Re:This is what I've been saying! by julesh · · Score: 1

      SHA-0,1 are only defined for messages of less than 2^64 bits.

      MD5 also has a similar restriction; at one point in the process the size of the message in bits must be stored in a fixed width field (I believe it is also 64 bits for MD5, although I can't remember exactly).

    49. Re:This is what I've been saying! by Anonymous Coward · · Score: 0

      plus the glove didn't fit, so you must acquit.

  4. Re:gentleMEN by ponds · · Score: 2, Informative

    Yes, too bad ECC is not a hashing algorithm and has absolutely nothing to do with this, or else we'd be set.

  5. Idiot Question by OverlordQ · · Score: 4, Insightful

    Say I have program A that I distributed and I supply the md5/sha1 sum to insure it's 'validity'. From what I read you can get two bitstreams to produce the same sum, ok that was expected. But what I dont get is this still doesn't allow somebody to arbitrarily pick whatever sum they want for their code right? I mean still the chances of somebody writing some trojan'd program and magically somehow getting the sum's to match is extreemly small and/or really computationally expensive. If they were that smart, wouldn't they be working for one of the TLA's (Three letter Acronyms) already?

    --
    Your hair look like poop, Bob! - Wanker.
    1. Re:Idiot Question by cpeikert · · Score: 3, Informative

      But what I dont get is this still doesn't allow somebody to arbitrarily pick whatever sum they want for their code right? I mean still the chances of somebody writing some trojan'd program and magically somehow getting the sum's to match is extreemly small and/or really computationally expensive.

      Right. The breaks that were announced of the variety: "Here are two totally contrived documents that hash to the same value (which I can't control)." The attack does not allow someone to "hit" a desired hash value. So for the use you described, MD5 is still OK (so far).

    2. Re:Idiot Question by ponds · · Score: 5, Interesting

      In many situations any data inconsistancy can cause catastrophe. When distributing binaries it isn't that big of a deal, however there are other applications of hashing algorithms.

      Think about forensics: Someone gets arrested, computer confiscated. The first thing that happens is a hash checksum is ran of the disk, then a disk image is made, then the image checksum is verified to make sure that it is the same as the original disk. If the checksum of the original disk ever changes, the evidence is useless. When there are collisions in the algorithm, the checksum cannot prove, beyond a reasonable doubt, that the data has not been tampered with. Especially when the hashing algorithm is ran on 20 or more gigabytes of data, which is the typical case in forensics.

    3. Re:Idiot Question by Westley · · Score: 4, Informative

      I *think* this is the difference the article talked about between a preimage attack and a collision attack. An attacker still can't find a message (program, whatever) that produces the same hashcode as your one - but they *can* produce two messages which produce the same hashcode as each other. One of those may be perfectly acceptable, the other not - but once a user/system/whatever has accepted (or signed) the first one, they've effectively signed the second one.

      That's only my impression based on the article, and I wouldn't like to swear to it - but it does make a certain amount of sense.

    4. Re:Idiot Question by Vexler · · Score: 1

      You are right in describing what happens in a "collision" scenario whereby you can get two streams to match each other without knowing what the hash would be. But there is still the problem of "preimage attack", whereby you *CAN* choose an alternative input that has the same hash as another output. Both would be similarly problematic and could potentially be catastrophic should an attacker attempt to compromise, say, a CA system.

    5. Re:Idiot Question by merlin_jim · · Score: 1

      I believe this collision algorithm takes a seed value and munges it (using neutral bits theory?) until you have two values that hash the same.

      If you use your code as the seed, and design the algorithm so that the way it munges one part of it to only introduce NOPs and the other

      oh yeah I see.

      Yeah a collision algorithm can't really be used in this case. Or any case now that I'm thinking about it. Unless you can find a way to design the algorithm to give you specific outputs. But then its not a collision algorithm anymore; its a preimage algorithm.

      --
      I am disrespectful to dirt! Can you see that I am serious?!
    6. Re:Idiot Question by Anonymous Coward · · Score: 0

      it may be possible that you can construct another document which is totally arbitrary and only a small portion of it needs to be changed to get hash collision. so assume you have one program A signed with MD5. now i create another program B which is a trojan horse and it has some 2 kbytes of dummy segment. i can then set this 2 kbytes of dummy segment to a particular value such that it gives hash collision.

    7. Re:Idiot Question by hsoft · · Score: 1

      That's what I think too. I mean, there are an infinite number of strings with the same MD5 sum than 'foobar' MD5 sum, however, most of these strings don't make sense (and most of them must be insanely long). And a binary trojan certainly isn't a part of those strings.

      --
      perception is reality
    8. Re:Idiot Question by qwijibo · · Score: 1

      MD5 is likely to be safe in general, but it depends on your security needs. However, it's good that this information is readily available with detailed analysis for those who are in a position to evaluate their risk potential.

      For those who are handling financial transactions, the potential threat justifies the added effort in changing to SHA-1.

      If you're using MD5's to make sure your executables on servers aren't tampered with, it's probably safe. What are the chances that someone is going to come across a collision with a very specific subset of commands that would be worth replacing on a compromised host?

      Using multiple hashes is still a good general approach. If you have both MD5 and SHA-1, what are the chances of a compromise of one being critical? Nothing can be made totally secure, but buying time between discovery and exploitability is worthwhile.

    9. Re:Idiot Question by Anonymous Coward · · Score: 0

      This can't be how they really do it, can it?

      Wouldn't just turning on the computer affect the checksum of the entire disc if just a single file was accessed, thereby changing its last accessed date, or if a single temp file was modified?

    10. Re:Idiot Question by EvilIdler · · Score: 1

      Maybe the disk example isn't the best - simply mounting the disk
      might change data on it. File access times, mount count..

    11. Re:Idiot Question by ponds · · Score: 2, Interesting

      Yes, if you mount a disk, it is completely useless as evidence. Any forensics practitioner who has been on the job for more than a day would never mount a disk. Thats why an image is made.

    12. Re:Idiot Question by tzanger · · Score: 1

      When there are collisions in the algorithm, the checksum cannot prove, beyond a reasonable doubt, that the data has not been tampered with.

      Uh... all hashing algorithms will have collisions. It's a simple fact of life when digesting x bits in to y bits, given that y << x.

    13. Re:Idiot Question by gordyf · · Score: 2, Insightful

      These hashes ALWAYS have collisions. A checksum could never definitively prove that the data has not been tampered with, as there are far more bits in a harddrive than there are bits in a hash.

      The new discovery is that it's not hard to generate two messages with the same hash. The discovery is not "hashes have collisions" - this has always been the case.

    14. Re:Idiot Question by david+duncan+scott · · Score: 1
      But there are always and necessarily possible collisions any time the checksum space is smaller than the data space (in this case, any time the data is more than 32 characters, never mind 20GB).

      I'd say "reasonable doubt", rather than mathematical certainty, is the key term here. Any checksum raises the odds that the data hasn't changed, but it can't prove that it hasn't unless there are at most the same number of possible checksums as there possible versions of the data.

      The question is one of likelihood and your idea of reasonable. There are no absolutes.

      --

      This next song is very sad. Please clap along. -- Robin Zander

    15. Re:Idiot Question by ponds · · Score: 4, Informative

      "Wouldn't just turning on the computer affect the checksum of the entire disc if just a single file was accessed, thereby changing its last accessed date, or if a single temp file was modified?"

      Correct, usually what happens when a computer is confiscated is this:

      1.) Power is removed. IE, plug pulled on desktop or battery removed on laptop. If you turn the power off, APM or ACPI will kick in and write to the disk.

      2.) Disk is removed and a chain of custody form is written.

      3.) Disk is checksummed and imaged, either using a standard computer or a forensics machine that is designed to image disks. The disk does not have to be mounted to do this, you can get a raw dd without mounting a disk and without accessing any files.

      4.) Forensic analysis is performed on images, usually on copies of the images. When evidence is found, the checksum is checked again to make sure that this image is the same as what was on the disk.

    16. Re:Idiot Question by PCM2 · · Score: 1
      Think about forensics: Someone gets arrested, computer confiscated. The first thing that happens is a hash checksum is ran of the disk, then a disk image is made, then the image checksum is verified to make sure that it is the same as the original disk. If the checksum of the original disk ever changes, the evidence is useless. When there are collisions in the algorithm, the checksum cannot prove, beyond a reasonable doubt, that the data has not been tampered with.
      Well, the operative word there would be reasonable doubt, wouldn't it? The courts aren't a binary computing system. Just because it's theoretically possible for an algorithm to fail doesn't automatically invalidate its use. Blood typing, for example, can't prove that the blood found in the back seat of a car belonged to any one person. But if a certain person shows up dead, and that person knew the owner of the car, and the blood type matches ... well, that's not proof, but it's evidence. All evidence works toward eliminating that reasonable doubt. Likewise, you would have the option of trying to reinforce doubt as to the accuracy of the data on your hard drive by suggesting that the data had in fact been tampered with and then covered up by "forging" the MD5 hash using a known collision attack. I guess they might believe you...
      --
      Breakfast served all day!
    17. Re:Idiot Question by ajna · · Score: 3, Informative

      Hash functions map a bigger space (unbounded strings, for example) to a smaller space (64 bits, perhaps), so collisions are inevitable. The linked article is not significant because it points out that MD-5 has collisions, since this is trivially true as I've attempted to make clear. It's significant because researchers have found a way to find multiple inputs which both hash to the same value. Since they have not found out how to create an input string that hashes to a value of their choice (preimage problem) MD-5 is not fundamentally broken.

      "When there are collisions in the algorithm, the checksum cannot prove, beyond a reasonable doubt, that the data has not been tampered with." This preceding sentence demonstrates a remarkable lack of understanding about hash functions. Collisions are inevitable, see above. How hash functions work is by making the hash values unpredictable and spread out evenly over the space of the hash values, given random input. Read up: http://www.cs.sunysb.edu/~skiena/214/lectures/lect 21/lect21.html or by googling for "hash functions collisions probablility".

    18. Re:Idiot Question by WolfWithoutAClause · · Score: 4, Informative
      But what I dont get is this still doesn't allow somebody to arbitrarily pick whatever sum they want for their code right?

      It's based on the birthday party paradox.

      For two randomly chosen hashes, the chances of them being the same is 1/p where p is the maximum size of the hash.

      However, if you pick n hashes at random, then the chance that any two of them match is approximately n^2/2p, since any one of the 'n' could match with any other of the 'n'.

      So if p is 1/(2^160) and you generate 2^80 hashes of random (or partly random) data, then theres about a probability of 1 that two of them match each other. 2^80 is still a big number, but they've managed to reduce it further with some clever tricks, and modern processors can do billions of operations per second.

      So, if you write two programs one evil, one good, and then add 2^80 different random fillers on the end of each, chances are, two of the good/bads will have the same hash.

      But the chances of any of the bads matching a given hash that somebody hands you is only 2^80/2^160 which is 1/2^80- much too small.

      --

      -WolfWithoutAClause

      "Gravity is only a theory, not a fact!"
    19. Re:Idiot Question by qwijibo · · Score: 1

      They use their own computers for forensics, allowing them to mount the drive as read only. That way they can look all they want, copy data, etc without risking changing the contents.

    20. Re:Idiot Question by ponds · · Score: 1

      Sorry, I meant [b]known[/b] collisions. Tampering the data and "getting lucky" and having the hashes match, is much much less probably than "reasonable doubt." Tampering the data and having the hashes match when you know how to cause an algorithm collision is much more probable.

    21. Re:Idiot Question by BdosError · · Score: 1

      But, as this article said, this weakness does not let you choose the hash value to create a collision with. It only allows you to create 2 sets of data that have a hash collision, so that won't help you with tampering with evidence unless you produced both sets of data originally.

      --
      Complexity is Easy. Simplicity is Hard.
    22. Re:Idiot Question by ponds · · Score: 1

      This is irrelevant, because no one will use MD5 for forensics anymore.

    23. Re:Idiot Question by Grond_the_Hammer · · Score: 2, Interesting
      If the checksum of the original disk ever changes, the evidence is useless. When there are collisions in the algorithm, the checksum cannot prove, beyond a reasonable doubt, that the data has not been tampered with.

      Neither of those statements are true. Hashing algorithms are useful for forensic verification but changing a single bit on a disk image will not cause it to be tossed from a case, as long as any changes can be explained as a result of something legitimate the forensic analyst did. Booting the original image (while risky) is sometimes necessary in forensics but it will not, contrary to popular opinion, invalidate the evidence. A slick defense attorney could, however, point to analyst incompetence as the reason and if successful could have the evidence tossed. Also, MD5 collision have been known about for years, and are an acceptable issue within forensics. DNA typing has collisions as well, but since collisions are in a statistical range that is nearly unassailable, a match still meets the "reasonable doubt" standard in criminal court.

    24. Re:Idiot Question by scruffy · · Score: 1
      The specific result is that it's much easier than random chance to generate two strings with the same MD5 value. This implies that MD5 is much less secure than previously assumed.

      No, it's not an algorithm that will efficiently find a string with the same MD5 value as your program A, but it's a significant step in that direction. MD5 is now known to have enough of a flaw that it is reasonable to assume that it's only a matter of time and ingenuity to exploit it.

    25. Re:Idiot Question by cft_128 · · Score: 1

      The particular attack mentioned in the article and your parent post does not apply in your forensics example - you are thinking of a preimage attack (where you can find a collision to an existing key) and not a collision attack (where you can find two messages that match a new arbitrary key).

      --

      Underloved Movies and Pub Quiz: donotquestionme.org

    26. Re:Idiot Question by jwdb · · Score: 1

      You're absolutely correct that they've only found a collision attack, but this is not 'perfectly acceptable.' The best example he gave is of someone who finds a collision between two SSL certificates, one pretty standard and one granting an absurd number of privleges. If he can then submit the first certificate and get it signed...

      Jw

    27. Re:Idiot Question by Brandybuck · · Score: 3, Insightful

      Correct me if I'm wrong, but isn't it still just as difficult to create a collision with a *known* message, as it was before this "attack" was discovered? In other words, there's no feasible way for a crooked forensics technician to alter the data. And even if it were practical to create a collision with a known message, the result of this on a harddrive is not going to be data that looks like it came from a harddrive. Rather, it's going to look like data that came from a random number generator.

      Hmmm, I guess that's one thing a crooked cop could do. He could make it look like the entire disk is encrypted. In France that's enough to convict your Grandmother with...

      --
      Don't blame me, I didn't vote for either of them!
    28. Re:Idiot Question by LurkerXXX · · Score: 1
      As stated by the previous posters, you never ever want to turn on and run the computer as the suspect left it. They may well have included a logic bomb in the system to say erase the hard drive if a certain keyword isn't typed in within 30 seconds, etc.

      The drives are always removed and accessed read-only from the forensic expert's own system.

    29. Re:Idiot Question by fibonacci2000 · · Score: 2, Informative

      When the dcfldd version of dd is used, the investigator can acquire the MD5 hashes of chunks of data; even if the whole disk hash changes, you can still have valid pieces of the disk images.

    30. Re:Idiot Question by Westley · · Score: 1

      I wasn't suggesting that collision attacks are perfectly acceptable - I was suggesting that one of the things to be signed could be perfectly acceptable (the "pretty standard" SSL certificate in your example while the other wouldn't.

    31. Re:Idiot Question by pclminion · · Score: 1
      When there are collisions in the algorithm, the checksum cannot prove, beyond a reasonable doubt, that the data has not been tampered with.

      It could never prove it in the first place. Prove it beyond a reasonable doubt maybe, but never really prove it.

      However, if the MD5 sum is different, then you certainly have proven that the data DID change. The MD5 can prove the data changed. It cannot prove it did not change.

      Similarly, DNA evidence is able to rule out suspects, but it cannot prove that a person committed a crime. This is why DNA expert witnesses say something like, "The probability that the defendent did not commit this crime is two hundred million to one." However, if your DNA doesn't match that found at the crime scene, you are instantly exonerated.

    32. Re:Idiot Question by RWerp · · Score: 1

      So now defence lawyers have a new tool in their posession, right? "This is not evidence, since MD5 hash has been compromised, Your Honor".

      --
      "Long run is a misleading guide to current affairs. In the long run we are all dead." (John Maynard Keynes)
    33. Re:Idiot Question by rsmith-mac · · Score: 2, Insightful
      In many situations any data inconsistency can cause catastrophe

      Bingo. Take BitTorrent for example; it uses SHA-1 hashing to make sure every piece downloaded isn't corrupt, and otherwise helps ensure that no one is tampering with the torrent(and if they are, most clients kick them). However, if you can generate an arbitrary SHA-1 hash, anyone could seed the swarm with bad data, and the clients would never know. This could not only be used by the **AA to disrupt illegitimate torrents, but script kiddies could disrupt legitimate torrents too, resulting in everyone downloading a corrupt version of Knoppix, for example.

      While this is only 1 case, the point remains: in many cases, you only need to disrupt the data, not explicitly replace it.

    34. Re:Idiot Question by jwdb · · Score: 1

      aah, my bad

      Jw

    35. Re:Idiot Question by ahdeoz · · Score: 2, Interesting

      For most Financial Transactions, you're probably safe enough throwing a `wc -c` on to your message to make MD5 impregnable, if you're really paranoid.

    36. Re:Idiot Question by Eythian · · Score: 1
      So if p is 1/(2^160) and you generate 2^80 hashes of random (or partly random) data, then theres about a probability of 1 that two of them match each other.

      Just being pedantic, but ITYM "a probability of 0.5", no? You need 2^160 to have p=1 :)

    37. Re:Idiot Question by dave420 · · Score: 1

      You forgot the part where they break the rest of the computer and keep it in a warehouse for 18 months, then send it back to the owner as if nothing happened :-P hehehe

    38. Re:Idiot Question by WolfWithoutAClause · · Score: 1
      Yeah, that's why I said 'about'- a probability of 0.5 is 'about' 1 :-)

      The equation I gave overestimates the probability somewhat, but it's probably only out by an order of magnitude (much less so for small numbers than big ones near one). The equation also gives a probability of greater than one in certain situations :-)

      The exact equation is a lot more fiddly, so I didn't want to use it.

      --

      -WolfWithoutAClause

      "Gravity is only a theory, not a fact!"
  6. New methods needed? by scumbucket · · Score: 3, Insightful

    In the wake of stories like this is this a message that we need more secure forms of encryption than we already have? RSA is great so far, but how long until 1024 is broken? Or any other schemes, like the MD5 hashing that's used for digital signatures?

    Keeping ahead of the crackers is a big concern not only for security of transactions, but for personal privacy as well.

    --
    CMDRTACO CHECK YOUR EMAIL!
    1. Re:New methods needed? by merlin_jim · · Score: 4, Insightful

      The question isn't how long until its broken... the question is how long until its broken for cheap.

      RSA-512 was already broken. It took a major portion of the world's computing resources for several years. You're not really in danger that your wife is going to find out about your girlfriend. Or that your state is gonna find out about your cocaine habit.

      If you're using RSA-512 you just might be on the cusp of being in danger of the government, having caught you and trying you for terrorism, is able to decode your e-mail enough in the six months before your trial to convict you.

      See its all about level of effort. How long till RSA-512 can be broken by anyone in a few minutes?

      Well 40-bit SSL was supposedly bulletproof when it was introduced. My P4 1.8 can decode SSL messages in about 10 seconds. So RSA-512 should be good for another 3-5 years.

      Honestly; always use the maximum number of bits. If your data is important enough to encrypt, its important enough to encrypt right.

      --
      I am disrespectful to dirt! Can you see that I am serious?!
    2. Re:New methods needed? by gordyf · · Score: 2, Informative

      I think you're confusing symmetric and asymmetric cryptography. Current browsers use "128-bit SSL", which refers to 128-bit symmetric keys, which are still very secure (as long as the algorithm isn't weak and the implementation isn't flawed). 1024 or 2048-bit keys for asymmetric crypto are considered secure, I believe.

      But yes, 40-bit SSL is too weak to use. I don't think 512-bit RSA is considered very secure either.

    3. Re:New methods needed? by Meostro · · Score: 5, Informative

      Slight correction: AFAIK RSA-512 was not broken, it was brute-forced. There is a huge difference between the two.

      Breaking a combination lock is figuring out that you can hear the tumblers go *click* when you hit the right number. It will take you twenty seconds and five tries to get the right combination.

      Brute-Forcing a combination lock is trying every combination from 00-00-00 through 99-99-99 until you get the right one. You will get the right combination, it will just take you long enough that someone will notice you.

      Just to give you back a little bit of a warm-fuzzy feeling about RSA strength, realize that every bit added doubles the brute-force keyspace. So if you can brute-force 40-bit SSL in 10 seconds, you can do 41-bit SSL in 20 seconds, but it'll take 98 billion-billion years for the same computer to do 128-bit SSL.

      For the combo lock analogy, it would be adding on another number to guess, a 4 number lock instead of 3, which would give you 100x as much work (original amount of work to get numbers A-B-C with D=00, then lather, rinse, repeat until D=99). If the combo lock were truly broken instead, it would take you about a minute and seven tries, instead of 100x as long.

    4. Re:New methods needed? by cft_128 · · Score: 1

      As a side note, RSA relies on hashing for it to work in signing messages, if you can break hashing algorithms, you can break most public key crypto. (It has been a while since I kept up on crypto, but IIRC all public key signing required using hashing to sign the message and then use the private/public key crypto to encrypt the key, they article actually alludes to this problem.)

      --

      Underloved Movies and Pub Quiz: donotquestionme.org

    5. Re:New methods needed? by Idarubicin · · Score: 1
      ust to give you back a little bit of a warm-fuzzy feeling about RSA strength, realize that every bit added doubles the brute-force keyspace. So if you can brute-force 40-bit SSL in 10 seconds, you can do 41-bit SSL in 20 seconds, but it'll take 98 billion-billion years for the same computer to do 128-bit SSL.

      Of course, that presumes that your computer gets no faster in the next hundred pentillion years. If Moore's Law holds--really, more Moore's Observation--then you get an extra bit off the key 'for free' every eighteen months. You might be able to do the 128 bit SSL in ten seconds about a hundred and thirty years from now. Not that I believe Moore's Trend will continue unabated for that long, but who knows? Regardless, as long as processing speed doubles in less than half the time it takes to exhaust the keyspace, you might as well hold off on starting to try to brute-force the key.

      --
      ~Idarubicin
    6. Re:New methods needed? by ajayvb · · Score: 1

      According to this (pages 35 and 36), around 74 bits for symmetric and 1024 bits for asymmetric key cryptosystems works right now. An old paper, but still the benchmark for sufficient key lengths in cryptosystems.

      Moral of the story: You may need to upgrade to 2048-bit PGP keys in the long run.

    7. Re:New methods needed? by Meostro · · Score: 1

      ...but it'll take 98 billion-billion years for the same computer to do 128-bit SSL.

      Yea, I remembered Moore's Law after I started writing that, so I included that little disclaimer to keep from having to factor him in. Also, I forgot to mention that RSA keyspace isn't really 1:1 with symmetric keyspace, so 1 bit of RSA is more like 0.8 bits or so of DES / IDEA / whatever, don't remember the equivalencies. In either case, use RSA-4096 to make brute-force impossible for anything but quantum computers, and hope that nobody breaks RSA itself.

      Regardless, as long as processing speed doubles in less than half the time it takes to exhaust the keyspace, you might as well hold off on starting to try to brute-force the key.

      I don't understand this line... If you're talking about starting from scratch for a single processor, then yes that's true. But if it's gonna take 132 years to brute-force, there's no better time than now to start. Moreso, even, if you go parallel, then you can just add on a new processor whenever you like, double-the-speed or not, and cut that time in half (or better) with just one additional box.

    8. Re:New methods needed? by pclminion · · Score: 1
      Just to give you back a little bit of a warm-fuzzy feeling about RSA strength, realize that every bit added doubles the brute-force keyspace.

      Not true for RSA. Remember that an RSA key is constructed from an extremely large number with exactly two prime factors. Not just any number is a valid RSA key. There are actually much, much fewer than 2^1024 possible 1024-bit RSA keys.

      For symmetric key cryptosystems where all possible values are valid keys, then it is true that each extra key bit doubles the average brute force effort. But definitely not for RSA.

    9. Re:New methods needed? by vadim_t · · Score: 1

      No, 128 bit SSL is secure pretty much forever.

      A 1000 GHz CPU does 10^12 operations per second.
      Let's suppose in one clock cycle we can try one key.
      Let's take a million of those CPUs. We get 10^18 tries per second.

      Number of seconds needed to try them all:
      340282366920938463463374607431768211456

      Or:
      10790283070806014188970529154990 years.

      If you could get computers 10^6 faster, and 10^6 times more of them, then you could get it done in somewhere about 10 years.

      However, I doubt we'll ever have so much processing power, or so many computers.

    10. Re:New methods needed? by ahdeoz · · Score: 1

      If the time frame to brute force will take 6 years, then you will get it done twice as fast if you start 3 years from now (with quadruple the computing power.)

    11. Re:New methods needed? by Anonymous Coward · · Score: 0

      The probability of a random n-bit number being a prime is about 1/(n*ln 2), thus there are about 2^1007 or so valid 1024-bit RSA keys. No big deal.

  7. browsers check for wildcard in domain names??????? by stonebeat.org · · Score: 3, Interesting

    For example, a devastating attack would be one that enabled adversaries to obtain a legitimate server certificate with a collision to one containing a wildcard for the domain name and an expiration date far in the future.

    quick questions:
    1) Don't the browser check for wildcard domain names in the certificates???
    2) If not, why not???

  8. Re:gentleMEN by merlin_jim · · Score: 2, Informative

    And too bad that ECC is a) not provably secure and b) is rumored to have been broken already.

    --
    I am disrespectful to dirt! Can you see that I am serious?!
  9. Comment removed by account_deleted · · Score: 3, Insightful

    Comment removed based on user account deletion

  10. Ok... by doublebackslash · · Score: 2, Insightful

    Use both!
    You say its easier than ever to find the soloution to each one of these hashes, so just use em both. I really think that the number of collisions that occur similtaniously are a bit fewer and farther between. I think that will be secure until we find a better and decently fast hash.

    --
    md5sum /boot/vmlinuz
    d41d8cd98f00b204e9800998ecf8427e /boot/vmlinuz
    1. Re:Ok... by psmears · · Score: 1

      Use both!

      IANAC (cryptographer) but this seems to make sense to me. It would also seem to make sense to "chain" the hashes - that is:

      HASH1 = SHA1(MESSAGE)
      HASH2 = MD5(MESSAGE + HASH1)
      Send MESSAGE + HASH1 + HASH2
      That way, the attacker not only has to tweak the message so that the MD5 hash is what they want, they also have to tweak it so that the SHA1 hash of that message produces a message that gives the desired MD5 hash...
    2. Re:Ok... by Anonymous Coward · · Score: 0

      That's how the SSL 3.0 shared secret step works. Paul designed it that way with concerns about this attack in mind.

    3. Re:Ok... by YoJ · · Score: 2, Interesting

      You have to be really careful when you combine things. A lot of times you can end up with something that is less secure than either part separately. That said, I think this form of chaining is good. Being able to find MD5 or SHA1 collisions/preimages alone is not enough to break this hash. The hash is also at least as secure as SHA1 to any type of attack, since the SHA1 hash is included verbatim. The remaining item is to show why this chaining is better than simply using SHA1 with a longer signature length (i.e. the length of HASH1 + HASH2). I would guess that against known attacks, the longer single hash is better than two half-size hashes. The added safety of a chaining scheme such as this is the safety against unknown attacks that might completely break SHA1 or MD5 somehow.

  11. Hacking by Sharp+Rulez · · Score: 0, Funny

    In a nearly day, i'll be possible to hack d008960fa6b395dca1c8362165bb31be

  12. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  13. How about this... by Millennium · · Score: 3, Interesting

    Has a collision been found yet concerning data which has both the same MD5 sum and the same SHA-1 sum?

    It would seem as though even if SHA-1 were to fail, the two algorithms used together could bolster each other security-wise. This slows things down, of course, but would it not suffice for the time being?

    1. Re:How about this... by merlin_jim · · Score: 4, Informative

      FTA...

      SSL3.0/TLS does use both, and is therefore immune to this attack.

      Also FTA, SHA-1 is still believed to be secure, and Cryptography Researchers does not believe it is in danger to this same attack.

      --
      I am disrespectful to dirt! Can you see that I am serious?!
    2. Re:How about this... by Anonymous Coward · · Score: 5, Informative
      It would seem as though even if SHA-1 were to fail, the two algorithms used together could bolster each other security-wise. This slows things down, of course, but would it not suffice for the time being?

      Using two hashes in conjunction does not work as well as you would expect it to work. There are at least half a dozen posters here proposing this idea, so I will try to explain in some detail why it does not work.

      In general an n-bit hash can be collided in 2^(n/2) time using the birthday paradox attack. When you concactenate two hashes of lengths n and m bits, you get a hash of length n+m bits. However, this (n+m)-bit hash can in fact be collided in m*2^(n/2) + 2^(m/2) time (assuming n is greater than or equal to m). This is only slightly more effort than it takes to collide both hashes separately. In the case of SHA-1 and MD5, n is 160 and m is 128, so colliding both hashes would take 128*2^80 + 2^64 = 2^87.00000017 effort versus 2^80 effort for SHA-1 alone.

      It must be especially stressed that m*2^(n/2) + 2^(m/2) is considerably smaller than the attack time of 2^((n+m)/2) which you would normally expect from a well designed hash having n+m output bits.

      So how does the attack on two hashes work, you ask? It exploits a curious property of the birthday attack which says that generating a multicollision (three or more messages all with the same hash) by brute force takes only marginally more effort than generating a single collision. Specifically, generating a 2^(m/2) way multicollision takes only m/2 times as much effort as generating a single collision. So what you do to collide two hash functions is: you generate a huge multicollision in the first hash function, and then from that set you look randomly for a pair that collides the second function. It seems very counterintuitive, but the point is you can break the hash functions one by one instead of having to break both of them at once. Strength in numbers doesn't apply here.

      If one of the hash functions (say MD5) has a better than brute force attack, then that can be used to speed up the attack against both hash functions by the same factor. The only uncertainty is if both of the hash functions have better than brute force attacks; in this case it would depend on the particulars of the attacks as to whether one can make them interact in such a way as to break both. However, no matter what, the idea of concactenating two hash functions has such low security compared to designing a good hash function of the same length from scratch that it is unlikely that this concept will ever be useful from a pure cryptography standpoint.

      For more information on multicollisions and attacking concactenated hash functions, see A. Joux "Multicollisions in Iterated Hash Functions", Proceedings of Crypto 2004, LNCS 3152.

    3. Re:How about this... by Anonymous Coward · · Score: 1, Interesting

      Remember, if you hash to a 32 byte value, there are 2^256 possible outcomes. That's the best you can do without collisions (2^256 inputs). With two different 32 byte values, you have what is basically a 64 byte value.

      What's the difference between 2 32 byte hashing algorithms and one 64 byte one? Well, if the algorithms do things in a similar way, the extra 32 bytes may be useless. The trivial case is if the algorithms are the same -- you have basically 32 bytes of lack-of-collision. The math to express how "similar" two algorithms are (and hence what "equivalent encryption strength" you have) is not really appropriate for a /. post, however.

    4. Re:How about this... by Anonymous Coward · · Score: 0

      In general an n-bit hash can be collided in 2^(n/2) time using the birthday paradox attack. When you concactenate two hashes of lengths n and m bits, you get a hash of length n+m bits. However, this (n+m)-bit hash can in fact be collided in m*2^(n/2) + 2^(m/2) time (assuming n is greater than or equal to m). This is only slightly more effort than it takes to collide both hashes separately. In the case of SHA-1 and MD5, n is 160 and m is 128, so colliding both hashes would take 128*2^80 + 2^64 = 2^87.00000017 effort versus 2^80 effort for SHA-1 alone.

      Except that the difference between 2^80 and 2^1 is smaller than the difference between 2^87 and 2^80. So it takes _a lot_ more effort. That's exponential growth for ya. The economists call it the rule of diminishing returns.

    5. Re:How about this... by ericpi · · Score: 1
      I, like many people here, was curious if it was possible to effectively chain multiple hash functions. I found your post very informative. However, I was a bit confused by the statement:
      no matter what, the idea of concactenating two hash functions has such low security compared to designing a good hash function of the same length from scratch that it is unlikely that this concept will ever be useful from a pure cryptography standpoint.

      If the purpose of chaining two hash functions of size n & m was to get one "super" hash function with effective size n+m, then I agree with your analysis: At the end, you get a new hash that isn't much better than hash 'n' alone.

      However, I thought the intent was to combine two hashes, each of adequate length, one of which may be later found to be flawed. In the most extreme case, a major flaw reduces effective size down to m=1 (i.e., trivial to break). By your same analysis, the combination of the broken & unbroken hash has an effective size of

      m*2^(n/2) + 2^(m/2) = 1*2^(n/2) + 2^(1/2) ~= 2^(n/2)= same effectiveness as original "n" hash

      Meaning that if one algorithm is completely broken, you still have the full strength of the remaining function. In a perfect world, it would be simpler to use just one algorithm known from the start to be secure. However, I don't know of any crypto algs, aside from One Time Pads, that have ever been 'proven' secure. Chaining, therefore, seems a reasonable approach, given that we don't know in advance which algorithms may be broken in the future.

  14. yes, it does invalidate its use by bani · · Score: 5, Interesting

    you don't have to generate specific malicious code in order to exploit md5.

    merely creating pure trash would be sufficient, think of the case of BIOS or other firmware. create random garbage with the same md5 hash and voila, you've turned your victim's PC/laptop/celphone/pda/etc into a doorstop.

    there are many other ways that md5 can be exploited, this is only one.

    1. Re:yes, it does invalidate its use by quasi_steller · · Score: 5, Informative

      Yes, but remember that this is a collision attack, not a preimage attack. You can find two pieces of plaintext that have the same hash, but you don't get to choose what the hash is. Thus it is still computationally difficult to find a document (even garbage) that has the same hash as some preexisting document.

      --
      ...interesting if true.
    2. Re:yes, it does invalidate its use by argent · · Score: 4, Informative

      create random garbage with the same md5 hash

      A collision attack doesn't let you create random garbage that generates a given md5 hash. It lets you generate two pieces of random garbage (or nongarbage) that generate the same hash. It can't be used to attack a third party's existing hash.

    3. Re:yes, it does invalidate its use by JSBiff · · Score: 5, Insightful

      But, it still is worth pointing out to people that the uses of this collision finding technique is still *very* limited. Someone can't take a digitally signed contract, make arbitrary changes to it, and still have the signature valid. In most cases, the change would be non beneficial (to the attacker that is), like maybe changing 3 characters in the document (this statement is made based upon the fact that, in the collision examples given by the author of the paper, the two messages differed only by like 4 bits or something like that), and the odds are slim that the 3 characters would end up being in the right place in the text, and have an appropriate value, to make it useful. For example, what *would* be useful, but unlikely, would be to change the string '$1995' to '$2995', but as likely as not, to get it to hash right, you'd end up with like '$#g95' or some other rubbish, even if you managed to get the changed bits to line up with the critical bit of data (in this case a dollar amount). It's more likely that you'd end up changing some word like 'benefactor' to '2knefactor'.

      However, for the example you gave, of firmware code, where you want it to be exactly right, or else it will cause problems (even 4 bits of difference in bios code can make a computer inoperable), you are right that the hash collision can be a much bigger, much harder to detect, problem.

    4. Re:yes, it does invalidate its use by Anonymous Coward · · Score: 0

      Oh, come on. In most formats there is the posibility for garbage bits. Say it's a word-document. Some random formatting that would have no impact and be ignored could be changed - an invisible area could be made and filled with the necessary bits. Your example would work with plaintext ASCII - pretty much anything else it would be very much possible to spoof.

    5. Re:yes, it does invalidate its use by Hans+Lehmann · · Score: 2, Informative
      For example, what *would* be useful, but unlikely, would be to change the string '$1995' to '$2995', but as likely as not, to get it to hash right, you'd end up with like '$#g95' or some other rubbish, even if you managed to get the changed bits to line up with the critical bit of data (in this case a dollar amount). It's more likely that you'd end up changing some word like 'benefactor' to '2knefactor'.

      Or, you could get it to hash right by making other, less noticable changes. Extra spaces between words or at the end of the document, extra non-displaying bytes, etc., that won't give away the fact that the original has been tampered with.
      Notice that the two plaintext messages that were found by Joux were very similar, large sections of each message were identical. If you can create a plaintext message that looks superficially similar to the original (except that $1995 is now $2995) in significantly less time than would be required using brute force, then that certainly is a big problem.

      --
      09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
    6. Re:yes, it does invalidate its use by Anonymous Coward · · Score: 0

      even if it is possible to generate collisions
      for md5 and sha-0, it may be harder to generate
      a collision for both md5 and sha-0 (or even sha-1);

      unless using more than one hash function would not
      be more secure...

    7. Re:yes, it does invalidate its use by sr180 · · Score: 1
      It's more likely that you'd end up changing some word like 'benefactor' to '2knefactor'.

      So that explains why people have been trying to sell me v1AgrA, xeaneix, and derugs all lieglly via my email then...

      --
      In Soviet Russia the insensitive clod is YOU!
  15. Re:browsers check for wildcard in domain names???? by Effugas · · Score: 4, Informative

    Because wildcards are not necessarily a bad thing. The concept is that you have a single SSL accelerator in front of a whole pool of servers, and it absorbs the "security context" of all the hosts behind it.

    If you want universal SSL deployment, this is one of the ways you get it.

    --Dan

  16. to protect binaries by Anonymous Coward · · Score: 2, Interesting

    it would be better to post both the MD5 hash _and_ the SHA-1 sum. What's the chance of 2 different binaries having the same MD5 and the same SHA-1 at the same time??

    Artaxerxes

    1. Re:to protect binaries by Shard013 · · Score: 0

      I thought of this too, and I think it will be quite a good solution atleast for a fair while. I'm sure with enough time, more computer power and better techniques that people will be able to get just about any hashes they want with whatever data they want. (I think this is what pre-imaging is about)

  17. Re:gentleMEN by Rightcoast · · Score: 2, Informative

    ECC has been cracked...it just takes roughly the combined powers of 50 PCs to do it. This is a really old link but its valid for this thread.
    http://cristal.inria.fr/~harley/ecdl7/readMe.html

  18. Re:browsers check for wildcard in domain names???? by MassacrE · · Score: 1

    SSL certificates (and certificates in general) do not attest to the trustworthiness of an individual or machine, but to the authenticity of that entity.

    Wildcard certificates are prefectly valid because they are issued to an individual who has proved their authenticity to a certificate issuer, and because you need to have the private key (which is known only to the individual requesting the certificate) in order to be validated as authentic.

    So, whether I put the certificate on one machine or ten, I'm still the same person, and the certificate still asserts that.

  19. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  20. Another writeup on this by Anonymous Coward · · Score: 2, Informative

    Is here.

  21. Re: Are things really that bad? by quasi_steller · · Score: 2, Insightful

    I didn't read the attack too well, but from the Q&A, it appears that the attacks are collision attacks (like the Birthday attack, but, I imagine, more efficient). The Q&A states "In contrast [to a preimage attack], a collision attack finds two messages with the same hash, but the attacker can't pick what the hash will be."

    So, shouldn't it be possible to edit something in the document that doesn't change the meaning (such as a misspelling, or punctuation) before you sign it, thereby changing the hash to something completely different? It would seem that now the attacker is forced to find a document that has a given hash, which is essentially a preimage attack, or is there something I am missing?

    --
    ...interesting if true.
  22. Re:MOD UP INFORMATIVE by Neon+Spiral+Injector · · Score: 2, Informative

    512 bits made from 2 hashes, one weak and one strong will be weaker than a single 512 bit hash from the stronger algorithm.

  23. Summary for those too lazy to read it by rewt66 · · Score: 5, Insightful

    The world is going to end! Giant asteroids will destroy all life on earth!

    Oops, wrong article. Um... The world is going to end! Global warming... um, well... the Patriot Act... umm...

    Well, it's not that bad. Somebody might be able to flip four very carefully selected bits in a file, and still produce the same MD5 hash. This could let me, for example, create an executable that had a normal, benign behaviour, and an evil trojan behaviour, and have one of the bits that I flip change a conditional so that the trojan behaviour was activated. (Note that open source tends to be immune to this kind of nonsense, since in the source code, the actual trojan part - not the conditional that activates it, but the actual evil payload - tends to stick out like a sore thumb.)

    Note well that this does not let me create an evil version of somebody else's file. It only lets me create two closely related files, one of which differs by four bits from the other. I have to be able to construct the benign file in such a way that I can turn it into an evil file by changing four bits. And it can't be just any four bits, either; it's a very specific four bits.

    So this isn't the end of the world. What it means is that you can't quite trust MD5 to guarantee that you got exactly, bit-for-bit, what you think you got.

    But really, this new situation isn't much worse than what we had before. I mean, I could simply have the evil behaviour activated by the date, or by the IP address of the installed machine, or whatever, and get somebody else (who never saw the evil part run) to state that the program did what it was supposed to. Having an MD5 hash doesn't guarantee that the program isn't evil. Bottom line: don't run code written by bad people, whether it has a valid MD5 or not. (I know, I know. How do you tell who the bad people are? That's a hard question, but my point is that a valid MD5 has never told you whether the authors were bad people or not.)

    1. Re:Summary for those too lazy to read it by Isao · · Score: 3, Interesting
      What it means is that you can't quite trust MD5 to guarantee that you got exactly, bit-for-bit, what you think you got.

      You never could. It merely said that it was unlikely for you to be getting something else. The difficulty of arranging such a situation just got easier. Not easy. Not trivial. Just easier. Probably by the same factor it got easier over the past four years due to Moore's law increases. Eventually this will become a real issue, and we should be prepared for that, much the same way we don't use plain DES any more.

    2. Re:Summary for those too lazy to read it by 1000StonedMonkeys · · Score: 1

      A clarification:

      The current attacks cannot take a given file and modify it so that the hash does not change. All they can do is generate two files who have the same hash. Neither the two files nor the hash are known at the beginning of the attack, so it's really not a danger for verifying programs (yet). We'll see if this attack leads to other attacks where a file could be generated to match a specific hash.

    3. Re:Summary for those too lazy to read it by vidarlo · · Score: 1

      You never could. It merely said that it was unlikely for you to be getting something else. Exactly. And for my main use, which is checking iso files and so for download erreors, it is sufficent really. If you flip *any* random bit in a iso, it still should be hard to get the exactly same checksum. And for *important* stuff, I tend to use gpg or other signing mechanisms, with AFAIK is pretty bulletproof at the time of being... So for md5's main use (for me atleast, running sha1 passwd's) it ain't that important if you can match different content to the same hash, as it still will probably stop random bits flipped.

  24. Oh no! my nick is compromised! by d41d8cd98f00b204e980 · · Score: 4, Funny

    A bad day for me.

    1. Re:Oh no! my nick is compromised! by Anonymous Coward · · Score: 0

      nice nick. what's the original?

    2. Re:Oh no! my nick is compromised! by Anonymous Coward · · Score: 0

      It's the MD5 hash of NULL. Well, it's the first part of the hash, at least: d41d8cd98f00b204e9800998ecf8427e

    3. Re:Oh no! my nick is compromised! by subsentio · · Score: 1
      % echo -n | md5
      d41d8cd98f00b204e9800998ecf8427e
  25. How to Expoit by notthepainter · · Score: 3, Informative
    So as I understand it, from read many of the above comments, the exploint is as thus.

    I write 2 programs, lets call one "Cool Whizzy Must Have Util" and the other "Soul Sucking Destruction" and I tweak and tweak one of the binaries so that they have the same checksum.

    Then I release the first one, everybody is eventually using it.

    So then on my servers, I replace the first one with the 2nd one.

    Gotcha!

    Is that the danger here?

    1. Re:How to Expoit by phats+garage · · Score: 1

      From the article, it looks like you have to be able to tweak both images. If you're only required to tweak one image, that implies that you know which hash to target, and then that would be a preimage attack which is still infeasable (at this time).

    2. Re:How to Expoit by Shard013 · · Score: 0
      notthepainter: Short answer yes

      phats garage: Not quite. You only need tweak the second image. Its not that you know what hash to target but you know the origional image data. So for a longer explination for notthepainter, you would create "Cool Whizzy Must Have Util" and generate a MD5 sum. You would then slightly modify "Cool Whizzy Must Have Util" with the explits you want, and then some extra changes that will not break your program and fiddle with these "extra" changes until you get the same MD5 from the first image.

      The main point is that both images are nearly identical and for now only certain changes will be able to be made based on the limitations of the exploit found. So for a certain exploit it may be impossible with image A, so you recode image A to be slightly different with the same functionaly and create a MD5 hash for that. Then you hope you can create your evil image from your new image.

    3. Re:How to Expoit by phats+garage · · Score: 1

      What I'm thinking is that the moment the innocent image, A, is frozen and unchangeable, the hash is now a fixed target to be matched by the changing of the malicious image B. According to my understanding, its still too difficult to match the two hashes when one is fixed.

    4. Re:How to Expoit by Shard013 · · Score: 0
      This is how they currently do it, modifying A in the correct way to produce B with the same hash as A.

      It is too difficult to get a match if image A is unknown, its like A is a hint for how to produce that certain hash and that hint lets them make B.

  26. Access by lateralus_1024 · · Score: 2, Funny

    Yes but what does this mean to me, "Mr.MSAccess Guru/Administrator"?

    Microsoft certification available upon request.

    --
    If you think /. comments are bad, check out Digg.
  27. Re:gentleMEN by Zeinfeld · · Score: 4, Informative
    And too bad that ECC is a) not provably secure and b) is rumored to have been broken already. And according to Denis Hastert George Soros is 'rumoured' to be a drugs dealler, Brittney Spears is rumoured to be a virgin and George W. Bush is 'rumoured' to have been an AWOL coke head during Vietnam, not a sissy getting three purple hearts.

    Some forms of ECC have been 'broken', Len Adlemann (A of RSA) showed that ECC in dimensions higher than 2 was no more secure. He has been working on some further attacks and thinks that ECC as a whole might be vulnerable.

    I don't like ECC for two reasons. The first is that ECC is a very new field of mathematics, new results come regularly. It is entirely possible that someone would find an efficient means of transforming ECC problems into discrete math problems and come up with a solution.

    The other reason is that ECC is patented up the wazoo. The most efficient ways of using ECC are patented and if you can't use them there is no efficiency advantage over RSA in a discrete field so why bother?

    The hash algorithm thing is massively overblown. MD5 was already toast. SHA1 was due to be withdrawn in 2010 in any case and has already been superceded by SHA-256 and SHA-512. New versions of DSA for the larger hash sizes are also due.

    It remains to be seen whether the construction of SHA-256 needs to be adjusted in the light of the MD5 results. It may well be that it shares the same vulnerability as SHA-1 and we should forget about the new hash functions and move straight to something else. Alternatively all might be right with the world. We do not know yet.

    A lot of people are suggesting a competition similar to the AES competition for a new digest algorithm. There is already something underway for stream ciphers. This seems like a good plan, not least since the cryptographers seemed to have fun with the last one.

    --
    Looking for an Information Security student project suggestion?
    Try http://dotcrimeManifesto.com/
  28. Frequently questioned answers by Anonymous Coward · · Score: 5, Informative

    "SHA1 is a totally different algorithm, so it's still perfectly safe."

    Yes and no. MD5 collisions are not SHA1 collisions, and the attack that generated the MD5 collisions doesn't seem to be applicable to SHA1, or its authors would have published collisions on SHA1. The published collisions on several other algorithms: HAVAL-128, MD4, and RIPEMD. They also say that their method will work against SHA0. All these hash functions share similar design principles. It seems highly probable that the MD5 attack will have at least some applicability to SHA1 even though it isn't directly an attack against SHA1. Also, other researchers have published results against SHA1. In particular, Biham and Chen con produce collisions on reduced versions of SHA1 with up to about 40 rounds (the full hash function has 80). That isn't a break of the full hash function, and there's no guaranteed it can be extended to more rounds, but it looks worrisome.

    "This attack produces two messages with the same hash, no guarantee what that hash would be, instead of one message with a chosen desired hash, so it isn't a threat to real systems."

    That's just stupid. "No practically-findable collisions" is one of the design requirements for a secure hash function. Protocols using secure hash functions are based on the assumption that the functions used are secure hash functions. If your hash function doesn't guarantee collision resistence, then your protocols must be assumed to be broken unless you can go back and prove, for every protocol, "This one is still secure even if we use something that is not a real secure hash function."

    One way a hash collision could be useful, for instance, would be against some signature schemes where the secret key is revealed if you ever sign an identical message more than once. People who use those schemes are careful to avoid signing the same message twice... but if you had two different messages and they had the same hash, it's quite possible to imagine that you could be tricked into signing the same hash more than once (because people sign hashes, not actual messages) and making trouble for yourself. Similarly, if you use hash output for initialization vectors in cipher modes that use those, the result could be encrypting two messages with the same keystream, which means an attacker can probably recover both messages (and then use them as stepping-stones to breaking the rest of your system).

    Also, a fast way of finding collisions may well be extensible to a somewhat-slower, but still faster-than-brute-force, way of finding the preimages that you think an attacker really wants.

    "This attack depends on the messages having a special form; they don't look like real plaintext, so it isn't a threat to real systems."

    One of the conditions for a secure cryptographic system is that you don't depend on the plaintext having (or NOT having) a specific form. If your system doesn't work regardless of the content of the data I put through it, then I will punt on your system, and recommend to my clients some other system that will actually work. It's also not clear that the attack on MD5 really does require a specific form... those strings look randomly-generated to me, even though the XOR difference of them clearly is not. Maybe with just a little more work they can produce collisions of two meaningful and interesting strings with opposite meanings.

    "All hash functions have collisions, so it was bound to happen and isn't a threat."

    The important question is whether people can actually find collisions. With a good hash function, collisions should be rare enough that nobody has any reasonable chance of finding them on purpose any time soon. Wang, Feng, Lai, and Yu can find collisions on MD5 deliberately, with practical amounts of computer power. They have done this more than once, and have at least outlined a plausible theoretical explanation of how they can do it. That means MD5 does not provide the guarantees that a secure hash function must

    1. Re:Frequently questioned answers by ahsile · · Score: 1

      A very well thought out argument. I applaud you for it. You've got some very good info regarding some of the common misconceptions I've already read about this subject.

    2. Re:Frequently questioned answers by Idarubicin · · Score: 3, Informative
      Wang, Feng, Lai, and Yu can find collisions on MD5 deliberately, with practical amounts of computer power. They have done this more than once, and have at least outlined a plausible theoretical explanation of how they can do it. That means MD5 does not provide the guarantees that a secure hash function must provide; MD5 is not a secure hash function.

      It depends on how you define 'secure' and for what purposes you intend to use MD5. For a lot of cases, MD5 is still 'secure'.

      Wang et al. have demonstrated that they can generate collisions in a reasonable period of time. They have not demonstrated that they can generate a collision for a given hash--a so-called preimage attack.

      In other words, it is possible to produce two files full of junk that have the same MD5 hash. It's not possible (yet) to produce a file that has the same MD5 hash as a Linux kernel. It's not possible to create Trojan malware that shares an MD5 hash with a useful application. For most of us, that still counts as 'secure'.

      --
      ~Idarubicin
    3. Re:Frequently questioned answers by Anonymous Coward · · Score: 0

      That point is discussed in some detail in the comment you're replying to.

    4. Re:Frequently questioned answers by anakog · · Score: 1
      One way a hash collision could be useful, for instance, would be against some signature schemes where the secret key is revealed if you ever sign an identical message more than once.

      Hmm, it looks like this is worse than the initial impression I got from reading the FAQ. I felt that the FAQ was a fairly good explanation of the consequences of the recent development, but I obviously don't know enough about cryptography to make a sound judgement on that.

      Since you did such a good job of pointing out this problem, could you give us more info on which signature schemes have this "property" (of revealing info of the secret key when used to sign the same message twice) and what signature schemes are ones that appear safer to use these days?

    5. Re:Frequently questioned answers by Anonymous Coward · · Score: 0

      I'm not sure of the specific schemes. I think DSA is one of them. It is normally used in conjunction with a padding protocol that appends random data to the hash result in order to protect against this kind of attack - but combined with an attack on the random number generator, it might be threatened. I wouldn't stop using DSA over that issue, though - I would stop using MD5. It's MD5 that is the broken piece; the signature schemes that could have this kind of weakness are always reinforced with protocol measures to stop it from being a problem. The issue would be if someone was depending on collision resistence of a hash function to serve that reinforcement purpose, and it's possible that someone might do that.

      Note that I'm pretty sure DSA is specified to be used with SHA1, not MD5, anyway, and SHA1 is not known to be broken at this point.

      My point was not that I know of any specific system that is directly threatened by pure collisions - my point was that it's not safe to assume that pure collisions are no threat at all, as some people here are doing.

      - MS

    6. Re:Frequently questioned answers by julesh · · Score: 1

      "This attack produces two messages with the same hash, no guarantee what that hash would be, instead of one message with a chosen desired hash, so it isn't a threat to real systems."

      That's just stupid. "No practically-findable collisions" is one of the design requirements for a secure hash function. Protocols using secure hash functions are based on the assumption that the functions used are secure hash functions.


      Well, yes and no. There are two reasons for using a secure hash function: to prevent a message from being changed by its originator, and to prevent a message from being changed by a third party who wasn't the message's originator. This latter application is unaffected by these breaks, and is also the more common requirement of the two.

      In fact, you could say that in most real world situations it is enough. There are few applications where it is necessary to prevent the originator of a message changing it; most of these can be secured by having the signer modify the message before signing. Only if the originator _and_ signer cannot be trusted is there a problem.

      If your hash function doesn't guarantee collision resistence, then your protocols must be assumed to be broken unless you can go back and prove, for every protocol, "This one is still secure even if we use something that is not a real secure hash function."

      Thanks for your advice. I will continue using the eDonkey2000 file sharing network, which uses MD4 for its file verification, because I don't believe this to be a real threat to that application. (Yes, I can see threat scenarios where it might be a problem, but I can also see other attack scenarios that use social engineering and are much easier to carry out)

      However, I will not use Kazaa, which uses an utterly broken hashing algorithm (where it is trivially easy to construct a collision for a known hash if you already have a sufficiently large file that generates that hash) for its file verification.

  29. The next hash by augustz · · Score: 2, Interesting

    The work that went into putting together AES was really fantastic.

    I'm just looking forward to a similar effort around an advanced hashing standard.

    Where would an effort like this form?

    1. Re:The next hash by xxxJonBoyxxx · · Score: 3, Informative

      NIST (http://csrc.nist.gov/cryptval/) ran the AES contest (http://csrc.nist.gov/CryptoToolkit/aes/index.html ). They would be the body to run future contests of the same sort.

      BTW, NIST never approved MD5 for government use (as per FIPS), but they have been validating implementations of SHA-1 for several years. NIST also now validates SHA-224, SHA-256, SHA-384, and SHA-512, each essentially a longer version of SHA-1 ("160").

    2. Re:The next hash by augustz · · Score: 1

      I've found that support for longer version SHA's is surprisingly minimal. I'd personally like to be at SHA-256 for many items.

      I'd also love to see ASCII armoring of the hash, rather then the hex fingerprints, which I beleive would result in a more compact hash.

  30. Is MD5 broken? I can't tell from the article... by tstoneman · · Score: 0

    I thought the original post on slashdot said something about finding a collision with MD5 using a simplified version or different initialization numbers or something like that.

    Is tried-and-true MD5 broken?

    1. Re:Is MD5 broken? I can't tell from the article... by Shard013 · · Score: 0
      Well that was from the first attempt, but they quickly extended it to the read MD5 soon after.

      I have tested the collisions on my home computer, freebsd 5.0 with the standard MD5 function that comes with it and they come up same hashes but different files.

      I am really hoping that my version of MD5 is the real version.

    2. Re:Is MD5 broken? I can't tell from the article... by Smallpond · · Score: 1

      You might be thinking of the 40-round version of SHA-1 that got broken. Boring, since SHA-1 is 80 rounds.

      The claim for MD5 is that there is a way to generate two messages with the same hash in much less time then previously thought. However, this is already known to be much simpler than generating a message that has a specific hash, which is the usual purpose of a hash function. RFC 1321 claims it takes 2^128 operations to do the latter, but only 2^64 for the former. I haven't seen a claim that finding a message with a given hash is any easier.

      Any protocol that is worried about somebody creating two messages with the same hash function should not have been using MD5 alone anyway. 2^64 isn't as "big" as it used to be.

  31. Our Goverment is on the ball by Lieutenant_Dan · · Score: 2, Funny

    Upon hearing these news, Tom Ridge raised the level of alert to "Amber".

    At least this time he had something a tad more substantial to instill fear in the hearts of all patriotic Americans such as myself.

    Thank you Department of Homeland Defense! I sleep so much better at night!

    --
    Wearing pants should always be optional.
  32. Re: Are things really that bad? by Shard013 · · Score: 0
    From what I have read a preimage attack is picking exactly what hash you want.

    For example I may have a document that currently creates a mostly random hash, with preimage I could add extra bits into the document and make the hash equal to exactly what I want. Warez people could make all their fileid.diz hashes "L33T_L33T_L33T_L33T_L33T" with some extra padding at the end of the file.

    Right now all we can really do is live with the hash that is currently generated.

  33. Real scoop by Anonymous Coward · · Score: 4, Interesting
    I wasn't there this year. A friend told me that the embarrassing thing was that the Chinese paper was REJECTED from the conference. They presented their results at the rump session. Other non-Asian researchers with hash collisions got papers in the conference. This doesn't help one's faith in academia, does it, when one of the most important developments at a conference is rejected by the program committee. There is a growing rift between Asian research and Western research. The Asian side has much lower standards, but also has some good results. Sometimes good Asian papers end up being rejected by association with so many mediocre Asian papers.

    Posted anonymously to avoid offending any of my colleagues.

    1. Re:Real scoop by Anonymous Coward · · Score: 0

      I don't think that academic descrimination is limited to Asians, and in fact may be less pronounced then for South American papers, for example.

    2. Re:Real scoop by randombit · · Score: 3, Informative

      I wasn't there this year. A friend told me that the embarrassing thing was that the Chinese paper was REJECTED from the conference. They presented their results at the rump session.

      Of course, it would have helped the paper's chances of being accepted if:

      a) They had actually presented the methods they used

      and/or

      b) The results had been correct. Initially, they were not finding collisions for MD5, but what they *thought* was MD5 (due to a translation error).

      So what the reviewers read was a claimed attack on MD5, with no details, and their examples did not work. If I were reviewing that paper, I would have rejected it too. They didn't correct the paper until (IIRC) the day before the rump session.

    3. Re:Real scoop by jeif1k · · Score: 1

      A friend told me that the embarrassing thing was that the Chinese paper was REJECTED from the conference. They presented their results at the rump session. Other non-Asian researchers with hash collisions got papers in the conference

      I don't know the particular paper in question, but the paper may have been rejected because it contained errors; even if a paper claims to show something extraordinary, if it contains errors, it will get rejected.

      Even if the research itself was correct, the may well have been rejected for a reason indirectly related to having been written by Chinese authors: the presentation may have been unacceptably poor.

      Like it or not, when the official language for a conference or journal is English, papers must be written in clear, correct, readable English and follow the academic standards of the English-speaking world.

    4. Re:Real scoop by Anonymous Coward · · Score: 0

      Your friend lied to you(I actually doubt this, you're probably just a troll). Everyone who modded this up should be ashamed.

  34. Re: Are things really that bad? by Shard013 · · Score: 0
    One extra thing, I think I just clicked onto your exact point of confusion.

    In the current attack from what I have seen, the origional document makes hash A, and then the origional document is slightly modified to make hash A again from the modified document. I don't think they can make hash A without the origional document, its just too hard.

  35. Re:gentleMEN by Anonymous Coward · · Score: 1, Funny
    it just takes roughly the combined powers of 50 PCs to do it

    Go ahead and say it - "Beowulf cluster".

  36. Hash function attack roundup at Educated Guesswork by Anonymous Coward · · Score: 2, Informative

    More info on the implications at Educated Guesswork. (It isn't my work, so anonymously it is.)

  37. MOD PARENT UP! by rewt66 · · Score: 1

    I wish I'd said what he said...

  38. Birthday Attacks by Bob+4knee · · Score: 1
    The problem is that you can say basically the same thing, many different ways. Each of these ways will (probably) have a different hash.

    I'm Bob Fourney
    I'm bob Fourney
    I'm Bob Fourney.
    I'm Bob Fourney:
    I am Bob Fourney
    I'm Robert Fourney
    I'm Robert Anthony Fourney

    So, you could write the malicious code (or email) and play games with the syntax until you get the malicious functionality and a hash you like.

    It's easier if you can play with both the "good" message and the malicious one.

  39. Could be as big as Y2K by Anonymous Coward · · Score: 0

    Oh No! The sky is falling! This could be as big as Y2K!

  40. not only that... by 192939495969798999 · · Score: 1

    If you sign for one hash, you've signed for anything that can generate that hash... which is a great deal more than 2 different sets of data. It's an infinite set of data (in theory)!

    --
    stuff |
    1. Re:not only that... by Westley · · Score: 1

      It's certainly an infinite set, sure - but the important thing is to be able to take one "nasty" thing to be effectively signed by finding one "nice" thing with the same hash. You don't need more than two things to have the same hash for it to be a problem.

  41. Re:really now by phaze3000 · · Score: 1

    Well, personally I'm more of a skunk fan, but Abraxas in Amsterdam does a superb hash milkshake.

    --
    Blaming GW Bush for the Iraq war is like blaming Ronald McDonald for the poor quality of food.
  42. Re:How to Exploit by notthepainter · · Score: 1
    so you tweak both binaries before releasing the good one. The exploit is still there. Right?

    I'm asking questions, educate me.

  43. a better solution already exists by poot_rootbeer · · Score: 4, Funny


    ROT-13 is completely invulnerable to hash collisions; no two non-identical inputs will ever result in identical outputs!

    I recommend that everybody replace their existing encryption systems with ROT-13 immediately.

    -Cbbg

    1. Re:a better solution already exists by Anonymous Coward · · Score: 0

      Arrgghhh... I know it was a joke... but ROT13 is not a hash function. Sorry, the math major in me cringed when I read that.. :D

    2. Re:a better solution already exists by Neophytus · · Score: 1

      ROT13 is not a hash function.
      Even better! The chance of a hash collision in ROT-13 is even better than nil!

    3. Re:a better solution already exists by Chris+Burke · · Score: 2, Funny

      ROT-13 is considered horribly weak with modern computing power.

      Much better to use double-ROT-13.

      --

      The enemies of Democracy are
    4. Re:a better solution already exists by dave420 · · Score: 1

      Damn that's old. I use ROT-26 instead. Double the protection. ph33r my data!

    5. Re:a better solution already exists by gnu-user · · Score: 1

      Quadruple rot13....

      One better then triple DES

  44. Link to the MD5 Paper by xxxJonBoyxxx · · Score: 2, Informative

    http://eprint.iacr.org/2004/199.pdf

  45. Re: Are things really that bad? by RealAlaskan · · Score: 1
    ...something in the document that doesn't change the meaning (such as a misspelling, or punctuation) ...

    If you think that punctuation doesn't affect meaning, you are illiterate.

  46. Hash, beef, corned, 20 metric tons. by kris_lang · · Score: 1
    But even if someone finds a technique to pad junk-bytes on a malicious code package so that the malicious package's MD5 hash is equivalent to the actual code that this doppelganger is claiming to be, a quick workaround would be to publish BOTH an md5sum hash and another hash-function identifier for the package. Now even if supervillian Alice could mask Trojan-ribbed to appear to be Trojan-regular to Bob under the MD5sum, the other hash function would show it to be not the case.

    In fact, that's also the reason that having a PGP-signed message which exclaims what the md5sum of a package is good for: double-wrapped-goodness double-happiness security. Someone trying to pass off the bad Trojan to Bob would not only have to create the md5sum collision of the bad Trojan to match the purported good program, but would have to present a... whoops, I see the error in my own argument. Yeah, stick with publishing multiple hash values with multiple hash algorithms.

    The probability of being able to create a doppelganger which can cause a collision in BOTH hash functions simultaneously is multiplicatively harder. Right?

    1. Re:Hash, beef, corned, 20 metric tons. by ars · · Score: 1
      The probability of being able to create a doppelganger which can cause a collision in BOTH hash functions simultaneously is multiplicatively harder. Right?
      If that was the case then wouldn't you think that you should do this inside the one hashing algorythm?

      And that's exactly what MD2 does! And I quote from an RSA FAQ

      MD2 was developed by Rivest in 1989. The message is first padded so its length in bytes is divisible by 16. A 16-byte checksum is then appended to the message, and the hash value is computed on the resulting message. Rogier and Chauvaud have found that collisions for MD2 can be constructed if the calculation of the checksum is omitted [RC95]. This is the only cryptanalytic result known for MD2.
      If you read about other hashing functions you'll see that most of them say if you leave out this or that part of it they fail, it's exactly the same thing: they basically merge two or more different hashing functions.

      Another interesting thing is that all the secure hashing functions in use today basically all use the same algorithym, by Prof. Ronald Rivest of MIT!

      --
      -Ariel
    2. Re:Hash, beef, corned, 20 metric tons. by kris_lang · · Score: 1

      interesting comments, I'll have to look back at the complexity issues on this and that damn kolmogorov complexity book that's lying around somewhere. and let's not forget about shamir and adelman, eh?

      Thanks for the pointer. I hadn't thought seriously about this stuff in quite a few years.

  47. Re:MOD UP INFORMATIVE by Anonymous Coward · · Score: 0
    OTOH, it will be stronger than single "stronger" algorightm if someone breaks the "strong" algorithm.

    The time it'll take to produce a crack that breaks all three of DES, AES, and Rot-13 may be no stronger than AES alone is today; but if someone breaks AES tomorrow, at least some h4x0r5 can't break DES yet.

  48. Help me understand this. by inertia187 · · Score: 1
    The example in the Q&A had two messages...
    I, Bob, agree to pay Charlie $ 5000.00 on 4/12/2005.
    ...and...
    I, Bob, agree to pay Charlie $18542841.54 on 9/27/2012.
    MD5 on the first one is...
    4d4f44971dc6b71171c01abf5cbc7593
    ...and on the second one is...
    269a931c35a592458ad96e768e5f9d37
    Was the example supposed to result in an identical MD5 output or not? They look different to me. I want to see this work for myself. Any working examples?
    --
    A programmer is a machine for converting coffee into code.
    1. Re:Help me understand this. by Anonymous Coward · · Score: 2, Informative

      Here's an actual example of two different binary files having the same md5sum.

    2. Re:Help me understand this. by Anonymous Coward · · Score: 1, Informative

      The Q/A does not say they have the same hash, but rather say "Suppose they both have the same hash..." The keyword is "Suppose"....

      For example, suppose the attacker (Charlie) discovers that the message "I, Bob, agree to pay Charlie $ 5000.00 on 4/12/2005." has the same hash as "I, Bob, agree to pay Charlie $18542841.54 on 9/27/2012."

  49. Re:fristage postage by Anonymous Coward · · Score: 0

    guess again faggot.

    GNAA > j00
    fristage postage is mine

  50. p2p consequence by alexislashdot · · Score: 1

    This would allow labels to insert broken music/video content into p2p networks without the network being able to defend. They can take a pirated file, change it a little and add it back to the p2p network. The MD5 is the same but the file may be unplayable.

    1. Re:p2p consequence by Anonymous Coward · · Score: 0

      That would require a "preimage" attack - RTFA

  51. implications largely academic (for now) by kirkjobsluder · · Score: 1

    The implications are largely academic for now. What seems to have been discovered is that there are some shortcuts in MD5 and SHA0 that make these algorithms less robust than previously expected. The existance of one "hole" in an encryption or hash algorithm suggests that there might be more waiting to be discovered. This might take 1 year or it might never happen.

    So, right now the primary risk is that someday we wake up to find that enough holes have been discovered to compromise preimage resistance. If this happens, it might be easier (but probably still horrendously expensive) to perform certain kinds of attacks on digital signatures and password files.

    Another risk is to find out that these algorithms have an undiscovered bias in their output. MD5 and SHA are sometimes used to generate keys for encryption algorithms. If a bias is discovered, then it might be a possible (but probably still horrendously expensive) attack on some encrypted data.

    Again, we are talking about stuff that could happen tomorrow, or could never happen. The consensus I'm reading is that the odds of a worse break in MD5 being discovered have just gone up by a significant level. The odds are not big enough to justify another Y2K panic (partly because quite a bit of software has already made the transition). However, where possible it is prudent to pick SHA1 over MD5.

  52. The attack does matter. by Chandon+Seldon · · Score: 3, Informative

    With a collision attack, you can perform an attack that matters - here's an example:

    Imagine that Microsoft won't sign any audio drivers for Windows XP that allow raw audio data to be output to disk. Also imagine that you are the driver release engineer at Creative (Sound Blaster division) and you want to release a driver that can do that.

    What you do is build both drivers (one that Microsoft will sign, and one that you want to release with the "unacceptable" feature) with a large static data buffer that isn't used in the binary. You then try to modify both buffers in such a way as the two files will have the same hash (doesn't matter what hash, just that it's the same). This will take about 2^40 worth of work for MD5 instead of the 2^64 that it should take because of this security issue.

    Once you've created your two binaries with the same hash, you send the acceptable binary to Microsoft and they sign it. Then, in the release section of your website you post the other binary with the signature you got from Microsoft... and the signature verifys just like they signed it.

    There is also a break in the digital check situation, *if* the digital check protocal has random padding (many do) *and* the payee generates the check (also possible).

    --
    -- The act of censorship is always worse than whatever is being censored. Always.
    1. Re: The attack does matter. by Anonymous Coward · · Score: 0

      Very nice. Finally, a post from someone who actually understands the implications of this result. It's very telling that this has still only been moderated by one person so far.

    2. Re:The attack does matter. by dave420 · · Score: 1

      You'd just install the unsigned drivers anyway - windows doesn't require you to use WHQL-certified drivers, it just recommends it (which is good, as most crashes are through crappy drivers).

    3. Re:The attack does matter. by WuphonsReach · · Score: 1

      Nice post.

      This attack is also useful in forging signatures on digital documents that have been signed (since the public/private key pair only signs the hash of the digital document).

      Moderately troublesome.

      --
      Wolde you bothe eate your cake, and have your cake?
  53. Re:gentleMEN by shish · · Score: 1
    it just takes roughly the combined powers of 50 PCs to do it

    Or one PC from 5 years in the future...

    --
    I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
  54. Brute force complexity by Hoch · · Score: 1

    I don't think people realize how complicated it would be to brute force one of these algorithms. Lets take one that has a 160 bit hash. This means that you would find a collision in ~2^80 tries, right? Wrong, this assumption requires that you store each of the hashes you receive in the brute force attack. This would amount to 2^80 x 160 bits, or 21990232555520 terrabytes. It is clearly not feasible to store that much. A 128 bit hash would find a collision in ~2^64 tries and would need 2^64 x 128 bits or 268435456 terrabytes of storage! Lets see someone distribute that.

    --
    2*31*37*263
    1. Re:Brute force complexity by Anonymous Coward · · Score: 0

      You can trade time for space however, e.g. 2^54 memory and 2^74 operations.

    2. Re:Brute force complexity by Anonymous Coward · · Score: 0

      Look up Pollard's rho method.

  55. MOD PARENT DOWN by grouse · · Score: 1

    This is not true. Any input that contains no alphabetic characters will result in identical outputs. ;)

  56. Weird results by Lisper · · Score: 2, Interesting

    On a lark I decided to run the purported collisions in the paper through MD5 to verify the claim, and I got a weird result. The two examples given are indeed collisions, but the hash is not what the paper says it is. The paper says that the hashes for the two examples are supposed to be:

    9603161f f41fc7ef 9f65ffbc a30f9dbf

    and

    8d5e7019 6324c015 715d6b58 61804e08

    but the hashes I get are:

    74BE7342 8C5BDB65 9BE40E00 CF6AE31C

    and

    BC5E1391 D31E52F3 D41CBE8C 05D7DBC1

    I'm using the MD5 library built in to Darwin (OS X) and I've verified that it passes the standard MD5 test suite in RFC 1321.

    1. Re:Weird results by andfarm · · Score: 2, Informative

      The paper used an incorrect (wrong endianness, I think) implementation of MD5. You can reverse every chunk of 4 bytes in the data in the paper, or just look around for someone else who did the same thing.

      --

      TANSTAAFI: There Ain't No Such Thing As A Free iPod.

    2. Re:Weird results by Lisper · · Score: 1

      Correction: the hashes I get are:

      A4C0D35C 95A63A80 5915367D CFE6B751 and 79054025 255FB1A2 6E4BC422 AEF54EB4

      These still bear no obvious resemblance to the published results.

  57. Well duh... by Phishcast · · Score: 1

    Thanks for pointing out the obvious...Wait, what now?

  58. Re:MOD UP INFORMATIVE by ComputerizedYoga · · Score: 2, Interesting

    parent's trying to say more along the lines of "it'll be a lot less easy to find a collision dataset that's simultaneously a collision for md5 and sha1"

    A lot of stuff I've seen floating around carries multiple verification methods (apache uses md5 and pgp sigs for example).

    Even if one verification technique is rendered "broken" -- together, the two hash algorithms are still that much more complex to break (though your point is also valid: wasting 32 bits on crc32 isn't going to make it more secure than adding those 32 bits to your new nonbroken cryptographic hashing algorithm).

  59. Potential problem here... by Reteo+Varala · · Score: 1

    What if some attacker gets ahold of a Linux MD5 /etc/passwd file? That would likely now be enough to get access to the computers...

    I hope the distros migrate to sha-1 for their default authentication mechanisms...

  60. Unix /etc/passwd files vulnerable... by Anonymous Coward · · Score: 0

    Correct me if I'm wrong, but aren't the hashes in many unix /etc/passwd files MD5 hashes of the actual passwords? In this case, wouldn't it be easy for anyone with access to the passwd file to generate passwords that match the hashes (collisions) allowing them to login to accounts which were previously thought secure? This seems like an enormous security hole that wasn't even mentioned in the Q&A document.

    Nate

    1. Re:Unix /etc/passwd files vulnerable... by pclminion · · Score: 2, Insightful
      In this case, wouldn't it be easy for anyone with access to the passwd file to generate passwords that match the hashes (collisions) allowing them to login to accounts which were previously thought secure?

      No. The new technique allows you to construct two pieces of data which have the same hash. It doesn't allow you to construct a piece of data which matches a given hash. This was explicitly spelled out in the Q&A document.

      Also, MD5 password hashes are salted. This means that, although you could potentially find a collision with the hash in the password file, the colliding data almost certainly would not have the same salt, and therefore would not be accepted as a valid password.

      Furthermore, think about it. If you had access to the password hashes, you would be root, and could just 'su' to the user account in question anyway.

      Now, suppose that your goal was to guess the password and hope that the user uses the same password on other machines -- i.e., you want to boost yourself to other hosts. But you're still SOL, because the MD5 password hashes are salted. Therefore, you must recompute the equivalent password on the other hosts, even if the user used the same password on those hosts.

      As the Q&A document explained, the ability to produce colliding hashes makes MD5 unsuitable for some tasks. Protecting UNIX logins is not one of the ways in which its use has been compromised.

  61. A bad habit I'm guilty of... by Spoing · · Score: 1
    Maybe you do this too?

    To check a file manually, I should do the following;

    Check the MD5sum against a known good source.

    Check the GPG signature of that source.

    Check the file size (might be harder to fake an MD5 for files of the same size?)

    What I actually do most of the time is quite a bit is different;

    Check the first and last few characters in the MD5sum against what is posted on the web/FTP site.

    To get a complete MD5 collision is currently something the NSA might be able to do (paranoia hat not on). To get a look-alike that matches part of the original MD5 (just the part I tend to check) should be possible.

    (Forging the original MD5 is probably the easiest thing to do since the GPG signature is rarely provided and if it is is probably rarely checked.)

    --
    A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
  62. Re:gentleMEN by Anonymous Coward · · Score: 0

    I see that you have been paying close attention to the ECC FUD. Good for you.

  63. Stored Passwords by boatboy · · Score: 1

    A common security system practice, especially in web development, is to use MD5 to hash users' passwords and store them in a database. When the user enters their password for access, it is hashed and checked against the db. This means you can check passwords without having to store it plain-text anywhere.

    My understanding is that this problem in MD5 means it is slightly easier to take an MD5 hash (if the database were stolen, for example), and find a password that will generate the same hash, and thus allow access. Is this correct? What are some developers doing out there to address this issue in their security systems? Is it really an issue in this scenario?

  64. I, for one... by RWerp · · Score: 1

    welcome our new bluefish password hashes.

    --
    "Long run is a misleading guide to current affairs. In the long run we are all dead." (John Maynard Keynes)
  65. Re:gentleMEN by Anonymous Coward · · Score: 0

    Ummm... RSA has been cracked too:
    http://www.rsasecurity.com/rsalabs/node.asp?id=209 6

  66. MOD DOWN UNINFORMATIVE by grahamsz · · Score: 1

    The grandparent hasn't really grasped this.

    It's not like i can download a BIOS form abit and make garbage which matches the hash.

    It allows me to make two pieces of garbage which have the same hash

  67. Re: Are things really that bad? by bobbozzo · · Score: 2, Informative

    Yes, some crypto people are already saying you should change the whitespace in any pre-generated document you are asked to digitally sign.
    Changing spelling or punctuation would also protect against collision attacks.

    --
    Nothing to see here; Move along.
  68. Re:gentleMEN by Anonymous Coward · · Score: 0

    Wait... Did I understand you correctly? George W. Bush likes virgin coke and is secretly married to Britney Spears?

  69. pooh-pooh the findings at your own expense... by bani · · Score: 1

    what's shocking and horrifying is not that it actually happened , but that it was far easier to generate them than anyone ever thought .

  70. Re:gentleMEN by clap_hands · · Score: 1

    There's something like an AES-style competition for stream ciphers? Sounds fun, who's organising it?

  71. serial or parallel years? by NuShrike · · Score: 1

    so that's 98 billion billion serial-years?

    But, we've proven already with distributed attacks it's not so many parallel-years if you gather enough computing power together, so how many is that in parallel-years, or even quantum-years?

  72. Re: Are things really that bad? by quasi_steller · · Score: 1

    Now, I have no degree in English, but it certinally is possible to change punctuation without changing meaning. I didn't say "change any punctuation because it will not change the meaning". I said "...change something in the document that doesn't change the meaning..." (you even quoted this!) I can think of an example of a case where a change in punctuation will not change the meaning. For example I could write:

    "The cryptographic hashes are broken. The world is going to fall apart."

    or

    "The cryptographic hashes are broken; the world is going to fall apart."

    or even (adding a word)

    "The cryptographic hashes are broken, and the world is going to fall apart."

    All three of these are legal English and have the exact same meaning. Granted the second one may not be the most common way to write the sentences, but all are changes that nobody would argue with (unless they had ulterior motives).

    Maybe you should try thinking of more examples where punctuation doesn't change the meaning before calling someone illiterate.

    --
    ...interesting if true.
  73. Have you flashed BIOS lately?? by Anonymous Coward · · Score: 0

    Dingle berry, put away the pentium 60, newer flash BIOS has 2 areas, one that can be re-writen and another for basic start-up - I have had power failures during writes in newer systems and they recovered fine.

  74. Shyster Lawyer Scam by CustomDesigned · · Score: 1
    You are correct, but finding two messages with the same MD5 allows a shyster lawyer attack. The lawyer finds two contracts with the same MD5. One says something you would find reasonable. The other says you owe him $1000/mo for life (or your firstborn child or whatever). He presents you with the first contract to digitally sign. He then drags you into court with the second contract - your protests that "that isn't what I signed" not withstanding.

    I would hope that you could present the original contract you signed with the same MD5 as evidence.

    I would hope that the judge and jury would realize that it is easy (given the recent finding) for the shyster to find two contracts with the same MD5, but that it is computationally infeasible for you to find an alternate contract with a given MD5. Therefore, the existence of the two documents should be enough evidence to put the shyster in jail! One would hope ...

    However, as others have suggested, it would be prudent to make minor alterations to any important document you digitally sign to defeat the shyster lawyer attack.

    In real life crypto systems, the minor alterations take the form of random salt added to the hash. So a properly designed crypto system is not broken by this finding. In fact, any document signing system should include random salt in the hash and signature so that you don't have to go changing color to colour before signing.

  75. Dangerous! Parent does not understand (bad mod) by Anonymous Coward · · Score: 0

    The partent is NOT insightful. The breaking of MD5 is completely separate from the near-breakage of SHA-1. The SHA-0 algorithm, which is almost identical to SHA-1, has been broken. The reason people are worried about SHA-1 is that new techniques have been developed and they are partially successful on it. With some extra work, they may be ale to break it. The fact that MD5 was broken recently too is a coincidence and is NOT the reason people are worried about SHA-1.

  76. Bzzt. You do not understand by Anonymous Coward · · Score: 0

    Yes, we know these exist for any hash which can take inputs of greater length than the hash (insert simple pigeon-hole argument here).

    But the reason this _is_ something different is that we found an algorithm to generate them without having to do a brute force search.

    For example, for a 64 bit hash, you would expect to have to test 2^63 different inputs before finding a collision with the hash of a fixed input. Or, if you just want a collision of any two inputs, it would take many fewer attempts (the birthday paradox). But with the new techniques, it is even easier. Hashes are assumed to not have this property, thus cryptographic protocols built on top of any of these broken hashes may not be safe after this discovery.

  77. Re:That's what shadow passwords are for by Anonymous Coward · · Score: 0

    The whole point of shadow passwords it to force an attacker to gain root to even be able to get your password file (/etc/shadow). If an attacker can gain root, the attacker can either install a rootkit or create their own account, say root2 with a UID of 0. Shadow passwords were originally designed to provide additional protection against dictionary attacks by trying to keep the file with the passwords in it out of an attacker's hands, but work perfectly against an md5 collision attack. As long as the root account is properly secured, md5 collision attacks would not be possible. The only way to crack an /etc/passwd file nowadays is through something like a rubber hose attack or purchase key attack (i.e. torture/bribe/blackmail the password from someone who knows it).

  78. Might crash some software? by r6144 · · Score: 1
    I guess there are quite a number of software that use MD5 or SHA-1 hashes as if a collision is impossible, and indeed even when there is code to handle hash collision, it had been hard to test that code when no real collisions were known. Now, if such software is fed with two pieces of data with identical MD5 hashes, it might well break in horrible ways --- at least a DoS attack.

    Disclaimer: this is a repost.

  79. Looks like it was previously broken in sppokland by ErroneousBee · · Score: 1

    Q: What is the difference between SHA-0 and SHA-1? Is SHA-0 widely used?

    A: SHA-0 was initially proposed in FIPS 180 (May 1993) as hashing standard by the U.S. government, but was replaced by SHA-1 in FIPS 180-1 (April 1995). SHA-1 adds an additional circular shift operation that appears to have been specifically intended to address the weaknesses found in SHA-0. SHA-0 is not widely used and should not be used in new systems.

    This indicates that the US Govt had already shown SHA-0 was weak by 1995. I dimly recall there was a similar involvement in RSA or DES, where a weakness was avoided by the US govt requesting a (at the time) seemingly irrelevant change to the algorythm.

    --
    **TODO** Steal someone elses sig.
  80. RIAA and MPAA break bittorrent by Deliveranc3 · · Score: 1

    Can they now use this to break bittorrent?

    Would seem to me the server could have a secret hash code and use it on data that turned out to be invalid and ban based on it but more security is always a pain in the ass...

    Anyone seen any evidence of this or heard anything about it happening in the future?

  81. Re: Are things really that bad? by Darren+Winsper · · Score: 1

    I'm not sure you're entirely right there.

    "The cryptographic hashes are broken. The world is going to fall apart."
    You've got two separate statements there. Because of the full stop, you can't tell if the second statement was caused by the first.

    "The cryptographic hashes are broken; the world is going to fall apart."
    This implies causality. Thus, the world is going to fall apart *because* the cryptographic hashes are broken.

    "The cryptographic hashes are broken, and the world is going to fall apart."
    I was told to never put "and" after a comma, so I'm not sure this sentance *is* legal English. As with the first sentance, you again can't be certain of the causality of the statements.

    However, you are right in your general statement, the first and last sentances have the same meaning.

  82. Your sig is nuts by Anonymous Coward · · Score: 0

    How can you blame 9000 deaths on AI. Their highlighting government mischief has created awareness, forced investigations and lowered the number of deaths.

  83. Add arbitrary salt? LESS secure! by KWTm · · Score: 1

    Adding an arbitrary salt makes it less secure. Here's why:

    MD5 will still be able to protect your file (binary or authenticated message, whatever) from being tampered with by SOMEONE ELSE, but not by yourself. One of the problems with the hash collision attack is, as suggested in TFA, someone creates two files that have the same hash (easier than creating a file that has the same has as another pre-specified file) and submits the good one to be trusted before substituting it for the evil one.

    Suppose you write a program:

    if (user == root) {
    omnipotent_flag = True;
    }

    But what you really want is to write this malware:

    if (user == backdoor) {
    omnipotent_flag = True;
    }

    But --gosh darnit, the resulting two programs generate two different MD5 hashes! What to do?

    Well, use the excuse of some arbitrary salt string:

    if (user == root) {
    omnipotent_flag = True; /* I'm just inserting this arbitrary string to randomize the hash --yeah, that's it.
    ujiUFIDO94305-8345JFKL:JKDFLS:f */
    }

    Hey, now you can generate your own malware program:

    if (user == backdoor) {
    omnipotent_flag = True; /* Inserting whatever random characters it takes to generate the same MD5 hash:
    rewAFDSADSF5435435#$%#$% */
    }

    Of course, the example doesn't have to be as blatant as this, but you can see where this is going:
    - the "random" salt can be at the beginning, with a nice-looking comment, and if it is short (like your example of "xyzzy"), it might be accepted. A 5-letter "salt" can increase your number of available files by 64^5 (assuming 64 characters are acceptable as "letters"). One of those might just generate a hash that matches the malware!
    - Yes, I know the idea is that the hashes match for both the file with the salt and the file without the salt. But you can imagine someone saying, "Yeah, I added this RANDOM salt [laughs evilly to himself] and the hashes still match!" and the sheeple will say, "Wow, it must be a match --I guess I don't need to bother checking the hashes of the original unsalted file."

    What is needed is a widely accepted Standard Salt String that is pre-pended to the file, and when people list the MD5's of a file, they also list the MD5's for the file with the Standard Salt String, and both must match. As long as it's the community that chooses the SSString, and not the contributor, the SSString can be any arbitrary sequence of characters, like, oh, say, "m1cr0$0f7 5Ux0R5".

    KWTm

    --
    404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
    [GPG key in journal]
  84. Re:Add arbitrary salt? LESS secure! by julesh · · Score: 1

    Your explanation makes no sense to me.

    How is it any easier to produce a file that will cause a collision in MD5 when I hash it in MD5 after an arbitrary (and unknown to the attacker) string has been hashed?

  85. Re:Add arbitrary salt? LESS secure! by KWTm · · Score: 1

    In this case, the attacker and the creator of the file are the same person. (I'll say why this is the only important case.)

    The attacker/creator creates two files that have the same hash, by adding an arbitrary string of the attacker/creator's choice, using some excuse that will fool some people. Then you get a hash collision.

    In your question, you're asking what if YOU add an arbitrary salt string of your choice, and then do the MD5 hash. In this case, the attacker cannot choose your string. But then what use would your new MD5 hash be? You would need to compare it to another MD5 hash (presumably, that of the originating site) which also used the salt string of your choice.

    Suppose I post my file, and the MD5 hash is "123abc" (for argument's sake). You get the same MD5 hash after downloading my file. But now you say, "Hey, just to be sure: when I add 'xyzzy' to the front, the MD5 hash is '456def'. Did you get the same thing?" So now I have to go back and rehash it for xyzzy+file.

    Then someone else says, "Hey, when I add 'plugh' to the front, I get '321cba'. Did you get the same thing?" Pretty soon I'll be having all these requests and have to post all these hashes. So I say, "Okay, I'm just going to pick some arbitrary string, like 'sf9FD798dfs' and do the hash." The danger is that some people would get fooled into thinking that this is the ONLY hash that needs to be checked (and not the file without the salt), and so it's much easier for me to create a collision.

    --
    404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
    [GPG key in journal]
  86. Re:MOD UP INFORMATIVE by julesh · · Score: 1

    512 bits made from 2 hashes, one weak and one strong will be weaker than a single 512 bit hash from the stronger algorithm.

    True. However, using 2 different algorithms that are not known to be weak is probably stronger than using a single algorithm that is not known to be weak but produces twice as many bits.

    This follows from the fact that similar methods will be used to generate all of the bits in the latter case, therefore if there is some systematic flaw it is reasonably likely to apply to all of the bits. Whereas in the former case, you'd have to find 2 systematic flaws to get you as far (assuming that the algorithms used to generate them were dissimilar, and therefore unlikely to both contain the same flaw).

  87. Re:Add arbitrary salt? LESS secure! by julesh · · Score: 1

    In your question, you're asking what if YOU add an arbitrary salt string of your choice, and then do the MD5 hash. In this case, the attacker cannot choose your string. But then what use would your new MD5 hash be? You would need to compare it to another MD5 hash (presumably, that of the originating site) which also used the salt string of your choice.

    Yes. This is the same situation that the original poster who suggested this was talking about -- he was monitoring files on his system for changes (presumably in order to catch rootkits being installed, etc.).

    I understand the problems if you don't choose your own salt.