Slashdot Mirror


MD5 Collision Source Code Released

SiliconEntity writes "The crypto world was shaken to its roots last year with the announcement of a new algorithm to find collisions in the still widely-used MD5 hash algorithm. Despite considerable work and commentary since then, no source code for finding such collisions has been published. Until today! Patrick Stach has announced the availability of his source code for finding MD5 collisions and MD4 collisions (Coral cache links provided to prevent slashdotting). MD4 collisions can be found in a few seconds (but nobody uses that any more), while MD5 collisions (still being used!) take 45 minutes on a 1.6 GHz P4. At last we will be able to implement various attacks which have been purely hypothetical until now. This more than anything should be the final stake in the heart of MD5, now that anyone can generate collisions whenever they want."

16 of 411 comments (clear)

  1. SHA1 by mysqlrocks · · Score: 5, Funny

    So is SHA1 the recommended alternative?

    1. Re:SHA1 by psykocrime · · Score: 5, Informative

      So is SHA1 the recommended alternative?

      No, see:

      http://www.computerworld.com/securitytopics/securi ty/story/0,10801,99852,00.html

      and

      http://www.computerworld.com/softwaretopics/softwa re/story/0,10801,105875,00.html

      I like this quote:

      "SHA-1 is a wounded fish in shark-infested waters, but I'm more worried about MD5 because it's used everywhere," said Niels Ferguson, a cryptographer at Microsoft Corp. "Try to switch away from SHA-1 as quickly as you can, but switch away from MD5 first," he said when asked what recommendations he has regarding the algorithms during a panel discussion at the conference.

      --
      // TODO: Insert Cool Sig
    2. Re:SHA1 by Anonymous Coward · · Score: 5, Informative
      No, MD5 and SHA1 were found to have better than brute-force attacks within a few months of each other.

      Crypto people are recommending SHA-256 or SHA-512 which is only like SHA-1 in name.

      Obviously check your the hash length beforehand and make sure your database column is wide enough.

      When migrating existing hashes to the new hash be careful not to store the old hash anywhere -- that can be the weak link in the chain. For example, generating passwords and having the MD5 around lets attackers generate valid inputs and then try them against the more computationally complex hash. It gives them an approach to attacking your stronger hash.

      Take a copy of your database and hash all the existing passwords into SHA-512 form, and you'll need some way of distinguishing the MD5-to-SHA512 hashes from the SHA512 hashes, so add a date column with todays date in it. Then write a function "hashString" as a wrapper that can identify when something was hashed, and go down a different branch of code based on that.

      The first branch does MD5 then SHA512, the second branch does SHA512, and it does this based on the date column.

      And, of course, re-salt both branches.

  2. Should I care? by SlashAmpersand · · Score: 5, Funny

    This is all really interesting theoretically, but who has the money to run a 1.6 GHz P4?

  3. Replacement Hash Functions by Anonymous Coward · · Score: 5, Informative

    Recommended replacements are SHA (preferably SHA-2), WHIRLPOOL and/or RIPEMD.

    http://en.wikipedia.org/wiki/SHA-2
    http://en.wikipedia.org/wiki/WHIRLPOOL
    http://en.wikipedia.org/wiki/RIPEMD-160

  4. Re:So what the hell do I do now? by DreadSpoon · · Score: 5, Insightful

    Do nothing.

    MD5 has not been invalidated for those uses. Checking the MD5 sum of an ISO download is not done for security purposes, it's done so that you can make sure you didn't get a bad byte or two somewhere in that 650MB. I mean, if hackers could upload a malware-filled ISO to the FTP server, they could upload a new MD5SUMS file too, right?

  5. This is misleading - MD5 is still useful by hoggoth · · Score: 5, Insightful

    This new algorithm does not ruin the usefulness of MD5 hashes. The algorithm can generate two documents that have the same MD5 hash, an MD5 collision. But it can NOT generate an MD5 collision starting with an existing document. In practical terms, this means a file that has been signed with an MD5 hash is STILL secure. Nobody can replace the file with a different file that will have the same MD5 hash. However someone can prepare in advance two documents with the same MD5 hash and trick someone into believing one document is really the other. So if you trust the original source (a Linux distro for example) you can be confident you are downloading the original document.

    --
    - For the complete works of Shakespeare: cat /dev/random (may take some time)
  6. Collisions do not mean the end of MD5 by afaik_ianal · · Score: 5, Insightful

    This more than anything should be the final stake in the heart of MD5, now that anyone can generate collisions whenever they want.

    No, no, no. This does not allow an attacker to generate any collision they like. They cannot find data that collides with a piece of data I provide them with. All they can do is provide me with 2 pieces of data that happen to collide.

    This means that an attacker can theoretically provide 2 different documents to people with the same hash, but they cannot easily produce a document that has the same hash as a document I have written.

    (Disclaimer: I haven't actually been able to RTFA (it's /.'d), but unless they have made an enormous breakthrough since this was last reported, this attack has very little implications for those of us who use MD5).

  7. "broken" does not mean broken by Edgewize · · Score: 5, Informative

    This program is an efficient way to generate two source blocks with the same resulting MD5. This program does NOT allow you to match an arbitrary MD5 hash. That may come some day, but unless I've missed a very important paper somewhere, it has not happened yet.

    This does not totally invalidate MD5 for verification. This attack still does not let you poison a torrent feed, etc, unless you are the author of the original source data and you engineered the data specifically to be vulnerable to this attack.

  8. Re:Why? by einhverfr · · Score: 5, Insightful

    Even if SHA1 and MD5 have attackable collisions the chances are very low that you can find a meaningful collision that affects both algorithms.

    --

    LedgerSMB: Open source Accounting/ERP
  9. Coral cache? by Viper+Daimao · · Score: 5, Funny

    (Coral cache links provided to prevent slashdotting)

    Im sorry, you must be new here.

    --
    "In the game of life, someone always has to lose. To me, if life were fair, that someone would always be Oklahoma." -DKR
  10. Re:So you found a collision, big deal by Krischi · · Score: 5, Informative

    See this: http://www.cits.rub.de/MD5Collisions/

    It demonstrates the generation of two postscript files with the same MD5 hash that nevertheless display completely differently.

  11. Re:Managed to get just the last few lines... by Anonymous Coward · · Score: 5, Funny

    I downloaded the source, but it doesn't seem to be working properly. Does anyone have an md5sum of the original so I can verify I got the right code?

    -confused

  12. Re:SHA-1??? by poemofatic · · Score: 5, Informative

    Huh? The SHA-2 family have been standardized, approved by NIST, and recommended by the NSA as part of their suite B for some time now. They are *much* more proven than Whirlpool and required for government use, whereas Whirlpool is not allowed for government use. Look at the SHA-512, SHA-384, SHA-256 CMVP instructions and validation lists before you say that NIST has not approved these hashes.

    --

    When in doubt, have a man come through a door with a gun in his hand.

  13. Re:Q and A by CodeRx · · Score: 5, Interesting
    sha1(md5($password . '¥1i9k') . 'a-thirty-five-ch4racter-l0ng-str1ng' . md5($password))

    This is a very bad password salting scheme and vulnerable to a dictionary attack. Once I have your database and salts, I can run a dictionary of common passwords through your scheme and crack any weak passwords.

    You can make things much harder by having your salt change for each password - include the username for example. Now I have to run my entire dictionary through the sha/md5 function for each user. By doing this, you make the attack O(m*n) instead of O(m) (where m = the number of words in my dictionary and n = the number of users).

    And as you mentioned in a follow up post, this code only generates documents with identical md5 sums, it does not generate a document with a given sum. So MD5 is broken for document signing and the like, but secure for password hashing for the time being.

  14. MOD ME DOWN by swillden · · Score: 5, Informative

    The parent comment, which I wrote, was based on a severe misunderstanding of the extent of the capability of the attack. In particular, I didn't realize that the attack could find collisions even with arbitrary, attacker-specified IVs. What that means is that it is indeed possible to generate x.509 certificates containing different keys but the same MD5 hash (and therefore the same signature). In fact, it's been done.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.