Slashdot Mirror


MD5 Collision Source Code Released

SiliconEntity writes "The crypto world was shaken to its roots last year with the announcement of a new algorithm to find collisions in the still widely-used MD5 hash algorithm. Despite considerable work and commentary since then, no source code for finding such collisions has been published. Until today! Patrick Stach has announced the availability of his source code for finding MD5 collisions and MD4 collisions (Coral cache links provided to prevent slashdotting). MD4 collisions can be found in a few seconds (but nobody uses that any more), while MD5 collisions (still being used!) take 45 minutes on a 1.6 GHz P4. At last we will be able to implement various attacks which have been purely hypothetical until now. This more than anything should be the final stake in the heart of MD5, now that anyone can generate collisions whenever they want."

85 of 411 comments (clear)

  1. SHA1 by mysqlrocks · · Score: 5, Funny

    So is SHA1 the recommended alternative?

    1. Re:SHA1 by Mind+Booster+Noori · · Score: 3, Informative

      SHA-1 is not the sollution.

    2. Re:SHA1 by psykocrime · · Score: 5, Informative

      So is SHA1 the recommended alternative?

      No, see:

      http://www.computerworld.com/securitytopics/securi ty/story/0,10801,99852,00.html

      and

      http://www.computerworld.com/softwaretopics/softwa re/story/0,10801,105875,00.html

      I like this quote:

      "SHA-1 is a wounded fish in shark-infested waters, but I'm more worried about MD5 because it's used everywhere," said Niels Ferguson, a cryptographer at Microsoft Corp. "Try to switch away from SHA-1 as quickly as you can, but switch away from MD5 first," he said when asked what recommendations he has regarding the algorithms during a panel discussion at the conference.

      --
      // TODO: Insert Cool Sig
    3. Re:SHA1 by Anonymous Coward · · Score: 5, Informative
      No, MD5 and SHA1 were found to have better than brute-force attacks within a few months of each other.

      Crypto people are recommending SHA-256 or SHA-512 which is only like SHA-1 in name.

      Obviously check your the hash length beforehand and make sure your database column is wide enough.

      When migrating existing hashes to the new hash be careful not to store the old hash anywhere -- that can be the weak link in the chain. For example, generating passwords and having the MD5 around lets attackers generate valid inputs and then try them against the more computationally complex hash. It gives them an approach to attacking your stronger hash.

      Take a copy of your database and hash all the existing passwords into SHA-512 form, and you'll need some way of distinguishing the MD5-to-SHA512 hashes from the SHA512 hashes, so add a date column with todays date in it. Then write a function "hashString" as a wrapper that can identify when something was hashed, and go down a different branch of code based on that.

      The first branch does MD5 then SHA512, the second branch does SHA512, and it does this based on the date column.

      And, of course, re-salt both branches.

    4. Re:SHA1 by flosofl · · Score: 3, Insightful

      For the love of _______! (fill in appropriate name for your particular beliefs)

      Someone mod the GP post Funny, before we get more "informative" posts. It looked like a tongue-in-cheek comment to me. I actually laughed. Then I saw the follow-ups from the Unfunny Brigade... It was a joke!

      Seriously, who doesn't know about the SHA-1 weakness by now.

      --
      "This calls for a very special blend of psychology and extreme violence" - Vyvyan "The Young Ones"
    5. Re:SHA1 by jacksonj04 · · Score: 3, Informative

      Yes, so

      hash1 = md5(input1)
      hash2 = sha1(input1)

      Then to validate you run the input through the same thing. Like I said, the chances of a valid input colliding with both are ignorably slim.

      --
      How many people can read hex if only you and dead people can read hex?
    6. Re:SHA1 by Anonymous Coward · · Score: 4, Informative

      If you can find a collision in one, you can, for pretty much the same work, find a collision in the second hash. Cascading hash functions like this DOES NOT WORK.

      Look up the Joux Multicollision paper from 2004 CRYPTO. This is a famous result.

      This scheme you propose has been broken by Joux.

      -Matt

  2. Managed to get just the last few lines... by Saint+Aardvark · · Score: 4, Funny
    ...before even the Coral cache was Slashdotted, and it turns out they've written it in LISP:

    ))))))) ))))))))

    (With sincere apologies to Bryce Jasmer.)

    1. Re:Managed to get just the last few lines... by Anonymous Coward · · Score: 5, Funny

      I downloaded the source, but it doesn't seem to be working properly. Does anyone have an md5sum of the original so I can verify I got the right code?

      -confused

  3. shaken to our what? by tomstdenis · · Score: 3, Insightful

    It's important news but not really that shocking. MD5 was not something professionals would recommend for a few years already.

    Any half-way intelligent cryptographer would have suggested SHA-1, TIGER or perhaps HAVAL since quite some time already.

    Tom

    --
    Someday, I'll have a real sig.
    1. Re:shaken to our what? by ZachPruckowski · · Score: 2, Insightful

      It's important news but not really that shocking. MD5 was not something professionals would recommend for a few years already.

      But a household user can crack in an hour on a normal mid-line computer something that "a few years" ago professionals might have recommended. That's the news. If low-end PCs can crack encryption that's only a few years outdated, then the hounds are nipping at the heels of the cyptography industry. And imagine what hackers could do with more powerful computers (yes, I know there is a non-applicability issue with some forms of encryption). Or the government with all those Cray super-monstrosities.

  4. MD5 broken, not useless by Anonymous Coward · · Score: 2, Insightful

    Keeping in mind where MD5 is broken is important, so that good uses for this tool are not disposed of out-of-hand.

    md5 is still good for keeping track of if your files have changed. It should not be used for document signing.

  5. So what the hell do I do now? by jeblucas · · Score: 4, Interesting

    I'm essentially crypto ignorant. About all I've known to do was verify MD5 hashes on downloads. Now that this is by-and-large pointless, how to check the veracity of things like Linux ISO's, video drivers, etc, ad inifintum?

    --
    blarg.
    1. Re:So what the hell do I do now? by DreadSpoon · · Score: 5, Insightful

      Do nothing.

      MD5 has not been invalidated for those uses. Checking the MD5 sum of an ISO download is not done for security purposes, it's done so that you can make sure you didn't get a bad byte or two somewhere in that 650MB. I mean, if hackers could upload a malware-filled ISO to the FTP server, they could upload a new MD5SUMS file too, right?

    2. Re:So what the hell do I do now? by yamla · · Score: 4, Informative

      That's not what MD5 sums are used for. TCP/IP already has packet integrity. MD5 sums are indeed used to make sure you don't have a malware-filled ISO. The trick is that you grab the MD5 sum from a trusted source, then you can grab the ISO image from any mirror site. Assuming MD5 is safe (obviously not the case), you know your downloaded ISO is exactly the same as the one distributed from the central repository.

      --

      Oceania has always been at war with Eastasia.
    3. Re:So what the hell do I do now? by Zocalo · · Score: 3, Informative
      Most things use multiple checksums, for instance on the updated copy of Lynx I just grabbed for Fedora:

      $ rpm --checksig lynx-2.8.5-23.2.x86_64.rpm
      lynx-2.8.5-23.2.x86_64.rpm: (sha1) dsa sha1 md5 gpg OK
      $

      So, even if it is also possible to generate collisions for DSA and GPG keys as well as SHA1 and MD5, the chances of being able to generate a collision for all four checksums/signatures at the same time is quite likely infinitesimally small. And that's just for a random file, things are going to get much more complex if you want that random file to can pass as whatever format the original was supposed to be and actually deliver a payload that might do something useful for the cracker.

      --
      UNIX? They're not even circumcised! Savages!
    4. Re:So what the hell do I do now? by Aranth+Brainfire · · Score: 3, Insightful

      "That's not what MD5 sums are used for."

      Heh, sorry, but you lose. I use them for that.

      I've never had a problem with an "infected" iso, but I have had one or two that downloaded wrong and flat out failed to work after I burned them. Sure enough, I check the md5 sum and it's not correct- download the iso again from the same site and it works.

      So I guess maybe either my hard drive is funky (or some other system failure on one end) or this "packet integrity" isn't so great.

      --
      "Quoting yourself is stupid." -Me
    5. Re:So what the hell do I do now? by swillden · · Score: 3, Informative

      So, even if it is also possible to generate collisions for DSA and GPG keys as well as SHA1 and MD5, the chances of being able to generate a collision for all four checksums/signatures at the same time is quite likely infinitesimally small.

      Actually, only SHA1 and MD5 need collide. DSA and GPG aren't used for hashing. DSA is used for signing the hashes, and GPG just implements and structures all of the above.

      Still, you're right that finding collisions for both MD5 and SHA-1 on the same pair of files is extremely unlikely.

      And that's just for a random file, things are going to get much more complex if you want that random file to can pass as whatever format the original was supposed to be and actually deliver a payload that might do something useful for the cracker.

      Hammerhead, meet nail.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  6. The cycle begins anew... by rcbarnes · · Score: 2, Funny

    Great. Now that MD5 is dead, the slow/theoretical attacks on SHA1 can be the focus of collision research. I look forward to changing hash algorythms again from SHA1 in a year. :-/

    --
    "Fight for lost causes. You may discover they weren't."
  7. Should I care? by SlashAmpersand · · Score: 5, Funny

    This is all really interesting theoretically, but who has the money to run a 1.6 GHz P4?

  8. Replacement Hash Functions by Anonymous Coward · · Score: 5, Informative

    Recommended replacements are SHA (preferably SHA-2), WHIRLPOOL and/or RIPEMD.

    http://en.wikipedia.org/wiki/SHA-2
    http://en.wikipedia.org/wiki/WHIRLPOOL
    http://en.wikipedia.org/wiki/RIPEMD-160

  9. Why? by mboverload · · Score: 3, Insightful

    Why not just use 2 different algorithms? Yes, it's possible. Or hell, use 3. Can some one tell me why not this isn't a standard practice? Even if one has a weakness, you still have the other to back it up

    I use HMAC-SHA-256 with PasswordMaker.

    https://passwordmaker.org/passwordmaker.html

    1. Re:Why? by einhverfr · · Score: 5, Insightful

      Even if SHA1 and MD5 have attackable collisions the chances are very low that you can find a meaningful collision that affects both algorithms.

      --

      LedgerSMB: Open source Accounting/ERP
    2. Re:Why? by ninjaz · · Score: 2, Informative
      Why not just use 2 different algorithms? Yes, it's possible. Or hell, use 3. Can some one tell me why not this isn't a standard practice? Even if one has a weakness, you still have the other to back it up
      I noticed that NetBSD's source-based package management system, pkgsrc, already does this using SHA1 and RMD160 (apparently RIPEMD-160 is the official name for the digest). Here's what it looks like in the archive fetching phase of a package installation:

      => Checksum SHA1 OK for unzip-5.52/unzip552.tar.gz.
      => Checksum RMD160 OK for unzip-5.52/unzip552.tar.gz.
      ===> Extracting for unzip-5.52nb2

      One might also imagine that colliding two different hash types at the same time would be much more difficult than only at a time, anyway.

    3. Re:Why? by jonabbey · · Score: 2, Insightful

      True, but you could use a hash function like SHA1('data').MD5('data'), where the . operator stands for string concatenation.

      The reason that this isn't generally done is that should not provide more security than a proper cryptographic hash algorithm that produces hashes as long as the two different hash algorithms concatenated together.

      If you want additional collision resistance, just generate a longer hash. I believe this is how people are advised to handle SHA-class algorithms right now.

    4. Re:Why? by einhverfr · · Score: 2, Insightful

      If you want additional collision resistance, just generate a longer hash. I believe this is how people are advised to handle SHA-class algorithms right now.

      Hmmm....

      I actually like the concatenation more. The reason is that the longer hash solution only provides protection provided that the hash algorythm is not broken. Once it is broken your entire system fails. In other words, the security doesn't have much of a concept of depth.

      If you use two independant hash algorythms that are different, then the chance that a given change will provide a collision on both hashes is quite remote. This means that one not only has to break both algorythms but has to break both *in the same way.* This means that you could quite possibly use two broken hash algorythms safely.

      --

      LedgerSMB: Open source Accounting/ERP
    5. Re:Why? by Phleg · · Score: 3, Informative

      Just like mixing medications can have very bad synergistic side effects, so should encryption or hashing technologies be mixed and matched.

      As an example, when DES was first known to be broken, the most intuitive solution would be to double-encrypt the plaintext. However, upon cryptographic analysis, this acutally fails to improve the complexity of an attack (and in some cases may simplify it). Thus, Triple DES.

      Be very wary of trying to combine "broken" algorithms in an attempt to gain security, especially if you have no real grounding in cryptanalysis. Vulnerabilities in each have a nasty tendency to either amplify or at least complement each other in highly unpredictible ways.

      Remember one of the basic tenets of cryptography: it's easy to create an algorithm that you can't break. But just because you can't think of a way to break it doesn't mean there's not a trivial way to do so.

      --
      No comment.
    6. Re:Why? by einhverfr · · Score: 2, Interesting

      As an example, when DES was first known to be broken, the most intuitive solution would be to double-encrypt the plaintext. However, upon cryptographic analysis, this acutally fails to improve the complexity of an attack (and in some cases may simplify it). Thus, Triple DES.

      Your point is well taken. However, in this case, I believe you are mistaken. Indeed 3DES is just an extended form of DES which uses 2 keys and three encryptions. But here you are speaking of access so the analogy is likely to be fatally flawed anyway.

      My initial response was to the idea that a hash of a hash would provide any security and indeed would reduce it as your example illustrates. If you MD5 a SHA1 hash, you have reduced your security because a failure in either hash algorythm renders the entire chain ineffective (weakest link principle).

      Now, with my proposal, one would include independant hashs which would be checked independantly. If either one fails, one assumes that the data has been tampered with. The issue is that it would be difficult to defeat both simultaneously for this specific type of check. Being able to do so on demand while editing the file in a meaningful way might well prove impossible.

      Mixing encryption systems is indeed the bread and butter of the way our systems usually work. For example almost any key exchange algorythm uses a combination of encryption algorythms to ensure that symmetric encryption keys can be exchanged in a secure manner among previously unknown parties. But the key is that you want to make it a requirement to break every link on the chain and not have the system fail when one link is compromised. And yes, double-encrypting with the same key is stupid :-)

      --

      LedgerSMB: Open source Accounting/ERP
    7. Re:Why? by jonabbey · · Score: 2, Insightful

      Now, with my proposal, one would include independant hashs which would be checked independantly. If either one fails, one assumes that the data has been tampered with. The issue is that it would be difficult to defeat both simultaneously for this specific type of check. Being able to do so on demand while editing the file in a meaningful way might well prove impossible.

      Yes, but only if you mean 'might well prove impossible' in the same way that it 'might well prove impossible' to break SHA-1 or MD5. There's nothing magical in the mathematics that makes a hash generated partially by SHA-1 and partially by MD5 harder to break than a hash generated by SHA-256 or the like, which generates a longer hash than SHA-1 or MD5 alone.

      Remember, as long as the domain of source files to be hashed includes all possible data files longer than the generated hash, there will be collisions in the function's range. This is true even in the SHA1('data').MD5('data') case. And as long as there are collisions, it's just a question of how difficult it is to find them.

      Yes, prefixing your MD5 hash with an SHA1 hash should make it much harder to find a collision in both algorithms simultaneously, but the very same difficulty could be achieved with a single hash algorithm which generates a longer hash. The magic is in the quality of the algorithm and the length of the output, not in the fact that two algorithms were put to use.

    8. Re:Why? by Lord+Ender · · Score: 2, Insightful

      Combining hashes is effectively just inventing a new hash algorithm with longer digests.

      As a mathematician, this "seems" like it would be intuitively much more difficult for someone to discover an attack against.

      As an engineer, this seems like it is obviously NOT a "good" solution.

      If H1() and H2() are both weak, the hybrid H12() is obviously stronger than either, but also fundamentally flawed.

      --
      A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
  10. Right now it's time to... by Tackhead · · Score: 2, Interesting
    > This more than anything should be the final stake in the heart of MD5, now that anyone can generate collisions whenever they want."

    In other words, right now it's time to...
    ...LICK OUT THE KAMS, NOTHERGUCKERS!

  11. bittorrent? by rayde · · Score: 4, Insightful

    doesn't bittorrent use md5 to verify the sections of files it has downloaded? will this facilitate poison seeds? or does BT use something more complex than md5?

    1. Re:bittorrent? by Wesley+Felter · · Score: 4, Informative

      BitTorrent uses SHA-1.

    2. Re:bittorrent? by Breakfast+Pants · · Score: 2, Informative

      He was talking about poisoned seeds. And theoretically SHA-1 isn't fine for this and is vulnerable to the same flaws as MD5. It isn't practical now (and certainly not as practical as poisoning kazaa shares... where every other byte in a file is skipped(!) for performance when generating a hash).

      --

      --

      WHO ATE MY BREAKFAST PANTS?
  12. WTF this source is useless by RootsLINUX · · Score: 3, Interesting

    I guess someone thinks he's a little too cool to comment his code properly. Yeah, like "/* B2 */" tells me anything useful you insensitive clog!

    --
    Hero of Allacrost, a FOSS RPG for *NIX/*BSD/OS X/Win
    1. Re:WTF this source is useless by plalonde2 · · Score: 2, Insightful
      It's thoroughly documented: mind you you need to start with the reference given in the first comment. Variable names and comments relate to that paper.

      Not all code documentation lives in the source.

    2. Re:WTF this source is useless by Theatetus · · Score: 3, Informative

      If you RTFRFC you might notice that those are the variable and section names used in the algorithm specificaiton.

      --
      All's true that is mistrusted
  13. This is misleading - MD5 is still useful by hoggoth · · Score: 5, Insightful

    This new algorithm does not ruin the usefulness of MD5 hashes. The algorithm can generate two documents that have the same MD5 hash, an MD5 collision. But it can NOT generate an MD5 collision starting with an existing document. In practical terms, this means a file that has been signed with an MD5 hash is STILL secure. Nobody can replace the file with a different file that will have the same MD5 hash. However someone can prepare in advance two documents with the same MD5 hash and trick someone into believing one document is really the other. So if you trust the original source (a Linux distro for example) you can be confident you are downloading the original document.

    --
    - For the complete works of Shakespeare: cat /dev/random (may take some time)
    1. Re:This is misleading - MD5 is still useful by MoogMan · · Score: 2, Insightful

      Nobody can replace the file with a different file that will have the same MD5 hash.

      Yet.

  14. Collisions do not mean the end of MD5 by afaik_ianal · · Score: 5, Insightful

    This more than anything should be the final stake in the heart of MD5, now that anyone can generate collisions whenever they want.

    No, no, no. This does not allow an attacker to generate any collision they like. They cannot find data that collides with a piece of data I provide them with. All they can do is provide me with 2 pieces of data that happen to collide.

    This means that an attacker can theoretically provide 2 different documents to people with the same hash, but they cannot easily produce a document that has the same hash as a document I have written.

    (Disclaimer: I haven't actually been able to RTFA (it's /.'d), but unless they have made an enormous breakthrough since this was last reported, this attack has very little implications for those of us who use MD5).

  15. "broken" does not mean broken by Edgewize · · Score: 5, Informative

    This program is an efficient way to generate two source blocks with the same resulting MD5. This program does NOT allow you to match an arbitrary MD5 hash. That may come some day, but unless I've missed a very important paper somewhere, it has not happened yet.

    This does not totally invalidate MD5 for verification. This attack still does not let you poison a torrent feed, etc, unless you are the author of the original source data and you engineered the data specifically to be vulnerable to this attack.

  16. Will people stop saying this? by ivan256 · · Score: 2, Insightful

    This more than anything should be the final stake in the heart of MD5

    No, no it won't be. It won't be, because MD5 is useful for many things where the existance of an "easy" (in quotes because easy is relative) method of generating colisions is irrelavant.

    It won't even kill off the use of MD5 checksums as a signature for verifying authenticity, because if your data is smaller than the checksum there may not be a colision at all, and an exploit wouldn't matter.

    This is an important discovery, but it doesn't make MD5 useless any more than CRC32 is useless.

  17. Got Salt? by Anonymous Coward · · Score: 2, Insightful

    OK, so clearly a scripted attack against MD5 is bad.

    But aren't most people using MD5 using salted (as opposed to unsalted) hashes? (for those unclear on the difference, a "salted" has basically uses a local seed as part of it's MD5 hash, in addition to the value to be encrypted)

    Doesn't seem likely that salted hashes can be easily broken by this technique, although clearly it's a concern that, should the salt value become known, all your passwords, etc, become breakable...

  18. Not -that- bad by Parity · · Score: 2, Interesting

    The only attacks that these md5 collisions allow are denial-of-service/destruction-of-data attacks, they don't generally allow the compromise of protected data or access to systems or suchlike. The collision blocks that are generated are effectively random data. It has yet to be shown how to -craft- a collision block.

    If we could craft a collision block that contained a specified string at a specified position, that would be another issue altogether.

    The ability to find collision blocks easily does suggest that crafted collision blocks might be possible, but for now, you have as good a chance of getting a viable exploit out of /dev/random as out of a collision block.

    This doesn't mean we shouldn't look to other options for the newest releases of high-security software, but it doesn't mean that the md5 algorithm should be purged from our systems altogether either. It's still extremely valuable at detecting accidental corruption, and useful-with-caveats at detecting malicious corruption (45 minutes to discover a block of data that matches the sum is not really useful in either speed or resulting data for any kind of man in the middle attack, for example, so using md5 to validate network packets is safer than using it to validate disk files).

    Of course, the black hats may know more than we do about md5 weaknesses, but 'may know' is just as true of any other algorithm.

    --
    --Parity
    'Card carrying' member of the EFF.
  19. MD5 and verification by n0dalus · · Score: 4, Insightful

    Just because collisions can be generated doesn't mean that MD5 is dead.
    It might only take minutes to calculate two random strings with the same hash, but it would still take a very long time to calculate a second string that collides with a pre-existing string. So even though it is now cryptographically weak, it can still be used effectively to check the integrity of files.

  20. Coral cache? by Viper+Daimao · · Score: 5, Funny

    (Coral cache links provided to prevent slashdotting)

    Im sorry, you must be new here.

    --
    "In the game of life, someone always has to lose. To me, if life were fair, that someone would always be Oklahoma." -DKR
  21. SHA-1??? by jd · · Score: 3, Insightful
    SHA-1 has known attacks, although none have (yet) proven to be useful for an exploit. The SHA-2 family (eg: SHA-256) are "unproven" and not part of the FIPS-180 standard (so cannot be used for US Government work), but I would regard them as being "probably safer" than SHA-1 for secure work.


    TIGER is good, as is Whirlpool. Whirlpool has the advantage that it uses AES as the basis, and AES is regarded as secure. It was also certified for secure work by NESSIE - a European group trying to do for the EU what NIST does for the US - which means that it's probably certified for use in secure environments in Europe.


    According to the Hashing Function Lounge, there are other hashing functions that have not been broken (eg: cellhash and fft-hash) but these are sufficiently obscure that a lack of a known exploit may be through lack of study and not through the presence of good security. It would make them good for beating skript-kiddies, as they won't have the skills to find exploits and those skilled enough at finding them aren't studying those algorithms much. (I don't like security through obscurity, but technically these aren't obscured as anyone CAN study the algorithm.)

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    1. Re:SHA-1??? by poemofatic · · Score: 5, Informative

      Huh? The SHA-2 family have been standardized, approved by NIST, and recommended by the NSA as part of their suite B for some time now. They are *much* more proven than Whirlpool and required for government use, whereas Whirlpool is not allowed for government use. Look at the SHA-512, SHA-384, SHA-256 CMVP instructions and validation lists before you say that NIST has not approved these hashes.

      --

      When in doubt, have a man come through a door with a gun in his hand.

    2. Re:SHA-1??? by Anpheus · · Score: 2, Informative

      You're half right, SHA-256, 384, 512 are part of FIPS 180-2. That is, they are approved for government use. FIPS 180-2 (PDF Warning!)

      Brief Caveat: SHA-1 is still more algorithmically secure, as finding a hash is not feasible on desktop computers. SHA-256 and higher length are in the same vein of logic that provides this security, but like any new algorithm, there is an insufficient amount of study to verify this. This applies to Whirlpool as well, AES might be sufficiently secure, but Whirlpool != AES.

    3. Re:SHA-1??? by jafac · · Score: 2, Interesting

      For my application, SHA-1 incurred a HUGE performance penalty. (factor of 1000). Given that there are few other variables I am free to change in this system, which hash, among these others you mention, tends to be more lightweight?

      --

      These are my friends, See how they glisten. See this one shine, how he smiles in the light.
    4. Re:SHA-1??? by jd · · Score: 2, Interesting
      It depends a little on the implementation, but using mhash I get the following results when generating hashes of a gigabit file from DVD: (Results calculated by using gnu time, and using the user time, not the real time, result)


      • SHA-1 - 16 seconds
      • SHA-512 - 1 min, 17 seconds
      • Haval - 27 seconds
      • Whirlpool - 1 min, 14 seconds
      • Tiger - 18 seconds


      And then, when generating hashes of a 5 gigabyte file on my hard drive (there's been a lot of good stuff on Freshmeat, recently):


      • SHA-1 - 2 seconds
      • SHA-512 - 3 seconds
      • Haval - 5 seconds
      • Whirlpool - 5 seconds
      • Tiger - 3 seconds


      I'm going to guess, on the basis of these results, that you want to try Tiger as a first choice, Haval as a second choice, and that the mhash implementation will probably be OK for Tiger but if you opt for Haval you're probably better with a different hashing engine.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  22. So you found a collision, big deal by goombah99 · · Score: 3, Interesting

    Maybe someone could explain why collisions are a serious problem for MD5. Or at least in what instances they are. I can see that in some cases, such as password hashing this could be a problem. But for many other uses I don't see what harm a collision has. Maybe I misunderstand but as I understand it MD5s are normally used in a checksum manner to sign or provide a fingerprint of a document. If you have an original document and compute it's MD5 then it can match some certified MD5 check sum. If someone were to generate a fake document they coul dnot design it to match the MD5 fingerprint. They could create some bit of gibberish that did match it but not a document that was useful as a forgery.

    I guess if one were trying to deliberately pedal gibberish, like say if you were the RIAA trying to destabilize a torrent net then that would be all you care about. But for more general issues I don't get it.

    Or is it now possible to generate a collision that also contains some intentional content like a binary program or source code or bank statement. That woul dbe be bad.

    It seems like even in the case of gibberish generation that some simple hacks could extend the useful life of this. For example, if you were to MD5 a whole document and then MD5 the concatenation of every other byte in the document, it woul dbe pretty hard to find two collisions that had that property simulateously. Sure I wont doubt there are better ways to hash something than adding hacks to MD5. I'm just saying that it seems like a simple hack might well be good enough to extend its useful lifespan for passwords and file shareing.

    But I invite you to explain to me why I'm wrong.

    --
    Some drink at the fountain of knowledge. Others just gargle.
    1. Re:So you found a collision, big deal by Krischi · · Score: 5, Informative

      See this: http://www.cits.rub.de/MD5Collisions/

      It demonstrates the generation of two postscript files with the same MD5 hash that nevertheless display completely differently.

    2. Re:So you found a collision, big deal by andyh1978 · · Score: 4, Informative
      Maybe someone could explain why collisions are a serious problem for MD5. Or at least in what instances they are. I can see that in some cases, such as password hashing this could be a problem.
      It's not a problem in password hashing. There is still no feasible way to compute one of the infinite plaintexts that would generate a given MD5 from just the MD5. Rainbow Tables are the main threat there, but they're defeated by salting (e.g. HMAC-MD5) as you have to regenerate the tables all over again (and find the salt in the first place). It doesn't hurt to go to a larger, more complex hash, but for this purpose, there's no additional worries. It's still "preimage resistant".
    3. Re:So you found a collision, big deal by iabervon · · Score: 3, Insightful

      This generates "weak collisions", which are where the attacker finds a pair of texts which hash to the same value, not "strong collisions", which are where the attacker finds a text that hashes to the same value as a text chosen by the user. So, someone could now set their password to some string, and then be able to type a different string to get in. (Except that neither are likely to be ascii text, making it a bit tough to type).

      The actual issues are for document signing; the attacker could give you one document to sign, and use the signature on a different document with the same hash. There are smaller issues in the case where code expects no two documents to have the same hash. Obviously, collisions must exist, but the code to handle the case is likely not to be well-tested, since test cases were previously impractical to find.

    4. Re:So you found a collision, big deal by andyh1978 · · Score: 2, Informative
      the problem is that if someone knows the MD5 of a password, they can use this code to generate another password with the same MD5. since passwords are usually stored hashed, an attacker wouldn't need to know the original password, only another password that would generate the same MD5

      How?

      The collision vulnerabilities do not allow this. They require both the MD5, and the original plaintext, to produce a modified plaintext that has the same MD5.

      Think about it - how do you know it's a collision at all, unless you have the original plaintext? A collision is two different plaintexts that produce the same MD5. You can't know if you have a different plaintext unless you have the original plaintext.

      If you had the original plaintext, that means you've got the original password, so collisions are entirely irrelevant. You've already got what you need to log in.

      There is still no way, other than the brute force enumeration which is made easier to look up through Rainbow Tables, to get from an MD5 to a plaintext that hashes to that MD5 value. The discovery of methods to produce collisions has not weakened MD5 any further - so far only the increase in computing power to produce Rainbow Tables has weakened this particular use. But trivial salting of the values makes Rainbow Tables useless once more.

      The use of MD5 as a method to checksum files has been blown out of the water, of course. That's the other use, which I'm not arguing about at all. You know both the plaintext and the MD5 there, because you've downloaded the file and the MD5 for the file which you trust can't be forged - which is no longer true.

  23. Re:The end of edonkey by n0dalus · · Score: 4, Informative

    No. This only helps you find collisions in two randomly generated strings.
    It is still very difficult to produce a colliding file given a pre-existing file on the network.
    It should also be noted that edonkey splits a file into 9500KB chunks, and then into smaller chunks again, and hashes each one. It would be far more difficult to produce a chunk that causes collisions on all three levels.
    Anyway, I expect an eMule extension will come out soon to allow for sharing of SHA1 hashes between clients (if it doesn't already exist).

  24. Weak code. by kg_o.O · · Score: 4, Funny

    This code is weak. I fired it up like 20 minutes ago and still haven't r00ted my box.

  25. Re:MD5 is not an encryption algo by swillden · · Score: 4, Informative

    t's also the default algorithm to hash passwords (i.e. if you type in your password, it gets hashed into an MD5 sum which is then compared to what the MD5-ed password should be, thereby avoiding plaintext password storage).

    Doesn't matter. This attack has no significant effect on password hashing, with or without salt.

    This attack allows you to find a pair of plaintexts that hash to the same value; you don't get to pick either the plaintexts or the hash value. It does not help you find a plaintext that hashes to a given value. To use this to attack an unsalted password hashing system you would need to first generate a collision, then convince the target of your attack to set one of those plaintexts to be his/her password, then you could log in using the other plaintext. But if you can convince the target to use a particular password, why not just use that to log in?

    This is not an insignificant cryptologic result, and people should move away from MD-5 as fast as practically possible (actually, people have been moving away from it for years due to some results against MD-4, which MD-5 is very similar to) but it doesn't really have any practical implications right now.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  26. Q and A by Sheepdot · · Score: 4, Informative
    For those of you who store passwords as hashes in your web apps, I've developed a little "Q & A" post here that explains this as best as possible.

    Question: Does this mean all my MD5 passwords for all my users can be cracked?
    Answer: The short answer is yes, they can be cracked. The long answer is no, not if you used a salt, and the attacker has to get those md5 hashes first. You are not safe you are storing your user's password field input directly to the database ala the php/sql method of:
    INSERT INTO users VALUES ('user','" . md5($password) . "');

    Question: How should I remedy this?
    Answer: Always use a salt or salts. For example in the case above you could use this php/sql method instead:
    INSERT INTO users VALUES ('user','" . sha1(md5($password . '¥1i9k') . 'a-thirty-five-ch4racter-l0ng-str1ng' . md5($password)) . "');

    Question: How/Why is this safer?
    Answer: Collisions are only direct input for the md5 function to get the same md5 hash. So in the above case, $password was directly taken from the user and made into a hash. Assuming an attacker got an SQL injection in and grabbed the database, they could run this collision creator on a hash and produce an input that gave that exact hash.

    But, this would be much more difficult with any code that introduced a salt. That is why the second code is better, it includes two salts that the attacker (through his SQL injection) is unable to account for.
    1. Re:Q and A by CodeRx · · Score: 5, Interesting
      sha1(md5($password . '¥1i9k') . 'a-thirty-five-ch4racter-l0ng-str1ng' . md5($password))

      This is a very bad password salting scheme and vulnerable to a dictionary attack. Once I have your database and salts, I can run a dictionary of common passwords through your scheme and crack any weak passwords.

      You can make things much harder by having your salt change for each password - include the username for example. Now I have to run my entire dictionary through the sha/md5 function for each user. By doing this, you make the attack O(m*n) instead of O(m) (where m = the number of words in my dictionary and n = the number of users).

      And as you mentioned in a follow up post, this code only generates documents with identical md5 sums, it does not generate a document with a given sum. So MD5 is broken for document signing and the like, but secure for password hashing for the time being.

    2. Re:Q and A by PhiRatE · · Score: 3, Insightful

      No, a salt can be public knowledge, a salt is there to prevent the use of a pre-hashed dictionary, it does not help in the event of a matcher.

      Way back when the hash was unix crypt(), everyone got a little nervous about the idea that someone with a lot of computing power to burn might just fill up a large hard-drive with a map of plaintext->hash for the entire oxford dictionary for example. If they did that, then all they'd need to do subsequently to figure out the plaintext for any of the values they did the crypt()'ing for would be to look through the rive for the crypt value that matched the crypt value in the password file.

      The idea of a salt is simple, it changes the hashed result in a computably repeatable fashion, but one that requires that the entire hash has to be performed with the salt in mind in order to get the correct end result.

      So using a completely public bit of knowledge is fine. What is preferable is that the salt be random yet repeatable. That's why you see unix passwd files have the salt prepended to the hash (in crypt, it use to be the first 2 characters for example), you generate it randomly once, then whenever you want to see if the plaintext you've been given is the same, you grab the salt, hash the plaintext with it, and then check to see if the result matches the original hash.

      In summary, a salt protects you from the problem of a pre-computed hash dictionary, nothing else.

      --
      You can't win a fight.
    3. Re:Q and A by slashdotmsiriv · · Score: 2, Informative

      You got the whole deal wrong. The published attacks are not able to find a second preimage that is: for a certain y = h(x) find x' such that h(x')=h(x)=y.
      All this salt deal is completely redundant. Salts would make finding second preimages harder but it is already hard.
      Ppl need to realize that collision resistance is not the same as second preimage resistance. What the so called attack on MD5 does is finding collisions, random pairs that actually hash to the same value. The probability those pairs hash to hash values that you find in password records is negligible. It is almost as probable as trying random passwords until you find the one you are looking for.

      MD5 was designed to be both collision and second preimage resistant. These guys in China showed it is not collision resistant. That is a great theoretical result, but far from useful in practice. It remains second preimage resistant, so enough with this fake hype.

      Your passwords are completely safe if stored as hashes.

  27. Re:SHA1 (NO) by photon317 · · Score: 3, Informative


    MD5 is dead. SHA-1 is currently dying. They're both based on the same fundamental math, and the attacks on SHA1 are getting stronger and stronger. Now would be a really good time to get off of that entire family of hashes if you can (MD4, MD5, RIPEMD-* SHA-*, etc).

    The crypto world is in a little bit of a bind when it comes to strong hashes now. We simply don't have any left that both have a long proven track record of analysis and haven't been seriously compromised. Best bet, IMHO, is to go with a new-blood hash algorithm. They seem to be the answer we're looking for - but of course what they lack is more years of intensive study by the experts for flaws they themselves might harbor.

    So if you want to give them a whirl, I'd recommend looking at Tiger and Whirlpool:

    http://en.wikipedia.org/wiki/Tiger_(hash)
    http://en.wikipedia.org/wiki/Whirlpool_(algorithm)

    --
    11*43+456^2
  28. Doesn't Microsoft networking still use MD4? by Medievalist · · Score: 2, Interesting
    MD4 collisions can be found in a few seconds (but nobody uses that any more)
    I thought MD4-encrypted passwords were still flying around most of the world's MS-windows networks. I coulda sworn I had to explicitly turn them off on all my nets...
  29. breaking torrents? by OrangeTide · · Score: 4, Insightful

    Ah! That's a very good point.

    now if you you were a software company you could put torrents out (I assume they use blocks of MD5sum), and then after the torrent becomes popular start randomly seeding people with blocks that hash correctly but are complete garbage (since you can't pick what exactly you hash). if you do it right you would have games out there that would still mostly run. but would crash, or have garbled game data, etc.

    I'm not sure if this is really all that useful. but this exploit certainly seems to make it easy to do.

    --
    “Common sense is not so common.” — Voltaire
    1. Re:breaking torrents? by dougmc · · Score: 3, Insightful
      That's what I was thinking -- this being used to break torrents and other p2p setups.

      Though to be fair, most games seem to come in the form of a compressed archive of some sort -- either a bunch of .rar files (for warez) or a .exe file with .cab files (for Windows installers) or something similar. In that case, the corruption would be detected at installation, though it wouldn't be easy to determine from the torrent exactly which blocks are corrupt.

      In short, MD5 being broken (and now the code being available) is very bad. I expect to see the anti-piracy vigilantes jumping on this very quickly to create code to totally break bit torrent and similar things -- it would come up and look like a seed, but would be spewing garbage that the other clients couldn't detect as garbage. Of course, such poisoning would also end up being used to corrupt completely legitimate torrents as well, just because people think it's fun.

      (The fix? Update things like bittorrent to use hash routines that haven't been cracked yet, or to use multiple hash routines on the same blocks.)

    2. Re:breaking torrents? by swillden · · Score: 2, Interesting

      now if you you were a software company you could put torrents out (I assume they use blocks of MD5sum), and then after the torrent becomes popular start randomly seeding people with blocks that hash correctly but are complete garbage (since you can't pick what exactly you hash)

      Yeah, they could do that. They could even use the same trick as the guys who showed the Postscript exploit and make the software do different things based on which of the collision pair is in the file. Get the right version and the software runs fine. Get the wrong version and the software formats your drive.

      In case it's not clear, the attack is to write some code like:

      byte a[] = { /* One collision plaintext block here */ };
      byte b[] = { /* Other collision plaintext block here */ };

      // ...

      if (memcmp(a, b, sizeof(a)) == 0)
      {
      // Do something useful
      }
      else
      {
      // Do something malicious
      }

      The legit version contains the same value for a and b, but the bogus version contains different values. Both programs hash to the same value, but they behave differently.

      This sort of attack can work wherever you can compare blocks and then follow different codepaths based on the result. It could probably work with the scripting language that DVD menuing systems are implemented in, too, so it could work for DVD ISOs. Audio and video streams would probably not be vulnerable.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    3. Re:breaking torrents? by joecr · · Score: 2, Informative

      One Minor problem is that BitTorrent actually uses SHA not MD5 for it's hashing. Check out the following link for more information. http://dessent.net/btfaq/#hash-check

  30. You can spoof (almost) arbitrary documents! by ChaosDiscord · · Score: 4, Interesting
    Maybe I misunderstand but as I understand it MD5s are normally used in a checksum manner to sign or provide a fingerprint of a document. If you have an original document and compute it's MD5 then it can match some certified MD5 check sum. If someone were to generate a fake document they coul dnot design it to match the MD5 fingerprint. They could create some bit of gibberish that did match it but not a document that was useful as a forgery.

    Most document formats have lots of "dead space", parts you can pretty much modify at will without changing what the user actually sees. Comments in HTML or PostScript. Old junk data in Word documents. Executables can have just about anything you like added if you know your stuff. The MD5 attacks currently available only 128 "dead space" bytes to generate a collision. So far from being a gibberish document, one can generate almost any document you want. This page has a simple example with PostScript files. Both files have the same MD5 hash, but one is a relatively harmless letter of recommendation while the other is a grant of security clearance. Get your boss to sign your letter of recommendation digitally, swap in the security clearance file, and pass it on. This is a Big Deal and a Major Problem.

  31. Editors: Change the Links by theelemur · · Score: 2, Informative

    The Coral Cache is extremely slow. Please point the links to the original site www.stachliu.com/collisions.html as the html is small/static and the machine it's sitting on has a decent (100mb) connection. Thanks.

  32. Re:This cannot be used to crack passwords by cnettel · · Score: 2, Informative

    The program generates byte stream A and B, both having the same hash C. You can't input C and get an arbitrary B out (with this algorithm).

  33. I may not have been clear enough... by ^BR · · Score: 3, Interesting

    Or you may have a brain tumor...

    The generaly accepted definition for "cracking a password" is, given the encrypted password, being able to generate a password the once entered in the authentication system will generate the same encrypted form.

    For the second time, this attack does not permit that. This attack permits build two byte streams that hashes to the same value. So in a password context, assuming the password system permits the entry permits more than 1024 bits of arbitrary data to be entered as password (I don't remind the details well, but I think this is the amount of data that you must be able to change between the two streams) one could generate to "passwords" (being that long they don't qualify for this name anymore IMHO) that would hash to the same value.

    How does that amount to an attack on MD5 passwords again? Never at any point the attack had been being able given a unknown to him MD5 hash to produce a stream that would hash to the same value.

  34. SHA1 is not a good alternative in some cases by chongo · · Score: 3, Interesting
    SHA1 is not a good alternative in some cases. For details on the cryptographic hash problem, see my paper:
    SHA1 Cryptographic Hash Update
    My paper talks about the general problem at a high level. It gives a summary of common opinions expressed at the NIST Cryptographic Hash conference. Moreover, it gives developers some specific cryptographic hash recommendations.

    For the impatient, here is a summary for my recommendations for 2005-2006:

    • Avoid non-standard cryptographic hashes
    • Stop using MD5 now except for:
      • MD5 HMAC and MD5 hashed Passwords
      • Replace MD5 HMAC and MD5 hashed Passwords with SHA256 or SHA1 before end of 2007
    • Existing applications that use SHA1, where possible, should changed to use SHA256 before the end of 2008
      • For interoperability with older applications and hardware, these applications may have to also support SHA1
      • If you must support both SHA256 and SHA1, take care so that a "man in the middle" cannot inappropriately downgrade
    • Until a new Advanced Hash Standard (AHS) is adopted, new applications and hardware should be designed to use SHA256
      • For interoperability with older applications and hardware, these applications may have to also support SHA1
      • If you must support both SHA256 and SHA1, take care so that a "man in the middle" cannot inappropriately downgrade
    • All new applications and protocols must be designed to be algorithm agile
    • Existing applications and protocols should be modified to be algorithm agile by the end of 2008, if not sooner
    • SHA384 or SHA512 may be used in place of SHA256 in the above examples
      • Keep in mind that SHA384 and SHA512 are slower and larger than SHA256 or SHA1
    • Because it is possible that SHA1 will become unacceptably weak before 2008, and because SHA256 may become vulnerable to attack before Advanced Hash Standard (AHS) is adopted, a defense in depth approach must be taken

    See the paper for mode details.

    --
    chongo (was here) /\oo/\
  35. Re:MD5 is not an encryption algo by swillden · · Score: 2, Insightful

    For one: any system that uses one-way hashed password storage or key generation mechanism, as many sites use, is now some percentage easier to violate because multiple strings may resolve to same hash.

    It's the case with all one-way hashes that multiple strings hash to the same value. Since an n-bit hash function can only produce 2^n distinct hash values, if there are more than 2^n possible input plaintexts, than multiple plaintexts hash to single values. In fact, since MD5 and similar can hash inputs of any length, on average there are an infinite number of plaintexts that hash to any given n-bit value.

    The existence of this attack doesn't change the existence or distribution of collsions... we always knew they existed. It just means that we can find some of them. Ideally, we'd like it to be practically impossible to find *any*, even though we know they exist.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  36. Re:MD5 is not an encryption algo by dotgain · · Score: 2, Informative
    You don't seem to understand... Having the MD5 hash of a piece of software and the ability to generate a collision for that hash will logically lead to more code which can take any dll/exe, inject malicous code along with whatever other random garbage characters are needed to produce the collision. Yes, the sizes will be different but even that can be fudged given the sophistication of today's hackers. You don't seem to read the article. We don't yet have the ability to create a plaintext that will map to a predetermined hash value.

    While we might discover that "Donald Duck" has the same hash as "a", we don't get to choose either. We can't say "gimme a plaintext that hashes to this: 0x4faed...", all we can do so far is say "find two plaintexts that will give us the same hash"

    Not that this doesn't mean this is a very serious matter. My take on the situation is: "Oh. Shit."

  37. Re:MD5 is not an encryption algo by Haeleth · · Score: 4, Informative

    You don't seem to understand... Having the MD5 hash of a piece of software and the ability to generate a collision for that hash will--

    Stop right there, because it's quite clear that the one who doesn't understand is you. Nobody has the ability to generate a collision for a given MD5 hash. All we have is the ability to generate two bits of random junk that share the same hash. This makes some attacks possible, but it does not make it possible for you to distribute a malicious version of someone else's software that has the same MD5 hash as their version.

  38. MD5 and digital signatures by dido · · Score: 2, Informative

    The main problem with MD5 as it's used today comes when MD5 is used as a component of a digital signature scheme. Most digital signature schemes based on public key crypto work like this:

    1. Generate hash of document to be signed
    2. Encrypt hash of document using signer's private key (this is the signature)
    3. Send document along with signed hash to whoever cares

    To verify a digital signature, the following is performed:

    1. Recompute hash of signed document
    2. Decrypt signature using signer's public key, producing calculated hash
    3. Compare computed hash and signed hash--if they match, signature is authentic

    Now, it's easy to see why this spate of collision attacks on hashing algorithms is so deadly. If, given some signed document, you can produce another document that verifies to the same signature, well, I guess you're in a world of hurt. If these documents happen to be public key certificates, well, the whole PKI more or less collapses. And well, here's a bit of news: someone has done just that with X.509 certificates based on MD5.

    --
    Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
    1. Re:MD5 and digital signatures by swillden · · Score: 2, Interesting

      And well, here's a bit of news: someone has done just that with X.509 certificates based on MD5.

      Ouch. I'm not sure how I missed that one. The first page of their paper shows that I greatly underestimated the severity of the attack. What I hadn't realized is:

      Due to the ability of Wang's method to produce MD5 compression function collisions for any IV, and due to the iterative structure of MD5, we can append a collision to any block of data of our choice (provided that the bitlength is a multiple of the MD5 block length), while maintaining the collision property.

      My understanding was that the collision generation algorithm only created relatively small blocks, both effectively random, and that appending those blocks to different chunks of data would produce different hashes, even though the blocks themselves hash to the same value. But apparently the method can produce collisions even with arbitrary, attacker-specified IVs. That means effectively that you *can* craft a collision to any file, even a pre-existing file, as long as the file format has some "dead" space at an appropriate offset and the changes you want to make can precede this location.

      This is significantly worse than I understood it to be. It still doesn't affect password hashing, but it affects many other uses of hashing. The exploit that showed a pair of different postscript documents with different visible content but the same hash didn't require the property that Lenstra et al used to create the certs, which led me to my erroneous assumption.

      Thanks for pointing out this attack.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  39. Re:MD5 collisions is a non-issue by slashdotmsiriv · · Score: 2, Informative

    "You can generate collisions at any point in a file simply by starting with the partial MD5 sum up to the point in the file that you want to insert the block at.
    " That's incorrect. MD5 operates over 512-bit blocks and uses state generated from the hash of the previous blocks to generate the new state for the current block. The state is actually the 4 words of the 128 digest.
    Here is the alg from wikipedia:
    init h0
    init h1
    init h2
    init h3
    for each 512-bit chunk of message
    break chunk into sixteen 32-bit little-endian words w(i), 0 i 15 //Initialize hash value for this chunk: var int a := h0
    var int b := h1
    var int c := h2
    var int d := h3

    //Main loop:
    for i from 0 to 63
    if 0 i 15 then
    f := (b and c) or ((not b) and d)
    g := i
    else if 16 i 31
    f := (d and b) or ((not d) and c)
    g := (5×i + 1) mod 16
    else if 32 i 47
    f := b xor c xor d g := (3×i + 5) mod 16 else if 48 i 63
    f := c xor (b or (not d)) g := (7×i) mod 16

    temp := d
    d := c
    c := b
    b := ((a + f + k(i) + w(g)) leftrotate r(i)) + b
    a := temp
    //Add this chunk's hash to result so far: h0 := h0 + a
    h1 := h1 + b
    h2 := h2 + c
    h3 := h3 + d
    var int digest := h0 append h1 append h2 append h3 //(expressed as little-endian) What these guys have done is to find 512 bit blocks or larger that return the same hash given the same initial state h0, h1, h2, h3. As u can see above, if you find a collision for state i, ho, h1, h2, h3 the same collision will not work for the different state MD5 will have say after processing a number of blocks. It is computationally feasible to find many collisions for the initial state but to find collisions for all possible states and to have these collisions occuring in the original file is very hard. So, no you cannot simply search the file for blocks that are among the blocks for which u found a collision and then simply replace them with their collision. The MD5 for which you found the collision and the MD5 that is used over the file are essentially two different functions and will not hash to the same value. As for the rest of your comments: I hope you realize that you don't just give their program an MD5 hash and it outputs two values that hash to that output. MD5 has been shown not to be collision resistant but it is still preimage and second preimage resistant. I explicitly said that we assume the signer is trusted, you did not need to say anything about trusted parties. If the signer is not trusted the whole signed code scheme is meaningless.

  40. MOD ME DOWN by swillden · · Score: 5, Informative

    The parent comment, which I wrote, was based on a severe misunderstanding of the extent of the capability of the attack. In particular, I didn't realize that the attack could find collisions even with arbitrary, attacker-specified IVs. What that means is that it is indeed possible to generate x.509 certificates containing different keys but the same MD5 hash (and therefore the same signature). In fact, it's been done.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  41. How to calculate SHA-2 under GNU/Linux by YA_Python_dev · · Score: 3, Informative

    A lot of people have the programs md5sum and sha1sum installed, but they often don't have equivalent programs for the SHA-2 family. To calculate those you can use the command:

    gpg --print-mds [files]

    or you can download a dedicated program: http://www.ouah.org/~ogay/sha2/.

    --
    There's a hidden treasure in Python 3.x: __prepare__()
  42. gnupg configuration help by kabloom · · Score: 2, Interesting

    Is it advised to switch GnuPG from SHA-1 for signatures to SHA-256 (like immediately)? How can I make this configuration change?

  43. Adi Shamir's Hash Function *IS* 'unbreakable'.... by iamcf13 · · Score: 3, Informative
    Let Ron 'RSA' Rivest tell you why....

    (from material at the Pure Crypto Project - http://senderek.de/pcp/ )

    Quote below from http://senderek.de/pcp/pcp-security.html


    Adi Shamir once proposed the following hash function:

              Let n = p*q be the product of two large primes, such that
              factoring n is believed to be infeasible.

              Let g be an element of maximum order in Z_n^* (i.e. an
              element of order lambda(n) = lcm(p-1,q-1)).

              Assume that n and g are fixed and public; p and q are secret.

              Let x be an input to be hashed, interpreted as a
              non-negative integer. (Of arbitrary length; this may be
              considerably larger than n.)

              Define hash(x) = g^x (mod n).

    Then this hash function is provably collision-resistant, since
    the ability to find a collision means that you have an x and
    an x' such that

              hash(x) = hash(x')

    which implies that

              x - x' = k * lambda(n)

    for some k. That is a collision implies that you can find a
    multiple of lambda(n). Being able to find a multiple of lambda(n)
    means that you can factor n.

    I would suggest this meets the specs of your query above.

                      Cheers,
                      Ron Rivest

    Ronald L. Rivest
    Room 324, 200 Technology Square, Cambridge MA 02139
    Tel 617-253-5880, Fax 617-258-9738, Email



    The nice thing about Adi Shamir's hash function is that it, as well as the RSA cryptosystem co-created with Rivest and Len Adleman is all based on simple modular exponentiation.

    Too bad the Feds consider arbitrary precision mathematics used for encryption purposes to be 'a munition' and 'a controlled export'.... :(

    Years ago, they raked Phil Zimmerman over the coals over his email cryptosystem PGP then (eventually) left him alone.

    Can't cryptosavvy individuals secure the details of their affairs with strong encryption WITHOUT being hassled by 'the Man'?...

    P.S. However, Rivest came up with a scheme that gives you 'confidientiality *without* encryption' through a scheme he calls Chaffing and Winnowing

    Enjoy! :)

  44. Exactly! by 0xB00F · · Score: 3, Informative

    Because that would require the hashed password and a preimage attack.

    See here.

    Summary for those who are lazy:

    • This is a collision attack. The attacker will be able to find two messages that will produce the same hash, but the attacker cannot choose what the hash will be. So this rules out attacks on hashed passwords.
    • Since a collision attack can find to messages that will produce the same hash, it is possible to use this to break message signing, such as DSA and RSA. where a hash of the message is generated first and then signed cryptographically.
    • Collision attacks cannot be used to tamper with existing SSL certificates. It can be used to craft a CSR which will allow you to receive a server certificate with a collision to one containing a wildcard for the domain name and an expiration date far in the future. This by far is one of the most dangerous exploits because most CA's will issue certificates without completely verifying the identity of the requester.
    • Because of the way MD5 is used in SSL 3.0/TLS, these attacks do not affect it.
    • Collision attacks do not affect MD5 and SHA-1 when they are used in an HMAC. So even though a hash function can be broken by either by collision attacks, they can still be used safely in HMAC.
    • Tampering a signed binary is only possible using a preimage attack.
    • All will have naturally collisions. How exploitable they are depends on how easy it is to find those collisions.

    In conclusion, the value of this attack/exploit is only relative to how the hash function is used in an application. Just because this exploit and source code for it exists, that does not that these hashing algorithms are completely useless.