Slashdot Mirror


Practical Exploits of Broken MD5 Algorithm

jose parinas writes "A practical sample of an MD5 exploit can be found, with source code included,in codeproject, a site for .Net programmers. The intent of the demos is to demonstrate a very specific type of attack that exploits the inherent trust of an MD5 hash. It's sort of a semi-social engineering attack. At Microsoft, the MD5 hash functions are banned. The main problem is that the attack is directed to the distribution of software process, as you can understand reading the paper, Considered Harmful Someday. Some open source programs, like RPM, use MD5, and in many open source distributions MD5 is used as check sum."

18 of 253 comments (clear)

  1. Checksums are always going to be vulnerable by LiquidCoooled · · Score: 3, Insightful

    If you contract your file from x bytes down to a fixed size, no matter what algorithm you use, you will always have collisions.

    Unless you start to give your hash keys as the same size as the original file, there is not anything that can be done about it, ever.

    --
    liqbase :: faster than paper
    1. Re:Checksums are always going to be vulnerable by Ckwop · · Score: 5, Insightful

      If you contract your file from x bytes down to a fixed size, no matter what algorithm you use, you will always have collisions. Unless you start to give your hash keys as the same size as the original file, there is not anything that can be done about it, ever.

      While this technical correct it is slightly misleading. The aim of a hash function is to make it has hard as possible to find a collision. For an n-bit has it takes roughly 2^(n/2) operations to find a collision. Any attack faster than this is considered a break of the hash function. It typically takes less than five minutes to break MD5 so it is horribly broken.

      While removing the possibility of collision all together is provably impossible you can design a hash function for which finding a collision is computationally infeasible. The standard size to achieve this today is 256-bits and the better designed functions like Whirlpool use this hash length.

      Simon

    2. Re:Checksums are always going to be vulnerable by LentoMan · · Score: 2, Insightful

      Use of two or more different checksums algorithms would make it substantially harder to create a file with a fake hash.

    3. Re:Checksums are always going to be vulnerable by m50d · · Score: 2, Insightful
      No, it wouldn't, *sigh*. Stop posting this, I'm tired of correcting it.

      As another reply said, however, doubling the bit count would improve security. But, a simple double-MD5 would have exactly the same problems. Therefore, the solution is to use a longer and more secure hash, like, for example, SHA-256.

      --
      I am trolling
    4. Re:Checksums are always going to be vulnerable by ajs · · Score: 2, Insightful

      You are looking at this incorrectly. The problem is that, given file A, which has hash X, if it is possible to create file B which has hash X in a non-exaustive way, then the hash is weak. Most such techniques work well regardless of file B being a from-scratch random pile of data or a combination of well-formed data plus random data. So, for example, you might have an RPM with 3 files in it. Add a fourth file, and start permuting to find your similar hash.

      Now, as you point out, we're not yet to the point that this is trivial with MD5 (it has always been possible, just not practical). So, the only valid thing to do is to move to something more secure as soon as possible so that when we finally do discover a trivial exploit for MD5, we no longer care as much about it. This also helps us by forcing software vendors and projects to re-consider how they use hashing, and to make it easier to modularly replace the hashing functions they use. You will note that all of the oldest such software has already had to do this, and thus the task will be easier for them.

  2. Re:H(x) == H(y) - H(x + q) == H(y + q) ? by igb · · Score: 4, Insightful
    It's almost universal to use the byte count as part of the checking of equivalence, either by storing it as a distinct item or by using it as inital or final salt to the calculation of the hash.

    ian

  3. Re:So if you need a freely available hash algorith by thomasdn · · Score: 2, Insightful

    AFAIK there are no known vulnerabilities or attacks for these two yet

    I am no cryptography expert so I can not read and understand those algorithms. But the fact that there are no known vilnerabilities for an algorithm doesn't make it secure.
    Maybe they are just not used as much as other well known algorithms. And therefore nobody has found vulnerabilities for them yet?

  4. Re:File Integrity Checkers ? by ArsenneLupin · · Score: 3, Insightful
    So the next time someone installs a root kit, he just needs to do it in a way TFA points out.

    Wouldn't work. TFA points out a way to generate to packages, a good one, and an evil twin. Both are constructed by the algorithm.

    It does not however show how to make an evil twin for an existing package. This considerably lessens its danger.

    Think about it: the only guy who could pull this off would be the original author. But then, the original author could achieve that same goal much more easily than by "breaking" MD5.

  5. Don't panic yet by Anonymous Coward · · Score: 4, Insightful

    While this is a very serious attack, it doesn't yet mean that every use of MD5 is totally unsafe.

    It is feasible now to generate 2 different pieces of data with the same MD5 hash. As many file formats allow one to embed invisible 'junk', it is possible to create a 'good' version and an 'evil' version with the same MD5 hash.

    BUT it is NOT (yet) feasible to create a piece of data with a given MD5 hash. This means that if you can not modify the 'good' version, you can't create a matching 'evil' version.

    An example of where the usage of MD5 isn't broken are *nix passwords. Your password is hashed with MD5 (a salt is added to your password too, but that's not important here), and that hash value is stored. Anyone who can supply a password (doesn't have to be the same one!) which has the same MD5 hash, is allowed access.

    So if it would be feasible to generate data with a given MD5 hash, one could easily generate a matching password, when given the MD5 hash (which you can often easily acquire, especially in NIS/yp environments). But luckily, this is not possible ... yet :-)

    So you'd better start protecting those hashes (that's what a shadow password file does), en better yet, move to a better algorithm. Like the Blowfish algorithm that OpenBSD has been using for years now.

    1. Re:Don't panic yet by Haeleth · · Score: 3, Insightful

      An example of where the usage of MD5 isn't broken are *nix passwords. Your password is hashed with MD5 (a salt is added to your password too, but that's not important here)

      Actually, the salt is extremely important. A large proportion of unsalted MD5 passwords can be cracked trivially by looking them up in pregenerated databases. MD5 alone is broken for passwords - it's only the salt that makes MD5 passwords still "good enough" for use on low-security machines... for the time being.

  6. Why it IS a problem by Moraelin · · Score: 3, Insightful

    The problem is that a broken algorithm just makes that piece of social engineering a lot easier.

    If I just told you you can download the latest auto-installer of the latest WoW patch from www.i-pwn-ur-puter.ru instead through the slow Blizzard installer, you might think "uh, wtf, I think I'll play it safe anyway and get it directly from Blizzard. I trust them more than I trust a warez and script-kiddie site."

    Now picture that I tell you "and here's a link to the MD5 sum on Blizzard's site. You can check for yourself that the the file on our site is the original file and it hasn't been tampered with." In fact, I would even _urge_ you to make a habit to check all your downloads against the original MD5 sums, for your own good.

    It already looks a lot safer and more legitimate. Well, maybe not to _you_, but to a lot of people it does.

    That's the whole problem. That false sense of security makes the "if we can convince you to run our insecure extractor code" part a helluva lot easier.

    --
    A polar bear is a cartesian bear after a coordinate transform.
    1. Re:Why it IS a problem by archeopterix · · Score: 3, Insightful
      Now picture that I tell you "and here's a link to the MD5 sum on Blizzard's site. You can check for yourself that the the file on our site is the original file and it hasn't been tampered with." In fact, I would even _urge_ you to make a habit to check all your downloads against the original MD5 sums, for your own good.

      It already looks a lot safer and more legitimate. Well, maybe not to _you_, but to a lot of people it does.

      That's the whole problem. That false sense of security makes the "if we can convince you to run our insecure extractor code" part a helluva lot easier.

      The scheme described in the article enables you to: take a good file and generate:

      A. Another "good" file (one that generates the same exec while extracting).
      B. A "bad" file.
      Such that hash(A)=hash(B)

      For this scheme to work, you would first have to convince Blizzard to use your "good" file A for distribution (more exactly: computing the published hash). Hey Blizzard, I have a file that extracts the same files as your distribution, only has a different hash value, why don't you replace your file with my file? "Helluva lot" easier than just convincing them to distribute your "bad" file? I don't think so.

  7. Re:hashtrust by SillyNickName4me · · Score: 2, Insightful

    Any practical use of hash algorithms is extremely easy to fool with social engineering. All you have to do is get someone to trust your hash..

  8. Re:Not a problem for software distribution.. by provolt · · Score: 2, Insightful

    This is certainly a problem with software distribution. Here's an example of why. The good-faith programmer creates good.bin. A Bad-faith programmer creates bad.bin with the same hash. All the bad-faith programmer has to do is find a way to swap bad.bin in for good.bin (hacking, misrepresentation, etc).

    Now everyone who gets bad.bin will hash the data, verify the signature and see that the signature verifies. When the signature verifies, everyone believes bad.bin is really good.bin.

  9. Banning MD5 is stupid and small minded by ivan256 · · Score: 3, Insightful

    It's dumb to ban MD5. It is still, and will continue to be a useful tool in situations where a cursory comparison is needed for reasons other than security. It would be silly, stupid even, to use a larger hash that requires more (slower) computation in these situations. Banning MD5 can mean one of two things: Either it's a stupid publicity stunt that will result in slower code overall, or that Microsoft doesn't trust it's developers to be smart enough to know when a particular hash is appropriate.

  10. replace both the "Evil file" AND the extractor ? by Jedi_Gunsmith · · Score: 2, Insightful

    I just read the article, and i was thinking :If we can replace the extractor, then why bother creating an evil file with the same checksum ? The extractor can do the "evil patch" at extraction. The method in this article isn't very usefull for evil purposes.

  11. About this attack by Ernesto+Alvarez · · Score: 3, Insightful

    I've been reading the posts in this thread and I've noticed that there are two types of posters here: the ones who got it 100% right, and the clueless ones (there appear to be little or no posters in the middle ground).

    Now, the clueless ones are thinking of lots of "attacks" using this vulnerability, some of them really wrong. Since this has the potential of getting lots of people to do stupid things (like not trusting MD5 when they should), let's talk a little bit about the vulnerability and its effects.

    First of all: this is not new. There was an article here explaining the same attack a few months ago (about x.509 certificate collisions and how to fake postscript orders, if you know what I am refering to, please post a link).

    The attack goes like this:

    You have a block B1 that is known to collide with another block B2.
    You have some custom made code that looks like this:

    -----BEGIN SNEAKY CODE---------------
    If DATA[1] = DATA[2] then
    do something good
    else
    do something bad
    end
    DATA[1]
    DATA[2]
    -----END SNEAKY CODE-----------------

    The trick is that since there's a collision between B1 and B2 and MD5 makes the hash by reading sequentially, the hash for the whole program will be the same whether you fill DATA[1] and DATA[2] with B1 or B2 (in any combination). Since the code is DESIGNED to do different things depending on the collision area, by changing the contents of DATA[1] and DATA[2] you can have programs that do "good" or "bad" things, with the same hash. Please note it's been DESIGNED with that in mind.

    From now on I'll talk on absolute terms, while in reality there is a very small probability of things being right for an attack without being planned that way, so keep in mind that before saying "but that's not the whole truth.....".

    Now let's discuss what's possible to do and what's not:

    1.Oh no! Now, someone will create a virus that has the same hash than my favorite app!

    False: the app (or installer) would have to have been designed with that "feature" in advance.

    2.MD5 is worthless and should not be used anymore.

    False: MD5 is useless in the situation presented above. There are some very good uses of MD5 that are safe (like access control: this attack does nothing practical to you salted MD5 shadow file). MD5 should probably be watched for other undesirable properties, though. An alternate cryptographically secure function should be kept in reserve.

    3.I'll use another hash function, I'll be invulnerable to this attack.

    (somewhat) False: You'll be invulnerable until someone finds ONE collision in your new hash function (it might take a long time but....). Then you'll be vulnerable again. But now we all know what can be done with ONE collision. What you're thinking is probably good, but it's no silver bullet.

    4.Microsoft will forbid the use of MD5 and DES, and use SHA-1 and AES. We should do the same.

    (somewhat) True: Not for the reasons you're thinking though. If MS is really doing this, this attack is a lame excuse to do it. MD5 is still useable for some things, and SHA-1 is not much better than MD5 in the things related to this attack. IIRC these collisions were found using an attack derived from an attack on SHA-1. Right now, SHA-1 collisions can be found in 2^63 operations (and the clock is ticking). We should probably consider using a new hash function someday, but leave the decision to the cryptologists. About AES, it's about time. DES can be brute forced in reasonable time, and that's been like that for a few years. 3DES is slow. That's the reason for the AES contest, we should use since we have it.

    5.Someone could distribute some sort of binary and the switch it so it does lots of damage to unsuspecting people.

    True: That's exactly what the attack is about. Maybe you were wrong to trust [insert a name here].

    6.Who should be doing what and when?

    If you work in crypto, you probably k

  12. Re:Yadda Yadda by NightHwk1 · · Score: 1, Insightful

    Both pages are exactly the same, but the javascript output changes depending on the URL used. This doesn't have anything to do with MD5, and it's hard to see how this would count as an exploit.