Practical Exploits of Broken MD5 Algorithm
jose parinas writes "A practical sample of an MD5 exploit can be found, with source code included,in codeproject, a site for .Net programmers.
The intent of the demos is to demonstrate a very specific type of attack that exploits the inherent trust of an MD5 hash. It's sort of a semi-social engineering attack.
At Microsoft, the MD5 hash functions are banned.
The main problem is that the attack is directed to the distribution of software process, as you can understand reading the paper, Considered Harmful Someday. Some open source programs, like RPM, use MD5, and in many open source distributions MD5 is used as check sum."
Now we know why people distribute modified game ISOs on the net and check it with md5 :P In all technicality, couldnt this mean that someone could land a virus on someone else's machine because the person trusted the hash?
Viable Slashdot alternatives: https://pipedot.org/ and http://soylentnews.org/
Isn't this the problem with all algorithms? Only way to know if something is something... is to see something instead of its checksum?
As for md5... with only 32bits, it should've came up with repetitive hashes in end anyways?
I guess since the article explains some issues against md5 security, the only answer would be to trust the source that is supplying the hash in the first place? Coming down to the fact that a system is only as secure as its user?
This isn't a problem for software distribution, really, since the good.bin file needs to start with a vector designed to enable a collision. A good-faith programmer wouldn't include that vector.
It is a problem for stuff like contracts; you draw up two versions of a contract, a good one and an evil one, let someone sign the good one, and later keep them to the clauses in the evil one.
So while there IS a very big problem, the example is a bit contrived.
Is this true for other popular hash functions?
No it is not. The newer hashes, such as Whirlpool, do not have this problem. You're correct in saying this is a "well known result" and every cryptographer worth his salt says that this fact constitutes a break of the algorithm. We've known since the middle of the nineties that breaking MD5 was within reach. The fact there has been so much inertia in getting people to change is quite incredible really.
At Toorcon this year, Dan Kaminsky showed a way to create two different webpages that render properly in a browser but have the same MD5 hash. Anybody who thinks this attack is theortical and ignorable is grossly mistaken.
Simon
Two pages, same hashes, etc. (This is the guy who wrote the MD5 someday paper.)
http://www.doxpara.com/t1.html
http://www.doxpara.com/t2.html
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
You almost got it right. The reverse problem of a checksum is to produce the original file that will give the checksum.
Actually, the functions are called hash functions, and they can be written as:
H(file) = h
H() is the hash function, h is the hash value (checksum) the reverse would be RH()
RH(h) = file
Now we can of course prove that reverse is not a unique function, since any fixed-length h has a limited number of possible values, whereas file can be of any length.
Therefore, reverse has many possible solutions, and in a simple case like the real checksum of bits (having two values 1 and 0) has a simple reverse algorithm, too. Given the value of h, produce the file == easy.
Complex hash functions like MD5 have the interesting property of difficulty of reversion : it is difficult to even produce any file that will compute to any given MD5 value, hence it is difficult to fake it.
It has been known all along that reverse is possible, and has many solutions, but it was thought they are too difficult to find. Now it has turned out that some can be, and were, found.
Not necessarily - it depends on how much the author is trusted to begin with. Certain types of software are very closely checked by the open source community, and any trojan will be discovered if it exists in the package.
Say I write a bit of security software (for which most people take the time to compare checksums). As a relative nobody, lots of people are going to scrutinise the source code before using it. Any new release will also be checked. Only after it's been scrutinised and built up a reasonable userbase is it worth switching it for the evil version - otherwise, the evil version would be discovered early and nobody would use it.
Someone like Schneier could probably release a trojan directly and people would install it without thinking (I trust he's too responsible for that, though). For the rest of us, this gives a feasible way to sneak in evil code without anyone checking it.
``If you contract your file from x bytes down to a fixed size, no matter what algorithm you use, you will always have collisions.''
Yes.
``Unless you start to give your hash keys as the same size as the original file, there is not anything that can be done about it, ever.''
No. If I run deflate or some other compression algorithm on a file, chances are it will come out smaller. Still, the compressed file maps one to one with the original.
Please correct me if I got my facts wrong.