Generating Fast MD5 Collisions With ATI Video Cards
An anonymous reader writes "Yesterday at Black Hat USA 2009, a talk entitled
MD5 Chosen-Prefix Collisions on GPUs
(whitepaper) (Both PDFs)
presented an implementation written in assembly language for ATI video cards that achieves
1.6 billion MD5 hash/sec, or 2.2 billion MD5 hash/sec with reversing,
on an ATI Radeon HD 4850 X2. This is faster than the much-publicized 1.4-1.9 billion hash/sec figure that was
supposedly reached on a PlayStation 3 by Nick Breese at Black Hat Europe 2008 (he
later noticed an error in his benchmarking tool). Compared to the cluster of 215 PlayStation 3s that was used to
create a rogue CA in December 2008,
Marc Bevand claimed a cluster of 12 machines with 24 video cards would be
a bit faster, consume 5 times less power, and be 10 times cheaper."
Generated with the help of an ATI card, I assume.
If all you want is a signed SSL certificate, I suspect it would be easier to bribe an employee at a CA to skip a few steps when validating you.
Somewhat off-topic, but I guess related all the same...
Nobody should use MD5 for authentication and whatnot... and even as a 'checksum' of sorts you have to be careful (i.e. make sure that the source of the MD5 text/file isn't the very same source as the file it was generated for, as a compromised file probably means the MD5 string would be equally compromised).
But I'm curious.. are any of the attacks capable of injecting new data that..
1. doesn't affect filesize - the wiki mentions that successful attacks can prepend and append, but presuming you'd include the file size with the MD5 string, that would be another parameter to check
2. actually does something.. be it useful or nefarious, rather than just crash the app or insert gibberish in a text document, etc.
e.g. if I took the declaration of independence as a .txt file, are there any attacks that could subtly, or non-subtly, change the wording without increasing or decreasing the size of the file, and still match an original MD5?
--
On-topic: cool; but not particularly new? Most everybody knows that GPUs are great at taking in a tiny bit of data, crunching it, and spitting a result back out. Kudos for actually writing optimized code for the given platform (in this case an AMD/ATi GPU), but it's still the same number crunching instead of an improved method.. correct?
or 2.2 billion MD5 hash/sec with reversing
Keep in mind I have completely no idea what "reversing" means.
ATI cards are programmable, Brook+ is just a little too high level for writing simple computational kernels (you drop too much performance) and CAL too low level for most people (it's basically assembly). So generally people just stick to CUDA, even in the few cases where ATI's architecture is superior.
This problem is ideal for ATI, very little input necessary (NVIDIA has more texture samplers) and no inter thread communication necessary (ATI does not have random writes on it's local data share at the moment, making that communication harder than it is with NVIDIA). So basically it just comes down to FLOPS and ATI wins big there.
Basically this was done in CAL because it was done by a hacker and not by an academic researcher (who doesn't really care about performance if he can just as easily get his paper published on a slower GPU with less effort, easier in fact since editors know CUDA).
The numbers don't add up no matter how I turn them. He claims to be getting 14% more performance from each graphics card than from each PS3. That means he need 12 machines with 24 graphics cards each to match the speed of a 215 node PS3 cluster. So because he get 14% more performance per node, he only need 34% more nodes to achieve the same performance. That does just not make sense to me. The 24 graphics cards in each machine also sounds unlikely. Maybe it was 24 in total, so 2 per machine. In that case 14% more performance per node means he need 89% fewer nodes. That does not make sense either. So, how are the numbers supposed to be interpreted?
I don't understand why anybody still finds it newsworthy when somebody come up with faster collision attacks against MD5. We already know, that collisions can be generated for MD5, and they can be generated fast enough, that we have to worry about it. It no longer matters exactly how fast they can be generated. If somebody managed to come up with a practical second preimage attack against MD5, then it would be newsworthy.
Do you care about the security of your wireless mouse?