Slashdot Mirror


Finding MD5 Collisions With Chinese Lottery

Stanislav Shalunov writes "Jean-Luc Cooke posted a Usenet article describing a distributed webpage-based effort (Chinese Lottery) to find a collision in the MD5 function. All you need to do to participate in the effort is visit the URL that loads the code. The author comments: 'What is interesting about this approach - when we reach final release stage - is that any website that adds this small snippet of code to their pages will have their visitors working on the problem for the duration of their visit to the site'."

69 of 303 comments (clear)

  1. Uhh.. by TCM · · Score: 5, Insightful

    From the link:

    You run an Applet, it reports to us the search results. Distributed computing without installing anything...and without people knowing you're stealing their idle CPU time. ;)

    I don't know about you but I wouldn't lean out the window with the fact that I'm stealing from others.

    Idle CPU time might be unused but I still want to know what my box is doing and why.

    --
    Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
    1. Re:Uhh.. by Phillup · · Score: 4, Insightful

      I personally wouldn't call it "stealing". You pretty much agreed to run Java. Yes, you could be a clueless noob and knot *know* that your browser has it enabled... but, nobody is *making* you run java applets.

      I don't shove it down your pipe... you ask for it.

      Of course this line of reasoning could be extended too far... like the case of all the porn pop-ups... but, even there... I tend to feel that the user is ultimately in control (or should be!) of their own computer. Install Mozilla and don't suffer the pop-ups.

      Better yet... and this is the approach I myself practice... go away. Any time I find a site that ticks me off (bad Java/JavaScript that causes browser naughtiness), I add them to my banned list on my proxy... and never have to suffer the site again.

      Not even unintentionally.

      ---

      Not only that... but my CPU monitor went to a hundred percent.

      Yeah, it is a low priority thread... but... I did notice.

      P.S. "you" does not mean you personally...

      --

      --Phillip

      Can you say BIRTH TAX
    2. Re:Uhh.. by cmallinson · · Score: 3, Insightful
      I personally wouldn't call it "stealing". You pretty much agreed to run Java. Yes, you could be a clueless noob and knot *know* that your browser has it enabled... but, nobody is *making* you run java applets.

      I don't shove it down your pipe... you ask for it.

      OK, come on. Leaving Java enabled is a very poor definition of "asking for it". What percentage of internet users know the difference between Java and JavaScript, and can determine which one if any should be turned off or on? I would say less than 1-2%. Taking advantage of the rest is just not cool.

    3. Re:Uhh.. by Phillup · · Score: 2, Insightful

      I do understand what you are saying.

      But, at the end of the day... you look in the web server logs and you see a request from a computer asking for a Java applet.

      What is it supposed to do... somehow know that the person in front of the browser was not smart enough to really make the call?

      At some point you have to say that a valid request was made... and honor the request.

      --

      --Phillip

      Can you say BIRTH TAX
  2. Oh, lovely, distributed Javascript computing by Anonymous Coward · · Score: 5, Interesting

    Perhaps we could tie this to some sort of micropayment system. You come do distributed work on my website, and you get to view it. Some third party pays me for the cycles, and I have a new revenue stream!

    1. Re:Oh, lovely, distributed Javascript computing by illustir · · Score: 3, Insightful

      Why don't the slashdot editors who put this online embed the code in the story page? That way the slashdotting would have some use at least.

      --
      -- Alper
    2. Re:Oh, lovely, distributed Javascript computing by sinistral · · Score: 2, Informative

      It's not JavaScript, it's Java. Despite the names, they're vastly different.

  3. Are there any known MD5 collisions today? by GGardner · · Score: 2, Interesting

    Last time I looked into this, which was several years ago, there were no known different strings which had the same MD5 hash. I thought this was remarkable. Are there any known ones today?

    1. Re:Are there any known MD5 collisions today? by mattdm · · Score: 4, Funny

      Well, if there were, that'd make the question this project is trying to answer remarkably easy.

    2. Re:Are there any known MD5 collisions today? by iggymanz · · Score: 2, Interesting

      more accurate to say it's very unlikely two string have same md5 value - but raise two to the power of the number of bits in an md5 hash, and there's at least that probability that two strings will have same hash. Of course, question is with real world strings is it even more likely than that huge 1:n number that 2 will match??? Hence this project, which I don't think is ethical or good way to find out.

    3. Re:Are there any known MD5 collisions today? by ilsa · · Score: 2, Interesting

      Reason #83 that MD5 is an inadequate method of identifying MP3s. Hashsums are only "practically unique."

      --
      -- I Am Not A Terrorist.
    4. Re:Are there any known MD5 collisions today? by lostchicken · · Score: 2, Informative

      It can't be reversed. That's the point of MD5.

      However, it is trivial to prove the fact that there are strings that have the same MD5 hash due to the fact that you can't represent 2^65 different numbers with only 2^64 keys.

      --
      -twb
    5. Re:Are there any known MD5 collisions today? by jrstewart · · Score: 2

      Umm, the fact that hashsums are only "practically unique" isn't why they're an inadequate method of identifying MP3s (and that's not what Schneier is saying in the article). The reason they're inadequate is that depending on what encoder you use and the settings there will be a bunch of different MD5s of the same song.

      The RIAA could get around this by setting up a battery of tools to try to get all of the relevent hashes, but it would be possible to create encoders that perturb the compression process to get different bits in the file while sounding essentially the same. A trivial way to do this would be to watermark your MP3s with random data.

    6. Re:Are there any known MD5 collisions today? by spongman · · Score: 2, Insightful

      moreover, most programs that hash MP3s fail to exclude the ID1/ID2 tags, so it's pretty simple (and common) for different MP3s to sound exactly the same.

    7. Re:Are there any known MD5 collisions today? by The+Snowman · · Score: 2, Interesting

      Last time I looked into this, which was several years ago, there were no known different strings which had the same MD5 hash. I thought this was remarkable. Are there any known ones today?

      MD5 is a hash. Hashes have three defining characteristics. First, the same input always produces the same output. Second, a small change in input produces a large change in output. Third, collisions are relatively rare -- it should be uncommon for two input strings to produce the same output string. Of course, with 2^128 output values and an infinite number of input values, there are an infinite number of inputs that produce the same output, theoretically.

      Anyway, there are a few strings that produce identical outputs, using two dictionary words. I cannot find them at this moment, although I know where I saw them. Google and on-site searching mechanisms aren't helping. Oh well. I tried.

      --
      24 beers in a case, 24 hours in a day. Coincidence? I think not!
    8. Re:Are there any known MD5 collisions today? by Tom7 · · Score: 3, Interesting

      Considering there are an infinite number of strings that will map to a single MD5

      That's probably, but not necessarily, true.

      I'd say there is a chance we'll find one sooner or later.

      Yeah, it's about 1 in 2^128. There aren't even enough electrons in the universe to write down all the possible MD5 hashes, not to mention the strings that might hash to them.

  4. Re:How do I add this to my site? by coene · · Score: 2, Informative

    Just embed the applet into your HTML, view the source of that page - you'll get it.

  5. That's really interesting... by herrvinny · · Score: 5, Informative

    That's a really interesting way of doing it. For the people who don't know, here's a quick explanation:

    Java Applets, because of the sandbox they're run in, can't open up a network connection to any website, except for the websie they came from. Presumably, what they're doing is creating a small Java applet, that when loaded, executes some logic, then opens up a network connection back home and sends the results.

    Fascinating. This way, you don't have to bother installing something and hope it doesn't fsck up your computer. It might be slightly less efficient than a dedicated, installed program, but this way, they can harness the power of a computer just casually browsing a web page. Very innovative.

    1. Re:That's really interesting... by herrvinny · · Score: 2, Interesting

      It's run in a sandbox, and the sandbox is pretty restrictive. No writing to the hard drive, no network access other than connecting back to the website the applet came from, a requirement that all applet created windows have a "WARNING: APPLET WINDOW" box on the bottom, etc. And the process of signing an applet is downright screwy and often doesn't work for all platforms.

    2. Re:That's really interesting... by Rich0 · · Score: 2, Informative

      Keep in mind that many websites use two-way communications with a Java applet. How is this a privacy violation?

      A Java applet can't see what you're doing on your computer. It can't see your hard drive. It can't see what other processes are running, etc. It can only communicate within the confines of the browser window and well-marked pop-up windows that it can spawn. Security is enforced by the local JVM - which the user installs from a trusted source.

      Java was designed the "right way". This isn't ActiveX - in which an applet can rummage though your files and send a copy of every one of them to whoever the applet author wants. Java applets run in a sandbox and can only execute a subset of the full Java language.

      There really isn't anything to see here... Move along...

  6. Whoever made this... by coene · · Score: 2, Interesting

    Make sure to take out the warning message "ok fine then, you don't want cookies..." that pops up when you disallow it yer cookies (buy yer own thx!). This was surely a debug message, it's not useful anymore ;)

  7. bitch, bitch, bitch by Anonymous Coward · · Score: 3, Funny

    First thing it does when the applet loaded was to bitch at me for not accepting cookies. Just like my wife.

  8. Not ethical by Bill_Royle · · Score: 3, Insightful

    I respect the effort and ingenuity, but the rationale that "hey, we're helping solve a problem" somehow justifies stealing someone else's resources... it's just wrong.

    Be upfront with people - tell them why it's so important, what can be accomplished with it, and what it does. You'd be surprised - people might help out of *gasp* the goodness of their own hearts. A good example might be SETI, etc.

    1. Re:Not ethical by pla · · Score: 4, Interesting

      I respect the effort and ingenuity, but the rationale that "hey, we're helping solve a problem" somehow justifies stealing someone else's resources... it's just wrong.

      Although letting visitors know about this would certainly seem nicer, I don't think I'd actually consider it as outright unethical.

      For one thing, considering the number of websites out there that try to feed outright malicious code into our browsers, this looks very very tame by comparison. It uses a few CPU cycles, but has no long-term effects on the visitor.

      For another, this seems no different that sending the visitor a few banner ads - Just a way of "paying" for the content. For most of the world, bandwidth costs far more than CPU time, so in effect, this "charges" the user less per visit than most advertisements. From some quick n' dirty calculations, the bandwidth for 35k of banner ads costs me 0.082 cents, while the electricity for a full hour of CPU time (on a PIII/933) costs me only 0.0045 cents... Literally 18 times more.


      Finally, I can (and do) keep Javascript disabled in my browser. Advertisements, on the other hand, I do my best to block, but a few still manage to sneak through.

    2. Re:Not ethical by Phillup · · Score: 5, Insightful

      While I completely agree with your sentiment about being upfront... I don't agree with calling it "stealing".

      Who clicked on the link?

      Who has Java enabled on their browser?

      Who has cookies enabled on their browser?

      It isn't like he is doing anything "tricky" or using some "bug" to pull this off. The page doesn't "trap" you. It doesn't eat your CPU and make it impossible to quit the app or go to another page. And, for me, it didn't crash anything.

      I *really* don't understand how this can even remotely be considered stealing. Every single item is being used *as*designed* both by the web author and you.

      The way I see it... someone jumped in a pool... and now they are bitching about your clothes being wet?

      --

      --Phillip

      Can you say BIRTH TAX
  9. Not very intensive. by LoneIguana · · Score: 4, Informative

    It certainly isn't using very many cpu cycles, the OS reports that my webbrowser is using less than 1% of the available cpu power

    1. Re:Not very intensive. by smart_ass · · Score: 2, Interesting

      With Mozilla I got the same ... but when I opened it up in IE 6.0 it hogged all resources.

      --
      Ouch ... did I just say that.
    2. Re:Not very intensive. by LucidityZero · · Score: 2, Funny

      No, no. You're wrong. IE 6.0 just hogs all resources by default.

      --
      Sig.i>
  10. ./ effect = benefit?? by bluelip · · Score: 4, Funny

    put the snippet on slashdot.org. The collisions should all be found within an hour.

    --

    Yep, I never spell check.
    More incorrect spellings can be found he
  11. Re:Would be great for LOTR by deadsaijinx* · · Score: 3, Insightful

    Have you ever tried even using a dedicated renderfarm? The complications that can arise if you don't have all the textures and files locally, not to mention the fact that rendering is so heavy a tax on the CPU people would NEVER want to do it. Plus, that would involve them releasing files that go into making the movie. And so on and so forth, The idea is so terrible I couldn't imagine anyone ever trying it. Peace out and try to talk about something you konw for once.

    --
    YOU SUCK BALLS!
  12. Normal Thread Priority by cybermancer · · Score: 4, Funny

    Interesting idea, but most distributed computing tasks that run in the background run at low priority. Since this is running inside your browser (more or less) it will run at the priority of the browser. Unless your browser is running at low priority then this process will push all the lower priority processes out of process cycles.

    This could prevent contact with ET!

    --
    "Anything is possible with enough programmers, time and pizza." (Substitute caffeine for time as needed.)
    1. Re:Normal Thread Priority by mlk · · Score: 5, Informative

      Java applets run as a different process to the browser, and it can (and very likely does) create a new thread, and set its priority to low.

      --
      Wow, I should not post when knackered.
  13. Re:Would be great for LOTR by gordyf · · Score: 2, Insightful

    No, it would take too long just to upload the scene data to the client, let alone render anything useful within the average person's attention span.

  14. the slashdot effect by Peeet · · Score: 3, Funny

    It's about time that the monster (us) is used for good and not evil.

    Oooh! I thought of another way...
    Just Click here.

    -P

  15. Great, GREAT idea. by SargeZT · · Score: 2, Funny

    I nearly got suspended from school because I installed seti@home on all the machines. With this, I can still maintain my EVIL distributed computing campaign, and do it without them knowing!

    --
    And why did you staple the trout to the RAM?
  16. For anyone wanting the code... by Vaevictis666 · · Score: 4, Informative

    Here's the code:

    <!-- try IFRAME, else use LAYER -->
    <IFRAME SRC="http://www.jlcooke.ca/psearch/dmd5l.html" SCROLLING="NO" FRAMEBORDER="0" WIDTH="100" HEIGHT="32">
    <LAYER SRC="http://www.jlcooke.ca/psearch/dmd5l.html" WIDTH="100" HEIGHT="32" CLIP="0,0,100,32"></LAYER>
    </IFRAME>

    It' s making an iframe that loads the applet, and just does its own thing - by loading in the iframe it can call back to their host, rather than yours :P

    Someone should let him know that he needs to make his server parse .html files through PHP, 'cause he's got a PHP header that isn't being sent - oh yeah and better html please.

  17. How to steal a virtual supercomupter? by LostCluster · · Score: 2, Insightful

    Let's put the research effort asside here and thing about the underlying concept here... basically, this is a distributed computing app being buried within webpages. Could commercial interests use this concept to get access to computing resources from their web users without telling them?

  18. New buisness plan by Anonymous Coward · · Score: 4, Funny

    1. Create very small website with CPU draining applet and post a link to said website to Slashdot.
    2. ??
    3. Profit!

  19. Re:./ effect = benefit?? by TCM · · Score: 2, Funny

    What's this Dotslash you talk about?

    --
    Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
  20. Parasitic computing by bigberk · · Score: 3, Insightful

    I believe the term was parasitic computing. Ideally the web master makes visitors aware to what's going on. You're using visitors' computing power to accomplish a neat sort of distributed computing. Great idea, if you're not just stealing resources

  21. no thanks by mercuryresearch · · Score: 3, Interesting

    As someone who intentionally runs a low-performance box as a primary system (VIA Epia 533) I'd be pretty unhappy with some snarfing up a few cycles. Junked-up web sites with flash and excessive java/javascript are REALLY noticable when you're browsing at the low end of the power curve.

    I run a cpu monitor in the background and when a site wants to run one of the more annoying classes of advertisements, utilization usually pegs... I can't imagine what something that intentionally sucked cycles would do.

  22. Re:./ effect = benefit?? by Darth+Fredd · · Score: 2, Informative

    Yeah, but do we all run Java enabled browsers? (lynx, links, etc)

    I'm running No-Java-Opera right now:because the java enabled opera was 11 more megs.. ..and I have dialup.

    Point is, geeky as we are, we're probably all expirementing with stuff.

    NOT LIKE THAT YOU PERVERTS!!/

    --
    "The most looniest, zaniest, spontaneous, sporadic Impulsive thinker, compulsive drinker, addict"
  23. Re:really bad idea for real system administrators by focitrixilous+P · · Score: 2, Informative

    Dude. Do you want to know the tax on your server? 3 lines of simple HTML. That doesn't sound like much of an extra complication, or CPU usage. Even the tiny applet is loaded off Their Server, meaning you do nearly no work to help these guys. You can debate the ethics, sure, but saying this is a mistake because of server issues is wrong.

    --
    SAILING MISHAP
  24. Re:Hmmm. by __aaitqo8496 · · Score: 5, Insightful

    I wonder if the good slashdot people would be willing to make this into a slashbox ?

  25. Re:RFI: "collision" means? by WTFmonkey · · Score: 4, Insightful
    The whoop is that MD5 is often used for "fingerprinting" or other unique identification on the internet (et al). Since we all know that what can go wrong will, the question is the definition and accuracy of the infamous phrase "computationally infeasible."

    Basically, in a world where everything was based on a thumbprint, would you want even the smallest chance, no matter how statistically unlikely, that someone else had the same thumbprint as you?

  26. Re:RFI: "collision" means? by Anonymous Coward · · Score: 4, Funny

    If two strings produce the same md5 hash, the universe ends. This project should probably be stopped.

  27. nonono-it *does* tax the servers.. by Darth+Fredd · · Score: 2, Insightful

    ..some. You use bandwidth for data throughput, you have the CPU usage..

    All on the server side. Yes, the clients are the ones doing the Real Work, but you have to do something with the result of that work. And its the Doing that taxes your servers, if only a little bit.

    --
    "The most looniest, zaniest, spontaneous, sporadic Impulsive thinker, compulsive drinker, addict"
  28. This plus popunders? ne The other way to pay. by IBitOBear · · Score: 2, Interesting

    OK, so an evil webmister makes a pop-under containing this kind of code and puts it up when you visit his porn site (optionally by mistyping "google" in your address bar.)

    Heck, (google|SlashDot|your legitimate business) just has a tiny inset on their page: "This box is using your spare CPU cycles to help us pay for this site or service. Subscribers do not see this box. Click here to subscribe."

    It could work.

    In the popunder case it is vile and abusive. In the legitimite and well advertised case it is totally fair.

    --
    Innocent people shouldn't be forced to pay for inferior software development.
    --"Code Complete" Microsoft Press
  29. Argggh! It's not ready yet! by phr1 · · Score: 3, Informative

    It's really too early for Slashdot readers to try to run that code. As the usenet post said, it's alpha test. I'd actually call it pre-alpha. The usenet sci.crypt discussion is about ways to change the design so it can be hosted on multiple sites at the same time. Really, it would have been a lot better to wait for the author to make an announcement, before linking an ongoing discussion about a work in progress to the front page of Slashdot as if the code was ready for prime time. Ow!

  30. seems a bit easy to highjack by doublebackslash · · Score: 2, Interesting

    With this being posted here someone with more knowledge of java than me is going to have the idea to give back false results. That is the reason for an install, to give the project mamgers control.
    I bet that sometime son they are going to be finding lots of collisions, all results from the same IP.
    Hope they have some sort of filter.

    --
    md5sum /boot/vmlinuz
    d41d8cd98f00b204e9800998ecf8427e /boot/vmlinuz
  31. I like the idea, but by tulare · · Score: 2, Interesting

    It crashes Safari. Now, admittedly, I don't know whether this is a Safari bug, a Java bug, a bug in the applet, or some combination thereof, but here's what happens to me:
    I load the thing in its own tab, have a look, look at the neat code that loads an IFRAME, etc. Ho-hum, nice idea, let's see where it goes, cmd-W to close the tab. Whups! The entire browser window closed, including all the tabs which I hadn't got around to checking yet! Safari is still running in the foreground, but I just lost its window.

    Anyone interested enough to debug this? I'm not =P

    --
    political_news.c: warning: comparison is always true due to limited range of data type
  32. I really hope this doesn't catch on by digitalgimpus · · Score: 2, Insightful

    Not that I mind technology, and new tricks.

    But the last thing I want to see is every website hogging my CPU. Either selling computing power of their web visitors for profit, or using it for themselves.

    Imagine the next series of Spyware Trojans... rather than spy, they harness your CPU and sell the power. All without the knowlege of the computer owner.

    Interesting business model, but not something I want to see. I like my CPU. Note the word "my".

  33. Re:RFI: "collision" means? by tstoneman · · Score: 2, Informative

    MD5 is a hashing algorithm. It will take an input of theoretically any size and create a 16 byte number that maps to this string. Most security algorithms use MD5 (or SHA-1 or some other hashing algorithm) to verify that the plaintext or cryptotext has not been altered during transit.

    Obviously, since a string can be an almost infinite length, there has *got* to be collisions somewhere, but so far, no one has found any.

    Realize that 16 bytes = 128 bits = 3.40282367e38 different outputs of MD5. Given that the half-life of a proton is 10e31 years, you need to do about 1 per second before half of the universe ends for good. Or, if you want to finish it in 100 years, you would need to 10e20 per second.

    You better start some time soon!

  34. Re:Anti-Javascript Post... by Tweaker_Phreaker · · Score: 2, Insightful

    This uses Java not Javascript; learn the difference.

  35. Re:RFI: "collision" means? by jrstewart · · Score: 2, Informative

    The chance of an MD5 collision if MD5 were an ideal hashing algorithm is astronomically small. To get a 1% chance of a collision you'd have to test on the order of 2^63 samples (for the math behind this google for the birthday paradox; it's of the order of the sqrt of the size of the hash space) to find two that match. Never mind finding an MD5 which matches a chosen hash value.

    This is a really big number.

    Nobody's really concerned about MD5 hash collisions of reasonable corpii (corpuses?, forgive my pseudo-latin) if MD5 is actually a perfect hash, or somewhat close to it. What people are really concerned about is there being some weakness in MD5 where you can reverse the algorithm and given some MD5 hash (maybe not any hash, maybe just certain ones) and come up with strings which hash to that value.

    For example, suppose that 2^127-1 is prime (it may well be but I'm too lazy to check). Then if you start pulling out random strings foo and using the remainder of foo mod 2^127-1 as your hash you'll also have a 1% chance of a random collision with a sample size of the order of roughly 2^63, as above. However there are some trivial collisions you can calculate, like 0*(2^127-1), 1*(2^127-1), 2*(2^127-1) all hash to the same value.

    If the data you're feeding your hash algorithm is random (more or less) there's no reason to prefer the modulus algorithm over MD5. But if you're using it for cryptographic things the modulus algorithm is pretty useless, and it may turn out to fall down on many common inputs that MD5 gives good results for.

    I may have goofed some of this, and there's lots more to be said about it but I've wasted enough time on this post as it is.

  36. Wow. Something where a slashdotting by jtnishi · · Score: 2, Funny

    is a good thing.

  37. Not Everyone is as quite so Advanced by ledbetter · · Score: 2, Interesting

    Most people who browse websites are quite simply unaware that their computer even contains a concept called Idle CPU Cycles, or that there is any way to get a CPU % reading from their computer. Besides, not everyone is so miserly with their CPU time. Most users also have a short attention span.

    If the user, whose browser visits such a website that opens up a number crunching applet, notices that their whole computer just became slower, then they'll leave the website. And the applet will be alive for less time. Therefore successful applet projects that are accepted and deployed by various webmasters, which want to obtain the most results would make sure that the applet is as unobtrusive as possible. Otherwise the user will browse away from the page (and or close the browser window all together), and the applet's lifespan will be short.

  38. CORRECTION by MyHair · · Score: 2, Funny

    Obviously, since a string can be an almost infinite length, there has *got* to be collisions somewhere, but so far, no one has found any.

    Correction: No one has reported any. I, uh, have a friend--yeah, that's it--who found a few collisions but is afraid to report them because it always occurs between his beastiality files and his lengthy and frequent poetic love letters to some girl who claims he's stalking her.

  39. Re:RFI: "collision" means? by tstoneman · · Score: 2, Informative

    Actually, I think in the "Chinese Lottery" scenario, there is one string/hash pair that is chosen, and all the clients try other combinations of strings. Whoever gets the same hash will "win" the lottery. Thus, the web site wouldn't have to store anything except the returned plaintext that hashed to the same MD5 value.

    I think the original "Chinese Lottery" scenario was if everyone one in China had a radio that was set to do encryption, and the Chinese government broadcasted a particular ciphertext that it wanted to encrypt, every radio would do the decryption using different strings until one of them got the answer. I think it would be under the guise of a lottery, so whichever citizen came back with the winning radio would receive a prize, and the Chinese government would have their cracked ciphertext.

  40. Re:RFI: "collision" means? by lxs · · Score: 2, Informative

    You're basically correct. Theoretically many different inputs have the same md5 hash. However, the chances of finding two such inputs are very small. There is no real practical value to finding such a collision, other than to give a rough idea of what it takes computationally to find one. Since md5 is used to check the integrity of files like linux isos, it is important to know how secure the algorithm is.

    It is a bit like SETI@home, It is very likely that we're not alone in the universe, but until we have empirical proof that we're not, nobody is truly satisfied.

    Besides, if this was of true significance for national safety, funding would be found to run this on dedicated machines.

  41. WARNING! WARNING! DANGER WILL ROBINSON! by Crypto+Gnome · · Score: 2, Informative
    I dunno what they think they're doing, but they managed to consistently crash my browser in under 5 seconds.

    YOU HAVE BEEN WARNED
    • Windows XP PRO
    • Athlon XP 2200+
    • 1GB RAM
    • Firebird 0.7+
    --
    Visit CryptoGnome in his home.
  42. Finally a possible way to pay for web traffic? by waferhead · · Score: 4, Interesting

    Once they have gotten this working, and assuming there is a commercial need for these cycles that exceeds the cost in bandwith, a site could do as others have suggested, and require you to run this app (ala netzero etc) in order to acess content on the site.

    Beats pop up ads, anyway.

  43. Ulterior Motives . . . by Dausha · · Score: 3, Interesting

    But, could this not be used to build a hash table of all MD5 sums? If all possible MD5s were known by one source, what is to prevent them from using this as a simple lookup to crack MD5-based passwords? Even if they only focused on short strings (say, typical password length) they could go a long way to defeating another security mechanism.

    --
    What those who want activist courts fear is rule by the people.
  44. Electrons in universe by Glorat · · Score: 2, Insightful

    My standard reply to this is that there are 2^128 possible hash sums which is many magnitudes more than the number of electrons in the universe! So you'd have a pretty hard time storing them all.

    As for the set of short strings, because this is such a limited set, if MD5 is any good (which it is), you won't find a collision in such a small subset.

    1. Re:Electrons in universe by jlcooke · · Score: 2, Informative

      read the sci.crypt post, I site a paper from van oorschot from 1994 describing exactly how to get MD5 collision. In today dollars/moores law, it would cost $100,000....anyone with good credit can find collisions in MD5.

    2. Re:Electrons in universe by Glass+of+Water · · Score: 2, Insightful
      What you describe is called a "salt". It's standard for storing hashed passwords and preventing against dictionary attacks, or comparing a user's passwords on two different systems. Maybe you know that already.

      Here's a pretty good recent thread on the subject from SecurityFocus' secprog list.

      --
      There are no trolls. There are no trees out here.
  45. Re:Short answer: yes by jlcooke · · Score: 2, Informative

    a collision in MD5's transform was found. But not on the whole hash.

    Difference? The md5() function includes padding. The md5_compress() collision is cited here:

    http://citeseer.nj.nec.com/denboer93collisions.h tm l

  46. Re:I don't get it by jlcooke · · Score: 2, Informative

    Read van oorschot's paper cited in my sci.crypt post. You'll start gettign mad at VeriSign, Amazon, SourceForge, et al for using MD5.

  47. Re:RFI: "collision" means? by jlcooke · · Score: 2, Informative

    No respectable cryptographer uses MD5 for signatures anymore, they havn't for years - the industry hasn't caught up yet (TripWire, VeriSign, .rpm, .deb, md5sum, some PRNGs, etc)

    This is the essance of why I'm doing this.

    Look around for evidance of this movment in crypto circles (ie don't listen to /. posters... :) )