Slashdot Mirror


SHA-1 Broken

Nanolith writes "From Bruce Schneier's weblog: 'SHA-1 has been broken. Not a reduced-round version. Not a simplified version. The real thing. The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu (mostly from Shandong University in China) have been quietly circulating a paper announcing their results...'" Note, though, that Schneier also writes "The paper isn't generally available yet. At this point I can't tell if the attack is real, but the paper looks good and this is a reputable research team."

42 of 751 comments (clear)

  1. May be a big deal... by ThisNukes4u · · Score: 2, Interesting

    This may be a big deal, because if I understand correctly, SHA-1 is a similiar algorithm to MD5, which is commonly used to uniquely identify files. If that could be cracked using a similiar technique, a better method of hashing files may have to be found.

    --
    thisnukes4u.net
  2. Now what do we use? by enos · · Score: 2, Interesting

    With SHA-1 being MD5's replacement after that was broken, which hash function do we use now?

    --
    boldly going forward, 'cause we can't find reverse
    1. Re:Now what do we use? by Anonymous Coward · · Score: 1, Interesting

      Run data through SHA-1, then MD5, then XOR.

  3. what's left by Yonkeltron · · Score: 2, Interesting

    wait a sec....no MD5 and no SHA-1. what is going to take the place of those things? something like anubis or whirlpool?

    maybe more people will use GPG now!

    --
    Keep the faith, share the code
  4. Question by Anonymous Coward · · Score: 1, Interesting

    Why is it they found a hash collision in, but could not break, MD5; but have apparently broken SHA-1? This could just be due to a quirk of the respective algorithms, but it could also mean that the nature of their SHA-1 paper purported by the weblogs is mistaken.

  5. Re:Info on what exactly SHA-1 is ... by interiot · · Score: 5, Interesting

    So SHA-1 was created by the NSA, and was broken nine years after it was released. Is there any chance that the NSA knew it had a secret weakness, and promoted it for that specific reason?

  6. Bittorrent? by oman_ · · Score: 4, Interesting

    Is it time to update bittorrent?
    How hard is it going to be for people to provide garbage data with correct SHA-1 hashes to screw up downloads?

    --
    Rats would be more funny if they could fart.
  7. So what's the big deal for the rest of us? by beaststwo · · Score: 5, Interesting
    I've been reading about hash collisions for the last few years and haven't figured out why this is a crisis problem.

    I'm not a cryptographer, just a nerdy engineer, but let me explain my rationale: a hash algorithm takes an arbitrary message and generates a fixed-length signature that has a high probability (10**50 or better for most modern algorithms) of being the original.

    Let's assume that your hash algorithm generates a 128-bit hash. Anyone who knows anything about probability can see that is the original message is greater than 128 bits, there MUST be more than one message that will generate the same hash. For long messages, there may be thousands or millions of messages out of a filed of 10**50 (or better) that have the same hash, although many of them will be meaningless garbage.

    So SHA-1 has been broken by a group of cryptographers/mathematicians. Does this really mean that they can generate can alter any message in a way that will generate the same hash as the original, thus fooling the math that we use to validate content? No Way! I read Bruce Scheier's Cryptogram every month and he often makes the same argument.

    So yes, this means that from a long-term systems security standpoint, we should all move to stronger hashes. Does it mean that SHA-1-based transactions are inherently secure right now?

    I think not!

  8. Re:Well by FireballX301 · · Score: 2, Interesting

    Of course they're not supposed to be all-powerful, but considering details as to how exactly the algorithm is broken are not available, I'm quite interested as to how they broke it.

    I'm particularly worried about BT users, personally. The breaking of SHA-1 will essentially allow the RIAA and others to corrupt many bittorrent downloads.

  9. Impact on Digital Certificates & Issuer Liabil by Anonymous Coward · · Score: 2, Interesting

    Is SHA-1 used in x509 digital certificates, and if so does this mean that people can forge digital certificates ?

    If someone can do this, then what are the liability concerns for certificate issuers (or even their customers) ?

  10. Re:Hmm by Anonymous Coward · · Score: 1, Interesting

    > sha1 and md5 are generally considered so weak that they should only be used to combat error or accidents, not fraud

    No. At least, until very recently, SHA-1 was considered a serious cryptographic hash function, well suited for digital signatures and such.

    Maybe you're mixing them with CRC-32, which is indeed well suited to combat transmission errors, but not intentional foul-plays.

  11. Unfortunately the SHA series seems to be suspect by jd · · Score: 5, Interesting
    The Hashing Function Lounge lists other problems with the SHA functions:


    • (R04) V. Rijmen, "Update on SHA-1", accepted for CT-RSA'2005
    • P. Hawkes, M. Paddon, G. G. Rose, "On Corrective Patterns for the SHA-2 Family", Cryptology ePrint Archive, Report 2004/207


    If this definite break is confirmed, I think we will need to conclude that the entire family is suspect for any genuinely important purpose.


    There are a bunch of hashing algorithms on the Hashing Function Lounge that are listed as having no known attacks. At present, the most widespread is Whirlpool. I think it likely that one of these will replace SHA as the hashing function of choice in major cryptographic areas.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  12. Time to dust off your XBox by kernel_dan · · Score: 1, Interesting

    "The copy prevention system of Microsoft's Xbox game console relies on the security of SHA-1." -- The all knowing Wikipedia

    Maybe now we can crack some XBox stuff and run homebrew code on a homeburned dvd (dual-layer, of course.) Please correct me if I don't understand something. I thought M$ used RSA or something to encrypt their games and used SHA-1 signatures as checkpoints?

    --

    Illegal? Samir, This is America.
  13. Not true. by Ayanami+Rei · · Score: 2, Interesting

    Clinton signed a bill that ceased the definition of cryptographic algorithms as munitions. Now there is no strength requirements, checking by the NSA, nothing. Like since 96.

    Where've you been?

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  14. Re:Info on what exactly SHA-1 is ... by OverlordQ · · Score: 4, Interesting

    DES had a weakness nobody but the NSA knew about, so they recommended changes to it without saying the reasons for them. years later an attack was found against DES, but the NSA changes prevented it from being useful. Why would they change their tune to SHA-1?

    --
    Your hair look like poop, Bob! - Wanker.
  15. Re:Xbox-Linux Project B Complete? by Ant2 · · Score: 2, Interesting

    Factoring the public key for signed XBE files might now be the best option in running arbitrary code (i.e. Linux) on unmodded XBoxes.

    Have: Public key (it's inside the Xbox kernel), in decimal:
    2074011932725872376027602350906301713845 5993606274 88352673195511324110900735 43623741289960962910463535723067421103054569468248 62203867115042369878729703 47576511228016749818904643779460296616881241942336 51969796694319295889511268 04648743029387833666031765734337165949634731375592 47167029424618087781510481 26746269674500970450051175466570687005452630641050 24888769118032059917845867 65304041940400368455988250919539863092282405040537 96205135896999939802056942 66973236095772153476388267418476533663512746243310 31785386194643005307289050 29493197037650237921611449426113236294444096001738 94963797156859916567288947 565058003

  16. Re:Broken, but not for everything... by JM · · Score: 2, Interesting

    Even if the NSA could do it in a week...

    Suppose you're signing a tar.gz file. If the NSA could find a collision, the collision will still need to fit:
    - filesize has to stay the same
    - you don't want to get errors with gzip
    - you don't want to get errors with tar
    - the files in the archive needs to make sense

    What's the probability of all this happening?

  17. Re:Info on what exactly SHA-1 is ... by pchan- · · Score: 4, Interesting

    So SHA-1 was created by the NSA, and was broken nine years after it was released. Is there any chance that the NSA knew it had a secret weakness, and promoted it for that specific reason?

    I don't know about this, but when SHA (the Secure Hash Algorithm) was submitted as an approved algorithm for government use, the NSA reviewed it and suggested a minor change. That modified algorithm is what we now know as SHA-1. It was a few years before public-sector cryptographers caught on to what the significance of the changes was (I wish I could explain it, but it is beyond me).

  18. Encryption vs. fingerprint hashing.... by otis+wildflower · · Score: 2, Interesting

    .... So these hashes are still good for uniqueness out to 2^32 size fingerprints?

    What's the best hash for file fingerprinting, for stuff like version databases, tripwire, etc?

  19. Re:Not a problem (yet) by grennis · · Score: 1, Interesting

    Not quite. Sorry but you are wrong and the GP is correct. There is a big difference between generating 2 strings that collide, and being handed a hash and asked to find a collision.

  20. INSIGHTFUL? by Anonymous Coward · · Score: 0, Interesting

    Parent poster is an IDIOT. And you mods, who dont read RTFA, or even the SUMMARY, are even WORSE... you are pathetic sheep and I pity your pointless existence.

  21. Re:Is it that bad? by Anonymous Coward · · Score: 2, Interesting

    Shit, just retested, got 150,000 SHA-1's per second.

    1 Billion computers now comes out to 22 days.

    But, 1 million computers is still 60 years.

    The ten-year mark comes out to 6 million computers.

    (BYW, these tests were done on a 256 byte file, bigger files would take longer.)

    Still not that bad.

  22. Re:Info on what exactly SHA-1 is ... by ottffssent · · Score: 2, Interesting

    Sure, there's a chance, but it's fairly remote.

    The original SHA was released in 1993. 2 years later, SHA-1 was released (by the NSA, through NIST) to address unspecified weakness in the original algorithm. In 1998, a weakness in the original SHA was discovered, and in 2004 various attacks on SHA, SHA-1, and other hashes were published. It is unknown which, if any, of the known SHA weaknesses were discovered by the NSA and resulted in the 1995 release of SHA-1, though at least one known attack affects SHA but not SHA-1. It seems likely that if the NSA were capable of releasing SHA-1 as the original SHA in 1993, they would have done so. Thus, it appears that the NSA discovered a weakness in SHA in 2 years that took the academic community 5-10 years to discover. The gap is not as wide as it was when DES was released, but it's still very impressive.

    End brief synthesis. Check out the wikipedia article (and links) for more information. That's where I got mine.

  23. Re:Not a problem (yet) by Spy+Hunter · · Score: 4, Interesting

    Thought by much of the Slashdot community, as general reaction to this article shows. Until today, the prevailing Slashdot wisdom was that MD5 was weak and broken and SHA-1 was strong. Now we know that's not the case. Maybe this is no surprise to your circle of cryptography guru friends, but nobody told me until now.

    --
    main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
  24. Re:Info on what exactly SHA-1 is ... by Ninja+Programmer · · Score: 5, Interesting
    DES had a weakness nobody but the NSA knew about, so they recommended changes to it without saying the reasons for them. years later an attack was found against DES, but the NSA changes prevented it from being useful. Why would they change their tune to SHA-1?


    You know, of course, that the NSA did the same thing with SHA right? The original algorithm submitted was SHA-0, then the NSA recommended an unexplained minor change.

    Last August SHA-0 was broken, so their tweak appears to have added about 6 months of extra life for SHA-1.
  25. Re:Quick Measures? by whitis · · Score: 2, Interesting

    I have barely any cryptography knowledge, but would the SHA series be any safer if the size of the data was part of the signature? From glancing over Bruce's post (and remembering how MD5 and others were broken), data has to be padded. You can't just change arbitrary bits and produce the same signature. So, couldn't you just add the size of the file to the signature? Does that decrease security, because the size is now known?

    It would greatly add to the security in theory except that it wouldn't really work well in practice. I am not an expert on cryto algorithms but the limitations of this approach can be seen even without knowing anything about the actual has algorithm.

    The trick with forging messages is that you need to add some amount of garbage to the forged message to make it match the hash of the original (there are other ways in theory that are likely to be less practical). This is why the length inforormation would appear to help. Then you have to disguise the purpose of the garbage so it doesn't draw attention to itself. Perhaps as another fake signature or an uncompressed image. Steganography would give you the requisite number of bits while still letting you include an image (of a hand written signature, a document page, or a company logo)that wasn't obviously junk but the bits would not be consequtive which makes it harder). Instead of hiding them in low order bit steganography, you could replace a number of consequtive pixels; the result would be visable but would be mistaken for a smudge. By including the length in the signature, you make it much harder to make a message that says what you want it to say and still matches the hash.

    However, the problem with your suggestion is protecting the new size information. If you include the size in the message before hashing, then you simply search for a hash that matches the forged message with modified size which could be about the same amount of work as forging in the first place for a brute force attack though it may well add additional constraints that might prevent taking particular shortcuts vs. a brute force attack in which case it would help. It would really depend on whether the hash crack is bothered by this constraint. If the crack algorithm knows in advance how many extra bits will be generated then running the length through the hash doesn't help at all. If you just add some message size bits to the end of the hash, then the information is unprotected and can be changed to match the message. So, you probably would want to sign the length separately. But if you are going to do that, you could use a hash with twice as many bits in the first place or hash the same message twice with two different initial values for the hash (cracking the first hash may or may not provide a shortcut to cracking the second). Both of these approaches would probably be much more secure than adding the length. Even more secure would be to hash the message with two different hash algorithms.

    There are also ways of returning to the original message length even when adding garbage. Suppose I want to change a contract to say you will pay me $100,000 instead of $10,000. To make room for the extra hash fooling bits, I can delete an entire paragraph from the contract that is of no importance to me (or even makes it more favorable to me by its absence). Longer hashes or double hashes would effectively prevent this but length information would not.

  26. Re:Not a problem (yet) by gnuman99 · · Score: 4, Interesting

    What about what OpenBSD is doing? Have multiple hashes per file (MD5, SHA1, etc..) for a given signature.

  27. not that useful yet... by DangerTenor · · Score: 2, Interesting
    I think it is important to note that (from what I've heard, I haven't seen the paper either...) this collision attack is not very "real world" useful. Their attack is focused on taking a certain number of operations to come up with two hunks of data that result in the same hash.

    In my opinion, a "real world" attack would be one which given a blob which has already been hashed, would come up with another blob which results in the same hash. To my knowledge, nobody has any useful attacks in that direction yet, although some would argue based upon this research that it may just be a matter of time.

    Then we of course need to get into whether that is really useful either. If I find out that
    "I agree to purchase 100 units for $500"
    and
    "*(\D$Hw&72d98a %93di(hd eLKH%ap$#"
    results in the same hash, how helpful is that to me? How is a lawyer is going to prove to a jury that I may have actually signed the garbage instead of the purchase agreement? So, there is even more work to be done to make it a useful real world attack, wherein you might take the original signed text (modified for your evil purposes), append a null character, and then add garbage until the hashes are equal--and hope the UI was poorly written and just displays up to the first null.
    --
    Check out our infosecurity industry blog: http://securitymusings.com/
  28. How about using MD5 and SHA-1 togeher by geo_2677 · · Score: 2, Interesting

    I am not a cryptographer but shouldn't it be possible to use MD5 and SHA-1 for the same piece of text together. How likely is the hash sum for a particular text to be cracked for both the algos and have the same modified text. Will it make it a bit more tougher to be cracked?

  29. Re:Sigh by Nos. · · Score: 2, Interesting

    Well maybe then they wouldn't be attempting to send my mail server spam every 2 minutes from what appears to be some very poor version of a dictionary attack. Its fun to watch some of the attempts. It appears a lot of harvesters pull out 'and' in an email address. In my case, this is actually a *good* thing.

  30. missing the overall point by jnf · · Score: 3, Interesting

    What you have to figure is that with any hash thats shorter than the max amount of data, then the possibility of collisions will occur;
    figure that if you could represent every possible combination in 128 bits, you would never need to have 129 bits of data.
    Because this is not true all hashes will have collisions. However the chances of multiple hashes all having collisions with altered data is 'pretty damn slim'. So therefore the best solution, most likely in the future, and presently is to authenticate messages, identification (ala ssl certificates**) and binaries with multiple hashs known to be reasonably strong. One doesnt need to be a cryptologist to realize that using something like md5, sha256 and like ripemd160, the chances of collision in all 3 hashes are quite slim, and within the range of acceptable risk.

  31. Can someone by Deliveranc3 · · Score: 2, Interesting

    Explain to me why this isn't useful for compression?

    I know it's next to impossible to create the data from the hash but shouldn't it be theoretically possible?

    If the hash reduces the possible files which match it by 99.999% then shouldn't it be possible to send that much less data?

    1. Re:Can someone by Rothron+the+Wise · · Score: 2, Interesting

      If the hash reduces the possible files which match it by 99.999% then shouldn't it be possible to send that much less data?

      Yes, but that's not what the hash does. It only makes it much harder to find files that hash to the same value because the hash values are randomly distributed. If you have a 160-bit hash then (ideally) every 2^160th file will have the same hash.

      This means basically that there are an infinite number of files that hash to the same value. For this to be used for compression there must be a 1:1 relation between inputs files and outputs hashes. In other words, the algorithm must be reversible, something which cryptographic hashes are not. When a file is hashed, data is thrown away, data which can never be recovered.

      --
      A witty .sig proves nothing
  32. What to use in new apps? by BigZaphod · · Score: 2, Interesting

    I'm actually working on an app that was going to use SHA-1 for integrity verification. I may just stick with SHA-1 because I'm not terribly familiar with the other options out there in this realm. So ideally, what should new apps use these days? What would be the recommended "safe" algorithm? And can I find a nice, tested C library/code for it? :-)

  33. Re:Info on what exactly SHA-1 is ... by Eivind · · Score: 4, Interesting
    Not quite.

    For quite a few applications the hash is broken even if I cannot easily find a second string with the same hash as one given. Even if I can "only" at will find two strings with the same hash, that is a pretty serious weakness.

    I could, for example, create two documents with the same hash, have you sign one, and then claim you signed the other one. Since the hashes are the same your digital signature will be valid for both.

    For other applications, like replacing a signed document with another without being detected you're rigth -- that would only work if one could easily find a document with a given hash.

  34. Important info on crypto hashes by ars · · Score: 3, Interesting

    I guess I missed posting this before the bulk of the posts, but maybe it'll help someone.

    First: MD* SHA-* etc - they are all basically the SAME algorithm! The are just minor modifications of the same exact thing, so a break in one is a break in all.

    Second: Tons and tons of people ask: can't we merge two hashes together and get a stronger one? Yes you can that's EXACTLY what MD* and HA-* DO! They are a combination of different hashes! That's how they work.

    So if you really did have a good combo of hashes then just give them a name and use them as a hash - don't bother just plain merging existing ones.

    Also, merging say MD5 and SHA-1 is pointless - they are both based on the same hashing code! You are gaining nothing by merging them.

    --
    -Ariel
  35. Now I get the "Use 2 hash algorithms" comments by The_Dougster · · Score: 2, Interesting

    Ok, if my file consists of the line "Hello World." then I get the following hashes:

    770b95bb61d5b0406c135b6e42260580 for MD5

    b924c2f360b572e17c971f1b1b667e0732944df7 for SHA-1

    Trying to tinker around with the file and make both hashes come out the same as above would presumably be much more difficult than for any single hashing algorithm, and it might very well be nigh impossible. The little light bulb has finally come on. Now I get it. Yeah using two hash algorithms together would probably work nicely. Don't combine the results mathematically, just append the keys together into a big long string. The final MD5+SHA1 hash key for my file would be:

    770b95bb61d5b0406c135b6e42260580b924c2f360b572e1 7c 971f1b1b667e0732944df7

    I don't know whether this would be stronger than a SHA-2 of equivalent bit length or not, but now I get what some of you have been saying. From a common sense view, it would seem that something like this would be pretty darn tough to crack, because you would have to make two different algorithms compute matching keys for a given dataset.

    --
    Clickety Click ...
  36. Metadata by bogado · · Score: 2, Interesting

    Maybe we should start encoding meta-data along with the hash, so instead of trusting only on the hash to confirm that the message is from who sign it, we would encode along the message, the size, type and whatever characteristic could define the message.

    For instance, suppose I sign the message "Hi, I'm Victor", along with the hash it would contain the size (14 bytes), type (English text), encoding (7bits ASCII) and how about the range of codes used in the messages (from U+0027 - U+0074).

    A good hash would give a uniformly distributed random hash for the message, so it is safe to assume that even if we could find a collision, it would be highly unprovable that it would satisfy all the meta-data. In some cases it could be provable that this kind of hash is unbreakable, since there is a finite number of messages that satisfy the meta-data (if you could hash all possibilities and verify that there were no collisions you're 100% safe).

    --
    []'s Victor Bogado da Silva Lins

    ^[:wq

  37. Re:Pigeon-hole principle by Shano · · Score: 2, Interesting

    To nit-pick further, the pigeon-hole principle says nothing about not reusing any slot. It states that if you place n items in to n slots, either every slot is filled, or (at least) one slot has more than one item.

    Equivalently, if there are n+1 items, there must be a slot with more than one item. Your statement is a special case of the principle, but not as general.

    It is possible to prove (by induction) that there are an infinite number of collisions for some hash value using this. However, proving that collisions exist for every hash value requires detailed knowledge of the algorithm, and doesn't follow directly from the pigeon-hole principle.

  38. Doubt it by hey! · · Score: 2, Interesting

    This doesn't seem likely even with my tinfoil hat mode fully engated.

    If we were talking about an encryption scheme, the temptation would be there. If it were and encryption scheme adopted by The Bad Guys (tm) then NSA would of course be able to read their secret communications.

    But that's not what SHA is for. It's to allow a piece of data to be authenticated. You can satisfy that that this file is indeed from me based on a simple number computed from it and a secret we both share. When thegovernment procures a piece of software that is going to do something like launch nuclear missiles, it would be nice if that software could figure out whether the order really came from PotUS. On the other hand, when the order comes from Osama to fly the plane into the WTC, authentication of that order, while useful, is not as critical.

    So, the national security interests here are clearly in favor of their being a publicly available, secure hash function.

    --
    Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  39. Re:Not a problem (yet) by melandy · · Score: 2, Interesting
    Sounds prudent to me, as long as you have the CPU time to spare.
    CPU time on legit transactions is not really an issue. Hashes are designed to be computationally expensive to crack, and not computationally expensive to use normally.

    So for legit use you have the following (assuming use of MD5 and SHA1):
    Time to compute ONE hash via MD5
    + Time to compute ONE hash via SHA1
    = Ain't much
    Compared to what a brute force would have to do:
    Time to compute MD5 hashes until you find a match
    + Time to compute SHA1 hashes based on MD5 matches until you find a match
    = A whole lot
    ... or (depending on which is less expensive) ...
    Time to compute SHA1 hashes until you find a match
    + Time to compute MD5 hashes based on SHA1 matches until you find a match
    = A whole lot
    Either way, it's a lot more difficult brute force plaintext that will match hashes for both MD5 and SHA1 than for either alone, but the additional CPU time on the system for legitimate use is negligible.
  40. Not quite the end of the world by steve_stern · · Score: 3, Interesting
    So the paper says I can find two values, X and Y, such that they hash to the same SHA-1 value. Great. What can I do with this?

    As others have pointed out, I can create 2 documents, X and Y, have a target sign one, then substitute the other. His digital signature will be valid for both. Great - it takes only 2^69 attempts to get a collision - I'm sure the chances that the X and Y found will both be valid English documents, one of which I could convince a target to sign, the other allowing me to scam him out of enough money to make the whole ordeal worthwhile.

    However, people keep copies of what they sign. Even if I did find a collision, and even if both documents were valid English text, the guy could say "I didn't sign Y - look, my signature is valid for X - he scammed me". Great.

    The more likely scenario is someone signing their own document, then claiming it was fraudulent. They could create their own X and Y, sign X that somehow involves another party, then claim they actually signed Y and this other party was the scammer. But they still have to find X and Y in 2^69 steps such that both make logical sense in the English language - no simple task.

    This is cool in a theoretical sense, but in a practical sense, its like saying you don't need a million monkeys on a million typewriters typing for a million years to generate Shakespeare; it'll only take 999,999 monkeys on 999,999 typewriters...

    Or, to go back to the theoretical world: with processor speeds doubling every 1.5 years, and this team shaving 11 factors of 2 off of the break time, the lifetime of SHA-1 just shortened by about 16.5 years. Not quite the end of the world as we know it.

    Step 1: Break SHA-1
    Step 2: ?
    Step 3: Profit!