SHA-3 Finalist Candidates Known

← Back to Stories (view on slashdot.org)

SHA-3 Finalist Candidates Known

Posted by timothy on Friday December 10, 2010 @12:58PM from the not-enough-like-line-noise dept.

Skuto writes "NIST just announced the final selection of algorithms in the SHA-3 hash competition. The algorithms that are candidates to replace SHA-2 are BLAKE, Grøstl, JH, Keccak and Skein. The selection criteria included performance in software and hardware, hardware implementation size, best known attacks and being different enough from the other candidates. Curiously, some of the faster algorithms were eliminated as they were felt to be 'too fast to be true.' A full report with the (non-)selection rationale for each candidate is forthcoming."

194 comments

Min score:

Reason:

Sort:

SHA-SHA-SHA-KE YOUR BOOTY !! by Anonymous Coward · 2010-12-10 12:59 · Score: 0, Offtopic

Yeah, man !! Do it to it !!
1. Re:SHA-SHA-SHA-KE YOUR BOOTY !! by larry+bagina · 2010-12-10 14:05 · Score: 5, Funny
  
  That's funny, but SHAKEs ("elder") are arabic, SHAs ("king") are persian/iranian. There is a difference and they get mad when you confuse them. They all look alike to me, but whatever.
  For those of us that didn't read the article, wikileaks revealed that the SHA has terminal cancer and will die soon. That's why they're looking for a new SHA-3. The SHA is kind of like the Dalai Lama, but with a unix greybeard. I'm glad they've narrowed down the candidates. Hopefully, the next one will bring peace in the middle east.
  
  --
  Do you even lift?
  These aren't the 'roids you're looking for.
2. Re:SHA-SHA-SHA-KE YOUR BOOTY !! by Anonymous Coward · 2010-12-10 14:12 · Score: 2, Funny
  
  Sheik Yerbouti?
3. Re:SHA-SHA-SHA-KE YOUR BOOTY !! by morgan_greywolf · 2010-12-10 14:56 · Score: 2
  
  For those of us that didn't read the article, wikileaks revealed that the SHA has terminal cancer and will die soon.
  
  SHA-1 has had terminal cancer a very long time: it was cracked over 4 years ago. Anything Wikileaks may have revealed about SHA-1 is very old news indeed.
  
  --
  My blog
4. Re:SHA-SHA-SHA-KE YOUR BOOTY !! by Martin+Blank · 2010-12-10 15:24 · Score: 4, Informative
  
  SHA-1 was not "cracked." A weakness was found in it that reduced the strength by 2^11 to 2^69 instead of 2^80 when conducting preimage attacks. Even on specialized hardware, this is not a practical attack, requiring thousands of years to come up with a message that hashes to the same value. Papers since then have found variations on the weakness, but they have only been demonstrated in reduced-round variants of SHA-1, not in full implementations due to the processing power required.
  The weakness was recognized as a potential problem, hence the recommended move to SHA-2, particularly the stronger variants of it. The SHA-3 competition was born out of concern that SHA-2 could suffer from similar weaknesses, which may doom the SHA-3 contestants that draw from SHA-2 at a political level if not a technical level.
  
  --
  You can never go home again... but I guess you can shop there.
5. Re:SHA-SHA-SHA-KE YOUR BOOTY !! by Anonymous Coward · 2010-12-10 16:13 · Score: 2, Informative
  
  Wrong. The same year (2005) improvements reduced the complexity to 2^63. See http://www.rsa.com/rsalabs/node.asp?id=2927
  Also, the attack was for finding collisions, not preimage attacks. A preimage attack would be more devastating, but collisions still allow for faking certificates and checksums, depending on the protocol.
  SHA-1 might not be broken, but it's about as close to being broken as any crypto primitive can be without being official broken. Everybody should have begun the process of moving away from SHA-1 in 2005.
6. Re:SHA-SHA-SHA-KE YOUR BOOTY !! by Ethanol-fueled · 2010-12-10 16:48 · Score: 0
  
  Sheik Yerbouti(Peace Be Upon Him) died of prostate cancer in 1993.
  
  :(
7. Re:SHA-SHA-SHA-KE YOUR BOOTY !! by Anonymous Coward · 2010-12-10 16:49 · Score: 0
  
  You mean 2^69 instead of 2^80 for collision attacks.
  A preimage attack would still take 2^160 calls.
8. Re:SHA-SHA-SHA-KE YOUR BOOTY !! by Dc0der · 2010-12-11 00:37 · Score: 1
  
  It's a collision, not preimage, attack. The attack was later improved to 2^63, which is certainly doable.
9. Re:SHA-SHA-SHA-KE YOUR BOOTY !! by Anonymous Coward · 2010-12-11 13:10 · Score: 0
  
  If I had mod points, you would have them. I have heard this misinformed SHA-1-is-cracked argument too many times now. It probably all stems back to a poorly written slashdot "story".
"Too fast to be true" by MrEricSir · 2010-12-10 13:05 · Score: 4, Insightful

Well that's mathematically sound reasoning!

--
There's no -1 for "I don't get it."
1. Re:"Too fast to be true" by icebike · 2010-12-10 13:13 · Score: 4, Insightful
  
  Exactly my reaction.
  Is this a beauty contest or what?
  There may be some tendency to think that something that hashes too quickly would be trivial, but without even a glance at the methodology and a modicum of trials this is just like assuming the cute girl is an air-head without so much as a conversation.
  Who are these guys anyway? You expect better from NIST.
  
  --
  Sig Battery depleted. Reverting to safe mode.
2. Re:"Too fast to be true" by Haedrian · 2010-12-10 13:13 · Score: 1
  
  Probably so that brute-forcing the plaintext never sounds like a good idea.
  
  Yeah I know that the increase is exponential (based on the string length) - but if you only do dictionary words... you'll still hit a few passwords.
3. Re:"Too fast to be true" by Anonymous Coward · 2010-12-10 13:26 · Score: 4, Funny
  
  what if they were optimized? would sleep(10) make them finalists?
4. Re:"Too fast to be true" by Arancaytar · 2010-12-10 13:34 · Score: 1
  
  Short strings are supposed to be salted anyway.
5. Re:"Too fast to be true" by Omnifarious · 2010-12-10 13:47 · Score: 4, Insightful
  
  Tangential? What are you talking about? The cryptographic uses of hashes are the whole reason SHA-1, SHA-2 224,256,384,512 were created in the first place. It's also the reason the competition is being run.
  I would also submit that your use case is not as security insensitive as you might think.
  
  --
  Need a Python, C++, Unix, Linux develop
6. Re:"Too fast to be true" by Anonymous Coward · 2010-12-10 13:50 · Score: 5, Informative
  
  There are some cryptographic uses of hashs but they are tangential for the most part.
  This is the Secure Hash Algorithm - 3 selection competition. The cryptographic uses are pretty much at the forefront of the judges' minds.
  A perfectly acceptable hash for error correction purposes can be doomed for cryptographic purposes. For example, being able to find "a different plaintext input that would have given the same hash as input X" is not a problem for an error correction hash provided that the pair of inputs are not similar (and so transmission errors are unlikely to turn one into the other). However it would make many uses of cryptographic hashes totally unviable.
7. Re:"Too fast to be true" by jewelises · 2010-12-10 14:03 · Score: 2
  
  When you want to slow down a fast hash you just do it a lot of times. See PBKDF2, for example.
8. Re:"Too fast to be true" by Anonymous Coward · 2010-12-10 14:07 · Score: 0
  
  are you daft? cryptographic use is the POINT of the SHA-3 competition
9. Re:"Too fast to be true" by Darinbob · 2010-12-10 14:13 · Score: 1
  
  Those examples are where you just want a hash algorithm. You want a SECURE hash algorithm for extra security. Ie, not to tell if the file was corrupted in transmit, but whether someone has hacked the server and put up a compromised version of the software, or that a certificate is valid, or that the bank transfer order is valid, or that the order to launch really came from the POTUS.
10. Re:"Too fast to be true" by PopeRatzo · 2010-12-10 14:32 · Score: 1
  
  Beauty contest? Nah, this is fantasy football. Maybe Dancing with the Stars.
  Anybody know what the point spread is on this SHA-3 hash competition? I'd like to get a bet down before the line moves.
  
  --
  You are welcome on my lawn.
11. Re:"Too fast to be true" by Fry-kun · 2010-12-10 14:40 · Score: 2
  
  FTFA: "Cryptologist Ron 'The R in RSA' Rivest withdrew his MD6 process - it was highly-rated but conspicuously sluggish"
  Someone simply misread or misunderstood "sluggish" for "too fast"
  
  --
  Did you know that "FTW" ("for the win") is a direct translation of "Sieg Heil"?
12. Re:"Too fast to be true" by Anonymous Coward · 2010-12-10 14:54 · Score: 0
  
  No it was really really slow.
13. Re:"Too fast to be true" by nedlohs · 2010-12-10 15:47 · Score: 3, Insightful
  
  You believe what you read in a slashdot summary???
14. Re:"Too fast to be true" by Surt · 2010-12-10 15:53 · Score: 2, Insightful
  
  Technically, if your hash algorithm is too fast, it gets easier to brute force. So it isn't completely unscientific.
  
  --
  "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
15. Re:"Too fast to be true" by Anonymous Coward · 2010-12-10 16:01 · Score: 0
  
  Actually it was just too slow.
16. Re:"Too fast to be true" by Anonymous Coward · 2010-12-10 16:48 · Score: 0
  
  Don't you mean before the line is shifted/xor'd? :P
17. Re:"Too fast to be true" by swillden · 2010-12-10 16:55 · Score: 1
  
  Technically, if your hash algorithm is too fast, it gets easier to brute force. So it isn't completely unscientific.
  Only if the input is small, which translates to "only if the protocol designer is clueless". Also, you can always make a fast algorithm slower by iterating it, so your point is irrelevant.
  
  --
  Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
18. Re:"Too fast to be true" by Anonymous Coward · 2010-12-10 17:31 · Score: 2, Insightful
  
  checksum != hash table function != cryptographic hash != hashish
19. Re:"Too fast to be true" by ocdscouter · 2010-12-10 18:55 · Score: 1
  
  You're right. It looks like it was written too quickly to be accurate.
20. Re:"Too fast to be true" by Skuto · 2010-12-10 20:26 · Score: 4, Informative
  
  "We preferred to be conservative about security, and in some cases did not select algorithms with exceptional performance, largely because something about them made us “nervous,” even though we knew of no clear attack against the full algorithm."
  William Burr, NIST
21. Re:"Too fast to be true" by moderatorrater · 2010-12-10 20:44 · Score: 2
  
  It's not unreasonable to leave out an algorithm that's as secure mathematically as the others as far as we can tell but that has a concerning characteristic. Previously, they've eliminated competitors for having simple mathematical representations and things like that. Since those algorithms were no more secure than the ones without the worrisome attribute, they could be eliminated without much problem. Remember, these are security guys, so they're paranoid about stuff like that.
  
  I'm a little curious about that portion of the summary, though, since one of Skein's distinguishing features is that it runs nearly as fast on a processor as it would on specialized hardware due to the way that it's designed. If those algorithms were much faster than that then I would probably agree with the committee that the speed was suspicious.
22. Re:"Too fast to be true" by moderatorrater · 2010-12-10 20:48 · Score: 1
  
  The SOAP WSSE standard uses SHA-1 for authentication and all password storage should be hashes. If all your doing is a checksum, then use something faster than SHA, you're using a sledgehammer to open a walnut.
23. Re:"Too fast to be true" by LainTouko · 2010-12-10 20:57 · Score: 2
  
  If you're only concerned about accidental corruption, you should use a CRC, which will be much faster than a cryptographic hash. Spending a load of extra CPU time on acquiring good cryptographic properties is silly if you're not interested in any cryptographic properties.
24. Re:"Too fast to be true" by omfgnosis · 2010-12-10 21:17 · Score: 1
  
  != hash browns :(
25. Re:"Too fast to be true" by omfgnosis · 2010-12-10 21:20 · Score: 1
  
  Wait, so you're saying it was slow?
26. Re:"Too fast to be true" by Anonymous Coward · 2010-12-10 22:13 · Score: 0
  
  D'oh. That doesn't change anything if the algorithm is fast, since the salt is stored _with_ the hash. If you have the hash, you have the salt. Dictionary based means that the _original_ string is looked up in a dictionary. Comprende??
27. Re:"Too fast to be true" by Stellian · 2010-12-10 22:47 · Score: 2
  
  Well that's mathematically sound reasoning!
  In cryptographic lingo, it means that although the algorithms aren't broken, they have a small security margin, for example 14 of 16 rounds are broken. Since attacks always get better, it's a good idea to pick an algorithm twice as slow with, say, 32 rounds, then to be on the bleeding edge. Sure, you get twice the speed, but you are only one good research paper away from hell.
  In regard to AES, it's largely agreed in the crypto community that NIST went for the performance, and we now trust an algorithm with a comparatively low security margin. If advances in cryptography continue at the same rate as the did in the past 30 years, then surely AES will be insecure 30 years from now [citation needed]. That's not sound mathematical reasoning, but it is sound pragmatic reasoning to reject an algorithm that's "too fast".
28. Re:"Too fast to be true" by kasperd · 2010-12-10 23:09 · Score: 2
  
  Technically, if your hash algorithm is too fast, it gets easier to brute force.
  Let's assume somebody came up with a hash function that is 10 times faster than what we would otherwise use, and let's assume it is just as secure except from the minor detail that by being 10 times faster it also becomes 10 times faster to perform a brute force attack. If those assumptions are true, then instead of discarding it altogether, we should find a way to make the brute force attacks slower again. Making the algorithm slower would of course achieve that, but that's not a good idea because it would become less useful. Instead we can use the same principles for the hash function, but increase the size of the output with one byte. That extra byte makes the time to find a collision 16 times larger (and preimage attacks takes 256 times longer). But why stop at one byte, we could make the hash value a lot larger. Even if that meant it would slow down by a factor of two, it would still be five times faster than the alternatives and a lot more secure.
  
  This is exactly the reason why modern hash functions no longer output just 128 bits. These days nobody in his right mind would try to design a new cryptographic hash that output less than 256 bits.
  
  --
  
  Do you care about the security of your wireless mouse?
29. Re:"Too fast to be true" by Anonymous Coward · 2010-12-11 00:45 · Score: 0
  
  That doesn't mean much when you're searching for a needle in a haystack with 2^256 or more straws.
30. Re:"Too fast to be true" by mattpalmer1086 · 2010-12-11 01:07 · Score: 1
  
  By brute forcing, I assume you mean find another input that hashes to a predetermined value, not finding two inputs which happen to hash to the same value. Without some mathematical attacks, this very unlikely, no matter how fast the hash algorithm is.
  A slower hash will make dictionary attacks on salted password tables more difficult - but generally you would simply run the hash many times to increase the computational load for an attacker. Again, the raw speed of the hash algorithm is unlikely to make much of a difference.
31. Re:"Too fast to be true" by Anonymous Coward · 2010-12-11 01:10 · Score: 0
  
  Short strings are supposed to be salted anyway.
  D'oh. That doesn't change anything if the algorithm is fast, since the salt is stored _with_ the hash. If you have the hash, you have the salt. Dictionary based means that the _original_ string is looked up in a dictionary. Comprende??
  IIRC, the whole point of salting is that it slows things down because you can't maintain a database of dictionary words and their hash values calculated in advance (which could be used to reverse-lookup *any* password based on a dictionary word if salting wasn't used).
  With salting in place, a dictionary attack would require recalculating the dictionary every time for each password, since different passwords have different salt values stored with them.
32. Re:"Too fast to be true" by Anonymous Coward · 2010-12-11 02:00 · Score: 0
  
  I don't think you understand how ming-bogglingly large 2^256 actually is. The algorithm could be a thousand times faster and it wouldn't matter.
33. Re:"Too fast to be true" by Anonymous Coward · 2010-12-11 02:23 · Score: 0
  
  OpenBSD slowed an internal hash function down to slow possible brute force attacks against the passwd file(if I remember correctly) so it makes sense to presume that an algorithm that is too fast will(even if theoretically very secure) be practically weaker because it would take less time on the same hardware to complete a brute force attack.
34. Re:"Too fast to be true" by Anonymous Coward · 2010-12-11 03:31 · Score: 0
  
  "We preferred to be conservative about security, and in some cases did not select algorithms with exceptional performance, largely because something about them made us “nervous,” even though we knew of no clear attack against the full algorithm."
  William Burr, NIST
  To paraphrase my philosophy professor: Saying exactly what's wrong with an argument can be one of the hardest things to do.
35. Re:"Too fast to be true" by amorsen · 2010-12-11 04:06 · Score: 2
  
  The "something" which made them nervous wasn't the speed, and it wasn't necessarily the same for each algorithm...
  
  --
  Finally! A year of moderation! Ready for 2019?
36. Re:"Too fast to be true" by Surt · 2010-12-11 04:12 · Score: 1
  
  Actually, to defeat a hash, you need only defeat the last repetition, so, no, iteration doesn't help.
  
  --
  "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
37. Re:"Too fast to be true" by amorsen · 2010-12-11 04:15 · Score: 1
  
  You often have a cryptographically secure and reasonably fast hash algorithm available in some library, no matter what you program for. It's easier to just pick that by default instead of spending a lot of energy on developing and debugging something which might not end up all that much faster or smaller in the end. Skein is in the region of 6 cycles per byte on modern Intel CPUs. That should be fast enough for almost all uses. SHA is slow on modern CPUs, but even that is often fast enough.
  
  --
  Finally! A year of moderation! Ready for 2019?
38. Re:"Too fast to be true" by RaymondKurzweil · 2010-12-11 04:19 · Score: 1
  
  Teh stupid. It burns.
39. Re:"Too fast to be true" by Surt · 2010-12-11 04:32 · Score: 1
  
  That only matters if the hashes output is perfectly distributed. Unless they have some proof of that, a 256 bit hash is actually much less than 256 bits of security.
  Now if you start from a 4K hash, I'd stop worrying about brute force.
  
  --
  "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
40. Re:"Too fast to be true" by Anonymous Coward · 2010-12-11 05:03 · Score: 0
  
  Actually, to defeat a hash, you need only defeat the last repetition, so, no, iteration doesn't help.
  Please explain further, because it is not obvious what you mean. Iterating a hash is surely equivalent to increasing the number of rounds in a modern block cipher primitive, which absolutely does increase its security.
41. Re:"Too fast to be true" by Ramble · 2010-12-11 05:49 · Score: 1
  
  Actually it is - if you have a fast hash algorithm it means attackers can easily hash a whole database, for example if they found a weakness in your system (e.g. you used the same salt for all your passwords) then he could very easily rehash his dictionary attacks with this new salt with a trivial amount of computing time.
  
  --
  "Oh boy"
42. Re:"Too fast to be true" by Anonymous Coward · 2010-12-11 10:22 · Score: 0
  
  conclusion if u wanna win and you make an exceptionally good algorithm that is also quick, put 100 more rounds to please William Burr, NIST.
43. Re:"Too fast to be true" by kasperd · 2010-12-11 11:41 · Score: 1
  
  Iterating a hash is surely equivalent to increasing the number of rounds in a modern block cipher primitive, which absolutely does increase its security.
  Not exactly. The multiple rounds for a typical hash is not for the full hash, but rather for each individual block. And after running all the rounds for a block usually the input from before all these rounds is merged back with the output of all these rounds. Also, when running these rounds the input to the next round is not just the output of the previous round, but also the block you are hashing.
  
  --
  
  Do you care about the security of your wireless mouse?
44. Re:"Too fast to be true" by kasperd · 2010-12-11 11:53 · Score: 2
  
  That only matters if the hashes output is perfectly distributed. Unless they have some proof of that, a 256 bit hash is actually much less than 256 bits of security.
  That's part of what this whole process is about. If anybody can show that one of the hashes has a skewed distribution of the outputs, then that hash is very likely to leave the competition. However giving a proof that a hash has a uniform output is not easy unless it has a very simple structure, that is likely to suffer from other weaknesses. Even proving that all 2^256 combinations are possible is difficult, because if there was a constructive proof for that, then you would have given a preimage attack.
  
  Now if you start from a 4K hash, I'd stop worrying about brute force.
  That large a hash value would make a lot of the use cases totally impractical. And increasing the size doesn't guarantee you a better hash. I think the current peer review process does more for the security than increasing the size would.
  
  Sure, if the same people would go through the same process to design a hash with a 4K output, then it probably would also end up being very collision resistant, but also very slow. Most people think it is more useful to spend the time on designing a hash with a smaller output.
  
  --
  
  Do you care about the security of your wireless mouse?
45. Re:"Too fast to be true" by kasperd · 2010-12-11 11:58 · Score: 2
  
  OpenBSD slowed an internal hash function down to slow possible brute force attacks against the passwd file
  A slower implementation of the same hash doesn't add any security. You should expect the attackers to use the fastest possible implementation of the hash. Some uses of hashes for passwords will apply the hash multiple times (each iteration should use both the output from the previous iteration and the password itself). This makes the calculation equally slower for both the system using it and the attackers. The purpose of this is not to protect against weaknesses in the hash, but rather to protect against weak passwords. This is a very special use case for a hash function, and the requirements are quite different from what hash functions are usually used for.
  
  --
  
  Do you care about the security of your wireless mouse?
46. Re:"Too fast to be true" by kasperd · 2010-12-11 12:08 · Score: 1
  
  You often have a cryptographically secure and reasonably fast hash algorithm available in some library
  If you want a hash function to use as a checksum to protect against random data corruption, then md5 is actually a quite good choice. It is much faster than sha1 and sha2. The output is large enough to make the risk of random collisions small. The design that aimed to make it resilient to an adversarial modification of the data means that it is totally unlikely that some pattern in the data corruptions could by chance cause changes to the hash to cancel out. And there are portable implementations available. If you try to write something that works on different byte orderings it is actually hard to write something that is simple and faster than md5.
  
  --
  
  Do you care about the security of your wireless mouse?
47. Re:"Too fast to be true" by farmkid · 2010-12-11 13:26 · Score: 1
  
  I've worked with NIST, and largely sorta respect 'em: yeah, they're bureaucrats and, yeah, they share some characteristics with the rest of the dumbocracy. On the other hand, they _do_ try, and I my impression has been, after working with other fed agencies, that they're better than most. But not perfect, and probably not as 'alert' (i.e. as guarded with development funds) as private corporations would be.
  And, yeah: let's see the details on the fast ones. If they can be disproved, so much the better, but if good, they're math triumphs.
48. Re:"Too fast to be true" by mbkennel · 2010-12-11 13:59 · Score: 2
  
  Read the above very very carefully. This is superb government misdirection.
  The reader is encouraged to infer that the exceptional performance made them nervous. That is not what he claimed.
  I suspect (without insider knowledge) that forthcoming statement would be:
  "We preferred to be conservative about security, and in some cases did not select algorithms with exceptional performance, largely because an attack and theory known to government scientists but not to the public can crack a variant or limited case. Even though we knew of no clear attack against the full algorithm we did have suspicions of potential strategies in the future based on this knowledge that I know that you don't. "
49. Re:"Too fast to be true" by swillden · 2010-12-11 19:02 · Score: 2
  
  Actually, to defeat a hash, you need only defeat the last repetition, so, no, iteration doesn't help.
  Cite?
  The sort of attack you're talking about, where speed is a factor, is a dictionary attack. The attacker has reason to suspect that the input is from a relatively small set (e.g. it's a human-selected password) and it's therefore feasible to hash every element in the set and compare each output with the known hash value. If the hash is fast enough and the set is small enough, this may be feasible.
  One way to defeat that attack is to increase the set size, but in many cases (like passwords) that's not feasible. So another way to defeat the attack is to use a slow hash, because then testing each dictionary entry will take long enough that searching the dictionary isn't practical. On the other hand, the computation required to compute one hash during a login is fast enough to be acceptable.
  So, under that scenario, look at iterating your fast hash to create a slow one. How do you "defeat the last repetition"? What does that even mean? Are you assuming that you can actually reverse the hash, a pre-image attack? If that's the case, then brute force is the least of your worries. And without that, what does it even mean?
  Suppose you could somehow find out what the output of the n-1 iteration on the actual input was. What have you achieved? Well, to find out which input to iteration 1 maps to that n-1 iteration output you need to... search your dictionary applying n-1 iterations to each entry.
  Either you're talking about something completely different, or you're up in the night.
  
  --
  Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
50. Re:"Too fast to be true" by plover · 2010-12-11 19:17 · Score: 1
  
  Short strings are supposed to be salted anyway.
  That doesn't change anything if the algorithm is fast, since the salt is stored _with_ the hash. If you have the hash, you have the salt. Dictionary based means that the _original_ string is looked up in a dictionary.
  Salting doesn't change the security of a single hash against brute force attack, but it does change the security of a collection of hashes.
  Salt provides a measure of hiding otherwise innocuous collisions. The classic example is a password database. If I enter a password of "asdf" and you enter a password of "asdf", our unsalted hashed values would be identical in the database, where any random admin could spot that they were identical. It's like a poor-man's dictionary attack.
  Remember, short strings are inherently more susceptible to collisions than long strings, particularly if they're strings people can choose, and not just random data.
  
  --
  John
51. Re:"Too fast to be true" by tkalfigo · 2010-12-12 10:10 · Score: 1
  
  Who are these guys anyway? You expect better from NIST.
  They are the same guys who came up with this piece of scientific work
52. Re:"Too fast to be true" by Anonymous Coward · 2010-12-13 08:12 · Score: 0
  
  * s l o w c l a p *
Bruce Schneier by Jackanackanoree · 2010-12-10 13:09 · Score: 2

Bruce Schneier helped to make skein http://www.schneier.com/skein.html
1. Re:Bruce Schneier by fuzzyfuzzyfungus · 2010-12-10 13:16 · Score: 1
  
  But only he is capable of using it for lossless compression...
2. Re:Bruce Schneier by Anonymous Coward · 2010-12-10 13:17 · Score: 0
  
  Rumors has it that djb has broken skein, though.
3. Re:Bruce Schneier by Martin+Blank · 2010-12-10 15:27 · Score: 1
  
  That may well be true -- djb is one of the smartest guys out there -- but if he hasn't provided the proof to either NIST or the Skein team, it shouldn't really factor into the results.
  
  --
  You can never go home again... but I guess you can shop there.
4. Re:Bruce Schneier by e9th · 2010-12-10 15:54 · Score: 1
  
  I see that his own entry, CubeHash, made it into Round Two, but not the finals.
good! by larry+bagina · 2010-12-10 13:12 · Score: 3, Funny

Our lawyers won't let us convert our svn repositories to git since git uses SHA-1, which is known to be vulnerable to collisions. Hopefully, they pick a SHA-3 soon!

--
Do you even lift?
These aren't the 'roids you're looking for.
1. Re:good! by Anonymous Coward · 2010-12-10 13:15 · Score: 0
  
  Why do your insec^Wlawyers even have a say in the matter?
2. Re:good! by Haedrian · 2010-12-10 13:22 · Score: 1
  
  I find the suggestion that Lawyers make purely technical decisions in the company to be incredibly confusing.
  
  Isn't this the sort of decision that the IT staff take?
3. Re:good! by CoderJoe · 2010-12-10 13:23 · Score: 1
  
  I don't understand why your lawyers have a say in the matter, but I think you are out of luck. ANY hash function is going to have collisions. That is just the nature of the beast. The only thing you get from SHA-2 or SHA-3 over SHA-1 is better probability of not colliding, and a more difficult time of deliberately creating a collision.
4. Re:good! by Anonymous Coward · 2010-12-10 13:26 · Score: 0
  
  i'm assuming it's a joke. if they're waiting for a hash without collisions they'll be waiting a while...
5. Re:good! by noidentity · 2010-12-10 13:26 · Score: 1
  
  Actually, some hash functions have no collisions, for example one that returns the entire input as the hash. They should use that as their git hashes. Oh, wait...
6. Re:good! by icebike · 2010-12-10 13:34 · Score: 2
  
  And if they were waiting for a lawyer who understood the issue they would be waiting longer.
  
  --
  Sig Battery depleted. Reverting to safe mode.
7. Re:good! by John+Hasler · 2010-12-10 13:35 · Score: 3, Insightful
  
  The only thing you get from SHA-2 or SHA-3 over SHA-1 is better probability of not colliding, and a more difficult time of deliberately creating a collision.
  And the risk of accidental collisions is negligible while deliberate collisions are irrelevant to the use of hashes in Git as they have no security-related function there.
  
  --
  Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
8. Re:good! by Anonymous Coward · 2010-12-10 13:43 · Score: 0
  
  I'm sure you know this, but for the record, when stupid people say a hash function "has teh collisions", they intend to say that it is easier to find collisions than to randomly stumble over them in a brute-force attack. Collisions are a natural necessity of a hash function.
9. Re:good! by arth1 · 2010-12-10 13:51 · Score: 1
  
  Collisions are a natural necessity of a hash function.
  Not necessarily, no. If the hash is bigger than the hashed data, that's not a certainty.
  And yes, hashes longer than the hashed data can be useful too. You don't have to look further than /etc/shadow or a .htpasswd file for practical examples.
10. Re:good! by Anonymous Coward · 2010-12-10 14:07 · Score: 5, Informative
  
  The only thing you get from SHA-2 or SHA-3 over SHA-1 is better probability of not colliding, and a more difficult time of deliberately creating a collision.
  And the risk of accidental collisions is negligible while deliberate collisions are irrelevant to the use of hashes in Git as they have no security-related function there.
  Actually SHA-1s do have a security related function. I don't remember where I read this explanation, but it is plausible, although difficult.
  SHA-1s are used to uniquely identify the object in GIT. An attacker could write a new patch and generate a collision for it. The attacker would then submit the good patch and get the maintainers to accept the patch and sign it with their GPG key. The attacker would then create a rogue mirror site and replace the good patch with the malicious collision. Because the SHA-1s would match this would not invalidate the GPG signature of the maintainers. If anyone went to the rogue site they would receive a poisoned copy of the git repository that appears cryptographically valid.
  Now the collision would be pretty easy to see if the replaced object was plain source code, because generating a collision usually involves writing out a whole bunch of garbage to a file. However if the replaced object was a binary blob for a driver or a checked in library or something, then it would be much less obvious.
11. Re:good! by Anonymous Coward · 2010-12-10 14:31 · Score: 0
  
  True. That said, when you hash data smaller than the hash, you are usually using regular hash functions with collisions in their original intended use case(DES, MD5, Blowfish).
12. Re:good! by Omnifarious · 2010-12-10 15:28 · Score: 2
  
  People are always saying "Oh, collisions aren't important for this application.". And they're almost always wrong. Stop trying to be a security expert and just quit using an algorithm when it's broken instead of coming up with excuses not to change it.
  
  --
  Need a Python, C++, Unix, Linux develop
13. Re:good! by hedwards · 2010-12-10 15:44 · Score: 1
  
  Eh, is letting lawyers make purely technical decisions really that much worse than letting the accountants or the non-IT managers do it?
14. Re:good! by TheLink · 2010-12-10 16:09 · Score: 1
  
  Now the collision would be pretty easy to see if the replaced object was plain source code, because generating a collision usually involves writing out a whole bunch of garbage to a file.
  
  That assumes people would actually 1) look at the source code and 2) notice the problem.
  In most source code you can insert comments. So the whole bunch of garbage can be commented out.
  --
  
  Too many replies beneath your current threshold
15. Re:good! by Mysteray · 2010-12-10 21:00 · Score: 3, Insightful
  
  An attacker could write a new patch and generate a collision for it. The attacker would then submit the good patch and get the maintainers to accept the patch and sign it with their GPG key. The attacker would then create a rogue mirror site and replace the good patch with the malicious collision.
  That would definitely win you the prize for "the most absurdly over-complicated and difficult way of pwning a Linux box".
  Why don't you just watch [Full-disclosure] for the 0-day of the week like everyone else?
  The bear only has to be faster than the first of the two hunters.
16. Re:good! by Skuto · 2010-12-10 21:19 · Score: 1
  
  I don't know, there's been several cases where lawyers were arguing about the possibility of false DNA matches, which somewhat amounts to the same thing.
17. Re:good! by maestroX · 2010-12-10 23:31 · Score: 1
  
  Our lawyers won't let us convert our svn repositories to git since git uses SHA-1, which is known to be vulnerable to collisions.
  
  Just wait till you tell them about branching..
18. Re:good! by oxygene2k2 · 2010-12-11 01:32 · Score: 1
  
  That issue has been debated and refuted before git even existed: http://www.monotone.ca/docs/Hash-Integrity.html
  There's also venti of plan9 that uses SHA-1 for content addressing: http://en.wikipedia.org/wiki/Venti
  But of course, they must be all wrong.
19. Re:good! by amorsen · 2010-12-11 04:23 · Score: 1
  
  False DNA matches will happen. Not because of mathematical chance, but because it is extremely easy to contaminate DNA samples. I wish they would always send the samples from the crime scene to an entirely different lab from the samples from the suspected individual, for instance, but AFAIK it is often the same lab handling both. The same thing applies to samples from different crime scenes, where the lab could end up establishing links between unrelated crimes.
  
  --
  Finally! A year of moderation! Ready for 2019?
20. Re:good! by Ant+P. · 2010-12-11 07:47 · Score: 1
  
  Does Git actually have a test-suite test to simulate a collision? Is there data loss involved?
Skein by betterunixthanunix · 2010-12-10 13:26 · Score: 1

Skein is broken, last I heard...

--
Palm trees and 8
1. Re:Skein by Omnifarious · 2010-12-10 13:51 · Score: 2
  
  I've been following the progress on the SHA-3 Zoo and I haven't seen anything indicating Skein is broken. I've been following Skein with particular interest because I like how it can be tweaked in various ways to serve particular needs.
  
  --
  Need a Python, C++, Unix, Linux develop
2. Re:Skein by Sancho · 2010-12-10 14:13 · Score: 1
  
  It was broken, but it has been fixed.
  Actually, Threefish was broken (which Skein relied upon.)
3. Re:Skein by Anonymous Coward · 2010-12-10 15:07 · Score: 0
  
  djb has released some stuff:
  http://eprint.iacr.org/2010/623.pdf
  No details at all, but .. certainly .. heh .. provoking. :)
4. Re:Skein by Omnifarious · 2010-12-10 15:15 · Score: 1
  
  Is this the attack by djb that even he hasn't posted clear details of? Or is this a previous attack that Schneier and company solved with their 2nd round tweaks that improved diffusion?
  
  --
  Need a Python, C++, Unix, Linux develop
5. Re:Skein by Omnifarious · 2010-12-10 15:16 · Score: 1
  
  Well, that's amusing. But without details, it's only slightly better than saying "Skein sucks!".
  
  --
  Need a Python, C++, Unix, Linux develop
6. Re:Skein by Sancho · 2010-12-10 15:22 · Score: 1
  
  I was referring to the previous attack which was solved in the 2nd round tweaks.
7. Re:Skein by Anonymous Coward · 2010-12-10 16:26 · Score: 2, Insightful
  
  Woosh!
  Definition of skein: A loosely-wound, oblong ball of yarn
8. Re:Skein by Omnifarious · 2010-12-12 18:29 · Score: 1
  
  So, it was intended as a joke, and not a hint that there's some terrible flaw in the skein algorithm? As a joke, it's pretty darned funny. :-)
  
  --
  Need a Python, C++, Unix, Linux develop
Bah! by jd · 2010-12-10 13:29 · Score: 5, Interesting

None of the good names survived!
Still, there was a lot of debate on the SHA3 mailing list governing the criteria as it was felt that some of the criteria were being abused and others were being ignored. I, and a few others, advocated an approach where the best compromise solution was the "winner" for SHA3 but the runner-up that was best for some specific specialist problem (and still ok at everything else, since it's a runner-up, and also free of known issues) would then be considered the winner as "SHA3b". That way, you'd also get a strong specialist hash. The idea for this compromise was due to SHA2 not being widely adopted because it IS ok for everything but not good for anything. Some people wanted SHA3 to be wholly specialised, others wanted it to be as true to the original specs as possible, the compromise was suggested as a means of providing both without making the bake-off unnecessarily complex or having to have a whole parallel SHA3 contest for the specialist system.
The main problem with the finalists is the inclusion of Skein. The use of narrow-pipe algorithms has been widely criticised by people far more knowledgable than myself because it violates some of the security guarantees that are supposed to be present. The argument for Skein is that the objection is theoretical.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
1. Re:Bah! by Omnifarious · 2010-12-10 13:53 · Score: 2
  
  I'm really curious as to why Blue Midnight Wish wasn't selected. I've read a bunch of the papers and nobody seemed to be able to come up with any reasonable reason it was weak, and it's very fast.
  
  --
  Need a Python, C++, Unix, Linux develop
2. Re:Bah! by Anonymous Coward · 2010-12-10 16:10 · Score: 0
  
  That way, you'd also get a strong specialist hash. The idea for this compromise was due to SHA2 not being widely adopted because it IS ok for everything but not good for anything. Some people wanted SHA3 to be wholly specialised, others wanted it to be as true to the original specs as possible, the compromise was suggested as a means of providing both without making the bake-off unnecessarily complex or having to have a whole parallel SHA3 contest for the specialist system.
  Nothing stops protocol designers from using other algorithms. Just make sure you use a protocol field so implementations can negotiate, and perhaps set "preference levels".
  Some implementers will need to use SHA-3 (whatever algorithm that ends up being) because of regulatory restrictions (government vendors in the US will need to use NIST-approved stuff), but the general public may able to profit from one of the runner-up algorithms that could be a better, specialized fit for the application in question. SSL/TLS support multiple algorithms, as do SSH and PGP. If what you're doing is best suited to a particular algorithm, use it—just make sure it can be expanded—because, after all, all algorithms tend to be "broken" eventually, so you might as well put some flexibility in from the start.
3. Re:Bah! by mdmkolbe · 2010-12-10 18:42 · Score: 1
  
  Which specialist problem would SHA3b have been for? Any random specialist problem or is there a particularly important specialist problem? The former doesn't sound very useful, but I don't know what the later would be.
4. Re:Bah! by jhnphm · 2010-12-10 18:53 · Score: 1
  
  It has a non-regular structure that causes it to be very large when implemented in hardware - it's one of the largest, if not the largest period, of the round 2 candidates in terms of chip area, and the performance isn't all that great in hardware to make up for this either.
5. Re:Bah! by jhnphm · 2010-12-10 18:57 · Score: 2
  
  In addition, the NIST email said "No algorithm survived to become a finalist that did not have a clear round structure that could be readily adjusted to trade security for perfomance." This probably refers to BMW.
6. Re:Bah! by Omnifarious · 2010-12-10 23:08 · Score: 1
  
  Thanks! Those two reasons make a lot of sense.
  
  --
  Need a Python, C++, Unix, Linux develop
7. Re:Bah! by TheRaven64 · 2010-12-11 00:26 · Score: 2
  
  I, and a few others, advocated an approach where the best compromise solution was the "winner" for SHA3 but the runner-up that was best for some specific specialist problem (and still ok at everything else, since it's a runner-up, and also free of known issues) would then be considered the winner as "SHA3b".
  
  That would entirely defeat the point of the competition. The purpose is to select one, good, cryptographic hash that will be called SHA-3 and can therefore be used in places where US government regulations require a strong cryptographic hash.
  The other algorithms aren't going away at the end of the competition. If you have a specialist purpose where one of them is better suited than SHA-3, then you can still pick your own algorithm. It just won't be called SHA-anything.
  
  --
  I am TheRaven on Soylent News
8. Re:Bah! by jd · 2010-12-15 08:50 · Score: 1
  
  It may be bad in hardware, but if it's good in software then I'd consider it superb for software-only uses of hashes.
  This also goes back to an argument I (and a few others) made on the list - since some of the original requirements were being dropped anyway, why not have a runner-up that is acceptably good at everything but is especially good at one that is frequently used?
  That way, you have a "winner" that is good overall but you also have something that has a specialist use but is decent elsewhere.
  In this case, I'd say BMW is ideal for file signatures (be it for file transfers, Tripwire/AIDE-style uses, etc). This isn't something hardware is normally used for and being very fast in software makes it ideal.
  
  --
  It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
9. Re:Bah! by jd · 2010-12-15 08:56 · Score: 1
  
  The purpose is to select one good (overall) universal-purpose cryptographic hash an call it SHA-3.
  The problem that a specialist has is that isn't SHA-something. As such, it can't be used in Federal applications. At all.
  Let's say that Blue Midnight Wish is the ideal for file validation (which is mostly done in software). The Feds can't use it.
  Now, you can't produce an infinite number of alternatives, but a number of us felt that one - just one - Federally-usable special-purpose hash had a place, where it was not that important what that special purpose was, it was so that the Feds would have a very controlled level of flexibility when the general-purpose "ideal" is unsuitable.
  
  --
  It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
10. Re:Bah! by jd · 2010-12-15 09:13 · Score: 1
  I wanted NIST to be able to say, at the end of the final bake-off "hey, criterion X is vitally important in a substantial portion of cases where cryptographic hashes are used and criterion Y is really not that critical in most of those cases; SHA-3 is ok at everything, but algorithm xyzzy is massively better at X - it wasn't picked because it's massively worse at Y, but that just doesn't matter".
  Since NIST is in a better position to know if X and Y are even real cases and what the hell those cases would be, I didn't want to suggest things that wouldn't actually be that useful.
  However, since that route isn't getting taken, here's what inspired me:
  Cryptographic hashes on Mondex-style snart cards would want to be very fast in hardware, to hell with software. Great for not just money but secure handling of data in a portable medium.
  Hashes for files (as per Tripwire or AIDE) would want to be very fast in software, but this just isn't done in hardware so who cares what speed it is there?
  Network security has to consider ATM (48-bit packets); passwords, likewise, have to consider very short strings. Ethernet's largest jumbo packet is about 9K. A hash that can guarantee no pre-images or other even minor weaknesses for extremely low levels of input would be perfect in these cases. Doesn't matter if it's slow for data of greater size.
  Three specialist cases that the Federal Government could realistically use on a large-scale basis but won't be able to.
  --
  It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Yet another crappy summary... by msauve · 2010-12-10 13:55 · Score: 3, Informative

Curiously, some of the faster algorithms were eliminated as they were felt to be "too fast to be true."
Not only is the claimed quote ("too fast to be true") nowhere to be found in the linked article, but there isn't even a basis for that claim.

--
"National Security is the chief cause of national insecurity." - Celine's First Law
1. Re:Yet another crappy summary... by Anonymous Coward · 2010-12-10 16:33 · Score: 0
  
  That's not how Slashdot works. Slashdot guarantees that any claims made in the Slashdot summary cannot be found in TFA.
2. Re:Yet another crappy summary... by udittmer · 2010-12-10 20:19 · Score: 5, Informative
  
  Not only is the claimed quote ("too fast to be true") nowhere to be found in the linked article, but there isn't even a basis for that claim.
  There is in fact a basis for that claim, even if it isn't mentioned in that particular article. See http://crypto.junod.info/2010/12/10/sha-3-finalists-announced-by-nist/ for more about that.
3. Re:Yet another crappy summary... by Skuto · 2010-12-10 21:10 · Score: 4, Informative
  
  >Not only is the claimed quote ("too fast to be true") nowhere to be found in the linked article, but there isn't even a basis for that claim.
  People read the articles? That's new.
  My original post had no links, because the original announcement was on a password protected mailing list. If you read that (it's been posted elsewhere since), you will see the statement it refers to.
  Some fast algorithms were eliminated based on partial attacks or observations that are not real attacks. This means there's a potential we miss out on a faster but good algorithm, because most partial attacks never make it to full attacks. Using this to eliminate ciphers means the selection is a bit of a black art (that shouldn't surprise insiders too much).
  Some people were advocating the opposite approach, namely to just pick the fastest/smallest ciphers and then see which one wasn't broken at the end of the process. Clearly, NIST is taken a very different approach. And given hash function history, an understandable one.
4. Re:Yet another crappy summary... by Anonymous Coward · 2010-12-10 22:21 · Score: 0
  
  Well, but in each case it wasn't just speed per se - there were some theoretical partial attacks that didn't work as stated but made people nervous. That is according to the nice link you have given.
  So no, sleep(10) wouldn't help.
Mod parent up. by John+Hasler · 2010-12-10 14:30 · Score: 0

n/t

--
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
1. Re:Mod parent up. by John+Hasler · 2010-12-10 15:59 · Score: 0
  
  Um, I meant mod the AC's response to me up, not my comment.
  
  --
  Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
Use them all! by Anonymous Coward · 2010-12-10 14:38 · Score: 1, Funny

Use them all, and XOR the results together to get your final hashvalue.
That way, you're safe unless they're all broken, right?
1. Re:Use them all! by Anonymous Coward · 2010-12-10 14:59 · Score: 0
  
  As a bonus, it's not "too fast"!
2. Re:Use them all! by PseudonymousBraveguy · 2010-12-10 21:04 · Score: 1
  
  For extra security, use each of them twice!
3. Re:Use them all! by Mysteray · 2010-12-10 21:14 · Score: 2
  
  It might help, it might not help much, it might make things slightly worse. It will be measurably slower and not measurably more secure.
  You'll be on your own with it because it will not be an interoperable, accepted standard. Hashes are often used for data shared by multiple parties.
"password" by pgn674 · 2010-12-10 14:38 · Score: 0, Troll

A friend of mine discovered and I verified the other day that BASE64(SHA256("password")) == XohImNooBHFR0OVvjcYpJ3NgPQ1qq73WKhHvch0VQtg=
Is that "ohImNooB" just a coincidence? If so, then it's quite the coincidence. Taking the SHA256 of a password and converting it to BASE64 is a fairly common way of storing and displaying a password on a system. To have the representation of the word "password", which is a very noobish password to choose, contain the string "ohImNooB". Quite the coincidence indeed.
Unless it's not a coincidence. Would that be possible?
1. Re:"password" by johanatan · 2010-12-10 15:13 · Score: 1
  
  It is most surely a coincidence.
2. Re:"password" by mx_mx_mx · 2010-12-10 15:23 · Score: 1
  
  Who modded that Troll?
  
  --
  Linux forever
3. Re:"password" by Anonymous Coward · 2010-12-10 15:33 · Score: 0
  
  A friend of mine discovered and I verified the other day that BASE64(SHA256("password")) == XohImNooBHFR0OVvjcYpJ3NgPQ1qq73WKhHvch0VQtg=
  Is that "ohImNooB" just a coincidence? If so, then it's quite the coincidence. Taking the SHA256 of a password and converting it to BASE64 is a fairly common way of storing and displaying a password on a system. To have the representation of the word "password", which is a very noobish password to choose, contain the string "ohImNooB". Quite the coincidence indeed.
  Unless it's not a coincidence. Would that be possible?
  Is "jcYpJ3NgPQ" just a coincidence also? I think not!
4. Re:"password" by ducomputergeek · 2010-12-10 18:18 · Score: 1
  
  Yeah, must be noobs. They forgot the ROT13 step...pfffst, bloody amateurs.
  
  --
  "The problem with socialism is eventually you run out of other people's money" - Thatcher.
5. Re:"password" by pgn674 · 2010-12-10 19:05 · Score: 1
  
  Dunno. I had meant to also link it to the article, but I forgot.
  I was thinking that it may not be a coincidence. What if, while developing the SHA256 algorithm, they needed an arbitrary starting point or seed, like deciding where to begin drawing a circle? And so, whoever made that choice chose the one that gave this hash for this string? And what if someone did something similar in the SHA3 algorithm? That would be cool to find.
  But, I don't know how the SHA2 algorithms work, and there probably is not any arbitrary starting point or seed.
6. Re:"password" by Mysteray · 2010-12-10 21:28 · Score: 2
  
  http://en.wikipedia.org/wiki/SHA-2
  So for SHA-256 the starting constants are the "first 32 bits of the fractional parts of the square roots of the first 8 primes 2..19" and "first 32 bits of the fractional parts of the cube roots of the first 64 primes 2..311".
  That only takes a few words to explain, and most of it is dictated by the design (e.g., "32 bits"). The hash designer is signaling that he only had freedom to select a few general concepts here and there.
  http://en.wikipedia.org/wiki/Nothing_up_my_sleeve_number
  You can be sure that the people who approve these kinds of things are pretty paranoid about the possibility of someone sneaking a back door in there. If the constants had been proposed as "bits from the base-2 representation of pi starting at bit position 2364826687681" there would have been some serious eyebrow raising.
  Still, it's a pretty cool find. I can't wait for the upcoming holiday party, I will surely impress the ladies with that!
7. Re:"password" by SigmundFloyd · 2010-12-11 00:03 · Score: 1
  
  A friend of mine discovered and I verified the other day that
  BASE64(SHA256("password")) == XohImNooBHFR0OVvjcYpJ3NgPQ1qq73WKhHvch0VQtg=
  Can't reproduce that.
  echo -n password | sha256sum | base64 NWU4ODQ4OT hkYTI4MDQ3 MTUxZDBlNT ZmOGRjNjI5 Mjc3MzYwM2 QwZDZhYWJi ZGQ2MmExMW VmNzIxZDE1 NDJkOCAgLQo= ~> echo password | sha256sum | base64 NmIzYTU1ZT AyNjFiMDMw NDE0M2Y4MD VhMjQ5MjRk MGMxYzQ0NT I0ODIxMzA1 ZjMxZDkyNz c4NDNiOGEx MGY0ZSAgLQo=
  
  Unless it's not a coincidence. Would that be possible?
  Lame troll.
  
  --
  Knowledge is power; knowledge shared is power lost.
8. Re:"password" by ElMiguel · 2010-12-11 00:51 · Score: 1
  
  As it happens, you're doing it wrong, because the output of sha256sum is a hex string, not binary. You should have realised because 256 bits in base64 should be ceil(256/6) = 43 characters long, not the ~90 you get.
  This produces the correct result:
  $ echo -n password | sha256sum | perl -ane "print pack('H*', @F)" | base64
  XohImNooBHFR0OVvjcYpJ3NgPQ1qq73WKhHvch0VQtg=
9. Re:"password" by ElMiguel · 2010-12-11 00:53 · Score: 1
  
  The greater noobishness here would be storing the unsalted hash in plaintext.
10. Re:"password" by SigmundFloyd · 2010-12-11 02:47 · Score: 1
  
  Point taken. I still think it's a troll, though.
  
  --
  Knowledge is power; knowledge shared is power lost.
11. Re:"password" by Serious+Callers+Only · 2010-12-11 22:10 · Score: 1
  
  I don't think it's a troll, as they're certainly not lying about the input/output, however it's more probable that the base64 algorithm was engineered to have this output than SHA digests, since that's what actually produces this output based on that particular hash, and it has no pretensions to security. Here it is in ruby - same result.
  require 'base64' require 'digest' puts Base64::encode64(Digest::SHA256.digest('password')) => XohImNooBHFR0OVvjcYpJ3NgPQ1qq73WKhHvch0VQtg=
  There's no way this is a simple coincidence, given it's three words which form a sentence.
SHAs decrypts just like a woman? by Zero__Kelvin · 2010-12-10 14:56 · Score: 1

You didn't think that when sha gave up the goods that fast that you were the only one sha was giving it up to, did you?

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
One of the many, many reasons why IANAL by Zero__Kelvin · 2010-12-10 15:10 · Score: 1

"Our lawyers won't let us convert our svn repositories to git since git uses SHA-1, which is known to be vulnerable to collisions."
That makes perfect sense. Better to use an SCM that gives no assurance that what you get back is the same as what committed than use one that was designed in large part to fix that known problem with Subversion, and has been used to make hundreds of thousands of changes to one of the biggest software products on the planet without any such problem.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
1. Re:One of the many, many reasons why IANAL by CoderJoe · 2010-12-10 17:10 · Score: 1
  
  an SCM that gives no assurance that what you get back is the same as what committed
  
  I'm going to have to ask for more concrete demonstrations of that claim than "Linus said so" before I believe it.
2. Re:One of the many, many reasons why IANAL by Zero__Kelvin · 2010-12-10 17:22 · Score: 2
  
  "I'm going to have to ask for more concrete demonstrations of that claim than "Linus said so" before I believe it."
  Damn. I was just about to go to sleep too. Now I have to stay up all night worrying what you think, and how I'm going to do that thinking for you 8-(
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
3. Re:One of the many, many reasons why IANAL by CoderJoe · 2010-12-10 19:43 · Score: 1
  
  You're trying to make a persuasive argument that using SVN is a bad choice, and the only thing you have provided to back that claim up is a video of Linus saying the exact same statement you made. So far, I have not found any evidence to back up this argument.
  I'm going to guess your (and Linus') reasoning is that because git has the SHA-1 hashes, and checks them, that you get the same data out as you put in. I have not seen anything that ensures that git will not silently overwrite or discard data in the case that there is a hash collision.
  Yes, the probability is small, but that does not mean you can ignore it completely. There is a small probability that a hard drive will be DOA (yes, gigantic compared to hash collisions, but bear with me). However, I have had the bad luck of having two different model drives, purchased 9 months apart be bad out of the box. This is without buying large quantities of drives. Just because some is statistically improbable does not mean that it cannot happen.
4. Re:One of the many, many reasons why IANAL by Anonymous Coward · 2010-12-10 20:49 · Score: 1
  
  I think linus may have referred to svn backend implementations which at that time used Berkeley db which had a nasty corruption habit or fsfs which was their in house project which had always had a lot of shortcomings and was generally an immature project
  (look, they say this themselves, they're going to throw it away for the next big release iteration in favor of WC-NG)
  it has to be said that I've used svn for many years now without a problem.
5. Re:One of the many, many reasons why IANAL by Zero__Kelvin · 2010-12-10 22:23 · Score: 1
  
  "I'm going to guess your (and Linus') reasoning is ..."
  This is covered in the video. If you can't be bothered to watch and learn, don't expect others to go out of the way to teach you.
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
6. Re:One of the many, many reasons why IANAL by CoderJoe · 2010-12-11 01:51 · Score: 1
  
  I did watch the video. All he really said about the matter is that other SCMs, you don't always get back what you put in. The only other marginally related thing was at the end of the talk (40 minutes later!) about git using the SHA-1 hashes to verify the integrity. Ok, so git verifies the integrity, but that still does not demonstrate how other SCMs corrupt data. (I am not going to count filesystem corruption against the SCM, either. There are a few filesystems that regularly check the integrity of the data stored on them, which is one way to counteract this problem.)
7. Re:One of the many, many reasons why IANAL by Zero__Kelvin · 2010-12-11 02:44 · Score: 1
  
  I don't speak for Linus of course, but I think that if he meant that other SCMS corrupt data, he would have said that other SCMs corrupt data. He likely means that if someone else either foolishly or maliciously directly modifies the code in the repository manually your code will not be the same as what is in the repository for the same revision and you'll never know it.
  
  "There are a few filesystems that regularly check the integrity of the data stored on them, which is one way to counteract this problem.)"
  Great. Now all we have to do is make sure that every user of the repository is using those filesystems. How do you plan on doing that, especially on an Open Source project?
  
  Finally, how do you solve the problem with SVN where my code revisions change but all of my code is the same?
  
  Just face it, Subversion is badly broken and is stuck in an antiquated client/server paradigm and git is the present for those with a clue and the future for those who hope to someday get one.
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
8. Re:One of the many, many reasons why IANAL by gnapster · 2010-12-11 12:38 · Score: 1
  
  Aside from the apparent lack of rigor in demonstrating your point (which is perfectly understandable for an audience where apparently only 10 were familiar with distributed SCM), that is an interesting talk. Thanks for sharing it.
  For those in a hurry, the relevant remarks are at 10:55 and 56:15 in the video.
9. Re:One of the many, many reasons why IANAL by Dahan · 2010-12-11 13:51 · Score: 1
  
  (look, they say this themselves, they're going to throw it away for the next big release iteration in favor of WC-NG)
  No, they don't say that. WC-NG is a working copy format, whereas Berkeley DB and fsfs are used for the repository. Switching to a different working copy format does not change anything about the repository. fsfs is still the preferred repo format.
10. Re:One of the many, many reasons why IANAL by Zero__Kelvin · 2010-12-11 14:21 · Score: 1
  
  I'm glad someone got something out of it besides me! ;-)
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
11. Re:One of the many, many reasons why IANAL by gnapster · 2010-12-11 15:37 · Score: 1
  
  Yeah, on one hand, I can see why CoderJoe took issue with it. The only way it really supported your point was empirically: that kernel developers probably see a lot of these problems, and I suppose that Linus is as aware of these things as anyone. I am a novice SCM user, at best, and I can definitely say that if I had to ask svn to show me whether my repository had been corrupted or compromised, I would not know where to start. I can easily believe that svn provides little to no assurances of such things. On the other hand, here is a short discussion on a related problem that Google unearthed for me quickly, so I think it's not hard to find examples where svn falls short.
  I wonder how git catches these things; Linus talks about "knowing the hash" – I can't tell whether he means the user or the system itself; I would love to understand how git draws attention to unauthorized changes to a repository, if that is actually what is going on. That would be fascinating.
  All these issues aside, I have been trying to understand git over the last few months, although I have not actually used it with an active project. In that respect, this was a very timely exposure to the video, and why I am grateful to have come upon your comment. :c)
It will never work by Zero__Kelvin · 2010-12-10 15:44 · Score: 1

"An attacker could write a new patch and generate a collision for it. The attacker would then submit the good patch and get the maintainers to accept the patch and sign it with their GPG key. The attacker would then create a rogue mirror site and replace the good patch with the malicious collision."
If you use source code it will not compile. If you use a blob it will not run. Even if those things were not true, whatever you came up with would certainly not do what you wanted it to do.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
1. Re:It will never work by Omnifarious · 2010-12-10 15:55 · Score: 1
  
  That's an ignorant defense. There is a really nice example of someone creating two Postscript files that both generate perfectly intelligible pages that contain the same hash. Doing this and still hiding the exploit just requires sticking it in code that nobody will review. You have the exploit code and the non-exploit code in the same file and have a decision based on a bunch of random garbage that's different in the two files.
  
  --
  Need a Python, C++, Unix, Linux develop
2. Re:It will never work by pavon · 2010-12-10 16:22 · Score: 1
  
  That isn't an issue at all. Most collision generating attacks append filler data to the end of the desired file to get the desired hash. For source code you just insert the filler data in comment at the end of the file. It is a trivial modification to the existing algorithms.
3. Re:It will never work by Zero__Kelvin · 2010-12-10 16:48 · Score: 1
  
  I believe that you need to have to have a git object that is the same size to replace your target object with, or git will sense the corruption. I don't have time to look more closely at the moment, but I trust that Linus did ;-)
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
4. Re:It will never work by pavon · 2010-12-10 19:07 · Score: 1
  
  Needing to match the length of the original object doesn't make the problem that much harder (unless the desired code is already close in length to the original code). In fact many collision attacks are designed for fixed size inputs, because it makes things easier.
  Requiring valid formatting of the file is not a hard problem compared to the much more fundamental problem of finding a practical preimage attack in general. If SHA-1 were broken (it isn't yet), then it would certainly be plausible to attack git in the manner that the AC described.
5. Re:It will never work by Anonymous Coward · 2010-12-10 21:15 · Score: 0
  
  Mod up. Converting as few as 160 tabs to spaces or vice versa is enough to (in theory) collide with any possible hash.
  Thus it's almost trivial(*) to create a pair of "birthday paradox" SHA1 hash collisions based on any 160+ line file.
  (* almost trivial = for an NSA supercomputer or a black hat bot net with a few hundred CPU-years to spare)
6. Re:It will never work by dhasenan · 2010-12-10 21:33 · Score: 1
  
  This is still irrelevant to the original problem, since anyone who has access to modify the SCM's internal files is already trusted. They could easily enough submit a change in the ordinary fashion to accomplish their goals, even as a different user. You're worrying about a bank employee pulling off a high-tech heist rather than simply removing money from the till.
7. Re:It will never work by Rich0 · 2010-12-11 02:27 · Score: 1
  
  They don't need to modify any internal files. You just submit a file to a repository and get the maintainers to check it in. The file has to look innocuous to any inspection. Once they do that, you put instead a rogue version of that file with the same hash and size on your own mirror, and get somebody to use that mirror. You make sure the rogue file is signed by the official maintainers so that end-users trust your mirror. Since the hash matches you can just extract their signature and put it on your rogue file.
  Now, for source file this isn't going to be very practical, since if I submit a file that is 100 lines of source followed by a 1000-line pile of random characters in a comment they're going to strip that out, defeating the attack. However, what about binary submissions.
  Wait, you ask, who accepts binary submissions into an SCM? Well, check out any of the android distros - the phone manufacturers don't distribute source for some of their drivers, which means lots of binary blobs floating around. I'm sure they get checked into SCMs. All I need to do is say that I patched a blob to make some fix, and chances are somebody will accept it after testing. It is very easy to sneak a bunch of binary garbage in a binary blob - as long as you make sure that it isn't in the code path. If you're REALLY clever you might be able to put it in the code path - just look at any polymorphic virus.
  Bottom line is that if your hash function allows pre-generated collisions LOTS of bad things can happen. Coming up with clever reasons why they're not likely to happen won't save you when it actually does happen.
8. Re:It will never work by gnapster · 2010-12-11 10:21 · Score: 1
  
  Yay! Another good reason to use Python!
Code is NOT English prose (FTFY) by Zero__Kelvin · 2010-12-10 16:06 · Score: 1

Let me know when the replacement page says exactly what they want it to rather than merely something that appears intelligible, and using SHA-1 rather than MD5. Don't forget that changing a period to a semicolon in a page of text has little implication, but in source code it changes everything completely.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
1. Re:Code is NOT English prose (FTFY) by Omnifarious · 2010-12-10 16:52 · Score: 1
  
  I google for 'md5 postscript attack' and found this: Hash Collisions (The Poisoned Message Attack). Using this technique they were able to make either document say exactly and precisely what they wanted it to say. The postscript merely makes a decision about what display code to run based on the contents of some garbage in the middle of the file that's different in each file. The rest of both files are the same.
  
  --
  Need a Python, C++, Unix, Linux develop
2. Re:Code is NOT English prose (FTFY) by fatphil · 2010-12-10 21:35 · Score: 1
  
  Changes everything completely? /* Apparently you've not heard of comments. */ /* Apparently you've not heard of comments; */
  
  --
  Also FatPhil on SoylentNews, id 863
3. Re:Code is NOT English prose (FTFY) by TheRaven64 · 2010-12-11 00:21 · Score: 1
  
  PostScript files are not English prose either. They are programs in a Turing-complete stack-based language that draw bezier curves on a screen. The output can be English prose, but the same is true of C programs.
  
  --
  I am TheRaven on Soylent News
4. Re:Code is NOT English prose (FTFY) by Zero__Kelvin · 2010-12-11 02:33 · Score: 1
  
  OK. That is a correct, and I was mistaken about that fact. Of course, it has nothing to do with why the proposed attack is infeasible if not impossible, and it was a mistake on my part to bring it up.
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
5. Re:Code is NOT English prose (FTFY) by kasperd · 2010-12-11 12:52 · Score: 1
  
  That is just a practical application of the weakness that was demonstrated in 2004. The original demonstration of the weakness just got people saying that it would be easy to tell the difference between a legitimate file and a file crafted to have the same hash value. The article you mention seems to have been a reaction to this demonstrating how it can be applied. Such applications should have been obvious to everyone from the original result.
  
  A stronger attack was demonstrated later. In the original attack the only part of the inputs that differed would be chosen by the attack, so you would have no control over them. The stronger attack would take any two prefixes of identical length and add a few blocks to each to line up the intermediate state of the hash. That way it would produce a collision where you had a lot more control over the contents. And the attack could be repeated to produce many files with identical hash. They used that method to predict the outcome of an election in a large North American country. They published the md5sum of a file with the name of the winner a long time before the election. They had produced a sequence of files with the names of each candidate (I think they were pdf files), all of which had same md5sum.
  
  A similar attack was later used to construct a pair of SSL certificates with identical md5sums, and they got a CA to sign one of them which was a valid signature for the other. Even this was not enough to get browsers to stop accepting certificates signed with md5.
  
  --
  
  Do you care about the security of your wireless mouse?
6. Re:Code is NOT English prose (FTFY) by Omnifarious · 2010-12-11 23:46 · Score: 1
  
  That is just a practical application of the weakness that was demonstrated in 2004. The original demonstration of the weakness just got people saying that it would be easy to tell the difference between a legitimate file and a file crafted to have the same hash value. The article you mention seems to have been a reaction to this demonstrating how it can be applied. Such applications should have been obvious to everyone from the original result.
  I am somewhat aware of this history. Thank you for clarifying and mentioning the attack that's even better than this one.
  Strangely enough, a lot of people haven't gotten the message. A lot of people keep on claiming MD5 is perfectly fine in cases where this attack is possible. I've had to fight it so many times. I don't understand people.
  And really, I think it's only a matter of time (and not much time at that) before the same issue crops up for SHA-1 given the weaknesses that are currently known.
  Switching a hash algorithm out isn't that hard. And if you're writing new code, using a good one as opposed to a bad one is even easier. I don't understand why people resist and try to claim the old one is 'just fine' for whatever it is they're doing.
  
  --
  Need a Python, C++, Unix, Linux develop
7. Re:Code is NOT English prose (FTFY) by Anonymous Coward · 2010-12-14 15:48 · Score: 0
  
  But what if you pad out your source code with comments? /* or would this be excluded from the hash ? */
git objects don't live in a vacuum by Zero__Kelvin · 2010-12-10 17:09 · Score: 1

You are ignoring the fact that git doesn't blindly store the object and hash independently. It is a hierarchical tree of objects, each with a size element. If you plug your new object in I believe it will break the hashes of the other objects. For example a directory is an object with a hash that includes the size of the object. For this reason I am almost certain that your object must not only have the same hash, it has to be the same size as well.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
1. Re:git objects don't live in a vacuum by PseudonymousBraveguy · 2010-12-10 20:58 · Score: 1
  
  So you have a situation where an attacker may substitute a patch with a malicious patch. That may or may not invalidate other hashes, depending on several circumstances of the attack, which are basically speculation. You can now either simply change the hash function, eliminating the problem, or ignore the problem and hope nothing will go wrong. Which option is better from a security standpoint?
2. Re:git objects don't live in a vacuum by Zero__Kelvin · 2010-12-10 22:16 · Score: 1
  
  "So you have a situation where an attacker may substitute a patch with a malicious patch. "
  There is no such situation. I am simply trying to explain why that is true. Your argument is that we should change from technique A to technique B to fix the situation where technique A works already ;-)
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
3. Re:git objects don't live in a vacuum by Omnifarious · 2010-12-10 23:11 · Score: 1
  
  Unfortunately, the links to the postscript files in question are no longer valid. :-( If I recall correctly though, the blocks of garbage in the middle were exactly the same size in both files.
  
  --
  Need a Python, C++, Unix, Linux develop
4. Re:git objects don't live in a vacuum by PseudonymousBraveguy · 2010-12-10 23:14 · Score: 1
  
  Your argument bases upon the assumption that the attacker can not generate a malicious patch the same size as the original patch. That may or may not be true, depending on how the attack works. And in security questions, it's usually better to go with the more pessimistic assumption.
5. Re:git objects don't live in a vacuum by Skuto · 2010-12-10 23:50 · Score: 1
  
  Mod parent up and grandparent down. If you have a break and can generate hash collisions, making sure its the same size tends to be the EASY part of the break.
6. Re:git objects don't live in a vacuum by Zero__Kelvin · 2010-12-11 02:31 · Score: 1
  
  Oh My F'ing God! You people are THICK! Making sure it is the same size and that it compiles and does what you want it to do is frigging impossible!
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
7. Re:git objects don't live in a vacuum by moonbender · 2010-12-11 04:16 · Score: 1
  
  No, you're thick. Breaking SHA-1 is the difficult part, the things you keep bringing up are trivial in comparison.
  
  --
  Switch back to Slashdot's D1 system.
8. Re:git objects don't live in a vacuum by Zero__Kelvin · 2010-12-11 04:54 · Score: 1
  
  Be careful who you're calling thick there Captain Molasses. The entire thread is about what could be done if you break SHA-1. I am saying that if and when you break SHA-1 you still face all these additional hurdles that cannot be overcome.
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
9. Re:git objects don't live in a vacuum by pavon · 2010-12-11 09:20 · Score: 2
  
  Here is a collection of real world implementations of a collision attacks in which two legitimate executable binaries were created to have the same MD5 hash and size.
  Here is the post script collision attack that Omnifarious was referring to. Both files are the same length and have the same MD5 hash. Furthermore, postscript is a turing complete programing language, with as picky of a syntax as C.
  All the collision attacks I have seen used fixed length blocks in both files which are modified. Inserting a fixed length comment block into a piece of code is not hard.
  Preimage attacks (where you only modify one file not both) are harder, and to date, not even MD5 has any known practical preimage attacks. But if it did, it would be trivial to implement it by tweaking a block comment in a source code file, or a data segment in a binary file. There is no challenge there whatsoever.
  It took me less than a minute to find those on Google. I don't expect people to know everything, but if you are going to run around insulting people and being an asshole, you better know what the fuck you are talking about.
10. Re:git objects don't live in a vacuum by Zero__Kelvin · 2010-12-11 10:11 · Score: 1
  
  I know about all of these, and none of them come close to meeting the criteria. In case you didn't notice, you called me an asshole, which is an insult. Saying that you are THICK is not an insult, but rather a statement of fact, as you have just shown. You spent all this time trying to argue that it can be easily done, even when you yourself admit that: Preimage attacks (where you only modify one file not both) are harder, and to date, not even MD5 has any known practical preimage attacks.
  
  Still, you don't admit that It isn't going to happen. So yes, you have now graduated from THICK to exactly what you called me.
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
11. Re:git objects don't live in a vacuum by moonbender · 2010-12-11 11:12 · Score: 1
  
  No, this thread is about whether the attack outlined before is viable. It is, but you'd have to break SHA-1 to do it. You also have to do a couple of other things, but those are trivial compared to managing a preimage attack on SHA-1.
  
  --
  Switch back to Slashdot's D1 system.
12. Re:git objects don't live in a vacuum by larry+bagina · 2010-12-11 11:25 · Score: 1
  
  that's wrong. directories (trees) store the mode, type, hash, and name. They do not store the size. this is a good read of git's internals.
  
  --
  Do you even lift?
  These aren't the 'roids you're looking for.
13. Re:git objects don't live in a vacuum by Anonymous Coward · 2010-12-11 13:00 · Score: 0
  
  No.. dumbass. That's the point.. It's not that difficult. For certain types of collision breaks, it is all very easy you fucking toolbag moron.
  If the vulnerability is one such that I can modify a subset of a given plaintext, then you're screwed.
  See, fuckwit. In C, and most other languages, there are these things called comments. With few exceptions, I can put arbitrary data in there and it will still compile and operate the same on most C compilers. So if I have some C file, with a block of comments, I can possibly alter that area of comments. Usually the size part (as has been mentioned), is the easy bit, and that will of course depend on the nature of whatever code I added or removed.
  You are simply too thick to use your goddamn brain.
  I thought the discussion here was replacing hash functions with certain types of *vulnerabilities* with one's without them. No one is saying if the hash function were ideal that this would be a problem. MD5 length is still pretty good from a statistical point of view (2^128)... that's not the reason it (MD5) is problematic.
14. Re:git objects don't live in a vacuum by kasperd · 2010-12-11 13:17 · Score: 1
  
  If you plug your new object in I believe it will break the hashes of the other objects. For example a directory is an object with a hash that includes the size of the object.
  All the collision attacks demonstrated against md5 would produce collisions where the inputs had identical size. Due to the way the padding to make the input an integral number of blocks works, none of the attacks can be used to construct collisions with different size inputs.
  
  Since sha1 use the exact same padding as md5, it seems quite likely, that any attack against sha1 will produce collisions of identical size. So, if you do replace an object with another one with same hash, it will also be of the same size.
  
  Assuming a chosen prefix attack against sha1 (similar to what has been demonstrated against md5), you can choose two prefixes of identical length, then have the attack generate a few blocks of "random" data to line of the hashes, and then finally choose a suffix for both files.
  
  Producing something that compiles and do what you want it to is easy. The random garbage in the middle can be put inside a comment. The hard part is to get a file with a block of random garbage in the middle accepted. For a C source file it is easy to recognize. Maybe the attack could be performed in a way that it only has to tweak the low order bits of some bytes, then you could produce a hex encoding of some binary data that you may be able to argue servers some purpose. Notice that for the malicious version, the prefix could end by starting a comment, such that the attack part and anything after that until the first comment is ignored.
  
  --
  
  Do you care about the security of your wireless mouse?
15. Re:git objects don't live in a vacuum by Zero__Kelvin · 2010-12-11 13:55 · Score: 1
  
  Never mind. Linus Torvolds, who invented git and knows more about the Linux kernel development model than anyone else on the planet is wrong, and you are right.
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
16. Re:git objects don't live in a vacuum by Zero__Kelvin · 2010-12-11 14:07 · Score: 1
  
  tree.h, which you can check out yourself from the git repository at git://git.kernel.org/pub/scm/git/git.git is a better read of git internals ;-)
  
  struct tree { struct object object; void *buffer; unsigned long size; };
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
17. Re:git objects don't live in a vacuum by Zero__Kelvin · 2010-12-11 15:51 · Score: 1
  
  "All the collision attacks demonstrated against md5 would produce collisions where the inputs had identical size. Due to the way the padding to make the input an integral number of blocks works, none of the attacks can be used to construct collisions with different size inputs. "
  Maybe I'm wrong, but as far as I know the padding is merely to make sure the input is an even multiple of the number of bits in the hash. I admit that I'm not a cryptographer, but I'm fairly certain that there is no requirement for the objects to be identical in size in order for a collision to occur.
  
  Also, it would be easy to assume that if the hash collides that the all higher level hashes (e.g. Tree hash of a branch which includes the file) would be identical, but I am actually leaning in the direction that it won't be for SHA-1 while it will be for MD5. It is this "nested hash" feature that people seem to be missing. In other words I believe you need to collide with the file hash, the tree hash, and the commit hash simultaneously, and because it is not a CRC type hash like MD5 it is impossible.
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
18. Re:git objects don't live in a vacuum by Zero__Kelvin · 2010-12-11 17:04 · Score: 1
  
  Also, in addition to the actual git source code excerpt I already posted, there is this gem from The git Community Book section on the Object Model:
  
  Objects
  Every object consists of three things - a type, a size and content. The size is simply the size of the contents, the contents depend on what type of object it is, and there are four different types of objects: "blob", "tree", "commit", and "tag".
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
19. Re:git objects don't live in a vacuum by kasperd · 2010-12-11 22:44 · Score: 1
  
  Maybe I'm wrong, but as far as I know the padding is merely to make sure the input is an even multiple of the number of bits in the hash.
  That is the main purpose of the padding, but the padding ends with a 64 bit field containing the length of the full message. The padding between the message and the length field is a one bit followed by all zeros. This way you can always find out how long the padding was by looking for the last one bit. So it is guaranteed that two different messages will always be different after padding as well, and there are two different ways to reverse the padding operation.
  
  I admit that I'm not a cryptographer, but I'm fairly certain that there is no requirement for the objects to be identical in size in order for a collision to occur.
  Correct, but all the methods faster than brute force produce collisions of identical length. The best attack on MD5 produces a collision by taking two chosen prefixes and then adding some blocks to bring the hash state to the same for the two strings. Once the hash state is identical for both you can add the same suffix to both and have the same resulting hash. The first part of this attack doesn't depend on the message length. You could start with two prefixes of different length and use the attack to get the same state of the hash except for the length counter. Once the final padding was added they would most likely end up with different hash values.
  
  It is not proven that there exist collisions of messages of different size. However, it is highly likely that there does. If you were to perform a brute force attack, it would not be much harder to find a collision of two messages of different length. You could take 2^64 messages of one length and 2^65 messages of a different length, and you have a good chance that one message of one length collides with a message of the other length.
  
  But why would you want to use a slower method of generating a collision of messages of different length, when that is also less useful since the layer above the hash might check length and hash value.
  
  Also, it would be easy to assume that if the hash collides that the all higher level hashes (e.g. Tree hash of a branch which includes the file) would be identical
  The next level up the tree doesn't hash the file contents, it hashes the hash of the file contents. And since both of those hashes are identical, the next level will produce identical hashes because it is hashing identical inputs.
  
  but I am actually leaning in the direction that it won't be for SHA-1 while it will be for MD5.
  First of all the tree structure is not part of the hash definition itself, it is just a way to use the hash function. Second, SHA1 and MD5 are very similar. The round function is different, and SHA1 has a state of 5 32 bit words where MD5 only has 4 32 bit words. Everything else is identical. And this similarity is actually not a problem since as long as the round function is collision resistant, then the full hash will be as well. You cannot produce a collision for the full hash without doing it at least once for the round function. However this similarity does mean that most arguments about behaviour of the hash will be exactly the same for both hashes.
  
  It is this "nested hash" feature that people seem to be missing.
  I'm not familiar with that term. A quick search for it found me information about some perl library, which did not seem to have anything to do with cryptographic hashes. I am assuming the term you wanted to use was a hash tree. In a hash tree you just have to produce a collision in one node of the tree, and it will propagate up all the way to the root as a collision.
  
  and because it is not a CRC type hash like MD5 it is impossible.
  MD5 is not a CRC. MD5 and SHA1 are both very different from a CRC, MD5 and SHA1 are however very similar.
  
  --
  
  Do you care about the security of your wireless mouse?
20. Re:git objects don't live in a vacuum by larry+bagina · 2010-12-13 10:27 · Score: 1
  
  here's an actual tree entry from an actual git repository:
  100644 blob ce013625030ba8dba906f756967f9e9ca394464a hello.txt
  mode, type, hash, name. No file size. The file size is stored in the ce013625030ba8dba906f756967f9e9ca394464a file, but not in the tree.
  
  --
  Do you even lift?
  These aren't the 'roids you're looking for.
21. Re:git objects don't live in a vacuum by Zero__Kelvin · 2010-12-13 11:23 · Score: 1
  
  ... and presumably the tree name is hello.txt ;-)
  
  What you have shown is not the contents of the object, but rather the output of a git command. In the book you reference, every look at the "internals" is done by using git itself to interpret the information. The source code shows the actual internal representation unprocessed by any git commands.
  
  For example, the output you have posted is the contents of the tree as interpreted by git, but you missed an important option. Try git show ls-tree -l
  
  The size of every object in git is stored internally. Nothing is left to chance.
  
  I have shown the actual source code and the official git documentation that explains the internal representation. I hope this helps you to understand exactly what is going on, but you will need to focus on learning what is really happening internally rather than focusing on ways to prove you were right and I am wrong first.
  
  --
  Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Skein break by Bernstein by Skuto · 2010-12-10 21:12 · Score: 3, Informative

UNOFFICIAL COMMENT: Cryptanalysis of Skein
http://cr.yp.to/hash/skein-20101206.pdf
1. Re:Skein break by Bernstein by Anonymous Coward · 2010-12-10 22:35 · Score: 4, Informative
  
  ...that's a joke. :)
  CubeHash was eliminated due to poor short-message MAC performance. A parameter tweak could have fixed that, but was too late for the selection to change their mind, and would have had security implications.
  Still, it was a fascinating design that degraded extremely well to reduced versions for some fascinating simplified cryptanalysis. I think we all learned a lot from it.
  Skein's ThreeFish needed a rotational constant tweak. That's done, and it's through.
  Now that the suspiciously fast has gone, I suspect they'll choose the fastest which survives with no severe attacks. My money's on Skein (which, if it survives the competition as a finalist with no severe attacks, I will use anyway because it has a native tree-hash mode which is extremely useful to me) or at a push Keccak (which can derive advantage from AES round function hardware acceleration, at the cost of using the AES round function, which is a bit like putting your eggs in one possibly dodgy basket).
2. Re:Skein break by Bernstein by Skuto · 2010-12-10 23:48 · Score: 1
  
  >...that's a joke. :)
  No shit!
  >push Keccak (which can derive advantage from AES round function hardware acceleration, at the cost of using the AES round
  >function, which is a bit like putting your eggs in one possibly dodgy basket).
  Keccak has no relation to AES except for one of the authors, and is a new design. I think you are confusing it with another hash.
  It has strong advantages over Skein in hardware and embedded platforms. In fact, I think Skein is weak enough there compared to the others that it won't make it.
  As for tree-mode hashing: all candidates support it, even if not in the algorithm definition itself.
3. Re:Skein break by Bernstein by Anonymous Coward · 2010-12-11 01:47 · Score: 0
  
  he's just telling a yarn. but there's lots of pretty pictures..
4. Re:Skein break by Bernstein by kasperd · 2010-12-11 12:29 · Score: 1
  
  at the cost of using the AES round >function, which is a bit like putting your eggs in one possibly dodgy basket).
  If you have a system that uses both a block cipher and a hash function, it is probably because it needs both, and a weakness in either would result in a weakness in the system as a whole. So, it isn't really much of a problem that they are similar.
  
  it is possible to combine two block ciphers to produce a system that is secure even if one of the ciphers is broken. If they use the same block size, then you can just apply one hash after the other, using two independent keys. It doesn't make brute force harder, but it does mean you don't have to worry if one of the two is broken. Of course this reduces performance to half.
  
  It is surprisingly difficult to combine two hash functions in a similar way. The most obvious ways to combine two hash functions won't be secure in the typical security models. Try to design a way to combine two hash functions, where you assume one of them is secure in the random oracle model, and the other is designed by an adversary to be as insecure as possible. The aim then is to make a hash function out of those two, which will be secure in the random oracle model without knowing which of the two hash functions is secure.
  
  Keccak has no relation to AES except for one of the authors, and is a new design. I think you are confusing it with another hash.
  You may be thinking of Grøstl. It uses some constructions from AES. They have been modified a bit, but there is still a lot of similarity, and they can reuse some of the hardware designed for accelerating AES.
  
  --
  
  Do you care about the security of your wireless mouse?
Read slashdot much? by Bill,+Shooter+of+Bul · 2010-12-11 06:52 · Score: 1

Breaking sha1 with amazon & Cuda

--
Well.. maybe. Or Maybe not. But Definitely not sort of.
Thanks for the info (It'll still never work) by Zero__Kelvin · 2010-12-12 00:08 · Score: 1

OK. It sounds like I have a lot to learn on this topic, and I misunderstood a number of things about what git was doing. The fact that git doesn't hash the whole object, but rather the hash of the object makes perfect sense now that you say it, since it would obviously be much more costly to do it the way I was thinking of it. (BTW: I used quotes around "nested hash" because I wouldn't expect it to be an actual term. It is the (costly) idea of hashing an object, and then hashing a collection of objects and their hashes in a hierarchy to which I was referring.)

Of course, none of this changes my original point, which is that this purely theoretical attack will never work in application no matter how feasible it is in theory.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun