A Competition To Replace SHA-1
SHA who? writes "In light of recent attacks on SHA-1, NIST is preparing for a competition to augment and revise the current Secure Hash Standard. The public competition will be run much like the development process for the Advance Encryption Standard, and is expected to take 3 years. As a first step, NIST is publishing draft minimum acceptability requirements, submission requirements, and evaluation criteria for candidate algorithms, and requests public comment by April 27, 2007. NIST has ordered Federal agencies to stop using SHA-1 and instead to use the SHA-2 family of hash functions."
The draft can be found (in PDF) here.
Er Galvão Abbott - IT Consultant and Developer
Schneier proposed such a competition in March 2005: http://www.schneier.com/crypto-gram-0503.html#1
Please correct me if I got my facts wrong.
``Maybe secure hashing needs to store a mixture of the low level and the high level details but in a context specific way - the face picture example should also store the detailed iris pattern as well as an overall face picture, both should match to allow this person through. It might be easy to find someone who looks like me, but the specific portion cannot be modified without surgery.''
The idea is that, in a good hash function, each input bit affects all the output bits more or less equally. This is especially true of cryptographic hashes, and for a good reason. The stronger the correlations between input and output, the weaker the hash function.
Please correct me if I got my facts wrong.
You clearly don't know what a crytographic hash is about. And this is not what is ment by collisions resitant. What it means is that there is minum amount of work needed to produce a collision.
There are a number of different type of collisions as well. Lets assume we have a 256-bit hash. There is the kind of colision where you just find *any* 2 strings that produce the same hash, which should require on avarage 2**128 "operations". A harder task is given a string and its hash find another string with the same hash. For a secure hash 256-bit hash function this will require on avarage 2**256 "operations".
There are other properties that are important as well. Its a well established idea. Hashes are very very usefull and are used for a lot more that file verification and we know what properties they need. We are just not very good at producing very good hashes yet.
If information wants to be free, why does my internet connection cost so much?
Most modern protocols and standards are designed to be agile. Basically, this means that they don't mandate any one particular algorithm, but rather are designed such that alternatives can be used. Otherwise, many specs would be woefully out-of-date every few years as computing power and cryptographic algorithms advance. The 3 examples you give above are all considered "agile", read the specs and note that they use algorithm identifiers and allow for a wide variety of different algorithms to be used, none of the above are strictly bound to use SHA-1 or MD5.
I always wonder about what would happen if we used multiple hash functions together. E.g. you provide an SHA-1 hash, an MD5 hash, and an RMD-160 hash, all for the same message. Would that be harder to fool (i.e. make the system think you got the original, but it's actually a forgery) than one hash function that generated as many bits? What about weaknesses in the individual hash functions; would you be worse off because a flaw in any one of your hash functions affects you, or better off, because you have more hash functions that need to be fooled?
By the way, IIRC, OpenBSD and NetBSD include multiple hashes per archive in their ports trees, but use only one for verification.
Please correct me if I got my facts wrong.
I think that you are missing the point of a hash.
A hash is a signature of the file, its designed to give a good confidence that a given file that you have been supplied matches the one that you think has been supplied.
The theory being that being able to create a file that is of the same length as the orignal, is not corrupt (eg a zip file still unzips, an executable still runs, a pdf still displays) and is different from the original but still hash should be infeasable (not impossible, most cryptography doesn't look for impossible, not practical within a given time frame is sufficient for most needs)
Another use of hashes is on data storage systems, especailly with backup systems, where two files with the same hash and length are treated as the same file (so no need to write it to tape twice) this way you only have to sort the list of hashs and look for matches, rathering than having to diff every file against every other one.
Personally I think I'd rather binary diff matches hashes just to be safe - but thats time intensive. The chances of two files each having the same size and SHA-256 hash and being different is less than the chance of your sotrage device being destoryed (meteroite, fire, flood, plane) before you are able to back up either file
$_="Slashdotter";$syn="OTT";s;..;;;sub _{print shift||$_};s!ash!Perl !;s=$syn=ack=i;tr+LLEd+BLAH+;_"Just Another ";_
Its not like everyone is starting from a blank slate on the first day of the contest. It's basically a call for the math geeks who design this stuff to polish up whatever they are working on.
Conformity is the jailer of freedom and enemy of growth. -JFK
2000 times quicker than brute force (where brute force is average time 2^159 attempts) means the algorithm is not as secure as it used to be thought.
This has demonstrated a cryptographic weakness, there could quite well be more, look at the research over the years on weakening md5, therefore moving to different algorithm would be advisable.
Its doesn't mean that you are going to be able to find a collision in non trivial time, but it did lower the bar. Lowering it enough that people wanting high grade protection should switch to a more secure algorithm.
Context specific data has no place in a hash, it would only weaken it.
$_="Slashdotter";$syn="OTT";s;..;;;sub _{print shift||$_};s!ash!Perl !;s=$syn=ack=i;tr+LLEd+BLAH+;_"Just Another ";_
No, SHA-224 is truncated SHA-256 and SHA-384 is a truncated SHA-512.
SHA-256 and SHA-512 are different hash functions (same basic design though). On 32-bit boxes SHA-256 is faster, and on 64-bit boxes SHA-512 is faster.
There is no point in 224 or 384, but they're there just for completeness (e.g. to comply with some specs that don't allow the arbitrary truncatage of a hash).
Tom
Someday, I'll have a real sig.
Sure there must be collisions, but that's not the point.
The point is that you can verify that data is correct with a good amount of confidence, from a relatively small hash code. So I can download a lot of data through, say, bittorrent, and despite the fact that I don't necessarily trust the people I actually download from, I can verify that the hash is right and therefore I am confident that the data I receive is what the original seeder put out: no-one's decided to play games and (say) sneak their CC number grabber into the data.
So what you want is an algorithm which is reasonably easy to run, which SHA-1 is, but where it is not easy to find a collision. For example, if my hash code was simply to give the total byte sum modulo 1000, then while it would almost certainly catch accidental errors in data, it would be very easy for an attacker to stick in his CC number grabber to your data then fiddle the byte sum back to where it should be.
Your idea pretty clearly shows you have no idea of what hashes are used for: there is no point preserving the data structure, it takes a lot of extra space and gives virtually no security. For example, SHA-1 produces a 20 byte hash. I can put something that size up on my personal website without getting huge bandwidth charges even if millions of people want to download it - and then I can distribute my 1GB zipfile by way of people I don't necessarily trust (but who have more bandwidth than I) and still the eventual recipients can be confident that what they receive is what I sent out. If I include the virtual FAT table of this zipfile, my hash size goes up by about 500,000 percent (literally), and so do my bandwidth charges. And I get virtually no extra security, because all that an attacker has to do above finding an SHA-1 collision is ensure that the change doesn't affect the FAT table: i.e. he replaces some suitable virtual file of mine with one of his, keeps the name and size the same and he's done.
Let's start with the facts: SHA1 is cryptographically "broken" in the sense there's a "better than brute force" attack which takes about 2^63 operations instead of 2^80 of finding a colliding pair of two random strings.
It's not a practical attack because 2^63 is still a huge number.
It's not a "find a collision to a known string" attack which would be stage 2.
It's not a "find a collision to a known string by appending to a fixed string" attack which would be stage 3.
It is a sratch in the armor which creates doubt if there are more powerful attacks, nothing more.
There are strong alternatives like SHA-512 and Whirlpool (AES-based) which it is possible to use today, if you're paranoid more is better. Is it urgent? Not really, even a practical stage 1 and 2 attack would just be "stuff breaks, files corrupt, migrate away". The only one with really nasty consequences is stage three with code injection attacks in software and such.
Live today, because you never know what tomorrow brings
>This is what a hash is by design: obscurity.
"Security through obscurity" means trying to depend on indefensible secrets. The classic example from 19th century crypto theory is that it's stupid to try to keep your crypto algorithm secret, so you should keep keys secret instead.
Security through obscurity leads to worldwide breaks when it fails.
The existing secure hashes have nothing obscure about them. The algorithms are published and open for review. The fact that they're vulnerable to brute force is not being hidden and is the same problem that all the workhorse encryption algorithms have.
"Security through obscurity" would be trying to hide the fact that there's a work factor reduction attack and hoping that nobody rediscovered it.
Again you are wrong (and somewhat right about the incorrect title at the same time, iI suppose). The point of this workshop is to revise and amend FIPS 180-2. Now, while the SHA-2 line of hashes are laid out in FIPS 180-2, it is not the case that SHA-2 and the like will be thrown out. They meet the requirements laid out in the call, and frankly NIST would be insane to not make it one of the workshop's submissions. It may very well fall out that the SHA-2 is just fine and indeed the best candidate submission.
As for the Chinese attacks, they haven't shown any real applicability to SHA-2 as of yet.
-30-
I would have thought that brute force no longer implied 2^X attempts as the 'standard' for brute force. Wikipedia gives a lovely equation (hope it's accurate!,) such that if we want a 50% chance of finding a collision, we let H be the number of possible values in the hash, and let n(P) be a function representing the number of values that must be tested for a desired probability of collision P. n(P) = 1.1174sqrt(H). Applying that to a 160-bit value gives us 'only' 1,350,853,710,837,386,639,816,681 values. That's a phenomenal result (1.35x10^24) because it is drastically lower than 2^160 (1.46x10^48.) And the neat thing about the birthday attack is that as we increase P, n'(P) decreases. (dn/dP)