NIST Announces Round 1 Candidates For SHA-3 Competition
jd writes "NIST has announced the round 1 candidates for the Cryptographic Hash Algorithm Challenge. Of the 64 who submitted entries, 51 were accepted. Of those, in mere days, one has been definitely broken, and three others are believed to have been. At this rate, it won't take the couple of years NIST was reckoning to whittle down the field to just one or two. (In comparison, the European Union version, NESSIE, received just one cryptographic hash function for its contest. One has to wonder if NIST and the crypto experts are so concerned about being overwhelmed with work for this current contest, why they all but ignored the European effort. A self-inflicted wound might hurt, but it's still self-inflicted.) Popular wisdom has it that no product will have any support for any of these algorithms for years — if ever. Of course, popular wisdom is ignoring all Open Source projects that support cryptography (including the Linux kernel) which could add support for any of these tomorrow. Does it really matter if the algorithm is found to be flawed later on, if most of these packages support algorithms known to be flawed today? Wouldn't it just be geekier to have passwords in Blue Midnight Wish or SANDstorm rather than boring old MD5, even if it makes no practical difference whatsoever?"
What is the point if they only got one submission for the Hash contest? Doesn't that make it the automatic winner?
Surely you want to do better than have to pick from more than one choice.
And yes it will take years to pick the winner. Duh. You don't want to just throw something out there that will get broken immediately.
Actually, it's probably much better to have MD5 which is known broken in understood ways, than Jo3#a$# which is broken but we don't know how, where and why. There are fairly simple rules for MD5 (start phasing out now; only use in situations where you in some way control the input, not your adversary) which make it possible to use in a relatively safe way. If you don't know what way the hash is broken you don't know how to avoid those problems. Having said that, SHA256 should probably be considered the minimum for a temporarily secure system with a lifetime limited until something better has been available and tested. As Mr Schneier says "attacks only get better; they never get worse".
It's also not a surprise that some hashes got broken. There are many entries and they come from all types of cryptographer from teenager to aged expert; from unknown to known mostly by initials (e.g. A, S or R). There was not much hope that all of them would be of good quality.
=~ s,(.*),<sarcasm>$1</sarcasm>,g if any_point_you_wish();
s/geekier/stupid and irresponsible
Let me guess, the submitter likes to enable all the useless bling effects on Compiz but never gets any work done, and has racing stripes on his Civic....
I went to Carnegie Mellon and took classes from a bunch of professors who were all freakin' geniuses and here is the second most important lesson I learned about cryptography: NEVER DO IT YOURSELF. And a corollary to that is never use a cryptographic system someone else cooked up until it has been through the vigorous peer review that these hash functions will go through. This was an important lesson to a bunch of egotistical CMU students, and I hope the ones who were actually smart listened. (The first most important lesson is an old one: if you think cryptography is the solution to all your security problems, you don't understand cryptography or your security problems).
"Whaa! But the ciphers we have now are already broken!!" The current hash functions that are "broken" like SHA-1 are not trivially broken, but broken in a sense that in some scenarios might make it somewhat easier to conduct either a pre-image attack (useful if you know somebody's password hash and want to make a password that will hit the same hash) or a collision attack (useful in some cases where you are trying to forge a messsage to match a digital signature.... but if the fake message has to contain lots of garbage bytes even a successful collision might not pass the smell test). "Somewhat Easier" does not mean you can do it on your iPhone, it just means that it might take a supercomputer 100 years instead of the heat death of the universe to do it. This is still very important, but it is a world apart from an algorithm that has never been tested... those could be blown wide open and cracked almost instantly with trivial computing power. To use a bad car analogy, just because a seat belt won't save your life in every car accident doesn't mean it's just as safe to strap plastic explosives to your gas tank and hook them up to a mercury switch detonator.
As for "open source" making these cryptographic models available quickly, I wasn't aware that text editors froze up and stopped you from writing code if it wasn't going to be open source. The reason commercial vendors won't jump on a new cryptographic protocol before they are validated is that their customers would (rightly) go ballistic and their credibility would be smashed. Fortunately for all of us the leaders of the open source community have a little more sense that you and you won't see any of these hashes in the Linux kernel or OpenSSL until they are at least in the final rounds of competition and there is some evidence that they have value. OSS has the advantage that its software implementation can be publicly validated and peer reviewed, but having your code opened up to the world is actually much MORE dangerous if you are just screwing around because you think a hash function has a badass sounding name. I'm glad Torvalds is in charge of Linux and not "jd".
AntiFA: An abbreviation for Anti First Amendment.
Because rainbow tables are useless if the hash is salted
Wikipedia:
"The ideal hash function has four main properties: it is easy to compute the hash for any given data, it is extremely difficult to construct a text that has a given hash, it is extremely difficult to modify a given text without changing its hash, and it is extremely unlikely that two different messages will have the same hash. These requirements call for the use of advanced cryptography techniques, hence the name."
The whole point of the exercise is to find an algorithm that can't be easily reversed and that's far from impossible.
Besides, hashes are never completely broken, at most you can make various collision attacks, you never get away with putting in arbitrary data.
I hate to state the obvious, but a hash by nature is breakable. You are (typically) distilling a large number of unique bits down to a smaller number of bits.
Of course there will be more than one set of inputs that generate the same output.
Its more an issue of:
1. How hard it is to find colliding inputs.
2. What the hash is used for.
Passwords typically generate more bits, so different rules apply.
MD6 (similarity in name to MD5 is entirely intentional) looks very interesting:
While raw speed isn't great (the default single-threaded 32-bit md5sum in Linux can do 325 MB/s on a 2.4 GHz CPU) maybe its multi-core friendly design is the right way to do it right now. The original MD5 will probably not entirely disappear because of its speed.
(OTOH if you're hashing SSL web traffic it's probably worse to have your hash bog down other CPUs that are busy with their own jobs)
-- Sig down
No hash, even the very worst, is reversible. The reason for this is that an infinite number of input strings will produce the same, finite, output string. See http://stackoverflow.com/questions/330207/how-come-md5-hash-values-are-not-reversible for more information.
Disconnect and self-destruct, one bullet at a time.
In answer to - "have passwords in Blue Midnight Wish or SANDstorm rather than boring old MD5, even if it makes no practical difference whatsoever?"
I'm going into the "no practical difference whatsoever" camp. In fact you're taking a huge risk if any of them are broken and you gain nothing that simply salting your hashes doesn't already give you.
We know that MD5 is secure to a degree. Just salt that bad boy up so rainbow tables no longer have any impact.
I spent a few hours the other day looking over all of the submissions; Keccak and Skein are my favorite contributions. My criteria was "does the hash generate a fixed-length output, or is the hash capable of also being used as a stream cipher".
There are only four unbroken contributions that can generate arbitrarily long streams of numbers: Keccak, LUX, MeshHash, and Skein. Of these contributions, LUX and MeshHash, while not broken, already have cryptanalysis done against them that make me a little uneasy using them.
I prefer Keccak over Skein, for the simple reason there is a bonda-fide 32-bit variant of Keccak that can run quickly on 32-bit systems. Skein is designed to run well only on 64-bit systems. Part 5.4 of the Skein paper talks about the possibility of making a 32-bit variant of Skein but that they need to come up with rotation and permutation constants, and figure out how many rounds a 32-bit Skein variant would need. I would like to see Schneier, et al (the team responsible for Skein) actually do this. Skein is more flexible that Keccak (I think threefish is the first tweakable block cipher since the somewhat broken Hasty Pudding Cipher), and is faster on 64-bit systems, but I would like to see it run on embedded and legacy systems better.
Did you really need a link to explain that? I mean, the fact that I'm deriving a 16-byte hash from a multi-gigabyte file should be a pretty good indication that there's no way to turn it around. Otherwise we'd have some really cool compression algorithms.
The fact that people still think hashes are reversible makes it pretty clear that they need more than "no, you can't do that".
Disconnect and self-destruct, one bullet at a time.
Here is a torrent of all 51 submissions: http://thepiratebay.org/torrent/4592403
Popular wisdom has it that no product will have any support for any of these algorithms for years â" if ever. Of course, popular wisdom is ignoring all Open Source projects that support cryptography (including the Linux kernel) which could add support for any of these tomorrow. Does it really matter if the algorithm is found to be flawed later on, if most of these packages support algorithms known to be flawed today?
It matters a lot. Say OpenSSL added all of these algorithms tomorrow. Some idiot developer (hint: go read DailyWTF) will build on top of it. OpenSSL now has to maintain backwards compatibility - so they can never take out the algorithm. A month from now, the algorithm gets broken completely. But because OpenSSL shipped with it, they can never take it back out.
The "popular wisdom" standard for proliferating a new algorithm is not how shiny it looks at first glance. Popular wisdom waits months or years until algorithms seem good enough. MD5 (or even MD4), SHA1 - all are good enough for some purposes (generally, when attacker does not control input). And if the attacker does control the input, the only sure solution is to send the whole thing - anyone believing otherwise needs to review the meaning of the word "hash". A secure hash is merely an irreversible hash with a very low risk of collision.
Even this article is mostly "security theater". There are very, very few uses of secure hashes where SHA1 (or even MD5, for that matter) is not good enough.
A witty [sig] proves nothing. --Voltaire
The article is already out of date. The round 1 candidates were announced back on December 11. Since that time, 11 candidates have been broken. For the latest information, I recommend visiting the SHA-3 Zoo.
Also, the article suggests that candidates will continue to be broken quickly, but I doubt this will happen. The weak hashes will be broken quickly, but there are likely to be many strong candidates which will not be broken during the contest. Other factors (speed, simplicity, etc.) will determine the ultimate winner.
I'll ignore your misuse of the term 'reversible', others have explained it.
Rainbow tables are only feasible against poor implementations. I.e. the windows SAM hashes. Even the stored LM2 hash is susceptible to a rainbow table that can fit on a dual layer DVD for over 99% of the keyspace. The old crypt in Unix systems is similarly weak (though still not nearly as much). The implementation on MD5 crypt on /etc/shadow would require about 10^73 yottabytes of a rainbow table to achieve the same end in the same way.
In other words, a dictionary attack on the password space rather than precomputed tables of hashes remain the biggest threat to /etc/shadow. No application in their right mind would not use a similar strategy to remember how to prove client knowledge of a password.
MD5 is not sufficiently broken yet to induce panic. As far as I understand, there is no attack yet that has sufficient control over the colliding data to be of consequence yet.
Besides, what would your proposal be? The other logical class of cryptography would be two-way, which fundamentally provides no security in these instances. Hashes passwords are so a server can prove a password is valid without having to know the password. If it were two way, the crypted data and the key would both have to be accessible, making it trivial to break if you achieve privilege to get the password file today. The other major application is download verification, to enable a small amount of data to be distributed in a more trustworthy way to validate data transmitted in the most expedient way, or to validate future transfers once trust is established..
XML is like violence. If it doesn't solve the problem, use more.
Not strictly true. Rainbow tables are only feasible for very small inputs -- like 8-character-or-less passwords. Salting makes the minimum input larger (much larger, since salts are usually full binary, wheras password characters are almost always out of a small subset of possible characters). Of course, rainbow tables are absolutely useless if what you're hashing is, say, an entire file for a digital signature.
Does it really matter if the algorithm is found to be flawed later on, if most of these packages support algorithms known to be flawed today?
does it matter? does it matter?? fuck me it fucking matters.
example 1
there's a type of encryption algorithm principle - the feistel cipher - see http://en.wikipedia.org/wiki/Feistel_cipher - where you perform one simple transform function as "round 1", then for rounds 2 and 3 you do a one-way hash function, and then for round 4 you do a simple transform function.
if the one-way has is ever broken, your encryption cipher is also broken.
game over: any traffic that's ever been using that cipher can be decrypted.
example 2
your credit cards you carry around? the PIN number isn't stored on the card - but an MD5 hash of the PIN number *is* stored on the card (making replay attacks possible, believe it or not).
if MD5 is ever cracked...
game over: anyone can get your PIN number.
example 3
your peer-to-peer filesystem, your git source control system, they use one-way hashes to store an index of the data blocks. let's say that someone deliberately wants to break deployed systems, they work out what file chunks could end up being mapped to the same one-way hash...
game over: anyone can corrupt the database or the peer-to-peer filesystem by _deliberately_ making file or file chunks write to the same block.
i could go on with the list of examples - authentication systems that would fall over; internet bank systems that could be broken in to - we _totally_ rely on one-way hashes working correctly.
it's important beyond _belief_ that these one-way hash functions work, so much so that i was staggered that the question even had to be asked as part of the article-announcement.
Triple MD5 anyone? Hey it worked to extend DES!
(Triple MD5 is is composed of the XOR of standard MD5 first byte to last byte, MD5 last byte to first byte, MD5 middle out to the ends. Faster hardware makes this feasible now.)
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
Anyone who has access to a set of password hashes will break some of them quickly. Just make sure your system is robust despite that (i.e., make sure that you can't get to a given set of password hashes unless you can already get to everything accessible using every password in that set).
Humans choose short, weak passwords, and always will. Make your system OK with that. There are plenty of ways, from limiting retries to using physical tokens. 4-digit PINs *work* for ATM cards, because the PIN isn't the only element providing security. The physical token, the ability to invalidate or capture the physical token, the camera in the ATM, and the daily withdrawal limit are all important there.
Hash functions are more interesting for digital signatures. Depending on a hash not being reversible given a very short input is a bit silly. That's not really the point of a hash function. The point is to make it practically impossible to create a document that matches a given signature.
Socialism: a lie told by totalitarians and believed by fools.
IIRC, Skein is getting about 6 cycles a byte in 512-bit mode on 64 bit platforms, which on a 2.4GHz dual core CPU would yield a theoretical 800 MB/s in a parallel tree hashing mode, 400 MB/s in standard mode. Apparently MD6 has a parallel mode also, and it's striking that both hash functions are trying to be minimalist by employing only three fundamental operations (AND, XOR, SHIFT for MD6; XOR, ADD, ROLL for Skein) and lots of rounds. It's odd that MD6 should be so much slower. Perhaps it hasn't been fully optimized yet?