Slashdot Mirror


BLAKE2 Claims Faster Hashing Than SHA-3, SHA-2 and MD5

hypnosec writes "BLAKE2 has been recently announced as a new alternative to the existing cryptographic hash algorithms MD5 and SHA-2/3. With applicability in cloud storage, software distribution, host-based intrusion detection, digital forensics and revision control tools, BLAKE2 performs a lot faster than the MD5 algorithm on Intel 32- and 64-bit systems. The developers of BLAKE2 insist that even though the algorithm is faster, there are no loose ends when it comes to security. BLAKE2 is an optimized version of the then SHA-3 finalist BLAKE."

96 of 134 comments (clear)

  1. ORAC by MickyTheIdiot · · Score: 4, Funny

    He was always good with codes...

    1. Re:ORAC by Nyder · · Score: 1

      He was always good with codes...

      woot!

      --
      Be seeing you...
  2. I don't trust new, untried crypto. by SuricouRaven · · Score: 5, Funny

    Give it, oh, five more versions.

    1. Re:I don't trust new, untried crypto. by davester666 · · Score: 1

      So, you want to wait until the version string the tool reports is at least 1.0.5?

      Done!

      --
      Sleep your way to a whiter smile...date a dentist!
    2. Re:I don't trust new, untried crypto. by Anonymous Coward · · Score: 2, Interesting

      He's waiting for Blake's 7.

      http://en.wikipedia.org/wiki/Blake%27s_7

  3. links to NIST by girlinatrainingbra · · Score: 5, Interesting

    The BLAKE hash function was an also-ran finalist for the NIST Hash function competition ( http://en.wikipedia.org/wiki/NIST_hash_function_competition ). There is not yet a wikipedia page for BLAKE2, but the winner of the NIST competition was Keccak now known simply as SHA-3 since it won the competition.
    .
    Why would an optimized (optimized for run time speed? optimized for low memory footprint while running? optimized to minimize the likeliness of hash collisions) version of the same BLAKE entrant be more useful? Perhaps an improved algorithm that made it better competition for Keccak would make more sense. I don't know enough math to say completely, and still need to read the details.

    1. Re:links to NIST by Nexion · · Score: 2, Insightful

      Yeah... I was under the impression that more hashes faster was a bad thing. That is something for which crackers tune their code, but they didn't explicitly mention password storage. Perhaps it is targeted more for checksum like usage.

    2. Re:links to NIST by nazsco · · Score: 1, Informative

      Time to encode is not always related to time to decode. So it would benefit dictionary attacks only,

    3. Re:links to NIST by Anonymous Coward · · Score: 1

      If you can decode a hash, your hash is very very broken.

    4. Re:links to NIST by Anonymous Coward · · Score: 2, Informative

      There is no "decode" - breaking hashes is done by brute force (or by using dictionaries + modifiers).

    5. Re:links to NIST by Zero__Kelvin · · Score: 2, Funny

      Yes. I hate optimized versions of things. Why does everyone think it's better just because they made it better?

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
    6. Re:links to NIST by Anonymous Coward · · Score: 5, Informative

      It is optimized for runtime speed, though they also claim a slightly lower memory footprint.
      The chance of hash collisions is about the same for any unbroken cryptographic hash.

      Keccak has good performance in dedicated hardware, but was one of the slower SHA-3 candidates in software implementations. Compared to MD5, the best SHA-3 implementation is between 2 to 4 times slower on recent Intel processors, or 5 times slower on a recent ARM CPU.

      Note that NIST has stated that any of the five SHA-3 finalists would have been an acceptable SHA-3. Keccak was selected in part due to having a very different composition than the SHA-2 family, decreasing the risk that both are broken at once. That does not mean Keccak is superior to Blake for all purposes - if software speed is the main issue, it is almost certainly worse.

    7. Re:links to NIST by Anonymous Coward · · Score: 1

      >> Time to encode is not always related to time to decode.
      What do you mean with "decode" ?

      >> they didn't explicitly mention password storage
      a very fast hash is bad for password storage. Password storage needs very slow hashes. generic hashing of a lot of data needs fast hashes.

    8. Re:links to NIST by Johann+Lau · · Score: 1

      That also means you can hash more often in the same time, it cancels out. I'm not a crypto expert or even non-doofus, but basically you keep inreasing the number of iterations as processing power increases, e.g. calibrate it to take at least N seconds once a year. Which has nothing to do with the algorithm used really; I just like to throw dangerous half-knowledge out there to see what happens haha.

    9. Re:links to NIST by marcansoft · · Score: 5, Informative

      For password hashing, that is correct. However, cryptographic hash functions are not designed for such use (and yes, all the websites and services out there using plain hashes for passwords are Doing It Wrong, even if they are using a salt). You can build a good password hashing scheme out of a cryptographic hash function (for example, PBKDF2), but plain hash functions are not suitable (precisely because they are too fast).

      Fast cryptographically secure hash functions are a Good Thing, so you can hash a given block of data (and thus compute its digest and e.g. verify its digital signature) as fast as possible. This speeds up things like GPG, SSL, Git*, good old sha1sum checksums, etc. If you then want to use such a hash function as the core of a password hashing scheme, you can compensate for the extra speed by simply increasing the number of iterations. Making a hash function slower is always easy.

      *Git is pretty much stuck with SHA-1 for now, but future incompatible versions of the repo format could conceivably switch to a faster hash function if it made sense.

    10. Re:links to NIST by diamondmagic · · Score: 1

      Don't confuse a hash with a key derivation function. A hash is just a cryptographic primitive, you shouldn't use it directly (like you don't use block ciphers directly). It should be fast, and if you need it to be slower, you can iterate its output through itself and slow it artificially.

      A MAC like HMAC uses a hash, but it accepts a secret key as a parameter and it does some stuff to prevent so-called length-extension attacks (whereby given Hash(x) you can determine very quickly Hash(Concat(x, y)) for some C and x, meaning attackers can append data to your secret communications and go unnoticed). Speed is important in this application, and your secret key should already have enough security that speed doesn't matter: 128 bits of security (= 256 bit key) should outlive much of the combined computational power of the Earth. (There's 512 bit hashes because, among other reasons, one use of hashes is to cut their output in half thereby eliminating information about their internal state, and then the effective security is one-half of that, or 128 bits.)

      Key derivation functions, like PBKDF2, by contrast, have to work with very low input strength, usually 40-80 bits. By iterating the hash 1000 times, you add 10 bits of security on top of your input strength. It's still not very secure, but it's better than the alternative of nothing.

    11. Re:links to NIST by Anonymous Coward · · Score: 5, Informative

      A common misconception, but untrue. Cryptographic hashes can be used for many purposes - for storing passwords, to sign documents, to verify file integrity, as a short unique identifier for content, etc.

      The only of these for which slower is better, is password storing.
      If you are using e.g. cryptographic hash functions with a 512-bit output, it is irrelevant for practical security that one hash function is a billion times faster than another. Either way, you will not find a collision in the lifetime of the universe without finding a weakness in the hash.

      Now, this applies when we can assume that any needed keys are of good randomness. This is not the case with passwords, because human-chosen passwords have low enough entropy that they can be guessed.
      However, it is very simple to make a slow hash from a fast one by using a key derivation function like PBKDF2, which iterates over the hash function a configurable number of time, often hundreds of thousands.

      In short, many applications benefit from a fast hash algorithm, and none suffer, because you can always easily make a slower one.

    12. Re:links to NIST by Anonymous Coward · · Score: 1

      You obviously have never used full disk encryption, ZFS, etc.

      Faster hashes will be _very_ welcome!!!!

    13. Re:links to NIST by mysidia · · Score: 1

      The only of these for which slower is better, is password storing. If you are using e.g. cryptographic hash functions with a 512-bit output, it is irrelevant for practical security that one hash function is a billion times faster than another.

      No, it's not irrelevent. The 64-bit hash that cannot be sped up may be stronger than a 512-bit hash that can be computed quickly.

      If the fast hash enables me to build an ASIC that can perform 10^150 hashing operations per second instead of 10^80, then I will be able to find a collision on the weak hash within a few hours, whereas, the slower 64-bit hash might have been something that would take hundreds of years.

    14. Re:links to NIST by Johann+Lau · · Score: 2

      All but the biggest web services will be dealing with roughly one login at a time (ie. you're usually finished with the previous login before the next user tries to login) whereas people attacking will be trying to run as many hashes as possible as quickly as possible to find any matches.

      So how does a slower hash help here? It changes nothing about this fundamental principle.

      An algorithm that's too memory hungry or esoteric to run on something like a GPU core is all to the good right now.

      Botnets.

    15. Re:links to NIST by Anonymous Coward · · Score: 1

      A fast cryptographically secure hash function is always nice to have, for example, deduplication in a filesystem. You want it to be as difficult as possible to create a hash collision, while not needlessly wasting time and CPU cycles.

      For authentication, you'll of course be adding a salt and as much recursion as necessary to ensure brute force attacks aren't feasable. Nobody is going to depend on the hashing function alone.

    16. Re:links to NIST by rthille · · Score: 1

      And has a very large number of bits in the resulting hash...

      --
      Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
    17. Re:links to NIST by steviesteveo12 · · Score: 2

      I thought you weren't a crypto expert?

    18. Re:links to NIST by SirMarth01 · · Score: 2

      NIST's comments on their selection of Keccak to be SHA-3: (PDF)

      Additionally, KECCAK complements the existing SHA-2 family of hash algorithms well. NIST remains confident in the security of SHA-2 which is now widely implemented, and the SHA-2 hash algorithms will continue to be used for the foreseeable future, as indicated in the NIST hash policy statement. One benefit that KECCAK offers as the SHA-3 winner is its difference in design and implementation properties from that of SHA-2. It seems very unlikely that a single new cryptanalytic attack or approach could threaten both algorithms. Similarly, the very different implementation properties of the two algorithms will allow future application and protocol designers greater flexibility in finding one of the two hash algorithms that fits well with their requirements.

      So, Keccak wasn't necessarily chosen because it was "superior" to the other finalists (note that it's slow when implemented in software, especially in comparison to the other finalists), but because it was different enough that, should SHA-2 be found to be fundamentally broken, SHA-3 should remain unaffected.

      An optimized version of BLAKE would be useful because it can run faster than SHA-3 in software, while also being (theoretically) stronger than SHA-2 (or other older algorithms).

      Mind you, I'm not an expert in cryptography; regardless, it probably wouldn't be the greatest choice to use BLAKE2 right away. It's still not been held up to the level of scrutiny that something like SHA-2 has. (Although I'm certain that being a finalist has helped it come under much closer scrutiny than if it hadn't been one.)

      As for why a fast cryptographically secure hash function is desirable, others have already answered that question better than I can.

    19. Re:links to NIST by Anonymous Coward · · Score: 1

      If the fast hash enables me to build an ASIC that can perform 10^150 hashing operations per second

      Ohh, I see we're not working within the laws of thermodynamics anymore. 10^150 hashing ops you just did just consumed the entire MilkyWay for energy.

    20. Re:links to NIST by aysa · · Score: 1

      Why would an optimized (optimized for run time speed? optimized for low memory footprint while running? optimized to minimize the likeliness of hash collisions) version of the same BLAKE entrant be more useful?

      BLAKE2 is an improved version of the SHA-3 finalist BLAKE. Like BLAKE or SHA-3, BLAKE2 offers the highest security, yet is fast as MD5 on 64-bit platforms and requires at least 33% less RAM than SHA-2 or SHA-3 on low-end systems. I can find applications in cloud storage, software distribution, host-based intrusion detection, digital forensics, revision control and embedded systems https://blake2.net/blake2_20121223.pdf

    21. Re:links to NIST by Anonymous Coward · · Score: 1

      your theoretical CPU is slow

      custom ASICs can compute hashes for different data in parallel, for 12 character password to be secure against them, it would have to be completely random, contain letters of mixed case, numbers and symbols

    22. Re:links to NIST by WaywardGeek · · Score: 2

      No, his point was valid. You can always make a slow hash algorithm out of a fast one, for example by rehashing over and over again a 100K times. Speed in a hashing algorithm is always a good thing, so long as its used properly, which unfortunately often not the case. I'm hoping to avoid making a mistake myself. I'm considering using a blake2 64-bit authentication has in P2P messages. Any good reasons not to?

      --
      Celebrate failure, and then learn from it - Nolan Bushnell
    23. Re:links to NIST by Jorl17 · · Score: 2

      Why not Ask Slashdot?

      --
      Have you heard about SoylentNews?
    24. Re:links to NIST by sqlrob · · Score: 1

      No, nobody SHOULD depend on the hashing function alone.

      There's a big difference between "shouldn't" and "doesn't"

    25. Re:links to NIST by marcansoft · · Score: 1

      First of all, you mention the TCP/IP checkum, at the same time as key exchanges, long PSKs, and money transfer. Which is it, simple message integrity or message authentication? If you need authentication, you shouldn't even be mentioning TCP/IP checksums. If all you need is integrity, then just append an MD5.

      Assuming you need authentication: I don't have any opinion on Blake2, but you should use an HMAC instead of just appending the shared secret key and taking the hash. HMAC can be built on top of any hash function, including Blake2. You should also make your MACs at leasts 128 bits long, preferably 192 or 256. 64 bits is definitely bruteforceable these days. I wouldn't settle for anything shorter than 256 for a protocol that involves money.

      Keep in mind that a basic message authentication code is only one piece of a secure protocol. There are many possible attacks that might slip through (e.g. replay attacks, MITM, timing attacks, ...). For example, you need to ensure that an attacker can neither reinject a message from the current session back into it (e.g. by using a sequence number in the signed payload) nor reinject a message from a different session (e.g. by ensuring that there is a random component to the shared secret) nor somehow guess the right MAC code (e.g. by making sure that the wrong-MAC responses do not leak any information, neither in contents nor timing), and of course there's always the host identification problem (preventing a simple MITM where the attacker performs a key exchange with both sides and resigns/reencrypts traffic both ways). Honestly, your first choice should be to use an existing well-tested protocol, such as TLS (you can build your own host verification rules on top of it, you don't have to use the "well-known" root CA list). Failing that, consult a real crypto expert.

    26. Re:links to NIST by Johann+Lau · · Score: 1

      I'm not trying to be dense, I just really don't get it.

      For me, at least when just considering CPU, 1000 iterations of Algorithm A take as long as 1 iteration of an Algorithm B which is 1000 times slower, and a slower hash makes it slower for both sides in equal measure, likewise for a faster hash. That's why I claimed "it's a wash".

      If there is a way to use memory/IO in a way that increases the cost for attacking it more than than the cost of defense, someone has to actually spell it out for me, or I'll remain ignorant :(

    27. Re:links to NIST by mysidia · · Score: 2

      I'm considering using a blake2 64-bit authentication has in P2P messages. Any good reasons not to?

      That may not be a very conservative choice from a security standpoint. The algorithm is new, and has less review/scrutiny and has stood less of a test of time than some other algorithms such as SHA1, RIPEMD-160, Tiger, VSH, Whirlpool. The protocol or its optimization may be broken in a subtle way that will be discovered within 2 to 6 years.

      Even BLAKE the SHA-3 finalist may have stood less scrutiny so far. Although; the finalist protocols for SHA3 will eventually be under a microscope, and a great deal of scrutiny from many crypto researchers. It remains to be seen of there will be an attack against new "optimized" version; frankly, I'm not sure how much scrutiny the optimized version will get if it's not also a finalist for SHA3, and it's not right now a widely used algorithm either.

      I would suggest implementing the protocol in some manner with support for multiple choices of hashing algorithm, so you can upgrade later, or replace/transition away from the BLAKE-fast algorithm without breaking the network, if it turns out to have issues.

    28. Re:links to NIST by DMUTPeregrine · · Score: 1

      Fast hashes are bad for passwords. Slow hashes are bad for data integrity checking, message authentication codes, and other such things. They are similar constructs, but used for very different purposes.

      If you want to verify your 1TB video collection got transmitted properly you want a fast hash to keep it from taking a long time. That same hash would be totally unsuitable for password hashing.

      --
      Not a sentence!
    29. Re:links to NIST by Dr_Barnowl · · Score: 1

      When you're hashing your password, you only have to do it once. Your multiplier is 1.

      When you has to defeat a password, you have to do it as many times as you need to, to match the previously hashed password. Let's say you have a really crap password (a 6 digit number) and your attacker knows this. Your multiplier is between 1 and 1,000,000

      The hash is the difficulty factor. If your hash is optimized for throughput, it's cheap. The advantage is that it's really good for establishing data integrity - large volumes of data pass through it quickly. This is the opposite of what you want in a password hash, where the data volume is small.

      If we take hypothetical hashes A and B, where the cost of A is 1 and the cost of B is 1,000 , we can see that the cost of hashing data with B is too much for bulk data, but the cost of doing it for passwords makes it harder for an attacker, who has to hash a vast amount of data compared to someone logging into their account.

      A : 1 x 1 - 1,000,000 = 1 - 1,000,000
      B : 1,000 x 1 - 1,000,000 = 1,000 - 1,000,000,000

    30. Re:links to NIST by mattpalmer1086 · · Score: 1

      Follow this simple rule: don't invent your own crypto. And that goes double for protocols. It's hard. I mean, really, really hard to get it right. If you are not an expert, then there are almost certainly known attacks which will undermine the security of your entire system. If you are an expert, there are still probably subtle attacks (but fewer of them which may not be discovered for years).

      I understand most standard modern crypto and protocols quite well. I have studied and worked with it long enough to recognise that using tried and tested methods is almost always the right thing to do. I am not an expert, but I am hopefully a reasonable practitioner.

      With that disclaimer out of the way, you need to identify what security you are trying to achieve first, before tacking on cryptographic primitives left right and center. You seem to want message confidentiality and message authentication. For confidentiality with message authentication, use standard constructions. Preferably use an authenticated encryption mode, which has the advantage that they can take less message space than the old style encrypt-then-HMAC constructions, and are usually faster than doing either separately. And you typically only need a single key. If you use the encrypt-then-MAC construction, don't use the same key for encryption as you do for MACing.

      Does confidentiality extend to the attacker knowing whether he has seen the same encrypted plaintext before? I.e. should the same plaintext always encrypt to a different ciphertext, or are you happy for the same ciphertext to always be produced for the same plaintext? Do you care about replay attacks? What happens if an attacker replays an old message? Message counters or timestamps can help here. What about non-repudiation? Does it matter whether you can prove which side generated the message or not? For payments, this might be important.

      This only scratches the surface. Get proper advice to analyse the true security requirements of your protocol, and use standardised constructions in the implementation. Use BLAKE by all means if it fits the security you need to achieve and performance is a primary requirement. But figure out what security you need first.

    31. Re:links to NIST by steviesteveo12 · · Score: 1

      Dr_Barnowl puts it very well too. I'm taking his example of a 6 digit password here. IIRC someone (maybe at Google?) did research that most users will sit for around a second on a login before they start wondering if it's broken, so that gives you a whole second -- billions of processor cycles -- to run anything you want on the single password. An attacker doesn't have the luxury of spending that much time on a single password. For example, a million seconds (000000-999999 @ 1s each) is about a week and a half. Parallelism helps attackers, really quite a bit -- a single Nvidia GeForce GTX690 has 3072 cores which means that with all cores running you can cut that down to 5 minutes with one graphics card which is why you don't want your hash to be easily run on hugely parallel hardware. The important thing is that trying billions of possibilities in parallel is only something an attacker ever would do. That means you want to make the hash function too big to fit into a GPU core's cache, use instructions that a GPU can't process, lock in some way etc etc.

    32. Re:links to NIST by steviesteveo12 · · Score: 1

      Exactly, that's very well put.

    33. Re:links to NIST by WaywardGeek · · Score: 1

      Thanks! Both to you and the others who replied. These comments are quite useful, and exactly what I was hoping for. First, I really am focusing on authentication of small messages. If encryption is needed, I expect the communication channel to be secured using existing mechanisms. My basic P2P protocol does not address secrecy, and is meant to be ISP friendly. In short, I'm hoping to build a Ripple based system for exchange of value, though that's just part of the P2P platform, which I intend to be a rapid application development tool for high performance P2P apps. While secrecy is an aim of Ripple, it's not one of my aims, and users of the system may request the state of any or all of the Ripple network. I also intend the system to be IRS friendly, and frankly, Ripple becomes much simpler if you have access to the whole network.

      Based on you're guys feedback, I'm thinking SipHash sounds like an excellent fit, but I'll provide a software switch to the P2P server to switch to HMAC-SHA2. I'll include a message sequence number between each pair of peers which are never reused in a session. Is there anything else I need to do?

      Authenticating messages between peers who already have a shared key is code I'm writing now, so thanks for the instant feedback. For signing Ripple money, and key exchange, I'll need a proven public key crypto system. BitCoin uses an elliptic curve system, which considerably reduces the size the transaction history. Do you guys have any advice for choice of e-money signing systems? There's no really hurry here, I'm just looking for initial advice. An important consideration is that I wish to support micro-transactions as small as 0.01 cent. The typical cost of computation and bandwidth for such transactions needs to be far less than this, say 0.0001 cents or lower.

      --
      Celebrate failure, and then learn from it - Nolan Bushnell
    34. Re:links to NIST by ILongForDarkness · · Score: 1

      But doesn't that also mean that your user authentication process will be 1000 slower (at least for the CPU part of the process)? I'm assuming the user types their password and the hash gets sent over the wire. Hashing might be a bigger part of the type with Blake on a non-Intel piece of hardware (say a smart phone) since the instruction set might not be there, and definitely the CPU horsepower isn't.

    35. Re:links to NIST by Johann+Lau · · Score: 1

      Ah! Now that I got it, I feel a bit stupid for not seeing that right away.. thanks a lot to you all for explaining over and over ^^ Not knowing what the big deal / difference is kinda bugged me, now I have closure.

    36. Re:links to NIST by ILongForDarkness · · Score: 1

      Not to mention liability issues: using standards and getting exploited anyways is much different than trying to roll your own and being the only one vulnerable to the attack ... "the buck stops ... hopefully somewhere else".

    37. Re:links to NIST by steviesteveo12 · · Score: 1

      That's not a problem. You would never do any of this on the client end. It's far too easy for someone to strip out the hashing the password stage and replace it with a false "yeah, it worked" signal. Sony did a howler where they implemented a client side CAPTCHA where they sent out the right answer and asked if you'd typed it in. http://cryptogasm.com/2011/07/sony-captcha-fail/

      It's might be 1000x slower but, so what? You can afford to take half a second longer. Users generally won't even notice.

    38. Re:links to NIST by WaywardGeek · · Score: 1

      I agree. I'm planing to use SipHash and having a switch to use HMAC-SHA2. Thanks for the advice.

      --
      Celebrate failure, and then learn from it - Nolan Bushnell
    39. Re:links to NIST by mysidia · · Score: 1

      No, his point was valid. You can always make a slow hash algorithm out of a fast one, for example by rehashing over and over again a 100K times.

      Not true. The word always is very dangerous when you don't restrict it. You can always make a slow implementation of a hash algorithm by rehashing over and over again 100K times.

      It may very well be that the result of performing all those 100K repetitions can be computed in a different way, that is barely any slower than 1 more repetition of the original hash, so your slow implementation that does 100K repetitions, is not the fastest way to accomplish them.

      Ergo, you don't have a slow hash at all.

  4. Time by Anonymous Coward · · Score: 1

    If security is important to you, don't use it until its been around for a few years at least. Boy was it annoying replacing all the md5 usages in various codebases... and that's been around for more than "a few years."

    1. Re:Time by magic+maverick+ · · Score: 1

      So use a generic hashthis() function (or class, whatever), and then you don't have to replace sha3() or blake2() or whatever through your code, merely modify the hashthis() function to use the new algorithm instead. Forward thinking amirite?

      --
      HELP MY ACCOUNT HAS BEEN HACKED BY AN ILLIBERAL ART STUDENT SET TO DESTROY THE INTERWEBZ!
    2. Re:Time by Gaygirlie · · Score: 4, Informative

      So use a generic hashthis() function (or class, whatever), and then you don't have to replace sha3() or blake2() or whatever through your code, merely modify the hashthis() function to use the new algorithm instead. Forward thinking amirite?

      Otherwise yes, but it would make all the existing data unreadable. If you make it easy to change the hashing method like that then you will also have to always track what data was hashed with what method.

    3. Re:Time by preaction · · Score: 1

      Which is exactly what /etc/shadow does, and it's been around for a long time. These are all solved problems if anyone would pay attention.

    4. Re:Time by Gaygirlie · · Score: 1

      Which is exactly what /etc/shadow does, and it's been around for a long time. These are all solved problems if anyone would pay attention.

      No one was talking about unix passwords here. Data can be almost anything that needs to be hashed and it may be stored in SQL - databases, random text files, proprietary binary files and so on. In other words your referral to /etc/shadow is irrelevant.

    5. Re:Time by zippthorne · · Score: 1

      You have to do that anyway...

      For passwords, I'd assume (but probably be wrong...) that you'd convert the password to the new hash either when the user signs in next, or when the user changes their password next, but you still have to be able to check against the old hash, so you'd always need to know what type of hash that is. Seems to me that the best place to store that is in a field in the database right next to the hash itself.

      One would assume that hashthis() would be implemented to either be aware of the hash's algorithm (a property of its object, perhaps?), or take an argument that indicates the hash.

      I would assume that the feature, "able to change the hash algorithm" would tend to get added either around the first time that it needed to be updated, or by experienced developers who have already experienced that event in other applications.

      --
      Can you be Even More Awesome?!
    6. Re:Time by magic+maverick+ · · Score: 1

      I almost said something like that, along with the solution, but figured people could figure it out on their own.

      With password hashes, the type of hash tends to be part of the end result (e.g. $B$ indicates a hash of type B, while $A$ indicates a hash of type A; not real examples, 'cause I can't be bothered looking them up). You could easily do something similar with generic hashes.

      zippthorne (in a sibling to this post) gave some other thoughts.

      --
      HELP MY ACCOUNT HAS BEEN HACKED BY AN ILLIBERAL ART STUDENT SET TO DESTROY THE INTERWEBZ!
    7. Re:Time by preaction · · Score: 1

      People were talking about storing the hash algorithm with the hash (and implying it was a bad thing, or too much to ask for a simple programmer), which is what /etc/shadow does (which they did back before we had as much power and space as we do now.) The fact that /etc/shadow is storing passwords is, as you said, irrelevant. It stores info about the hash with the hash so that later you know what hash to use to verify.

    8. Re:Time by mysidia · · Score: 1

      There is a common method of doing this, by inserting a "HASH TYPE" header in front of a password. RFC2307

      SHA256 CRYPT() with 10000 rounds - {SHA256-CRYPT}$5$rounds=100000$hK.9TGSGMfuVhMY1$ukBEvc3eTCJNgiK.BIsNIWmq9JU3D8tMapmqwJB6ip0

      Cleartext password - {plain}password {clear}password
      Traditional Unix crypt - {crypt}FHrLg6eHMx2nA
      Unsalted MD5 - {md5}
      Salted MD5 - {smd5}p47J0UeLzLY+/Lk7q0ohOnMJTGI=
      Unix MD5 Crypt - {md5-crypt}$1$QgAqzRlm$Wo69omVJhJHZR2aOKJ4qs1
      Unsalted SHA - {sha1}...
      Unsalted SHA - {sha}W6ph5Mm5Pz8GgiULbPgzG37mj9g=
      Salted SHA1 - {ssha}Dh+8hirJLTZa79Pupbn4W5qbRMP9C7Bv
      SHA256 - {sha256}
      SHA512 - {sha512}
      Salted SHA 256 - {ssha256}
      Salted SHA 512 - {ssha512}
      Windows NT Hash / MD5 - {nt}
      NTLM - {ntlm}
      Lanman - {lm}
      {CRAM-MD5}
      {plain-md4}(MD4 in hex)
      {SKEY}....

    9. Re:Time by aliquis · · Score: 1

      Because it's so hard to store hash,MD5 or hash,SHA3 or whatever in two fields in a database.

  5. Re:They just need five more versions then.. by MickyTheIdiot · · Score: 1

    Version 2 was code named Jenna.

  6. Missing the point by Anonymous Coward · · Score: 1, Interesting

    There are 3 aspects that are important to the security of hashing algorithms.

    1) one-directional. You cannot get back to the original from the hash.
    2) unique-ness. Fewer different sources with hash to the same result.
    3) takes time to make. The quicker it is to create the hash, the quicker it is to brute-force your way through all possible combinations to find a hash that matches.

    Most password cracking happens by getting a list of hashed password values through some other securitu exploit, and running as many hashes as you can as quickly as you can to find match.
    If they're claiming a fast hash is a good thing, they're missing the point.

    1. Re:Missing the point by Georules · · Score: 5, Insightful

      Fast hashing has its uses as well. Such as verifying data transmissions. Passwords aren't the only things that we hash.

    2. Re:Missing the point by pipatron · · Score: 1

      3) takes time to make. The quicker it is to create the hash, the quicker it is to brute-force your way through all possible combinations to find a hash that matches.

      Most password cracking happens by getting a list of hashed password values through some other securitu exploit, and running as many hashes as you can as quickly as you can to find match. If they're claiming a fast hash is a good thing, they're missing the point.

      Bullshit. You don't bruteforce a 512 bit hash, no matter how fast it is. It simply doesn't work like that. You must find a weakness in the algorithm to break even much smaller hashes.

      --
      c++; /* this makes c bigger but returns the old value */
    3. Re:Missing the point by magic+maverick+ · · Score: 3, Informative

      Nah, you are. Hashes are used for a lot more than just passwords. Yes, for passwords a fast generic hash function like SHA2 or SHA3 (let alone MD5) is not such a good option. But for verifying that a downloaded executable or other file has not been modified, it's mostly fine. But don't use MD5, because it's completely broken (e.g. with it being possible to have two distinct PDF files having the same hash).

      For password hashing use Blowfish/Bcrypt.

      --
      HELP MY ACCOUNT HAS BEEN HACKED BY AN ILLIBERAL ART STUDENT SET TO DESTROY THE INTERWEBZ!
    4. Re:Missing the point by Agent+ME · · Score: 1

      Taking time is only important if your input data is small and you don't want someone guessing the input data and verifying it against the hash.

      If you want a hashing algorithm to take time, it's not that hard: just run it X times.

    5. Re:Missing the point by OneAhead · · Score: 1

      No weakness required, simply the ability to make educated guesses about the data being hashed. As in brute forcing password hashes. Granted, hash functions are used for a host of other applications, but password hashes happen to be pretty high-profile, and most definitely have a requirement that they shouldn't be too fast. You can call GP's claim 3) not generally true, but to call it bullshit is, well, bullshit.

    6. Re:Missing the point by Desler · · Score: 1

      Hashes are used for far more than storing passwords securely.

    7. Re:Missing the point by PhrostyMcByte · · Score: 2

      3) takes time to make. The quicker it is to create the hash, the quicker it is to brute-force your way through all possible combinations to find a hash that matches.

      I wouldn't say that. The hash itself should be as quick as possible, so long as it satisfies the other requirements. We have things like PBKDF2 when we want a hash to eat up a lot of CPU, and scrypt when we want to eat up lots of RAM as well.

    8. Re:Missing the point by eric_herm · · Score: 1

      You forgot : changing a little bit the input should ideally produce a totally different output ( ie, good cascading effect, if I am not wrong on the property ).

    9. Re:Missing the point by Immerman · · Score: 1

      Just to clarify - it's not the duplicate hashes that are the problem. Hash collisions should *always* be considered inevitable since the number of possible pdfs (or whatever) will exceed the number of possible hash values by many, many orders of magnitude unless the hash size is as large as the original file. The problem is that it's possible to *intentionally* cause a hash collision - for example by appending some "garbage" data to a compromised executable to make it appear to as though it's the original. And that's really only a problem in situations where you're using the hash to guard against malicious intervention, which is only one application of hashes. If you're simply using it to detect transmission or storage errors it's a non-issue. And if you're using it to generate something like a really good hash-table key (something secure hashes can also be quite valuable for) then you still need to be able to recognize and gracefully handle duplicate keys because collisions *will* eventually occur, a good hash just makes them uncommon enough that they're unlikely to cause performance penalties.

      --
      --- Most topics have many sides worth arguing, allow me to take one opposite you.
    10. Re:Missing the point by Pseudonym · · Score: 1

      Hash collisions should *always* be considered inevitable since the number of possible pdfs (or whatever) will exceed the number of possible hash values by many, many orders of magnitude unless the hash size is as large as the original file.

      The number of possible PDFs is larger, but the number of actual PDFs is lower. Much lower.

      A very reliable rule of thumb is that if there are N possible hash values, then you should expect 0.5 hash collisions after hashing N items. (Exercise: Prove this. You want to consider the limit as N goes to infinity.) So if you're hashing PDF files and looking for a collision, then for a 512-bit hash, you need around 2^256 PDF files to have a realistic chance of a collision. I'd wager that there aren't anywhere near 2^256 PDF files.

      Of course hash collisions should always be considered inevitable. As you say, that's just good programming practice. However, if your hash function is good and has a large range, it can safely be considered rare.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    11. Re:Missing the point by magic+maverick+ · · Score: 1

      Well yes. Good point that I should have made, it is the intentional aspect that is most worrisome. You take a PDF file, take the hash, and then can easily generate another PDF file with the same hash. Which apparently could be used for fraud. Or, take an executable and then generate a malicious one with the same hash and trick someone into downloading it instead of the correct one. I should have expanded on that point in my original post.

      --
      HELP MY ACCOUNT HAS BEEN HACKED BY AN ILLIBERAL ART STUDENT SET TO DESTROY THE INTERWEBZ!
    12. Re:Missing the point by WillerZ · · Score: 1

      A very reliable rule of thumb is that if there are N possible hash values, then you should expect 0.5 hash collisions after hashing N items.

      No you shouldn't. You should expect 0.5 collisions after hashing N/2 items. Trivially you are guaranteed at least M collisions when hashing N+M items.

      I don't actually know off the top of my head how many collisions you should expect if you hashed N items.

      --
      I guess today is a passable day to die.
    13. Re:Missing the point by Immerman · · Score: 2

      Okay, I can only assume that that's a typo - if you have N possible hash values and have hashed N items then you almost certainly already have many, many collisions, and are guaranteed a collision on the (N+1)th hash by the pigeonhole principle.

      Consider that a good hash function will bear many similarities to a random number generator and you'll see that it's a variant of the birthday problem - you don't need 180 people in a room to get a 0.5 chance of two people having the same birthday, only something like 20-30 (I don't remember exactly), and the effect scales sub-linearly. Logarithmically if I remember correctly. So yes, a good hundreds-of-bits hash function will generate values that can safely be considered rare, just NOT unique. And even events with billions-to-one against odds happen all the time in the real world, just not in a predictable fashion.

      --
      --- Most topics have many sides worth arguing, allow me to take one opposite you.
    14. Re:Missing the point by fnj · · Score: 1

      Are you sure you don't mean "you need around 2^511 PDF files"?

      2^256 is not anywhere near half of 2^512.

    15. Re:Missing the point by fnj · · Score: 2

      Good and fast hashes are all you need for any purpose. You can't make a slow hash fast, but it's trivially easy to make a fast hash slow. You just use sequential rounds, enough of them to make it slow.

      SHA-512 is fine for passwords. It's the best there is. You're nuts to use anything else, the US military agrees with me, and it is the default in RHEL6 and other recent security conscious distros. The thing is, nobody with any sense uses a single round for this purpose. They use 5000, or 50,000, or 719,246. When you configure your password security on linux to SHA-512, glibc defaults to 5000 rounds.

    16. Re:Missing the point by Pseudonym · · Score: 1

      Okay, I can only assume that that's a typo

      What I wrote was √N, but Slashdot in its wisdom cut off the square root sign.

      If there are N possible hash values, you should expect 0.5 collisions after hashing sqrt(N) items.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    17. Re:Missing the point by Pseudonym · · Score: 1

      Slashdot screwed up my comment. Please see this correction.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    18. Re:Missing the point by Pseudonym · · Score: 1

      Slashdot screwed up my comment. Please see this correction. 2^256 is the square root of 2^512.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    19. Re:Missing the point by Pseudonym · · Score: 1

      By the way, to answer your question: If there are N hash values and you hash N items, you should expect N(1-2/e) collisions. The proof of this is also left as an exercise.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    20. Re:Missing the point by DrVxD · · Score: 1

      Take a look at this

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    21. Re:Missing the point by DrVxD · · Score: 1

      whether or not you celebrate Christmas, Christmas celebrates you.

      But only in Soviet Russia

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    22. Re:Missing the point by DrVxD · · Score: 1

      I don't actually know off the top of my head how many collisions you should expect if you hashed N items.

      One of the characteristics of a good cryptographic hash is that it has a uniform distribution in the hash space - so really, you have a generalisation of the birthday paradox; the section there on 'Collision Counting' gives the expected number of collisions as
      n - d + d ( ( d -1 ) / d ) ^ n
      (where d is the number of possible hash values, and n is the number of items).

      Since we're hashing N items, with N outcomes, then we can substitute d==N, n==N
      N - N + N ( ( N - 1 ) / N ) ^ N
      and simplify a little:
      N ( ( N - 1 ) / N ) ^ N

      My calculus is embarrassingly rusty, so I did a quick graph of that (and its differential) on Wolfram Alpha, and it looks to be tending to around 36.7% collisions, which I thought was surprisingly high - but then again, the birthday paradox still strikes me as a surprising result...

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
  7. Re:Faster? by Buzer · · Score: 1

    It depends on what hashing function is used or. For passwords, yeah, that kind of function is good. But that's not the only use for hash function. If you need to know if files X and Y in different systems are identical, you want that checksum pretty fast. If you are transmitting data at 10Gbit/s and want to make sure it's going without problems, you are going to need something very fast. But you most likely want to make sure it's still slow enough that it cannot be bruteforced on the wire.

  8. Re:Faster? by Anonymous Coward · · Score: 4, Informative

    Faster hashing is better. The good guys have to pay a cost of X for hashing at b bits. The attacker has to, ideally, pay a cost of 2^b*X (well, a little less due to the Birthday paradox, but it'll still be exponential). Halving X helps the good guys, because hashing is now faster for everyone. It doesn't help the attacker because you should have chosen b to be so big that halving X makes just absolutely no difference because 2^b is so big that it's completely impractical even if X is just the time for a single cycle on a CPU.

    Here's another way of looking at it: trying to impede the attacker by increasing X is the wrong idea because you ALSO have to pay for X. Instead, you want to focus on b because here your cost grows slowly with b while the attacker's cost grows exponentially with b. So smaller X is better, that is, faster hashes are better.

  9. Before "too fast = bad" by Anonymous Coward · · Score: 2, Interesting

    I know I’m already too late for a few, but fast hashing is good. The only thing you want a slow hash for is a PBKDF—and, to be honest, that is and should be an entirely different construct to a hash, although you can make PBKDFs out of cryptographic hashes.

    I’m still not seeing anything that makes this better than Skein, however, which is very fast in software, can perform signature pinned and keyed hashes and rounds (which could be used to make a PBKDF of arbitary strength), and even has a built-in tree hash function, and parallelises well. Skein’s still my favourite SHA-3 finalist by far, and I’m going to be using it in production instead of SHA-3—given FIPS won’t be driving SHA-3 adoption in short order, we won’t get hardware support for some time, and the NSA value hardware speed far more than software speed (as they mostly work with hardware CSMs). Further, I am not entirely convinced the round function underneath the sponge construction is necessarily different enough to survive an attack that would weaken AES.

    Keccak’s way, way too slow given Skein and BLAKE are faster and of comparable security (although BLAKE was the second least favourite of my picks), and I suspect that SHA-3 will face slow adoption: there’s nothing obviously wrong with SHA-2, and absolutely no reason to move from SHA-256 to something significantly slower than SHA-256 when other hashes of comparable security exist which have received a similar amount of cryptographic attention and run faster. Keccak’s a poor sell.

    As for PBKDFs, perhaps we need to hold another contest entirely for them, although it is also getting to the point that by the time we’d finish such a competition, passwords which the average person can remember have too little entropy to survive even brute force on any practical PBKDF you’d want to run (bearing in mind that if it runs too slow it represents a potential point of DoS), and no asymmetric PBKDF can exist that resists offline attacks.

    It might also be worthwhile to hold a contest to define a tweakable block cipher and define more modes based on that, and perhaps that would be an interesting way forward.

    1. Re:Before "too fast = bad" by TechyImmigrant · · Score: 2

      > the NSA value hardware speed far more than software speed

      I don't think this is true for NIST though. Witness the results of the SHA competition. Keccak sucks in hardware compared to Skein. Skein is lovely to implement in hardware. Ultimately the sponge construction is what won it.

      They certainly ignored my advice on hardware implementation.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    2. Re:Before "too fast = bad" by TechyImmigrant · · Score: 3, Insightful

      The issue is not so much how small and fast you can make one instance of the algorithm, but rather, how it scales from small/slow to fast/big. An algorithm like a hash or block cipher gets baked into many silicon contexts, like instructions, or memory datapaths, or offload processors, or IO datapaths. The size/speed/power requirements of these different contexts varies.

      Keccek is harder to divide down into smaller execution chunks in hardware than skein.
      Skein has an add-xor core operation that is repeated width-wise and depth-wise. So you can easily scale it in width, depth and pipeline depth in order to meet the needs of the situation.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  10. Re:Famous last words by Desler · · Score: 1

    Faster is good for plenty of applications of hashes such as checksums on data.

  11. Re:Faster? by sydneyfong · · Score: 2

    Generally you're right.

    However, for passwords specifically, you'll need longer plaintext passwords for this to work. It doesn't matter how many bits your hash function is at, if the average password is only 8 characters of Alphanumeric (which, unfortunately, has entropy much less than 64 bits...). Salts help against things like rainbow attacks, but it won't help if an attacker is trying to crack one specific account that is protected by a 8 character password -- in that case, the speed of the hash function is the only variable in the cracking effort.

    We all know we should have longer plaintext passwords, but the fact is -- our puny brains cannot remember random characters that well.

    And passphrases haven't really caught on, unfortunately.

    --
    Don't quote me on this.
  12. Re:Faster? by Anonymous Coward · · Score: 2, Interesting

    That's why there are dedicated hash functions for password hashing like bcrypt and scrypt that should be used for this. One should never use generic cryptographic hash function for storage of passwords directly. Being slow was never considered as a goal in design of cryptographic hash functions, they are intended to be practical replacements for random oracles. They could, however, be used as a building blocks for PBKDF that might be appropriate for password storage.

  13. The SW speed of BlakeX is moot by TechyImmigrant · · Score: 3, Interesting

    The software speed of the SHA algorithms is somewhat moot in the medium terms because over the medium term, crypto primitives (encryption, hashing, RNGs etc) are moving to hardware and moving to an instruction model instead of a device+device_driver model.

    So the hardware implementations available to software through instructions will be faster than software implementations and have much better security properties in terms of attack surface and side channels. Modern crypto tends to fall to side channels and implementation error before it falls to crypto attacks and hardware is the best place to solve these problems.

    At the recent NIST RBG Workshop http://www.nist.gov/itl/csd/ct/rbg_workshop2012.cfm
    I presented a short talk on where Intel is going. http://csrc.nist.gov/groups/ST/rbg_workshop_2012/johnston.pdf

    Basically, we've started putting standards based crypto primitives in hardware, on the CPU die, presented through the instruction interface (E.G. AES-NI, RdRand, RdSeed) to provide for more secure crypto on PCs. This is our publicly stated intent going forward. So who cares how many cycles it takes when there's a constant time instruction available that is faster?

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    1. Re:The SW speed of BlakeX is moot by TechyImmigrant · · Score: 1

      Yes, but the NIST competition method to define new algorithms that yielded AES and SHA3 seems to have focused more attention on the algorithms, so the odds of them being broken is reduced. SHA-0 and SHA-1 were essentially handed to the world on a plate, with little outside analysis.

      So I have greater confidence in AES and SHA-3 that I did in the older algorithms. Hence me pushing to put such algorithms into CPU silicon.

       

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    2. Re:The SW speed of BlakeX is moot by TechyImmigrant · · Score: 1

      Sorry, I can't say more that I said in the NIST slides.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  14. Re:Fastest Encryption by TechyImmigrant · · Score: 1

    > [Fastest Encryption] Is this: source_byte_to_encrypt XOR some_random_byte.

    No it isn't. The complex part is the generation of 'some_random_byte'. Check out SP800-90A,B & C. Those are not efficient algorithms.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  15. Re:Faster? by Pseudonym · · Score: 1

    The birthday paradox only lets you find a collision faster, not crack a specific password faster.

    If you have a good b-bit hash function, then the expected cost of "undoing" a specific hash is X*2^(b-1), but the expected cost of finding half a collision (the birthday paradox scenario) is X*2^(b/2).

    --
    sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
  16. Re:cool by Taco+Cowboy · · Score: 1

    According to wikipedia's page on the NISH-sponsored competition - @

    http://en.wikipedia.org/wiki/NIST_hash_function_competition

    - one of the criteria of the elimination of potential candidates was ...

    Security: "We preferred to be conservative about security, and in some cases did not select algorithms with exceptional performance, largely because something about them made us 'nervous,' even though we knew of no clear attack against the full algorithm."

    Would the clause above invalidate Blake2??

    --
    Muchas Gracias, Señor Edward Snowden !
  17. Why encrypt? by PacRim+Jim · · Score: 1

    It occurs to me that, under certain circumstances, encryption might be unnecessary. An eavesdropper expected encrypted text might be skeptical of brazenly published open text. Hiding in plain sight, so to speak.

  18. Re:cool by GameboyRMH · · Score: 1

    Wow how scientific of them |:-|

    --
    "When information is power, privacy is freedom" - Jah-Wren Ryel
  19. Software only? by martijn+hoekstra · · Score: 1

    What about simplicity and performance for hardware implementations? Or is this meant as a software hash? TFA only gives software runtimes.