Schneier: We Don't Need SHA-3
Trailrunner7 writes with this excerpt from Threatpost: "For the last five years, NIST, the government body charged with developing new standards for computer security, among other things, has been searching for a new hash function to replace the aging SHA-2 function. Five years is a long time, but this is the federal government and things move at their own pace in Washington, but NIST soon will be announcing the winner from the five finalists that were chosen last year. Despite the problems that have cropped up with some versions of SHA-2 in the past and the long wait for the new function, there doesn't seem to be much in the way of breathless anticipation for this announcement. So much so, in fact, that Bruce Schneier, a co-author of one of the finalists not only isn't hoping that his entry wins, he's hoping that none of them wins. ... It's not because Schneier doesn't think the finalists are worthy of winning. In fact, he says, they're all good and fast and perfectly capable. The problem is, he doesn't think that the world needs a new hash function standard at all. SHA-512, the stronger version of the SHA-2 function that's been in use for more than a decade, is still holding up fine, Schneier said, which was not what cryptographers anticipated would be the case when the SHA-3 competition was conceived. 'I expect SHA-2 to be still acceptable for the foreseeable future. That's the problem. It's not like AES. Everyone knew that DES was dead — and triple-DES was too slow and clunky — and we needed something new. So when AES appeared, people switched as soon as they could. This will be different,' Schneier said via email."
Faster computation of hash functions is very important, especially to low-power devices. In other words, even if the improvements in cryptographic strength are irrelevant I'd expect the new standard to be adopted quickly.
However, SHA-2 could be broken tomorrow, and this time we won't have a decade's wait while a suitable replacement is designed.
I guess today is a passable day to die.
I think the next hash should be called B455 DR0PP3R 2K12
Disclaimer: I'm not a security expert so don't expect what I'm saying to be accurate.
Dictionary attacks have nothing to do with breaking hashes. If you mean stuff like rainbow tables, that's specific to hashes used to store passwords, which doesn't even need anything > SHA-256, because passwords don't have that much entropy to begin with.
What you need for security are essentially too properties: the entropy in the hash system (how random the values seem to be, in relation to the input), and the collision resistance (how hard is it to generate two inputs that result in the same hash, AND how hard it is to generate an input for a given hash number).
Cryptographic Hashes are used for a lot other purposes, and many of them DO require to be fast, and have a very high collision resistance. The most notable may be generating signatures for cryptographic purposes (to verify a message was sent by the entity that claims to have sent it, generally).
How about we link to Schneier's actual blog post? https://www.schneier.com/blog/archives/2012/09/sha-3_will_be_a.html
Wonder what the public key field is for?
As I understood, it has to be slow to be hard to break via dictionary attacks etc. ...
No - you use long, cryptographically random, salts to avoid dictionary attacks. Any hash or cryptographic function that is fast enough to use will be fast enough to attack with a dictionary unless you do this. Of course user education and password rules forbidding short alpha-only words are important too!
Is it really necessary to have a snide remark at supposed government inefficiency there? Can't we bury this ideological attacks that are not really supported by facts or data, add nothing to the point and are in fact grossly misleading?
This is a hard mathematical problem. Ordinary research papers in mathematics often spend a year or more in peer review in order to verify their correctness. If you're building a key component of security infrastructure a couple of years of review is not at all unreasonable.
> Dictionary attacks have nothing to do with breaking hashes.
There's two kinds of hashes you should use - those which are meant to be slow (for password hashes), and those which are meant to be fast (for message signing). SHA is meant to be fast.
There are too types of dictionary attacks, one is used for breaking passwords using a dictionary of likely passwords and trying them one by one hoping that you find the password in the dictionary, this can be coupled with bruteforcing techniques to try things like add number to the end, replace e with 3 etc. And some crackers will even start running bruteforce through combinations not in the dictionary when the dictionary runs out. Now in hash world a collision exists, this is where another set of the same data (like a different password) also yields the same hash (so would allow authentication). Dictionaries don't just apply to passwords, but any data set one could build into a dictionary to help find collisions.
The other to break certain block ciphers using ECB mode (building a dictionary of what a particular block maps to as in ECB each block is encrypted entirely independently, so if another block happens to be the same pattern, it gets encrypted to be exactly the same.
The first is related to hashing as that is usually the way the passwords are hidden in database tables and so is relevant though cryptographic hashes are used for many other things not just passwords and so there are other attacks against cryptographic hashes. While ECB dictionary attacks are totally irrelevant and out of context in this case and can be discarded.
No, they avoid certain classes of dictionary attack like rainbow table attacks, this is where the dictionary has the hash it matches to precalculated in the dictionary. Me taking a dictionary and salting and hashing each word and seeing if it matches is a dictionary attack.
If you rely on hashing speed to hash passwords, you are doing it wrong. computers get faster, constantly. It's not speed that matters, it's the number of possible combinations making it exponentially too large to brute force, relative to the time to calculate each hash. Who cares if you can calculate missions of hashes in one second, if you still need to spend longer than the age of the universe to get a reasonable number of inputs to use as a dictionary? It's just simpler to use a plain-text dictionary and perform the hashing element by element. In which case the hashing speed does not matter AT ALL, it's how many attempts the site allows before either locking you out or increasing the time between attempts too much.
As I understand it, that's why you salt the passwords AND use a user-specific string (based on username, email and/or similarly constant data) to introduce more variation so that they can't use generic rainbow tables or even site-specific rainbow tables.
So much work from everyone involved and we just throw it away??
This is a standard for many years in the future. SHA-1 is still used in some current applications and is considered secure and people are still using MD5.
Everyone can just ignore the new standard and the researcher can have a decade or two to try to break it before its needed. Where is the harm?
This is a common misconception. The source of this misconception is the way people have tried to protect weak password through the use of hashing. If you take a strong password and hash it with a hash function satisfying all the requirements for a cryptographic hash function, then that password is well protected. That construction doesn't work for weak passwords. If you apply a salt while hashing, you move the threshold for the strength of passwords which can be brute forced. This is quite clearly an improvement over plain hashing. I know of no cryptographer, who has disputed, that it is a good idea to use salts while hashing passwords.
But there are still some passwords, which are too weak to be protected by a salted hash. This has lead to some people saying this hash function is insecure, because it is too fast. What they should have been saying was, this password mechanism is insecure, because it is using the wrong kind of primitive. This is an important distinction. Even if the hash function satisfies all the requirements of a cryptographic hash, then a salted hash cannot protect a weak password.
When building cryptographic systems you often need to apply different classes of primitives. Common primitives are hash functions, block ciphers, asymmetric encryption, digital signatures. Examples of primitives in each of these four classes are SHA512, AES128, RSA, RSA (yes RSA does fall in two different classes, there are other primitives, which fall in only one of those two classes). If you want to send an encrypted and signed email, you typically use all those four classes of primitives.
To protect semiweak passwords better than through a salted hash you really need a new class of primitive. For lack of better term I'll call that primitive a slow function. Claiming that a hash function is insecure because it is fast would be like claiming AES128 is secure because you can derive the decryption key from the encryption key.
The formal properties I would say a slow function should have are first of all that it is a (mathematical) function mapping bitstrings to bitstrings, and that it requires a certain amount of computing resources to compute the full output from any single input. Properties that I would not require a slow function to have includes collision resistance and fixed size outputs. Those are properties you expect from a hash function, which is a different kind of primitive.
People have tried to squeeze both kinds of properties into a single primitive, which if they succeeded, would be both a cryptographic hash and a slow function. But they haven't always paid attention to the differences in requirements. And often the result has been called a hash function, which confuses people, since it is different from a cryptographic hash.
One nice property of slow functions as I would define them is that given multiple candidate functions, you can just compute all of them and concatenate the results. And you will have another slow function, which is guaranteed to be at least as strong as the strongest of the functions you started with.
Once you have all the primitives you need, you can combine them into a cryptographic system, that achieves some goal. If you want to protect passwords, I think you are going to need both a slow function and a hash function. For the combined system, you actually give a formal proof of the security. The proof of course assumes, that the underlying primitives satisfy the promised criteria. I guess the password protection you would implement given the above primitives would compute the slow function on the password, and then compute a hash of password, salt and output of the slow function.
Additionally you could prove that the regardless of the strength of the slow function, the password as well protected as it would have been with just a salted hash. That way by handling those as separate primitives, you can reason about the security assuming the failure of one of the primitives. Such a construction would eliminate the main argument about some of the existing slow functions, which is that they haven't had sufficient review.
Do you care about the security of your wireless mouse?
As I was trying to explain in the reply below, the time it takes to calculate the hash is meaningless. Relying on that time as a way to prevent intrusions would be like a bank using a maths puzzle to lock the safe, and then claiming that it takes too long to solve, so they would notice the attempt before it happens. It just doesn't work that way.
You have two strengths in preventing such intrusions: first is the exponential complexity of reversing the hashing process (brute forcing, unless the algorithm is proven broken), and second is the artificial delay used to prevent mass attempts at the password. There's attacks for everything, but if any of those 2 fail, everything fails.
Not strictly try, one reason bcrypt/scrypt/PBKDF2 is recommended over straight salting and hashing is that it is slower to hash and in the case of BCRYPT it is also deliberately designed to be harder to build dedicated accelerators or use parallel processing to help speed it up, therefore slowing down a brute force attack. Yes, time shouldn't be the only factor, but most cryptography has a time element, given enough time one can break your the whole banks password database through bruteforce, don't you want to make it as slow as possible to even make attempts (offline as well as online). If I can break this diplomatic cable, it's great, but if it takes 70 years it's already declassified before I broke it anyway does it matter I could break it given 70 years?
That obviously should have said "Claiming that a hash function is insecure because it is fast would be like claiming AES128 is insecure because you can derive the decryption key from the encryption key."
Put differently. If you use AES when you should have used RSA, you don't blame AES for being insecure. If you use a SHA512 when you should have used a slow function, you shouldn't blame SHA512 for being insecure.
When you reason about the security of cryptographic systems, you usually show how many times a certain primitive must be invoked to break it. And if that number is large (usually meaning 2^128 or more), then you consider the system to be secure. It is not threat if the primitive itself is fast, because once you have to execute it 2^128 times, it will still take "forever".
But for protection of weak passwords you can't give such guarantee. Those can be broken with a relatively small number of invocations of the primitive. At that point the resources it takes to compute the primitive matters, and adding another requirement to a primitive means it is no longer the same kind of primitive.
Do you care about the security of your wireless mouse?
SHA-2 Veriditas, Neato!
Visit CryptoGnome in his home.
The proper name for these "Slow functions" is Key Derivation Function. They've been around a long time and are what OSes use to protect login credentials and what encrypted archive formats like RAR use.
Some examples are crypt (obsolete, vulnerable) PBKDF-2 (repeated application of salt-and-hash), bcrypt (repeated rounds of a special extra-slow variant of blowfish), and scrypt (an attempt to defeat GPU and custom hardware attacks by requiring lots of low-latency RAM).
Single-round salted hash is only a "better than plaintext" hack solution, it's never been the correct way to store passwords.
I just found an SQL injection attack and downloaded the whole password database. I know crack it at my own leisure. Now I can come back any time and use those user names and passwords. Now what is the bet some of those user name and passwords are used somewhere else by some of the users? When salting you need to do it very specific, you do not want to use the same salt as another system, you do want your salts to all be unique to a given user on your system suggestion is random data from a PRNG (technically for salting it doesn't need to be cryptographically secure random, though it doesn't hurt). Finally when salting don't just append the salt to the password and has as it may open up other avenues of collision attacks instead prepend the password length too.
It's not at the scale of 70 years. Brute forcing a 128-bit space would take at best millions of years and require that most of the planet mass be converted to energy.
If the passwords are decently salted and the salt is unknown good luck with that. Remember to switch planets when the Sun goes nova.
This sig is intentionally left blank
Hashes like bcrypt are configurable too so the number of rounds is a parameter to the power of two so it can be made more secure / slower if necessary as time progresses. With 2^10 rounds it's approximately 8000x slower to make a hash than SHA1 which server side isn't a big deal but it is for someone running through a dictionary.
It's so bad that attackers would probably only bother to try a subset of common passwords (e.g. top 1000 passwords) before moving onto the next one. Enforcing password quality during signup would probably block these from hitting too.
As the saying goes. Why don't we continue to run candidates in parallel to SHA-2 usage and when there are signs that SHA-2 is about to be compromised or obsolete we'll just switch to whatever candidate was the best afterwards. Naturally if SHA-3 is significantly better in speed, security and compression etc then we have already made SHA-2 obsolete and need not wait until it "breaks". My two pence.
A 'singular oddity' is an event that cannot be explained and only happens when you are alone.
If the algorithm is slow, it doesn't really help prevent it from being cracked. Because the hacker can just put more computing power into it.
Then you also mean the person who is trying to protect the data now has to get a faster computer to keep the load and the application running at the right speed... Thus giving an extra cost to the protectee however the hardware speed increase will negate the disadvantage to the hacker. It is just a losing proposal for the system owner.
However if the algorithm is fast that means we can increase the performance on the box, and give more resources to extra security measures. Such as adding a salt (and Pepper) to the Data to be hashed, Putting in a system that puts a system idle sleep if the value is incorrect...
1x 2x 100x slower isn't a big deal. Big O is.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
would be link a bank using a maths puzzle to lock the safe, and then claiming that it takes too long to solve
Um, isn't that *exactly* how encryption works? :p
The point is, the timescales are exponential. A hash that's 100 times faster to compute only knocks 2 orders of magnitude off the time it takes to crack the hash (10^10 universe-lifetimes instead of 10^12, w00t), but it makes it 100 times more usable in the golden path case of computing a hash on an in-core string.
Users will use weak passwords*, web servers will get compromised.
Ideally you would have a seperate "password checking server" that did nothing but store and check paswords and was locked down very tight but most sites can't really justify the cost of that. So on most sites the password database and any related secrets such as a fixed part of the salt are just one bug in a php webapp away from being revealed to an attacker.
Using a deliberately slow hashing technique will increase the time taken for the hacker to crack passwords in that scenario potentially buying you time to warn your users.
*and if you make rules to try and stop them they will do the minimum needed to comply with those rules.
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
If you rely on hashing speed to hash passwords, you are doing it wrong.
If you rely only on hashing speed to protect your passwords, you're doing it wrong.
The problem with fast hashing is that it facilitates brute force password searches. Salt prevents rainbow table attacks, but targeted brute force attacks against a specific password can be quite feasible given typical user password selection. There are two solutions to this: Make users pick better passwords or find a way to slow down brute force search. The best approach is to do both; do what you can to make users pick good passwords (though there are definite limits to that approach), and use a parameterized hash algorithm that allows you to tune the computational effort.
The common way to slow down the hash is simple enough: iterate it. Then as computers in general get faster you can simply increase the number of iterations. In fact, you can periodically go through your password table and apply a few hundred more iterations to each of your stored password hashes. The goal is to keep password hashing as expensive as you can afford, since whatever your cost is, the cost to an attacker is several orders of magnitude higher (since the attacker has to search the password space). Oh, and it's also a good idea to try to keep attackers from getting hold of your password file. Layered defense.
As I understand it, that's why you salt the passwords AND use a user-specific string (based on username, email and/or similarly constant data)
User-specific strings are good too, as another layer to the defense, but you have to assume that an attacker who gets access to your password file probably gets that data as well.
to introduce more variation so that they can't use generic rainbow tables or even site-specific rainbow tables.
Salt is sufficient to eliminate rainbow table attacks.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
No - you use long, cryptographically random, salts to avoid dictionary attacks.
There are basically two types of salt, fixed salts stored in the server configuration and per-password salts stored in the password database. They defend against different things.
Fixed salts stored in the server configuration defend against someone who has got your password DB but not your server configuration.
per-password salts stored in the password DB defend against precomputed attacks.
Neither provides a defense against someone who has both your password DB and server configuration and is going after an individual password. That is where deliberately slow hash functions come in.
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Has anyone ever actually read NIST 800? I just had to review 800 30 and 800 39 yesterday. Hand to god they're designed to put you in a coma. There is not enough ritalin on the the planet for that.
Really, any increase in key-length or change in algorithm ought to be done to save us from the issues that could arise from things like quantum computers, super computer bot nets, or further into the future quantum computer bot nets. I mean we don't have those things yet, but we can kinda see them coming, and ought to be thinking long and hard about how to break that issue permanently.
Hmm, the humour and sarcasm seem to have been be lost on you.
They are related, but not exactly the same. The slow functions that I was suggesting does not require every bit of their output to be unpredictable. It just requires that the output as a whole to not be easily computable. It doesn't matter if it turns out some long subsequence of the slow function output is easily computable. There is also no requirement that the output of the slow function be random looking. It could start with a sequence of 1MB of zeros, as long as it is followed by something that cannot be computed as quickly. There is also no requirement that the slow function isn't reversible.
So a slow function as I suggest it is not guaranteed to be usable as key derivation function. OTOH it may be that any key derivation function would satisfy my suggestion for a slow function. But I am not entirely sure about that either. It boils down to the question about whether a key derivation function is required to be slow?
Can you point to a formal definition which shows the difference between a key derivation function and a cryptographic PRNG?
What would you do if you are required to design an ultra secure password protection based on the above? You have four suggestions, but you might then be faced with the requirement that every stored password must be secure against an attacker who can defeat three of the four candidates. You need something that is secure as long as one of the four is not broken, but you don't know in advance which of the four.
If the formalization I was suggesting was being used, then you could just compute all four and concatenate them. You'd still need a hash function, and composing multiple candidate hash functions into one secure hash functions is harder. But you'd just a cryptographic hash function, which means simpler requirements than the more complicated primitive.
Do you care about the security of your wireless mouse?
You're missing the distinction between an online attack and an offline attack. In an online attack, where the attacker goes to the website and starts typing in passwords, then lockout will do just fine. But when the attacker has stolen the password file, he gets as many guesses as he wants, bounded only by computing power. And in that case, the hashing speed will be a limiting factor in how long it takes him to break the passwords.
FTFY
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
The sun will not got nova. It will turn into a red giant and then a white dwarf.
Conveniently, converting most of the planet's mass into energy serves as an effective substitute for diplomacy in many situations.
The salt is difficult to keep unknown. Every part of the web application which needs access to the salted hashed password also needs access to the salt, so if your security fails and allows access to the salted password, it probably allows access to the salt as well.
Sometimes you get lucky and the attackers get only the salted hashed password, but you cannot design your security around getting lucky.
Finally! A year of moderation! Ready for 2019?
I just found an SQL injection attack and downloaded the whole password database. I know crack it at my own leisure.
Sure, but while the site is exploitable can't you pwn the rest of the site? You probably can pwn the rest of the database.
The solution it seems is to use different passwords for every site (or at least sites that matter). It doesn't even matter if the passwords are short. Once the hacker has enough access to get the passwords they normally have enough access to get the rest of the juicy data, or even change it.
Given the vast numbers of sites with weak security it seems a waste of time to use very long passwords. Just use passwords long enough that they won't brute force it via HTTP (which will probably look like a DoS/DDoS attack).
Perhaps I'm misunderstanding your point, but the idea of the salt is not to keep it secret. The idea is that each users password is combined with a unique string (the salt) so that if you try to attack the password database with a dictionary attack you have to process each password individually.
What a salt does do, however, is make rainbow tables impractical. It doesn't need to be private, you can store it in plaintext in the same table as the password hashes.
I had a sig once. It was lost in the great storm of '09.
Well, it's certainly exactly how safes work.
Of course, but I'm betting even your educated users do not do that. And yeah, you need about 12 characters before brute force is out against salted MD5, this is why slower algorithms like bcrypt help (blowfish/sha-1/sha256 multiple times with some special stuff thrown in to make it hard to build hardware accelerators for it.)
Very valid point on pw length as it's what I tend to follow. For those sites with a "critical" pw, I tend to use as high an entropy and length as possible. For places like /. and other forums that are not important, I use a lower quality pw as I can and will replace the account if the forum is important enough to me. Otherwise, I post AC if I'm just stiking my nose in the tent to see if there's anything interesting.
Mod me up/Mod me down: I wont frown as I've no crown
No need for the additional overhead of using a rainbow table for this. Just apply a generic brute force dictionary attack without using rainbow tables. It will be much faster, if there is just a single leak.
The rainbow table only helps if it can be precomputed, which you can only do if that salt is leaked before the database. If a site repeatedly have leaks of their password database, and the salt remains unchanged between leaks, then for any subsequent leak the salt is previously known, and a rainbow table can be used. But after the second leak the only users still using the site will be those who don't care about security anyway.
That is correct, and it should change whenever the user update their password. You can combine the two and have one value, which is constant for the site and one that is unique for every user. It is the salt, which is unique for every user, that provides the majority of the security. The other value, which is fixed for all users on the site is sometimes referred to as pepper. If the pepper value is leaked, it provides no additional security, but you still have the security provided by the unique salt values. But if the pepper value is not leaked, then nobody can even start brute forcing the password hashes or compute a rainbow table, even if the entire password database is leaked. That gives more time to get passwords changed, if you are lucky and only the password database is leaked and not the pepper.
For example if the pepper value is a hardcoded value in the source code for the site, then it doesn't have to be present anywhere in the password database. Some attacks cause a leak of only one of the two, and in those cases you have more time to get passwords changed. If they are leaked simultaneously, you are left with only the security of the salt, which is what you'd have, if you weren't using pepper in the first place.
If either the pepper or the database is leaked, you have to change the pepper value (as soon as you have closed the hole, which allowed the leak in the first place). But until all passwords have been changed, you still have to support the old pepper value. That means to properly use this, you need to have a hardcoded array of pepper values in the application, and store an array index in the user entry in the password database.
The use of such a pepper value makes sense if you believe in defence in depth with many layers of protection. Using the pepper without simultaneously using a per user salt, means you don't know what you are doing. The per user salt is easier to implement than a pepper value stored in the application code itself, and the per user salt also provides more security.
Do you care about the security of your wireless mouse?
The main point with my proposal to split the hashing and slowness into separate is that each of them have a much smaller set of requirements, and thus a much smaller set of possible vulnerabilities. The specific problem you mention is not considered a major threat, but my proposal still protects against it.
In my suggested model, the slow function is where you could choose to use an iterated hash function. But there is no requirement that the slow function preserves all of the entropy. If the slow function was to lose a bit of entropy, that would not be a major problem. If it lose so much entropy, that the output is completely predictable, then it is a real problem. But you'd be able to demonstrate the entropy leak by using the birthday paradox to find an example collision a long time before the lost entropy became a real problem.
The point of my construction is, that you can still argue about the security of the system even if the security of the slow function completely breaks down. The worst case for the slow function would be if the output was constant (if every input results in the same output, you just need to compute it once). But even in that case, you can make a full system, which requires one invocation of the cryptographic hash for every password that is attempted.
Protecting against the entropy loss in iterated hashing is by now means a new idea. What I think is a new idea is that of splitting the slow part and the hashing into separate primitives and providing two proofs for the security of the system with a broken slow function and with an unbroken slow function.
Do you care about the security of your wireless mouse?
Nist started the SHA-3 competition when SHA-1 was proven weak, and no one was sure how long SHA-2 would last,
no one liked the idea of relying solely on the wide pipe SHA-512 when the underlying building blocks have been proved week, (using SHA-512 is a bit like using triple-DES).
However it is difficult to predict advances in cryptography, and though SHA-512 is not nearly as weak as we predicted it would be a few years ago, we don't know what new cryptanalysis will show up tomorrow, forcing us to leave SHA-2 family in a hurry.
So it is very good we have 5 new well studied hash functions. Choosing one now would do little good, because it could prove weaker tomorrow just like SHA-2 could.
If we don't pick a winner now and keep them all on ice, we could pick from them easily and quickly a replacement when we need it.
The salt is known if you have the password file. The point is to use enough salt that rainbow tables are infeasible.
The thing about security is that it is an on-going fight with those that want to cause mischief (or worse). You always need to do your best. That is why we need SHA3. Maybe it isn't required right now, but I am fairly sure that at some point we will need a new solution. Either because of flaws that have become known in a current system, increases in computing power which mean we need a new solution, or maybe something much worse. I would rather spend time coming up with a new solution now. Ironing out any flaws or issues in an open way. Than having to knock something up quickly to get out of a tricky situation. So we should all support this rather than sticking our head in the sand and thinking everything will be alright.
Or you drop support for the old pepper and mass-expire all the passwords and require users to use a password reset mechanism (equivalent to the "lost password" functionality you should have) after the breach before they can log in.
This is somewhat less convenient to users, but simpler and -- assuming the security of your password reset mechanism -- more secure than continuing to support passwords with the old pepper until they are changed.
Salt also protects against rainbow tables in itself. If I add 4 bytes of salt, your rainbow table becomes 4 billion times larger to have the same effectiveness as it did when there is no salt (assuming the 4 bytes can hold any value).
If you keep a common extra salt somewhere in your server configuration, you can get lucky that the adversary only gets a copy of the password database and the per-password salts. That way the adversary has a hard time breaking even the easy passwords, and the post I replied to is right: "If the passwords are decently salted and the salt is unknown good luck with that."
However, that protection relies on luck. The salt is rarely unknown.
Finally! A year of moderation! Ready for 2019?
I know how salts work. I was educating AIXtreme, who apparently believes that they can be kept hidden in general.
Finally! A year of moderation! Ready for 2019?
I disagree. You don't wait to build a fire escape until the building is on fire. Similarly, we need a good alternative hash algorithm now, not when disaster strikes.
I believe that, in general, we should always have two widely-implemented crypto algorithms for any important purpose. That way, if one breaks, everyone can just switch their configuration to the other one. If you only have one algorithm... you have nothing to switch to. It can take a very long time to deploy things "everywhere", and it takes far longer to get agreement on what the alternatives should be. Doing it in a calm, careful way is far more likely to produce good results.
The history of cryptography has not been kind, in the sense that many algorithms that were once considered secure have been found not to be. Always having 2 algorithms seem prudent, given that history. And yes, it's possible that a future break will break both common algorithms. But if the algorithms are intentionally chosen to use different approaches, that is much less likely.
Today, symmetric key encryption is widely implemented in AES. But lots of people still implement other algorithms, such as 3DES. 3DES is really slow, but there's no known MAJOR break in it, so in a pinch people could switch to it. There are other encryption algorithms obviously; the important point is that all sending and receiving parties have to implement the same algorithms for a given message BEFORE they can be used.
Similarly, we have known concerns about SHA-2, SHA-256, and SHA-512. Maybe there will never be a problem. So what? Build the fire escape NOW, thank you.
- David A. Wheeler (see my Secure Programming HOWTO)
If you rely on hashing speed to hash passwords, you are doing it wrong. computers get faster, constantly. It's not speed that matters, it's the number of possible combinations making it exponentially too large to brute force, relative to the time to calculate each hash.
It's my understanding that speed does matter, which is why if you use a fast algorithm like SHA then you're advised to run it many times because you want to slow down any adversaries by several orders of magnitude. http://en.wikipedia.org/wiki/Bcrypt is an example of a KDF that is designed to be slow.
Schneier does not see difference between SHA2 and SHA3. He decodes both on the fly.
It doesn't need to be private
However, it should be sufficiently complex. A salt that adds a few numbers to the beginning and/or end of the password string does little to no good. A salt generated by a hash of a random value is, however, very effective.
"If a nation expects to be ignorant and free in a state of civilization, it expects what never was and never will be."
That's one way to use salt. Another way is to keep the salt secret. A secret salt for example, can be used to validate that a value you've handed to someone else hasn't changed.
Let's use this example...
I send you a session ID, uniquely identifying you. This session ID is tied to your username, and is involved in access control. If I simply send you the ID and trust the ID you return, you could easily change it, and possibly hijack someone else's session.
If I send you the session ID, and a salted hash of the ID (but I don't send you the salt), I can validate that you haven't changed your session ID, by requiring that you return both the session ID and the hash. I'll simply re-hash the ID with the salt, and confirm that it matches the hash you send me.
This can be used as a form of input validation for pretty much anything.
Agreed, salted hashes are very valuable even when the salt is available. Salted hashes break rainbow tables, and make it difficult to identify users with the same password.
To be fair, pretty much all cryptogrophy has a time/memory element. This element is the main limitation on brute force attack.
The point of cryptogrophy is to make it more time consuming/expensive to brute force the key-space than to try to brute guess the contents of the hash. The difficulty of breaking modern cryptography is typically described in terms of astronomic scales (to brute force this cypher, you'd need a bit of memory for every atom in the solar system.)
Attacks against cryptogrophy usually involve finding "short cuts," which reduce the time to attack a given cipher. The birthday attack is one very well known approach that most (all?) hash algorithms are vulnerable to.
The only perfect unbreakable encryption I'm aware of is the one time pad, and that only works if you observe proper key management.
Now as the other guy said not a security expert or crypto guy so maybe I'm missing something, but with the ability to rent time on these huge monsters like the Amazon cloud and those big Nvidia Tesla boxes won't generating those inputs become a lot more doable?
Maybe somebody here can explain what is meant by "longer than the length of the universe" and more specifically what kind of FLOPS they were looking at when they wrote that? After all just 20 years ago we were using computers below 100MHz and this 6 core AMD I built for cheap would have been a million dollar supercomputer. If we continue to keep cranking like this, won't the ability to distribute the load across so many massive number crunchers for relatively cheap make breaking these a whole lot more doable?
Again if I'm missing something my bad, the deep level math that they use in heavy crypto is honestly beyond me, but I do know cranking through math is what CPUs do best and if Amazon cloud is any indication soon you'll be able to rent something the size and power of Blue Gene for less than a really nice gaming PC which could put it into black hat useful territory.
ACs don't waste your time replying, your posts are never seen by me.
Properties that I would not require a slow function to have includes collision resistance
Unless you are very careful about how you define slowness, I think collision resistance (or something like it) might actually be important. For example, suppose 90% of all inputs result in the same output but to determine whether a particular value hashes to that common value might still require a lot of computation. Then if I want to crack a leaked password table, I compute a rainbow table assuming the slow function is just the constant function that always produces that common value. It is an invalid assumption, but statistically it will break a number of passwords.
In any case, your ideas are intriguing to me and I look forward to a peer-reviewed paper with full mathematical proofs to see whether they actually work.
It's not even just distributed computing. Some commodity hardware, like AMD GPUs, can compute current (fast) hashes at a ludicrous speed (billions per second, and no, that's not a typo, although memory throughput tends to limit the effective rate to hundreds of millions). Dedicated hardware, either custom-fabricated or using FPGAs, can improve on even that order of magnitude... and that's today's tech. As you say, hardware just keeps getting faster and faster, plus of course there's the distributed ("cloud") aspect.
For an example of what dedicated hardware can do, there's now a commercial service that can brute-force any DES (56 bits, 7*10^16 possible values; 10 bits is just over 3 orders of magnitude) keys in a day or so (under two days for worst-case). Of course, as the summary points out, 3DES is considered inadequate these days, and that's 7*10^16 as hard to crack as basic DES (112 bits, 5*10^33).
Now, even the shortest sha-2 digest is four times the length of a DES key, which means about 3*10^67 possible values. Even assuming a very fast SHA2 implementation (compute a hash in less than 1/70 the time it takes to do a block of DES), computing every possible SHA-224 value would take about 10*63 years. Suppose you got a *really* big cluster/cloud/botnet/whatever, like a billion machines. Then, with modern state-of-the-art hardware, it would take 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 (10^54) years (give or take a bit). Heat death of the universe indeed! But wait, what about Moore's Law? Well, after the first twenty years, you could knock three orders of magnitude off that. Still too long. After three hundred years, you can take 45 orders of magnitude off; at that point it'll only take a billion years using a billion machines! The solar system might even still exist in something resembling its current form by the time you finished, then!
So, if Moore's Law (as it relates to computing in general, not strictly as stated) continues for three times as long as it has so far, somewhere around the start of the 25th century CE "you" (or rather, your great-great-great-great-great... grandchildren) should be able to brute-force the shortest digest of SHA2 in a year or so with reasonably-priced hardware. That's well within "the length of the universe" (most likely) but still quite impractical.
People just do not comprehend exponential values; they're too big for our brains to really understand. Computers are really, really fast (relative to the numbers we commonly use), with prefixes meaning "billion" or even "trillion" thrown around these days. Great... but a trillion is 10^12. A trillion operations per second (1 TFLOPS) sounds fast today, but breaking modern crypto (say, AES) via brute-force requires so many operations that if every single atom on the Earth were a 1TF computer, you still wouldn't manage it once before the Earth was swallowed by the expanding sun. Crazy, huh?
BTW, apologies if I misplaced a few orders of magnitude here or there; it could happen. My point remains, though.
There's no place I could be, since I've found Serenity...
That is a valid point. And that is something that the formal definition would need to take into account. But to address that point it is sufficient that the probability of two inputs producing the same output is small, it does not need to be negligible.
For example if the probability that two random inputs produce the same output is 10^-10, then it is impractical to rely on collisions if you want to break passwords. It would be much more efficient to simply compute the slow function than to test enough passwords to find the one where the 10^-10 chance of producing the right output comes out in your favour. So given today's computational power a 10^-10 probability is small enough. On the other hand a hash function with 10^-10 probability of collision between any random pair of inputs is absolutely not collision resistant. You only need to evaluate such a hash function on roughly 10^5 inputs to find a collision. Even if each input took five seconds to hash, it would only take a week to hash enough inputs to find a collision.
So, the requirement that would be needed would be much weaker than collision resistance. Moreover the probability of two inputs producing the same output is something that has been considered for non-cryptographic hash functions in the past. And there are even constructions with formal proofs of such probabilities. And for those constructions I am not talking about something that relies on assumptions about the security of some cryptographic primitives. For example if you want to produce message-authentication-codes using a shared secret, it can be done provably secure. And a large part of the construction comes from hash functions with a known probability of collisions.
That little detour was just to point out some of the work in that area, which could be relied upon. Now getting back to the slow functions, I should elaborate on something I said earlier.
The part about not requiring fixed size outputs is a bit more important than it may seem from what I wrote. If you look at input and output sizes of hash functions and PRNGs you'll find a difference. Whether you are designing a cryptographic hash function or a PRNG, you want the output to be indistinguishable from random. The most important difference between those kinds of primitives is that you use a hash function when you want the output to be small and a PRNG when you want the output to be larger than the input.
A hash function always produce outputs of the same size, which in most applications is smaller than the input, hence collisions are guaranteed to exist, but may be hard to find. A PRNG always produce a output that is larger than the input.
The slow functions I suggest do not require the output to be nearly as random looking as that of a hash function or a PRNG. They also don't have to produce outputs of some specific size, the output can be smaller or larger than the input. But in order to satisfy some of the other requirements it is a good idea to produce a large output. It is easy to avoid collisions if the output is larger than the input. If you design a slow function from the start to satisfy the definition I suggest, it may as well append every single intermediate calculation along the way to the output. As such a slow function is being evaluated on a tiny password, the output may very well be multiple MB in size.
That is another reason why I would imagine the slow function being combined with a cryptographic hash. Any design involving a slow function is almost certainly going to pass the output through a cryptographic hash with the usual collision resistance requirement.
Do you care about the security of your wireless mouse?
You are assuming that the slow function is used in an insecure high level design. The way cryptographic systems are designed is to take primitives where you make assumptions about the security properties of the primitives, then you put those primitives together in a high level construction, where you prove that it is secure. The proof relies on the security properties of the low level primitives.
When done this way, you cannot break the high level system. You can only break the primitives. If you do break one of the primitives, the high level system would become insecure until you replace the broken primitive with another primitive of the same class, but not broken yet. For example many constructions use cryptographic hash functions. If you have such a system using MD5, then it is insecure. But if you were to replace MD5 with SHA512, then it would be secure again.
At no point did I say the slow function was supposed to difficult to invert. If you design a high level protocol with the assumption that the slow function is difficult to invert, you are at fault for designing the high level protocol incorrectly. This is just as bad as assuming that a cryptographic hash function is slow to compute, which is not one of the accepted assumptions about a cryptographic hash function.
I did point out that I expect the slow function to be combined with a cryptographic hash function. In fact the only sensible thing you can do with the output of a slow function as I described it is to pass it through a cryptographic hash function (perhaps combining with other data, which could be as part of an HMAC construction). The point is that each of the primitives serves a well defined purpose, and you can reason about them independently.
Do you care about the security of your wireless mouse?
The reference to site-specific rainbow tables implies the same salt was used for all passwords.
That's not salt, that's just a modification of the hash algorithm -- basically a tagged hash. Salt is defined as a per-entry random value.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
The scheme you describe is a Message Authentication Code, not a salted hash. If you use a salted hash when you actually need a MAC, you're potentially compromising your system's security.
Range Voting: preference intensity matters
Perfectly random one time pad the length of the message and for bit for bit with the message is the only provably secure algorithm, just don't ever use the same key twice and find some secure way for key management (trusted sneaker net?). But most key management systems for such cryptography might as well just put the message instead of the key as message length is the same and key length.
> it is very likely they also know how you salt and/or create the user-specific string. so in that case, trying to find the password by a user still becomes trying all possible password until you find one that matches.
False, if the the site is using double passwords.
If you passwordHash2( userId + passwordHash1( plaintext )) good luck even trying to "crack" that password.
Functions passwordHash1 and passwordHash2 could be the same one-way-hash or passwordHash2 could be the "super" strong one-way-hash. As long as both are sufficiently "strong enough" you are fucked.
I think the proper definition would have to include: "An adversary who has spent x% of the resources required to compute the output has at most x% probability of guessing the correct output." In reality the actual probability as function of the time spent isn't going to grow linear, but more likely as a convex function. The definition just requires a linear upper bound, which in some cases could be much higher than the actual probability.
Do you care about the security of your wireless mouse?