MD5crypt Password Scrambler Is No Longer Considered Safe
As reported here recently, millions of LinkedIn password hashes have been leaked online. An anonymous reader writes "Now, Poul-Henning Kamp a developer known for work on various projects and the author of the md5crypt password scrambler asks everybody to migrate to a stronger password scrambler without undue delay. From the blog post: 'New research has shown that it can be run at a rate close to 1 million checks per second on COTS GPU hardware, which means that it is as prone to brute-force attacks as the DES based UNIX crypt was back in 1995: Any 8 character password can be found in a couple of days. The default algorithm for storing password hashes in /etc/shadow is MD5. RHEL / CentOS / FreeBSD user can migrate to SHA-512 hashing algorithms.'" Reader Curseyoukhan was one of several to also point out that dating site eHarmony got the same treatment as LinkedIn. Update: 06/07 20:13 GMT by T : An anonymous reader adds a snippet from Help Net Security, too: "Last.fm has piped up to warn about a leak of their own users' passwords. Users who have logged in to the site were greeted today by a warning asking them to change their password while the site investigates a security problem. Following the offered link to learn more, they landed on another page with another warning."
rot13 isn't safe either.
Good info about storing passwords properly: http://www.f-secure.com/weblog/archives/00002095.html
"... RHEL / CentOS / FreeBSD user can migrate to SHA-512 hashing algorithms."
That's fine and dandy and all, but what about King Shit Debian? Pretty much all of my systems run something Debian-based in one form or another. Is this a simple change for Debian users too?
It astounds me that Linkedin and eHarmony used unsalted password hashes. That's much worse than using md5 (and, yes, you shouldn't use md5, but, still, first things first).
From the Linkedin Press Release :
The passwords are stored as unsalted SHA-1 hashes,
Come on, guys, get up to at least 1978 in your security policy.
Looks like it's time to change my password to "password1".
JADBP
There are 95 ASCII characters, which makes 95**8 = 6,634,204,312,890,625 possible 8 character passwords. At one million checks per second a brute force attack will take 6,634,204,312 seconds (210 years).
There's a fad going around right now to use ridiculously slow password hashing algorithms on the web, which the poster apparently has bought into:
If you do this you're opening your site up to an easy DoS attack - a few 10s of login requests per second will slow your server to a crawl. The place where slow hashing algorithms ought to be used is exactly the opposite of where they're used today: encryption of local files, where the user actually has to remember the password, unlike web passwords where you can just use 32 random characters and let your local browser remember it (preferably with the browser's password file encrypted with a slow hash).
So, if someone steals your database full of hashed passwords, you just call them up and ask them nicely to slow down their brute force?
If you get your password wrong, you can't try again for 1 second. Every failure doubles the time required to try again.
Why doesn't everyone do that?
It doesn't help if your attacker has got hold of the list of hashes.
1. Steal hashes
2. Brute-force on your own system/cloud/botnet/whatever
3. Use password
The problem is if the hashed password database is recovered (as in LinkedIn). Then you can run hashes as fast as you want to.
For instance, the SHA-1 hash of "password1" is e38ad214943daad1d64c102faec29de4afe9da3d -- you cannot reverse "e38ad214943daad1d64c102faec29de4afe9da3d" to get "password1", but you can guess things until you get "e38ad214943daad1d64c102faec29de4afe9da3d" and then you know that my password is "password1".
Password selection depends on the place you're using the password.
For most websites, enter something like abc321, hit reset password and they kindly reset the password to something and email me the new relatively good password.
It doesn't need that much security, so those are stored in my email.
For places that need better passwords, $ md5sum - lot of random text pounded on the keyboard and result is something like 24a53bc05c6f216e340aa8d5dc08b605
That checksum becomes the password.
For places where I actually have to enter the password without copypaste, something generated like xkcd's battery horse staple correct.
There are no atheists when recovering from tape backup.
SHA-1 is not sufficient, but the summary refers to SHA-512, which is in the SHA-2 set. Now whether SHA-2 is sufficent, or if developers should be migrating to something like bcrypt or SRP is a bigger conversation.
In any case, while I would never tell someone to use MD5, the lack of a salt is much more egregious and made the leak much worse than it had to be.
Because whoever downloaded the database of hashes will probably ignore your request that they only check one password per second.
If I have been able to see further than others, it is because I bought a pair of binoculars.
Indeed.
The effort to use a more secure hash is generally trivial, but there's still going to be a lot of people who either know and don't, or don't know.
For the first category, nothing you can do about it. Same people running wep on their wifi. They either don't see anyone ever attacking them, are tied in due to old systems, or don't care.
For the second category, stuff like this may help. I think at this point most people know md5 isn't as secure as once considered, but I don't think people realize just how insecure it is becoming. In peoples minds it's still in the "theoretically if someone was really dedicated they could break it" stage.. whereas it's actually entering into the "feasible to do it on large scale" stage. Breaking that perception might speed things along.
Hashes are not cryptos.
608b2d50a6521a27c12626cedfea0fc3
Whether MD5 is "secure" or not is irrelevant.
Machines that are accessed by users should not be the same servers storing the account security data. One of the key benefits to domain authentication provided by Kerberos and it's relatives is that the authentication data is isolated on a server that is supposed to be doing nothing but authentication and authorization.
That makes it damn hard to break into the security server to steal the password lists in the first place, regardless of what algorithms are used to hash the passwords. The problem is a poorly designed system, not a poorly equipped algorithm.
I do not fail; I succeed at finding out what does not work.
Amusing that people rag on about MD5Crypt being weak, when...
It's still much stronger than the unsalted simple hash being used by sites like linkedin.
Both of which are still massively stronger than the unsalted MD4 based algorithm used by windows (which is not only fast to crack, but can also be used as-is without needing to crack anyway), on which virtually all companies in the world are currently reliant.
Solaris still defaults to DES, although it does support MD5/Blowfish if you explicitly enable them.
Incidentally, until a breach like this occurs you have no idea what algorithm a website uses to store your passwords... There are still many sites out there which store them in plain text.
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
You have to distinguish two cases:
a) Collisions of hashes -- two documents have the same hash, and you can alter a document, but it will still have the same hash.
b) The hashing algorithm is insecure (not one-directional) for passwords, i.e. you can reconstruct the original password.
If the algorithm is susceptible to a), as were the attacks you mention, this does not mean anything for the password security! You don't want to create an alternate password that has the same hash as the password you already have. Additionally, you have length limitations with passwords you do not have for collisions.
Of course, susceptibility of hash algorithms to a) and b) is weakly correlated, but just because people understand the algorithm better.
Specifically, what are the drawbacks of storing md5 hashed password? Except for rainbow tables that can be produced for any algorithm and are evaded by salts.
I wouldn't choose MD5 for designing a new system, but I think understanding the difference is important. This has some similarity to using ridiculous key lengths for public-key encryption.
The article is arguing that MD5 and SHA1 are just to fast to compute rainbow-tables once the attacker has the salt, and algorithms that require more computations should be preferred. Should thus PBKDF2 be chosen for hashing documents? No, because a) and b) are different problems with different requirements.
NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
First of all, WTF is a "password scrambler"? If you feel the need to dumb down the phrase "hash algorithm", you're probably submitting to the wrong site.
I LOLed at this article[1] on ZDNet this morning for its sensationalist, lowest-common-denominator "OMG computer hackery stuff" reporting, with its implied link between MD5's weakness (which has been known for years) and the LinkedIn breach (even though they use SHA1), and its ridiculous accompanying screen cap (running user-space tools while logged in as root, which no security-minded user would ever do, but hey "root@" at a shell prompt with lots of hackery output looks l33t).
And now here's basically the same thing on Slashdot. Yawn...
[1] http://www.zdnet.com/blog/security/md5-password-scrambler-no-longer-safe/12317
If only there were a website where they could connect with other security professionals, exchange ideas and maybe even find people to hire....
If telephones are outlawed, then only outlaws will have telephones.
"The default algorithm for storing password hashes in /etc/shadow is MD5. RHEL / CentOS / FreeBSD user can migrate to SHA-512 hashing algorithms."
FreeBSD has long (like, 10+ years) had support for Blowfish password hashes. Blowfish was a close second in the AES contest, and is quite strong. Enabling it only requires editing /etc/login.conf and afterwards updating any pre-existing passwords.
/dev/random
Algorithms designed to burn CPU, like PBKDF2 recommended in the article, are great if used correctly.
The important thing to remember is that you want to make the client burn CPU, NOT the server. If you let the client trivially initiate 10 http requests that cause a server's CPU to peg for 1 second each, you've created a nice DoS vector.
Are there any existing Javascript crypto libraries that safely offload this work to the client?
As the summary notes, 8 character passwords can be cracked pretty quickly. 15 Characters with the crappy password rules we've enforced for minimum 8 character passwords become hard for users. It's time we start demanding correcthorsebatterystaple style random word passwords with maximum lengths of 255 characters (and a minimum > 8 characters).
That and WTF the passwords were unsalted? Salt them and DON'T keep the salt in the database.
I put on my robe and wizard hat..
How many times does this have to be said? Having literally hundreds of thousands of sites on the web that require you to create an account with a password is not a good security model. This should be patently obvious to anyone who spends enough time on the web.
If any of you out there are devs (for consumer-facing web companies) out there - I beg of you to push your company to start supporting OpenID as a reliant party.
AccountKiller
Once you can solve problem a). Problem b) will follow soon. Notice that for many case, you might not need to recover the password. Any password that generate the same hash will do.
If the MD5 encoded /ec/shadows of your system leaks. Solving problem a) allows me to log onto the system. That is a security problem. A lesser one than leaking the password itself but it is still a problem. Passwords need to be changed.
With a strong hashing function, you'll post your /etc/shadow on the web and still sleep like a baby at night.
1) server + http is stateless; it is not trivial to delay attempts every second. You would have to maintain a database of accounts and failure timestamps. On occasion, you'd have to scrub that database too. Not difficult to implement but I suspect few do. Busy distributed sites have more complications as this database may need to be in sync between servers; creating a possible bottleneck and another attack vector.
2) An attack on 100s of accounts could rotate between accounts to get around the time limit. So now you are storing a short history in that database; or tracking an IP address but not being too aggressive with the IP due to NAT users... and bot nets do not have as much trouble getting IP addresses.
3) Security holes. Some simple little add on to your website written in PHP just compromised your password database. The server may still be "secure" but the data could leak out and you may never know about it. Your password hashes are now on the internet with ZERO time delay between password attempts and any method known to man can be employed in parallel against those password hashes. Many people use the SAME password for all their accounts so one can be motivated to crack them even everybody later changes their passwords they probably keep the old ones in use elsewhere.
4) Some users have EMAIL ADDRESSES for account names it becomes easy to find that person again. Also, identification information may leak as well. Some sites produce different errors for unknown account names so then you know they have an account - especially if the account name is an email address. Even with a 1 second delay, I can quickly (in parallel) check a huge list of email addresses to see who has accounts with XXX with animals and kids .com. In addition, one has enough to send phishing emails...
5) Lost password questions. These questions are usually pathetic and tolerant of variations on input. This provides an easier password to crack probably without as much protection. 1 second delay will do nothing against "What is your mother's maiden name?"
So:
Learn something from DES, MD5 and soon SHA -- use bcrypt hashing!
Keep a timestamp database to filter out simple attacks and identify accounts under attack and log more data.
Do not use emails for account names. Encrypt identification (emails) in the database; store the keys outside the database's reach.
Forbid stupid passwords. Probably BAD to have secure questions at all.
Do not mindlessly ban the use of autocomplete since it allows many of us to generate long random passwords. Do not limit the length of passwords or the characters used; too many sites are overly restrictive.
Do not output errors that leak information.
Democracy Now! - uncensored, anti-establishment news
The argument is that "we can do 1 million hashes in one second on a GPU, thus we can perform a dictionary attack in just a few days." It is an argument about execution speed versus size of the dictionary. The size of the dictionary is limited by the human brain, and is not going to change any time soon. The execution speed is expected to decrease as a function of Moore's law, GPU, etc. The solution cannot be to move to SHA1, or even SHA256, because these algorithms don't take much longer to run than MD-5. They can't, because they are used in scenarios where servers have to process millions of messages.
The solution is probably to do something special for password storage. Use salt of course. But also do something like "run SH-xxx N times" where N is a number that grows larger as Moore's law progresses.
The issue with doing the hash client-side is that now the hash has become the password. If someone steals the list of hashes it's game over, they can just emulate the client sending the hash and the server won't know that they didn't start from the password and perform a hash. The hash must be done server-side.
I've been playing with a dedicated hash database that is on its own server, so hosts bounce a request off this appliance and get a "yes", "no", or "timeout". Too many "no"s in too short a time make the hash validation appliance refuse to give any answers for a period of time.
If done correctly, for someone to get the hash database, they would have to find a way to physically get access to the appliance, then dump the box. It isn't perfect (which is why a better algorithm like bcrypt should be used to store hashes), but having a layer of security whose sole purpose is to keep the list of hashes out of the hands of the blackhats means that after a breach is rectified, users are not forced to change all their passwords (unless their accounts were directly involved).
I think at this point most people know md5 isn't as secure as once considered, but I don't think people realize just how insecure it is becoming.
Why not? Are they not paying attention? This is a quote from TFA:
In 2004, researchers revealed a number of weaknesses in regularly-used hash functions. Later in 2005, MD5 was declared “broken” by security expert Bruce Schneier.
I remember reading that back in 2005 (7 years ago!) and not being surprised. I mean, what the hell people? Who in 2012 is using MD5 for new systems thinking that it's "good enough"? It hasn't been "good enough" since SHA-1 came out in 1995. I mean, all other things being equal, MD5 results in a hash that is 128 bits, a SHA-1 hash is 160 bits. The hash space is larger for SHA-1, so why would anyone be using MD5 for anything at all, even CRC checks? Not even SHA-1 is good enough any more, so again, why would anyone think MD5 is fine to use? I just don't get it, if people in 2012 still think that MD5 is appropriate to use in any circumstance then they'll never learn.
"Our two-party system is like a bowl of shit looking at itself in a mirror." - Lewis Black
I thought of this just as I hit the post button, but I would LOVE to see PHP deprecate and outright remove the MD5 function. Maybe copy-paste programmers will start paying attention at that point. If it breaks software, well, that software was already broken. PHP is just requiring that you fix it now.
Wishful thinking...
"Our two-party system is like a bowl of shit looking at itself in a mirror." - Lewis Black
So one could trivially deny service to all of your users by bouncing a bunch of bogus login attempts off your front end?
It is an easy way to go, but what I saw people do is log in just to make sure a legit user would be locked out.
Some old IBM systems would lock a user account indefinitely after 3-5 wrong guesses. So, what people would do for petty revenge is just type in the user name of someone they don't like, type in some wrong guesses, and that person's account is locked out until the next weekday when the IT staff comes in.
Instead, there needs to be multiple levels of lockout to prevent brute force guessing. The lowest level would be at the hash database server, where it would allow 10-20 wrong guesses before it would block access for 1-2 minutes (good enough to slow down brute force dictionary attacks, but not so long that it means a legit user is locked out forever.) From there, the app server would lock out access via IP ranges, say 15 minutes if there were 3+ guesses wrong. This way, if someone was coming from one IP range to guess passwords, it would not completely lock a legit user out who was coming from somewhere else.
Who told you my password?
b) differs from a) in common practice in that you must find data+determined salt=hash. If your collision doesn't end in the salt, it's useless in that application.
XML is like violence. If it doesn't solve the problem, use more.
I think the problem is overhyped. If you're an online site and some hacker already has the hashes and salts of user passwords to bruteforce, you're typically already so pwned it doesn't matter if you are using SHA2048 or whatever.
;).
For the same reason it doesn't matter that much even if I use 8 character passwords for noncritical online sites. I'd be flattered if the attackers are going to DDoS the site just to crack my password via the site's login page! If the site is famous enough I might even have enough warning to switch to a longer password when the DDoS attack hits the news
Whereas if they are bruteforcing my password offline - it means the site has already been compromised. And they are likely to be able to access the rest of my data in that site, possibly do actions using my account and perhaps with a bit more effort get the plaintext of my password the next time I log in.
So use different passwords for different sites, don't use passwords that are too short or obvious that they can be bruteforced online, but don't sweat making them super long unless its important or you're paranoid- since the site is more likely to get pwned before they bruteforce it online.
Getting pwned or compromised isn't a rare thing. I've signed up for different stuff using unique email addresses, and I've noticed spam coming to a few of those addresses. Maybe one day I'll have to create new slashdot/facebook/etc accounts when my current ones get pwned. Big deal.
For "offline" stuff like GPG, truecrypt, yes please do use strong and long passphrases.
Salt means rainbow tables become impractical. Salt also means a collision attack is also mitigated. Salt is not expected to be any more secret than hash, it just changes tho way it works.
XML is like violence. If it doesn't solve the problem, use more.
It would be interesting to try to create some cryptosystem providing: - a Hash(Key, Password) function; - a Compare(Key1, Key2, Hash1, Hash2) which returns true only when Password1=Password2 in Hash(Key1, Password1)=Hash(Key2, Password2). When a user registers an account: - Server gives the client a randomly generated challenge Key1; - Client provides H = Hash(Key1, Password) to the server; - Server stores Key1 and H. When a user logs in: - Server gives the client a randomly generated challenge Key2; - Client provides h' = Hash(Key2, Password); - Server computes Compare(Key1, Key2, h, h'). I'm not sure creating such a cryptosystem is possible, I have not given it much thought. It would be nice if Hash were expensive to compute but Compare were cheap. This has the added bonus that the server will never know your password.
The argument is that "we can do 1 million hashes in one second on a GPU, thus we can perform a dictionary attack in just a few days." It is an argument about execution speed versus size of the dictionary. The size of the dictionary is limited by the human brain, and is not going to change any time soon. The execution speed is expected to decrease as a function of Moore's law, GPU, etc. The solution cannot be to move to SHA1, or even SHA256, because these algorithms don't take much longer to run than MD-5. They can't, because they are used in scenarios where servers have to process millions of messages.
The solution is probably to do something special for password storage. Use salt of course. But also do something like "run SH-xxx N times" where N is a number that grows larger as Moore's law progresses.
True, all true, but entirely missing the point. Never store any customer data at all unencrypted, even password hashes. There are many ways to have your data pwnt, but bulk copy of the data at rest is the easiest and most common. That data (bulk data at rest) should never be unencrypted, ever.
Socialism: a lie told by totalitarians and believed by fools.
Firstly, a bad tool becoming so popular to the point where it's nearly impossible to convince people not to use it is not an uncommon effect. Just like w3schools is so widely spread although one of the poorest sources of documentation for html and the like, MD5 is still pretty much the first thing most programmers are taught by others when it comes to storing passwords (especially PHP programmers, who often have no clue what they are doing, as they're pretty much the only people not using prepared statements in 2012). Secondly, it often appears to be in human's nature to simply not care about consequences when they seem so unlikely to occur to one. Just like you know your door lock can be broken easily and there are better alternatives yet you are not switching, many programmers simply come to the conclusion that whether they choose MD5 or something else, it will likely not matter at all. "Who's going to manage to break into my database and steal my hashes in the first place, anyway". Unsalted MD5 is still a trend among PHP programmers (just like mysql). Besides, a reasonably strong password with a salt is still not very "easy" to break, and pragmatic programmers simply don't care that you can generate collision pairs efficiently.
Also, I'm not sure HMAC MD5 is considered weak at all.
That is what one tries to avoid. The best is to have multiple blocking mechanisms. First is by IP address, so if someone is hacking user Alice's account, the site trying to hack in as Alice gets blocked on the IP level.
The hashing appliance is a work in progress. Either a delay between handing out replies for the same user or an outright lockout serve the same purpose. The trick is to slow down a dictionary attack, as well as make it difficult for an attacker to grab the list of user password hashes.
Storing is meaningless if companies ** cough ** virgin-mobile ** cough** send you your passwords or PINs in an email every time you change it. And they include your phone number and name in the same email. And when asked to stop it they claim to adhere to industry standard security standards.
You missed the point.
Spoofed source IPs would defeat this method easily.
It's useless for an actual online brute-force attack, since the attacker clearly wants to receive a reply from his password attempts, but if the aim is just to DoS the authentication server, that's not a problem. You should send the client a token and wait to get it back before you start blowing CPU cycles on determining whether their password hash is legit.
DRM: Terminator crops for your mind!
There's ways to keep the data passed back and forth from being constant. A challenge-response system, where the server sends a random blob and requires the client to manipulate it using the password in a fashion that it can verify using the database-stored hash, without actually transmitting either the password or hash. Of course, now we're well beyond the simple "hash this and see if it matches" and into automatic handshaking protocols...
Does "run SH-xxx N times" actually grow in computational complexity as proportionally to O(N)? I don't happen to know enough about SHA-x to answer this, but I do remember that triple-DES did not actually increase complexity proportional to a tripled key length.
http://en.wikipedia.org/wiki/Meet-in-the-middle_attack
DRM: Terminator crops for your mind!
With a strong hashing function, you'll post your /etc/shadow on the web and still sleep like a baby at night.
That's going a bit far... if the hashing function is known and a password in the list is known, the rest of the variables can usually be filled in pretty quickly. At that point, a brute force attack against the rest of the hashes will only be limited by the speed at which the attack can be performed -- which is the point of what's being argued in the original article (not just that md5crypt is bad, but that any replacement should be system-configurable (so that it's harder to guess the algorithm and algorithm settings/salt used) and computationally expensive (which most web servers are NOT going to like -- but as sites like LinkedIn allow you to stay logged in via cookie anyway, I can't see how it's that much of an issue).
http://www.buzzfeed.com/jwherrman/the-23-most-depressing-leaked-linkedin-passwords
Why is Snark Required?
I'm always interested in the latest status of how fast the various popular hashes (and encryption) can be brute forced by:
- home computer / GPU
- super computer
- (blade) server cluster (enterprise/government scale)
- distributed.net-like project
- maybe ASICs or dedicated hardware? Like EFF DES cracker/Deep Crack
And the expectations for the next 20 to 50 years, assuming no revolutionary thing as quantum computing will emerge.
Does anyone know an up to date website dedicated to this? Articles tend to be obsolete in a few years.
You cannot spoof an IP in this case. You are establishing a TCP connection and sending login info over HTTP. The TCP connection starts with a 3 way handshake, which is not possible in the case of spoofed IPs. IP based throttling, and dedicate hash processor (may be someone could build a co-processor card for this) sounds great actually.
When talking about password security you must assume that the attacker has access to your salt method, the hash function that you use and the hashed passwords.
That doesn't mean that you don't also do defense in depth such as account lockouts, increased response times for consecutive failures and limiting the number of tries. But at the end of the day you must make the assumption that the attacker can bypass those and make as many attempts per second as they have the money/resources for.
Then there's the enforcement of password complexity, which the marketing folks are at complete odds with the security folks about. Forcing users to use stronger passwords will drive them away and marketers will try to get those barriers lowered so they can build a user base.
Wolde you bothe eate your cake, and have your cake?
I've never used RoR, I prefer to write correct PHP code. What exactly do you use MD5 for where SHA-1 would not be more appropriate (less prone to collisions, due to a larger hash space)?
"Our two-party system is like a bowl of shit looking at itself in a mirror." - Lewis Black
Also, I'm not sure HMAC MD5 is considered weak at all.
Maybe not, but would it be better to use HMAC MD5 or HMAC SHA-1? Or HMAC SHA-512? What is the argument for using a weaker alternative?
Firstly, a bad tool becoming so popular to the point where it's nearly impossible to convince people not to use it is not an uncommon effect.
Right, that's why I'd love to see PHP remove the MD5 function. It would force people to wonder why and look for alternatives.
"Our two-party system is like a bowl of shit looking at itself in a mirror." - Lewis Black
The hashing algorithm is far more important in offline attacks than in online attacks. Please don't suggest otherwise, as you don't appear to have a background in security (this isn't a personal attack, but people without security backgrounds should never give security advice).
If the table containing password hashes is compromised and leaked, the only thing preventing the plaintext being bruteforced is the strength of the algorithm. Salting prevents using a rainbow table, but you have to assume the salt is also compromised. If there's a global salt, it's just a matter of building a new rainbow table with the known shared salt: this is time consuming, but not that bad especially for weaker passwords IF THE ALGORITHM WAS CHOSEN POORLY. When each password has its own salt, an attacker must go one at a time, slowing the entire process down by a factor of [number of users in the compromised system].
The algorithm's speed comes into play here. MD5 and SHA1 are meant to be used as checksums and are designed run as fast as possible - hundreds of thousands of times per second on password-length strings with modern hardware. This of course disregards the fact that they are not cryptographically secure. SHA256, same situation: it's a bit slower and while it's currently still considered crytographically secure, it's designed for speed. I can compute a rainbow table for a dictionary attack in about two seconds (or two seconds per row, assuming unique per-row salts). Compare to something like bcrypt or PBKDF2* which include a number of rounds specifically to slow things down: with even a relatively low number of rounds ($08$ to $09$ in bcrypt, for example) modern hardware caps at about ten hashes per second. Now going down /usr/share/dict/words takes 25,000 seconds - seven hours - instead of 1-2 seconds. That timeframe is going to get exponentially longer when you consider variations, substitutions, mixed case, and multi-word passwords, assuming the password being attacked is dictionary-based at all.
You do raise a valid point about the other damage a data breach may cause, but aside from encrypted data (such as financial information), the password is the most damaging thing an attacker could retrieve since that probably grants access to a whole host of other sites since so many people re-use passwords. If you're OK with that, fine - go ahead and re-use crappy passwords everywhere. But for the love of security, please don't give advice on how people/companies should hash passwords.
*PBKDF2 is meant to be used as a key derivation function to convert a password into a cryptographically-secure encryption key. It's not the best choice for password digests, although because it includes a work factor like bcrypt is still relevant to the discussion, and is certainly a better choice than MD5 or SHA-family alone.
How are sites slashdotted when nobody reads TFAs?
Lack of a salt makes no difference. Salting is designed to defeat rainbow table attacks. However no actual criminals who are cracking passwords are using rainbow tables. It's all done using GPUs, which don't care about salts. Also the summary of TFA is wildly misleading. 1 million attempts per second? Uh, no. Last I checked oclhashcat-plus was capable of about 50 billion attempts per second on a 4xATI5970 rig. You can break most "normal" passwords in under one second with such hardware.
md5 is definitely more collisionable than sha1!
GP was ragging on md5, not MD5Crypt, whatever that is.
Take off every 'sig' !!
Ever logged in to a mail server before? You logged in with that method.
How it works:
1. Make standard hash of user's password.
2. Make random 'challenge'. Give to user.
3. User sends back HASH(HASH(password) + challenge)
4. Server computes HASH(Stored hash + challenge)
5. Authentication with no password sent or stored in cleartext.
If the server can decrypt it, an attacker can as well.
Its just a minor speed bump along the way.
An attack on 100s of accounts could rotate between accounts to get around the time limit. So now you are storing a short history in that database; or tracking an IP address but not being too aggressive with the IP due to NAT users... and bot nets do not have as much trouble getting IP addresses.
You are aware that such large-scale attacks are pretty rare? The vast majority involve one or two hosts. (You can tell this by looking at logs and overall net traffic levels.) They also tend to try all sorts of obvious things first; it's really easy to spot what's going on, and it usually reads like a litany of all that's wrong with IIS and PHP... but I digress. While theoretically, a botnet could be used to get past techniques like denyhosts, most botnets work on far more valuable things than breaking into a single account on a single computer. Hardly anyone has data valuable enough to be worth that effort (and with those I know who do, no botnet could break in anyway, as the data is held strictly offline).
"Little does he know, but there is no 'I' in 'Idiot'!"
Great... so instead of one password per site, someone just needs to log into your DropBox account and crack your (hopefully fairly strong) KeePass password, and they get everything -- not just all your passwords, but what sites they're for and what the associated usernames are. All sitting out there on a public server 24x7.
How strong is your KeePass password?
And the hash is the password again...
If I have stolen $hash from server, I won't need to break that hash to emulate the user.
I just need to get challenge and return hash($stolen_hash + challenge) to the server. Congratulations, you re-invented saving "plaintext" passwords! ;)
It's The Golden Rule: "He who has the gold makes the rules."
The attack on multiple encryption relies on two things: 1) Input and output space are both finite 2) You can invert encryption under possible key values to "meet in the middle" Neither of these is true for hash functions (although in this specific password case you can put reasonable bounds on the size of the password to get (1) under assumptions).
They have chosen suffix attacks on MD5 that would work for this but you have to have the freedom to add some large amount of data (many bytes) to the front of the input, which would most likely be rejected from any reasonable login system.
All of security is just speedbumps along the attackers way. That's what security does, whether physical or IT: it delays the attacker by some amount.
If all of your customer data is plaintext at rest, then the attacker just needs to pwn any server anywhere that can see that data, and bulk-copy it off. That might not even raise any internal alarms, since it's justa file copy.
If all of your customer data is encrypted at rest, then the attacker needs to pwn each machine thatknows how to decrypt each part of that data, and not just to run an arbitrary process/shell on that server, but actually understand and interact with the process that holds the keys. That's a smaller hole. And if it's something like an SQL injection attack to get at DB data, you have to be vulnerable to that specific atack, and there's at least a chance that some massive query that returns all rows of all tables will get noticed, so even if you don't stop it maybe you at least know it just happened.
Socialism: a lie told by totalitarians and believed by fools.
The execution speed is expected to decrease as a function of Moore's law, GPU, etc.
Wouldn't the execution speed increase? Or did you mean that the execution time would decrease?
MD5 is a checksum algorithm. I have never seen it used to actually "encrypt" something, although I'm sure it happens from time to time. Regardless, removing it would be a bad idea.
Fanboy Status: Apache Flex, C#, Eclipse, KDE, Pirate Party, Ron Paul, Slackware, Windows 7
it is not MD5 we are talking but MD5crypt: http://en.wikipedia.org/wiki/Crypt_(Unix)#MD5-based_scheme
Which is 1000 times MD5, so you get not 50 billion attempts on your rig but 0.05 billion
Atari rules... ermm... ruled.
That method is nothing like I described and I doubt anyone serious about security is using it.
If I remember correctly, PHP's main developer broke the crypt() function and their regression testing was so poor that the bug went into a stable release. Not to mention that everyone was like "everyone can misunderstand the manpage for strcpy!". I wouldn't put too much faith in anything good coming out of PHP.
Using HMAC SHA-1 over HMAC MD5 is no more convincing than using AES-192 over AES-128. For regular security needs, AES-128 is considered sufficient. For greater needs, there's AES-192 and AES-256. The same logic applies for HMAC-MD5,HMAC-SHA1 and HMAC-512.
SASL does store the password in clear text though if you use CRAM-MD5 or DIGEST-MD5 authentication.
Use strings on the SASL db to confirm this.
I worked on a search engine in 95, and used md5 hashes (of content minus tags) as keys for pages to detect identical pages, and I remember having 2-3 md5 collisions out of maybe the first 100.000 indexed pages.
On second thought, the idea above is merely a zero-knowledge proof which can be easily achieved with homomorphic encryption. I think this could be achieved with javascript so website developers can easily integrate a more secure login with almost zero overhead server-side, and unnoticeable overhead client-side.
You seem to be replying to the wrong post.
For regular security needs, AES-128 is considered sufficient. For greater needs, there's AES-192 and AES-256.
I don't like that type of thought, it seems lazy. It's the reason why MD5 is still being used. What is "sufficient" today may not be sufficient in 10 years, so why choose lesser security when the difference between that and greater security is a bigger database field? And how would you quantify "regular security" versus "greater needs"?
"Our two-party system is like a bowl of shit looking at itself in a mirror." - Lewis Black
Yes. This is slashdot.
Some other OCD nerd would have pointed it out as an obvious attack OR if I mentioned it point out it is unlikely...
Likely hood is not a wise excuse when listing security threats! Save that for the implementation phase! You document the threats and what you are doing about them, which in some cases will be nothing because it is not worth the effort at that point.
One has to keep in mind that when common attack vectors are GONE the less common ones become the new common attacks - and real crackers are skilled computer users who evolve their tools and skills (sometimes lending help to others.) It also depends on the details; if you don't track IPs and I have a long list of accounts I could stay under the radar by rotating the attacks; if you track IPs then I'd have a problem of getting a lot of IPs to do the attack which as you pointed out is quite unlikely.
Democracy Now! - uncensored, anti-establishment news
You are confusing the story about the Flame malware with the LinkedIn breach.
Funny enough, in one case the FBI wants to know who told everyone the USA wrote Flame to hack sites, and in the other case the FBI wants to know who hacked LinkedIn. Perhaps the two teams should, you know, Link up? :)
Flame actually does use a (apparently previously unknown) cryptographic attack method on the MD5 hash itself (creating collisions in a novel way) to sign fake Microsoft certificates. This was already a known attack since last year, but that time it appeared to come from the Middle East and was aimed at Iranian dissidents (apparently). Funny how it was used against Iran then. Someone seems to have a well-developed sense of irony. UNfortunately the new method also opens new attack avenues at other cryptographic security systems. The stakes are high, but not every country or involved party in what looks like a covert war can afford this research (or do it with success), so RSA and certificate providers will be having a really bad time in the coming years: I predict physical breakins, kidnaps, and assault on people working for those companies.
If I was working for RSA or a similar company, I'd leave facebook and LinkedIn and other social media right now. Social media sites are about to become decidedly unhealthy pastimes for employees of those companies.
Therefore, by the (faulty) logic you're using, you're just a cow with a keyboard - osu-neko (2604)
To check the integrity of transmitted files from a list of MD5 checksums provided by the software on the other end, that can only do MD5 checksums. Not critical but still pretty useful when we go live in 4 weeks. Validated system and all that so forget about patching it (unless you want to spend a lot of time+money).
Therefore, by the (faulty) logic you're using, you're just a cow with a keyboard - osu-neko (2604)
So the argument for using a hash/checksum algorithm with a smaller collision space is because it would be difficult to update the software, not for a technical reason? I'm looking for a single technical reason why someone would favor MD5 over SHA-1.
"Our two-party system is like a bowl of shit looking at itself in a mirror." - Lewis Black
Your question inspired me to write a proof of concept.
Do you care about the security of your wireless mouse?
The issue with doing the hash client-side is that now the hash has become the password. If someone steals the list of hashes it's game over, they can just emulate the client sending the hash and the server won't know that they didn't start from the password and perform a hash. The hash must be done server-side.
Not necessarily. If the salt or part of the salt was a "Proof of work" package, it could be offloaded to the client side.
i.e.
my password is stored in regular hash/salt format, the salt being a challenge string.
The proof of work would be to vary the challenge string on a pre-agreed upon mutation, and produce a hash of the mutated string so that the value starts with 0000. That hash added to a secondary salt based on the challenge string constitutes the actual salt. Naturally the proof of work can vary with hardware improvements.
Now I can off load most of that work to the client to calculate the salt for me.