MD5crypt Password Scrambler Is No Longer Considered Safe
As reported here recently, millions of LinkedIn password hashes have been leaked online. An anonymous reader writes "Now, Poul-Henning Kamp a developer known for work on various projects and the author of the md5crypt password scrambler asks everybody to migrate to a stronger password scrambler without undue delay. From the blog post: 'New research has shown that it can be run at a rate close to 1 million checks per second on COTS GPU hardware, which means that it is as prone to brute-force attacks as the DES based UNIX crypt was back in 1995: Any 8 character password can be found in a couple of days. The default algorithm for storing password hashes in /etc/shadow is MD5. RHEL / CentOS / FreeBSD user can migrate to SHA-512 hashing algorithms.'" Reader Curseyoukhan was one of several to also point out that dating site eHarmony got the same treatment as LinkedIn. Update: 06/07 20:13 GMT by T : An anonymous reader adds a snippet from Help Net Security, too: "Last.fm has piped up to warn about a leak of their own users' passwords. Users who have logged in to the site were greeted today by a warning asking them to change their password while the site investigates a security problem. Following the offered link to learn more, they landed on another page with another warning."
rot13 isn't safe either.
Good info about storing passwords properly: http://www.f-secure.com/weblog/archives/00002095.html
"... RHEL / CentOS / FreeBSD user can migrate to SHA-512 hashing algorithms."
That's fine and dandy and all, but what about King Shit Debian? Pretty much all of my systems run something Debian-based in one form or another. Is this a simple change for Debian users too?
It astounds me that Linkedin and eHarmony used unsalted password hashes. That's much worse than using md5 (and, yes, you shouldn't use md5, but, still, first things first).
From the Linkedin Press Release :
The passwords are stored as unsalted SHA-1 hashes,
Come on, guys, get up to at least 1978 in your security policy.
Looks like it's time to change my password to "password1".
JADBP
So, if someone steals your database full of hashed passwords, you just call them up and ask them nicely to slow down their brute force?
If you get your password wrong, you can't try again for 1 second. Every failure doubles the time required to try again.
Why doesn't everyone do that?
It doesn't help if your attacker has got hold of the list of hashes.
1. Steal hashes
2. Brute-force on your own system/cloud/botnet/whatever
3. Use password
SHA-1 is not sufficient, but the summary refers to SHA-512, which is in the SHA-2 set. Now whether SHA-2 is sufficent, or if developers should be migrating to something like bcrypt or SRP is a bigger conversation.
In any case, while I would never tell someone to use MD5, the lack of a salt is much more egregious and made the leak much worse than it had to be.
Because whoever downloaded the database of hashes will probably ignore your request that they only check one password per second.
If I have been able to see further than others, it is because I bought a pair of binoculars.
Indeed.
The effort to use a more secure hash is generally trivial, but there's still going to be a lot of people who either know and don't, or don't know.
For the first category, nothing you can do about it. Same people running wep on their wifi. They either don't see anyone ever attacking them, are tied in due to old systems, or don't care.
For the second category, stuff like this may help. I think at this point most people know md5 isn't as secure as once considered, but I don't think people realize just how insecure it is becoming. In peoples minds it's still in the "theoretically if someone was really dedicated they could break it" stage.. whereas it's actually entering into the "feasible to do it on large scale" stage. Breaking that perception might speed things along.
608b2d50a6521a27c12626cedfea0fc3
Whether MD5 is "secure" or not is irrelevant.
Machines that are accessed by users should not be the same servers storing the account security data. One of the key benefits to domain authentication provided by Kerberos and it's relatives is that the authentication data is isolated on a server that is supposed to be doing nothing but authentication and authorization.
That makes it damn hard to break into the security server to steal the password lists in the first place, regardless of what algorithms are used to hash the passwords. The problem is a poorly designed system, not a poorly equipped algorithm.
I do not fail; I succeed at finding out what does not work.
You have to distinguish two cases:
a) Collisions of hashes -- two documents have the same hash, and you can alter a document, but it will still have the same hash.
b) The hashing algorithm is insecure (not one-directional) for passwords, i.e. you can reconstruct the original password.
If the algorithm is susceptible to a), as were the attacks you mention, this does not mean anything for the password security! You don't want to create an alternate password that has the same hash as the password you already have. Additionally, you have length limitations with passwords you do not have for collisions.
Of course, susceptibility of hash algorithms to a) and b) is weakly correlated, but just because people understand the algorithm better.
Specifically, what are the drawbacks of storing md5 hashed password? Except for rainbow tables that can be produced for any algorithm and are evaded by salts.
I wouldn't choose MD5 for designing a new system, but I think understanding the difference is important. This has some similarity to using ridiculous key lengths for public-key encryption.
The article is arguing that MD5 and SHA1 are just to fast to compute rainbow-tables once the attacker has the salt, and algorithms that require more computations should be preferred. Should thus PBKDF2 be chosen for hashing documents? No, because a) and b) are different problems with different requirements.
NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
First of all, WTF is a "password scrambler"? If you feel the need to dumb down the phrase "hash algorithm", you're probably submitting to the wrong site.
I LOLed at this article[1] on ZDNet this morning for its sensationalist, lowest-common-denominator "OMG computer hackery stuff" reporting, with its implied link between MD5's weakness (which has been known for years) and the LinkedIn breach (even though they use SHA1), and its ridiculous accompanying screen cap (running user-space tools while logged in as root, which no security-minded user would ever do, but hey "root@" at a shell prompt with lots of hackery output looks l33t).
And now here's basically the same thing on Slashdot. Yawn...
[1] http://www.zdnet.com/blog/security/md5-password-scrambler-no-longer-safe/12317
If only there were a website where they could connect with other security professionals, exchange ideas and maybe even find people to hire....
If telephones are outlawed, then only outlaws will have telephones.
Algorithms designed to burn CPU, like PBKDF2 recommended in the article, are great if used correctly.
The important thing to remember is that you want to make the client burn CPU, NOT the server. If you let the client trivially initiate 10 http requests that cause a server's CPU to peg for 1 second each, you've created a nice DoS vector.
Are there any existing Javascript crypto libraries that safely offload this work to the client?
As the summary notes, 8 character passwords can be cracked pretty quickly. 15 Characters with the crappy password rules we've enforced for minimum 8 character passwords become hard for users. It's time we start demanding correcthorsebatterystaple style random word passwords with maximum lengths of 255 characters (and a minimum > 8 characters).
That and WTF the passwords were unsalted? Salt them and DON'T keep the salt in the database.
I put on my robe and wizard hat..
How many times does this have to be said? Having literally hundreds of thousands of sites on the web that require you to create an account with a password is not a good security model. This should be patently obvious to anyone who spends enough time on the web.
If any of you out there are devs (for consumer-facing web companies) out there - I beg of you to push your company to start supporting OpenID as a reliant party.
AccountKiller
1) server + http is stateless; it is not trivial to delay attempts every second. You would have to maintain a database of accounts and failure timestamps. On occasion, you'd have to scrub that database too. Not difficult to implement but I suspect few do. Busy distributed sites have more complications as this database may need to be in sync between servers; creating a possible bottleneck and another attack vector.
2) An attack on 100s of accounts could rotate between accounts to get around the time limit. So now you are storing a short history in that database; or tracking an IP address but not being too aggressive with the IP due to NAT users... and bot nets do not have as much trouble getting IP addresses.
3) Security holes. Some simple little add on to your website written in PHP just compromised your password database. The server may still be "secure" but the data could leak out and you may never know about it. Your password hashes are now on the internet with ZERO time delay between password attempts and any method known to man can be employed in parallel against those password hashes. Many people use the SAME password for all their accounts so one can be motivated to crack them even everybody later changes their passwords they probably keep the old ones in use elsewhere.
4) Some users have EMAIL ADDRESSES for account names it becomes easy to find that person again. Also, identification information may leak as well. Some sites produce different errors for unknown account names so then you know they have an account - especially if the account name is an email address. Even with a 1 second delay, I can quickly (in parallel) check a huge list of email addresses to see who has accounts with XXX with animals and kids .com. In addition, one has enough to send phishing emails...
5) Lost password questions. These questions are usually pathetic and tolerant of variations on input. This provides an easier password to crack probably without as much protection. 1 second delay will do nothing against "What is your mother's maiden name?"
So:
Learn something from DES, MD5 and soon SHA -- use bcrypt hashing!
Keep a timestamp database to filter out simple attacks and identify accounts under attack and log more data.
Do not use emails for account names. Encrypt identification (emails) in the database; store the keys outside the database's reach.
Forbid stupid passwords. Probably BAD to have secure questions at all.
Do not mindlessly ban the use of autocomplete since it allows many of us to generate long random passwords. Do not limit the length of passwords or the characters used; too many sites are overly restrictive.
Do not output errors that leak information.
Democracy Now! - uncensored, anti-establishment news
The issue with doing the hash client-side is that now the hash has become the password. If someone steals the list of hashes it's game over, they can just emulate the client sending the hash and the server won't know that they didn't start from the password and perform a hash. The hash must be done server-side.
I tend to use 'apg' when generating passwords, neat little tool. Aliased as 'apg -a 1 -m 12 -x 16' though, as the default generator goes for pronounceable passwords that are too short for my taste:
% apg ;a-_)wg}~*Xu~z
9&}v3Q/'n5O6UN
]%LE\!TLUt?Z]jjj
$i4&zmOxh-wmfGu
N6.H+i/^rcGo5`p
rKv4JoC6wO0`\6,j
If someone brute-forces those they have earned it.
This sig is intentionally left blank
I've been playing with a dedicated hash database that is on its own server, so hosts bounce a request off this appliance and get a "yes", "no", or "timeout". Too many "no"s in too short a time make the hash validation appliance refuse to give any answers for a period of time.
If done correctly, for someone to get the hash database, they would have to find a way to physically get access to the appliance, then dump the box. It isn't perfect (which is why a better algorithm like bcrypt should be used to store hashes), but having a layer of security whose sole purpose is to keep the list of hashes out of the hands of the blackhats means that after a breach is rectified, users are not forced to change all their passwords (unless their accounts were directly involved).
Who told you my password?
I think the problem is overhyped. If you're an online site and some hacker already has the hashes and salts of user passwords to bruteforce, you're typically already so pwned it doesn't matter if you are using SHA2048 or whatever.
;).
For the same reason it doesn't matter that much even if I use 8 character passwords for noncritical online sites. I'd be flattered if the attackers are going to DDoS the site just to crack my password via the site's login page! If the site is famous enough I might even have enough warning to switch to a longer password when the DDoS attack hits the news
Whereas if they are bruteforcing my password offline - it means the site has already been compromised. And they are likely to be able to access the rest of my data in that site, possibly do actions using my account and perhaps with a bit more effort get the plaintext of my password the next time I log in.
So use different passwords for different sites, don't use passwords that are too short or obvious that they can be bruteforced online, but don't sweat making them super long unless its important or you're paranoid- since the site is more likely to get pwned before they bruteforce it online.
Getting pwned or compromised isn't a rare thing. I've signed up for different stuff using unique email addresses, and I've noticed spam coming to a few of those addresses. Maybe one day I'll have to create new slashdot/facebook/etc accounts when my current ones get pwned. Big deal.
For "offline" stuff like GPG, truecrypt, yes please do use strong and long passphrases.
The argument is that "we can do 1 million hashes in one second on a GPU, thus we can perform a dictionary attack in just a few days." It is an argument about execution speed versus size of the dictionary. The size of the dictionary is limited by the human brain, and is not going to change any time soon. The execution speed is expected to decrease as a function of Moore's law, GPU, etc. The solution cannot be to move to SHA1, or even SHA256, because these algorithms don't take much longer to run than MD-5. They can't, because they are used in scenarios where servers have to process millions of messages.
The solution is probably to do something special for password storage. Use salt of course. But also do something like "run SH-xxx N times" where N is a number that grows larger as Moore's law progresses.
True, all true, but entirely missing the point. Never store any customer data at all unencrypted, even password hashes. There are many ways to have your data pwnt, but bulk copy of the data at rest is the easiest and most common. That data (bulk data at rest) should never be unencrypted, ever.
Socialism: a lie told by totalitarians and believed by fools.
Storing is meaningless if companies ** cough ** virgin-mobile ** cough** send you your passwords or PINs in an email every time you change it. And they include your phone number and name in the same email. And when asked to stop it they claim to adhere to industry standard security standards.
Spoofed source IPs would defeat this method easily.
It's useless for an actual online brute-force attack, since the attacker clearly wants to receive a reply from his password attempts, but if the aim is just to DoS the authentication server, that's not a problem. You should send the client a token and wait to get it back before you start blowing CPU cycles on determining whether their password hash is legit.
DRM: Terminator crops for your mind!
http://www.buzzfeed.com/jwherrman/the-23-most-depressing-leaked-linkedin-passwords
Why is Snark Required?
The hashing algorithm is far more important in offline attacks than in online attacks. Please don't suggest otherwise, as you don't appear to have a background in security (this isn't a personal attack, but people without security backgrounds should never give security advice).
If the table containing password hashes is compromised and leaked, the only thing preventing the plaintext being bruteforced is the strength of the algorithm. Salting prevents using a rainbow table, but you have to assume the salt is also compromised. If there's a global salt, it's just a matter of building a new rainbow table with the known shared salt: this is time consuming, but not that bad especially for weaker passwords IF THE ALGORITHM WAS CHOSEN POORLY. When each password has its own salt, an attacker must go one at a time, slowing the entire process down by a factor of [number of users in the compromised system].
The algorithm's speed comes into play here. MD5 and SHA1 are meant to be used as checksums and are designed run as fast as possible - hundreds of thousands of times per second on password-length strings with modern hardware. This of course disregards the fact that they are not cryptographically secure. SHA256, same situation: it's a bit slower and while it's currently still considered crytographically secure, it's designed for speed. I can compute a rainbow table for a dictionary attack in about two seconds (or two seconds per row, assuming unique per-row salts). Compare to something like bcrypt or PBKDF2* which include a number of rounds specifically to slow things down: with even a relatively low number of rounds ($08$ to $09$ in bcrypt, for example) modern hardware caps at about ten hashes per second. Now going down /usr/share/dict/words takes 25,000 seconds - seven hours - instead of 1-2 seconds. That timeframe is going to get exponentially longer when you consider variations, substitutions, mixed case, and multi-word passwords, assuming the password being attacked is dictionary-based at all.
You do raise a valid point about the other damage a data breach may cause, but aside from encrypted data (such as financial information), the password is the most damaging thing an attacker could retrieve since that probably grants access to a whole host of other sites since so many people re-use passwords. If you're OK with that, fine - go ahead and re-use crappy passwords everywhere. But for the love of security, please don't give advice on how people/companies should hash passwords.
*PBKDF2 is meant to be used as a key derivation function to convert a password into a cryptographically-secure encryption key. It's not the best choice for password digests, although because it includes a work factor like bcrypt is still relevant to the discussion, and is certainly a better choice than MD5 or SHA-family alone.
How are sites slashdotted when nobody reads TFAs?
There are 600K words in the Oxford dictionary. That gives, by your formula, a formidable 46,656,000,000,000,000,000,000,000,000,000,000 possible passwords to crunch into the rainbow table.
But most Americans have a lexicon of about 8K - 10K words! 9000^6 is only 531,441,000,000,000,000,000,000.
As you are aware, "correct horse battery staple" is only four words: 9000^4 = 6,561,000,000,000,000. Not an improvement in complexity, but considerably more memorable. God, passwords are so shitty, they do my head in.
Take off every 'sig' !!
And the hash is the password again...
If I have stolen $hash from server, I won't need to break that hash to emulate the user.
I just need to get challenge and return hash($stolen_hash + challenge) to the server. Congratulations, you re-invented saving "plaintext" passwords! ;)
It's The Golden Rule: "He who has the gold makes the rules."
All of security is just speedbumps along the attackers way. That's what security does, whether physical or IT: it delays the attacker by some amount.
If all of your customer data is plaintext at rest, then the attacker just needs to pwn any server anywhere that can see that data, and bulk-copy it off. That might not even raise any internal alarms, since it's justa file copy.
If all of your customer data is encrypted at rest, then the attacker needs to pwn each machine thatknows how to decrypt each part of that data, and not just to run an arbitrary process/shell on that server, but actually understand and interact with the process that holds the keys. That's a smaller hole. And if it's something like an SQL injection attack to get at DB data, you have to be vulnerable to that specific atack, and there's at least a chance that some massive query that returns all rows of all tables will get noticed, so even if you don't stop it maybe you at least know it just happened.
Socialism: a lie told by totalitarians and believed by fools.