LinkedIn Password Leak: Salt Their Hide
CowboyRobot writes "Following yesterday's post about Poul-Henning Kamp no longer supporting md5crypt, the author has a new column at the ACM where he details all the ways that LinkedIn failed, specifically related to how they failed to 'salt' their passwords, making them that much easier to crack. 'On a system with many users, the chances that some of them have chosen the same password are pretty good. Humans are notoriously lousy at selecting good passwords. For the evil attacker, that means all users who have the same hashed password in the database have chosen the same password, so it is probably not a very good one, and the attacker can target that with a brute force attempt.'"
Isnt salting and hashing a standard practice for passwords even for low security stuff?
With something as high profile as Linkedin, how did it get missed?
Arent there audits,etc to check for this type of stuff?
Isnt it similar to releasing a range of cars, all of which have the same key (or something similar. Analogies are my weak point, as is the English language)
Funny! Learned to "always salt your passwords" in Udacity CS253: Webapps.
I think you might be missing the point about duplicate passwords--it's an argument FOR salting the hashes.
Yo dawg, I heard you like the Ackermann function, so OH GOD OH GOD OH GOD
Exactly. Both the summary and article are being stupid about the reason for salting in hashed passwords. It's main benefit isn't hiding two same password. It's main purpose is to make brute force much more work, even if the user supplied short password. Even Google isn't stupid enough to pull stuff like that. The salt should consist of general site wide salt, and personal salt calculated from user values that do not change (UID, birth date, some extra field in db).
I'll say it again: OpenID.
AccountKiller
I think I will start visiting websites I no longer use, and deactivate my accounts. If that doesn't work, I'll fill the "user profile" with a bunch of nonsense & scramble the password. It worries me that over the last ~20 years I've visited a bunch of places and left behind details about myself. I never used linked-in but I imagine if I did, I'd be in potential trouble now (leaked info).
My AC stalker: " I personally agree with your posts most of the time, but that won't keep me from modding you troll"
Do they understand how hashes work?
Yes, Poul-Henning Kamp understand how hashes work. Much, much, much better than you do. But if you feel compelled to lecture the writer of MD5crypt on your wonderful insights into how hashes work, please, feel free.
https://www.browserid.org/
It's the closest I've seen to SSH Keys. That's all I want, SSH Keys for web auth.
This may well be a very daft question but:
It appears that the default is to use a simple solution when people are first developing their websites.
Isn't this what libraries are for? Why isn't the default secure method in development environments/website template/distros etc to use a much more secure method?
Why is there no standard library that I just say (for example) store_user_password($user_name); and it takes care of it?
If there is such a thing why isn't it more universally used and why blame the website implementers for what is a dev environment/toolkit/whatever problem?
If the technology the article talks about has been known since the 80s why isn't it the standard in all the modern environments? And whatever the answer I'm not sure the blame rests in the users of however they develop this(unless they are writing their website in C based CGI...).
"The weirdest thing about a mind, is that every answer that you find, is the basis of a brand new cliche" -
Or more important, makes it so you would have to try to brute force each one because the hash won't exist in previously generated rainbow tables.
So if I crack other people's passwords on Linkedin, can I put that up as a skill in my resume on Linkedin?
Always wanting to salt everything. Maybe LinkedIn just wanted to be leaner??
This LinkedIn hack could lead to even more high-profile hacks, due to the unique user base that LinkedIn has
On most sites (like Facebook) most of the stupid passwords will belong to stupid 13 year old kids with nothing of value to hackers, but on a site like LinkedIn you are more likely to get the password for some computer illiterate corporate executive. In many cases this is the same simplistic password he uses at work, where he insisted he be given admin-level access on all of their servers "because hes an executive"
Computer security is always about the weakest link in the chain, and when one of those "links" partied his way though business college and never thinks twice about password reuse, you have a pretty weak chain. LinkedIn is like an x-ray showing the hackers who the weak links are.
If you're looking for hash collisions, you're not going to target a particular hash; you're going to leverage the magic of the birthday paradox and hash your way through a dictionary with passwords listed in order of decreasing probability (with "password" and "12345" listed first, then "p@ssw0rd" and "OU812", etc.) and match the resulting hash with entries in the password file. And you're going to do this once, building one rainbow table, and try it on every unsalted, SHA1 hashed password list you can get your grubby hands on, until you have a nice little dictionary of username / password pairs - and try it on whatever services you're targeting, because most people tend to reuse the same password over and over again (usually Joshua97 or MaryJune05 or some other combination of their kids' names and birthdates).
Salting does not make brute forcing more work, it makes rainbow tables less useful. This is its primary purpose, though the secondary purpose of hiding matched passwords (which are also likely to be weak passwords) is a convenient bonus.
One to Many... But a lot of those Many are characters you are not going to type with your keyboard.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Phew, close one, my password is MaryJane420.
Salting used to be very important to hide weak passwords from rainbow tables. Now that it takes mere minutes of GPU time to replicate a large rainbow table, salting doesn't buy you nearly as much as it used to. Assuming the attacker gets the salts along with the passwords, but in case of a breach you have to assume they do.
id put linkedin or any other social networking site on my usual password list where i wouldnt put ebay or PayPal .... what baffles me most is why no one is cashing in on the password gap the risk maybe? the impossibility to secure? so far i never heard of facebook, or microsoft or google being hacked , i wonder what happened to sony still i still suspect some inside penetration there who's gonna make a buck providing the world with that one simple login then? probably some indian or asian guy i guess
Free speech was meant to be free for all... how can anyone grow up in a nanny state ?
With all the talk these days about low-sodium diets, they just wanted to provide their users with a healthy alternative.
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
I checked my Linkedin password against the list to see if it was there or not. It turns out it wasn't but I thought I'd change it anyway to be on the safe side. I checked my 2nd, 3rd and 4th choices on the leaked list before changing. All of those were on the list. It surprised me a little because they're not common passwords or particularly obvious. I guess it shows if you get a large enough group of people together, similar though processes are going to lead to similar passwords.
Somewhere, an astute, MBA-sporting LinkedIn user just proudly placed a salt shaker atop of their post-it note of passwords and smugly called it a day.
Don't use his md5crypt and think you did anything useful he says. http://phk.freebsd.dk/sagas/md5crypt_eol.html
Imagine if you weren't allowed to use roads because a bus company complained about your driving 3 times. --skunkpussy
Salting does not make brute forcing one password more work. It does make bruteforcing a list of passwords more work, however.
Both the summary and article are being stupid about the reason for salting in hashed passwords. It's main benefit isn't hiding two same password. It's main purpose is to make brute force much more work,
Yes, but you should also mention that salts with a large amount of entropy also protect against Rainbow tables and other forms of pre-computed hashed passwords. Make sure you have enough entropy in your salt(128 bits is very high) to prevent these kinds of attacks.
I'd recommend a randomly generated salt for each password, and not based on some user details. This guarantees a large amount of entropy in the salt. Some people also recommend an added site wide salt as well that's not stored in the same place as the password (embedded in the code for instance). This might increase your security a bit, but it's going to cost you quite a bit in added complexity.
AccountKiller
So a useless site is easily cracked for people who use poor passwords of insufficient length such that they would have the same hash codes. Not a big deal.
Even though salting makes it "much more work", your algorithm could be not CPU(GPU) intensive enough. That's the largest flaw in most systems, and that includes, like the author of MD5crypt stated, too computationally simple to break. So on one simple box, even when salted, we're talking about 2 days time to crack MD5crypt to brute force and 8 character password (probably less with better hardware). Without the salt of course, I suspect that all 6.5M accounts were cracked that day (especially if they can scale it to say 50 ordinary boxes). Use Blowfish or Twofish, and it would take years to even do brute force as each calculation takes in the tenths of seconds (given 12 rounds or greater).
Or perhaps the hash method used should make it so it's not possible to guess the length of the password, filling in the password with some collection of characters after password entry. What I mean is, suppose the maximum password length is 128 characters. The user chooses "12345" as a password.
The server takes "12345", then adds "The quick red fox jumps over the lazy brown dogs" over and over until the length reaches 128 characters. So even though from the user's perspective, his password is "12345" the server regards the password as
"The quick red fox jumps over the lazy brown dogsThe quick red 12345 fox jumps over the lazy brown dogsThe quick red fox jumps ove" (if I counted it right...)
and then it encrypts/hashes that, and ends up having your password hash as something like:
"Uj&8Anv0 H6Hfg*ggYly54jbt0Gy0tygvulTY78Jbyi~Mbi=6Bt8p,mYUILMN$InQRup($ylp56vcd%^OKHtOJG7690>J%^^7D3w43swazzZA"
The portion that is the users password is located in the middle, or something similar.
When the user with the "weak" password of "12345" goes to log-in, the password he types in is then hashed to the 128 character garbled string...
he only sees " ***** " in the password field, but what is stored on the server is the long string. Now the customer doesn't have to bring his own strongbox to the bank.
Salting only protects you from precomuted "rainbow" brute force methods which means if you have a big enough table your password is cracked in seconds to minutes rather than oh I don't know what is the average for your typical password? Hour, day..two days? week tops...? Does this difference really mean anything substaintial to the vicitim?
Now I may be wrong, but that would only be the case if the salt was stored with the hashes, correct? Which to me seems rather dumb (from a security perspective, not a performance one). To maximize the benefit of salts, the password hashed and their associated salts should be stored in two different databases, running on different servers so that a hacker would have to compromise both to get access to the list. Lock down the Salt DB server so that's it's only able to communicate with the Hash DB server (and nothing else) and will only return one hash request at a time to it.
Experience: Director if IT Security for a large Social Media Product.
Objective: A position where I can apply my experience and lessons learned in a fast paced and dynamic environment, targeting a large customer base. ::Hired:: By Facebook.
Huh?
I suggest all premium users, who have paid for the service in some way or another, request some for of discount for a while. Any service that that effectively holds the keys to part of your public identity doesn't understand and implement basic credentials security (especially after the few high-profile cases that have hit in the few years prior to this incident) is simply not fit for purpose.
It is one-to-many, not one-to-one. It is unlikely that you will see a hash collision with the same password string.
Fail. Hashes are a many-to-one mapping, not one-to-many.
Your not really adding security by splitting it up, a well protected server that only stored uuid and salted hashes would do the same.
No sir I dont like it.
The file going around is simply a pile of hashes, no logins. Did the crackers get both? Or did they just get the hashes?
Because the hashes may as well be piles of random data if you can't pair them with a login.
I have not heard a peep about the logins themselves. Are we just assuming they were taken?
--
BMO
It's called a password manager. KeePass is a nice one. There are many others. How many passwords are so important to you that must internalize all of them? For me, the answer is "very few". Never reuse. Never recycle.
Still, you're right that passwords are unideal--a PGP-like solution would be better. Even done poorly, all they could leak would be the information that you have an account. But if you stored a different PGP-key for each site in Keepass, then they couldn't even do that.
Insert self-referential sig here.
Huh?
sha1sum blah = 5bf1fd927dfb8679496a2e6cf00cbe50c1c87145
sha1sum 5bf1fd927dfb8679496a2e6cf00cbe50c1c87145 = 9dbc5291ce0269baafbf68a6346b5510a1b30860
sha1sum 9dbc5291ce0269baafbf68a6346b5510a1b30860 = caa4e37211fda75d91b9bf3231f9fbeb434d56e8
[...]
Obviously, linkedin is learning fast from the publicity here, but that doesn't stop other companies from sending emails with your password if you click "forgot password". (*cough*T-Mobile*cough*) I deny that proper password security is either hard or expensive. What we really need is a good way to shame companies into doing it right. Preferably one that doesn't involve leaking a few million passwords.
I was implying using a different/random salt value for every individual password. It is my understanding that it would be impossible to brute-force a password for a salted hash whose salt is not known. You can get the original value passed to the hash function but not the actual password. You would then have to try and deduce the salt. If this is true then separating the hash from the salt does add security.
These passwords can still be brute force at todays mega ridiculous n core, GPU accelerated rates at extremely low cost.
Go ahead, try bruteforcing high-cost bcrypt.
For the sake of argument assume a database of millions of salted SHA-1 passwords was compromised. What is the effective difference?
Sorry, but this is just silly. Salting takes you from needing trivial effort to crack a sizable percentage of a password pool (using rainbow tables) to needing substantial brute-force effort just to crack a single password. It takes you from having obvious targets for additional brute-force attack (password hash collisions) to having none.
If you're an evil hacker and someone gives you a corpus of unsalted password hashes that's an instant goldmine in compromised accounts. If someone gives you a corpus of salted password hashes? Who's going to spend the time brute-forcing each individual password? You're better off finding an easier target.
So yeah, there's a freaking HUGE difference between a database of millions of salted passwords being compromised and a database of millions of unsalted ones.
Actually, it does a few things:
* It makes rainbow tables ineffective.
* It prevents identification and lookup of common passwords (e.g., by pasting the SHA-1 hash into Google)
* It make cracking a list of N passwords a factor of N harder.
The last is fairly important for large database leaks. A single password is no harder to compute if you add salt, but if you're brute-force cracking against the entire password list (which is what you want to do), it's a factor of a million slower because you can't just compute the SHA-1 for a single test password and compare it against every entry in the database; you need to compute the SHA-1 for each (password, user) combination. That's a substantial slowdown, and cracking against a large leaked database is a much more common way of passwords being exposed than a concerned cracking effort against a single account.
Salting only protects you from precomuted "rainbow" brute force methods which means if you have a big enough table your password is cracked in seconds to minutes rather than oh I don't know what is the average for your typical password? Hour, day..two days? week tops...? Does this difference really mean anything substaintial to the vicitim?
To a single victim no. But you've taken the complexity of the problem from hacking a password to hacking 6 million passwords each taking as long. So to the 'average' victim your expectation time before being hacked is 3 million times longer, on a 6 million element set, than one that wasn't salted. There are just over 31 million seconds in a year. So even if your password takes 1 second to crack you've still bought yourself about a month. If it takes a day... well you'll be dead before it's brute forced so who cares.
I only expect to live exactly 50 more years. So if it takes 525.95 seconds to crack my password I have a 50/50 chance of being dead before they manage to brute force it under salting (at 525.95 seconds per they will be able to hack about 3 million in 50 years, and the other 3 million in the next 50 years). If on the other hand it takes 525 seconds for the first one, and small fractions of a second for the next 6 million, I could reasonably expect to have been compromised already.
A different random salt per hash is expected (otherwise it's nearly useless). In the good better best of security you have salted hashes stored along with everything else, a box that gets sent the password and uuid (or whatever unique identifier is used), and a box that is sent a uuid and random string to hash with the password and the client does that same. Splitting up the hash parts onto two separate servers does very little as they are both involved and one have to send it's part of the hash to the other to work it out, so one of them at some point with halves the two halves anyways and you can game the system to get it to have the ones your interested in at that time.
Honestly web sites need to get past mucking with passwords, there are numerous better ways to authenticate, that do not require individual web sites to have any sort of password on file. This makes otp, keyfobs, smartcards etc much more palatable and does not require anything extra on the web site. Sites need to learn that handling passwords is a liability they should try and avoid. Users can pick whoever they want to handle there identity including themselves if they are technically inclined. Sure it's not perfect and can still be gamed but little in security is perfect.
No sir I dont like it.
Actually, they didn't miss just that, they missed the entire story. Honestly, who wrote this? The article SHOULD have said that because they're unsalted, you can get ALL the passwords up to 22 characters turned back into their original text extremely quickly without brute forcing anything. There's no comparison or anything necessary. If they were salted, THEN you'd have to compare them and brute force them. Otherwise you run them all through a rainbow table database on a system with a buttload of RAM and tada, you can reverse them at a very high rate of speed.
So, I already changed my linkedin password.
This point about making password hashing take a whole second is critical. For some reason, the geniuses in charge of our Internet security are totally screwing this up. Take my GPG private key, for example. I have to encrypt the damned thing with a humongous pass phrase, because some dork, who you would expect to know something about security (having written GPG), didn't bother making the private key's decryption key a compute intensive hash of my pass phrase. This allows attackers to do efficient brute-force attacks on our PGP private keys. They don't even have to run the compute intensive RSA algorithm... they just do AES with their password guess and detect if it's correct instantly, because GPG goes to the trouble of providing an instant "Your guess was correct" feature. Worse, sites like github not only allow, but require these insecure PGP keys to be used to log into their systems, even though they have no control over what, if any, password is protecting the user's private keys. I just got a private key from a rather good IT guy recently, and the file was named "ThePasswordIsFoo"! I used to encrypt my freaking OpenVPN keys with a password designed to give you an RSI injury, but the new IT policy is to leave it unencrypted on every Windows machine! And I thought our old SSH on port 22 was scary (actually... that is pretty damned scary). This same problem is true of encrypted documents. The most secure document encryption software I currently know of is my own stupid little tinycrypt algorithm. Do you use encfs, or TrueCrypt? Don't count on any CPU intensive hashing to protect your password. Brute force away!
Instead of leaving your documents wide open to brute force attacks like GPG using Blowfish or AES, it uses the old broken WEB encryption algorithm, RC4, with only a tiny mod: it drops the first 768 bytes of the key stream, which last I checked means no one knows any useful attack against it. It salts the encrypted file with a random 20 byte nonce, and requires about 1 second to compute the hash of the user's password used to do the encryption. Why should average non-security geeks like me have to write our own encryption software for documents, and live with 20+ character passwords on everything that uses GPG? Couldn't we at least count on GPG to get it right?
On top of all that, I run freaking Windows, meaning that I have no clue what software is sniffing for my passwords. I'm careful about what I install, but I installed ZIP, for example, a clear nasty piece of spy/annoy-ware, and Skype, one of the worst security leaks in history. God forbid that I install a free game or anything really stupidly dangerous. I sure hope all those laptops employees carry back and forth from home don't have any dangerous exe files installed :-P A couple friends who actually study security technology rave about the security tech in Windows. But I still press the "Yes" button on that freaking "Allow this program to make changes to your computer" dialog multiple times a day.
What a mess! So, let's drag freaking Linked-In over the coals! Then, let's have a quick look in the mirror. GPG, hears looking at you! Windows... let's not even go there. Mt Gox... really?
Celebrate failure, and then learn from it - Nolan Bushnell
Hmm, funny, that's exactly how I described my system last wednesday in an earlier thread on the subject. The added complexity is not a problem at all, the extra password is only stored in the authentication server (which runs on a different system than the database contaning the hashed passwords.
Think about it. 6 million unsalted password hashes without matching use data. If this is real password data, how big is the chance that someone would find their password in there?
Perhaps as big as the chance that you get a Google hit when you search for your password?
AFAIK all we have is:
- Someone posting a list claiming it's from LinkedIn
- Some people confirming that the hash of their LinkedIn password is on that list
That doesn't really prove anything, right?
- People tend to pick similar passwords
- People use the same password on different sites
I read this in some blog, but I already had my doubts then.
I guess you don't use terminal type 3270 too often. Control E is often configured to clear your field from the cursor to the end. So your password would be saved as what you typed minus the Ctrl-E... Unless you intentionally go back a few characters and do the ctrl-e last.
Back in the olden days of DOS my password often had Alt-219 atl-176 alt-177 and alt-178
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Who's going to spend the time brute-forcing each individual password? You're better off finding an easier target.
So yeah, there's a freaking HUGE difference between a database of millions of salted passwords being compromised and a database of millions of unsalted ones.
Your making some assumptions:
1. Cracking a single password has no value.
2. The value of every user is equal.
3. Out of a space of millions just running well known/shit passwords at a cost of 1min/core would not easily yield many thousands of passwords.
Compromising a site like Linkedin may pose little reward to an attacker however there may be huge payouts if the same password can be used to gain access to corporate or financial resources.
If the only threat was spam/phishing I would tend to agree with your assessment. In my view it is dangerous and unwarranted to make this leap.
it's a factor of a million slower because you can't just compute the SHA-1 for a single test password
That's a substantial slowdown,
I had already point this out myself. See my origional comment "Salting only protects you from precomuted "rainbow" brute force methods".
Whether salted passwords are worth compromising is a value proposition for the ATTACKER to decide. Maybe they have a botnet at their disposal sitting idle which can be tasked with the cracking initiative? I simply claim if one password is worth trying to brute force then a million passwords will yield a much larger ROI.
If your password cracker follows an algorithm where it exhausts the entire keyspace on each password sequentially rather than scanning for high probability passwords across the entire space first then it might be time to upgrade.
You can still yield countless thousands of passwords easily with minimal effort with salted passwords. To be successfull you might not need to crack them all or even a majority.
Go ahead, try bruteforcing high-cost bcrypt.
Password amplification schemes are an interesting idea they seem to afford practical protection against offline attack for all but the most easily guessed of passwords.
To a single victim no. But you've taken the complexity of the problem from hacking a password to hacking 6 million passwords each taking as long
I agree.. In fact it is much much better than this as a rainbow table need only be computed once and used eternally for all subsequent cracking campaigns against passwords using the same algorithm. Storage is not free however.
. So to the 'average' victim your expectation time before being hacked is 3 million times longer, on a 6 million element set, than one that wasn't salted
I would caution against simple extrapolation. Password guessing is an activity with a long tail.
You will quickly recover many low hanging fruits initially. Only later will you see fewer and fewer returns as the amount of computation required to recover more difficult to guess passwords approaches unprofitable.
I only expect to live exactly 50 more years. So if it takes 525.95 seconds to crack my password I have a 50/50 chance of being dead before they manage to brute force it under salting
A static assumption of attacker resources seems unlikely to occur in the real world. For any of us know chunks could have been sold off to friends or unfreindly governments each with their own separate compute resources. It could have been distributed to a botnet with hundreds of thousands of nodes or attacked by custom asics.
It also depends on the market value of a successfully compromised account. It may be out of the 6 million only a few thousand accounts are actually worth anything or worth much more than others.
I'm not so sure safety in numbers is a safe bet for everyone... I don't see how it is possible for a potential victim to know what their odds are in the first place as this would require them to know who has the passwords, what recovery resources they have and what they plan on doing with them.
Of course to all of your points.
I was being illustrative of what salting does not a technical analysis of compromising passwords.
it was sort of obviously contrived, as passwords are never going to take exactly 525.95 seconds, and I'm not going to live exactly 50 years (now minus 2 days).
You might not recover low hanging fruits quickly with salting. It depends on the salting, if it just adds a couple of alphanumeric characters it's not adding a great deal, but a good salting algorithm could take a trivial password 1234 and turn it into #1_2_June*10_sqrt((to 8 digits)asciicodeof(3))@accountname4.
I had already point this out myself.
No, you said, "[s]alting only protects you from precom[p]uted "rainbow" brute force methods". I agreed that this was one thing that they protect you against, but it is not the only thing. I listed three things, one of which was rainbow tables.
The "factor of a million slowdown" is referring to the third item I listed. If you have a list of N possible passwords and K password hashes, the total cost of testing the passwords is O(N) for unsalted passwords and O(K * N) for salted passwords. This is because when you compute the hash for one of the passwords, you must use one of the salts. So, you have to compute K different hashes, one for each salt, whenever you test a single password. This is not the same as protecting you against rainbow tables, which are effective regardless of the number K.
The "factor of a million slowdown" is referring to the third item I listed. If you have a list of N possible passwords and K password hashes, the total cost of testing the passwords is O(N) for unsalted passwords and O(K * N) for salted passwords. This is because when you compute the hash for one of the passwords, you must use one of the salts. So, you have to compute K different hashes, one for each salt, whenever you test a single password. This is not the same as protecting you against rainbow tables, which are effective regardless of the number K.
I don't see any difference between a "rainbow table" and taking the fruit of a hash operation and searching the space of K hashes for duplicates. It is logically the same operation although technically there are likely to be differences in implementation. The only distinction is in the weeds of who computed what when.
It is not really O(N) because searching K space may be cheap but it is not free the same way searching a precomputed table is not free.
When I say rainbow tables don't work on salts it implies they also don't work when you carry out the same logical operation and call it something else.
They're not the same operation at all. A rainbow table isn't just some precomputed lookup table. It enables you to do a reasonably space-efficient partial precomputation of hashes, giving you a factor of X speedup in reversing a hash.
Notably, the speedup a rainbow table provides is the same regardless of whether you're cracking one password or N.
The slowdown due to users having different salts is always a factor of K, where K is the number of users.
It is not really O(N) because searching K space may be cheap but it is not free
It's not free, but (a) an efficient implementation of searching the space K is constant-time and (b) every other password-cracking operation is so much cheaper than SHA-1 that they might as well be free. So yes, it's even technically O(N) since the K-space search is constant in K.