Slashdot Mirror


Analysis of 32 Million Breached Passwords

An anonymous reader writes "Imperva released a study analyzing 32 million passwords exposed in the Rockyou.com breach. The data provides a unique glimpse into the way that users select passwords and an opportunity to evaluate the true strength of these as a security mechanism. In the past, password studies have focused mostly on surveys. Never before has there been such a high volume of real-world passwords to examine." Most interesting to me was that in the sample, less than 4% used any non alpha-numerics in their #$#%'ing passwords.

2 of 499 comments (clear)

  1. Re:Given the sample set, is it a surprise? by Blade · · Score: 4, Interesting

    Until they break into your facebook account and use that to socially engineer access to something else and escalate their way into something beyond that. Or they access your facebook account and start taking guesses are the answers to the security questions you're forced to use (what school did you go to, what was your first pet called, etc., etc.)

    There are so many links between so much of what we do online that you would do well to treat it all as worth securing equally.

  2. Re:Password strength vs. how often you change it by epine · · Score: 4, Interesting

    technically an all lowercase password is just as secure as any other password

    You must have missed the bulletin which explains that security consists of becoming a less inviting target than the guy beside you. If the sheep tend to use all lower-case passwords (baaaaaa), then you're best off wearing a different cloak.

    it is probably also better to start all of your passwords with a 'z' since they tend to check in alphabetical order [citation needed]

    I thought script kiddies were all playing on the streets of the Facebook favela these days, and that unemployed Russian PhDs were out there flexing their combinatorics.

    From that training set, it would be pretty easy to code up a Markov letter bigram or trigram model and enumerate from least entropy on up (a near approximation to this is plenty good enough). My guess is that that nine letter all-lowercase passwords would be on roughly the same tier as six letter passwords with multiple punctuation marks.

    This study was a bit stupid in reporting password strength. A nine letter password from two symbol sets will be close in strength to an eight letter password from three symbol sets, as long as the nine letter password doesn't build upon trivial substrings.

    I think this is why the recommendation demands three symbol sets: it gives users less scope to squander entropy that a longer, ordinary character password ought to have.

    One time, as a joke, a very long time ago, a devious coworker put a keystroke logger on a paranoid coworker and the password revealed was 6uldv8. Apparently there's more than one reason to keep your passwords secret.

    I generate all my own passwords starting from suggestions offered by OpenBSD's apg utility. For crap sites, I try to achieve an estimated entropy in the vicinity of 30 bits and scale up to about 60 bits at the paranoid end: 5*6 (a brief burst of line noise), 6*5, 7*4, 8*4, 9*3, 10*3 (baby talk).

    For longer passwords, you can pair two words from a large dictionary (about 13 bits entropy each) and then add another four bits with a single symbol corruption. Routinely sticking an ! in between two obscure dictionary words is not a good idea if you're concerned about cross entropy, where the attacker already knows some of your passwords by other means. I avoid consistent corruption templates, because I don't want to lower the cross-entropy on a set of partially exposed passwords too severely.

    For most purposes, even 20 bits of entropy is a good start, if the attack involves knocking on the front door. Not so good if the hashed password file is compromised behind the scenes. Even 30 bits is pathetic in the latter case, but this reasonably well mitigated by never sharing a password across multiple sites.

    At 40 bits, the attacker begins to ask whether there's any money involved. A high-end video card, properly coded, would sneeze at 40 bits. However, properly coded still isn't free,

    By the time you get to 50 bits, it's time to start asking whether you've seriously pissed off the wrong person. Quite doable, with a modicum of enmity, but not worth the bother if the game is shooting fish in a barrel at least expense. Armour piercing rounds are deployed sparingly.

    I wouldn't be the least bit surprised that the NSA has accumulated a dictionary of the trillion most common passwords, sorted by descending order of frequency, covering all languages and source lexicons of the world (pets, pet names, Klingon, Thalassian, Qenya) permuted into all manner of imposed password template schema. I'd be shocked if they hadn't. For that matter, Google could build a good approximation to that dictionary just using their lexigram index, on roughly the terascale.

    Shedding about 10 bits of protection per decade, we'll soon need to return to Beowulf era culture where reciting your ancestors back to the garden of Eden was the gold standard for accurate recall.

    I wish every login box on every site had a