GitHub Accidentally Exposes Some Plaintext Passwords In Its Internal Logs (zdnet.com)
GitHub has sent an email to some of its 27 million users alerting them of a bug that exposed some user passwords in plaintext. "During the course of regular auditing, GitHub discovered that a recently introduced bug exposed a small number of users' passwords to our internal logging system," said the email. "We have corrected this, but you'll need to reset your password to regain access to your account." ZDNet reports: The email said that a handful of GitHub staff could have seen those passwords -- and that it's "unlikely" that any GitHub staff accessed the site's internal logs. It's unclear exactly how this bug occurred. GitHub's explanation was that it stores user passwords with bcrypt, a stronger password hashing algorithm, but that the bug "resulted in our secure internal logs recording plaintext user passwords when users initiated a password reset." "Rest assured, these passwords were not accessible to the public or other GitHub users at any time," the email said. GitHub said it "has not been hacked or compromised in any way."
How can a clear text password be available to them at all to record it in a log?
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
the bug "resulted in our secure internal logs recording plaintext user passwords when users initiated a password reset."
"We have corrected this, but you'll need to reset your password to regain access to your account."
Er... are you really sure that this has been corrected?
Ask me about repetitive DNA
I guess http basic auth over TLS.
The connection is encrypted using TLS but the password is transferred in the clear (base64).
Don't use basic-auth.
If you absolutely have to, use an application specific password with restricted rights.
Okay, so for those who do not know quite how password storage works there are 3 parts to it
1) the original string of password text
2) salt - this is a random string used to create randomness in what is to come
3) you feed the string and the salt into an encryption algorythm like sha512 which produces a HASH this is what gets stored
Now for the fun part, the original string of password text is discarded NOT STORED ANYWHERE, when you need to check a password you feed in the salt and string to the encryption algorythm and check that the HASH is the same as the hash stored in the database.
If GitHub was actually storing plain text passwords...the pure amateur stupidity, I mean who to trust to be competent at a certain point beyond yourself?
you feed the string and the salt into an encryption algorythm like sha512 which produces a HASH this is what gets stored
Argh!
No!
NO!!!
NO-NO-NO-NO!!!!
DO NOT USE HASHES ! (like Sha512).
These are designed to be *fast* (1), meaning that it could be not impossible for an attacker to guess the password out of the hash simply by brute forcing all the most common password and variations thereof into the same salt and see if they match.
(1 - And remember that the "tera hash" that ASIC bitcoinminer are reporting are exactly that : trillion of SHA256-like computation per second.)
USE KEY-DERIVATION FUNCTIONS (KDF) INSTEAD !
Like the Bcrypt use by github as mentionned in the summary. Or Scrypt (same used by tarsnap). Or Argon2. etc.
These also produce a value out of a password and a salt, but they are on purpose extremely slow (E.g.: by repeating a hash function over and over for a high number of iteration).
If each computation takes some time, it doesn't impact login that much (After all, you only need to log in once at the beginning of your session), but it hinders anyone wanting to brute force your password out of a stolen hash.
It makes data breaches that managed to steal your user database a lot less dangerous (because once you have successfully guessed the password from the hash, the next step is to see all the other places where the user has re-used the same password).
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
GitHub has been going down the drain since the day they proudly threw out their Meritocracy rag and focused on the Social Justice Evangelism rag.
So to see the substitution of people with talent and pure interest in coding, by random idiots who are only interested in what was meme'd to them in the inclusion program, result in a general degradation of quality is nothing new.
OMG, like how can the server ever know the password in the first place?
From the summary:
the bug "resulted in our secure internal logs recording plaintext user passwords when users initiated a password reset."
p.s. You're making a fool of yourself tonight / this morning, Opportunist. You should go (back?) to sleep.
This still means that they are doing it wrong.
So how is this "random salt" recovered when you need to check the password's validity?
It's stored along in the data base.
Most stored password have a form like :
${type of algorithm used}${parameters used}${data}
where:
- "type of the algorithm used" tell you what was used to generate this (e.g: using Bcrypt, like GitHub as mentioned in the summary).
- "data" is the actual salted-output that you need to replicate to successfully log-in
- "parameters" is any extra-data that the algorithm needs to generate password checks.
Like the salt.
Or like the number of iterations. Because nobody sane actually use a hash function such as SHA512 anymore. Instead you use a Key Derivation Function (KDF) such as Bcrypt (or Scrypt or Argon2) and those are *slow* on purpose, to make brute-forcing much less likely (e.g.: they slow down by repeating a hash for large number of iterations).
The exact implementation vary (the above is typically used by the "crypt" function used, e.g., on Linux log-ins),
but basically are the same : the salt (and iterations) are stored together with the "hash" that you need to test.
And most of the KDF function can work as "hash_to_compare = KDF(password_login_attempt, old_hash_from_database)", ie.: they can automatically extract the parameters if you give them the string that is in the database, and generate the hash the exact same way.
They'll invent a new salt (and guess the optimal number of iterations) only if you omit the old hash and give the new password as the single parameter.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
OK, that's naughty and needs fixing, but it's internal logs, did it need a slashdot story?
Kanye lost his mind ever since his mom died. He's clearly become completely ego driven and is constantly trying to fill the void his mom left.
This is the right way to deal with a security cockup. Notify the users and explain. They could very easily have swept this under the carpet.
I say well done, GitHub
Google "Secure Remote Password protocol (SRP)" and implement that.
Your password never leaves your computer. Instead you sign a challenge, and the server validates only you signed it.
Basic auth is an HTTP header, and HTTP headers are just as protected by TLS as response headers and bodies. Otherwise, HTTPS would be ineffective against Firesheep-style attacks that clone a session cookie. The other common means of authentication is submitting a password that has been entered into a field of an HTML form as part of an HTTP POST request body. What's any more "in the clear" with HTTP basic authentication than with the form route?
And in case you believe both forms and basic authentication ought to be replaced, what other means would you prefer? I can think of three, each with serious drawbacks:
HTTP Digest authentication This does hashing using a random initialization vector. However, it requires the server to store the password rather than only an irreversible hash for verification. Some zero-knowledge proof means Because this is not built into the HTML5 standard, it requires running script in the browser. Though web browsers by default run all scripts, many users change this for security and data cap reasons. Extensions exist to restrict script execution to a domain whitelist (JavaScript Switcher), a fine-grained whitelist (NoScript), or only those scripts whose source code is machine-readably available to the public under a free software license (LibreJS). Some go so far as to regularly browse the web with all scripts turned off. Client certificates TLS supports the use of a client certificate that identifies a user, which is exactly analogous to key-based authentication in SSH. However, browser publishers have thus far given no significant attention to usability of common use cases, such as choosing the right client certificate for a particular origin, synchronizing client certificates across devices that a user uses, or even something as simple as logging out.A better question is why doesn't the HTML standard for password fields allow automatic hashing with a custom salt?
It does; it's called digest authentication. But depending on how digest authentication is implemented, it is vulnerable to one of two attacks. If the realm portion is fixed, digest is vulnerable to a replay attack that passes the hash. If the realm portion is variable, it requires the server to store the unhashed password. In addition, digest authentication still uses MD5, which is deprecated and whose immediate successor (SHA-1) is also deprecated.
Dear lord just saw him on TMZ Live, I did not think he could get any more insane but I was proven wrong, way wrong
ntr
Is 27 million a small number?..
In Soviet Washington the swamp drains you.
Logging a password is a beginner's mistake, like SQL injection. I found the same bug in unreleased code many years back, and raised it to management so we could track down the engineer who did it. It's the kind of (cough) mistake that can be the "straw that broke the camel's back" when dealing with an engineer who has (cough) "negative productivity."
Ideally, this kind of bug should be caught in code reviews. As someone who reviews a lot of code, even I'll admit that it's possible for something like this to slip through.
No, I will not work for your startup
There is almost no good reason to save plaintext passwords to permanent storage or for that matter be in transient storage more than a few seconds.
Barring a legal reason or with the clear knowledge and permission of the password's owner, the only time cleartext passwords should ever appear in a log is if they get caught in a diagnostics log. For this reason, the ability to do senditive diagnostics and see the results needs to be controlled and people need to think first before doing them and dispose of the sensitive pasts of the log as soon as possible.
I got the email.
I was impressed that it was handled quickly.
I'm even more confident because I actually use a proper password manager making sure I have unique passwd's for everything.
I give them credit -- here they found their own security issue before it became a breach, they fixed it, and they didn't sweep it under the rug but instead they notified their users. Kudos for being forthcoming.
This at -1 is moderation abuse.
PBKDF2 uses SHA-variants in it iteration.
Despite "Shattered", it's not "broken" yet.
There are just better more modern KDFs (like the Bcrypt used by Github, like the Scrypt designed for use in tarsnap, or like Argon2 which is the latest competition winner) that don't have PBKDF's short comings (e.g.: collision of long input pass phrases and their SHA-1).
Regarding : "Shattered" you have to understand its context.
SHA-1 has known to be not as secure as it could be (a 128bit SHA-1 has not 128bits of security) for quite some time.
(The main reason why SHA-2 was developed and is now widely used in cryptography, and a partial reason why SHA-3 got recently developed-though-competition (the other reasons being that SHA-3 / Keccak also introduce some novel interesting concepts) ).
Because of this it was widely speculated that collision could be found.
A team of security research spent massive resource (lots of computation time) to search for collision (not brute forcing the whole 128bits space of sha-1 - which would be hard in any reasonable time -, but cleverly exploiting the above known limitation and vulnerability of sha-1).
After spending a considerable amount of time they managed to create two different blocs of (complete non-sense random) data that happen to hash to the exact same value.
It's not that they can generate collision at a whim, they can generate collision at a tremendous computational cost (but still an achievable cost - unlike the whole 128bits search space), and thus far managed to generate exactly 1 such collision.
Also due to the block-iterative way SHA (And most other pre-SHA-3 hashes) operate, it means you can stick this block in a file in a specific way, and get the same hash as if you stuck the collision in the other wise same file.
That limits severly the possible uses of this collision. You need a situation where you can store arbitrary noisy binary data, and have a program that can react to the presence of one or the other piece of data.
Currently, the only successful demo of Shattered is in a PDF file, because PDF can store arbitrary blobs (e.g.: used to storing bitmap data for illustrations, fonts, etc.) and the PostScript language used in PDF is Turing-Complete (some people are even writing ray-tracers written in post-script).
So you can craft a special PDF that hashes to the same SHA-1 sum, but whose PostScript will generated two different document, depending on which of the two collision block is stored in the blobs.
It's pretty limited in practical use.
In PBKDF, it means that you can have two long passphrases, that will generate the same SHA-1 on the first round of PBKFD2 (so you have a tripple collision : both long passphrase containing the 2 blocks of Shattered, and their SHA-1 sum)
But the exploitability of such a solution is quite limited (complex scenarios like an oracle giving passwords, and Eve secretely colluding with the oracle, so the oracles gives two provably different password to Alice and Eve (e.g.: if they compare the SHA256 or SHA3 of the passwords, they are different), but Eve can use her password to unlock Alice's stuff. And vice versa).
So :
TL;DR: Shattered isn't affecting PBKDF2 directly that much, but people have moved to more modern KDFs anyway, because they are better.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Don't tell people not to use hashes. The next think they'll think is "Oh, I should use plaintext instead".
A key-derivation function is also insufficient, since the output is only as strong as the input. Meaning if you have a 10-bit password, the resulting KDF strength will still only be 10 bits.
You must use a Password-Based Key Derivation Function. A PBKDF can add ~10 bits of security to a password. So if a user gives "password" as their password (2 bits), the resulting hash has ~12 bits of security.
Wonder what the public key field is for?
seriously, bcrypt is RIGHT FUCKING THERE and it creates an opaque string with both the salt and the hash.
AGAIN: SECURITY 101 -- databases SHOULD NEVER STORE PASSWORD PLAINTEXT
I've even had shitty devops tell me "It IS encrypted" when all they did is run it through PHP Mcrypt and store the fucking PASSWORD in the PHP CONFIG FILE.
Fucking unbelievable.
Comment removed based on user account deletion
I don't really see why people are so against to hashes that they need to shout.
My main reason was for commically over-exagerated "hysteria".
The actual reason why people are against hashes, is a combination of three factors :
So guessing passwords out of (fast) hashes is completely doable for anyone with a little bit of ressource (paying a tiny sum to rent GPUs on the Cloud).
Just have a look at http://haveibeenpwned.com/ . Very often (though not always), attacker manage to get the password hashes. If you've been using a fast hashing function like SHA, guessing a significant proportion of the passwords is largely possible (like the point 1. above) at the cost of some GPU cloud-renting.
we human are stupid and tend to reuse passwords. Once you managed to successfully guess a password from point 2, you can try to see if it unlocks the e-mail account associated with the account in the database, or any other account you can find online associate with the same email and/or username and/or real identity (depending on what the leaked db provides to you).
That last one gives you tons of social engineering and identity theft/impersonation possibility to "profit!!!" from. So you can guess it is something that could happen in the wild.
---
(*) -- (when asked to follow password rules, humans will generally put the capital letter at the beginning, use 5-to-6 letters, then put 2-to-4 numbers, and the special at the end, most of the time it will be "!". The number of combination that follow this rules is vastly smaller than what "[A-Za-z0-9_!#@-]{8,16}" would imply)
Yes, bcrypt and similar are better and should be used. But I'd consider a hash, if properly used, still reasonably secure.
The vast difference is that bcrypt, scrypt and argon2 are on purpose designed to slow down bruteforcing and make FPGA and ASICs difficult (by using lots of iterations, and by requiring lots of memory)
The point 1. from the list above doesn't hold true anymore, so if the KDF's hash get laked in point 2. you can't gain much from them.
By properly used I mean hash(hash(password + salt) + salt), where + stands for concatenation. Even better if it has some concatenated pepper, too.
You don't even need to remember that formula if you remember the letters "hmac"...
For a typical /. geek who : /dev/random+base64 (good luck using patterns or common password lists on that !)
- generated purely random string from
- and uses 1 different password for each typical site (no password reuse)
( - and uses a secure password manager to keep them organised)
- and has activated 2-factors-auth (like Google Auth) on each website that supports it (so even if a password is somehow guessed correctly by shear luck, it's not useful on its own).
Yup, salted hashes are good enough.
For the rest of normal the humans, the 3 points I've listed above a re a real danger.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
"It's unclear exactly how this bug occurred"... riiiiiiiight, git blame?