Gravatars Can Leak Users' Email Addresses
abell writes "Gravatar offers a global avatar service, using an MD5 hash of the user's email as avatar ID. This piece of information in some cases is enough to retrieve the original email address. Testing a simple attack on stackoverflow.com, I was able to determine the email addresses of more than 10% of the site's users."
No it's not related to MD5 itself. period.
It would have been trivial for them to just add a secret salt string to the email before hashing, and that would have solved most of the problem. It is possible that they wanted to be "nice", in that in the case they go out of business, anyone can regenerate the ID's without them. But, as this guy has shown, that's not a great idea.
Here's my own Gravatar hash:
b835b33911b93c136d8e61cbbbe6736d
Who will be the first to crack it?
Can anyone tell me if the "you can add extra stuff after a +" that GMail lets you do is standard in the RFC for all email addresses? If it is, to "fix" this, if you should sign up to Gravatar with an email address using a random string after an added "+" the brute force search on hashes will be much, much harder. (Assuming that your email provider is implementing that part of the standard.)
The attack doesn't rely on MD5 itself or MD5 collisions. It would work no matter what hashing algorithm was used.
I disagree.
Granted, those are basically very unsophisticated databases that just store lookup values, but it's relatively easy to bruteforce an MD5 hash down into one of the possible original strings (obviously with any algorithm that has a fixed output size with limitless inputs like MD5 there are infinite inputs that will hash down to a single md5sum, but when you're trying to get a valid email address out of a hash it's easy to pick the right one). Couple that with the fact that in this situation, you know that the entire string is lowercased and probably 60% of the gravatar emails (probably more like 90% actually) are going to come from one of four or five domains... reversal becomes quite easy. If you're bored, you could spin up a few Amazon EC2 or Rackspace Cloud Server instances to dump out some large tables. One each for gmail, yahoo, msn, aol, whatever else; it'd be a very simple script to make. You could probably cover every alphanumeric email address under 12 characters overnight, at a cost of about a dollar and ten minutes of scripting.
The thing to realize here is that gravatar doesn't md5 emails to hide them from people who want to obscure their identity, just to obscure them from spambots. So it's really a non-issue. If you're that concerned, leave your blog comments with a fake email address.
How are sites slashdotted when nobody reads TFAs?
What I'm wondering is why this matters at all. A spammer would just send emails [your username]@[every common email domain]. Why would they bother to check if it's the correct address or not?
Not really, since the salt would need to be publicly known for Gravatar to work (and it would break any backwards compatibility to add it in now). This was a 'social engineering' attack, not a rainbow table lookup – it pieced the name together with common providers to find a matching MD5. Salt would just add a single extra step.
I believe it's exactly the same problem/attack as was brought up about MicroID in the past. The idea of Pavatar is a much better way to do this sort of avatar-finding (though the decentralisation comes with its own problems), since it relies on a public web address instead of a semi-private e-mail address.
Your email is: tyler.szabo _AT_ gmail.com
md5 -s "tyler.szabo@gmail.com"
MD5 ("tyler.szabo@gmail.com") = e9af4cb49c97162d6be3ea8c6ca90a46
For bonus points, your name is Tyler Szabo, you go to University of Waterloo and plan on graduating in 2011. You work at Amazon. You are in a relationship with a Kaylan Elizabeth L. (last name withheld as a courtesy, I'm sure you know who I mean :) ).
I found out you registered this, looked up your avatar on Gravatar, found you on Stack Overflow which gave me your real name (searched for Szabo assuming that was something to do with you). Using this, I looked you up on Facebook, Twitter, and various other sites. Your single avatar helped me link everything together. Once I had your real name from Stack Overflow it became easy.
Good times. Perhaps this reveals another security vulnerability? One avatar links -ALL- your social networking.
I also have your parents, previous employers, etc, but won't post those here :)
Email addresses are usernames. They are not secret information. If somebody can be bothered enough to find your email address through brute-forcing the MD5 hash of it; you've got bigger problems.
Far more than "10% of stackoverflow.com's users" can have their email addresses GUESSED far faster. Likely your email address is also FAR easier to establish through a simple Google search on your pseudonyms.
If you for some odd reason want your email address to be secret; for the same name as wanting a secret pseudonym or using a false name when signing up; register a fake email address instead (and set it up for forwarding). You're giving your email address in clear text to the site's owner and all the internet hops inbetween him and you ANYWAY.
It's important to learn to distinguish between what is a secret and what is not; and if you want to make things secret, at what level you should put your trust.
``OK, so ten out of ten for style, but minus several million for good thinking, yeah?''
Doubt it. there's 26 letters and 10 digits, in addition to that . is very common in email-adresses. Thus you get 37 possibilities for each position. 37 to the 12th power is 6582952005840035281 hashes to run, and even if you do 10^9 Hz (i.e. one giga-hash-a-second, which would require on the order of a few hundred cores), you'd still need 208 years to do that many hashes -- then you need to look up each of them in gravatar, and analyze the result for a hit-or-miss.
"every alphanumeric email-address under 12 characters" is infact much too large a keyspace to reasonably cover overnight with a "very simple script".
It's not a large enough keyspace to be cryptographically secure, but it's large enough to not be trivially exhaustible.
1) register as a website with gravatar, find out how long the salt is
2) register on stackoverflow with your email address
3) enumerate the possibilities until you find the hash of your own address and therefore the salt
4) extract 8000+ emails from stackoverflow
5) repeat for other sites
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Your email is: tyler.szabo _AT_ gmail.com
md5 -s "tyler.szabo@gmail.com"
Nice job obfuscating his email in the first line.