Slashdot Mirror


Gravatars Can Leak Users' Email Addresses

abell writes "Gravatar offers a global avatar service, using an MD5 hash of the user's email as avatar ID. This piece of information in some cases is enough to retrieve the original email address. Testing a simple attack on stackoverflow.com, I was able to determine the email addresses of more than 10% of the site's users."

6 of 170 comments (clear)

  1. Re:So let's change the algorithm. by sam0737 · · Score: 3, Insightful

    No it's not related to MD5 itself. period.

  2. No need by Mathinker · · Score: 3, Insightful

    It would have been trivial for them to just add a secret salt string to the email before hashing, and that would have solved most of the problem. It is possible that they wanted to be "nice", in that in the case they go out of business, anyone can regenerate the ID's without them. But, as this guy has shown, that's not a great idea.

  3. Possible workaround by Mathinker · · Score: 3, Insightful

    Can anyone tell me if the "you can add extra stuff after a +" that GMail lets you do is standard in the RFC for all email addresses? If it is, to "fix" this, if you should sign up to Gravatar with an email address using a random string after an added "+" the brute force search on hashes will be much, much harder. (Assuming that your email provider is implementing that part of the standard.)

  4. Re:So let's change the algorithm. by Firehed · · Score: 4, Insightful

    I disagree.

    Granted, those are basically very unsophisticated databases that just store lookup values, but it's relatively easy to bruteforce an MD5 hash down into one of the possible original strings (obviously with any algorithm that has a fixed output size with limitless inputs like MD5 there are infinite inputs that will hash down to a single md5sum, but when you're trying to get a valid email address out of a hash it's easy to pick the right one). Couple that with the fact that in this situation, you know that the entire string is lowercased and probably 60% of the gravatar emails (probably more like 90% actually) are going to come from one of four or five domains... reversal becomes quite easy. If you're bored, you could spin up a few Amazon EC2 or Rackspace Cloud Server instances to dump out some large tables. One each for gmail, yahoo, msn, aol, whatever else; it'd be a very simple script to make. You could probably cover every alphanumeric email address under 12 characters overnight, at a cost of about a dollar and ten minutes of scripting.

    The thing to realize here is that gravatar doesn't md5 emails to hide them from people who want to obscure their identity, just to obscure them from spambots. So it's really a non-issue. If you're that concerned, leave your blog comments with a fake email address.

    --
    How are sites slashdotted when nobody reads TFAs?
  5. Not A Bug by lhunath · · Score: 3, Insightful

    Email addresses are usernames. They are not secret information. If somebody can be bothered enough to find your email address through brute-forcing the MD5 hash of it; you've got bigger problems.

    Far more than "10% of stackoverflow.com's users" can have their email addresses GUESSED far faster. Likely your email address is also FAR easier to establish through a simple Google search on your pseudonyms.

    If you for some odd reason want your email address to be secret; for the same name as wanting a secret pseudonym or using a false name when signing up; register a fake email address instead (and set it up for forwarding). You're giving your email address in clear text to the site's owner and all the internet hops inbetween him and you ANYWAY.

    It's important to learn to distinguish between what is a secret and what is not; and if you want to make things secret, at what level you should put your trust.

    --
    ``OK, so ten out of ten for style, but minus several million for good thinking, yeah?''
  6. Re:So let's change the algorithm. by Eivind · · Score: 3, Insightful

    Doubt it. there's 26 letters and 10 digits, in addition to that . is very common in email-adresses. Thus you get 37 possibilities for each position. 37 to the 12th power is 6582952005840035281 hashes to run, and even if you do 10^9 Hz (i.e. one giga-hash-a-second, which would require on the order of a few hundred cores), you'd still need 208 years to do that many hashes -- then you need to look up each of them in gravatar, and analyze the result for a hit-or-miss.

    "every alphanumeric email-address under 12 characters" is infact much too large a keyspace to reasonably cover overnight with a "very simple script".

    It's not a large enough keyspace to be cryptographically secure, but it's large enough to not be trivially exhaustible.