Slashdot Mirror


Gravatars Can Leak Users' Email Addresses

abell writes "Gravatar offers a global avatar service, using an MD5 hash of the user's email as avatar ID. This piece of information in some cases is enough to retrieve the original email address. Testing a simple attack on stackoverflow.com, I was able to determine the email addresses of more than 10% of the site's users."

39 of 170 comments (clear)

  1. Re:So let's change the algorithm. by sam0737 · · Score: 3, Insightful

    No it's not related to MD5 itself. period.

  2. No need by Mathinker · · Score: 3, Insightful

    It would have been trivial for them to just add a secret salt string to the email before hashing, and that would have solved most of the problem. It is possible that they wanted to be "nice", in that in the case they go out of business, anyone can regenerate the ID's without them. But, as this guy has shown, that's not a great idea.

  3. Re:So let's change the algorithm. by Mad+Merlin · · Score: 2, Informative

    It's not, any hashing function would be subject to the same problem. If you RTFA you'll find that they just brute force combinations of the user name and common email domains.

    To actually fix this would require not hashing (only) email address, you could mix in some secret salt with the email before hashing, or you could use encryption (with a secret key), or you could just hand out unique identifiers which are associated only in the Gravitar database. I don't know if any of these are feasible for this particular application though.

  4. Public address by AlpineR · · Score: 4, Funny

    Here's my own Gravatar hash:

    b835b33911b93c136d8e61cbbbe6736d

    Who will be the first to crack it?

    1. Re:Public address by Yvan256 · · Score: 4, Funny

      Is it wagnerr@umich.edu?

    2. Re:Public address by palegray.net · · Score: 4, Funny

      I'm certain his email must be umich@wagnerr.edu. Now I just need to figure out why he's attending Wagner of all schools, and how the heck they managed to typo their own domain name.

    3. Re:Public address by grcumb · · Score: 4, Funny

      That took all of one second to find in an md5 lookup database. And thirty seconds for me to realize that I could have looked two lines higher to see it in plaintext next to your userid. :wallbash:

      Upside: You get to keep your geek card.

      Downside: You'll never survive the world outside your basement.

      8^)

      --
      Crumb's Corollary: Never bring a knife to a bun fight.
    4. Re:Public address by Ethanol-fueled · · Score: 2, Informative

      how the heck they managed to typo their own domain name.

      Wagner Computer Science program -- Page Not Found. Looks like that answered your question.

  5. Possible workaround by Mathinker · · Score: 3, Insightful

    Can anyone tell me if the "you can add extra stuff after a +" that GMail lets you do is standard in the RFC for all email addresses? If it is, to "fix" this, if you should sign up to Gravatar with an email address using a random string after an added "+" the brute force search on hashes will be much, much harder. (Assuming that your email provider is implementing that part of the standard.)

    1. Re:Possible workaround by bennomatic · · Score: 2, Informative

      I've looked through RFC822, and the inclusion of "+" in an email is not excluded, so it's perfectly legal. GMail's functional use of it, however (account+foo@gmail.com and account+bar@gmail.com both go to account@gmail.com, for easy tagging/filing) is just an implementation that takes advantage of the fact that most people do not have + signs in their email addresses.

      The RFC is actually pretty promiscuous; it's only implementations of it that fall short. Did you know that apostrophes are legal in the username portion of the email address? Yet how many web sites do you think would allow you to sign up as "First_O'Last@mailserver.net"? Heck; it's amazing how many sites forbid the '+' sign that Google takes advantage of.

      --
      The CB App. What's your 20?
    2. Re:Possible workaround by TubeSteak · · Score: 2, Informative

      Heck; it's amazing how many sites forbid the '+' sign that Google takes advantage of

      Here's what happened in hotmail when I tried to e-mail to [name]+bananas@hotmail.com
      http://i49.tinypic.com/fbjh1j.png
      I googled that odd character and it seems to be Chinese

      Hotmail treats the "send a message from one of your disposable addresses" generated by Spamgourmet as a typo.

      --
      [Fuck Beta]
      o0t!
  6. Re:So let's change the algorithm. by geekpowa · · Score: 5, Informative

    The attack doesn't rely on MD5 itself or MD5 collisions. It would work no matter what hashing algorithm was used.

  7. Re:So let's change the algorithm. by Mad+Merlin · · Score: 2, Informative

    MD5 collisions actually don't help the attacker here, in fact, an MD5 collision would simply be a false positive for this case (the attacker thinks they've found the email address, but they haven't).

  8. Re:So let's change the algorithm. by Firehed · · Score: 4, Insightful

    I disagree.

    Granted, those are basically very unsophisticated databases that just store lookup values, but it's relatively easy to bruteforce an MD5 hash down into one of the possible original strings (obviously with any algorithm that has a fixed output size with limitless inputs like MD5 there are infinite inputs that will hash down to a single md5sum, but when you're trying to get a valid email address out of a hash it's easy to pick the right one). Couple that with the fact that in this situation, you know that the entire string is lowercased and probably 60% of the gravatar emails (probably more like 90% actually) are going to come from one of four or five domains... reversal becomes quite easy. If you're bored, you could spin up a few Amazon EC2 or Rackspace Cloud Server instances to dump out some large tables. One each for gmail, yahoo, msn, aol, whatever else; it'd be a very simple script to make. You could probably cover every alphanumeric email address under 12 characters overnight, at a cost of about a dollar and ten minutes of scripting.

    The thing to realize here is that gravatar doesn't md5 emails to hide them from people who want to obscure their identity, just to obscure them from spambots. So it's really a non-issue. If you're that concerned, leave your blog comments with a fake email address.

    --
    How are sites slashdotted when nobody reads TFAs?
  9. Re:At first glance... by Psaakyrn · · Score: 2, Informative

    And you didn't think of Gravitar instead? Kids these days...

    http://en.wikipedia.org/wiki/Gravitar

  10. Why is this a problem? by gman003 · · Score: 2, Insightful

    Do you consider your email address private info, need-to-know only? With a decent spam filter and easy-to-use block features, it really isn't a problem. I provide mine to pretty much anyone who asks. The only thing I do is keep it in a non-scrapable format, to keep it from getting on too many spam lists.

  11. Re:So let's change the algorithm. by jamesh · · Score: 2, Interesting

    It's quite well known that MD5 shouldn't be used for anything privacy related, given the fact that it's been exploited quite publicly in recent history.

    An email address isn't private... I suspect that MD5 was just a convenient way to get a fixed length id. I'd be more worried about collisions, but i'm too lazy to calculate how many avatars would be required before that might become a problem.

  12. e9af4cb49c97162d6be3ea8c6ca90a46 by iSzabo · · Score: 2, Funny

    I actually *just* (20 minutes ago) put my picture up there. Can you guess my email ;)

    1. Re:e9af4cb49c97162d6be3ea8c6ca90a46 by Anonymous Coward · · Score: 5, Interesting

      Your email is: tyler.szabo _AT_ gmail.com

      md5 -s "tyler.szabo@gmail.com"
      MD5 ("tyler.szabo@gmail.com") = e9af4cb49c97162d6be3ea8c6ca90a46

      For bonus points, your name is Tyler Szabo, you go to University of Waterloo and plan on graduating in 2011. You work at Amazon. You are in a relationship with a Kaylan Elizabeth L. (last name withheld as a courtesy, I'm sure you know who I mean :) ).

      I found out you registered this, looked up your avatar on Gravatar, found you on Stack Overflow which gave me your real name (searched for Szabo assuming that was something to do with you). Using this, I looked you up on Facebook, Twitter, and various other sites. Your single avatar helped me link everything together. Once I had your real name from Stack Overflow it became easy.

      Good times. Perhaps this reveals another security vulnerability? One avatar links -ALL- your social networking.

      I also have your parents, previous employers, etc, but won't post those here :)

    2. Re:e9af4cb49c97162d6be3ea8c6ca90a46 by Anonymous Coward · · Score: 3, Funny

      Your email is: tyler.szabo _AT_ gmail.com

      md5 -s "tyler.szabo@gmail.com"

      Nice job obfuscating his email in the first line.

  13. use email+whatever@domain.com by topham · · Score: 2, Insightful

    Use your email address with "+randomsequence"@

    Randomsequence will have to be consistent between the user and the sites they want the gravatar to work at, but it will generate an MD5 hash different than their actual address; yet if the site sends email to the user with it the user will receive it.

  14. easier than other methods? by bcrowell · · Score: 2, Interesting

    But is this significantly easier than other methods of harvesting email addresses? Spammers already do dictionary attacks on big providers like yahoo. It's not clear to me that this method is a better way of generating a list of email addresses. If you carry out a dictionary attack on yahoo.com, you're going to come up with probably tens of millions of valid email addresses. If you carry out this attack on gravatar.com, how many addresses are you going to get for your trouble? 10% of gravatar's users, apparently -- which I'm guessing is not really that big a number. Remember, once a spammer has a botnet, it costs him zero to send out one more spam to test whether a particular address is valid. Therefore the dictionary attack is free.

    The defense against dictionary attacks is also exactly the same as the defense against this attack: either don't use a big email provider, or use a big email provider but pick a username that has a lot of characters (so it's not vulnerable to brute-forcing) and is also not vulnerable to dictionary attacks.

  15. Re:So let's change the algorithm. by Korin43 · · Score: 3, Interesting

    What I'm wondering is why this matters at all. A spammer would just send emails [your username]@[every common email domain]. Why would they bother to check if it's the correct address or not?

  16. Re:So let's change the algorithm. by broken_chaos · · Score: 3, Informative

    Not really, since the salt would need to be publicly known for Gravatar to work (and it would break any backwards compatibility to add it in now). This was a 'social engineering' attack, not a rainbow table lookup – it pieced the name together with common providers to find a matching MD5. Salt would just add a single extra step.

    I believe it's exactly the same problem/attack as was brought up about MicroID in the past. The idea of Pavatar is a much better way to do this sort of avatar-finding (though the decentralisation comes with its own problems), since it relies on a public web address instead of a semi-private e-mail address.

  17. Re:nope by KDingo · · Score: 2, Informative

    It is, actually. If you don't include the -n option for echo, it will insert a \n to the string, changing the md5, which is the hash you got.

  18. Not A Bug by lhunath · · Score: 3, Insightful

    Email addresses are usernames. They are not secret information. If somebody can be bothered enough to find your email address through brute-forcing the MD5 hash of it; you've got bigger problems.

    Far more than "10% of stackoverflow.com's users" can have their email addresses GUESSED far faster. Likely your email address is also FAR easier to establish through a simple Google search on your pseudonyms.

    If you for some odd reason want your email address to be secret; for the same name as wanting a secret pseudonym or using a false name when signing up; register a fake email address instead (and set it up for forwarding). You're giving your email address in clear text to the site's owner and all the internet hops inbetween him and you ANYWAY.

    It's important to learn to distinguish between what is a secret and what is not; and if you want to make things secret, at what level you should put your trust.

    --
    ``OK, so ten out of ten for style, but minus several million for good thinking, yeah?''
  19. Re:So let's change the algorithm. by Eivind · · Score: 3, Insightful

    Doubt it. there's 26 letters and 10 digits, in addition to that . is very common in email-adresses. Thus you get 37 possibilities for each position. 37 to the 12th power is 6582952005840035281 hashes to run, and even if you do 10^9 Hz (i.e. one giga-hash-a-second, which would require on the order of a few hundred cores), you'd still need 208 years to do that many hashes -- then you need to look up each of them in gravatar, and analyze the result for a hit-or-miss.

    "every alphanumeric email-address under 12 characters" is infact much too large a keyspace to reasonably cover overnight with a "very simple script".

    It's not a large enough keyspace to be cryptographically secure, but it's large enough to not be trivially exhaustible.

  20. In the grand scheme of things this is pretty minor by Just+Brew+It! · · Score: 2, Funny

    It's not exactly big news that a system based on MD5 hashes is susceptible to dictionary-style attacks; this should be obvious to anyone who understands how hashes work. In order for this particular attack to work, the attacker already has to have some reasonable guesses as to what your e-mail address is; the Gravatar trick only confirms the address. So it seems to me that the amount of additional data leaked is fairly small.

    OTOH, I suppose I'm somewhat desensitized to this sort of thing, since I've had the same primary e-mail address for something like 15 years (going back to the days when I was rather active on Usenet). My e-mail address is already in every spammer database on the planet, so I don't see how a few more people knowing it could make things any worse!

  21. Re:So let's change the algorithm. by rve · · Score: 2, Insightful

    That's assuming email addresses are random sequences of letters, digits and dots.

    If you're a spammer and don't mind missing the email of mr. q9x7.3f.1zzp@hotmail.com, a phone book would probably provide an effective dictionary for narrowing that keyspace considerably

  22. Could provide an API by Mathinker · · Score: 2, Interesting

    From Gravatar's FAQ:

    MD5 isnt strong enough encryption, they’ve cracked that havent they?

    MD5 is plenty good for obfuscating the email address of users across the wire. if you’re thinking of rainbow tables, those are all geared at passwords (which are generally shorter, and less globally different from one another) and not email addresses, furthermore they are geared at generating anything that matches the hash, NOT the original data being hashed. If you are thinking about being able to reproduce a collision, you still don’t necessarily get the actual email address being hashed from the data generated to create the collision. In either case the work required to both construct and operate such a monstrocity would be prohibitively costly. If we left your password laying around in the open as a plain md5 hash someone might be able to find some data (not necessarily your password) which they could use to log in as you... Leaving your email address out as an md5 hash, however, is not going to cause a violent upsurge in the number of fake rolex watch emails that you get. Lets face it there are far more lucrative, easier, ways of getting email address. I hope this helps ease your mind.

    So, they might have already thought about this vulnerability and dismissed it as not interesting.

    They could still fix their concept by providing an API where a website wanting to discover the avatar for a given email first hashes the email with MD5 and then the Gravatar URL which is generated redirects them to a link to the image (which contains no information about the email address, or perhaps uses a salted hash). This, in conjunction with rate limiting the number of queries per website, could provide a relatively secure way to do what they want.

  23. Re:So let's change the algorithm. by supernova_hq · · Score: 2, Insightful

    Security through Obscurity is a reference to the METHOD being obscure. Your encryption codes and salts are SUPPOSED to be obscure!!!

  24. Simple way to protect yourself by Umangme · · Score: 2, Insightful

    Some email providers have a simple way of giving you a throw away id. E.g example+slashdotnospam@gmail.com is sent to example@gmail.com.

    Say my name is Lary Page. If my email id is lary.page@gmail.com, I can still protect myself so that you will never get my email id.

    MD5 (lary.page@gmail.com) = "1b8dbe98e2b1138fd3ba34e26fc55107".

    So I provide my email id as lary.page+1b8dbe98e2b1138fd3ba34e26fc55107@gmail.com. If I gave you the md5 of that id, you'll find it hard to get back to lary.page@gmail.com.

    Try, the MD5 hash of the above email id is 803efbc80ead933f28d0704d43d1f63b.

  25. Re:So let's change the algorithm. by KiloByte · · Score: 2, Insightful

    Or, use john -incremental -stdout. This will test reasonable names first, while not being restricted to RL names only.

    --
    The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
  26. Who cares? by johny42 · · Score: 2, Interesting

    Using @ instead of @ is enough to stop most e-mail harvesting bots, I don't see them brute-forcing MD5s any time soon.

  27. Not the algorithm by panaceaa · · Score: 2, Interesting

    This is not related to the MD5 algorithm or use of salts. The fact is that Gravatar wants sites to use Gravatar without sending loads of requests to gravatar.com. Therefore Gravatar must provide a "client-side" API for generating Gravatar avatar URLs based on the known constant, email addresses. Sure, they could have salted things, but whatever they do, there's an essentially open source function somewhere that takes an email address and converts it to a Gravatar URL. As the algorithm is available to anyone, any attack can use it to check intelligent guesses against the known algorithm result.

    There really isn't anything Gravatar can do without changing their design to decouple avatar URLs from email addresses. Basically whenever anyone registers an account with a blog, the site would have to ask Gravator for the user's Gravatar avatar URL -- and probably poll on some regular basis in case users add Gravatar avatars later. The blog would then have to pertain this data in their databases for later look-up when comments are viewed. This is certainly possible, and could probably be designed in a way that doesn't add additional load to Gravatar's servers. But compared to the current implementation, which can be added to blogs with very minimal coding (probably just a couple lines in PHP), to do this more safely would require persistence-layer/database schema changes that would severely limit the attractiveness of Gravatar.

  28. Re:So let's change the algorithm. by MikeDX · · Score: 2, Informative

    Bolex make [motion picture] cameras, not watches, and were very important in the early television news reels. Even today they are a staple in film schools.

  29. Re:So let's change the algorithm. by DrSkwid · · Score: 4, Informative

    1) register as a website with gravatar, find out how long the salt is
    2) register on stackoverflow with your email address
    3) enumerate the possibilities until you find the hash of your own address and therefore the salt
    4) extract 8000+ emails from stackoverflow
    5) repeat for other sites

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
  30. Re:So let's change the algorithm. by pAnkRat · · Score: 2, Insightful

    Correct: the attack here is:

    Take big Site with thousands of user, many using thier (sorta) "real names".
    Permute these names with some known big email provider hostnames.
    Send them all some spam.

    It does not really matter if 90% of those emailadresses are incorrect, the rest will hit.

    I would not do the MD5 validation thing, why should I?

    --
    we need an "-1 Plain wrong" moderation option!
  31. Re:So let's change the algorithm. by maevius · · Score: 2, Insightful

    The salt can be user and website dependent (4 bytes user/4 bytes website for an 8 byte salt). Although I think that the added complexity won't be welcomed by the website owners