Hashing Email Addresses For Web Considered Harmful
cce writes "The MicroID standard, despite getting thrashed soundly by Ben Laurie two years ago, has since been recommended by the DataPortability Project and published on the user profiles of millions of users at Digg and Last.fm. MicroID is basically a hash calculated using a user's profile page URL and registered email address, producing a token that makes the email address vulnerable to dictionary attacks.
To see how easy it was to crack these tokens, I conducted a small study, choosing 56,775 random Digg users, and cracking the email addresses of 14,294 of them (25%) using just their MicroID, username, and a list of popular email domains. Digg has more than 2 million users, and that means half a million of them — mostly people who had never heard of MicroID, and had probably not logged in for a long time — had their email addresses exposed to this trivial attack. I also applied this attack to Last.fm (19%) and ClaimID (34%).
Digg and Last.fm have since removed support for MicroID, but the lesson is clear: don't publish a hash of my email address online, guys!"
I suppose this is yet another reason why it's nice that a few email services (most notably gmail) allow you to append a string to your email address using the + symbol (e.g. youremail+string@gmail.com will go to the inbox of youremail@gmail.com). In effect it allows you to "salt" your email, which adds a layer of complexity when trying to match these hashes with valid email (not to mention it allows you to check which site compromised your email if you use different 'salts' for each site you use your address on). If more email services start to allow this (doubtful), more sites start realizing that a + in your email is still a valid email (more doubtful), and more users start using it effectively (even more doubtful still), then I don't think the MicroID will be a huge problem.
I've read up on it, but I don't understand how it benefits the user, vulnerability aside.
To find out valid e-mails, couldn't a spammer just send out an e-mail blast to username@top5emaildomains.com and throw away all the bounces?
You wouldn't need a hash of any sort to do that kind of trivial attack and it isn't like the serious spammers are lacking in bandwidth or resources.
[Fuck Beta]
o0t!
Slashdot uses an e-mail scheme like that. Yeah, there it is, right there ^^^.
This concern that you may have your email address *discovered* by spammers because you post it on a web page is so 5-years-ago. They already have your email address, and they probably didn't get it by scraping web pages.
When you have sent a couple emails out with a given address, you can figure that at least one of them will to sit around in someone's Outlook mailstore for the next couple years. (Someone you know uses Windows!) When that person's computer gets infected with spam gang malware (as they all do), they have your address.
Once of them has it, they probably all have it.
What would be a better solution that is as easy to implement?
This is exactly the reason I don't use Gravatar. They even tell everyone they are morons right here:
http://en.gravatar.com/site/implement/url
I didn't know anything about them except that someone in a forum was describing how you could have the same avatar in compatible forums that you participate in. The second I read that your hashed email address was part of the URL I turned around and never looked back knowing full well that if someone wanted to, they could eventually get my email address.
tm
Support TBI Research: http://www.raisinhope.org
cce, how did you confirm a successful application of your method? If each site used a unique 'secret key' to salt the hash, would it prevent breakability? I run a small site that uses globally recognized avatars, which are implemented with hashed email addresses. Thanks for doing this study!
The bottom line is that unless you don't have any online presence, your email address is going to leak, and it's going to wind up on spammer's lists. If you want to avoid getting spam, some other solution is called for.
People still use E-Mail?
Huh, I guess I won't be using gravatar anymore :(
slashdot rocks
Oh the terrible price we paid for salvation from goto...
What's the difference between attacking the MicroID to collect email addresses, and running a dictionary attack on email servers using people's usernames?
Personally, I can't get clients unless they know how to get in touch with me.
And don't moan about spam. My E-mail address is widely published and maybe one or two messages a week gets through the filters.
I piss off bigots.
Assuming you're using postfix and virtual, you can do something like this:
main.cf:
recipient_delimiter = +
virtual_alias_maps = hash:/etc/postfix/virtual, regexp:/etc/postfix/virtual-regexp
virtual-regexp: /(.*)\-(.*)@example.com/ ${1}+${2}@example.com
and then you can do:
bob-somesite.com@example.com
this works for every site I've tried but oracle.com, who apparently doesn't want you tracking their mail. :)
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
I fully agree with the parent. The idea of keeping an email address that you actually use private is several orders of magnitude sillier than thinking your credit card number and social security number hasn't been stolen a dozen times already.
But there is one place I won't "publish" my email address (jeffrey@goldmark.org), and that is in the From line of a Usenet posting. Reply-to is fine, and there absolutely no problem in the body of messages, but tests have shown that putting something in the From line of a Usenet posting will give you a very noticeable increase in spam.
Prime numbers are exactly what Alan Greenspan says they are -S. Minsky
Writing junk like foo [at] bar [dot] com simply wastes time time of your colleagues and friends, who now have to rewrite your address by hand, and confuses the non-techies.
How dare you!? The hours I spend every week crafting clever rewrites of my email address is precisely that which keeps the spammers on their toes. How else do you think Gmail capable of filtering any spam mails out? I'm keeping their volume down. It's not just some stupid security superstition, either: it really works! And way better than whatever algorithm they're training over in Mountain View.
From,
seinjunkie@gmail.com
I guess Gravatar.com will now have to ecourage proxying of avatars via sites' web servers.
It sounds like what the author did is take each username, and then try hashing @gmail.com, @yahoo.com, etc. If that's the case, a spammer could just skip the hashing trouble and try emailing those same addresses directly.
Apart from the fact "+" is a perfectly valid character in an email address, if you're using Gmail, you can insert random dots in your address, and your mail will still get delivered.
my.name@gmail.com
is equivalent to
my.na.me@gmail.com
my....name@gmail.com
m.y.n.a.m.e@gmail.com
etc
Ah, arrogance and stupidity, all in the same package. How efficient of you. -- Londo Mollari
http://www.linuxjournal.com/article/9585
This is the most important post in the whole comment thread. That with publishing your MicroID you publish your e-mail address is one thing, but that the MicroID isn't actually an ID in any sense of the word is quite another. Why is Windows' cryptography tool exposed as an API, but didn't Microsoft ship an applet with it that uses it? For example, they could have added PGP support to Notepad and the e-mail and messenger applets with relative ease, but but they didn't do so. Why? And that of course was the reason why MicroID didn't ask users to sign it, because most users don't have a clue on how to do it. And those that do know often refuse to install GnuPG because then they 'need to install something'. Seriously, Windows should have more accessible crypto support.
You are worried because someone, if they really wanted to send you some mail, could go to the trouble of doing a CPU-intensive search against some hash shown on a website and find out that ultimate, embarassing secret: your *email address*??
What gives? Email addresses are designed to be public. If you don't want people you do not know to be able to contact you, then you are free to drop all mail from unrecognized addresses. If you want to set up some kind of secret knowledge that people must have in order to contact you, then ask them to put a particular word in the subject line when first sending you a message. Either of these does not rely on keeping the address secret, which just isn't likely to happen.
The only thing more broken than trying to keep an email address secret is trying to make a 'private' web page by keeping the URI secret. Again, the system is designed so that the address itself is not sensitive, but other information such as a password or PGP key can be.
Actually, what it reminds me of most is the crazy situation in the US where a basically public identifier, the social security number, is abused as some kind of secret token. Hence all the fuss made when it is possible to find out someone's SSN. The answer is not to add more and more baroque means to stop the SSN from leaking out: one breach, and it's no longer a secret.
I understand the desire to stop spam address harvesters, but really, there are hundreds of web sites which display email addresses with only light obfuscation, enough to stop a harvester bot but not a determined human being (or someone determined enough to use an OCR engine). The kind of hashing talked about here is way more difficult to undo than that. If you are even more paranoid, you need to revisit your assumptions of what is public and what is secret.
-- Ed Avis ed@membled.com
We have all the right stuff in place. However all the clients are nubs and they sure as hell don't...
Outlook is stupid. I've nearly moved this organisation off of it. Exchange no longer handles email, Google Apps does. Next step is to build an all encompassing app that eradicates the ability to send emails randomly.
Mod parent up. Please. Please.
... where you have to hide your email address from harvesters. I respect the privacy of others and am very careful about revealing or publishing email addresses. However, for the decades that I have had an email address, I have never made an attempt to hide it.
Marko <marko@pacujo.net>
...in the username portion as it's not part of the list of in RFC821/RFC1521 (SMTP) and that have a special meaning:
<special> ::= "<" | ">" | "(" | ")" | "[" | "]" | "\" | "." | "," | ";" | ":" | "@" """ | the control characters (ASCII codes 0 through 31 inclusive and 127)
neither is + listed in rfc822/rfc1522 (message format)...
SMTP itself allows funny things such as bob@somewhere-else.com@somewhere.org .
If MTAs rules now prevent such specialties for security/anti-spam reasons, it's a matter of choice.
Or if one doesn't want to see + when working with strings in a javascript form, except for concatenating them, it's the programmer's choice.
gmail is RFC compliant (in this regard at least)
If you have an email address you don't want to be public, don't make your username-on-public-websites the same as your username-at-email-provider.
Spammers wouldn't even bother checking the hash - if you have a spam botnet and an username harvester, you can just scrape the usernames and send email to username@gmail.com, username@hotmail.com, etc.
It seems like the attack is just taking user names and other publicly-known data trying to determine an email address from them. Spammers don't need microid to confirm that their guess is correct; they'll just send to all 50 or 100 top email domains, hoping to get a hit.
The whole point of MicroID is that if someone knows your email address, they can tell that you are the author of the page. If your email address is easy to guess, then your email address will be revealed, _whether_or_not_ there's a microid here, there, or anywhere.
If an email address is easy to guess, then the email address is easy to guess. Not clear what new ground we're covering here.
Evan Prodromou | evan@prodromou.name | http://evan.prodromou.name/