Hashing Email Addresses For Web Considered Harmful
cce writes "The MicroID standard, despite getting thrashed soundly by Ben Laurie two years ago, has since been recommended by the DataPortability Project and published on the user profiles of millions of users at Digg and Last.fm. MicroID is basically a hash calculated using a user's profile page URL and registered email address, producing a token that makes the email address vulnerable to dictionary attacks.
To see how easy it was to crack these tokens, I conducted a small study, choosing 56,775 random Digg users, and cracking the email addresses of 14,294 of them (25%) using just their MicroID, username, and a list of popular email domains. Digg has more than 2 million users, and that means half a million of them — mostly people who had never heard of MicroID, and had probably not logged in for a long time — had their email addresses exposed to this trivial attack. I also applied this attack to Last.fm (19%) and ClaimID (34%).
Digg and Last.fm have since removed support for MicroID, but the lesson is clear: don't publish a hash of my email address online, guys!"
I suppose this is yet another reason why it's nice that a few email services (most notably gmail) allow you to append a string to your email address using the + symbol (e.g. youremail+string@gmail.com will go to the inbox of youremail@gmail.com). In effect it allows you to "salt" your email, which adds a layer of complexity when trying to match these hashes with valid email (not to mention it allows you to check which site compromised your email if you use different 'salts' for each site you use your address on). If more email services start to allow this (doubtful), more sites start realizing that a + in your email is still a valid email (more doubtful), and more users start using it effectively (even more doubtful still), then I don't think the MicroID will be a huge problem.
This concern that you may have your email address *discovered* by spammers because you post it on a web page is so 5-years-ago. They already have your email address, and they probably didn't get it by scraping web pages.
When you have sent a couple emails out with a given address, you can figure that at least one of them will to sit around in someone's Outlook mailstore for the next couple years. (Someone you know uses Windows!) When that person's computer gets infected with spam gang malware (as they all do), they have your address.
Once of them has it, they probably all have it.
tm
Support TBI Research: http://www.raisinhope.org
I read up on it and I'm still confused, but I think this is the idea:
1. You set up an account at website Alpha.
2. You have a publicly-viewable profile page at Alpha. On the page is your MicroID.
3. You set up an account at website Beta.
4. You tell Beta about your Alpha profile page.
5. Beta verifies that your Alpha profile page is really yours by checking the MicroID.
Beta can't really do anything with your Alpha page except link to it. I guess the point would be to prevent people who aren't you from linking to your Alpha page on their Beta pages. That way, other people can be sure that the same person owns both accounts.
The attack mentioned in the article doesn't compromise the proper use of the MicroID, since Beta is assumed to have verified that you own your email address and you wouldn't link to a profile page claiming to be yours that wasn't. All it does is make it possible for spammers to harvest your email.
Do they still do that? I know from a distant past they tried it with smaller providers too, but haven't seen them for a long time. As far as I can tell, spammers do still use malware which harvests/sniffs email-address directly from peoples computers.
This is a definite tactic. I see it all the time on a mail server that I administer. From the results, there are definitely spammers that monitor user's e-mail, address book, or other sources of e-mail addresses on their computer. (Basically, on a brand new e-mail address, the user started getting spam within a few hours of contacting someone else.)
But we still see dictionary attacks on our mail server, so that's a popular tactic too.
Wolde you bothe eate your cake, and have your cake?
Use gmail. I'll get a thousand or so spams a month, but I've had maybe four make it to my inbox in the past three years.
It obviously doesn't eliminate the problem of spam, but in theory if it didn't make it to anyone's inbox, idiots would stop acting on it and suddenly spam wouldn't be profitable and would fizzle away.
How are sites slashdotted when nobody reads TFAs?
What's the difference between attacking the MicroID to collect email addresses, and running a dictionary attack on email servers using people's usernames?
Assuming you're using postfix and virtual, you can do something like this:
main.cf:
recipient_delimiter = +
virtual_alias_maps = hash:/etc/postfix/virtual, regexp:/etc/postfix/virtual-regexp
virtual-regexp: /(.*)\-(.*)@example.com/ ${1}+${2}@example.com
and then you can do:
bob-somesite.com@example.com
this works for every site I've tried but oracle.com, who apparently doesn't want you tracking their mail. :)
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
I fully agree with the parent. The idea of keeping an email address that you actually use private is several orders of magnitude sillier than thinking your credit card number and social security number hasn't been stolen a dozen times already.
But there is one place I won't "publish" my email address (jeffrey@goldmark.org), and that is in the From line of a Usenet posting. Reply-to is fine, and there absolutely no problem in the body of messages, but tests have shown that putting something in the From line of a Usenet posting will give you a very noticeable increase in spam.
Prime numbers are exactly what Alan Greenspan says they are -S. Minsky
Apart from the fact "+" is a perfectly valid character in an email address, if you're using Gmail, you can insert random dots in your address, and your mail will still get delivered.
my.name@gmail.com
is equivalent to
my.na.me@gmail.com
my....name@gmail.com
m.y.n.a.m.e@gmail.com
etc
Ah, arrogance and stupidity, all in the same package. How efficient of you. -- Londo Mollari
You are worried because someone, if they really wanted to send you some mail, could go to the trouble of doing a CPU-intensive search against some hash shown on a website and find out that ultimate, embarassing secret: your *email address*??
What gives? Email addresses are designed to be public. If you don't want people you do not know to be able to contact you, then you are free to drop all mail from unrecognized addresses. If you want to set up some kind of secret knowledge that people must have in order to contact you, then ask them to put a particular word in the subject line when first sending you a message. Either of these does not rely on keeping the address secret, which just isn't likely to happen.
The only thing more broken than trying to keep an email address secret is trying to make a 'private' web page by keeping the URI secret. Again, the system is designed so that the address itself is not sensitive, but other information such as a password or PGP key can be.
Actually, what it reminds me of most is the crazy situation in the US where a basically public identifier, the social security number, is abused as some kind of secret token. Hence all the fuss made when it is possible to find out someone's SSN. The answer is not to add more and more baroque means to stop the SSN from leaking out: one breach, and it's no longer a secret.
I understand the desire to stop spam address harvesters, but really, there are hundreds of web sites which display email addresses with only light obfuscation, enough to stop a harvester bot but not a determined human being (or someone determined enough to use an OCR engine). The kind of hashing talked about here is way more difficult to undo than that. If you are even more paranoid, you need to revisit your assumptions of what is public and what is secret.
-- Ed Avis ed@membled.com
It seems like the attack is just taking user names and other publicly-known data trying to determine an email address from them. Spammers don't need microid to confirm that their guess is correct; they'll just send to all 50 or 100 top email domains, hoping to get a hit.
The whole point of MicroID is that if someone knows your email address, they can tell that you are the author of the page. If your email address is easy to guess, then your email address will be revealed, _whether_or_not_ there's a microid here, there, or anywhere.
If an email address is easy to guess, then the email address is easy to guess. Not clear what new ground we're covering here.
Evan Prodromou | evan@prodromou.name | http://evan.prodromou.name/