Is E-Mail Obscuration Worth It?
ThenAgain asks: "Many sites obscure e-mail addresses by adding noise (like 'STOPSPAM') or by translating the punctuation into words (Ex: 'me at domain dot com'). This makes users feel good but does it actually help? Ten lines of perl could defeat any of the present schemes with ease and the spammers have shown plenty of adaptability. So if we're not helping hold back the flood of spam, why are we decreasing the utility of the web by eliminating mailto tags and forcing users to hand-correct the addresses in their mail clients?"
Ten lines of perl could defeat any of the present schemes with ease...
.01% of people who responds to this crap, and anything you send me will just hit my spam-filter anyways, so don't even try."
Yes, but, for now at least, there are still plenty of addresses from people who don't spam-guard, enough that writing those 10 lines of perl isn't even really worth it.
Also, if you have your address spam-guarded, it's effectively a message to the spammers that, "I'm not one of the
And they don't, because it's just not worth it for both those reasons.
A Minesweeper clone that doesn't suck
So much energy is put into securing networks that ends up inconveniencing users while tons of exploits abound and social engineering completely bypasses it. Why bother?
The reason people obscure their email is
a) It's fast, easy and doesn't require external software.
b) Sometimes that's all the protection you can get when you post to some sites.
Nothing wrong here. Web utilization is still high. It's the spam that is the problem -- not the countermeasures.
A study by the Center for Democracy & Technology in 2002 concluded that by either replacing email addresses with the HTML equivalent or human-readable equivalents like "example at domain dot com" signficantly cut down on spam. From their Major Findings: "E-mail addresses posted to Web sites using these conventions did not receive any spam." While, yes, it's relativley easy to write a script that would recombine the addresses, apparenlty most harvesters for whatever reason just aren't. My email address, which is posted online, is 'hidden' in HTML and I get very little spam after many years of having it up.
Go have a look around cotton fields just after harvest. Literally tons of the stuff is left behind at the edges of fields, blown along the roadside, lying on the stubble etc. Sure, you could go along and pick it up but the cost of doing so would outweigh the price you'd get for the extra x bushels you'd collect.
It's the same with e-mail addresses - why should a spammer go to the trouble of modifying their bots to detect obscured addresses, when there are plenty of unobscured ones ready for harvest?
I'm sure some spammers do try to pick up obscured addresses, but until they start running out of unobscured addresses, they'll keep going for the masses of low hanging fruit and not bother with the rest.
Of course, obscurity doesn't save your address from brute forcing...
For example, while you might post your address as:
user@NOSPAM.domain.com
I may post mine as
user2@no_spam_damnit.domain.com
To me, using relatively simple tricks like this to make the job of a spammer harder is definitely worthwile.
My blog
Yes, trivial obscuring like user(at)example(dot)com with various special characters can be done in 10 lines. (Could be hard to get the last 3 lines filled with code.)
But what if the user does not use English language, but German? And what if (s)he does not mark the obscured charachters? user klammeraffe example punkt com or with some funny synonymes user a im kringel example klecks com. Decoding this in 10 lines of Perl becomes harder, and it becomes harder with every new language. Decode this with 10 lines for English, German, French, Polish, Russian, Bantu, Spanish, ...
What happens if the user is really "evil" to spammers? Meine Mail-Adresse besteht aus dem Domainnamen meines Providers example unter der Top-Level-Domain fur kommerzielle Webseiten, dem wird mein Kundenpseudonym user und ein Klammeraffe vorangestellt. (I'm still hiding user@example.com - translation: My mail address is composed from the domain name of my provider example undet the top level domain for commercial websites, prefixed with my client pseudonym user and an at sign.) Decode this and similar examples in 10 lines of Perl for 10 languages, while still being able do decode all trivial variants and all slashdot mail obscurations.
Getting more evil: Meine e-Mail ist catch-those-spammers@example.com mit user vor dem Klammeraffen. Schicken Sie keine Mails an die falsche Adresse. (My email is catch-those-spammers@example.com with user in front of the at sign. Don't send mail to the wrong address.) Set up an account catch-those-spammers that marks and blocks all computers that test that acocunt or send mail to it. Now decode this and all examples above and all slashdot obscuration and don't run into the trap, and do not use more than 10 lines (with 80 characters each) of Perl code.
I bet it can't be done in 10 lines with 80 characters each, using Perl 5 and no external modules.
With nearly no work it is possible to make automatic address collecting harder and thus more expensive. Spammers don't want to spend much money, they want to maximise their profit. So they will do at most only trivial decoding, if they can't collect enough unobscured mail adresses. This is why images containing the mail address won't be OCRed for a while. It simply costs too much. On the other hand, just guessing names for existing domains works pretty well and it is very cheap. I have an unpublished six-letter account at a big German mail provider, and it is permanently hit by spam. The generic (unused and unpublished) accounts (sales, info, mail, accounting, vertrieb) of my domain are also spammed very often. Guessing is cheaper than collecting addresses.
So while this is not a mathematical proof, you can see that non-trivial obscuration will help. See also What You Get When You Buy a Spam CD.
Tux2000
Denken hilft.