Best Method For Foiling Email Harvesters?
pjp6259 writes "One of the common ways that spammers generate email mailing lists is by harvesting email addressess from websites. But in many cases you also need to make it easy for your customers to reach you. I have found three common solutions to this problem: 1.) Use an image to replace your email address. 2.) Use ascii encodings for some/all of the characters. 3.) Use javascript to concatenate and/or obfuscate your email address. Which of these methods are most effective? Are email harvesters able to interpret javascript? What do you use?"
My two favorite methods are:"
- Putting the e-mail in a distorted picture (like a captcha) - this is very difficult for spam crawlers to read
- Using a long human readable message "tset ta tset tod moc.reverse.each.word.prior.to.first.dot.for.addr
In general, your best defense is to employ some method that requires human interpretation.
Crack - Free with every butt and set of boobs
If you make it hard for 'bad guys', you make it hard for your customers/friends too. Some people like having mail-to links, and you won't be able to do that easily with an image.
If you have a form to submit to on-line, tag it and let it go to the head of the class.
v4sw6PU$hw6ln6pr4F$ck 4/6$ma3+6u7LNS$w2m4l7U$i2e4+7en6a2X h
Spend 10 minutes and make an HTML form for people to contact you. Be careful what you name your field names, though, as there are spam bots that can target web forms.
If people need to send you files, they can do so after you reply back to them.
IP geolocation and a shotgun.
Works for me.
Think of the Children; Sleep with your Sister
As for whether the harvesters can interpret javascript, I think that it depends on the particular harvester. You could analyze the source or the created page.
I have one email that I use specifically for REPLYING to emails and that one is the one that gets the MOST Spam.
I like microcars
With a mailto URL and deal with the resulting spam at the mail level, the cost of doing so is less than the cost of alienating potential customers.
However, on a personal site, images.
Deleted
use a table with 3 columns.. the first with the first part of your email addres, the second with @ and the third with domain.com. simple searches on the pages make it hard to find and with a border of 0 the user won't notice the table.
There exists some positive integer N that you are the Nth person to read this signature.
SpamGourmet.com
Makes it trivially easy to create a unique forwarding address for any website you care to visit, then set the domain of that site as an exclusive sender for that address.
If a 3rd party starts spamming you at that address, Spam Gourmet just drops it, but continues to deliver relevant mail.
Oh, and it's completely free.
gvcormac@uwaterloo.ca -- Bring it on!
Seriously, if we cower in fear, the spammers win. Obfuscating, Turing tests, whatever show fear.
Hide in the webpage a bogus email address. Maybe in comments, maybe in the corner with a super tiny font which matches the background. Whatever mail gets sent to that address should be automagically blocked to all other accounts.
----
Go canucks, habs, and sens!
I've heard the following works fairly well, but haven't tried it m'self.
Put 2 email addresses on your web site, the real one, and a 'decoy' one which is hidden from normal users (eg white-on-white text right at the bottom of the screen).
Any email that arrives at the 'decoy' address is parsed, and the sender added to a blacklist.
Quidquid Latine dictum sit, altum videtur (anything said in Latin sounds important)
You know when they said you were special? They were trying to tell you to just do something different than everyone else. If everyone did a table trick or wrote "blank at blank dot com" or did any other clever little thing a programmer could come along and regex the hell out of it. Be unique and make them deal with your site individually.
That being said, I don't think spammers crawl the net looking for addresses so much. Their zombies have all the addresses they need. Just try to give out your email address to people that don't have an affinity for virus infections. In my case, I protect my customers so my address hasn't been abuse too heavily thus far.
check+the+rfc+this+is+legal+but+nobody+codes+for+i t@yourdomain.com
Help poke pirates in the eyepatch, arr.
My actual e-mail address, in convenient text format and as a mailto: link, is at the bottom of every single web page at my personal web sites. I really don't see why I should change that just because spammers might harvest it. My e-mail address has been up there since about 1996, so that's at least a decade's worth of harvesting. I've also used the same e-mail address on Usenet posts.
Yes, I get quite a lot of spam. But with the usual techniques (greylisting, SpamAssassin, etc.) I only actually receive maybe half a dozen spam e-mails a day. And more importantly, all my actually valid e-mail still seems to get through just fine. I'm happy with it, and I get the personal satisfaction of being able to use my e-mail address wherever I damn well like without having to cower from spammers.
Put in plain sight: on your homepage which you submit to Google for indexing.
It's so obvious, they'd NEVER think to look there.
I then use separate email addresses for everything I sign up for. E.g. my bank email address is different from my health fund email address, which is different from my all of mp3 email address etc. I use a little code which isn't obvious(similar to a lookup table) to code each website into the username portion of the email address... That's why I'm a little annoyed at allofmp3.com at the moment, as I've supplied two email addresses to them on only two occassions, and both are huge spam recipients. So it's clear that not only does their financial arm sell my email address, but their online store does too.
This method is good for 2 reasons: It's very easy to direct all email from particular addresses straight to the trash should they become spam targets and secondly, it's very easy for me to figure out (such as the allofmp3.com case) who sold my email address to spammers and when.
I try to run any mailtos through an email obfuscator .. as the link says, a 6 month study showed that obfuscated emails "do not receive junk mail."
My theory is that harvesters have enough email addresses out there to gather and that the spammers are too lazy/have no need to write algorithms that interpret these types of mailtos.
...unfortunately no one can be told what The Mat^H^H^HGoatse is...they must experience it for themselves...
I have found that using SPAM as your username works wonders
just post it right there on the webpage or leave it as a mailto:spam@example.com
So many people use NOSPAMjohn@NOSPAMexample.com (remove the NOSPAM to reply)
or some variation of that, I tried using spam@example.com as my email address on Google Groups and previously on Usenet.
I got pretty much nothing. No spam. Not then, not now.
Since the email harvesters apparently filter out variations of addresses with SPAM, NOSPAM, DIESPAMMERS etc in them, once they filter out the "SPAM" part of spam@example.com they are left with @example.com which is not a valid email address.
I like microcars
A lot of these suggestions are fine for personal sites; but if you're actually in business they aren't practical.
We use Javascript. You don't want to make life more difficult for the person trying to correspond - the point is to raise the cost to the spammer. If they have to add a Javascript parser to their spider, it's going to slow them way down. It's not going to make financial sense for them to do a custom solution for each site (and if they do, the "image" methods will break down as well).
When someone writes to me and says "reply to joe at gmail dot com" (or whatever), they generally don't get a reply. Why is their time more valuable than mine?
#DeleteChrome
They use "sender verify" on the mail server.
When the mail server gets an incoming email, it sends a request back to the "sending" email server listed in the headers. Since most spam is sent with falsified headers, the reply from the "sending" email server will respond that no mail was sent. Then my host mail server simply dev/nulls the spam. In the case of real mail, the sending server responds that it did indeed send the mail and my host then delivers it.
The only troubles I've run into are servers that don't support "sender verify". If the email doesn't get a verification message, its returned to the sender. Oddly enough, of the servers I've found that don't support "sender verify" they have been IIS servers. While there are still other IIS servers that do support it, I find it interesting that most of the servers not running IIS seem to have this feature turned on.
The nice thing about it is 90% of the spam never reaches a mailbox, and the filters from Spam Assassin catch the rest. This also removes the image only spam.
-Goran
Carpe Scrotum - The only way to deal with your competition.
Excellent idea, it'd be ignored by humans and scripts alike.
LK
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
A while ago, I've set up an article on my homepage that combines all techniques without compromising usability:
http://www.thany.org/article/73/E-mail_hiding
If the spammers want so bad email addresses, why not give it to them? List poisoning will sting them right in the buttocks, and will make them think twice before they even consider sending there dumb spiders to your servers again. Take a look at the following sites for more info:
http://www.monkeys.com/wpoison/
http://www.spampoison.com/
My other OS is the MCP!
That's called a challenge-response system.
Those are EVIL and should be banned from the Internet.
My personal domain has been hijacked by spammers. Despite having a valid SPF record, they still send spam with my domain forged as the sender. Consequently, when someone has a challenge-response spam filter configured, those challenge message come to ME, despite the fact that I had nothing to do with the original message. I consider those challenge messages spam themselves, and report them to spamcop as such.
There are better ways of filtering spam. Forcing other people to filter your mail for you is extremely inconsiderate.
"The guide is definitive, reality is frequently inaccurate."
I mean yeah some of the tips and tricks may (or may not) work in the short run but eventually the spammers will get your id (not to mention the trouble to your customers if you obfuscate the id too much). Its not always how you displayed you mailid on your website or webpage that ultimately gets it harvested. More often than not, its stupid users with your address in their contact lists who get it out in the open.
Like most of the people, I use multiple mail ids for different uses. Lots of them are fakes just to register to sites and such, and a couple are private ones which are used only to correspond with the closest friends and family members. Recently one of my friends told me that he has used my address to register for a gaming site since his was already being used for one account and apparently creating a new id takes ages and he may die before he gets a new one so why not use mine which is totally personal to me but who gives a damn. He actually has no idea why he should Not be doing it. And he is a CS major from the one of the best colleges in the country! Now think of the regular users you may have corresponded to and how easy it is for them to fuck everything trick you have tried to evade harvester bots.
Politicians and Pedophiles: Two groups of exploitive bastards who are most dangerous when they're thinking of children.
I didn't even think of that. It seems that you would have to make a website that was readable (by a software page reader) and easily usable by the blind, but still difficult to extract the email address. Maybe you could put an audio clip of contact info, akin to a voicemail message.
Perhaps I don't want to send mail to companies who have broken only-tested-on-IE-on-WindowsXP preferences anyway...
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
What we need for someone to instead of talk, perform two experiments:
1. Create 10 new email addresses, and post them around the net with 10 obfuscation tricks (plenty of examples can be found in this thread). Which of these tricks actually foiled the spammers, and which did not? Obviously, spammers can theoretically get around any obfuscation, but which obfuscations are still "safe"?
2. Do an experiment to figure how how "safer" is an address that was never posted on the Web. Does it just cause a small delay in spam (say, you only start getting spam after a month) or does it get noticably less spam?
The answer to #2 isn't as obvious as some may think. One important problem to consider is spamming worms which use fake "from" addresses. These worms take your friends' email addresses - potentially addresses which have never been published - and use them as spam to random people. If a spammer also receives these mails, he gets a constant stream of real email addresses which were never published on the web. Another obvious issue is dictionary attacks, which are especially practical on large domains (e.g., gmail).
I am glad you used the car analogy, I cannot understand new concepts without one.
Have your code produce a unique contact e-mail address on the page for each visitor, so for instance:
support-312321@example.com
Then set up a catch all on the first part of the address.
If you get any spam, just block out that one receiving address.
Obfuscating emailaddresses on websites is one way of tackling the spam harvesters problem. Training filters by becoming somewhat of a spam-magnet is another way. The only problem herein lies in the differentiation between ham and spam. Spam is here and will be here for a long time to come because people do make (a lot of) money with it. SO you could say detecting it is more sensible compared to avoiding it.
I've been experimenting by adding an automatically generated code to my email adresses on my page (recipientDELIMcode@domain.ext). Spammers keep on sending me spam on these addresses, and i accept, and train my mailfilter this way. The only thing I have to do is add 'contaminated' email addresses to my shitlist once i've found spam being sent to it. As you might already have guessed... the shitlist is a simple forward to sa-learn.
Adding an auto whitelister based on my own address book (LDAP is sweet) tackles the problem of addressbook harvesters, mail from these sources will not be fed to sa-learn, even if the email address its received on is shitlisted.
A friend of mine, who listens to the name of 'the wanker who cant keep his antivir up to date'/Paul created the need for me implement this feature by becoming infected by a _addressbook_leechin_virus_
To receive even more spam to feed to my hungry sa-learn there's a set of email addresses on my site (>50% of all email addresses there are in hidden fields/autogen'd pages) which are passed thru to sa-learn by default.
I've also been thinking of combining the unique id email address with a database in which i store served (generated) email addresses and giving them a grace period of N mins. If i recieve an email within these N mins i assume this email was sent by a visitor on my site who clicked the mailto: link and the message is passed to my mailbox and the unique id generated email address is flagged as non-spam source. However.. if I recieve mail on that email address after the N mins i assume its a spam-run and feed it to sa-learn I'm not sure on ROI (code-time/overhead/extra dependencies serverside) with this technique because what i have now works well enough for me.
The downside is you can't give out your email address on things like a business card (lastname@domain.ext). A possible solution to this is replacing your email address with an URL like http://lastname.domain.ext/ on which a mailto: refresh is generated with the unique id'ed email address. Or trusting the intelligence of the lean-mean-(and pretty well trained)-spamkilling-machine, which is good enough for me.
My 2ct.