What E-Mail Validation Tools Do You Use?
morcego asks: "As we are all too much aware, spam is an increasing problem. Each of us has our own set of tools and methods to try and reduce the amount of spam we receive, each with different pros and cons. Also, on a more broad front, we have options like SPF (+ SRS), Microsoft's own Caller-ID, and
Yahoo's DomainKeys that we can use. These days, it is incredibly easy to implement any (or all of these), using publicly available frameworks and libraries (libspf2, and milter, to name a few). I have been using SPF for quite some time now with some measurable results, although nothing earth shattering. Which of these are you using, if any? Why, or why not? Do you think any of them really contribute anything to fight spam?"
Well, I use my brain!
sex is better than war!
I don't use anything other than dspam. It filters 99% of my spam for me. What more could I want?
The problem is that they can be spoofed, although not quite easily. That's because they're having folks self-setup the various systems.
Me, I would rather say "If your domain isn't in the same netblock as the ISP it represents, score heavily against."
--
# Canmephians for a better Linux Kernel
$Stalag99{"URL"}="http://stalag99.net";
I am an anonymous coward.
I wrote my own Bayesian filter, Mailvisa, to gain a better understanding of how Bayesian filtering works, and to be able to tweak the parameters. When I last measured it, it caught 93% of spam. Of all the filters I tried at the time (I think it was all filters in Debian sarge), only Bogofilter scored better. This applies to both the amount of spam caught and the filtering speed. The closest thing to false positives I've gotten over the years were a few advertisement mails from my domain registrar.
I have only two problems with it: 1. I have to train it regularly, and 2. nowadays, lots of mail slips through, because it contains words related to programming languages.
Please correct me if I got my facts wrong.
Greylisting and DSPAM work for me. The odd spam still gets through, though the majority of those can be rejected with various postfix settings.
Works pretty well.
The latest Slashdot meme.
After trying to tune SpamAssassin to work well for months, and being unimpressed by the hit/miss rate, I tok to forwarding all of my incoming email to GMail. I then forward all my email from GMail that is not spam back to my other account :0
I find this way I get 99.95% accuracy - things that GMail misses as spam, my local SpamAssassin catches. As a side bonus I have GMail's awesome interface to read my mail when on the road (much better than the Squirrel Mail I was using, and still better than RoundCube).
This brings up another point - I don't know why Google doesn't add IMAP connectivity to GMail, soyou could use it's interface to read email from other hosts. I don't see why their ad technology would not work with this scheme.
I get about 99% success with Spamassassin. (I do train it on its errors, about every couple of weeks.) The most common leakage I was getting was bounces from domains when the spammer spoofed my domain name; I finally put an SPF record in place, and those seem to have stopped.
One thing I wish it would allow would be to train it on all rules, not just those that the Bayesian filters use. Some of the rules give me a lot of false positives, but they'd be fine for others: so why do we have to manually change the scores on them?
I use several methods, none of which look at the message apart from headers, but the biggest hit is from greylisting which knocks out more than 90% of spam at a stroke. Most of the rest is rejected by refusing email from servers with no reverse-lookup.
While not necessary e-mail validation tools, greylisting and SBL+XBL blocking lists by Spamhaus have eradicated nearly all spam I used to get through all of the other filters.
Greylisting alone helped to lower e-mail traffic drastically and blocking lists take care of known spamming hosts. I'd recommend using both to anyone running a e-mail server.
The telephone.
If you really, really want to know if a message is from someone it's the only way.
all that SPF CallerID and DKIM does is validate the sender !
this cuts out about 70% of (stupid) spammers
you also need to blacklist people who send you spam (and you can be confident that you get them because of the above technologies)
if you Ever want to send lots of mail to hotmail users you need to have callerID setup yahoo and gmail both trust you more if you have domainKeys
so things are moving on and there is no reasson why people should not have at least one of SPF CallerID or DKIM setup on their domain !
you will note that people here also use filtering but the question is does the filtering feedback to the blacklists ?
regards
John Jones
p.s. I work in the mail vendor world...
SpamBayes. After enough training it is spookily accurate at getting spam. I used to run SpamAssassin as a POP3 proxy and then filter the rest with SpamBayes, but recently (past year or so) SpamBayes has been enough.
This *might* be due to ISPs doing a better job of bulk filtering out the obvious junk before we even see it. Some of the domains I have that are on other than my main ISP do seem to end up with more spam, but after filtering via SpamBayes I see very little...
I use the OS fingerprinting options from pf to block windows machines from delivering mail on the primary mx. This saves approximately between 300 and 1600 spams a day. Beside that, rejecting mail from hosts without an A record, blacklisting all hosts sending mail to spamtraps with spamikaze, rejecting hosts which falsely claim to be a host in my domain and filtering with bogofilter.
This is the list of most of the stuff we run at the border:
Exim + greylisting +c lamav + Spamassassin.
Here are the plugins to spamassassin and custom rulesets:
Plugins:
---------
Razor2
SpamCop
AWL
MIMEHeader
ReplaceTags
Custom Rulesets
----------------
We use a selection of the SARE rulesets
70_sare_adult.cf
70_sare_bayes_poison_nxm.cf
99_FVGT_Tripwire.cf
bogus-virus-warnings.cf
This was stopping most of our spam...however we were still getting a lot of spam that contained images with the spammy message. So about 2 weeks ago I implemented the FuzzyOcr plugin on all of our border systems. I've received a single image spam since I launched it. It works great. I also tweak the rule scores based upon our situation and the type of spam that we get and I monitor the effectiveness of those changes on a fairly regular basis.
======== In the future, everything will be artificial. ========
I wrote a Thunderbird Extension for Sender Verification which implements SPF and DK on the client side, which may not be the best place to do it, but it's better than nothing at all. The extension is aimed at phishing, rather than spam. It also checks sender domains in several blacklists.
https://addons.mozilla.org/thunderbird/345/
http://razor.occams.info/code/spf
I personally use ASSP for my spam filtering. I use the SPF vailidation, RBL, Spam bucket address, multiple HELO checks, and of course Bayseian filtering. I've found that with all of this I've yet to see a spam mail in my inbox with 40+ days of uptime. Before I started using ASSP I would probably recieve two to three spams a day.
Don't you hate glorious self-promotion? Visit my Blog
SPF (and related technologies) are not designed to cut down on spam. They are designed to prevent Joe jobs and address forgery. (It just so happens that most Joe Jobs are spam).
The World Wide Web is dying. Soon, we shall have only the Internet.
The combination of 8 DNS blacklists, Amavis and Spamassassin works very well.
I used to get more than 300 spam mails per day (intercepted by Spamassassin), due to the use of DNS blacklists I now only receive about 15 spam mails per day wich are intercepted by Spamassassin.
Only about 3 spam e-mails per day actually make it into my mailbox, with zero false positives.
The good thing about DNS blacklists is that the spam e-mails are actually rejected in the mail protocol, therefore it will hit spammers directly and renders their spam bots useless.
The blacklists also reject dynamic ip addresses, which are all virus infected home computers.
The most effective blacklists I use:
spamcop.net
uceprotect.net (L1, L2, L3)
spamhaus.org (sbl-xbl)
I use the libspf2 however, using it is quite useless when you come to think of it. In reality the concept is amazing, however, if you think of it, it relies on 3rd parties envolvement. When you implement SPF, you check other users domain SPF records for validation purposes, however, what if the other users haven't specified their own records? Some reputable and large ISP's still do not have SPF setup. In reality, using SPF is great... as long as everyone else uses it. Having to rely on others when it comes to spam however, has proven to be futile on many occasions.
I'm the entire IT Dept at my work and I do not have the time to manage our own email server, let alone worry about keeping it secure. Most of our business comes in via email and most of those are crafted to look exactly like spam with huge lists of names in the TO: or CC: boxes and no subject line.
My problem was finding a way to filter spam without filtering even a single legit email. Lost email means a lot of lost revenue. The only solution I found in a year of searching was mxlogic.com. We still get spam, but not nearly as much and since you get a filter report daily of what email was filtered, our people can see if anything was caught that shouldn't have been. The result, much less spam and ZERO lost revenue.
Yeah I wish I had the time and expertise to run my own email server and keep it secure but the fact of the matter is that there are lots of shops that just can't. This is seriously the only solution I have found.
And that's been keeping the ones that get through down to two or three a week. Not enough for me to turn on hard SPF checking or demanding that email to me be encrypted with my personal PGP key. Configuring all that stuff certainly is a pain though -- it'd be nice if they could get it down to drop in components for the most common configurations.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
The random main-body text spam is all over the place lately. It seems that as soon as spammers realize X won't pass the filters, they send much less X and more Y. The problem with the random text is that it's very hard to discern from legitimate e-mail (statistically speaking). Filters don't have a sense of context and conversation, even if they're so extravagant that they can perform cunnilingus on a hardwood floor. A simple validation system (SPF isn't a bad idea) would be a good step forward, if it was ubiquitous enough. Perhaps somebody can make it trendy to "get your SPF on" ?
--TheOrangeSquid Is it any wonder things seem so awry? We swim in a sea of confusion and don't have to think to survive
They indirectly help fight spam by fighting spoofing.
I have published my SPF data - so at least other people have the option of identifying whether stuff that claims to have come from my domain is legitimate or not. But our mailers are not yet doing SPF lookups. When we have a little time, we will probably add it to the postfix server. If the site specifies in their SPF record to 'hard' drop email that comes from anywhere but x (and the connecting server is at y), we'll treat it like an RBL.
The down side to having GWAVA one hop in from the postfix server is that some spammers get paid if the receiving mail server accepts the whole message - a (250 OK) by postfix means the spammer gets paid even if GWAVA later throws away the message. The GWAVA for Linux product is in beta test; once it goes official, I might be able to move it onto the postfix server, and hang up on the bad messages earlier.
"The most sensible request of government we make is not, "Do something!" But "Quit it!"
I've pondered over SPF myself, but I'm not really enamoured of it after reading all the pros and cons. I do publish a TXT record with SPF data for miggy.org, but only to say "these are the hosts/IPs that are DEFINITELY ok to receive email from claiming to be from miggy.org, but don't go dropping things on the floor just because I don't list another host here". i.e. people can use that record to whitelist (or upscore) the genuine miggy.org email, but won't use it to definitely blacklist miggy.org email from other hosts, although I guess they can downscore such if the like.
I don't use SPF at all at the MTA level, although I do allow Spamassassin's SPF rules to add to its scoring.
My main problem with SPF is the maillist one, and of course at least one solution to that, VERP, then interacts badly with greylisting. And of course that objection applies to the variations on SPF as well, to the best of my knowledge.
Actually the way I'm using SPF sums up my approach to spam counter-measures; try to use anything only as an advisory about the likelihood of the email being spam. My one exception to this is the use of Spamhaus' RBL as past experience has shown it to work near enough to 100% accurately to not be a problem (I've never had a user report a problem sending or receiving email with the culprit turning out to be an SBL-XBL false positive).
I use a (usually) sophisticated biological neural network consisting of a multi-billion plus nodes with some primitive pre-determined wiring structures serving as a foundataion. Oh yes, and as preliminary step, I use dual-stage filters: spamassassin followed by crm114. Spamassassin seems to be fairly well behaved by not giving too many false positve spam indications, and CRM114 picks through the remainder false negatives to my satisfaction. I still end up picking through the spam folders, but its bulk and not too difficult to plow through many at a time once they initially sorted.
Comment removed based on user account deletion
I've found there is rarely one single solution/software that provides *effective* anti-SPAM when you run your own mailserver.
Small shops or IT depts. without the technical know-how may be better served by farming their anti-SPAM out to a service. While running everything thru GMail sounds great, a quick read of the GMail TOS (or the TOS for any other "free" E-Mail service) should convince those who need reliability *and* accountability in their E-Mail service to either do it themselves or *pay* someone else to do it.
My personal anti-SPAM defenses use MILTERs, most-notably MIMEDefang. I invoke ClamAV and SpamAssassin thru MIMEDefang. I use sendmail, but its facilities for spotting the obviously fraudulent garbage are, like most stock MTAs, woefully inadequate. Beyond RBLs, the three really effective anti-SPAM features in sendmail are only available in v8.13.x, and are:
GreetPause - configurably delays presentation of the HELO banner, and rejects hosts that send commands prior to the HELO banner display
RateConn - configurably limits the number of new connnections a given IP/network/host/Domain may request at any one time
ClientConn - configurably limits the number of connections a given IP/network/host/Domain may have at any one time
Use of these features significantly (but not dramatically) reduces SPAM - best of all, the reduction is *very* early in the SMTP conversation, minimizing resource wastage.
MIMEDefang gives a lot of power to the mail system admin - of course, the power needs knowledge to be really useful. But some of the checks I do via MIMEDefang (exempting my own mail hosts, of course), by SMTP conversation step, are:
HELO - if it's an IP HELO, reject the connection if it lacks square or the IP does not match the actual IP of the connecting host; it it's a string, reject the connection if it is not an FQDN, or if it claims to be a host in any of my Domains, or it contains "localhost"
MAIL FROM - reject if the sender claims to be from any of my Domains
RCPT TO - reject if the recipient address is not valid or not in any of my Domains
The nice thing about MIMEDefang is that I can do those checks long before DATA - the spammer's ability to waste my resources is minimized.
After DATA, I run the E-Mail thru ClamAV, look for suspicious characters in the headers, malformed MIME, etc. If the E-Mail passes all that, I run it thru SpamAssassin and tag it if needed. I allow my users to set individual SPAM score limits, so the final act is to delete any recipients for whom the SPAM score exceeds their personal limit (and discard the E-Mail entirely if no one is left to receive it), re-build it so it conforms to standards, and then deliver it.
For those users who take advantage of the SPAM score limits, very little SPAM leaks thru. Even for those who don't, the SPAM is usually tagged as such.
My personal anti-SPAM philosophy is "Reject early, reject often". The sooner you spot the obviously fraudulent connection and drop it, the less of your resources the spammer gets to waste. A lot of SPAM can be stopped long before the DATA step - save "expensive" (in terms of bandwidth, CPU and disk) tools like ClamAV and SpamAssassin for the more-clever SPAM.
See their web site here...
Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
The Theorem Theorem: If If, Then Then.
In other words, it's crochety as hell. I have all the "speak the RFC's exactly or thy shall not pass" options turned on. I publish a SPF record, for what good it will do. I also 5xx reject anything from overseas.
Even though this is my own personal mail server, I haven't had too many false positives as far as rejects go... certianly nothing that a tweak here or there in the allow/deny hosts file wouldn't take care of.
All in all, I've recieved less than a dozen pieces of spam in the last year and a half. Not too shabby, I think.
I use spamd on OpenBSD to do greylisting. That cuts an enormous amount of spam out.
For those who aren't familiar with greylisting, when an smtp server attempts to deliver an e-mail the from address, to address, and IP address of the sender are put in a database and the mail is refused with a non-permanent error code.
Assuming the smtp server sending the e-mail follows the RFC, it will try again later. When it tries again after at least 20 minutes from the original attempt, it accepts the e-mail and adds the IP address of the source to a whitelist. For the next 30 days, any e-mails from it are white-listed. After that, the server is verified again.
I also keep a seperate white-list for non-RFC compliant servers and for frequent senders. Some servers only try one to three times and quit. Another problem is e-mail from some large e-mail farms may make each attempt to deliver the e-mail from a different server with different IP addresses, so I'll add their e-mail addresses to the white-lists as well.
One method I use for adding IP addresses of selected senders that send a lot of legitimate e-mail to the whitelist is to look up their SPF records and use that to identify the usual e-mail servers for the domain.
A few ISPs appear to put their entire address space in the SPF record. For example, panix.com's SPF record is
Needless to say they don't get whitelisted since I only want to whitelist e-mail servers, not their users spam-zombie computers.
In other words, I use the SPF records to identify legitimate e-mail servers from selected domains only.
I used to "roll my own" with SpamAssassin and MimeDefang. Then I started using CanIt at work (I liked them initially because the author is the author of MimeDefang). They have a free version that works well for me at home now. We have been using it for about 4 years at works and it does a great job incorporating grey listing, SA, MimeDefang, ClamAV, etc. into an easy to install and maintain system with a nice web interface and a database backend. It can scale well when we need it to and the support is great (a MAJOR factor for my company).
Did I mention it is cheaper than the other commercial offering as well. OSS, great support, low cost!
Dennis(I know this sounds like a commercial, but I am not affiliated with Roaring Penguin in any way other than being a very satisfied customer)
www.spampal.org
I killed da wabbit -Elmer Fudd
The last three years I used a free client side filter on Unix, just a shell-awk script, a baysian-like filter. Also a white list of the addresses I have accepted, not a challange-response type of white list, but a passive one of the addresses I accept.
I train it every couple weeks to generate the white list and the spam probability tokens.
I pay poor children in [???] $0.01 / hour to filter my mail for me. It's cheaper then buying SPAM filtering software.
No, I will not work for your startup
> They've done so for the past few years, and it seems to work *very* well.
My previous ISP imposed Postini on me with no notice (they sent me an email bragging about it three days after they started using it). It passed 50% of the spam and stopped 20% of the ham. I turned it off.
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
TMDA catches all my spam. I does not examine content. It sends a request for response to all unknown senders. Since the vast majority of spam has forged return addresses, no responses are sent back and the mail stays in the TMDA pending queue until it expires. Humans, on the other hand, reply, and their mail is removed from the pending queue and gets through. When I set up TMDA, I populated the whitelist with all the email addresses of my correspondents and lists.
Around 75% (150/200 daily) of my email is spam. After a month my false positive rate was around 0.5% (1/200) and most of those were mass mailed offers I would not miss. My false negative rate is around 0.02% (1/6000); every month or so a spam message is validated; I just move the address from validated to blocked so I'll never see it again.
I never have to see the 150 spam messages that come each day. I check the pending queue only when a business that sends email with a non-responding return address is sending me a message, like an online order confirmation.
The user can generate a keyword address when signing up for a list; messages from this address are allowed through without whitelisting. If later compromised; that address can be put on the revoked list.
I use tmda.cgi for configuration.
We use ASSP at work (a government entity) and it is effective enough that when we DO have a spam slip through, users usally call to complain about it. It happens rarely enough that they forget to forward it to spam@.
:) We looked at SpamAssassin, DSPAM, plain bayesian filtering (libmilter), ip blacklisting, RBL, forced validation schemes, .... ad nauseum. Unfortunately I hadn't gotten around to testing ASSP yet.
I also use it at home and have nearly the same effectiveness.
As far as various technologies, I don't believe any solution which relies sole upon one or two technologies will be that effective. ASSP seems to be the best so far at combining SPF/Greylisting/bayesian/various others. I implemented several versions of anti-spam systems for filtering an average 300k+ messages per day at an ISP and NOC peaking around 500-650k during holidays, so I do have SOME prior experience with this issue.
Interesting. My ISP introduced it as an opt-in service (just like they introduced SpamAssassin and various other tools to the user base), and while it did require some fine tuning, I've had very few problems with it (I get a handful of Spams a day which it doesn't catch, and I see one or two false positives a month).
I don't blame you for dropping it given how it was introduced at that ISP, but I think you also lost a chance to use a fairly effective anti-spam tool.
Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
The Theorem Theorem: If If, Then Then.
There are lots of short-term solutions, to which spammers adapt as they get widely adopted.
For example, content filtering in general is largely a short-term solution. Spammers invent and use obfuscation tricks; tools detect them, spammers invent new ones. Rinse, repeat.
Longer term solutions have to address root causes. These increase the consequences of spamming. IP blacklists, URI blacklists, domain blacklists, for example, result in negative consequences for bad actors and their associates. (Including folks who claim that they're not associates, where that association consists of sending money to the same folks for network connectivity, i.e. being customers of an ISP or webhost or ESP that harbors spamming customers.)
The way things are going, I see a continuing trend toward reputation services, where the reputation is that of an identity confirmed using one (or more) of the Email Authentication technologies - CSV (my favorite), SPF or DKIM/DomainKeys. (I've been building one, so I'm biased.) Only senders with positive (not just neutral) reputations will get through. Greylisting will, as another poster mentioned, be key in preventing spammers from getting one step ahead.
Complementary long term solutions include HashCash (e.g RubberStamp) -type solutions and better security.
SpamAssassin is a victim of its own success - it's so widely used that the first thing spammers do is send their mail through a server running it, and tweaking the message until it gets through that portion of its filters that are content-based. Of course SpamAssassin's Bayesian filter component helps in that regard, as do RulesDuJour, and other features that are not on by default. It works very well when tuned.
The unfortunate fact is that most ISPs and end users refuse to step up and shoulder the costs to keep their systems secure enough not to be sources of spam. They take on spamming customers and allow infected computers to remain on their networks. Until antispam measures impose costs that force these costs to be born (i.e. internalize the externalities), there will be more false positives and negatives.
Make 'em pay! http://Payola.org #include "stddisclaimer
Comment removed based on user account deletion