Interview With The SpamAssassin
comforteagle writes "Howard Wen has conducted an interview with Daniel Quinlan of SpamAssassin. In it he explores what keeps Daniel motivated in the face of the unrelenting torrent of spam and new spamming techniques, as well as, what is working - what is not, and what he predicts spammers have up their sleeves next for defeating spam detection." From the interview: "If you don't mind deleting spam manually, that's your prerogative, but don't complain about it. If your ISP doesn't do a good job fighting spam, then switch ISPs or install your own anti-spam software. There are a lot of choices out there."
When I got to over 300 spam a day was just about the time I tried gmail (google mail). So far this is the best spam protection I have come across. My spam folder is getting about 400 a day now but I can't remember the last time a "good" message went in there. I still get about five spam a day that I need to manually deal with.
http://www.busyweather.com/
Disclaimer: No interest in the company. Just a satisfied customer.
v1agr@ r0g@1n3
Who has noticed a decrease in the effectiveness of Spam Assasin. I have! Anyone else?
"If you don't mind deleting spam manually, that's your prerogative, but don't complain about it. If your ISP doesn't do a good job fighting spam, then switch ISPs or install your own anti-spam software. There are a lot of choices out there."
How the hell do you think the national do-not-call list came about? Because people bitched and complained! I agree there are spam solutions out there but I still think there should be an easier, more fool-proof, and legally backed way of opting out of spam.
I find laziness to be an excellent motivator.
Quinlan: That would probably be advance fee fraud, also known as "Nigerian" or "419" scams. These messages are often literally sent individually to each recipient, mutating each time, by scammers typically located somewhere in West Africa. Because they often are sent in low volume, and almost every one is somewhat different, they are a bit tricky to catch.
An easy solution for home users who don't happen to know anyone from West Africa is to just block all e-mail from there. But even without that, I have had decent success in the past with a combination of SpamAssassin tagging e-mails and Thunderbird filtering. Stay away from OE. Far, far away.
The SURBL can be found here: http://www.surbl.org. It's a very good thing, so much so that spammers are starting to try to get around it by doing stuff like this:John.
IT IS THE BOMB. Spam loads to my work account dropped by orders of magnitude. Now, Mail.app identifies maybe 2 per day, instead of 200+.
Charles
There exists no way of exchanging information without making judgments. --Bene Gesserit Axiom
I've said it before, but I have to promote PopFile (http://popfile.sourceforge.net/) again. Since doing a bit of training, it now correctly sorts about 99% of my e-mail. I get about 600 messages a day not including mailing lists, and my accuracy is 99.65%. It is generally not susceptible to new spam techniques unless they can match the subject matter that my e-mail typically covers.
When they start spamming "Linux IPF Apache LOOK! Vi@GR@ makes your peNi$ PHP Bug CSS" I will be concerned.
Maybe I'm the lucky minority here, or my mail host has some crazy filters I don't know about, but I very, very rarely recieve any type of spam. Now, I don't go handing out my email address either. If I'm signing up for something shady, I use another address at a web-based email account, which does get a lot of spam... but otherwise I use the mail host that comes with my website http://www.surpasshosting.com/ and Thunderbird as a client, and never see any type of spam.
From TFA:
The greater challenge is that the new techniques never stop coming. It's possible spammers will eventually run out of tricks, but it definitely hasn't happened yet. Most techniques backfire fairly in the long run, and make it more obvious that a message is spam.
You gotta wonder if there is a spam "bubble" that will burst pretty much like every other bubble. It started the same way, a few scammers got the idea of sending out scams via email and were quite successful, and everyone else started to jump on board. But soon enough(hopefully) people will learn their lesson and spam will slow....maybe I'm putting too much faith in people.
But it is interesting to see how many "me too" trends there are in spam. Up until about 2 years ago, I never received a 419 scam, but now I get at least one a week. Up until about a year ago, I never received a rolex email(typically the domain of brick and mortar(ok, urine soaked streetcorner) drifters), but now I get a few a day.
Monstar L
Two words: Spam Arrest. Zero spam, no filters to nurse, no lost mail.
Slashdot entertains. Windows pays the mortgage.
SpamAssassin got 'native' SURBL support in 3.0
I bet he has cool business cards:
Daniel Quinlan - Spam Assassin
He can tell people his job is to kill spammers. Which reminds me, I wonder if anyone at the IRS actually checks what job title you put on your tax forms?
I Am My Own Worst Enemy
Don't publish your e-mail address in a public forum, only idots do crap like that and they get what they deserve.
E.
Never rub another man's rhubarb - The Joker
I can second that. I have been using popfile for months, and it is currently doing an excellent job of putting my spam in a separate folder from my other correspondence.
John Sauter (J_Sauter@Empire.Net)
...God bless Daniel Quinlan and people like him. I have had a hell of a time with my daughter's email. A LOT of Web sites for kids have a "mail a friend" option. At one point my daughter wanted to use that option on a few sites. These are kid-oriented sites with privacy statements, so the sites felt trustworthy.
Fast forward to two weeks later, and one of those #@!&^ing sites has sold her email address to every spammer in the nation. My little kid got 196 spams yesterday -- for Viagra, lesbian cheerleader porn, you name it. So I have become heavily interested in every anti-spam product known to man. I've got 'em on the server, and got 'em on the client. Right now, with redundancy, this is 99% accurate, and my daughter gets only messages from friends and family. My biggest problem is not that spam gets through, but that false-positives block a legit message every now & then. That is the area I hope improves the most.
My Greasemonkey scripts for Digg &
When I get spam I don't want I just unsubscribe from it. "WHAT DO YOU MEAN IT WILL TAKE 72 HOURS TO REMOVE MY ADDRESS?"
"If you don't mind deleting spam manually, that's your prerogative, but don't complain about it. If your ISP doesn't do a good job fighting spam, then switch ISPs or install your own anti-spam software. There are a lot of choices out there."
It seems pretty simple to me: complaining leads to awareness, which leads to action. Maybe a bunch of people on Slashdot griping about spam won't amount to jack, but let Oprah or someone else with a grappling hook or two on the office/church/bar water cooler complain about it and they can make a difference in social attitudes.
SpamAssassin is a good step but the real problem is the social system which makes spamming possible. How else can you explain a 60-year-old grandmother 1) using her computer as a spam relay, 2) acknowledging it on television, and 3) not seeing it as a problem because it's "legal" and she's getting regular cheques to do so?
How is it that a social/legal system can be designed to bankrupt and scare the shit out of people who share a few movies or songs but barely put a dent in the people sending out millions of useless, offensive, and content-bordering-on-the-illegal emails? Is there nothing wrong with this?
A pop3 proxy works great. I recommened SpamBayes
http://spambayes.sourceforge.net/
My company uses a spam appliance called Meridius. It's based on some proprietary technology and uses spam assassin as a second layer. It has a very slick interface and stops about 97% of spam. Oh and it's made by a Canadian company called BlueCat Networks.
What's wrong with personalized training? I get more spam than almost anyone I know, and SpamBayes does a fantastic job for me.
i've been using yahoo for years and i get about 3000 spams per month which averages to about 100 spams a day. not quite as much as you, but i get about 1 spam a week that falls into my regular inbox. however, some of my newsletters which i subscribed to did get marked as spam and after marking them as not spam a couple of times, yahoo spam filter was smart enough not to do it again.
HD Trailers
I admin a handful of domains and I don't use anything except blocklisting by IP address. I get a handful of spam emails per week that regularly get reported to Spamcop. Since I am in regular contact with many of the people that email me, I can be sure to know if I am falsely blocking innocent domains - hasn't happened yet. For some reason it makes many people crazy that my method works for me - so many people think they have the absolute right to contact me if it suits them. I feel that if you do business with a spam-supporting ISP, you have nothing to say that I need to hear.
It's simple: I demand prosecution for torture.
I bet the third post will be by a Google-employee bragging about how great Gmail's spam protection is (of course omitting the fact that every mail read with gmail conjures 20 AdNonSense-ads by Google's online-pharmacy and casino-spam friends.
Disgusting.
DSPAM is what worked best for me. It is not easy to set up but definitely worth the trouble.
As of today, 99.985% spam filtering rate.
Some of us have to earn a living. If potential clients can't contact us as easily as possible they'll just try someone else.
This has both good and bad aspects. First, the good news: responsible ISPs will be able to block a good portion of spam at their routers and mailservers; it's not hard to detect and blacklist a PC which is spewing the same email to 20,000 different recipients. Unfortunately, it only takes a few poorly-configured ISPs to provide a great deal of bandwidth to spammers. Couple this with Windows' known security holes, and home users' typical apathy regarding patches and security updates, and you have a large pool of potential spam-hosts which cannot be as easily targeted as open relays or specialized spam-spewing servers. After all, if spammers are using a legitimate ISP's mail server to send spam, a remote admin can't block that mail server without also condemning large amounts of legitimate email to deletion, which may well be unacceptable.
The upshot of all this? The onus of spam filtering is going to be, more and more, on ISPs rather than on recipients. While this has its good side - spam filtered at the source doesn't take up as much precious bandwidth - it also means that filtering will be more difficult for those not close to the source.
That's it. I'm no longer part of Team Sanity.
We run a cluster of Barracuda Networks spam firewalls. They use mainly open-source software (spam-assassin on Linux, plus lots of other stuff), are super-easy to install, and they advertise on Slashdot. What more do you want?
Help save the critically endangered Blue Iguana
It depends on how you define "spam-free." If you mean that nobody is sending spam, posting blog spam, sending spam over chat networks, etc. then I think the chances are rather slim. If you mean that most people will rarely see [email] spam, then I think it's possible.
But I think that one would lead to the other. If relatively few people are seeing spam, then suddenly spamming is no longer making money for the spammers, and they would eventually stop actually sending it.
Of course that's an optimistic scenario. It would probably lie somewhere in the middle. Fewer and fewer people see the spam, so spamming itself is less and less cost effective. Fewer and fewer spammers participate, while the remaining ones will have to reduce their fees since there will be fewer views. Fewer spammers and less money mean less innovation. Eventually (hopefully), the entire movement will slow down until spamming is only done by a few recluses targetting only the most oblivious users.
Punctanym: alternate spelling of words using punctuation or numerals in place of some or all of its letters; see 'leet'
Trying to keep your email address private is the modern equivalent of tilting at windmills. All it takes is one friend sending you an "e-card" or something similar, and your email address is spreading through spammers' lists faster than.. uh.. something that spreads really quickly.
Also, people don't deserve to get spammed to hell because they post their email addresses in public forums. Slashdotters take things like "don't publish your email address" for granted, but it's only common sense to us because we know how all this works. The average user has likely never heard of an email harvester.
- Reject if on the spamhaus list
- Reject if claiming to be your mail server in the helo
- Reject if claiming to be RFC1918 space in the helo
- Reject if there isn't a '.' somewhere in the middle of the helo (simple way of checking for FQDN)
In addition, configure sendmail to do rcpt flood rejects, and even better, enable greet_pause. I've rejected quite a few with those.Anything that gets through all of that is then analyzed by spamassassin. WIth Bayesian training, my current threshold is 3.0. Anything legit is normally -2.0 or less. I Totally DROP through mimedefang anything greater than 7.0. Anything from 3-7 is dumped in a special folder on my local account via procmail. I analyze that stuff every now and then to see if it is time to once again lower the thresholds.
Also, continue to do the RBL checks in spamassassin (although it's a little redundant since I check spamhaus in mimedefang). That way you also get scoring based on SURBL..good stuff.
Spam Bayes with Outlook correctly handles over 95% of my spam.
In fact I've found it works great as a personal filter, if you configure it somewhat differently from the way the documentation suggests. That is, increase the weight of the Bayes filter, and have it train itself on every message it classifies. Then correct it on any mistakes it makes - which rapidly become few and far between.
Here's a paper showing that SpamAssassin can achieve as good results as others touted for personal use.
Unfortunately SpamAssassin is a bit hard to install and set up. But if you have RedHat or Debian Linux, it is available by rpm/apt and you can install a few scripts to make it work.
I wish I had a better shrink-wrapped version, but I don't. So I'm supplying the raw files for one user in the hopes that (a) somewhat technical people can reproduce the setup and be happy, (b) somebody will make a shrink-wrapped version, perhaps with plugins or extensions or macros for more mail clients.
Here is the Linux Personal Spamassassin setup.
Sort of. See that button above an e-mail that says 'not spam'? Yes, that's the one. If a message appears in your spam box, click that button, and it will be moved from the spam box to the inbox and taken off the 'spam' list, effectively adding it to a whitelist.
With a full screen terminal window, I can mark spam based on the name and the subject header. I can recognize spam at a rate of about 10 per second this way. With the names spammer pick, and the mis-spelled subject headers, it is pretty easy to pick them out.
Using pine, I never give a spammer info by opening web bugs. I can look at the raw email by typing "h" to show the headers, so all those phishing emails are immediately obvious.
Keeping the email on the isp's server means that when I rebuild a machine, I don't have to worry about about backing up my email.
But the real question is: does the interview pass SpamAssassin's filters?
(Posting as AC so no karma whoring here...)
I'd have to strongly disagree with people here saying that SpamAssassin doesn't work, etc. I run four different SA installations for different types of companies. After an initial training period, we've never looked back. Granted, I spend some time writing custom rules to be able to train it for their specific environment. But regular expression aren't that difficult, people. Our accuracy for each site runs in the 96-98% range, and we've also implemented some stricter sendmail access lists to keep crap from getting in to begin with. I've got very happy customers. So sit down over the weekend, read all the faq's, and get your SA installation tuned properly so you can stop bitching about such a great piece of software!
... and some oh-so-insightful shithead that comes on and waggles his finger and clucks at someone making a complaint, adding nothing but their righteous prattle to the conversation.
I use gmail as my primary email. Good enough for you?
I am no longer wasting my time with slashdot
Since I implemented the above as a Postfix ruleset, I don't get spam anymore, and it's not exactly like I've actually kept my primary address secret. No, I'm not kidding or exaggerating - basically, my mailbox is my own once again. Viva Postfix! Viva greylisting!
Dewey, what part of this looks like authorities should be involved?
I manage a couple ISP incoming MTAs, they come looking for a anti-spam and anti-virus solution which is easy to provide them in OSS land.
...
First Qmail setup to use RBLs
cbl.abuseat.org sbl-xbl.spamhaus.org relays.ordb.org dynablock.njabl.org list.dsbl.org dul.dnsbl.sorbs.net
That bunch will block a whole lotta spam before it ever gets to discuss sending mail with the SMTP server.
Next, SimScan from Inter7.com, this little c app runs at the front end of the SMTP process, it will scan incoming mail at SMTP level with ClamAV and SpamAssassin, anything scoring over 10 in SA is dropped at SMTP level with a 5xx error.
SimScan allows you to fine tune settings on a per domain and per user level if you so desire, so it is easy to turn SA off entirely for a user who wants all the spam they can get, ditto for those who'd rather not be protected from viruses.
Using these features you stop a LOT of spam, likely in the 80% or higher range. Most domains we've applied this to have gone from hundreds per day to less than 10 per day.
It is imperative you also use the SURBL features in SA to stop more spam than ever, you should also use Razor2, DCC and Pyzor. I suggest upping the Razor2 scores a bit as well the defaults are quite low.
Gmail is not really a 'FREE' service you pay for it by viewing the ads. Just like broadcast tv.
Taco?
Unfortunately it seems that a lot of the zombies out there aren't so much spambots as they are proxybots. It may not seem like much of a difference, but it has a tendency to open up a whole new set of possibilities for a spammer looking for a new network to spam from. Plus, because most of the bots are on dynamic ip's they move around enough that the blocklist entries are outdated within a few days.
I released a small paper on such a network back in October.
http://lowkeysoft.com/proxy/
-steele out
Use SpamPal. It comes with blacklists, but you can turn it off because the reg expressions that came with it are very effective. There are also modules to decode base64, filter on spammed URLs, clean up web bug crap, block by country etc. & it's free.
Just set up a rule so that your kid cannot open any email that isn't signed with pgp/gpg, with a key in your web of trust. I'm tempted to impose that rule on myself and force my friends to install gpg. (Sadly I'm lazy - I haven't gotten around to making myself a key yet)
From the article:
Don't complain? Don't complain about dealing with spam? Don't complain about paying money (ISP mail servers cost money, and you pay for them) so that some fucktard breaking the law (spamming is illegal in many places) can waste the time of millions of people every day?
I'm complaining about you Daniel Quinlan. Go write a filter for me, you're good at it. I'll complain exactly as much as I like. I'll write to my elected representatives. I'll campaign to change the law. I'll demand good service from enterprises I patronise. I'll advise my friends how to do the same.
You think I should give up my power as a voter and consumer because there's a bit of software which helps? You think I should just sit here and take it in the ass from the spammers because 93.6174% of the time some code stops their dicks before they reach my butt-cheeks? Fuck you. I'm going to bitch, complain, campaign and whine till the day comes that I don't need your stinkin' software.
Chernobyl 'not a wildlife haven' - BBC News
It's hard to picture a shorter route to corruption. When law enforcement officers fund themselves by taking stuff, the main incentive isn't to serve justice any more, it's to ... take stuff. This is exactly the problem faced by a lot of the former Soviet Union and Latin America: When the government can't (or won't) pay police enough to have a decent standard of living, they go into business for themselves. Not good.
All these server-based solutions are great, but what about the average (ok, slashdot average) cable modem or dsl user? We get lots of spam, too, and the isp's filtering is pretty bad in my case. For me, something I can attach to Thunderbird or place between Thunderbird and my isp would be very helpful.
"Hi, I'm Dr. Adams, and welcome to the Planet Arium."
"I thought this was the planetarium."
"It is, I have a bone disease that prevents me from saying the 't' in Planet Arium."
Interesting - the whole "copy and remove space" idea is something you see all over the place as an anti-harvesting technique for email addresses (like slashdot employs). Now spammers themselves are using anti-spam measures from within their spam to combat spam filters....
This is an old technique. Haven't you ever gotten email about "V1a gra"?
I received the following chain letter recently:
> IT DOESN'T MATTER IF YOU ARE REPUBLICAN OR DEMOCRAT!
> KEEP IT GOING!!!!
> 200 8 Election Issue!!
>
> GET A BILL STARTED TO PLACE ALL POLITICIANS ON SOC. SEC.
> This must be an issue in "200 8 ". Please! Keep it going.
[remaining BS deleted. Snopes has an identical version with "2004" (no space) if you're really interested.]
So it looks like something somewhere is looking for the string "2008", because chain letters are adapting themselves to it.
I can't believe no one has mentioned Yahoo! yet. Automatic, accurate spam-filtering? Yes. White-listing? Yes. Black-listing? Yes. And if you want to stick with the free account, use Yahoo!POPs to download messages into Thunderbird.
Personally, I have the upgraded (2GB) account so I can take advantage of what I consider the best anti-spam feature available anywhere: disposable email addresses.
Not sure if you want to divulge your address to for a free iPod contest? Give them a disposable address where email is directed straight past your inbox and into a separate folder. When you lose that iPod contest and the spam starts pouring in, just delete the disposable address.
Sure, you can set up a free "junk mail" address with Hotmail, Yahoo!, but I've found that "checking in" on my spam is a waste of time.
Of course, the best solution is to not give out your email address.
Not sure if this is already "common knowledge" or not, but my employer runs a small mail server for several of our customers - and he's started using a technique which seems to drastically reduce incoming spam.
He set things up so whenever a new piece of email arrives from an unknown source, it sends back a "try again later" request and trashes the message. Apparently, there's a function built into the specs for SMTP/POP servers so these "try again" requests normally get processed, and the mail is resent an hour or two later (sometimes longer if from a big ISP like AOL or something).
Since much of the spam coming in is just being blasted out by a "zombie" client, or some spam-sending software package, they generally ignore the "try later" request and simply move on down their list of addresses they're trying to shoot the spam out to.
When the mail does come through after a "try later" attempt, his mail server adds that address to a "white list" database, so future email from the same place won't get the "try later" treatment (and possibly irritate the owner of that mail server at some point!).
Permission-based email works great for me. While new correspondents need to respond to a challenge, I can have every response automatically accepted. This minimizes the delay in getting new messages, and insures that the message is being sent by person and not a bot. If it turns out I don't want to receive future mail from an approved new sender, I can black list them, with or without an explanation to them, with about two clicks. ChoiceMail http://www.digiportal.com/
Has single handedly raised my brownie points at the company I work for that must use Outlook.
Where around the web would there be better information about Spamassassin that any novice can understand and use without having to research vocabulary and arcane references?...
Spamassassin has difficulties for people with little mastery of computers. The difficulty of instructive materials for Spamassassin is an overabundance of jargon and references unfamiliar to people with little mastery of computers.
That all I have to say. Gmail uses SURBL as well ;)
I remember when SURBL was first announced on /. EVERYONE said they doubted it would work! Not a single post said ,"Oh I've had this idea for ages!", "Oh I invented this idea!", "I've been using this for a long time..." Blah blah blah....
Now that SURBL IS kicking butt, sudennly everyone had the idea first! Yeah, whatever!
Go ahead and dig up the article if you don't believe me.
Some of you guys just don't get it at all. Whoever posted the RANT to DQ about not complaining.... you have no reading comprehention. Go reread what he said.
I suggest more of you so called spam experts should actually spend some time at an antispam conference. Pretty much every antispam technique you can think of, has already been thought of.
Name me one other FREE antispam program that will work as well as spamassassin.
Also if your using an older version of SA, and not using rules from www.rulesemporium.com, then its your own fault.
AND, the DNS bug is in Net:Dns, not SURBL.
I have wasted lots of time on spam, especially as I used to read every one that wasn't porn and had fun tracing the servers. Most of my time is spent surfing the Internet anyway. My only regret is not figuring out how to get into the spam business back before the CAN-SPAM Act.
That's why we buy machines. To do our work for us. I mean, come on. Filtering HELOs? On what criteria? And why do I care? How do I even make qmail do that? Why would I care how? And then you go straight to DNS BLs? Which ones? And what do you do if there is a match? On one BL? On two BLs? When do you reject? Do you then let SA repeat the effort down in step 6?
spamd and clamd do not consume as many resources as people whine about. If you get under 100,000 emails per day at your server, I wouldn't even bother with anything else. A poorly tuned celeron would handle that load just fine.
"Avoid employing unlucky people - throw half of the pile of CVs in the bin without reading them." -- David Brent
After I'm through with that, my server will go from humming along at a 0.00 load average to humming along at a 0.00 load average.
Thank you for helping me to see the light!
"Avoid employing unlucky people - throw half of the pile of CVs in the bin without reading them." -- David Brent
Nice sig.
"Avoid employing unlucky people - throw half of the pile of CVs in the bin without reading them." -- David Brent
I could:
- Spend time learning how to instal a new MTA.
- Spend more time learning how to configure a new MTA
- Spend more time tweaking that configuration so that it meets my requirements (multiple domains/accounts, IMAP, spam and virus filtering with rejection during the SMTP session, SMTP AUTH)
- Spend more time testing my new configuration
- Realize a week later that something doesn't work the way I thought it did
- GOTO 3.
The end result would be: I have an email solution that meets my requirements.I am not in the email hosting business, and frankly I don't find email delivery to be that interesting. Qmail sucks balls--everything you want it to do requires the selection, application, and testing of a patch. If I had it to do over there is no way in hell I would have chosen it. But I have no time to dedicate to replacing it.
"Avoid employing unlucky people - throw half of the pile of CVs in the bin without reading them." -- David Brent