SpamAssassin 3.0 Released
davemabe writes "At long last, SpamAssassin 3.0.0 has been released. I've been using the release candidates for a month or so, and the results have been far improved over previous versions. Its use of SURBL along with Bayes auto learning make it seem like this solution is the one to beat. It looks like they've introduced a new logo as well. Snazzy!"
For those not in the know, SURBL is really cool. It actually lets you scan the message(well, SA does that) and then look for urls that it links to. It compares this to a realtime BL of other people getting spam like you and if it is a known spam TARGET url then it blocks the message based on that.
It makes it really hard for them unless they just register countless domains.
Excellent technology, and I will be upgrading to the newest stable.
Chris
Comment removed based on user account deletion
The real news here is not Bayes filtering or SURBL, but the totally rebuilt plug-in architecture of SA 3.0. Plug-ins for the 2.x version were quite a bit harder to write.
Version 3.0 will result in a proliferation of good third party plug-ins that are going to put SA into more direct competition with some of the commercial vendors out there.
You know, *assassins* are the type to take out single, lone, "high value" targets, right?
Sneaking into fortresses/castles, creeping up and then offing the bad guy, or else maybe using some nice long-distance sniper rifle to take out the bad guy, or maybe choice application of poisons at the right bottle of wine, etc.
This is not appropriate for *spam*, where we're talking about waves upon waves upon unending waves in what we would call a "target rich environment". Assassination? No, more like machine-gunning, or artillery, or, I dunno, nukes.
Assassination would take too long.
Ive been playing with DSPAM which seems very good. They claim a 99.991% accuracy. Apparently this is 10 times more accurate then a human. But Ive heard that most anti-spam solutions are very good.
"Here I am, brain the size of a planet and they ask me to filter your spam."
Thank you for your coordination,
the Buzzword Police.
I use SA and like it. I only get about 75% reduction because SA-Learn doesn't seem to work very well. I've been told it takes a lot of mail to get it to learn. Though I would think, "If you see this again kill it" wouldn't take but once. hehe
And will SpamAssassin's effectiveness erode as spammers adopt smarter methods in response? Escalation is not a long-term solution to any arms race or conflict. We can continue to fight spam, but the only way we will decisively defeat it is by acknowledging it as a social problem and legislating against it, with an common sense certainty and determination no one in Western goverments seems to be providing.
I've been using RC1 for over a month now, and I'll tell you confidently that
-- Performance is MUCH better than it used to be. It scans messages much faster than I've ever seen SA 2.x do, and doesn't hog my server's resources anymore.
-- THIS THING ROCKS. For almost two weeks after I installed it I kept instinctively sending myself test emails to make sure I hadn't broken my mail system, because my volume of incoming mail had reduced so drastically. I was used to getting at least a new spam every 2 minutes. After installing SA 3.0 I got one false negative in a 72 hour period. It is *that* good. To date I still have not recorded a single false positive. I really had to convince myself that this thing was real.
This spamfilter rocks. I'd award it product of the year if I could.
Am I a hipster-doofus?
Didja notice the Apache feathers on the arrow in the new logo? Nice touch!
What I would like to know, how does SA scale? About a year ago a talked to my ISP about it and they said they could not use it as it did not scale well and could not handle big loads.
It would be nice if it could be implemented now as I personally receive about 1000 spam messages a week.
- In Memoriam: Jeroen de Bruin (1972-2004), bye bro
From the SURBL site: "parse URIs in message bodies, extract their domains, and check those against a SURBL...."
I would rather extract the domain, look up the IP, and check the IP.
That way the server will have to move to a new IP - not just get a new bogus domain name.
Yes, I know that servers many host many domains:
This will only increase pressure on the spamheaven server admins to get rid of the people who use spam to spamvertize their sites.
-- From Denmark
Major feature list:
/etc/mail/spamassassin/local.cf file. This is strongly recommended if
- SpamAssassin is now part of the Apache Software Foundation and has an
improved software license, the 2.0 version of the Apache License.
- SpamAssassin now includes support for SPF (the Sender Policy
Framework, http://spf.pobox.com/).
- Web site links contained in the message are checked against SURBL and
SBL. SURBL and SBL track sites that advertise with spam, known spam
sources, and spam services.
- The new 3.0 architecture allows third-parties to easily add plugin
modules.
- There is now SQL database support for both the Bayes and
auto-whitelist modules, allowing more large sites to easily deploy
SpamAssassin.
- A more accurate simulation of email client handling of MIME and HTML
improves our accuracy. In addition, there is better detection and
handling of spammer techniques that try to trick anti-spam software.
Important installation notes:
- The SpamAssassin 2.6x release series was the last set of releases to
officially support perl versions earlier than perl 5.6.1. If you are
using an earlier version of perl, you will need to upgrade before you
can use the 3.0.0 version of SpamAssassin.
- SpamAssassin 3.0.0 has a significantly different API (Application
Program Interface) from the 2.x series of code. This means that if
you use SpamAssassin through a third-party utility (milter, etc,) you
need to make sure you have an updated version which supports 3.0.0.
- The --auto-whitelist and -a options for "spamd" and "spamassassin" to
turn on the auto-whitelist have been removed and replaced by the
"use_auto_whitelist" configuration option which is also now turned on
by default.
- The "rewrite_subject" and "subject_tag" configuration options were
deprecated and are now removed. Instead, using "rewrite_header Subject
[your desired setting]". e.g.
rewrite_subject 1
subject_tag ****SPAM(_SCORE_)****
becomes
rewrite_header Subject ****SPAM(_SCORE_)****
- The Bayesian storage modules have been completely re-written and now
include Berkeley DB (DBM) storage as well as SQL based storage (see
sql/README.bayes for more information). In addition, a new format has
been introduced for the bayes database that stores tokens in fixed
length hashes. All DBM databases should be automatically converted to
this new format the first time they are opened for write. You can
manually perform the upgrade by running "sa-learn --sync" from the
command line.
The "sa-learn --rebuild" command has been deprecated; please use
"sa-learn --sync" instead. The --rebuild option will remain
temporarily for backwards compatibility.
- "spamd" now has a default max-children setting of 5; no more than 5
child scanner processes will be run in parallel. Previously, there
was no default limit unless you specified the "-m" switch when
starting spamd.
- If you are using a UNIX machine with all database files on local
disks, and no sharing of those databases across NFS filesystems, you
can use a more efficient, but non-NFS-safe, locking mechanism. Do
this by adding the line "lock_method flock" to the
you're not using NFS, as it is much faster than the NFS-safe locker.
- Please note that the use of the following command line parameters for
spamassassin and spamd have been deprecated and are now removed. If
you currently use these flags, please remove them:
in the 2.6x series: --add-from, --pipe, -F, -P, --stop-at-threshold, -S
in the 3.0.x series: --auto-whitelist, -a
- The following flags are de
Artificial intelligence was born... Filtering spam.
In Greg Egan's _Permutation City_, spam filters and spam become ever more intelligent. Your spam filter runs the interactive video mail in a sandbox trying to detect whether it's spam, the spam tries to detect that it is in a sandbox or that it is talking to an AI construct, so that it can hide its commercial intent. Your filter tries to mimic you (and you review its reactions now and then, try to get its facial expressions ever more like yours, etc), the spammers try to get more information about you so they can try to fool your filter by making the spam look like on of your friends, etc.
This is an obvious arms race and in that book, AI and uploaded individuals etc exist - but the trick is to make your AI spam filters as good as possible without making them actually self-conscious, since using self-conscious AI software for spam filtering would be torture.
I rather liked that idea.
I believe posters are recognized by their sig. So I made one.
You can browse the version 3.0.0 Subversion repository. I'd suggest looking at the files UPGRADE and Changes.
I'm building the latest on all of my clients' mail exchangers and our primary boxen. ;)
:) Keep up the good work.
Here's the command to install/upgrade 3.0 via CPAN:
# perl -MCPAN -e shell;
cpan > install Mail::SpamAssassin
(many lines, type in the administrator's e-mail address, say no to network tests)
exit
#
Very difficult stuff.
Oh! Some link whoring as well:
SpamAssassin Milter for Sendmail - Filters everyone without procmail
SpamAssassin Milter Quarantine - Quarantines spam messages and sends summaries in digest for 1 or more times daily rather than simply delivering to the end user.
Karma: Chameleon (mostly due to the fact that you come and go).
[...] and doesn't hog my server's resources anymore.
Got any numbers on memory use? I would love to run SA on my home server, but it has "only" got 80MB of RAM. I tried running 2.x, but it seriously brought the system to its knees (swapping)
I must say, Python might be a nice language and all, but as it's making inroads everywhere it's also wrecking havoc on ones ability to convert older hardware into a competent server. YMMV (mailman + bittorrent + (apache + exim + samba) and you're pretty much down to the last few megabytes )
Belief is the currency of delusion.
Well, since it's capable of removing a certain caste of emails entirely how about SpamGenocide or SpamacialCleansing?
Perhaps we should identify it with (im)famous person(s) to drive up hits like SpamHitler, SpamNazi, or SpamlobodanMilosevic?
Maybe something that has an associated coolness factor, instead of being (almost) universaly hated, like Dr. Spamibal Lecter?
Well, there's still the problem of overwhelming evil there. It's not really evil, just heartless and calculating. Hmm, heartless, calculating, killer... I got it! How about SpamAssassin? Oh, wait...
When do people learn that
what we need is not spam filters but spam stallers.
With spam filters your just precipitating in a arms race.
The spammers will send more and more spam
and your spam filters will use more and more
of your processor time to filter the spam.
It is a uphill battle against the spammer.
With spam stallers like sa-exim and tarproxy
your are stalling the spammers smtp connection
and the effect is that the spammer can't send
as much spam or that they drop you email from there email database.
The new logo is nice, but I was kinda partial to the nunchaku wielding ninjas knocking the crap out of spam.
Disclaimer: I work for a company, but I don't speak for them.
A lot of closed source software has open source counterparts, (i.e. MS and Open Office) but its always interesting to see closed source commercial software based on an open source project.
McAfee has a product for Exchange servers that is based on Spam Assassin called Spam Killer. I found out about it from the Spam Assassin site when I was looking for a windows version. Spam Killer isnt free yet its not as expensive as some of the other solutions out there.
The major problem I've been having with it is it creating zero byte emails which cannot be downloaded via pop3. When a user gets 30 messages, and message 10 is a zero byte email the client will constantly download the first 10 over and over, creating duplicates, until the user logs into outlook web access (webmail) and deletes the zero byte message. This doesnt happen to the MAPI users but we have quite a few POP3 users.
The support people are useless, I'm about to try out Microsoft Intelligent Message Filter for exchange, and hopefully with some good RBLs it should be ok.
Im dreaming ofa big bndwdth, That can resist the
-----
Free P2P Backup, Windows & Linux
Someone in the place I used to work at had an e-mail of someone else which had a signature which scrolled in from the right of the page and flashed and stuff and from there in around 2 months more than 90% of everyone else in the office had the same thing. I believe this relied on Javascript and Outlook was more than happy to comply.
So I've heard good things about SpamAssassin and headed over the webpage to figure out what I needed to do to install, and I found this.
/. crowd is going to complain about RealPlayer dumping shortcuts in my desktop, quickstart bar, and main start menu, how is SpamAssassin making directories in my root any better? At least I can delete the stuff RealPlayer litters around.
I'm probably going to flamed for this, but that install process is ridiculous. I'm not even close to being a newbie, but there's no way I'd go through that much hassle to install a spamblocker compared to something like SpamBayes that does a standard windows install and hooks right into Outlook. Does anyone thing that these things are reasonable?
1. I'm supposed to extract it to the root of my drive. Sorry, my root is sacrosanct. If the
2. I've got to install Perl modules? And it doesn't work with certain versions of Perl? The install should include whatever it needs to run. Don't make me track down some particular version of outside software.
3. I've got to generate a batch file and run it to generate the documentation? Why not just include the generated documentation?
4. Step 10 of the install FAQ mentions a D drive. I don't have a D drive. Does SpamAssassin really require TWO drives to run/test properly?
5. The whole install process includes 13 steps, some of which are fairly complicated.
This is one of the reasons why the whole open-source initiative has such a bad, pointy-headed reputation. Where is the focus on usability and user-friendliness? I often get the impression that it's "not cool" to actually put time and energy into making your software anything other that esoteric in its usage. I realy would like to try SpamAssassin, but dealing with the minor annoyances of SpamBayes for the next six months is clearly less work than installing SpamAssassin today. Why doesn't that bother anyone?
I'm probably going get either flamed or ignored for this post, but I would appreciate a reasonable response if there is one. We'll see I guess.
I'm using it on a dual 1.6ghz Xeon box with Gentoo here in the office - the box processes over 70,000 emails per day (spamassassin, amavisd-new and clamav/f-prot) and the load average barely goes above 0.02.
:)
Your ISP just didn't want to take any time to actually learn about it.