Slashdot Mirror


SpamAssassin 3.0 Released

davemabe writes "At long last, SpamAssassin 3.0.0 has been released. I've been using the release candidates for a month or so, and the results have been far improved over previous versions. Its use of SURBL along with Bayes auto learning make it seem like this solution is the one to beat. It looks like they've introduced a new logo as well. Snazzy!"

28 of 335 comments (clear)

  1. SURBL by alatesystems · · Score: 5, Informative

    For those not in the know, SURBL is really cool. It actually lets you scan the message(well, SA does that) and then look for urls that it links to. It compares this to a realtime BL of other people getting spam like you and if it is a known spam TARGET url then it blocks the message based on that.

    It makes it really hard for them unless they just register countless domains.

    Excellent technology, and I will be upgrading to the newest stable.

    Chris

    1. Re:SURBL by hey · · Score: 4, Interesting

      I suppose this will driver spam-advertizers to obviscate their URLs in the spam mails. Eg use javaScript to build the URL so the real URL can't be detected -- like we do with our mail addresses on webpages so they won't be harvested by spammers!

  2. Comment removed by account_deleted · · Score: 4, Interesting

    Comment removed based on user account deletion

  3. Plugin Architecture by CleverFox · · Score: 5, Interesting

    The real news here is not Bayes filtering or SURBL, but the totally rebuilt plug-in architecture of SA 3.0. Plug-ins for the 2.x version were quite a bit harder to write.

    Version 3.0 will result in a proliferation of good third party plug-ins that are going to put SA into more direct competition with some of the commercial vendors out there.

  4. actually i've always felt their name's not right by Build6 · · Score: 4, Funny

    You know, *assassins* are the type to take out single, lone, "high value" targets, right?

    Sneaking into fortresses/castles, creeping up and then offing the bad guy, or else maybe using some nice long-distance sniper rifle to take out the bad guy, or maybe choice application of poisons at the right bottle of wine, etc.

    This is not appropriate for *spam*, where we're talking about waves upon waves upon unending waves in what we would call a "target rich environment". Assassination? No, more like machine-gunning, or artillery, or, I dunno, nukes.

    Assassination would take too long.

  5. anto-spam by Outsider_99 · · Score: 4, Informative

    Ive been playing with DSPAM which seems very good. They claim a 99.991% accuracy. Apparently this is 10 times more accurate then a human. But Ive heard that most anti-spam solutions are very good.

    1. Re:anto-spam by Skuto · · Score: 5, Interesting

      There was a good scientific test linked on slashdot a while ago, comparing spamfilters and including DSPAM and SpamAssassin.

      Contrary to DSPAM author's claims, both it and and CRM-114 (another package which likes to self-hype) performed quite a bit worse than SpamAssassin.

      Then again, I've heard people being happy with DSPAM that were not happy with SA.

      Guess it depends on the mailfeed you get.

  6. Re:Artificial intelligence was born... by Duke+Thomas · · Score: 5, Funny

    "Here I am, brain the size of a planet and they ask me to filter your spam."

  7. A spam arms race? by zaxios · · Score: 4, Insightful

    And will SpamAssassin's effectiveness erode as spammers adopt smarter methods in response? Escalation is not a long-term solution to any arms race or conflict. We can continue to fight spam, but the only way we will decisively defeat it is by acknowledging it as a social problem and legislating against it, with an common sense certainty and determination no one in Western goverments seems to be providing.

  8. Re:Improved Performance? by xcomputer_man · · Score: 5, Interesting

    I've been using RC1 for over a month now, and I'll tell you confidently that

    -- Performance is MUCH better than it used to be. It scans messages much faster than I've ever seen SA 2.x do, and doesn't hog my server's resources anymore.

    -- THIS THING ROCKS. For almost two weeks after I installed it I kept instinctively sending myself test emails to make sure I hadn't broken my mail system, because my volume of incoming mail had reduced so drastically. I was used to getting at least a new spam every 2 minutes. After installing SA 3.0 I got one false negative in a 72 hour period. It is *that* good. To date I still have not recorded a single false positive. I really had to convince myself that this thing was real.

    This spamfilter rocks. I'd award it product of the year if I could.

  9. New logo ... by YetAnotherName · · Score: 4, Interesting

    Didja notice the Apache feathers on the arrow in the new logo? Nice touch!

  10. Performance by smooc · · Score: 4, Interesting

    What I would like to know, how does SA scale? About a year ago a talked to my ISP about it and they said they could not use it as it did not scale well and could not handle big loads.

    It would be nice if it could be implemented now as I personally receive about 1000 spam messages a week.

    --
    - In Memoriam: Jeroen de Bruin (1972-2004), bye bro
  11. Does it use IP's or URI's ? by NKJensen · · Score: 4, Insightful

    From the SURBL site: "parse URIs in message bodies, extract their domains, and check those against a SURBL...."

    I would rather extract the domain, look up the IP, and check the IP.

    That way the server will have to move to a new IP - not just get a new bogus domain name.

    Yes, I know that servers many host many domains:

    This will only increase pressure on the spamheaven server admins to get rid of the people who use spam to spamvertize their sites.

    --
    -- From Denmark
    1. Re:Does it use IP's or URI's ? by NKJensen · · Score: 4, Informative

      Sorry, I've found the question and some pros and cons here:

      http://www.surbl.org/faq.html#numbered

      --
      -- From Denmark
    2. Re:Does it use IP's or URI's ? by platipusrc · · Score: 5, Interesting

      One of the problems with using IPs is the massive amount of Virtual Hosting being used. Say I'm a 1&1 customer, and there are 400 other domains going to the same IP as one of my domains, and I send you an email with a link to something on my site, but one spammer has managed to get an account with 1&1 for now. If they're on the same box as me, you just blacklisted 399 other domains that shouldn't have been blacklisted.

      --
      And the muscular cyborg German dudes dance with sexy French Canadians
    3. Re:Does it use IP's or URI's ? by pqdave · · Score: 4, Informative

      SpamAssasin is for email, and won't affect anyone trying to browse to your site. At worst, a properly-configured SpamAssasin would see a mention of your URL in an email, resolve it to the same IP as a spammer, and give it a few more points towards the spam threshhold. SpamAssasin (at least as used by my mail admin) scores messages based on various factors rather than giving pass/fail tests, so a suspicious URL in an otherwise non-spammy message wouldn't necessarily send it over the spam threshold.

    4. Re:Does it use IP's or URI's ? by ChaosDiscord · · Score: 4, Insightful
      If they're on the same box as me, you just blacklisted 399 other domains that shouldn't have been blacklisted.

      You're not blacklisting; you're marking as "more likely spam". In practice the damage will be minimal. First, legit email from the other 399 domains will in general be non-spam-like. The positive hit on the IP address won't be enough to push them over the edge. The penalties for being found in the SURBL at the moment are all relatively small, all less than 1 (5 points are needed in the default configuration to mark a message as spam). The only exception is data from the Spam Cop database, which is fairly small and more carefully vetted. If they broaden from hostnames to IPs, you might have to tweak the scores down, but that's it. Second, what's the realistic chance of your getting email containing a URL linking to that IP? There are millions of web sites. The Big Important Web Sites aren't on the sort of massive shared server you describe. The chances that you'll get an email mentioning one of those smaller sites is pretty small. There is a risk, but it's small enought that I won't lose any sleep over it.

  12. 3.0 New Features by CleverFox · · Score: 5, Informative

    Major feature list:

    - SpamAssassin is now part of the Apache Software Foundation and has an
    improved software license, the 2.0 version of the Apache License.

    - SpamAssassin now includes support for SPF (the Sender Policy
    Framework, http://spf.pobox.com/).

    - Web site links contained in the message are checked against SURBL and
    SBL. SURBL and SBL track sites that advertise with spam, known spam
    sources, and spam services.

    - The new 3.0 architecture allows third-parties to easily add plugin
    modules.

    - There is now SQL database support for both the Bayes and
    auto-whitelist modules, allowing more large sites to easily deploy
    SpamAssassin.

    - A more accurate simulation of email client handling of MIME and HTML
    improves our accuracy. In addition, there is better detection and
    handling of spammer techniques that try to trick anti-spam software.

    Important installation notes:

    - The SpamAssassin 2.6x release series was the last set of releases to
    officially support perl versions earlier than perl 5.6.1. If you are
    using an earlier version of perl, you will need to upgrade before you
    can use the 3.0.0 version of SpamAssassin.

    - SpamAssassin 3.0.0 has a significantly different API (Application
    Program Interface) from the 2.x series of code. This means that if
    you use SpamAssassin through a third-party utility (milter, etc,) you
    need to make sure you have an updated version which supports 3.0.0.

    - The --auto-whitelist and -a options for "spamd" and "spamassassin" to
    turn on the auto-whitelist have been removed and replaced by the
    "use_auto_whitelist" configuration option which is also now turned on
    by default.

    - The "rewrite_subject" and "subject_tag" configuration options were
    deprecated and are now removed. Instead, using "rewrite_header Subject
    [your desired setting]". e.g.

    rewrite_subject 1
    subject_tag ****SPAM(_SCORE_)****

    becomes

    rewrite_header Subject ****SPAM(_SCORE_)****

    - The Bayesian storage modules have been completely re-written and now
    include Berkeley DB (DBM) storage as well as SQL based storage (see
    sql/README.bayes for more information). In addition, a new format has
    been introduced for the bayes database that stores tokens in fixed
    length hashes. All DBM databases should be automatically converted to
    this new format the first time they are opened for write. You can
    manually perform the upgrade by running "sa-learn --sync" from the
    command line.

    The "sa-learn --rebuild" command has been deprecated; please use
    "sa-learn --sync" instead. The --rebuild option will remain
    temporarily for backwards compatibility.

    - "spamd" now has a default max-children setting of 5; no more than 5
    child scanner processes will be run in parallel. Previously, there
    was no default limit unless you specified the "-m" switch when
    starting spamd.

    - If you are using a UNIX machine with all database files on local
    disks, and no sharing of those databases across NFS filesystems, you
    can use a more efficient, but non-NFS-safe, locking mechanism. Do
    this by adding the line "lock_method flock" to the /etc/mail/spamassassin/local.cf file. This is strongly recommended if
    you're not using NFS, as it is much faster than the NFS-safe locker.

    - Please note that the use of the following command line parameters for
    spamassassin and spamd have been deprecated and are now removed. If
    you currently use these flags, please remove them:

    in the 2.6x series: --add-from, --pipe, -F, -P, --stop-at-threshold, -S
    in the 3.0.x series: --auto-whitelist, -a

    - The following flags are de

  13. Re:Artificial intelligence was born... by Scarblac · · Score: 4, Interesting

    Artificial intelligence was born... Filtering spam.

    In Greg Egan's _Permutation City_, spam filters and spam become ever more intelligent. Your spam filter runs the interactive video mail in a sandbox trying to detect whether it's spam, the spam tries to detect that it is in a sandbox or that it is talking to an AI construct, so that it can hide its commercial intent. Your filter tries to mimic you (and you review its reactions now and then, try to get its facial expressions ever more like yours, etc), the spammers try to get more information about you so they can try to fool your filter by making the spam look like on of your friends, etc.

    This is an obvious arms race and in that book, AI and uploaded individuals etc exist - but the trick is to make your AI spam filters as good as possible without making them actually self-conscious, since using self-conscious AI software for spam filtering would be torture.

    I rather liked that idea.

    --
    I believe posters are recognized by their sig. So I made one.
  14. Re:Release notes? by AnotherScratchMonkey · · Score: 5, Informative

    You can browse the version 3.0.0 Subversion repository. I'd suggest looking at the files UPGRADE and Changes.

  15. For those who may have forgotten by numbski · · Score: 5, Informative

    I'm building the latest on all of my clients' mail exchangers and our primary boxen. ;)

    Here's the command to install/upgrade 3.0 via CPAN:

    # perl -MCPAN -e shell;
    cpan > install Mail::SpamAssassin

    (many lines, type in the administrator's e-mail address, say no to network tests)

    exit

    #

    Very difficult stuff. :) Keep up the good work.

    Oh! Some link whoring as well:

    SpamAssassin Milter for Sendmail - Filters everyone without procmail

    SpamAssassin Milter Quarantine - Quarantines spam messages and sends summaries in digest for 1 or more times daily rather than simply delivering to the end user.

    --

    Karma: Chameleon (mostly due to the fact that you come and go).

  16. Better names? by Da+Twink+Daddy · · Score: 5, Funny

    Well, since it's capable of removing a certain caste of emails entirely how about SpamGenocide or SpamacialCleansing?

    Perhaps we should identify it with (im)famous person(s) to drive up hits like SpamHitler, SpamNazi, or SpamlobodanMilosevic?

    Maybe something that has an associated coolness factor, instead of being (almost) universaly hated, like Dr. Spamibal Lecter?

    Well, there's still the problem of overwhelming evil there. It's not really evil, just heartless and calculating. Hmm, heartless, calculating, killer... I got it! How about SpamAssassin? Oh, wait...

  17. fillters vs. stallers by Anonymous Coward · · Score: 4, Insightful

    When do people learn that
    what we need is not spam filters but spam stallers.

    With spam filters your just precipitating in a arms race.

    The spammers will send more and more spam
    and your spam filters will use more and more
    of your processor time to filter the spam.
    It is a uphill battle against the spammer.

    With spam stallers like sa-exim and tarproxy
    your are stalling the spammers smtp connection
    and the effect is that the spammer can't send
    as much spam or that they drop you email from there email database.

  18. still waiting for spammerassassin by Daniel+Ellard · · Score: 4, Funny
    This looks great, and I look forward to using it, but it doesn't address the root of the problem. Anyone working on spammerassassin yet?

    --
    Disclaimer: I work for a company, but I don't speak for them.
    1. Re:still waiting for spammerassassin by geeklawyer · · Score: 4, Funny

      Yes,
      The spammerassassin team is active, but on my legal advice they are not documenting their work: it could, technically, be argued to be murder.

      --
      -he who laughs last, is a bit slow.
      journal
  19. Re:Improved Performance? by Tim+Macinta · · Score: 4, Informative
    I must say, Python might be a nice language and all, but as it's making inroads everywhere it's also wrecking havoc on ones ability to convert older hardware into a competent server.
    Spamassassin is actually written in Perl, not Python. I'm not saying your point about certain languages making it difficult to maintain older machines isn't valid, I'm just clarifying what Spamassassin uses.
  20. Installing on Windows....you're kidding, right? by Chris+Carollo · · Score: 4, Insightful

    So I've heard good things about SpamAssassin and headed over the webpage to figure out what I needed to do to install, and I found this.

    I'm probably going to flamed for this, but that install process is ridiculous. I'm not even close to being a newbie, but there's no way I'd go through that much hassle to install a spamblocker compared to something like SpamBayes that does a standard windows install and hooks right into Outlook. Does anyone thing that these things are reasonable?

    1. I'm supposed to extract it to the root of my drive. Sorry, my root is sacrosanct. If the /. crowd is going to complain about RealPlayer dumping shortcuts in my desktop, quickstart bar, and main start menu, how is SpamAssassin making directories in my root any better? At least I can delete the stuff RealPlayer litters around.

    2. I've got to install Perl modules? And it doesn't work with certain versions of Perl? The install should include whatever it needs to run. Don't make me track down some particular version of outside software.

    3. I've got to generate a batch file and run it to generate the documentation? Why not just include the generated documentation?

    4. Step 10 of the install FAQ mentions a D drive. I don't have a D drive. Does SpamAssassin really require TWO drives to run/test properly?

    5. The whole install process includes 13 steps, some of which are fairly complicated.

    This is one of the reasons why the whole open-source initiative has such a bad, pointy-headed reputation. Where is the focus on usability and user-friendliness? I often get the impression that it's "not cool" to actually put time and energy into making your software anything other that esoteric in its usage. I realy would like to try SpamAssassin, but dealing with the minor annoyances of SpamBayes for the next six months is clearly less work than installing SpamAssassin today. Why doesn't that bother anyone?

    I'm probably going get either flamed or ignored for this post, but I would appreciate a reasonable response if there is one. We'll see I guess.

    1. Re:Installing on Windows....you're kidding, right? by sidney · · Score: 4, Informative

      I did a lot of the work of getting SpamAssassin to build and run on Windows. My goal was to have SpamAssassin build and install on Windows using the unmodified sources before version 3.0 was released. It does that now.

      SpamAssassin was written in Perl on Unix and Gnu/Linux, for use in high volume server environments. The installation for an ISP or for anyone running a *nix mail server is a piece of cake. Their users get their mail filtered without having to install anything on their own PCs.

      The fact that it works on Windows at all is a bonus. It is an open source project. Would anyone like to volunteer to help with the next steps of getting the server daemon, spamd, working properly in Windows as a service; writing or adapting an existing mail proxy that would integrate SpamAssassin with mail clients such as Thunderbird, Mozilla Mail, Eudora, Outlook Express; packaging it up in a standard Windows install package?

      Addressing the 5 points in the parent post:

      1. Nothing has to go in the root directory. The instructions show an example of Perl having been installed in C:\perl and configuration going in directories underneath a C:\etc\mail directory.

      2. Yes you have to install Perl. And a recent enough version that doesn't have certain bugs. And the required modules. SpamAssassin was written in Perl, which makes it useful on systems that have Perl, such as most Unix and GNU/Linux systems. If you install Perl and the modules on your Windows system then you have a system that meets the minimum requirements. If you have a Palm Pilot or or an Xbox or Windows without Perl then your system does not meet the minimum requirements and you are not going to even try to run SpamAssassin on it. In that case install SpamBayes, or get an ISP who uses SpamAssassin for your mail, or any of many other alternatives.

      3. Making the doc files is easier in *nix. I'll file a request for enhancement suggesting that generating the HTML be made part of the Makefile and that it be made to work under Windows. The doc files are generated from the sources as part of the build, so they are not included in a source distribution, which is what we are talking about here. If someone built a binary distribution they would include the doc files.

      4. That -D command line option stands for Debug, not D drive

      5. The whole install proces consists of 13 steps, some of which are things like "download SpamAssassin", some of which are "if you are installing the old version 2.6x do this extra step", and some of which have to do with getting the required Perl and Perl modules. The actual installation pretty much happens in three lines of step 7. It really is quite easy for a build and installation starting from source files. A binary installation package would be a lot easier. Does anyone know how to package perl plus modules plus a built SpamAssassin into a Windows install package? If you do, feel free to volunteer.

      The focus on usability and user friendliness is where it should be in this particular project, on the sysadmin who installs SpamAssassin on a server and on their end users who don't have to install anything at all.

      If you have the ideas and the expertise to also make SpamAssassin more useful and friendly to the end user owner of a PC running Windows, please volunteer to help.