DSPAM v3.2 Beta-1 Released

← Back to Stories (view on slashdot.org)

Posted by michael on Thursday September 23, 2004 @12:42AM from the spam-all-you-want dept.

Nuclear Elephant writes "After three months of development, the first public beta of DSPAM v3.2 has been released for testing. New features include SQLite support, A Win32 build supplement, extensions API, and some advanced new processing functionality such as Bill Yerazunis' (CRM114) Sparse Binary Polynomial Hashing and v1.2 of the author's Bayesian Noise Reduction Logic. Accuracy in 3.x has reportedly peaked as high as 99.991% (2 errors in 22,786 messages). Grab the new copy and participate in the request for feedback."

20 comments

Min score:

Reason:

Sort:

WHAT?!?! by Stoopid-Guy0 · 2004-09-23 00:51 · Score: 1

How can they release such an unfinished product ? I missed 2 whole emails !!!!
1. Re:WHAT?!?! by DrMorris · 2004-09-23 00:53 · Score: 1
  
  You mean you missed 22,786 mails...
2. Re:WHAT?!?! by Stoopid-Guy0 · 2004-09-23 01:45 · Score: 1
  
  I mean IT missed 2 emails ;) I don't need no prvew !
3. Re:WHAT?!?! by Anonymous Coward · 2004-09-23 01:50 · Score: 0
  
  Two emails each day.
  
  Well, not quite - but one a month today.
  
  And the spammers will continue to fight back, by increasing the volume and variety of spam, at least as fast as blocking technology improves.
why user infomercial.. by gl4ss · 2004-09-23 00:52 · Score: 1

why use infomercial type of speak?

"DSPAM users frequently see between 99.95% (1 error in 2000) all the way up to 99.991% (2 errors in 22,786)."

that could mean just about anything, "frequently see" could mean that they will see succes rates like that if they get the same mail 20 000 times...

or are they trying to 'sell' the the little phb in all of us?

--
world was created 5 seconds before this post as it is.
1. Re:why user infomercial.. by DrMorris · 2004-09-23 00:55 · Score: 1, Funny
  
  "frequently see" could mean that they will see succes rates like that if they are using SpamAssassin... :-)
SpamAssassin vs. DSPAM by Anonymous Coward · 2004-09-23 02:08 · Score: 0

Interesting how this release follows closely on the heels of a SpamAssassin release. One would almost think that was done on purpose, since DSPAN seems to be more about comparing sizes with SpamAssassin than actually being a user-friendly product. Maybe if both DSPAN and SpamAssassin spent less time holding up rulers and getting distracted otherwise, we'd have better progress in the spam fighting frontier. Just my $0.02.
1. Re:SpamAssassin vs. DSPAM by jwbozzy · 2004-09-23 02:32 · Score: 3, Informative
  
  I totally agree. I used DSPAM for a while, gave it a fair shot, participated on the mailing list. I even, at times, got encouraging results. Ultimately, DSPAM required way too much nursemaid work to make it work for my installation and I scrapped it and went back to SA. The general feel I got from the DSPAM crowd was a big dick waving contest with other products, particularly, but not limited to, SA. A typical mailing list message looked like:
  
  "SA is able to do X accurately. I cannot seem to achieve this with DSPAM. Am I doing something wrong, is there something I need to configure further?"
  "SA is inferior. You don't want X. Besides, DSPAM has Y, which approximates X. No, I can't tell you how to do it specifically, but know that you need only DSPAM."
  
  Personally, I found that DSPAM is blatently unable to train itself properly. You might have to train it 4 or 5 times with the same message to get it to classify that message as spam. It doesn't recursively train like SA does. This leads to users getting the EXACT SAME spam multiple times, despite their best efforts to train the filter. In addition, DSPAM's group features are sparsely documented and somewhat magical in their behavior. And of course, with out these features, DSPAM is useless to an installation of people who really do NOT want to have train their spam filter for months to get it to work right.
  
  I'm all for competing software/products, but both projects are OSS, there is no money involved here, and I can't see how bashing the other product while concurrently not being able to do better, or even match it can be viewed as a step forward...
  
  But hey, that's just me, your mileage may vary.
  
  --
  perl -e 'printf("mmm %x\n", 3735928559)'
2. Re:SpamAssassin vs. DSPAM by Anonymous Coward · 2004-09-23 03:41 · Score: 0
  
  DSPAM trains recursively, it has since version 2. You must be doing something wrong.
DSPAM. . . neat at fist, not for long. by Christopher+Cashell · 2004-09-23 02:47 · Score: 4, Informative

I used DSPAM for a while. I started using it with the Berkeley DB backend, and that worked reasonably well. . . it was fairly fast, but database corruption was almost impossible to avoid. I don't think I ever managed more than 3-4 weeks without my DB getting killed.

So, then I started using an SQL database. That worked great for a while, except it was slow. Now, admittedly, I'm running my mail server on an old machine (Dual Pentium Pro 200's, with 450MB RAM), but DSPAM was horrible. With more than half a dozen e-mails to process at a time, it would just choke. And the space issue. . . my spam-data database got over 300MB within a couple of weeks! And, yeah, I was processing a lot of mail, but come on. That's just not right.

Finally, I decided it just wasn't worth it. So, I tried an alternative that the DSPAM author has spoken fairly highly of, CRM114. That thing rocks! Within a few days, it was catching most of the spam, it runs much faster than DSPAM or SA, and it has fixed-sized spam token databases, so unless you explicitely increase the size, they won't grow past what you set them up for.

I can't see myself bothering with any other spam filter anytime soon.

--
Topher
1. Re:DSPAM. . . neat at fist, not for long. by Anonymous Coward · 2004-09-23 04:10 · Score: 1, Informative
  
  Sounds like you weren't purging, and this was probably why you had severe space and processing issues. You can switch to TOE-mode training and make your databases as small and fast as CRM114.
2. Re:DSPAM. . . neat at fist, not for long. by Christopher+Cashell · 2004-09-23 19:58 · Score: 1
  
  Note again what I said: "my spam-data database got over 300MB within a couple of weeks".
  
  The default purge settings won't even touch the database until tokens and signatures have been in there for at least 14 days. Even after tightening up the purge settings, the database was way too big. In fact, even without purching, the databse should not have been allowed to get that big. I mean, seriously, it processes some 20MB of e-mail, and the spam database from it is over 300MB? There's something wrong with that.
  
  --
  Topher
3. Re:DSPAM. . . neat at fist, not for long. by Anonymous Coward · 2004-09-24 03:00 · Score: 0
  
  You obviously get a shitload of email and should have considered using a different training mode, like train on error. My database is 44MB. I suspect you also turned on multiword tokens? Of course it's going to grow the way you configured it.
4. Re:DSPAM. . . neat at fist, not for long. by Christopher+Cashell · 2004-09-25 02:04 · Score: 1
  
  I do get a lot of mail, yes. However, DSPAM is claimed to be capable of handling thousands of users, so it should be well prepared to handle that kind of e-mail load. I may get a lot of e-mail, but I sure as hell don't get more than a thousand average people do.
  
  As for it's configuration, I used mostly default settings, although as I mentioned above, I tightened it's purge settings to try to reduce the DB size.
  
  I've used a lot of different spam filters, including a bunch of bayesian style ones, and none of them have ever come close to generating a database the size of DSPAMs, especially not within just a few weeks.
  
  *shrug*
  
  I'm sorry, but I don't think blaming me for DSPAM's faults is the correct way to go about things. I mentioned previously that it is a neat system, in many ways, and it does have some very cool features.
  
  But the issues I've run into are valid issues (a friend of mine experienced some similar, though not identical, problems when he tried out DSPAM (on my suggestion)), and until they're addressed, I don't plan to try DSPAM again.
  
  --
  Topher
5. Re:DSPAM. . . neat at fist, not for long. by Anonymous Coward · 2004-09-27 14:20 · Score: 0
  
  I guess you didn't read my previous message where I suggested you change your training mode. It seems the few people I run into who are unhappy with DSPAM are more bitter because they can't read the docs for themselves.
6. Re:DSPAM. . . neat at fist, not for long. by Anonymous Coward · 2004-09-28 00:26 · Score: 0
  
  ..and if you want filtering comparable to other spam filters, you most certainly need to disable chained tokens; this takes up a bunch of disk space but delivers levels of accuracy much higher than other tools.
2 errors in 22,786 by Anonymous Coward · 2004-09-23 07:39 · Score: 1, Funny

"2 errors in 22,786"

1 in 11,393 was too easy?
And to anyone out there who might believe this. by Anonymous Coward · 2004-09-24 07:11 · Score: 0

Your mileage will definately vary unless you intentionally go out of your way to configure DSPAM to behave poorly. By default it works amazingly well, and you don't have to do anything to it. Once it was installed (I wrote a C milter using libdspam) I have done nothing to it except ugrade to a new version of dspam a couple times.
1. Re:And to anyone out there who might believe this. by jwbozzy · 2004-09-27 03:02 · Score: 1
  
  I'd like to note the above posts as further proof of what I said in my initial post. Note that they are posted anonymously, and give no specific information other than to say that DSPAM is great and it works for them. I thank the posters, whoever they are, for ever so elegantly proving my point for me.
  
  --
  perl -e 'printf("mmm %x\n", 3735928559)'