DSPAM v3.2 Released

second post? by TheMysteriousFuture · 2004-10-16 21:05 · Score: 1, Interesting

Are most people using a bayesian DSPAM, CRM114, or SpamBayes along with SpamAssassin (rule based)? Or do you just use the bayesian filter?

I see that most of these bayesian filtering programs mention that they can be used with SpamAssassin. Is it usually best to run both for DoublePlusGood(TM) spam catching?

--
.sig

Re:second post? by Anonymous Coward · 2004-10-16 21:07 · Score: 0

TROLL? ALREADY? bastard mod! try contributing a comment instead of just weakly moding down (anonymously)

Oh wait...I forgot to add the oblig "go ahead, mod me down! I have karma to burn!"...damn
Re:second post? by Anonymous Coward · 2004-10-16 21:10 · Score: 0

Lets burn up this mods points with stupid posts. weeeeeeeeeee!

Well make up your mind! Is it a troll or is it underrated

*sigh*
Re:second post? by Anonymous Coward · 2004-10-16 21:15 · Score: 0

DSPAM seems to have problems integrating with Postfix + Amavis-new + SpamAssassin so I gave up..
SpamAssassin 3 just came out and it seems to be doing pretty good. I like having the ability to store the bayes info in MySql.
Re:second post? by hayds · 2004-10-16 21:18 · Score: 5, Interesting

I would have thought that running 2 bayesian filters would cause more trouble than good. The first filter would be ok as it would be trained like usual.
The second filter would probably have problems because it would only see a small subset of all your mail as the first filter would have removed most of the spam. The second filter's sample would therefore be skewed and it would have far less data to accurately classify spam.
Just my thoughts on the subject anyway...
Re:second post? by Anonymous Coward · 2004-10-16 21:25 · Score: 0

SpamAssassin is _not_ a bayesian filter afaik...
Re:second post? by fyonn · 2004-10-16 21:26 · Score: 1

I use dspam alone on my system and it does a very good job of pulling about 80 spams a day from my inbox. I can't say that nothing gets through, but little does. few enough that I don't mind too much, maybe one or two a week.

what does surprise me s that sometimes obvious spams seem to get through, ie every now and then a 419 comes through and I'd have thought it would be well trained on those by now. nevertheless, it works a lot better for me than spamassassin did, and it requires less (or easier) maintenance. I'm runnin on dpsam 3.0 atm so it's time to upgrade obviously.

dave
Re:second post? by Anonymous Coward · 2004-10-16 21:29 · Score: 0

I seem to be wrong. Acording to this there IS a "Bayesian learner" in SpamAssassin now. I'll have to try it out
Re:second post? by atrus · 2004-10-16 21:31 · Score: 2, Interesting

I'm running Postfix with RBLs. Looking at SpamCop, SpamHaus, and SORBS. It auto rejects all e-mail coming from banned IPs. This brings me down to 1 spam a day. If your IP is blocked, tough, find a new ISP (these lists tend to be more self-expiring and not 'permament ban' types, which is good).
Re:second post? by Anonymous Coward · 2004-10-16 21:41 · Score: 0

Yup, sbl-xbl.spamhaus.org, safe.dnsbl.sorbs.net, and bl.spamcop.net. All the spam filtering most people will ever need, and none of this training nonsense.
Re:second post? by Anonymous Coward · 2004-10-16 22:18 · Score: 0

It's been in SA for a while now...
Re:second post? by endx7 · 2004-10-16 22:24 · Score: 1

It depends on when the actual filter step occurs. For example, SA (SpamAssassin) by default only marks the message. The actual deletion (by the second filter) -could- occur after SA gets through it. Basic example: SA. Other filter. Other filter Act or SA Act in either order.

The funny part is if the second filter includes headers as part of its bayesian filtering, the second filter could become biased based on spamassassin's results :P
Re:second post? by kalman5 · 2004-10-16 22:33 · Score: 4, Interesting

What I dislike is the centralized Antispam. What is spam for me could not be for you. I was using the antispam filter on thunderbird but at least in previous was not good then I switched to use K9 ( http://keir.net/k9.html ). Is there nothing around for Linux like K9 ? K.
Re:second post? by The+Mgt · 2004-10-16 22:54 · Score: 1

What is spam for me could not be for you.
Care to provide an example of a spam that somebody might want to read ?
Re:second post? by flynn_nrg · 2004-10-16 22:58 · Score: 5, Informative

It's your server and hopefully you'll never have to suffer the 'collateral damage' of living near a spammer (network neighbourhood wise). It has happened to me a couple of times. The first time I actually spent time sending my reply from my gmail account, and told the guy about it. The second time I didn't even bother.

Netblock blacklisting is a really poor solution. In some cases a single spammer causes a /24 and then a /16 to be blocked. It doesn't make sense to me. OTOH, I discovered some time ago that blocking Windows boxes works wonderfully, and it's extremely easy to do with OpenBSD's pf :-)

Btw, do you understand that changing ISP may not be an option?
Re:second post? by Neophytus · 2004-10-16 23:31 · Score: 2, Informative

Spamcop and Spamhaus I agree with. SORBS demand payment for removal of clean servers (albeit not to them). That just doesn't chime when people spam through an isp's smtp server and get caught.
Re:second post? by BasilBrush · 2004-10-16 23:35 · Score: 1

Some people do want to buy V1agra online. ;-)
Re:second post? by Anonymous Coward · 2004-10-17 01:03 · Score: 0

That's why you tell the second filter to ignore the X-headers from the first filter. IMO, it doesn't make any sense to use a less accurate filter (spamassassin) with a more accurate one (dspam, crm114, etc).
Re:second post? by kalman5 · 2004-10-17 01:51 · Score: 1

Someone wants buy viagra on line and someone wants be advertised about new porno sites... K.
Re:second post? by sketerpot · 2004-10-17 01:54 · Score: 1

Some people should learn to use a search engine.
Re:second post? by ozzee · 2004-10-17 02:11 · Score: 1

I discovered some time ago that blocking Windows boxes works wonderfully, and it's extremely easy to do with OpenBSD's pf :-)
How can the pf detect a windows box ?
Re:second post? by BasilBrush · 2004-10-17 02:14 · Score: 1

They haven't seen the "Learn to use Google in 30 days" spam yet.
Re:second post? by Anonymous Coward · 2004-10-17 02:17 · Score: 0

Finger printing with PF...
Re:second post? by tzanger · 2004-10-17 02:27 · Score: 1

I use CBL and RBLDNS -- neither block entire netblocks, the IPs in them are static IPs which either run open proxies or fail other checks. Very, Very effective.
Re:second post? by homer_ca · 2004-10-17 03:58 · Score: 1

Bayes in Spamassassin doesn't seem to recognize 419 emails very well. What does work are the fraud rules from the Spamassassin Rules Emporium.
Re:second post? by sketerpot · 2004-10-17 04:54 · Score: 1

What sort of person needs 30 days to learn to use Google?
Re:second post? by Anonymous Coward · 2004-10-17 05:03 · Score: 0

I once asked Samsung a question through email, and they promptly added me to their mailing list. I get a monthly bit of marketing fluff (which I flag as spam to Yahoo), but obviously some people want it.
Re:second post? by BasilBrush · 2004-10-17 05:15 · Score: 1

The type of person that reads and responds to spam messages.
Re:second post? by atrus · 2004-10-17 06:04 · Score: 1

A valid point. I'm trying to stick toward more dynamic lists so I mean reconsider SORBS. SpamCop accounts for 90% of my blocks, while SpamHaus picks up the other 9%.

DSpam with qmail / vpopmail by hayds · 2004-10-16 21:10 · Score: 4, Interesting

I am using D-Spam on a qmail/vpopmail server and I find that its great in terms of accuracy. Most of my users have never had a false positive and many havent seen a spam after a couple of weeks of training.

The problem that I have with DSpam is the integration side. Im not sure how it goes with other mail systems but integrating it with vpopmail was a major pain. It seems easy, you just put the command in the dotfiles, but in practice getting it to work was quite a trial. Even now it doesnt integrate properly with the web administration, etc despite some scripting and minor code changes.

Because of this Ive been thinking of switching to Spam Assassin simply because of its integration with qmail-scanner. Has anyone else had similar problems or been in a similar situation and found a good solution?

Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-16 21:17 · Score: 0, Troll

qmail?...I've read through that guy's entire site and I just can't stand his attitude....if you google many people agree. For example his way is the right way. period.

that said if you _MUST_ use qmail make sure to google for one of the many collections of qmail patches (sometimes with install scripts).

One of my problems with qmail is that it's released under a V E R Y vague license, and the author doesn't respond to repeated attempts to clarify what is and is not allowed.
So if he decides to start charging/suing people for commercial use all of a sudden one day don't say I didn't 'told you so'
Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-16 21:29 · Score: 1, Informative

Naah, meanwhile SpamAssassin supports bayesian filtering as well, besides its rule based filtering.
Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-16 21:35 · Score: 0

What are you smoking? spamassassin has had a bayesian filter for a long time. You might hot have seen it becouse it kicks in after some time, when it has enough data, but it's there and it works fine, especialy in spamassassin 3.
Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-16 21:43 · Score: 0

.sigs

So how well does Spam Assassin's bayesian filter work? Is it up to par with D-Spam/SpamBayes/CRM114? If so, then why do I hear people talking about how they are feeding mail into SpamAssassin, then into ${OTHERBAYESIANFILTER}?
Re:DSpam with qmail / vpopmail by Inda · 2004-10-16 22:02 · Score: 3, Interesting

I'll admit I don't really understand your post.
All these new spam removal programs are all very well and good but from an end user's point of view, all I would like to know is:
How long am I going to have to put up with emails like this?

Hi. This is the qmail-send program at somewhere.com. I'm afraid I wasn't able to deliver your message to the following addresses. This is a permanent error; I've given up. Sorry it didn't work out. info@somewhere.com This address no longer accepts mail. --- Below this line is a copy of the message. ... COMPLETE COPY OF NETSKY VIRUS ... ------=_NextPart_000_001B_01C0CA80 .6B015D10--

I've had well over a thousand of these types of email in the last 30 days.
DSPAM v3.2 is probably a rock solid application in the right hands.

--
This post contains benzene, nitrosamines, formaldehyde and hydrogen cyanide.
Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-16 22:19 · Score: 2, Informative

What you want is ClamAV:
http://clamav.sourceforge.net/
Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-16 22:22 · Score: 0

how do i reply to the top thread? google nice D-Spam is not very good
Re:DSpam with qmail / vpopmail by hayds · 2004-10-16 22:23 · Score: 4, Informative

This is a legit message from someones mail system. You are receiving this because someone has been infected with a virus. Their computer is sending messages from your email address, and some of these messages are going to non-existant mail addresses. Because they are spoofing your mail address in the From: you are receiving all the bounces.
So technically, this isnt spam or junk mail. Its someones email system doing what its supposed to, returning 'your' email because the sender didnt exist.
Unfortunately, probably not much you can do about this without blocking all such legit system messages.
Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-16 22:36 · Score: 0

Because they're stupid. And so are you.
Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-16 23:06 · Score: 3, Insightful

Administrators really shoudn't configure their systems to return mail that contains virusses. Most of these are sent from spoofed addresses anyway and don't make it to the system that is actually infected. They just annoy people that are not responsible for the original messaga. And on top it just generates an unnecessary amount of traffic and I really just consider this to be spam.
Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-16 23:20 · Score: 0, Flamebait

you ignorant clod!

That message has absolutely nothing to do with DSpam. It's a failure notice from some legit mailserver responding to a faked From: address.

The spammer/virus/whatever has picked up your email address and is faking it.

That is all.

Try setting up SPF, and get a cluestick.
Re:DSpam with qmail / vpopmail by hayds · 2004-10-16 23:33 · Score: 1

This is very true but as an end user unfortunately theres not much you can do. For messages like this to stop, its not only your mail administrator that needs to block bounces like this but every mail administrator on the net. And its not like that's gonna happen anytime soon :(
Re:DSpam with qmail / vpopmail by BasilBrush · 2004-10-16 23:39 · Score: 1

If you're recieved 30 of those, then you aren't using any kind of trainable spam filter at all. Mark it spam once, and there's no need to see the other 29.
Re:DSpam with qmail / vpopmail by DaMeatGrinder · 2004-10-17 00:30 · Score: 3, Interesting

Unfortunately, probably not much you can do about this without blocking all such legit system messages.
Here's a crazy idea: if you crypto-sign all messages you send, it should be possible to check the signature in bounced messages and filter any unsigned bounced messages.
Re:DSpam with qmail / vpopmail by qbwiz · 2004-10-17 00:41 · Score: 1

Maybe, but then they have to distinguish, to around 100%, what is a virus and what isn't. Currently, they just have to know, i.e. if the mailbox doesn't exist anymore; to selectively bounce they would need to examine the message carefully for known virus signatures.

--
Ewige Blumenkraft.
Re:DSpam with qmail / vpopmail by TFGeditor · 2004-10-17 00:57 · Score: 1

High volume of improper bounces like this is reason for blacklisting by many of the BL maintainers. (SpamCop, et al.)

--
Ignorance is curable, stupid is forever.
Re:DSpam with qmail / vpopmail by Gadzinka · 2004-10-17 01:23 · Score: 1

Well, I have a clamav running on my mail server and it sorts out virus emails as well as bounces containing them.

Robert

--
Bastard Operator From 193.219.28.162
Re:DSpam with qmail / vpopmail by legirons · 2004-10-17 01:49 · Score: 1

"Unfortunately, probably not much you can do about this without blocking all such legit system messages."

Which many of us do routinely. So why bother sending faked "virus warning" messages at all, if the only effect is to worry some people with clean computers, and get the rest of us to block anything with "postmaster" in the header of the email.
Re:DSpam with qmail / vpopmail by tzanger · 2004-10-17 02:34 · Score: 1

COMPLETE COPY OF NETSKY VIRUS

Make the mail admin install the qmail-send.mimeheaders patch -- it causes bounces to bounce back only the headers of email with MIME attachments. As google provides, my qmail patchlist is quite long, actually. :-)

I'm moving over to Postfix these days -- it seems to do everything qmail does but without the need to recompile every time I want a change.
Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-17 04:29 · Score: 0

Very true, but I just don't understand why system-administrators don't seem get this.
Re:DSpam with qmail / vpopmail by hanulec · 2004-10-17 04:52 · Score: 1

qmail integration is easy w/ DSpam... just use Procmail. SpamAssassin shouldn't even be an option... its scoring algo in 2.6 put a bad taste in my mouth.

I compile DSpam w/ (works w/ v3.0... haven't tried 3.2 yet) ./configure --prefix=/usr/local/spam/dspam-3.0.0-sigheader \
--enable-signature-headers \
--enable-source-address-tracking \
--with-db4-includes=/usr/local/db/db-4.2.52/includ e \
--with-db4-libraries=/usr/local/db/db-4.2.52/lib \
2>&1 | tee H-configure-sigheader

#
# sigheader option requires you to 'bounce' spam or ham
# mails into your pre-defined aliases which reclassify emails
#

make 2>&1 |tee H-make-sigbody

Your /var/qmail/rc then looks like:

#!/bin/sh

exec env - PATH="/var/qmail/bin:$PATH" \
qmail-start '|preline procmail' multilog t s1000000 n20 /var/log/qmail

And your .procmailrc entry looks like: :0 fw: dspam.lock
|/usr/local/spam/dspam-3.0.0-sigheader /bin/dspam --mode=teft --user=hanulec --feature=chained,bnr --deliver=innocent,spam --stdout --dspam -d hanulec

You then use procmail to capture spams into another folder... the syntax of this is based upon procmail's integration w/ your POP3 or IMAP server.

I reliably use qmail and Cyrus IMAPd on a slow dedicated server to delivery 100K+ emails/day.
Re:DSpam with qmail / vpopmail by sheriff_p · 2004-10-17 05:16 · Score: 1

http://cou.ch/bounce.txt

Perl script to handle spam bounces...

--
Score:-1, Funny
Re:DSpam with qmail / vpopmail by Anonymous Coward · 2004-10-17 06:58 · Score: 0

No, this is a bad message from an improperly setup mail system. You do not wait until local delivery to decide you can't deliver, and then send an error to the from: address. If the mail cannot be delivered, the sending MTA should be told this during the SMTP dialog. The vast majority of these come from qmail, is it actually broken, or do incompetant people just like qmail alot?
Re:DSpam with qmail / vpopmail by JuggleGeek · 2004-10-17 07:13 · Score: 1

So technically, this isnt spam or junk mail. Its someones email system doing what its supposed to, returning 'your' email because the sender didnt exist.
Technically, if they are bouncing messages back to me when I didn't send the original message, it is unsolicited email.
Any mail that wasn't delivered because it was a virus shouldn't bounce - everyone *knows* that viruses spoof addresses. If it isn't delivered because a filter decided it was spam, it shouldn't bounce, IMO, as spam usually forges addresses.
At the very least, if they want to bounce it, they should run an SPF check. If SPF says "This isn't legitimate mail from our domain" then bouncing it is just attacking someone innocent. I get a bunch of bounces from forged spam and forged viruses. I didn't send the spam, or the viruses, and sending me a copy of the crap doesn't really help anyone.
Re:DSpam with qmail / vpopmail by Vellmont · 2004-10-17 10:16 · Score: 1

How long am I going to have to put up with emails like this?

If you install amavisd-new and clamav, you won't have to put up with it at all. amavisd-new is a generic mail proxy that calls both spamassassin for spam filtering, and clamav for virus scanning. If you really want you can get it to call dspam as well. It also can use a huge number of other virus scanners if you prefer them. I now get zero viruses using clamav and zero false positives.

--
AccountKiller

Is DSPAM... by DLR · 2004-10-16 21:26 · Score: 2, Funny

...any better than CSPAN?

--
"Like fire and fusion, government is a dangerous servant and a terrible master."~RAH

Re:Is DSPAM... by Anonymous Coward · 2004-10-16 21:27 · Score: 0

HAHAHAHAHAH...
no really, HAHAHAHHAHAH BWHHAHAHAHHAHAHAHAHHAHAHHAHA

Don't use so many caps.

hahahahhahahahhahahhah
Re:Is DSPAM... by Anonymous Coward · 2004-10-16 22:05 · Score: 0

Well, it's one better on the frontend, but it actually lags by one in the backend...

Four words. by Anonymous Coward · 2004-10-16 21:26 · Score: 0

Iron Port, fuck yeah.

3.2? by fyonn · 2004-10-16 21:29 · Score: 0, Redundant

it seems to me that we're on 3.2 preview release 1. not 3.2 release which is scheduled for the 20th to the 22nd. is this post a bit early?

dave

Re:3.2? by Anonymous Coward · 2004-10-16 21:36 · Score: 0

whoops, duped you.

DSPAM version 3.2 has _NOT_ been released by TheMysteriousFuture · 2004-10-16 21:34 · Score: 3, Informative

Check out the download page

Here's what it shows.

October 1, 2004 3.2 Release Candidate 1
October 8, 2004 3.2 Release Candidate 2
October 14, 2004 Devel Frozen - Critical Changes only
October 15, 2004 3.2 Preview Release 1
October 20, 2004 Devel Absolutely Frozen. Release to packagers.
October 22, 2004 3.2-STABLE Official Release

ONLY the 3.2 Preview Release 1 is currently out!

--
.sig

Re:DSPAM version 3.2 has _NOT_ been released by Anonymous Coward · 2004-10-16 22:50 · Score: 2, Informative

Oh.. is that why the article says, "DSPAM's official release is next week, but you can download the preview release now"? I never, ever would have guessed.
Re:DSPAM version 3.2 has _NOT_ been released by WinterpegCanuck · 2004-10-18 17:04 · Score: 1

But we need this story posted now if we are to dupe it on time.

What about false positives. by Anonymous Coward · 2004-10-16 21:35 · Score: 5, Insightful

From TFA, "around 99.95% (1 error in 2000)"

I'm sick of spam filters braging about their overall error rate. All of them do OK at getting rid of the bulk of spams and saving the bulk of time.

The real important differentating factor is how many false positives they mistakenly accuse of being spam.

The consequenses of a spam message getting through are minimal - under a seconds of time, on average, to skip them.

The consequenses of a non-spam getting blocked can be huge - loss of a customer - a mom not knowing her kid is in trouble.

I wish the spam filters focused entirely on reporting how few false positives they produce.

Re:What about false positives. by Anonymous Coward · 2004-10-16 22:18 · Score: 0

Well, since you asked:
TS: 20367 TI: 18781 SM: 78 IM: 152 SC: 0 IC: 0

That's 152 false positives since Mar16 2004 out of 20,000 odd messages SpamAssasin is letting heaps of messages through that dspam catches. That's approximately 1 FP per day average, but there is hardly any now, maybe 1 or 2 per week.

(this is DSpam 2.1)
Re:What about false positives. by Scaba · 2004-10-16 22:44 · Score: 4, Funny

The consequenses of a non-spam getting blocked can be huge - loss of a customer - a mom not knowing her kid is in trouble.

Dear Mom,

I hope this email finds you well. All is fine here, out in your garage. As you know, I love working on my cars. I'm currently replacing the engine block in my '76 Trans Am. Well, wouldn't you know it, but just moments ago, this 550 lb engine block fell on my legs and I cannot stand up, and in fact, am probably bleeding to death. Luckily, I have my cell phone handy and so am able to send you this email - the marvels of technology!! Anyway, I know you only check your email about twice weekly, but when you do, please send help.

Your loving son,

Dexter
Re:What about false positives. by Anonymous Coward · 2004-10-16 23:07 · Score: 2, Interesting

even false positives are not important. if I get 1000 spams a day, but only 40 legal mails, then marking everything as spam is 96% correct. if 35 of my mails are easily identified as legal mail (a procmail rule could do - closed and filtered mailing list) then marking those as good and everything else as bad is 99.5% correct. note that still all 5 personal mails I would get are marked as spamm.

the big question for me is: how many mails do I need to check for false detection? and here is the dspam issue: it doesn't give you a grey marking, so you either check on of the mails marked as spam, and could possibly loose many important mails, or you need to check all of the spam messages, which loweres the advantage of a spam filter.

i get a huge volume of spam, a some ML and a few other postings. If i get one word document from some windows using relative every few weeks, it still must not be marked as spam. with dspam i had to check the mails marked as spam to find such false positives. because i had to check all spam mails, dspam was useless. with spam assassin i only check level 1..5 and can ignore thousands of mails in spam leven 5..10 which gives me a very good middle way between a very unlikely case of overlooking a false positive and thus loosing a mail (hasn't happened as far a i know) and looking only a few spam mails for false positives.
Re:What about false positives. by -noefordeg- · 2004-10-16 23:30 · Score: 2, Informative

I'm running a mailserver with postfix, dspam, squirrelmail, courier pop/imap, amavis and Postfix Admin where I also integrated the DSPAM phpControlCenter.

DSPAM has currently given my 0 false positives.
The clue with dspam is to start with a clean database for each user and let them start to 'sort out their spam'. For imap it's stupidly simple. Everyone has two folders "spam" and "notspam", where you can drag&drop an email to the right folder. A script picks up any emails in each folder every hour and do the necessary add-spam/not-spam processes.
For pop it's just a matter of forwaring the email to add-spam/not-spam adresses.

This works so very well, because each use get to decide which emails he think is spam and which emails he would like to recieve.

Also, if they log on to their webmail they can control what emails are marked as spam from their DSPAM phpControlCenter, and also correct any false positives, if there are any, or choose to block sender adresses and more.
Re:What about false positives. by BasilBrush · 2004-10-16 23:44 · Score: 2, Insightful

Why isn't that relative in tyour address book? / Why don't you have whitelisting set up?
Re:What about false positives. by legirons · 2004-10-17 01:58 · Score: 1

"Why isn't that relative in tyour address book? / Why don't you have whitelisting set up?"

(a) Because it would whitelist any emails sent from a virus-infected computer that that person had previously sent an email to.

(b) Because people like that change their address all the time. "Hi! I'm on AOL now -- see my new address?"

(a+b) Because people like that never sign their emails, nor do they use different email-addresses for personal, public, shopping, and mailing-lists.

I think his point was that you need to be able to check the things which might be legit, without having to wade through reams of stuff which is clearly crap.

For example, I put any HTML email into a special folder, and check it every couple of weeks in case some friend has a new hotmail account.

But stuff in the "contains GREETINGS FROM THE DESK OF..." folder can be deleted without any such consideration. So putting both types of email into a generic "spam" folder isn't always as helpful as it sounds.
Re:What about false positives. by BasilBrush · 2004-10-17 02:27 · Score: 2, Insightful

With a 99.9% accuracy on spam filters, and better performance on false positives, it just isn't worth the time. On the occasional chance that you are sent something from an address that isn't in your address book, and also happens to be a false positive, the chance of it also being vitally important are slim. And if it is vitally important, the sender will in all probability chase you when you don't respond.
There's a story about a CEO that used to sweep his pile of memos into the waste bin every morning. The theory being that 99% of them were about things that were irrelevant, and for the 1% of important stuff, people would chase him. I can't remember whoich CEO it was supposed to be though, and it's probably apocryphal. But it does hint at a truth. People who manually go through any amount of spam manually to search for false positives are probably being too anally retentive. Life is too short.
Re:What about false positives. by GreyWolf3000 · 2004-10-17 05:11 · Score: 1

If you reject spam at SMTP time, then the person sending the mail will know right away. You could even send back a report as to why the mail was marked as spam.
If you look at how spamassassin works, for example, it's a lot of little things. You can actually send back what each of those little things were, by sending back SA's report.

--
Slashdot: Where people pretend to be twice as smart as they really are by behaving like children.
Re:What about false positives. by killjoe · 2004-10-17 06:20 · Score: 1

Don't tell anyone about the notspam emaill address, the spammers will send their spam to it and mess up your filters.

--
evil is as evil does
Re:What about false positives. by Anonymous Coward · 2004-10-17 07:12 · Score: 0

And when the person who sends the falsely identified as spam mail tries to send it, and the MTA on the other side reports "Sorry, your mail looks like spam", he knows it wasn't delivered. The mail doesn't have to get lost you know.
Re:What about false positives. by HermanAB · 2004-10-17 07:35 · Score: 1

The danger of false positives with modern filters is much overrated.
People are getting used to there being mail filters in the system and know that email is not perfectly reliable. This can be due to mechanical reasons - a mail filter discarding the message, or due to human reasons - the message got lost in a pile 10,000 spams, since the user doesn't have a spam filter, or it may be an executive with email overload who gets 2000 legitimate messages every day.
Therefore, if someone sends an important message and doesn't get a response they pick up the telephone and call - yes, some people still use phones...
In my experience, SpamProbe has yet to create a false positive, in about 1.5 million messages received. OK, I'm not looking very hard for false positives, but so far, I haven't seen a single one and if there were, it doesn't matter.

--
Oh well, what the hell...
Re:What about false positives. by avida · 2004-10-17 08:44 · Score: 1

that is why you should use whitelists to make sure mommy know you are all right. hell, call the woman who gave you birth to let her know you are all right and your email is perfectly fine.
Re:What about false positives. by Anonymous Coward · 2004-10-17 09:41 · Score: 0

dspam does too have a grey marking. Certain spams are deleted, maybe spams are quarenteened for you to check over, and non-spams are delivered to you. Don't complain about things you can't be bothered to read about.
Re:What about false positives. by neafevoc · 2004-10-17 12:24 · Score: 1

That's how I do it, except I create a trainer folder and an innocent-trainer folder.

If spam falls into your in box, throw it in the trainer folder. If an innocent email falls into your spam folder, throw it into the trainer folder.

It's always good to feed dspam innocent email. That's what the innocent-trainer folder is for. I feed it about a hundred or so of my innocent email into the innocent-trainer folder.

For the first hundred or so emails, dspam may misplace email, but after its training session, I had zero false positives. Here's my dspam_stats:

TS Total Spam: 10103
TI Total Innocent: 1063
SM Spam Misclassified: 807
IM Innocent Misclassified: 23
SC Spam Corpusfed: 285
IC Innocent Corpusfed: 191
TL Training Left: 1223

It says I still have over 1200 to go for training. Since my training is not complete, I still get the occasional spam in my in box, but as for false positives, I have not received any since the beginning of training.

DSpam by Anonymous Coward · 2004-10-16 21:37 · Score: 0, Informative

I have been using DSpam for my network for quite some time now (~a month or two) and have since not recieved a complaint from any users, seems to me it works better than CRM114.

Who else read this by Anonymous Coward · 2004-10-16 21:47 · Score: 0

as SPAM released?

Filters? by Anonymous Coward · 2004-10-16 22:22 · Score: 3, Funny

Me've always found that the best filter still is the humble (and the not so humble) human :p

How is the weather in India these days? by Anonymous Coward · 2004-10-16 22:42 · Score: 0

... have ya gone through several delete keys yet?

Re:How is the weather in India these days? by Anonymous Coward · 2004-10-16 22:58 · Score: 0

I'm actually an Aussie :p

And there is such thing as the Ctrl and Shift keys which are used for highlighting multiple emails ...
Re:How is the weather in India these days? by hayds · 2004-10-16 23:15 · Score: 1

Well you should still know how the weather is in India. You should be watching the cricket ;)

did they fix the problems? by Anonymous Coward · 2004-10-16 22:54 · Score: 4, Interesting

a few months ago those features were available, too. while dspam is great at filtering mail, I faced two crucial problems, which forced me back to spamassassin. I haven't heard that they fixed any of those:
- the database did grow huge. when my single user server with 128 mb had to use a 512 mb spam token database, performance was terrible. even with the tools included I could not do anything to fix the issue.
- dspam knows only yes or now, there is no usable value that gives you some grey information. as a result, I had to check all those spam postings for false positives. Spamassassin on the other hand has that spam result 0 .. 10, so I can check 0..4 where 0 is ok (few false negatives) and 1..4 spam (few false positives), and I can directly delete thousands of mails in 5..10 without looking at them.

i wont go back to dspam unless someone can offer speciic help for those issues. I believe everyone will face them sooner or later.

Re:did they fix the problems? by Anonymous Coward · 2004-10-17 01:08 · Score: 2, Informative

There are a lot of things you can (and should) do to keep small databases in DSPAM when disk is an issue. The problem is some of this is in the FAQ rather than the docs...but you can change your training mode to TOE (which only trains on error), set up merged groups (which uses a global db and then each user only stores corrections, almost as accurate), do some creative purging, and if you're really paranoid about disk, turn off some features like chained tokens (although i don't think it's necessary).

As for a gray area, DSPAM has a confidence level (has for many versions now) which you can use to greylist messages, or you can set up classification networks and neural networks to have DSPAM consult other users' dictionaries (neural networks is kind of cool because it seeks out the most reliable users for classifying your mail).

So yeah, it's done what you want for quite a while now. I've managed to get my system down to about 5MB per user using merged groups and TOE, and most of my users get 99.9% or better.
Re:did they fix the problems? by hacker · 2004-10-17 02:45 · Score: 2, Informative

"the database did grow huge. when my single user server with 128 mb had to use a 512 mb spam token database, performance was terrible. even with the tools included I could not do anything to fix the issue."

Did you run the nightly and weekly purge scripts, as documented? (purge.sql for your DBI driver)
Did you also change the model to TUM from the default? ( MUCH more accurate results over TOE or TEFT in our case, and we get a lot of spam!)
"dspam knows only yes or now, there is no usable value that gives you some grey information. as a result, I had to check all those spam postings for false positives."

I'm not sure what this means, but I've never personally had this problem. dspam gives each spam a percentage, which I can sort on using the web interface. Those with a lower percentage "might be" spam, but need to be checked. Those with a higher percentage (confidence), ARE spam. After 6 months of running dspam, I hardly ever check the quarantine now, because they're all spam. Its learned what is and what is not spam, and delivers accordingly.
I, like you, used SA for a year or two, and had it trained down to a 2.0 threshold (from the default of 5.0). I also had over 300 custom rulesets that blocked based on incoming subject at the MTA side, before even accepting the mail message and sending it to SA. I also used 13 RBLs. We were getting over 5,000 incoming spam a day, and about a dozen would slip through to the user's mailboxes. After 2 years and all of that, we were only at about 90% effectiveness (and yes, my SA rulesets were kept updated all the time)
After 2 weeks of using dspam, we were already at 98%, and not a single spam had slipped through to any user's mailbox. Granted, in the early period of using it for us, some messages were marked as False Positives, but that hasn't happened for ANY user in several months now.
We also stopped using the custom MTA rulesets, and don't use any RBLs either.
dspam absolutely blows away SA (currently, until/unless SA changes) in our particular subset of the mail we receive.
Re:did they fix the problems? by SendBot · 2004-10-17 11:20 · Score: 2, Informative

the database did grow huge... ...performance was terrible.

Did you try TOE mode? Instead of analyzing everything, it just uses the errors. That means significantly less utilization of your data backend. From the FAQ:

Switch to TOE Mode. DSPAM v2.10 supports TOE (Train-On-Error) mode, which only performs writes to the database in the event that a misclassification has occured (or if a user has fewer than 4000 innocent messages in corpus). Train-on-error mode should make a significant reduction in the number of writes (and therefore locks) being performed on your database, and may actually improve accuracy as TOE has been known to do so. The default mode of learning is TEFT (Train Everything). This performs a much more detailed training of incoming messages and can more easily adapt to new types of email behavior for users, but does use up a significant number of resources. This is a definite thing to try if you're bottlenecked!

Does DSPAM inform the sender? by Axoiv · 2004-10-16 23:02 · Score: 3, Funny

Does DSPAM inform the sender that his/her e-mail has been filtered out?

Re:Does DSPAM inform the sender? by hayds · 2004-10-16 23:08 · Score: 4, Interesting

No. Since spammers mostly use fake addresses, it's pretty pointless trying to send mail back to them. All that would achieve would be that you would receive all the bounces back and you'd get double the junk mail.
Re:Does DSPAM inform the sender? by Axoiv · 2004-10-16 23:44 · Score: 1

I suppose DSPAM can inform the receiver then, that an e-mail was filtered out? Somebody would need to know, if it was one of those filtered out by mistake.
Re:Does DSPAM inform the sender? by hayds · 2004-10-16 23:49 · Score: 1

Yes, theres is a web based CGI that comes with it which you can use to view your statistics. It also has a quarantine where any filtered messages are stored so you can view and retrieve any false positives.

Ask /. by Udo+Schmitz · 2004-10-16 23:05 · Score: 1

Asking slashdot:
Which provider do you think does the best effort to filter/fight spam and uses the most state of the art techniques for that? The german freemailer GMX I use now is good, but I wonder if others do better.
And I wouldn't mind paying for never receiving spam again. Is Apple .mac email service any good? I have a Mac and sure could make use of some of the other features they offer ...

Re:Ask /. by Lukey+Boy · 2004-10-17 00:25 · Score: 1

For me GMail's spam filter has beaten the rest (so far). YMMV.
Re:Ask /. by Vlad_the_Inhaler · 2004-10-17 22:35 · Score: 1

I had to turn off the GMX Spam filter because it was blocking messages from a mailing list I am on.

I tried marking the messages as 'not spam' based on the sender, but every single message has a different - unique - sender so that failed. To top it all, I could not even remove the 30-odd senders from the list again.

Now it is down to Mozilla's spam blocker again. It has virtually zero false-positives, but misses too many (30%) spam messages.

There are times when I'd love to have a baseball bat and a list of spammers - they are perfectly aware we do not want their garbage but they persist on finding new ways to bypass filters and dump their stuff on us. Anyone who actually wants this stuff will not have filters set anyway. Failing a baseball bat, I am about to try SpamVampire again. This sort of obnoxious antisocial behaviour demands some response.

--
Mielipiteet omiani - Opinions personal, facts suspect.

Platforms... by Anonymous Coward · 2004-10-16 23:06 · Score: 2, Interesting

The DSPAM site mentioned that it can be compiled on Mac OSX, but what about Winblows? I only have one box (go ahead and laugh) and it is an older Pentium III Winblows machine. I'd like to have a seperate box to act as a mail server but it just isn't currently feasable (translation: I'm broke.) Is there any way they can compile DSPAM for Win9X?

Re:Platforms... by Anonymous Coward · 2004-10-16 23:55 · Score: 0

According to the release notes, windows support was added in this version.
Re:Platforms... by Anonymous Coward · 2004-10-19 17:12 · Score: 0

Have you tried Robin Keir's K9? Great program, extremely accurate, Bayesian filtering, etc. etc... & free. Been using since Feb/04, & have found a false positive rate (ie - wrongly classified as spam) of 0.06%, & a false negative rate (ie, wrongly classified as good) of 1.84%. This is based on approx 6500 emails, & the system gets smarter each day.

http://keir.net/k9.html

Enjoy!

HOWTO for idiots? by Anonymous Coward · 2004-10-16 23:14 · Score: 1

this is one heck of a product, and I think it would be used more if there were a very verbose install of the current version on various platforms (similar to obsd version on site).

think- spamassassin, clam, spammassassin howto or something similar but it has to be VERY verbose to bring in the crowds (newbies).

my 2c

AC

Spamex by RKBA · 2004-10-16 23:22 · Score: 1

I use Spamex and I never get more than one spam per disposable email address. :-)

--
9/11 Eyewitnesses to Explosive WTC Demolition 1 of 2

We're off to see the wizard... by Tajas · 2004-10-16 23:29 · Score: 0

Wow, yet another anti-spam solution out there. I wonder how this stacks up to the other ones out there, looks good so far. SPAM SPAM it lasts for years, either meat, mail, or canned.

A little Harsh! by andyfaeglasgow · 2004-10-16 23:33 · Score: 2, Insightful

Didn't your mother tell you that if you haven't anything nice to say, then don't say it all!

MOD PARENT UP!!! by Futurepower(R) · 2004-10-16 23:43 · Score: 1

MOD PARENT UP!!! for a more friendly, sensible Slashdot.

From the website... by Anonymous Coward · 2004-10-17 00:06 · Score: 0

DSPAM's Focus

The DSPAM project attempts to set itself apart from "generic Bayesian filter" by focusing on the following areas:

* DSPAM has a strong drive for research. Many new algorithms and approaches to fighting spam have come out of the DSPAM project. Some of the approaches deployed in DSPAM include Chained Tokens, Neural Networking, Message Inoculation, advanced de-obfuscation techniques, and a new noise reduction algorithm called Bayesian Noise Reduction. We're always looking for new approaches to improving the accuracy of DSPAM.

* A strong focus on large-scale implementation support. The largest implementation of DSPAM we've heard about to-date involves 350,000 users, with the next largest being around 125,000, then 100,000. DSPAM has been designed to run with a very short execution time (between 0.01s - 0.03s real time for classification and between 0.03s - 0.10s real time for training, on average hardware), and has been equipped with a storage driver API allowing several different storage mechanisms to be used. Depending on disk space constraints, accuracy can be traded off for additional disk space or vice-versa.

* Usability. DSPAM was designed with "grandma" in mind. Users need only forward the spam they receive to an email address to train their filter. End-users don't need to know any commandline utilities or other complexities plaguing some other such tools. Functions such as whitelisting and keyword inventory are automatic (based on statistical functions) and therefore require no user intervention.

None of the above by igzat · 2004-10-17 00:06 · Score: 0

Not to change the topic, but I have a different method for curbing spam. I just change email addresses every 6-9 months. Works like a charm. When the ratio of spam versus real email starts shifting, I know it's time for a change. That and just don't post your email address all over the internet. Works for me.

--

Free Desk

Re:None of the above by iBod · 2004-10-17 00:35 · Score: 2, Insightful

Well that may work for you but it doesn't work for businesses. Change your name every 6-9 months? I don't think so.
Re:None of the above by terrab0t · 2004-10-17 03:39 · Score: 1

You're right. We'd like to disable the email address that the spammers have, but that means giving out a new email address to everyone we know. It also means (as he said) that we cannot post our address publicly because the spammers will quickly find it and we'll have to change it even sooner.

The solution to making this work is SpamGourmet. It's an email forwarding service. Basically, you don't give your real email address to anyone. When asked for an address you make one up for that person/organization and give them that instead. My addresses look like this: slashdotme.jmcclare@spamgourmet.com (jmcclare is my SpamGourmet username). Whenever you get an email from somebody you like, you add them to your whitelist. Spammers will only be able to send a specified number of emails to an address before it expires. You can set a default number, or put the number in the address, ie. slashdotme.[3].jmcclare@spamgourmet.com will accept three emails and then expire.

People on your whitelist can send you as many emails as they want. The only thing you have to do when an address (like a publicly posted one) gets taken in by spammers is post a new one. All of your whitelisted people can keep emailing you normally, new people will just have to get a new address from you or wherever you post your info. The old address will expire on it's own.

It's a little too complicated for the general populous to understand (although you could train a workplace to use this), but for me it's pretty much a cure for spam. Spammers can't reach me anymore. Anyone posting here should be able to use this, so I urge you to. Spam may still rule the internet around you, and your friends may keep losing your emails to their sloppy filters, but at least you won't get any crap in your own inbox.
Re:None of the above by HermanAB · 2004-10-17 07:39 · Score: 1

Hmmm, a self confessed Fly by Night operator?
:-)

--
Oh well, what the hell...

even spamassain's heuristics are bayesian scoring by cedspam · 2004-10-17 00:07 · Score: 1

if you look spamassasin distribution you'll find a tool to finetune rulescore based on spam an non spam mail. read "a plan for spam". read "a plan for spam" its token not words: sa rules can be tokens

Both inferior... by c0p0n · 2004-10-17 00:23 · Score: 1

... to CPAN!!

--

Your head a splode

Just use RBLs by Anonymous Coward · 2004-10-17 00:54 · Score: 0

RBL deal with 90% of the shit I see (no, not SPEWS and the nasty 'let's damage the world' ones - I use relays.ordb.org, sbl-xbl.spamhaus.org, list.dsbl.org, opm.blitzed.org, dul.dnsbl.sorbs.net, cbl.abuseat.org, dynablock.njabl.org, dnsbl.njabl.org.

Why mess with only DSPAM and stuff, which is after the fact, and massively over-engineered? Sure, use spamassassin or something, kept up to date to cleanup the other 10%, but don't really on it.

Re:Just use RBLs by HermanAB · 2004-10-17 07:46 · Score: 2, Interesting

Yup - agreed - the best solution is a combination attack.
Having users sort their mail and train a statistical filter from scratch is just way too much to ask - you'll get inundated with support calls and executives just don't have time to sort out the crud - they hired YOU to do it - passing the buck back to them ain't gonna fly...
The system should get rid of 99.9% of the crud by default, then let the users wholfeel like doing it, report the remaining 0.1% to a central mailbox where you can sort it and retrain the statistical filter if necessary.

--
Oh well, what the hell...

Alternative spam solution: Change the culture. by Futurepower(R) · 2004-10-17 00:58 · Score: 1, Interesting

Here's another spam solution:

If we had a respected national leader who could often talk to millions of people, that person could change the culture. The leader could tell everyone never to buy anything or even respond to unsolicited email advertising.

It might take years, but eventually it would not be economic for spammers to operate, particularly since spam filters would continue to improve.

The only person who could do this in the U.S. now would be Oprah Winfrey. She has an enormous following, and has a reputation for positive thinking (and, unfortunately, sometimes being ignorantly anti-male). She could tell her women viewers, and ask them to tell everyone in their family.

If we had a positively-minded president, he or she would be in an excellent position to change the email culture. A president could change the culture in a few months, possibly. It would simply become socially unacceptable to respond to unsolicited email.

Unfortunately, we don't have such a president. For example, see this article: Unprecedented Corruption: A guide to conflict of interest in the U.S. government.

If the spam culture change worked, the next thing I would like to see is an open source reference browser that set standards for how browsers should work. Unforunately, Bill Gates is not a positive leader, either. I would like to see Mozilla become the U.S. national government standard. Anyone could continue to use any browser they wanted, but the government's power could be put behind web page rendering standards and browser quality.

--
Government data shows Democrat and Republican spending patterns.

qmail looks okay to me. by Anonymous Coward · 2004-10-17 01:06 · Score: 0

I didn't find anything negative about the pages you referenced.

Informative, yes... by warrax_666 · 2004-10-17 01:06 · Score: 2, Insightful

but somewhat besides the point.

I have to disagree with you on whether it's spam, however. Just making up statistics here, but I'd guesstimate that the sender address of >99,99% (probably even more) of all virus emails is forged and probably points at an innocent third part. That means that the message from the virus scanner is completely and utterly worthless to the reciptient (i.e. the "sender" of the virus email). That makes it "junk" or "spam" in my book.

You're right that there isn't much you can do, but I usually check to see if the mailer-daemon/postmaster address in the message looks legit and send off a boilerplate message saying something to the effect of "what you're doing is stupid and counterproductive, please stop".

Hopefully SPF can stop some of this sender spoofing.

--
HAND.

Re:Informative, yes... by hayds · 2004-10-17 01:34 · Score: 1

Just clearing something up. The message that he was receiving wasnt a reply from a virus scanner, it was a bounce. I totally agree with you that virus scanners that reply to addresses that are 'sending' viruses are a total waste of time as the sender addresses are always forged.
In this case though, the receiving server is not replying to tell him that he has sent a virus, its telling him that hes sent an email to a nonexistant user. Obviously a message like this can be very useful if you have mistyped an address or something.
As to whether these messages are "junk" or "spam" I guess depends on your definitions and what you expect your spam filter to do. I have no argument that masses of postmaster messages like that are a total pain in the ass, but they are not spam as in the definition of "unsolicited bulk advertising email". Some of them are an annoying byproduct of viruses, but others are there to warn you if there is a problem with delivering your mail and so I think its worthwhile pointing out that there are disadvantages to just blindly filtering them out.
Re:Informative, yes... by _Sprocket_ · 2004-10-17 05:34 · Score: 1

That means that the message from the virus scanner is completely and utterly worthless to the reciptient (i.e. the "sender" of the virus email). That makes it "junk" or "spam" in my book.

A good point. However, from what I understand, this message is generated by the MTA and not the virus scrubber. So exactly what are you suggesting?

Maybe MTAs shouldn't alert the sender that the address they used doesn't exist (user no longer has an account, mistyped address, etc.)? That works for this situation. But it hurts legitimate users who mistyped an address or are trying to contact someone who's moved to a different provider, etc.

Maybe all MTAs should scan all incoming email before acting on bounces, etc. That assumes that the MTA does have a virus (and perhapses spam) scrubber available. For one reason or another, not everyone does. Secondly, that would also mean spending additional cycles processing email that'll never be delivered anyway - resources spent on an activity that has no benefit to your network or users.
Re:Informative, yes... by irc.goatse.cx+troll · 2004-10-17 05:53 · Score: 1

Simple solution that works for both sides of the issue: Bring back FingerD.

Your client can finger the email address automagicly before sending and have a nice warning if it doesnt think it exists, and then the MTA can finger the sender address to make sure its valid. This way obvious spoofed spam gets dropped. Of course people could still spoof valid addresses, but it would prevent some spam.

--
Pain lasts, kid. Its how you know you're alive. Sometimes I think this growing up thing is just pain management-TheMaxx
Re:Informative, yes... by statusbar · 2004-10-17 07:02 · Score: 1

I get so many 'Bounced Messages' containing spam and viruses every day that I automatically delete all bounces anyways. There is, in effect, no real disadvantage to just filtering them all out because SMTP/POP3/IMAP is unreliable anyways - not by design, but in reality.

You are not guaranteed to get a delivery error message emailed back to you for each and every delivery error anyways, so you may as well not ever expect one.

--jeff++

--
ipv6 is my vpn

Call me bitter, but... by JohnGrahamCumming · 2004-10-17 01:52 · Score: 4, Informative

Why does DSPAM get front page treatment when the latest POPFile release (which now handles POP3, IMAP, SMTP and NNTP filtering) and has an XML-RPC external interface, supports different databases, etc. etc. gets rejected as a story?

Perhaps it's because I don't tend to make super-wild claims about POPFile's accuracy? Or come up with cool marketing names for the internal technology?

POPFile's the only Bayesian filter that can:

1. Do more than spam vs. anti-spam and
2. Filter POP3, IMAP, SMTP and NNTP (that's right Usenet news)

Do I have an axe to grind with Jonathan and DSPAM? No, it's a cool project. Does it annoy me that /. has recently turned into some combination of Freshmeat and PC Magazine? Yes.

John.

Re:Call me bitter, but... by killjoe · 2004-10-17 06:25 · Score: 1

Does POPFile work with exchange native format. The baastard exchange admins (yes there are more then one!) at my office refuse to turn on pop or imap because "it's too dangerous" even though exchange supports imap over ssl.

--
evil is as evil does
Re:Call me bitter, but... by rsax · 2004-10-17 06:32 · Score: 2, Funny

Do I have an axe to grind with Jonathan and DSPAM? No, it's a cool project. Does it annoy me that /. has recently turned into some combination of Freshmeat and PC Magazine? Yes.
Do I like to ask questions aloud and then answer them myself? You bet.
;)
Re:Call me bitter, but... by Anonymous Coward · 2004-10-17 08:26 · Score: 0

Perhaps it's because I don't tend to make super-wild claims about POPFile's accuracy? Or come up with cool marketing names for the internal technology?

So shameless marketing on the front page of your website is better?

One of "the 10 most amazing, most indispensable applications and utilities of 2003"
Maximum PC Magazine
Re:Call me bitter, but... by Matt+Perry · 2004-10-17 09:00 · Score: 1

Does it annoy me that /. has recently turned into some combination of Freshmeat and PC Magazine? Yes.
Then why are you still a subscriber? Notice the asterisk:
JohnGrahamCumming (684871) *
Although I agree with your points maybe the first step would be to vote with your wallet until Slashdot starts to improve. Right now you're just rewarding them for heading in a direction of which you disapprove.

--
Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
Re:Call me bitter, but... by /dev/trash · 2004-10-17 09:16 · Score: 1

Recently? Are you new or something?

By the way I love PopFile.
Re:Call me bitter, but... by Vellmont · 2004-10-17 10:06 · Score: 1

I suspect different editors have different biases. DSpam has received a lot of hype for mostly the reasons you describe. I'd never heard of popfile before your post (though there's a ton of popular open source tools I've never heard of) so my guess is some editors are more willing than others to post stories about lesser-known tools. Re-submit the story in another week and hope you get lucky.

--
AccountKiller

Bayesian Filter Will Stop Working Soon by Anonymous Coward · 2004-10-17 02:10 · Score: 0

A lot of porn email now comes with legitimate text. Some are excerpts from books or wrods from the dictionary.

This confuses the hell out of Bayesian filters.

How do you fix that?

Re:Bayesian Filter Will Stop Working Soon by DavidTC · 2004-10-17 02:32 · Score: 1

That doesn't confuse the hell out of anything, although spammers apparently think it does. Bayesian filters don't work that way.
The only way spammers could slip under the radar of Bayesian filters is to start sending mail that is completely identical to legit mail you get. Which would be rather pointless, unless you're legitimately getting a lot of ads.

--
If corporations are people, aren't stockholders guilty of slavery?
Re:Bayesian Filter Will Stop Working Soon by paskie · 2004-10-17 05:49 · Score: 1

I.e. dspam just picks up few tokens from the mail message - the most "interesting" ones, that's those with highest counts. So if all emails contain tokens like "V1AGRA", "buy", "price", "$$$" and "p3nis", and each one few random dictionary words in addition, the V1AGRA-like tokens will get high count because they will be in each email, while the random dictionary words will get very low counts (probably even compensated by their occurences in proper emails). So the filter will pick up only the V1AGRA-like tokens when evaluating the email message and the dictionary words are harmless.

--
It's not the fall that kills you. It's the sudden stop at the end. -Douglas Adams
Re:Bayesian Filter Will Stop Working Soon by HermanAB · 2004-10-17 07:53 · Score: 1

No, adding prose to mail, just makes the Bayesian filters work better, since normal mail never sound like that, unless maybe if you are a publisher of drivel and receives submissions in your mail...
As for the rest of us, whatever schtuff the spammers add, just makes the spam easier to remove, since it increases the statistical distance between regular mail and spam. Since spammers started to do that, my systems went from 99.6% accuracy to practically 100% accuracy. I get 2000 messages per day and maybe see one or two spams per month - you do the math...

--
Oh well, what the hell...

Here's what i want in a spam filter... by Anonymous Coward · 2004-10-17 02:50 · Score: 1, Interesting

... I want a spam filter that automatically forwards all spam to the abuse@ mailbox for the domain from the spammer.

Once the admins start getting hundreds of thousands of spam complaints in their abuse boxes PER DAY. Then maybe they'll start to think of ways to fix this problem.

Re:Here's what i want in a spam filter... by HermanAB · 2004-10-17 07:57 · Score: 1

No, headers are forged - never send bounces to spam, unless maybe if the spam got a very low score and is in a grey area. Spam Assassin can do that. Otherwise you end up DOSing some little old lady with your bounces...
BTW, the mail admins already get hundreds of thousands of complaints per day, they don't need more.

--
Oh well, what the hell...
Re:Here's what i want in a spam filter... by cant_get_a_good_nick · 2004-10-17 10:45 · Score: 1

Hmm, with forged headers and tons of SPAM coming from zombie windows hosts, this seems to just add to the noise without any real chance of fixing anything.

Before filtering by Phatmanotoo · 2004-10-17 03:23 · Score: 2, Informative

I got nothing against content-filtering measures, as long as one is aware that this should be just the last layer of defense againts spam. Think about it, if your SMTP has already swallowed the spammer's email content, you have already lost precious bandwith.

Especially if you host your own SMTP, you should put up a layered system of defenses: RBL lists, maybe tarpitting, white/graylisting, and then content filtering.

Sigh... by Pig+Hogger · 2004-10-17 03:45 · Score: 0, Flamebait

No matter how much bells and whistles are deployed in an antispam "solution", anything else than pre-emptive blocking of spam-spewing networks is just an automated press-delete system.

Even though you don't SEE the spam, you STILL HAVE TO PAY FOR THE RESSOURCES THE SPAMMERS ARE STEALING FROM YOU!!!!

Unless ALL the spammy networks are PUNISHED FOR HARBORING SPAMMERS, spammers will always find connectivity

We are at war, at war against spammers and their spammy-networks accomplices, and until the spammy-networks are thoroughly eradicated, there will always be ressource-stealing spammers.

Re:Sigh... by HermanAB · 2004-10-17 08:02 · Score: 1

Well, that is what RBLs and header checks are for. They allow your MTA to refuse the connection.
See this extract for Postfix:
smtpd_helo_required = yes disable_vrfy_command = yes maps_rbl_domains = relays.ordb.org, bl.spamcop.net smtpd_recipient_restrictions = reject_invalid_hostname, reject_non_fqdn_sender, reject_unknown_recipient_domain, check_recipient_mx_access hash:/etc/postfix/mx_access, reject_unauth_pipelining, permit_mynetworks, reject_unauth_destination, reject_maps_rbl, permit # These don't work well - rejects local mail from daemons to root/webmaster # reject_non_fqdn_recipient, # These cause fetchmail to drop the connection # reject_unknown_sender_domain, # check_sender_mx_access hash:/etc/postfix/mx_access,

--
Oh well, what the hell...
Re:Sigh... by Pig+Hogger · 2004-10-17 11:08 · Score: 1

No matter how much bells and whistles are deployed in an antispam "solution", anything else than pre-emptive blocking of spam-spewing networks is just an automated press-delete system.

Even though you don't SEE the spam, you STILL HAVE TO PAY FOR THE RESSOURCES THE SPAMMERS ARE STEALING FROM YOU!!!!

Unless ALL the spammy networks are PUNISHED FOR HARBORING SPAMMERS, spammers will always find connectivity

We are at war, at war against spammers and their spammy-networks accomplices, and until the spammy-networks are thoroughly eradicated, there will always be ressource-stealing spammers.

Mmoderated by a spammer shill.

Complaints come first. by khasim · 2004-10-17 05:56 · Score: 2, Interesting

Netblock blacklisting is a really poor solution.

It is the only solution when the ISP will do nothing to stop the spammer on their network.

In some cases a single spammer causes a /24 and then a /16 to be blocked.

That is rather difficult without the ISP's assistance (or them repeatedly ignoring the complaints).

Btw, do you understand that changing ISP may not be an option?

Sometimes that is true. In which case, you should get on the phone and make sure that your ISP understands that they have customers who will be upset if the ISP doesn't handle its spammer problem.

Those lists, by themselves, do not block any email at all. Those lists are used by people who are fed up with trying to get ISP's to deal with their spammers.

Re:Complaints come first. by flynn_nrg · 2004-10-17 06:59 · Score: 1

It is the only solution when the ISP will do nothing to stop the spammer on their network.

I agree with this, but I do think that it's no longer true. Why? In my experience, most of the spam is delivered via compromised Windows machines these days. These are the people who haven't been educated yet and are still victims of the worm du jour.

Maybe the ISPs could block outgoing port 25 (to any other server than the ISP MX servers) by default and open if for people who explicitly request it. Whatever solution is adopted somebody will complain about it.

Soon after I started posting to some mailing lists (mainly FreeBSD ones) my spam/ham ratio started to skyrocket. That forced me to install spamassassin. One day, out of curiosity, I started checking where spam was coming from. Surprise surprise, it was no longer from bullet proof hosting sites, but from dsl/cable Windows boxes. As an experiment I redirected connections to my SMTP server that were coming from Windows boxes to OpenBSD's spamd (like I've said in the previous comment it's very easy to do with pf's passive OS fingerprinting) and voila! The number of spam pieces dropped from 100 a day to 3. Most spam is trapped in spamd and never delivered. The SPF check catches 2-3 a day as well. I haven't had any problem with this solution so far (been using it for some months now).

Those lists, by themselves, do not block any email at all. Those lists are used by people who are fed up with trying to get ISP's to deal with their spammers.

Yes, you're right, but they can be a dangerous tool when used by admins who don't fully understand the consequences of doing so. There have also been cases of people who got assigned an IP previously used by a spammer that was in some of those black lists. And getting delisted can be pretty hard sometimes. About ISPs dealing with their spammers, well, I think that has changed, so perhaps we should say ISPs babysitting uneducated Windows users?

Why not use spammers' tactics against themselves? by World_Leader · 2004-10-17 06:49 · Score: 1

What if we all began responding to every spam we
could, go to every website and fill in nonsense, etc.

It seems that very quickly spam would become
useless. They send these out to millions and
millions of account, they're generally low
budget operations, they can't afford to sort
out the wheat from the chaff.

There are some types of spam this won't work
for (e.g., stock pump+dump), but maybe it'd
put enough of them out of business that all
of it would go away. But why make the best the
enemy of the good?

-Barry Shein
www.TheWorld.com

XMail by Synn · 2004-10-17 07:02 · Score: 1

Has anyone used DSPAM with xmail?

Wrong. You can filter them by lakeland · 2004-10-17 07:50 · Score: 1

I trained my spam filter on bounces as well as regular messages. It got a little confused at first but soon got the hang of distinguishing real bounces from spam/virus bounces.

The FAQ is wrong by bucketoftruth · 2004-10-17 08:04 · Score: 1

From the DSPAM FAQ: SpamAssassin's primary detection facility has been designed to use a static set of rules to service all users of the system. That's not true at all. Each of my users maintains their own bayesian db's and custom rules if they choose. It's in $USER/.spamassassin.

Bayesian Noise Reduction not Bayesian by gguppi · 2004-10-17 08:13 · Score: 1

I read through the white paper describing the 'Bayesian Noise Reduction' and I just can not see how it is in any way Bayesian. It is a bunch of heuristics, which sound pretty reasonable and probably work great in practice. But why call it Bayesian? It is great to see that Bayesian techniques such as Naive Bayes Classifiers get applied with great success in the spam setting. But it is somewhat annoying if people use the word 'Bayesian' as just meaning 'sophisticated' or 'awesome'. It does actually have a meaning. http://en.wikipedia.org/wiki/Bayesian_inference

Re:Bayesian Noise Reduction not Bayesian by Anonymous Coward · 2004-10-17 08:22 · Score: 0

I don't think the author means it's Bayesian, but meant FOR Bayesian filters.

Opting out from a boycott by Anonymous Coward · 2004-10-17 08:21 · Score: 0

True, it may be difficult or even impossible for some users to switch ISP. Does it mean everybody else should be forgiving and tolerate more junk mail from their ISP? That's like telling the ISP up front: Lock your customers in and we will avoid blacklisting you for the spammers you host.

To the inhabitants of certain countries, emigration isn't an option either. Does their lack of choice mean we shouldn't boycott products from such a country due to the misdeeds of a very tiny minority

(the people in power) of said country? I rather not give specific examples of countries; I'm sure you can think of a few.

When someone begs "please don't attack those who keep me hostage", listening to that plea would be a disservice to everybody.

GPLware by hey · 2004-10-17 08:48 · Score: 1

The paper uses the term "GPLware". I haven't seen that befofe. I might use it. Of course, we remember "freeware", "shareware", etc.

Central vs individual choice by Anonymous Coward · 2004-10-17 09:04 · Score: 0

Problem is, if you don't want to trust the advice of others in any way, be it in the form of blacklists, filtering software, or even pressure against the spammer's ISP, you are effectively left with fighting your spam flood yourself. You would have to look at each and every message you get, or write a filter all by yourself. It will take you a lot of time, maybe even more time than it takes to hit delete, and you may still accidentally delete something you wanted without noticing it.

The whole idea of the Internet, on the other hand, is based on cooperation with others, to the point that you even trust complete strangers to write software used by you for business-critical activities. For a start, your e-mail correspondance with your customer depends not only on your ISP, but also on your customer's ISP, to get through intact and remain confidential. If you are willing to yield this much power over your business to your customer's ISP, then how is trusting someone else's advice (on what is and what isn't spam) fundamentally any different?

I agree with you however, that trusting someone else's list of "spam keywords" is in general a bad idea, and I avoid doing so myself, but not because I don't trust others. I prefer blacklists of IP addresses and domains instead, not because I maintain all of those lists myself (I don't), but because I can more easily predict the consequences of using them, given the listing criteria and the trustworthiness of the list maintainers. What are the "listing criteria" for a list of spammy keywords?

Why should a business donate money by Billly+Gates · 2004-10-17 13:42 · Score: 1

To prevent someone from doing something illegally while the spammers continue to do whatever they want?

Shouldn't they pay for the costs when they are caught?

--
http://saveie6.com/

OT: It's a Joke!! by DLR · 2004-10-17 16:21 · Score: 1

Ok, some one moderated the parent as a troll. I didn't think I had to say this but, IT'S A JOKE PEOPLE. It's even been modded Funny just incase the clue bus doesn't stop at your terminal. Sheesh!

Yes, the above counts as "humor" too. :) Have a nice day.

--
"Like fire and fusion, government is a dangerous servant and a terrible master."~RAH

No bounces! by lorcha · 2004-10-18 02:23 · Score: 1

Just say no to bounces! Bounces suck when some spammer decides to use your domain as a return address and you get all these stupid bounces. Ditto with all those stupid "your email was rejected 'cuz of a virus" when someone else was impersonating my email address.

What you should do, however, is reject the message in the SMTP session. My mail server issues a 554 during SMTP if you send me a spam or a virus. That way, legitamate senders will still get a notification of the delivery failure (generated by their own MTA, not mine!), and I am not sending misdirected bounces all over the place.

Of course, the 554 says why the email was rejected: "554 mail server permanently rejected message: message contained VIRUS (#5.3.0)" for a virus, similar message for spam. That way the sender knows what's up.

--
"Avoid employing unlucky people - throw half of the pile of CVs in the bin without reading them." -- David Brent

Occam's Razor as spam fighting software--mine. by iamcf13 · 2004-10-19 11:48 · Score: 1

My approach only uses 8 simple rules to score spam--the others use more complicated and computer-intensive methods.

My approach is fast, simple, and effective.

I use it to check my own email where it has filtered out my spam without fail.

The only 'spam' it wont detect currently is 'subject line' spam with email bodies with absolutely no content but I can easily fix that....

Maybe my approach is 'too good to be true' or 'not serious' to merit 'airtime' on Slashdot. You decide.

157 comments