DSPAM v3.2 Released
Nuclear Elephant writes "After four months of development DSPAM v3.2 has been released, bringing many new enhancements and filtering technologies. These include distributed computing support, implementation of Bill Yerazunis' Sparse Binary Polynomial Hashing algorithm (from CRM114), and v1.2 of Bayesian Noise Reduction. Other enhancements include SQLite support and many significant performance enhancements for PostgreSQL. DSPAM's official release is next week, but you can download the preview release now. Users of the project have also contributed towards creating a new logo for this release."
Are most people using a bayesian DSPAM, CRM114, or SpamBayes along with SpamAssassin (rule based)? Or do you just use the bayesian filter?
I see that most of these bayesian filtering programs mention that they can be used with SpamAssassin. Is it usually best to run both for DoublePlusGood(TM) spam catching?
.sig
I am using D-Spam on a qmail/vpopmail server and I find that its great in terms of accuracy. Most of my users have never had a false positive and many havent seen a spam after a couple of weeks of training.
The problem that I have with DSpam is the integration side. Im not sure how it goes with other mail systems but integrating it with vpopmail was a major pain. It seems easy, you just put the command in the dotfiles, but in practice getting it to work was quite a trial. Even now it doesnt integrate properly with the web administration, etc despite some scripting and minor code changes.
Because of this Ive been thinking of switching to Spam Assassin simply because of its integration with qmail-scanner. Has anyone else had similar problems or been in a similar situation and found a good solution?
...any better than CSPAN?
"Like fire and fusion, government is a dangerous servant and a terrible master."~RAH
Here's what it shows.
ONLY the 3.2 Preview Release 1 is currently out!
.sig
I'm sick of spam filters braging about their overall error rate. All of them do OK at getting rid of the bulk of spams and saving the bulk of time.
The real important differentating factor is how many false positives they mistakenly accuse of being spam.
The consequenses of a spam message getting through are minimal - under a seconds of time, on average, to skip them.
The consequenses of a non-spam getting blocked can be huge - loss of a customer - a mom not knowing her kid is in trouble.
I wish the spam filters focused entirely on reporting how few false positives they produce.
Me've always found that the best filter still is the humble (and the not so humble) human :p
a few months ago those features were available, too. while dspam is great at filtering mail, I faced two crucial problems, which forced me back to spamassassin. I haven't heard that they fixed any of those: .. 10, so I can check 0..4 where 0 is ok (few false negatives) and 1..4 spam (few false positives), and I can directly delete thousands of mails in 5..10 without looking at them.
- the database did grow huge. when my single user server with 128 mb had to use a 512 mb spam token database, performance was terrible. even with the tools included I could not do anything to fix the issue.
- dspam knows only yes or now, there is no usable value that gives you some grey information. as a result, I had to check all those spam postings for false positives. Spamassassin on the other hand has that spam result 0
i wont go back to dspam unless someone can offer speciic help for those issues. I believe everyone will face them sooner or later.
Does DSPAM inform the sender that his/her e-mail has been filtered out?
Asking slashdot: .mac email service any good? I have a Mac and sure could make use of some of the other features they offer ...
Which provider do you think does the best effort to filter/fight spam and uses the most state of the art techniques for that? The german freemailer GMX I use now is good, but I wonder if others do better.
And I wouldn't mind paying for never receiving spam again. Is Apple
The DSPAM site mentioned that it can be compiled on Mac OSX, but what about Winblows? I only have one box (go ahead and laugh) and it is an older Pentium III Winblows machine. I'd like to have a seperate box to act as a mail server but it just isn't currently feasable (translation: I'm broke.) Is there any way they can compile DSPAM for Win9X?
this is one heck of a product, and I think it would be used more if there were a very verbose install of the current version on various platforms (similar to obsd version on site).
think- spamassassin, clam, spammassassin howto or something similar but it has to be VERY verbose to bring in the crowds (newbies).
my 2c
AC
Well you should still know how the weather is in India. You should be watching the cricket ;)
9/11 Eyewitnesses to Explosive WTC Demolition 1 of 2
Didn't your mother tell you that if you haven't anything nice to say, then don't say it all!
MOD PARENT UP!!! for a more friendly, sensible Slashdot.
if you look spamassasin distribution you'll find a tool to finetune rulescore based on spam an non spam mail. read "a plan for spam". read "a plan for spam" its token not words: sa rules can be tokens
... to CPAN!!
Your head a splode
Well that may work for you but it doesn't work for businesses. Change your name every 6-9 months? I don't think so.
Here's another spam solution:
If we had a respected national leader who could often talk to millions of people, that person could change the culture. The leader could tell everyone never to buy anything or even respond to unsolicited email advertising.
It might take years, but eventually it would not be economic for spammers to operate, particularly since spam filters would continue to improve.
The only person who could do this in the U.S. now would be Oprah Winfrey. She has an enormous following, and has a reputation for positive thinking (and, unfortunately, sometimes being ignorantly anti-male). She could tell her women viewers, and ask them to tell everyone in their family.
If we had a positively-minded president, he or she would be in an excellent position to change the email culture. A president could change the culture in a few months, possibly. It would simply become socially unacceptable to respond to unsolicited email.
Unfortunately, we don't have such a president. For example, see this article: Unprecedented Corruption: A guide to conflict of interest in the U.S. government.
If the spam culture change worked, the next thing I would like to see is an open source reference browser that set standards for how browsers should work. Unforunately, Bill Gates is not a positive leader, either. I would like to see Mozilla become the U.S. national government standard. Anyone could continue to use any browser they wanted, but the government's power could be put behind web page rendering standards and browser quality.
--
Government data shows Democrat and Republican spending patterns.
but somewhat besides the point.
I have to disagree with you on whether it's spam, however. Just making up statistics here, but I'd guesstimate that the sender address of >99,99% (probably even more) of all virus emails is forged and probably points at an innocent third part. That means that the message from the virus scanner is completely and utterly worthless to the reciptient (i.e. the "sender" of the virus email). That makes it "junk" or "spam" in my book.
You're right that there isn't much you can do, but I usually check to see if the mailer-daemon/postmaster address in the message looks legit and send off a boilerplate message saying something to the effect of "what you're doing is stupid and counterproductive, please stop".
Hopefully SPF can stop some of this sender spoofing.
HAND.
Why does DSPAM get front page treatment when the latest POPFile release (which now handles POP3, IMAP, SMTP and NNTP filtering) and has an XML-RPC external interface, supports different databases, etc. etc. gets rejected as a story?
/. has recently turned into some combination of Freshmeat and PC Magazine? Yes.
Perhaps it's because I don't tend to make super-wild claims about POPFile's accuracy? Or come up with cool marketing names for the internal technology?
POPFile's the only Bayesian filter that can:
1. Do more than spam vs. anti-spam and
2. Filter POP3, IMAP, SMTP and NNTP (that's right Usenet news)
Do I have an axe to grind with Jonathan and DSPAM? No, it's a cool project. Does it annoy me that
John.
The only way spammers could slip under the radar of Bayesian filters is to start sending mail that is completely identical to legit mail you get. Which would be rather pointless, unless you're legitimately getting a lot of ads.
If corporations are people, aren't stockholders guilty of slavery?
... I want a spam filter that automatically forwards all spam to the abuse@ mailbox for the domain from the spammer.
Once the admins start getting hundreds of thousands of spam complaints in their abuse boxes PER DAY. Then maybe they'll start to think of ways to fix this problem.
I got nothing against content-filtering measures, as long as one is aware that this should be just the last layer of defense againts spam. Think about it, if your SMTP has already swallowed the spammer's email content, you have already lost precious bandwith.
Especially if you host your own SMTP, you should put up a layered system of defenses: RBL lists, maybe tarpitting, white/graylisting, and then content filtering.
You're right. We'd like to disable the email address that the spammers have, but that means giving out a new email address to everyone we know. It also means (as he said) that we cannot post our address publicly because the spammers will quickly find it and we'll have to change it even sooner.
The solution to making this work is SpamGourmet. It's an email forwarding service. Basically, you don't give your real email address to anyone. When asked for an address you make one up for that person/organization and give them that instead. My addresses look like this: slashdotme.jmcclare@spamgourmet.com (jmcclare is my SpamGourmet username). Whenever you get an email from somebody you like, you add them to your whitelist. Spammers will only be able to send a specified number of emails to an address before it expires. You can set a default number, or put the number in the address, ie. slashdotme.[3].jmcclare@spamgourmet.com will accept three emails and then expire.
People on your whitelist can send you as many emails as they want. The only thing you have to do when an address (like a publicly posted one) gets taken in by spammers is post a new one. All of your whitelisted people can keep emailing you normally, new people will just have to get a new address from you or wherever you post your info. The old address will expire on it's own.
It's a little too complicated for the general populous to understand (although you could train a workplace to use this), but for me it's pretty much a cure for spam. Spammers can't reach me anymore. Anyone posting here should be able to use this, so I urge you to. Spam may still rule the internet around you, and your friends may keep losing your emails to their sloppy filters, but at least you won't get any crap in your own inbox.
I.e. dspam just picks up few tokens from the mail message - the most "interesting" ones, that's those with highest counts. So if all emails contain tokens like "V1AGRA", "buy", "price", "$$$" and "p3nis", and each one few random dictionary words in addition, the V1AGRA-like tokens will get high count because they will be in each email, while the random dictionary words will get very low counts (probably even compensated by their occurences in proper emails). So the filter will pick up only the V1AGRA-like tokens when evaluating the email message and the dictionary words are harmless.
It's not the fall that kills you. It's the sudden stop at the end. -Douglas Adams
Netblock blacklisting is a really poor solution.
/24 and then a /16 to be blocked.
It is the only solution when the ISP will do nothing to stop the spammer on their network.
In some cases a single spammer causes a
That is rather difficult without the ISP's assistance (or them repeatedly ignoring the complaints).
Btw, do you understand that changing ISP may not be an option?
Sometimes that is true. In which case, you should get on the phone and make sure that your ISP understands that they have customers who will be upset if the ISP doesn't handle its spammer problem.
Those lists, by themselves, do not block any email at all. Those lists are used by people who are fed up with trying to get ISP's to deal with their spammers.
What if we all began responding to every spam we
could, go to every website and fill in nonsense, etc.
It seems that very quickly spam would become
useless. They send these out to millions and
millions of account, they're generally low
budget operations, they can't afford to sort
out the wheat from the chaff.
There are some types of spam this won't work
for (e.g., stock pump+dump), but maybe it'd
put enough of them out of business that all
of it would go away. But why make the best the
enemy of the good?
Has anyone used DSPAM with xmail?
Hmmm, a self confessed Fly by Night operator?
:-)
Oh well, what the hell...
Having users sort their mail and train a statistical filter from scratch is just way too much to ask - you'll get inundated with support calls and executives just don't have time to sort out the crud - they hired YOU to do it - passing the buck back to them ain't gonna fly...
The system should get rid of 99.9% of the crud by default, then let the users wholfeel like doing it, report the remaining 0.1% to a central mailbox where you can sort it and retrain the statistical filter if necessary.
Oh well, what the hell...
I trained my spam filter on bounces as well as regular messages. It got a little confused at first but soon got the hang of distinguishing real bounces from spam/virus bounces.
As for the rest of us, whatever schtuff the spammers add, just makes the spam easier to remove, since it increases the statistical distance between regular mail and spam. Since spammers started to do that, my systems went from 99.6% accuracy to practically 100% accuracy. I get 2000 messages per day and maybe see one or two spams per month - you do the math...
Oh well, what the hell...
See this extract for Postfix:
Oh well, what the hell...
From the DSPAM FAQ: SpamAssassin's primary detection facility has been designed to use a static set of rules to service all users of the system. That's not true at all. Each of my users maintains their own bayesian db's and custom rules if they choose. It's in $USER/.spamassassin.
I read through the white paper describing the 'Bayesian Noise Reduction' and I just can not see how it is in any way Bayesian. It is a bunch of heuristics, which sound pretty reasonable and probably work great in practice. But why call it Bayesian? It is great to see that Bayesian techniques such as Naive Bayes Classifiers get applied with great success in the spam setting. But it is somewhat annoying if people use the word 'Bayesian' as just meaning 'sophisticated' or 'awesome'. It does actually have a meaning. http://en.wikipedia.org/wiki/Bayesian_inference
The paper uses the term "GPLware". I haven't seen that befofe. I might use it. Of course, we remember "freeware", "shareware", etc.
Mmoderated by a spammer shill.
To prevent someone from doing something illegally while the spammers continue to do whatever they want?
Shouldn't they pay for the costs when they are caught?
http://saveie6.com/
Yes, the above counts as "humor" too. :) Have a nice day.
"Like fire and fusion, government is a dangerous servant and a terrible master."~RAH
What you should do, however, is reject the message in the SMTP session. My mail server issues a 554 during SMTP if you send me a spam or a virus. That way, legitamate senders will still get a notification of the delivery failure (generated by their own MTA, not mine!), and I am not sending misdirected bounces all over the place.
Of course, the 554 says why the email was rejected: "554 mail server permanently rejected message: message contained VIRUS (#5.3.0)" for a virus, similar message for spam. That way the sender knows what's up.
"Avoid employing unlucky people - throw half of the pile of CVs in the bin without reading them." -- David Brent
My approach only uses 8 simple rules to score spam--the others use more complicated and computer-intensive methods.
My approach is fast, simple, and effective.
I use it to check my own email where it has filtered out my spam without fail.
The only 'spam' it wont detect currently is 'subject line' spam with email bodies with absolutely no content but I can easily fix that....
Maybe my approach is 'too good to be true' or 'not serious' to merit 'airtime' on Slashdot. You decide.