DSPAM v3.6 Released
Nuclear Elephant writes "After six months of development, DSPAM v3.6 has been released. The most notable change is the series of new features added to make an anti-spam gateway appliance possible (Knoppix anyone?). Version 3.6 also includes a highly accurate alternative to Bayesian filtering known as Markovian discrimination, based on Bill Yerazunis' research. Other significant enhancements include trusted sender whitelisting, integrated Clam Antivirus and LDAP support, a centralized spam training alias, and a new dependency-free storage driver. Much of the documentation has also been rewritten to make installation easier. A change log and release notes are also available. Slashdot has recently featured a review of the author's book, Ending Spam and an interview as well."
It would be interesting to compare this version to other spam filters and see how it measures.
Finally a decent anti-spamming utility. There's been a lot of hype around this product and it is not out of place. I like the way its (at least partially) integrated to clam(win?). I still feel it wont be long for spammers to find ways around this tool... but for now, great, im definately using it.
I know I'm going to get mauled over this quesiton... but has anyone compiled it on Windows 2003 server ?
For practical reasons I don't have linux in my test lab, and I'd like to have DSpam on my Webserver which is running IIS6 and Windows 2003 Server.
I can see I need to run it in SMTP mode with a relay to my Exchange box, but I don't want to waste my time trying to compile it (using Visual Studio), if someone already knows it wont work.
-Jar.
Together, We Can Make Slashdot Better. I Do NOT Mod ACs. - Check Me Out
There isn't any trademark problems with DSPAM?
SPAM is a registered trademark of Hormel Foods Corporation, and DSPAM aren't the Monty Python.
My city: Barcelona.
That was how earlier version worked. I don't know of anyone who actually got them to work natively under Windows.
DSPAM is also noted for their trademark spat with Hormel, who tend to be nice about "spam" as a term until it's spelled in all-caps. (Previous Slashdot coverage.)
But the great news is this product is no longer needed. After all the FBI has put a stop to all of that: http://www.detnews.com/2005/technology/0510/16/B01 -349738.htm
(For those that are easily confused, the comment was tongue in cheek)
I'm a long-time proponent of and rare contributor to SpamAssassin, and I'll continue to be, but fighting spam is much like fighting disease: you have to diversify your defenses. DSPAM is a nice package, and is very well designed. I've spoken to the author in the past, and he has an excellent understanding of the complexities of the issue (as opposed to the legions of people who seem to think that spam filtering should be easy, given the right algorithm).
As far as I'm concerned there are two tools for spam filtering: DSPAM and SpamAssassin. Try them both. See what fits your needs. My impression is that SpamAssassin provides more knobs and buttons and is more easily extended by the casual user, but DSPAM can be lighter weight. Both are highly accurate, with very low false positive rates.
I use Gmail. :)
This is one of those things that makes me wonder...which "side" is pushing the technological envelope further and faster, the {spammers | malware slimers | virus breeders} or those who develop to defeat them?
Since it's generally agreed that history is written by the winners of a given conflict, I guess we won't have an answer to that until the war's over.
This comment generously brought to you by a severe lack of caffeine.
All the world's an analog stage, and digital circuits play only bit parts.
How about getting it compiled into a Linksys WRT54G router firmware i.e Sveasoft firmware?
What kind of fuckwittery is this? No, plenty of languages can code a simple contact form handler, the platform you run it on is pretty irrelevant, and PHP is by no means "the most important language to learn in the universe". It's a pretty typical scripting language, not the magic you make it out to be.
Bogtha Bogtha Bogtha
The best defense against spam is never to type your personal address anywhere on the internet.
You have to do more than that. You also have to not email anyone, and also not have an easy to guess username.
The problem is, you can never publish your email address anywhere - and someone else will gladly do it for you. All it takes is one person you have emailed to come down with an email virus, which then propogates your address all over the net.
Email address synthesis will also guarantee unless you have the most obtuse email address, it will end up getting spam too.
> The best defense against spam is never to type your
> personal address anywhere on the internet.
Hiding your address does not work because some viruses collect addresses from your correspondents addressbook. Your address will percolate to spam lists, it is only a matter of time. If like me you have kept your adress for many years, you absolutely need some form of spam defense.
You still have to communicate with people, and many of them will have windows boxes which will get rooted at one time or another. It is made worse by people who innocently spam whole lists of people with documents or joke emails. Your address can get spread around that way.
http://michaelsmith.id.au
Comment removed based on user account deletion
And make damn sure that your code isn't vulnerbale to "e-mail injection" exploits; these will result in spammers using your simple form to spam others AND you getting your hosting revoked.
n jection.php
See, eg, here: http://www.nyphp.org/phundamentals/email_header_i
The best defense against spam is never to type your personal address anywhere on the internet.
It's at least ten years too late for that for me, and I'll be damned if I'm going to give up my email address now just because of a few pesky spammers. Besides, the worst of the spam flood seems to be over. A year ago, I was getting hundreds of spam messages a day; now I might get ten, occasionally twenty a day. SpamAssassin + ClamAV identify the vast majority of those.
How well does "Markovian discrimination" work in practice? It sounds fascinating, but what is the false-positive rate that can be expected on average?? :)
Geez from dealing with spammers to working with the crap DiamondTouch, Yerazunis is a real glutton for punishment
~jennifer.k~
This isn't "bulletproofly" reliable either. My brothers and I run a small local ISP. Years ago I created an address for my youngest daughter. She never used it, it was never posted anywhere, and it wasn't an easy to guess address since it was a combination of her name and her nickname. However spammers are constantly trying to discover email addresses on our domain, we get about 2,000 invalid recipient attempts every hour of the day. So eventually they discovered her address and she now gets a small amount of spam. (6 to 12 a day) If you want something 100% effective, then cancel all of your email accounts. A more reasonable course of action is to use an excellent solution like DSPAM.
You can also use a "short-term" e-mail like the ones provided at SpamGourmet.com.
Never heard of dictionary attacks on domains have you?
Oh well, what the hell...
> or you can always do stuff like: foo AT gmail DOT com
> or you can always use the html encoding for the characters in the email
These are no protection against a number of more advanced bots, and that number will increase over time.
Also, in many situations, like signing up for stuff online, an encoded email address won't be seen as valid input and will be rejected out of hand.
> or you can always just put the words inside an image.
This might work on your personal website, but is useless in most situations.
> or you can always use a real email for friends, and a spam email for
> everything else.
By far the best method in my opinion, coupled with educating your friends so that they don't fall pray to malware.
Aye. It is pretty obvious the gp is something of a fuckwit. However, for its intended purpose PHP is practically magic. Personally I have always been something of a Perl addict and then one day I was pondering some web work and decided to dive into php by recoding a couple of perl scrips in php. I was simply amazed at how much more simply one can do web cgi's in php.
For just about everything else there is still perl (which is definately superior to php in every NON web task) and when perl fails there is C (or C++ for those who believe that it's ok to make programs eat more cpu than they have to simply because cpus are faster than they used to be).
The OpenBSD port can be downloaded from ftp://ftp.00f.net/misc/port-dspam-3.6.0.tar.gz
{{.sig}}
Also, spammers steal addressbooks or buy them from unethical employees, others make partnership contracts where you've submitted a contact address and use those contacts to get spam addresses, some spammers use alphabetical or name-guess spam, and any unethical sysadmin with a clue can use the mail logs of his servers to generate a list of valid email addresses from other sites for sale.
I think this is the kind of things like, you know, "humour".
... because PHP is the best. (the same joke with Debian and Other Distribs is left as an exercice to the reader...)
As you know, comments on PHP vs. Other Scripting Languages are totally useless...
What does it mean, "appended to the end of comments you post"
Have you been living under a rock for the last ten years? Of course web programming in PHP is easier than CGI! Just about anything is easier than CGI, not matter what language the CGI script is programmed in. If you want a similar (but more powerful) PHP-like environment for Perl, I highly recommend HTML::Mason. Two other interesting mod_perl environments are AxKit (centred around XML and XSLT) and Catalyst (a tight MVC framework). But they both are rougher to develop on, requiring restarts of Apache to load new code. At least Catalyst provides its own mini server for testing/development purposes.
just set up a simple form and use simple php to make it convenient for them to reach you while keeping your email address safely tucked away
All you've done is swapped vigilence in maintaining anti-spam on your inbox to vigilence in protecting your contact form against spammers abusing your email form as a spam gateway. My contact form page gets an attempted hit every couple of days (usually a combination of MIME attachments in the comments field and injecting a BCC field to forward to the recipient) and this is a low volume site. Anyway, your email only has to leak once for it to propagate and it may not necessarily be you that does it. You'll find the spam blocker built into Thunderbird does a good job if you don't want to bother installing Spamassissin/DSPAM on your mail server (at the expense of extra bandwidth and download times).
Phillip.
Property for sale in Nice, France
> Other than annoying whitelists, there is no anti spam warez that is bulletproofly reliable.
...just set up a simple form and use simple php to make it convenient for them to...
Yeah yo, no bulletproofly reliable warez yo!
>
Make it convinient to root your server, yo! Yeah, yo! Bulletproofly warez, yo!
> Though this is only possibly with PHP...
Yeeeeaaaah, buddy! Warez, yo!
NOT!
Whatever TF this guy is smoking, you lemmings shouldn't mod it +4/Informative. It's a crap post.
Must-not-watch TV!
I thought that whitelisting had been a feature of every email reader/server since spam filtering began.
"A year ago, I was getting hundreds of spam messages a day; now I might get ten, occasionally twenty a day. SpamAssassin + ClamAV identify the vast majority of those."
For me, most spam (unwanted email not intended for me personally) I receive are either bounces or "confirmation" emails from other people's spam filters. Since spammers never send FROM their own address, they usually just pick a random address off their list and send from them (ie. Mine.) So bounces go to me.
These days, I've started clicking the "confirmation" URL on all of those "Please confirm you are a real person" emails just to make those people stop using their broken, idiotic anti-spam systems that just make life worse for the rest of us.
E pluribus unum
also if your email is a combination of firstname and/or surname - chances are the spammers will guess it..
Nice troll. PHP has nothing to do with spam, if anything it was your blatant stupidity that got you on a spam list.
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo!
shhh don't tell anyone but when you program in PHP you ARE still programming to the CGI ;) In fact everything you mentioned above still interacts with the client via CGI and html/xhtml just like it has for the last 10 years.
I did this because, in practice, my system never had a message scored 10 or higher that a used considered to be HAM, and indeed, no one has ever called and said, "WTF? My email didn't get through!" Also, in practice, the number of spams and hams that score between 5 and 10 is very low, so users do check their spamboxes. 99% of the messages delivered to the spam folder are flagged as spam. Every so often, a ham slips through, but never has a ham been rejected to my knowledge.
I like this solution because it keeps all obvious spam away from the users, keeps most non-obvious spam away from the users, yet never drops anything to /dev/null.
"Avoid employing unlucky people - throw half of the pile of CVs in the bin without reading them." -- David Brent
... but it'll sound like one: I recently converted from a rather involved anti-spam defense utilizing SpamAssassin with Razor, Pyzor, and several RBL checks. I spent a fair amount of time selecting RBLs that worked the best and tweaking SA test scores whenever I got false positive/negative messages. I even had all sorts of validity checks turned on in the MTA to block out badly formed messages and the like.
I replaced all those defenses with: DSPAM. And I'm seeing better results out of the box than I ever did with a multi-layered SA-based solution, even after a lot of time tweaking.
A quick anecdote: When I converted, I opened up a bunch of previously blocked spamtrap addresses, just to get some good training material for the filter. I've long since passed my initial training threshhold but haven't even bothered to block the spamtraps again because I never see the spam. At the risk of sounding like I'm bragging, I literally don't have a spam problem anymore, and DSPAM is entirely responsible for that.
Now, I'm not necessarily advocating that you give up all your custom defenses and switch to DSPAM. (I've turned off all my other filters, but I haven't removed them completely.) There's always a chance that an ingenious spammer will find a weakness in DSPAM setups, but I can testify to the fact that DSPAM is "scary good" as of right now. Training the filter is a simple matter of dropping misclassified messages (and there aren't many) into an IMAP folder.
If what you have is working for you, stick with it. But if you're looking for a low-maintenance, high accuracy filter, you should definitely give DSPAM a shot.
I must agree about PHP being un-magical. It's great for one or two specific purposes, but is pretty lacking for anything else. Want a simple web email form? It'd be hard to find an easier way to do it than PHP. But if you want a large web application, it's worth trying other languages. What's magical and amazing is that people have built incredible things with it despite its shortcomings -- projects like Drupal and Mediawiki are sheer wizardry.
I've been keeping a list of problems with PHP, if anyone wants details. I won't say it's not biased, but it's not terribly religious either. It just attempts to list some of the more important issues.
I've found that nearly all of my users actually prefer an interactive system like dspam over a fully-automatic system. Both systems make mistakes, but the interactive system gives the user a feeling of empowerment to fix mistakes and improve their accuracy over time.
It's better for the admin, too... When a non-interactive system makes a mistake, I find that the users complain -- either to the admin or to each other. But with dspam, they reclassify the missed message and continue working, happy to know they're part of the solution. A simple "mark as spam" button eliminated most of my email support requests.
I do get occasional users who still aren't happy... they expect 100% accuracy with 0 effort. But the only way to please those users is to hire them a personal spam secretary. And guess how often that happens?
Absolutely. It is cathartic to punish spam by reporting it to your spam filter. And, of course, fully automatic systems aren't nearly as good as claimed. (Neither are learning filters - 99.9...% accuracy? pshaw! - but they're better than non-learning ones.)
I get an incredible amount of spam bounces in my GMail account -- from somebody sending lots of spam using my GMail address as the From: or the Return-to: address.
I really, really want an option for GMail to record the message-id of all messages I ever send through their server, and bounce any which are returned to me but which they haven't got on record as being sent by me.
I requested this ages ago, and it should be relatively straightforward. Does anyone else have this problem?
"Wise men talk because they have something to say; fools, because they have to say something" - Plato
Why is it not included in Debian?
Spamassassin is.
Bogofilter is.
Popfile is.
I thought it was the license, but seems that DSPAM is GPL.
So, can anyone comment? I'm not installing it
for my server if i can not apt-get it and have debian
security support for it.
1990 called, they want their webserver back.
Why not use Apache + mod_perl/mod_php, like the vast majority of souls in the known universe?
"Wise men talk because they have something to say; fools, because they have to say something" - Plato
Since you may have been serious - CGI stands for "Common Gateway Interface". In other words, CGI defines the "common" "interface" between the browser and the webserver (aka "gateway"). Many early CGI programs were written with perl, and several still are. I've written several CGI programs in C, PHP, perl, and bash - among others (Cold Fusion is something I'd like to forget - what a POS!). Using mod_blah generaly just moves the interpreter (or parts of it) into the web server so you save the launch time and can use nifty persistance stuff. None the less, you're still technically using CGI if you at any time submit data to a webserver using GET or POST. Yes, it's still cgi even if it's not handled by a perl script with a .cgi extension.
:)
CGI's really a badly understood, oft misused term - and I've not explained it all that well - but hopefully the general idea's a little more clear.
And if you're running a mail system for 10,000 Real Estate agents..... 4x Barracuda 400 Spam Firewalls.
-- I have a private email server in my basement.
Huh? My understanding of CGI was that it defines the interface between the web server and the program/script. It defines how the URL, headers, and POST variables are passed to it, and how the program/script returns the page and status code. The Apache modules like mod_perl, mod_php, and mod_python put the interpreter into the web server, eschewing the overhead of launching a program (and parsing the perl/python) for each request. Thus the interface is an internal Apache API instead of the CGI. Now, mod_perl allows you to emulate the CGI environment and reuse your Perl CGI scripts with a speed/efficiency increase. But it's not the only (or the best) use of mod_perl.
You're right - it's the interface between the app and the server. Doh. :) Though, isn't the mod_* API more of a superset of CGI rather than a replacement? Trying to save a little face here... ;)
I used SPAM Assassin quite happily for many years but found the effectiveness started dropping, there are some messages that just can't be caught, usually these are the worst kinds of messages (ie. a face full of spunk) almost always received by the people most likely to be offended (ie. 55 year old female administrative staff).
False positives seem to be more of a problem written in languages other than English. Pretty much all of our e-mail in Welsh language we receive through AOL has been tagged by AOL as SPAM, you might say AOL losers etc. But SpamAssassin & Messagelabs also incorrectly tag e-mails, training these systems doesn't really help and that pretty much ruled those options out, then on top of that if we don't respond to Freedom of Information requests within 20 days we can be fined, so another good reason to not rely on any SPAM system that can be manipulated by the user, better to not receive than to misfile and forget.
I have measured our greylisting performance, I manually filtered over 8000 messages and found only 4 items (Nigerian / lottery frauds) that were undetected SPAM, that gives us 99.95% and our users have had to take no action whatsoever to achieve this. Asside from the usually very short (usually less than 5 minutes) initial delay and the very occasional non-delivery (3 instances in 18 months) due to a broken downstream mailserver (easily rectified with a phone number & guaranteed to work contact e-mail in the bounce) it's very low maintenance.
Another great feature of greylisting is that it's a highly effective first line of defense against viruses. Prior to enabling greylisting I was getting around 10-20 messages a minute intercepted by our virus scanners, with greylisting the number is more like 8 a DAY and all of those are thanks to either transparant SMTP proxying from some brain dead ISPs or messages passed on through forwarding.
SPAM is not really a security risk as such, but the fact that greylisting has such strong anti-virus capabilities should when balanced against it's few potential shortcomings make it very easy to justify switching on as a good e-mail security measure.
Oh and I really get a laugh when people using SpamAssassin helpfully mark their own non-SPAM e-mail as SPAM, thats always a good one and a sure sign that there is something seriously wrong with the SpamAssassin approach.
Jason.
Or any injections at all. I host a modest number of people's domains (a dozen people). One user had PHPBB. When I told him what trouble his buggy, old version of PHPBB had caused, he swore he'd deleted it - all he'd actually done is removed the links to the board, but the code was still there.
A Romanian phishing gang found it, and tried to send over 2 million phishing emails by uploading a PHP script via the exploit. Fortunately, the way I have the email relay configured (the firewall blocks port 25 egress from the web server, so the system has no choice but to relay it through my relay), it shut down after only a handful of phishing emails went out and I could contain it (and had all the evidence to find out who did it). It's prompted me to make the egress filtering tighter though - I had allowed port 80 outbound because it was convenient, now I've told all the users they have to tell me what addresses they need because the rule is now default deny.
Oolite: Elite-like game. For Mac, Linux and Windows
Don't worry CGI is NOT just an interface between the webserver and the application. CGI also defines much of the information the browser is required to exchange with the webserver.