spamassassin.org · Domains · Slashdot Mirror

SPEWS - highest collateral damage of all lists? by some1somewhere · 2004-01-20 22:35 · Score: 2, Interesting · on SPEWS Adds DSL Reports to Block List

It seems SPEWS is one of the most hated block lists, not by spammers, but by regular folk that end up on their list. In fact, some speculate that some of the DOS attacks against SPEWS aren't actually done by spammers, but by enough innocent people pissed off by them and their attitude. Seems like SPEWS loves collateral damage against innocent people, doesn't update often (even though it claims to "automatically" remove old listings, a lie), hides behind a newsgroup and pretends to not exist so there is no responsibility, and other practices that go against the running of a good anti-spam list (eg. Spamcop, Visi, etc.). And if you're listed, you have to go beg in a newsgroup to get out, just to be told to switch ISPs by people who think they are holier than thou.

http://www.ifn.net/classic/rblstory.htm covers SPEWS in detail (i don't agree with all of it, but it is pretty spot on).

but you are sure to find lots more on http://www.google.com/search?q=spam+hate+spews.

Notice how it seems to be mostly innocent people complaining about SPEWS and the way it operates?

I hate spam just like the next guy, so I would recommend the wonderful Spamassassin and use it with Spamcop.

Re:Never likely to work by tubabeat · 2004-01-19 02:03 · Score: 1 · on Copyrighted Haiku Delivers Spam Through Filters

The latest SpamAssassin is 2.62, released in the last few days here It gives the Habeas blacklist a score of 16, (which with the default -8 for the headers gives 8 in reality) so assuming Habeas can keep the blacklist up to date (doubtfull I'd have thought) it should be effective.

Re:Never likely to work by skojt · 2004-01-18 22:06 · Score: 1 · on Copyrighted Haiku Delivers Spam Through Filters

Version 2.62 of Spamassissin was released yesterday (Sunday 18th). You can find the details here.

One of the fixes relate to Habeas:
- Modified HABEAS_SWE to function even if the Habeas headers were out of their normal order.

Re:No, you got it all wrong... by fjin · 2004-01-11 01:27 · Score: 5, Informative · on Spammers Not Complying With CAN-SPAM

You haven't heard before about:

Spamassassin
SpamAssassin(tm) is a mail filter to identify spam.
Using its rule base, it uses a wide range of heuristic tests on mail headers and body text to identify "spam", also known as unsolicited commercial email.

and Razor
What is Vipul's Razor?
Vipul's Razor is a distributed, collaborative, spam detection and filtering network. Through user contribution, Razor establishes a distributed and constantly updating catalogue of spam in propagation that is consulted by email clients to filter out known spam. Detection is done with statistical and randomized signatures that efficiently spot mutating spam content. User input is validated through reputation assignments based on consensus on report and revoke assertions which in turn is used for computing confidence values associated with individual signatures.

Spamassassin will support it in 2.70 by KjetilK · 2004-01-08 22:05 · Score: 3, Informative · on AOL Now Publishing SPF Records

Hm, I must have been living under a rock, because it is the first time I hear about it. However, it sounds like a good idea, I have to contact my upstream ISP to have them add a record for me.

Anyway, it seems SpamAssassin will be adding support for SPF in 2.70, at least according to bug 2143. That's cool!

Sounds like a newbie without a clue by Baloo+Ursidae · 2003-12-28 12:51 · Score: 1 · on Knock, Knock: Information Pollution Is Here

The Web is a junkyard.

To which he is contributing to the problem he's bitching about with this.

Do you want to keep track of your eBay auctions? Instead of five e-mails per auction, all scattered throughout your inbox, you would have a single flag in the control panel. Discussion groups? The control panel would show when hot topics of interest to you are being discussed and would call attention to discussions with contributions by writers you particularly respect.

It's almost 2004, and this guy still doesn't know about Procmail and what a kill file is?

E-mail? Restricted to truly personal communication. Newsletters, intranet status reports, and other nonletter communications would be summarized and available for perusal on request.

Isn't that why Procmail and SpamAssassin exist?

IM would have a small role, but your personal agent would be very strict at screening incoming requests.

Unless you're a complete moron completely lacking self-control, odds are you set yourself do-not-disturb when you're managing a lot of state.

Re:A note on Brightmail by balamw · 2003-12-20 20:16 · Score: 2, Insightful · on Brightmail Denies "White List" Deal With Spammer

Brightmail? Awesome? Not for me it ain't, at least not right now. My ISP (AT&T Worldnet) uses it and it is letting through sooo much obvious spam recently that I'm beginning to think the spammers must have figured out a way around Brightmail's rules.

FWIW, both Yahoo! and the new Hotmail filters are performing much better than brightmail for me now

Regardless, I download all my mail through a SpamAssassin POP3 proxy, which just plain knocks em dead.

Balam

Re:BBC Article by Noryungi · 2003-12-09 04:26 · Score: 1 · on Remail: IBM is Reinventing Email

Where 31 billion e-mails are sent every day, you think that systems might need to be updated to handle such volume

Since 29 billion out of these 31 billion emails are spam, feel free to install and operate SpamAssassin.

This program, in my experience, greatly reduces the level of email I receive each day...

(Yes, this is said in a very cheeky way!)

Re:Reinventing EMail CLIENT by mhifoe · 2003-12-09 04:18 · Score: 1 · on Remail: IBM is Reinventing Email

I receive over 100 spams a day.
A combination of Spamassassin and POPfile means that only 1 spam a day gets through.

Not Really by tookish · 2003-11-25 02:27 · Score: 4, Insightful · on Critical Eye on SpamAssassin

So his complaints are:

SpamAssassin is hard to install
it isn't very effective
nothing is filtered until you manually set up your own filters
it's hard to configure and poorly documented
non-commercial blacklists come with no guarantees
end users can't add to the whitelist
Bayesian filtering isn't included by default, and he couldn't make it work anyway
it doesn't catch words like Viagra and invisible HTML characters

I knew nothing about filtering spam until I installed SpamAssassin 2.6 in a multi-user environment last week. Here are my responses:

it took less than half an hour to install (from CPAN) and start
effectiveness out of the box was about 95%, with no false positives -- after a few minor tweaks, I'm at about 98% with no false positives
simply not true -- it runs right out of the box
maybe it's hard to configure if you're used to a GUI -- if you're not afraid of editing a text file, it's very easy to set up; and there's no shortage of documentation at spamassassin.org and elsewhere
do commercial blacklists come with guarantees? I don't know
with a very little bit of scripting, you could allow users to add to the whitelist
I haven't tried the Bayesian filtering because it's apparently not well suited to a multi-user environment
simply not true -- it flags this stuff out of the box

I wouldn't recommend that my grandmother install SpamAssassin, but if you have any admin skills whatsoever, it's quite easy to use it to set up effective and useful filters. Furthermore, there are enough factual errors in the article that I'm tempted to dismiss it outright.

Of course, it's possible that it got a lot better between 2.44 and 2.6, but that begs the question, why did he install 2.44?

They are losing by Kphrak · 2003-11-03 07:42 · Score: 1 · on Spammer DDoS-By-Virus On spamhaus.org

based on the number of spams that are getting through. It has jumped up again (doubled) in the last 1-2 months.

On which ISP? On one using proper blacklists, some good regexp rules (SpamAssassin) and some site-wide applications of the engine (MailScanner), spam is minimized. You'll get some false negatives, but it's a trickle, not a torrent.

Ever since installing the above at work (it's a .gov whose entire address list has been passed around the Internet like a trading card), spam has decreased to around 3-5 false negatives a day. Life is good.

And BTW, to the people who are moaning about the computing power needed to run SpamAssassin and MailScanner (MailScanner, especially, is a hog, no denying it) -- perhaps you need to think about replacing that 386 running RedHat 6.0 in your parent's basement. It's probably been 0wN3d a couple dozen times anyway.

Re:E-mail tax by The+Fanta+Menace · 2003-11-01 22:26 · Score: 4, Insightful · on Time-travel Spammer Strikes Back

Not going to work. I don't use my ISP to send mail, at least not in a way they can detect. I use my own server, instead.

Are you going to tax me to send email between the users on my machine? If so, how are you going to monitor the logs? Are you going to give government authorities permission to audit my machine whenever they see fit to? Looking kind of authoritarian, now, isn't it?

How about cron jobs sending me email? Do I get taxed for them, too?

Instant messaging? Tax for that? What about when people get fed up with your email tax and implement an email system over an IM service instead? Or just implement some other of email over any other protocol to bypass your tax system?

Filters are an effective way of combatting spam. Much better - and less oppressive - than a tax. SpamAssassin catches 99% of the spam I receive. It, and other filters, are so effective that spammers are now changing the content of their text to attempt to bypass it. And when they do this, it reduces the effectiveness of their advertising, so in the end, they lose.

spamassassin.org by mirko · 2003-10-23 02:47 · Score: 5, Interesting · on Study on the Effects of Spam on End Users

My provider just installed it.
Now, the spam comes with a modified subject (beginning with *****SPAM*****) and a report such as :
SPAM: . : . . : . : . . Start SpamAssassin results . : . . : . : . . SPAM: This mail is probably spam. The original message has been altered SPAM: so you can recognise or block similar unwanted mail in future. SPAM: See http://spamassassin.org/tag/ for more details. SPAM: SPAM: Content analysis details: (6.4 hits, 3 required) SPAM: Hit! (2.7 points) Subject contains lots of white space SPAM: Hit! (3.7 points) BODY: Information on getting a larger penis SPAM: SPAM: . : . . : . : . . End of SpamAssassin results . : . . : . : . .

Now, I'd suggest you ask your provider to install such a filter on his servers.

My (quite effective) approach by Baloo+Ursidae · 2003-10-10 23:22 · Score: 1 · on What's in Your Spam-Fighting Arsenal?

First off, realise that treating the symptoms doesn't work. This means that C/R is considered harmful, as is address munging. It is still possible in this day and age to stay sane with just one email address without spamtrapping.

Procmail is your friend. Use it. In conjunction with SpamAssassin, you can filter it off to a folder to go send to SpamCop at your earliest convienence. While SpamCop officially discourages doing so, setting your mail server to reject based on the RBL bl.spamcop.net will save you some work (and money if you're a SpamCop member) by prohibiting mail from sites already reported by several people.

I use exim in conjunction with sa-exim to reject spam that scores high with Spamassassin, and to teergrube the luser. Since I'm the postmaster, I also have sa-exim give all the sa-exim rejected spam to my spam folder to report as well.

I have roughly 30 users. Almost all of them use my site for mail, since doing so is extremely spam hostile thanks to me, with very little inconvienence, if any, to legitimate mailers, which is the way it should be.

On an aside, I also use abuse.net's forwarding service to report hosts infected with viruses to their ISPs. I've been fairly successful, though it could be better. Roughly one third of the ISPs I contact suspend or terminate the user's account for it. I also maintain a net-lsearchable list of the last relay such infected messages go through before hitting my server. Feel free to use it for yourself, it's on my website.

Re:thanks for the info folks by sid+crimson · 2003-09-29 19:19 · Score: 2, Informative · on Changes in the Network Security Model?

I'm working on something similar... Exchange/OWA on the net.

There are a couple people who just need to POP their email while away. Perdition POP3-proxy over SSL is a decent solution. Setup POP3 proxy box on a separate network (ie. DMZ) from the Exchange Server and you're set.

There are a few that must have OWA access. For them, set up a reverse proxy with Apache/Squid and get a certificate for this server to communicate with your Exchange/OWA/IIS box.

And forgoodnesssake relay all your email thru something before it hits your virus-protected Exchange box. I suggest a Postfix / Spamassasin / ClamAV setup.

-sid

Re:Can we really enforce this? by IceFreak2000 · 2003-09-23 22:31 · Score: 1 · on California Tries Spam Ban

I couldn't agree with you more.

My beef is with the amount of time I've spent setting up a spam filtering solution for my family at home - with the nature of a lot of the spam that gets sent to me, it scares me that my daughter will one day have an email address of her own.

I currently have a fairly robust system - qmail, qmail-scanner, clamav, spamassassin - that seems to do the trick, and manages to drop 99.99% of the spam I receive.

Mail that has been identified as Spam gets dropped into an IMAP folder so I can do a cursory check once a day to see if any false positives have been caught (2 in the past 6 months - but in both cases it would have been fairly disastrous if I'd missed them).

But why in hell should I have to jump through so many hoops to get an email service that's workable?

Since this morning, my system has had to deal with over 300 spam emails and 500 instances of Worm.Gibe.F - if things carry on the way they are at the moment it won't be long before people start ditching their email accounts

Re:Other factors to consider by deadcasuals · 2003-09-10 13:17 · Score: 1 · on Recommendations for the Right IMAP Server?

While not necessarily IMAP related, you may want to look in to MailScanner. It's a mail relay program that accepts all incoming mail for your domain, does some analysis on the email and then forwards it on to your internal mail system. It can use something like 14 different virus scanners (all at once!) to do signature-based virus detection. At my work, we just use the attachment blocking feature to strip out attachments that we don't want coming in via email. 95% of the attachments that get quarantined at the mail gateway are viruses! It also integrates with spamassassin to help stop spam. It can automatically remove hostile HTML/scripting tags if you want, too.

We're using a neat MRTG based tool called mailscanner-mrtg to monitor our Mailscanner system. It produces pretty graphs.

All in all, it's a really great first line defense tool for keeping corporate email secure!

Good luck!

ACK and you shall receive.

Re:Quick Workaround (SpamAssassin) by tugrul · 2003-08-26 17:14 · Score: 1 · on Osirusoft Blacklists The World

Old? According to the current stable list of tests, those in your parent are the proper values. Maybe you happen to be running a release candidate of SA, but I prefer to leave my mail in the hands of the latest stable release (2.55 at the time of this posting).

Re:Or try qmail - unbroken since v1.03 (1998) by Shaklee39 · 2003-08-25 15:56 · Score: 1 · on Postfix: A Secure and Easy-to-Use MTA

You do not need to be a programmer to set it up, unless the ability to follow directions makes you a programmer. As far as spam blocking goes, why would you expect a spam blocker out of an MTA? Most people would figure out that qmail is not for blocking spam and instead would use something like spamassassin and have the best spam blocker up and working within 5 minutes like I did.

Too late... by Brendan+Byrd · 2003-08-25 05:01 · Score: 1 · on NZ Spammer Shutdown Makes Big Difference

At I pointed out in an earlier message, SPEWS is probably going to be taken out (or devalued to 0.001) on the newest version of SpamAssassin anyway. More respectable lists like opm.blitzed.org, list.dsbl.org, and dnsbl.njabl.org get higher scores with SpamAssassin.

Too late... by Brendan+Byrd · 2003-08-25 05:01 · Score: 1 · on NZ Spammer Shutdown Makes Big Difference

At I pointed out in an earlier message, SPEWS is probably going to be taken out (or devalued to 0.001) on the newest version of SpamAssassin anyway. More respectable lists like opm.blitzed.org, list.dsbl.org, and dnsbl.njabl.org get higher scores with SpamAssassin.

Re:Are we sure? by Brendan+Byrd · 2003-08-25 04:37 · Score: 2, Informative · on NZ Spammer Shutdown Makes Big Difference

You can be on SPEWS for giving the wrong look. Seriously, SPEWS is an incredibly bad blacklist. The notion of throwing out entire IP blocks, entire ISPs, even entire backbones that MIGHT support spam, is entirely insane. The list is such a joke that the RBL test may be taken out of SpamAssassin in the next version.

The only thing more inaccurate than SPEWS is URBL. (And yes, that is a subtle joke.)

When will you people learn.... by SlashChick · 2003-08-19 06:57 · Score: 5, Insightful · on Microsoft Virus Spam: SoBig.F

...that just because you're not using Outlook or Outlook Express, you still may be vulnerable to worms or email viruses?

All it takes is one user to click the attachment who has an LDAP-enabled address book of the entire company, and poof! you're screwed.

The only sensible way to kill these worms is to block them at the mail server. If you block them at the mail server, you don't have to try to train people or keep hundreds of anti-virus clients up-to-date. Do yourself a favor and set up XWall if you have Exchange (this is about the coolest spam-blocker/email filter program I have ever used, BTW) or SpamAssassin/MailScanner if you have Linux/UNIX. This will save you a ton of headaches in the future, and won't require you to worry about hundreds of clients being up-to-date as much as focusing on whether a few email servers are up-to-date. (Block the standard Microsoft "bad executable" list and you should be fine.)

Seriously, in the year 2003, there's no excuse for "But my 400 clients weren't up-to-date!" Block these things at the server, which is something you as the network administrator should have complete control over, and which is where the worms should have been blocked to begin with.

Re:I changed my mind. Simpler is better. by scj · 2003-08-10 22:38 · Score: 5, Interesting · on Comparison of Bayesian POP3 Spam Filters

I had thought of something similar for fighting spam. Here's how I'd handle each email:

If the email is from someone in my whitelist, allow the mail to go through and feed it as 'ham' to the Bayesian filter.
If the email is not in my whitelist, run it through spam filtering software (Spamassassin works well) to determine if it is likely to be spam.
If it seems like spam, then use a challenge-response system (like TMDA) to find out if a human sent the email.
If the mail doesn't seem like spam, just deliver it. If I get 3 non-spammy messages from the same person (separated by a day or more) then add them to my whitelist automatically.
If someone responds to the TMDA challenge, put them in the whitelist and deliver the original email.
If no one responds to the TMDA challenge after a week, feed the mail as 'spam' the the Bayesian filter.

In addition, I'd use a system like Sneakemail to generate random email addresses to give out to businesses I want to do business with and use to sign up to mailing lists. These email addresses would be added to my whitelist so they could send me mail without going through the challenge-response system. If they start spamming me, I put the random email I gave them on my blacklist.

This system has the following benefits:

Business mail I want (like receipts and newsletters from companies I do business with) get through always since the Sneakemail-type address is whitelisted. This solves the problem of businesses not responding to TMDA challenges.
My real email address is protected from businesses who are likely to sell it and from people farming addresses from mailing lists.
Personal email that the spam filter sees as non-spam gets delivered without bothering the sender with a challenge-response system.
Personal email that does seem spammy by the filter still has a second chance to make it through the system with the challenge-response system. This should reduce false-positives to include only spammy emails from people who don't respond the the challenge.
The Bayesian filter is automatically trained based on mails from people in my whitelist and mails from people who never respond to the challenge-response.

You would still get spam with this system (personal email that your filter thinks is non-spam), but hopefully your false-positive rate would be zero. Also, you don't annoy other people much by only sending challenge-response messages to spam-like emails. Finally, this would be easy for end users to use. They don't have to train the spam filter, since it should train itself. The only complicated part would be generating and using the random emails that you give to businesses and mailing lists.

Re:Check out Internet Mail 2000 by dpotter · 2003-08-05 03:19 · Score: 1 · on Replacing SMTP?

But if everyone were to use Bayesian I swear we wouldn't even have to propose a new protocol, talk about new legislation, etc.

I love my Bayesian filter, but I am not nearly so optimistic as you. I suspect that Bayesian filtering won't protect us for very long. It's only recently been popularly deployed, and already I've started receiving spams that quite nicely bypass my Bayesian filters: seems that some of my spammers have been including a few hundred (pseudo?-)random english words from the dictionary. The quantity of random words actually exceeds the quantity of spam. They are rendered HTML invisible through a variety of techniques. My Bayesian filter Spamassassin doesn't recognize these messages as very spammy, and they are starting to slip through.

I suspect that as spammers get smarter about Bayesian analysis that they will find tokens that register as non-spammy for a large percentage of the population. And as we implement measures to discriminate against those tokens, spammers will migrate to a new set, and so on and so on and we'll discover that Bayesian filtering is just another round in the fight between spammers and... well... everyone else.

Re:Distrustful of Network Level Censorship by Delta-9 · 2003-07-26 14:32 · Score: 3, Interesting · on O'Reilly Article on Spam Defense

"Your spam may be my correspondence"

Thats why I would recommend SpamAssassin. All spamassassin does is label the mail with a "spam level" it is then up to each individual user to filter out the spam at the user level, not at the server level.

A much better method for letting your 'correspondence' get through while other users spam doesn't.

Nice free advertising, Slashdot by oobar · 2003-07-23 06:52 · Score: 1 · on The Growing Field Guide To Spam Techniques

Okay, this article is a thinly veiled promo for ActiveState. This so-called field guide contains a handful of tricks that are mostly obvious to anyone that knows a little bit about HTML or MIME-Encoding. You would be much better off combing through SpamAssassin's extensive list of heuristics rathen then reading a boring rehash of "Hey! you can hide stuff in HTML comments! Betcha didn't know that! (Subscribe to our newsletter, thanks.)"

Avoiding spam of all kinds by doodleboy · 2003-07-23 01:17 · Score: 4, Informative · on The Growing Field Guide To Spam Techniques

This will all be blindingly obvious to most readers of /., but just for the record:

Don't use your personal email address for anything online. Don't post to usenet with it, don't use it to register for anything, don't ever use it where there's any chance of it being sold to a third party or picked up by a web crawler. Use a free throwaway web-based account like hotmail or yahoo, that's what they're for. I have a verizon.net primary email address, and I've never received a single piece of spam from it.

However, I still have a forward-only email address from my university circa 1992. Back then, there was no spam and that address has to be on every spammer's list on the planet. I still get a legitimate email every year or two, but spam outnumbers these by at least 10,000 to 1. SpamAssassin does a surprisingly good job of identifying the garbage.

I also use a proxy to surf the web, as well as a large hosts file that reroutes requests to adservers to 127.0.0.1:80, combined with a utility that returns a transparent 1x1 gif to any request on port 80. And of course I use mozilla to block pop-ups and whatnot. I'm so used to surfing in this way that I always recoil in horror when I have to use IE on a naked, unprotected box. How on earth can anyone stand it?

As for more traditional types of spam such as telemarketers, there's the national do not call list. It's free, so there's nothing to lose. You'll also want to check out the many excellent resources at the Junkbusters website. One of the most useful features is a Junkbusters Declare page, which builds custom form letters for you that you can use to opt out of Direct Marketing Association junkmail, as well as telling your financial institutions, etc., not to sell your name to third parties. I used it, it's painless, and my privacy is protected.

Of course, it would be much better if we didn't have to jump through hoop after hoop just to get through the day without being pestered by morons.

Re:no spam filter? by Anonymous Coward · 2003-07-14 08:59 · Score: 0 · on Ximian Evolution's New Clothes

Here's a quarter kid, go buy yourself a real spam filter.

meh by SweetAndSourJesus · 2003-07-14 08:36 · Score: 0 · on Ximian Evolution's New Clothes

What happens if you want to check your email with something like SquirrelMail? No filters, so your spam gets in.

The answer is, as always, Procmail combined with SpamAssassin.

Client-side filtering is for sucks.

Re:Serious testing?? by laa · 2003-07-02 17:48 · Score: 1 · on Bayesian Filter Testing?

Abso-f**king-lutely! I get around 500 spam emails a week. I suppose it's not the world record, but it's enough to make my inbox unusable without filtering. Spamassassin has so far had a hit ratio of about 99%, with no real mail being classified spam. I don't know how "good" spamassassins Bayesian filtering really is, but it's certainly good enough for me.

Re:Good reputation? by shokk · 2003-07-02 13:21 · Score: 1 · on Hormel Sues Over SpamArrest Name

At which point I have to ask, is it then actually possible to be a SpamAssassin?

SA Public Corpus by jmason · 2003-07-02 06:22 · Score: 1 · on Bayesian Filter Testing?

There is one, for exactly this reason -- the SpamAssassin public corpus. I made it available for developers of spam tools to compare effectiveness using a good, recent corpus from 1 person's mail feed (as much as that was possible).

Here's the pertinent part of the README :

This is a selection of mail messages, suitable for use in testing spam filtering systems. Pertinent points:

All headers are reproduced in full. Some address obfuscation has taken place, and hostnames in some cases have been replaced with "spamassassin.taint.org" (which has a valid MX record). In most cases though, the headers appear as they were received.

All of these messages were posted to public fora, were sent to me in the knowledge that they may be made public, were sent by me, or originated as newsletters from public news web sites.

relying on data from public networked blacklists like DNSBLs, Razor, DCC or Pyzor for identification of these messages is not recommended, as a previous downloader of this corpus might have reported them!

Copyright for the text in the messages remains with the original senders.

OK, now onto the corpus description. It's split into three parts, as follows:

spam: 500 spam messages, all received from non-spam-trap sources.

easy_ham: 2500 non-spam messages. These are typically quite easy to differentiate from spam, since they frequently do not contain any spammish signatures (like HTML etc).

hard_ham: 250 non-spam messages which are closer in many respects to typical spam: use of HTML, unusual HTML markup, coloured text, "spammish-sounding" phrases etc.

easy_ham_2: 1400 non-spam messages. A more recent addition to the set.

spam_2: 1397 spam messages. Again, more recent.

Total count: 6047 messages, with about a 31% spam ratio.

SA Public Corpus by jmason · 2003-07-02 06:22 · Score: 1 · on Bayesian Filter Testing?

There is one, for exactly this reason -- the SpamAssassin public corpus. I made it available for developers of spam tools to compare effectiveness using a good, recent corpus from 1 person's mail feed (as much as that was possible).

Here's the pertinent part of the README :

This is a selection of mail messages, suitable for use in testing spam filtering systems. Pertinent points:

All headers are reproduced in full. Some address obfuscation has taken place, and hostnames in some cases have been replaced with "spamassassin.taint.org" (which has a valid MX record). In most cases though, the headers appear as they were received.

All of these messages were posted to public fora, were sent to me in the knowledge that they may be made public, were sent by me, or originated as newsletters from public news web sites.

relying on data from public networked blacklists like DNSBLs, Razor, DCC or Pyzor for identification of these messages is not recommended, as a previous downloader of this corpus might have reported them!

Copyright for the text in the messages remains with the original senders.

OK, now onto the corpus description. It's split into three parts, as follows:

spam: 500 spam messages, all received from non-spam-trap sources.

easy_ham: 2500 non-spam messages. These are typically quite easy to differentiate from spam, since they frequently do not contain any spammish signatures (like HTML etc).

hard_ham: 250 non-spam messages which are closer in many respects to typical spam: use of HTML, unusual HTML markup, coloured text, "spammish-sounding" phrases etc.

easy_ham_2: 1400 non-spam messages. A more recent addition to the set.

spam_2: 1397 spam messages. Again, more recent.

Total count: 6047 messages, with about a 31% spam ratio.

Re:Hormel will probably lose. by AndroidCat · 2003-07-02 04:54 · Score: 1 · on Hormel Sues Over SpamArrest Name

I wonder how they feel about SpamAssassan then? "The body was found with a can of SPAM jammed down its throat. Not a pretty sight." :^P

One difference is SpamArrest is know for spamming its own product from time to time.

They are really going to have their hands full... by Delta-9 · 2003-07-02 03:12 · Score: 1 · on Hormel Sues Over SpamArrest Name

I'd imagine if they were troubled by "Spam Arrest" that "SpamAssassin" would bother Hormel as well. I know what you are thinking, SpamAssassin is open source, however... take a look at here. They are a commercial site selling UCE blocking email software. They do have a link to the 'real' SpamAssassin we all know and love.

Re:You called the wrong people by Baloo+Ursidae · 2003-06-27 03:18 · Score: 1 · on Why Are We on E-mail Blacklists?

One word: SpamAssassin.

Re:openrbl.org is a useful tool by Gudlyf · 2003-06-26 05:23 · Score: 1 · on Why Are We on E-mail Blacklists?

Our mail server does not use any blacklists, which is a shame because we get quite a bit of spam. But we are a business and I cannot take the risk of a client email bouncing, especially if they are innocent and the blacklist is wrong.

Why not use SpamAssassin? I have the same situation here at work, and using SpamAssassin works like a champ. I use that along with Anomy. SpamAssassin scans and scores the mail as being possible spam.

I currently specify a score of 6+ as spam. Then that mail gets sent through an anomy script, which strips out any executable or virus-possible files (I tell people here to request zipped files if they want .exe attachements). It also scans the score of the message -- if it's 12+, it dumps the mail into a spam jail directory for three days, but no real person gets that mail unless it's a message they were expecting and never got.

Now all spam with a score of less-than 12 doesn't get to the recipent, but any with a score of 6-11 gets to the user with "***** SPAM *****" prepended to the subject, along with a body prefix stating what rules the mail "broke", then the original mail as an attachement. All of this is configurable, of course.

Re:I am not sure what the spam filter is by selfabuse · 2003-06-20 07:02 · Score: 1 · on The Next Step in Fighting Spam: Greylisting

sounds like SpamAssassin I work at an ISP, and we have it filtering incoming mail for several thousand people, and haven't hit any kind of problem that wasn't very easily fixable

Re:Gorilla Against Spam!! (GAS) by pjrc · 2003-06-18 07:18 · Score: 1 · on Microsoft Files 15 Lawsuits Against Spammers

... when legitimate businesses are sending me unsolicited email, I can click the "remove" link with confidence that I will be removed, not sent more spam.

Would you consider Disney to be a legitimate business?

Let me tell you a little story. Last Christmas season, Robin (girlfriend) wanted a certain collectable disney figure, so I went to disney.com to find and order it.

At the end of my checkout process, after I'd put in my credit card number, the last page informed me that they had automatically added me to their email marketing list. No checkbox to uncheck... they added me without permission, but they did have a little button I could click to opt-out.

So I clicked Disney's opt-out button: 404 Not Found Error. Damn thing didn't work and I was pissed.

I eventually dug up an email contact address (they obviously don't want to hear from end customers... try looking around on their site for it). Someone did reply eventually, asking for more info, and I replied but there was never a second response.

The SPAM started rolling in. Every message contained an address to supposedly unsubscribe. I replied to every single one to unsubscribe... but after a month (somewhere in February) their regular email promotions were still showing up every several days.

I use Spam Assassin to filter spam, but I set the threshold pretty high since a lot of random people contact me regarding my website. Disney's un-unsubscribable spam was coming in just under the threshold, so I added a custom rule for it to add a couple points and finally the problem was solved.

Maybe Disney is the only offender. Maybe my case was a fluke and normally that webpage button works and maybe that month their email unsubscribe was not working for some reason but has been fixed? But I doubt my experience is unique.

The truth is that they (think they) make money from building that spamy newsletter subscriber list, so they do everything they can to get you on it... including automatically adding you without even asking permission. They have little incentive to let you unsubscribe. So the unsubscribe mechanisms are poorly maintained and tend to not even work.

Legitimate business, Disney Corporation, and lightweight spammer with non-working unsubscribe.

Re:ClamAV! ClamAV! ClamAV! by ddkilzer · 2003-06-10 12:43 · Score: 3, Informative · on Microsoft Acquires RAV Antivirus

I've been using clamav for virus scanning since it appeared in Debian unstable. It is used by amavisd-new for virus scanning and with spamassassin for spam scanning of my incoming (and outgoing) email. Amavisd-new is then integrated with postfix and cyrus-imapd (2.1.x) for my mail server. Works like a champ on a Power Mac 8600/200 with 512MB RAM!

The only problem with using clamav is that it needs more mirrors to distribute the virus definitions. The main virus definition download site was down over this past weekend, I'm guessing because of the BugBear.B worm.

Re:Yes, hold them responsible by admbws · 2003-06-09 11:04 · Score: 1 · on Inappropriate Spam Reaching Children?

Fortunately SpamAssassin is very good at nuking the worst spams (loading a huge penalty on HTML mails with loads of inline images, etc., as well as detecting "porny" words and phrases).

The only spams I get through SpamAssassin seem to be are reasonably intelligently written ones, mainly offering webhosting and search engine submission and the like. Highly recommended!

http://spamassassin.org/

When are people going to *SOLVE THEIR OWN PROBLEM* by johnynek · 2003-06-08 01:26 · Score: 2, Interesting · on Spammers Exploiting Hotmail Vulnerability

I have totally solved my spam problem. I get around 600-800 spam messages a week, and maybe one of those will find its way into my inbox. Here is how it is done:

Spamassassin scans all my incoming email. It has pretty good hueristics, which get better if you allow it to use bayesian learning. If Spamassassin thinks its spam, a header is added.
CRM114 uses a much more sophisticated bayesian approach to check to see if the mail is spam. If it is spam, a header is added.
If the sender is on my whitelist (this is a good reference), I put the whitelisted mail in my inbox.
If the message is not on the whitelist and does not have a spam header (from either Spamassasin or CRM114) put the message in my inbox.
Otherwise, the message is spam and put it in my spam folder.

That is basically it. When one gets through, I put it into the false-negative folder, and a cron job has CRM114 learn it. If a good email winds up in the spam folder, I put it in the false-positive folder and CRM114 learns it as non-spam, and I add the sender to my whitelist.

Fortunately, both types of errors are *VERY* rare. The system just works.

A lot of /.ers just dismiss the idea that the problem can be solved. It can be solved. There are even ways my approach can be made more accurate. If I find more than an error or two a month, I may work on it (think: turing test confirmations for spammy email).

I put up a page describing my efforts. This is a problem which can (and has for many) been solved!

Re:Attachments by Bob+Uhl · 2003-06-06 04:28 · Score: 1 · on Yet Another Windows Worm

If you are running a corporate meail[sic] server and are not filtering for known executable extensions, you are a fucking idiot. Period. There is just no excuse to EVER allow unfiltered mail through.

Really? What if one's corporation is running Unix only? Perhaps .pif stands for personel information format at one's company. Perhaps one's corporation has a strict no-lusers policy.

I prefer my mail feed unfiltered. I'll accept SpamAssassin mangling, but that's about it.

Re:I don't receive spam by ptbarnett · 2003-05-27 09:49 · Score: 1 · on Bayesian Filtering For Dummies

Is there any filtering apps for windows that dont automatically delete spam, but download to a special spam folder?

Cloudmark does this. I don't use it directly, but my installation of SpamAssassin checks the Cloudmark/Razor servers for the message signature.

Since my email is hosted on a Linux server, I use procmail (with SpamAssassin) to filter spam into a Spam folder.

Re:Spam = /dev/null by GammaTau · 2003-05-26 09:45 · Score: 4, Informative · on Bayesian Filtering For Dummies

Bayesian filtering could stop all the spam that easily? This is great! Where can I download a filter like this?

You can try bogofilter, ifile, SpamBayes, or POPFile. The newer versions of SpamAssassin also implement some kind of Bayesian filtering.

spamassasin windows incarnations by aberson · 2003-05-20 02:19 · Score: 1 · on Anti-Spam Software for Mom?

at least 3 of em

Re:Ever wonder? by mccrew · 2003-05-15 04:48 · Score: 1 · on Spam Blackhole Lists Redux

Add a new protocol where mail stays on the *sending* server until you pop it off with your client. Instead of sending the entire email to your mail server, it just sends the headers.

I don't think that this would work.

I don't know if you use SpamAssassin or not, but in recent months it has become less and less effective, and more spam has been getting through. Why? It's because the spammers have gotten smarter about what they put in the payload - nowadays the spam that gets through to my inbox is usually a minimal HTML e-mail with no text component (i.e. neutralizes SpamAssassin's ability to filter based on key spammer words and phrases). The "sales pitch" is just an <img> tag to the spammer's website. On top of that, most e-mail clients will automatically go retrieve the image from the website automatically, causing your e-mail address to become validated as "live" as a side effect.

So in effect, we effectively already have the situation where just the headers get sent, with only 3 short lines of HTML payload. If we can't filter it out now while we have the body content, how will we be able to filter it when we just have the headers?

Now having said all that, I agree that holding the e-mail on the sender's server is a good idea, but for other reasons. Because most spam nowadays is pretty small (i.e. the payload is smaller than the RFC822 headers even), there isn't really any spam-prevention benefits that can occur on the recipient's side. The only plus I see is that the originating ISP could watch its outbound queue and hopefully be able to detect and shut things down quickly.

Also, it would be nice to not be burdened when the marketing dufus sends out multi-megabyte PowerPoint attachments, but that's a different rant.

Re:Learning Spam Filters by SuiteSisterMary · 2003-05-14 06:28 · Score: 1 · on Are People Using TMDA to Kill Spam?

The hard part is priming the databases. Maybe it would be worth it to have a database that can be downloaded and used as an initial point for new users - combined with "Spam", "Not Spam", "Whitelist" buttons in their client to automatically tweak the db to their usage patterns.

http://www.spamassassin.org/publiccorpus/

What a pain by stevenbdjr · 2003-05-14 03:37 · Score: 2, Informative · on Are People Using TMDA to Kill Spam?

There are better methods. Message analysis (ala SpamAssassin), spam clearing houses (ala Razor), RBLs, bayesian filters, and sender address verification. I use all five at my site, and my users are happy.

Plus, can you imagine a potential client of your company e-mailing for information, only be sent a TDMA message? I'd bet money that person would either not no what to do, or just ignore the message and think you never got back to them.

Slashdot Mirror

Domain: spamassassin.org

Comments · 240