Spam Archive opening FTP service December 4
Saint Aardvark writes "The FTP archives for spamarchive.org will be opening on December 4, according to this Wired article. But there already appear to be some archives available." I tried saving my spam for awhile just for giggles, but seeing that file grow to 100+ megs made me so angry I had to delete it. Currently getting ~200 spam every day, and now often they attach images so they are 100k+. Yay Internet!
SpamAssassin is rule based and doesn't as yet use this new, dubios, spamarchive. It can use Vipul's Razor, however, as well as SPEWS, SpamCop, etc.
dave
I also manage email for 10,000+ users. And I do a lot more than that; it simply does not take that much time if you handle things properly.
For corporate-wide spam blocking, sendmail has some great spam filtering features via DNS Black Lists (dnsbl). I use spamhaus.org and relays.osirusoft.com.
Add these lines to your sendmail.mc:
FEATURE(dnsbl, `sbl.spamhaus.org', `"550 Mail from " $&{client_addr} " rejected, see http://www.spamhaus.org/"')dnl
FEATURE(dnsbl, `relays.osirusoft.com', `"550 Mail from " $&{client_addr} " rejected, see http://relays.osirusoft.com"')dnl
There goes 90+% of the problem. After that, spamassassin handles the 10% that trickles through quite nicely.
If you don't use sendmail, all other modern mail relays can handle this problem in similar ways.
If people are going to use this archive to automatically induce rules for recognising junk mail (e.g. using naive bayes or ripper), then they will also need at least as many examples of legitimate mail.
Of course it could be useful for evaluating classifiers built using smaller corpora.
Some of us have been on Usenet since long before that meant we were "asking for it". That damage can't be undone.
Intelligent Life on Earth
Please please please setup Vipuls Razor - that we can all benefit from the spamminess of your account!
Are you sure you investigated exactly
what osirusoft does?
I fint it unfortunate that so many
administrators seem to put in osirusoft
as a blacklist without examing what it
does. Osirusoft combines the blackhole
listing of many many other blackhole
listings, one of which is unfortunately,
SPEWS. SPEWS in my opinion is
overzealous with blacklisting and it
is unfortunate that osirusoft includes
them in its list. To read more about
the problem, read this posting
here
here is a relavent quote...
ii. a grep on osirusoft - which yields about 1/2 the messages -
but.. when there's a false positive, there's a really good chance that
it's in this group - and of this class of false positives, there's a close
to 100% liklihood that it's SPEWS that's given the false positive
You can alos check out antispews.