Spam Archive opening FTP service December 4
Saint Aardvark writes "The FTP archives for spamarchive.org will be opening on December 4, according to this Wired article. But there already appear to be some archives available." I tried saving my spam for awhile just for giggles, but seeing that file grow to 100+ megs made me so angry I had to delete it. Currently getting ~200 spam every day, and now often they attach images so they are 100k+. Yay Internet!
I get about 5-7 per day, on average. Not 200, but to have to get rid of 5-7 messages per day (and report them to spamcop) is very, very irritating.
No one should have to abandon an e-mail address because of unsolicited e-mail, especially (as in my case) if they've had their account for five years, and all of their friends and relatives know it...
evil adrian
If people put their email adress all over the web, its no wonder. I just use some service like spamgourmet.com if I need an email address to subscribe somewhere and use a webform if you wanna contact me like C14L.com/mail. I've got no problem with spam.
SpamAssassin is rule based and doesn't as yet use this new, dubios, spamarchive. It can use Vipul's Razor, however, as well as SPEWS, SpamCop, etc.
dave
Yeah, you need a whitelist which doesn't use their email address but instead have them configure YOUR email address so its like
"Your Name +keyword"(YourAddress@Whereever.com)
and make the keyword NoSpamPlease or something, and make NoSpamPlease the thing you filter on. Pick a word which will never be in a spam message - not too hard
I also manage email for 10,000+ users. And I do a lot more than that; it simply does not take that much time if you handle things properly.
For corporate-wide spam blocking, sendmail has some great spam filtering features via DNS Black Lists (dnsbl). I use spamhaus.org and relays.osirusoft.com.
Add these lines to your sendmail.mc:
FEATURE(dnsbl, `sbl.spamhaus.org', `"550 Mail from " $&{client_addr} " rejected, see http://www.spamhaus.org/"')dnl
FEATURE(dnsbl, `relays.osirusoft.com', `"550 Mail from " $&{client_addr} " rejected, see http://relays.osirusoft.com"')dnl
There goes 90+% of the problem. After that, spamassassin handles the 10% that trickles through quite nicely.
If you don't use sendmail, all other modern mail relays can handle this problem in similar ways.
If people are going to use this archive to automatically induce rules for recognising junk mail (e.g. using naive bayes or ripper), then they will also need at least as many examples of legitimate mail.
Of course it could be useful for evaluating classifiers built using smaller corpora.
Now does this make EVERY email you receive spam?
Regardless, it works. I have never received spam through their service.
Try SpamNet it does something like that, only works for Outlook 2000/XP at the moment but they say outlook express support will come soon. It generates some sort of hash from the email and compares it to its database of known spam, you can also block spam that it did not filter so it filters it next time.
Some of us have been on Usenet since long before that meant we were "asking for it". That damage can't be undone.
Intelligent Life on Earth
google your verio address.
Please please please setup Vipuls Razor - that we can all benefit from the spamminess of your account!
Are you sure you investigated exactly
what osirusoft does?
I fint it unfortunate that so many
administrators seem to put in osirusoft
as a blacklist without examing what it
does. Osirusoft combines the blackhole
listing of many many other blackhole
listings, one of which is unfortunately,
SPEWS. SPEWS in my opinion is
overzealous with blacklisting and it
is unfortunate that osirusoft includes
them in its list. To read more about
the problem, read this posting
here
here is a relavent quote...
ii. a grep on osirusoft - which yields about 1/2 the messages -
but.. when there's a false positive, there's a really good chance that
it's in this group - and of this class of false positives, there's a close
to 100% liklihood that it's SPEWS that's given the false positive
You can alos check out antispews.
I haven't tried any of the Bayesian stuff (yet), but I imagine it'll have a similar hit-ratio.
Actually, I just switched from my shell hoster's systemwide spam filter (no idea what it was, but it puts X-Spam-Warning in the header) to the Bogofilter Bayesian spam filter running only in my shell account. I planned ahead and saved up over 250 spam emails (and 590 non-spam) for its first day of training. After three weeks of catching 35 and missing about 4 spams a day, it *just* marked its first legit one as spam today -- HiltonHonors assumed I wanted HTML mail and never referenced my name after the To: line. Not that HTML mail is necessarily a trigger for everyone, but it is for me.
If your mail goes through a shell account somewhere along the way, I would definitely recommend trying it out. After using pine for so many years, I can visually scan hundreds of emails in my spam folder for known senders in less than a minute. Under a minute every few days is okay by me.
Intelligent Life on Earth
In the long run, I think you're right, but thank the stars for spamassassin in the meantime! When I first installed it, about a year ago I think, it was blocking about 8000 message/month just to me! I checked earlier today for other reasons, and found it's grown to 13,000 blocked messages in the last month adding up to 116Meg. It's just f***ing insane. Unfortunately, the 4% it lets through adds up to over 500 messages in the last month, and it did manage to block 3 real messages, but it's still worth it...