Proving Which Spam Filters work Best

Not at 400 by Anonymous Coward · 2006-08-02 16:19 · Score: 0, Informative

400 Megs that is.......

Easier? by Ec|ipse · 2006-08-02 16:22 · Score: 2, Insightful

Isn't there an easier way to display the results, liek a chart or something. 400M per file download is a bit extream.

Re:Easier? by eric76 · 2006-08-02 22:04 · Score: 1

They could have just e-mailed it to everyone with a gmail account.
Re:Easier? by guysmilee · 2006-08-03 01:25 · Score: 1

Clearly the prof understands compression techniques and not the students posting the video.
Re:Easier? by devnulljapan · 2006-08-03 03:00 · Score: 1

Note: We are sorry that these talks are not available through BitTorrent, however under present IST policy we are not allowed to run BitTorrent. We thank you for your understanding.
...and you don't think someone at uwaterloo is trying to make a point about this policy by posting a link to a 400MB file on /.?

In my experience... by vivin · 2006-08-02 16:24 · Score: 4, Informative

... the ones which have worked best (for me) are Bayesian Spam Filters (A Plan for Spam, SpamBayes - a free filter) and CRM114 The Controllable Regex Mutilator (Paul Graham mentions it here). I've always had a very high success rate with these.

--
Vivin Suresh Paliath
http://vivin.net

I like

Re:In my experience... by coffeeisclassy · 2006-08-02 16:29 · Score: 3, Insightful

Whats surprising is, while Bayesian spam filters work well in his tests, the one that performs the best was never really heard of before.... I wonder how long it will be before we see something using the methods available, who wants to bet OpenSource will beet closed source to implementing this?
Re:In my experience... by ozmanjusri · 2006-08-02 16:30 · Score: 5, Funny

I've always had a very high success rate with these.
I haven't tested this one myself, Barrett Filter but I understand it is 100% effective at reducing spam from known sources. False positives may be a problem, however.

--
"I've got more toys than Teruhisa Kitahara."
Re:In my experience... by emag · 2006-08-02 16:51 · Score: 1

Just repeat after me... "They're comin' right for us!"

oh, wait, you can't use that anymore. Try "Aw, look, they're starvin' to death! We have to thin the herd!"

--
"The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
Re:In my experience... by TorKlingberg · 2006-08-02 17:19 · Score: 1

SpamAssassin uses Bayesian Filtering as well as other methods.
Re:In my experience... by Red+Alastor · 2006-08-02 18:00 · Score: 4, Informative

I like popfile because it's a bayesian filter that sorts into any arbitrary categories you want, not just spam and ham.
http://popfile.sourceforge.net/

--
Slashdot anagrams to "Sad Sloth"
Re:In my experience... by Anonymous Coward · 2006-08-02 18:15 · Score: 0, Redundant

I have a simple, foolproof idea to help eliminate spam.

Email certification.

If you want to be able to send Certified Email (CE), you apply for Certification from the company that gives you internet connectivity. They check you out, and 'Certify' you as being a legitimate emailer (ie: not a spammer). Then, you generate a private/public key pair and give them the public one. In the headers of all your email, is their certification, and an encrypted header line that's createdusing your private key.

When email arrives at the recipients server (or this could be done at the client level, as well), the server sees the certification, and connects to the certifying server to get your public key. It attempts to decrypt the header line. If it does it marks the email as 'certified', if it cannot, it marks the email as 'uncertified', and the email client can be programmed to filter messages based on that.

Due to the public/private key cryptography, there can be no certified email spoofing. (Assuming the private keys are secure, the keys are of decent length, etc.) All emails are traceable back to the originating server. CORRECTION- all CERTIFIED emails are traceable. Anonymous email is still possible. People can still set up email servers for mailing lists without "having" to get them certified. And people can still receive non-certified mail.

If an email server sends out spam, the complaints go to it's certifier. They can drop the certification, deleting the public key from their server. When this happens, ALL the email from the spamming server is now 'uncertified', and gets handled accordingly by email clients. If nothing is done, complaints go to THEIR upstream, etc. Individuals and groups can keep their own blacklists, if they wish, and anyone can choose to filter emails according to those lists.

Now, I've looked over that 'form email' that people like to post to shoot down anti-spam ideas. And nothing applies to this idea. (If something seems to apply, it's because I either left out details, or explained something wrong.) This idea does NOT need to be universally adopted, nor does it need to be adopted by everyone all at once. It's primarily a way of reliably tracing (certified) emails back to their originating server. The anti-spam part comes later: if you receive certified spam, complain and get the server un-certified. If you receive un-certified spam... well, just have your email client dump all uncertified emails in the trash. (Not nessisarilly, you could just use it's un-certifedness as a factor in filtering your email.)

This idea does not require anything be changed with SMTP. It simply requires a second connection be made to the certifying server. Now, before you bitch about the extra bandwidth, I'd like to remind you that, once this idea catches on, spam will be greatly reduced. This reduction will MORE than make up for the slight increase in bandwidth created in querying the certifying servers. Also, the certifying servers can set time limits on when the certifications expire, and need to be re-downloaded (kind of like DHCP leases). A 'new' company that just applied for certification might have it's certificate set to expire almost instantly. This way, every email they send requires a download of the certificate. This allows the certificate to be pulled rapidly if they start spamming. After a month or two, it could be set to expire weekly or monthly.

To sum up: Email Certification is reliable way of tracing the certified emails back to their originating server. This allows spammers to be identified unequivocally, and have their certification pulled. Email servers are NOT required to be certified, and anonymous email is still possible. Email recipients can, if they choose, set up their client to send uncertified emails to the trash, or to handle them however they wish. White lists and black lists are still possible. 'Hobby mailing lists' are still possible, certified or not. The extra bandwidth is minimal, and easily overshadowed by the reduction in spam being send once spammers realize no one is even seeing, much less reading or replying to their spam.
Re:In my experience... by I!heartU · 2006-08-02 19:34 · Score: 3, Insightful

Domain keys... now just get everyone to use it.
Re:In my experience... by Haeleth · 2006-08-02 19:59 · Score: 1

How does your email certification scheme prevent malicious false reports of spam from causing lazy certification providers to incorrectly revoke the certification of innocent users, leading either to false positives or to the usefulness of certification being largely lost?
Re:In my experience... by 1u3hr · 2006-08-02 21:33 · Score: 2, Insightful

Whats surprising is, while Bayesian spam filters work well in his tests, the one that performs the best was never really heard of before
Well, the spammers have heard of the other methods too and try to subvert them. So give them time and see how it performs if and when it becomes more commonly used and the spammers are trying to beat it.
Re:In my experience... by KlaymenDK · 2006-08-02 21:47 · Score: 4, Insightful

"False positives may be a problem, however."

False positives are a HUGE problem compared to the occasional "true negative"(?).

I'd rather have a small trickle of spam emails (I can't believe I'm saying this, but hear me out) than I would risk missing out on that one truly important email.

--
"Good news, everyone!"
Re:In my experience... by marcello_dl · 2006-08-02 22:07 · Score: 1

yep but good luck trying to defeat different algorithms and still retaining some sense, let alone a convincing message. Unless people are going to trust a sender named "Honey bee furufuru", which unfortunately is still entirely possible.

--
---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
Re:In my experience... by cerberusss · 2006-08-02 22:18 · Score: 1, Funny

Ummm, you didn't get the joke. Read it again, Sam.

--
8 of 13 people found this answer helpful. Did you?
Re:In my experience... by Jartan · 2006-08-02 22:30 · Score: 0, Offtopic

The best part about this post is that it's modded "Informative".
Re:In my experience... by jank1887 · 2006-08-03 00:27 · Score: 3, Insightful
Hello. welcome to the internet.
First, spam does not need to make sense to make money. Here's some of my latest received headlines:
- placing LEDhas
- pJapans mission
- capture Todays architect shared
- 6MZ
and the body text (with an attached image):
-----
malware
USDA databases crop
entente cordial: admission relation contract GB giveaway andd
studios another page:
... (etc.,etc.)
-------
AND IT STILL MAKES MONEY!!!
spam is funded by idiots. we will never run out of idiots on the net. Thus, spam will always be profitible under the current email system. No matter what filters are used. Filters don't fix the spam problem any more than Virus Scanners stop viruses from spreading. It's all reactionary, which translates to 'fighting a never-ending battle on the losing side'.
Re:In my experience... by twistedsymphony · 2006-08-03 00:39 · Score: 1

you mean "false negative"... a "true negative" would be a non-span email _not_ filtered by the spam filter. (which is what you want)

For reference:
False Positive - non-spam marked as spam - BAD
False Negative - spam marked as not spam - BAD (but forgivable if it's a trickle)
True Postive - spam marked as spam - GOOD
True Negative - non-spam marked as not spam - GOOD

--
Collector's Edition
Re:In my experience... by click2005 · 2006-08-03 01:07 · Score: 1

Maybe someone will write software to use Text Mining http://slashdot.org/article.pl?sid=06/08/02/221227 to find spam.

--
I am a free slashdotter. I will not be modded, blogged, DRM'd, patented, podcasted or RFID'd. My life is my own.
Re:In my experience... by DarkDragonVKQ · 2006-08-03 01:31 · Score: 1

While that may be the case. For me I just setup several different email accounts. One of the 6 is used for the obvious sites where you'd get spam, signs for forums, sign ups for sites like this. It's pretty much a junk account where I could care less if I get the emails or not. Ironically combined with Yahoo's spam filter and the built in one in Thunderbird it's rare that I get an email that falls into the junk folder.

Then I got two for more casual use (random friends on the internet who will send you chain letters, etc..) One for people I know in real life who I'd shove my foot up their ass if they dared to send me chain letters. And lastly a university one for university and business purposes.

It seems to work pretty well, I pretty much never miss an important email, and any spam that makes it thru the barriers is less then 1-2 per week.

Though managing those email accounts can be hard. Which is why I use Thunderbird and ComodoDragon's webmail extension that basically scrapes emails from the hotmail, yahoo servers even though they no longer support free POP3 access.

--
"I thought what I'd do was I'd pretend I was one of those deaf-mutes" ~ Laughing Man - GITS:SAC
Re:In my experience... by KlaymenDK · 2006-08-03 02:11 · Score: 1

Duh, I feel dumb now, I thought it was a relevant link. Busted for not reading the f'ing link, hee hee ... !

--
"Good news, everyone!"
Re:In my experience... by TheOtherChimeraTwin · 2006-08-03 02:16 · Score: 1

spam does not need to make sense to make money
More importantly, spam doesn't have to make money for the people paying for the spam. The people paying for the spam just have to expect to make money -- even if they never see a penny of profit. Spam is amazingly cheap to send; the spammers can make a profit selling their services to clueless people who hope you'll buy "bybubVjjagra".
Re:In my experience... by Crayon+Kid · 2006-08-03 02:20 · Score: 1

Wonderful! I've long wanted to have categories such as "spam but about tits so it's ok" or "not spam but damn this chick can ramble".

--
i ate crayons when i was a kid and now i have two braincells and the blue ones taste nicer
Re:In my experience... by ajs · 2006-08-03 03:23 · Score: 2, Informative
In my experience, the commercial offerings (such as mail frontier) aren't too bad. As far as open source stuff, my personal setup of choice is:
- Spamhaus SBL/XBL filtering (hard SMTP-time DNSBLing) based on my expereince with them and their consistent listing of VIOLATORS, not just anyone who shares a netblock with a spammer (i.e. they may not catch as much as some others, but they don't have the FP rate that others do)
- Greylisting. This is controversial because many people can't tolerate the delay it introduces. I found a radical decrease in spam when using it (because honeypots have already located a spammer by the time they try again), and only marginal headaches introduced by the delays of new senders. YMMV, and I wouldn't use it in a production environment.
- SpamAssassin. I tweek the RBL settings (I *never* want to even score SORBS, for example), and configure razor, but otherwise pretty much leave it in its default configuration, and it works great!
- Thunderbird mail filtering. I use evolution and thunderbird. I don't bother turning on mail filtering in evolution, since it uses SpamAssassin, and there's no point using SA twice on the same message. I *do* use thunderbirds filtering as yet-another layer of filtering when I'm using that, and it does a good job of classifying what little spam is left.
YMMV. Good luck.
Re:In my experience... by Elektroschock · 2006-08-03 04:13 · Score: 1

I am looking for a way to sort messages by language. E.g. Italian mails and English mails. Doe Bayes filters work?
Re:In my experience... by hackstraw · 2006-08-03 04:34 · Score: 1

spam is funded by idiots. we will never run out of idiots on the net.

Also, there are idiots that are independent of computers, email, etc.

I recently read about how 3 states in the US have loopholes in the credit laws, and how people who have the same personality of a successful spammer (yes, many are multi millionaires) where they are loaning people small amounts of money when they are desperate with up to a 25% initial fee, and the APR is over 300%.

There will always be profit in getting money from those who have it and are ignorant about the scam. Its amazing how almost all scams are pretty much unchanged from the beginning of scamming history, yet they still work. NIGERIAN scams are variations of the age old con of "I have come upon X, but can't collect on it because of Y, so if you help me, with little effort on your part, I'll give you some fraction of X, and we both win. Thanks for your help, and God bless".

Snake oil scams, work at home scams, FREE!!!! stuff scams, the list goes on and on. The best defense is always, "If it seems too good to be true, odds are it is".

However, I don't see scams stopping any time soon.
Re:In my experience... by Anonymous Coward · 2006-08-03 04:34 · Score: 0

I'd rather have a small trickle of spam emails (I can't believe I'm saying this, but hear me out) than I would risk missing out on that one truly important email.
If your important email looks like spam to the filters, then you have other problems.
Re:In my experience... by Red+Alastor · 2006-08-03 04:54 · Score: 1

Popfile does. It works extremely well as long as you don't change your mind about what each category means. If you do, reset the category and start training it again.

--
Slashdot anagrams to "Sad Sloth"
Re:In my experience... by Control-Z · 2006-08-03 04:55 · Score: 1

I think it probably could. It would take a day or two of training and I expect after that the accuracy would be 95%+.
Re:In my experience... by fatphil · 2006-08-03 22:59 · Score: 1

I read the _link_, but didn't follow it. Quite what Barret, and their Trifles, had to do with spam I really couldn't work out.

--
FatPhil

--
Also FatPhil on SoylentNews, id 863
Re:In my experience... by fatphil · 2006-08-03 23:06 · Score: 1

Nonsense. Much of the spam I get in my yahoomail inbox is from Domain Keys verified senders. And I can't block that particular sender, as it's yahoo.com itself, from which a fair proportion of my real mail comes.

--
FatPhil

--
Also FatPhil on SoylentNews, id 863
Re:In my experience... by I!heartU · 2006-08-04 04:28 · Score: 1

Hmm valid junk hadn't thought about that.

Why not just douse the server in gas... by shotgunefx · 2006-08-02 16:25 · Score: 3, Funny

400MB?

Why not just douse the server in gas if you want to see it melt.

--

-William Shatner can be neither created nor destroyed.

Re:Why not just douse the server in gas... by Tsiangkun · 2006-08-02 16:28 · Score: 5, Funny

I'm getting 8kb/s downloads from the site, it's just like the good old days !

I'll post more next week after I watch the video.
Re:Why not just douse the server in gas... by coffeeisclassy · 2006-08-02 16:42 · Score: 2, Informative

Its round robin mirrored accross a whole bunch of different servers so if youre only getting 8kb/s you could try cancelling and downloading again and seeing if it goes faster.
Re:Why not just douse the server in gas... by Tsiangkun · 2006-08-02 16:56 · Score: 1

whoa, thanks,
I was happy to have a connection, so I was letting it run.
But now I'm downloading at a much more reasonable speed.
Re:Why not just douse the server in gas... by darkfish32 · 2006-08-02 17:07 · Score: 1

Funny, I don't know whether traffic has died down, they've increased bandwidth, or it matter of inter-university connections, but I'm getting well over 200KBps.

Combo of SpamAssassin and Spamhaus by hyperion454 · 2006-08-02 16:27 · Score: 2, Interesting

At work we've set up a combination of SpamAssassin and Spamhaus. Personally I've went from about 10 spams per day to about 1 every two weeks.

Re:Combo of SpamAssassin and Spamhaus by b0r1s · 2006-08-02 16:35 · Score: 1, Insightful

Bah. We use Spamassassin, multiple DNSBLs, and I still get hundreds per day, most of them to addresses published on websites (unavoidable).

The key is still: don't give out your address. Once you've done that, you're going to be screwed eventually.

--
Mooniacs for iOS and Android
Re:Combo of SpamAssassin and Spamhaus by emag · 2006-08-02 16:45 · Score: 2, Informative

And turn off SMTP VRFY. Either that, or having windows systems @ my ISP managed to get the address associated with my account on spam lists. This is an address that's *only* used internally by my ISP (I use pobox or my own domain whenever someone asks for an address). Even that wasn't enough to provent it from getting harvested. :-(

--
"The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
Re:Combo of SpamAssassin and Spamhaus by antifoidulus · 2006-08-02 16:54 · Score: 2, Insightful

Heh, even if you are reasonably diligent in protecting your email address, 9/10 it will still get out(though maybe not as bad). All it takes is one recipient with a compromised windows box and your address can be all over the spammers lists in no time.
Or, as in my case, you could assume that a university you apply to will not send out a giant mass email to all the incoming graduate students inviting them to the graduate orientation. So now I have the email address of every grad student entering the University of Minnesota this year(and probably a few that aren't) and they have mine. All it takes is one infected box and my previously spam-free gmail account will no longer stay that way. The kicker is that I decided not to go to UMN because they didn't offer me funding...oy!

--
Monstar L
Re:Combo of SpamAssassin and Spamhaus by Etcetera · 2006-08-02 19:30 · Score: 1

And turn off SMTP VRFY.

SMTP VRFY (or recipient-checking at the SMTP level in general) being disabled is pointless. Given a choice between allowing people to not send mail to invalid addresses or having to deal with bounce-back scatter and getting your MX server blacklisted for third-party spam, I'll take the former any day.

And I'd wager anyone who's had to admin a qmail server and decide which (if any) recipient-checking patch to use would feel the same way.

It's far less load on the servers to have a more expensive spam identification process on the back end, than have to deal with the billions of messages generated by a dictionary attack on the front end.

--
Hire a Linux system administrator, systems engineer,
Re:Combo of SpamAssassin and Spamhaus by Anonymous Coward · 2006-08-02 19:40 · Score: 0

If your internet-facing mail hosts are capable of responding accurately to VRFY queries, then they're capable of rejecting mail for invalid recipients at the RCPT stage just as easily. Generating the bounce is then up to the client speaking to your server.

I don't see how enabling SMTP VRFY does anything to reduce backscatter etc.
Re:Combo of SpamAssassin and Spamhaus by xenobyte · 2006-08-02 23:43 · Score: 1

At work we've set up a combination of SpamAssassin and Spamhaus. Personally I've went from about 10 spams per day to about 1 every two weeks.

Amazing! - We've been using that combo for a long time and I get about 5-10 spams AN HOUR coming through the filters (and about the same amount caught). This is all personalized spam sent to one specific email address. That address was used in the past for a few newsgroup postings, a few technical forums and it was listed on a webpage some time ago. No spam sent to it was ever opened with an unsafe email client, i.e. no phone-home webbug activity possible to verify it. The mailserver it lives on never supported VRFY.

My experience is that current spam has defeated both the Razor system and the Bayes system as well by simply making each spam extremely unique in every part of the content (and headers too). We get the best blockage using the old but still trusty RBLs (excluding SPEWS which lists way too much).

--
"For every complex problem, there is a solution that is simple, neat, and wrong." -- H.L. Mencken (1880-1956) --
Re:Combo of SpamAssassin and Spamhaus by jdowland · 2006-08-02 23:45 · Score: 3, Funny

The key is still: don't give out your address. Once you've done that, you're going to be screwed eventually.

Nah, that's such a half measure. The real solution is to not have an email address at all.
Re:Combo of SpamAssassin and Spamhaus by Etcetera · 2006-08-03 03:36 · Score: 1

That's exactly my point: it's the same data being made available... If you have recipient-checking turned on, then keeping SMTP VRFY off as an anti-spam matter-of-principle doesn't make much sense. Any probing software could just as easily sent a RCPT TO: command as a VRFY one.

There may not be a benefit per se (since SMTP VRFY isn't all that widely used any more, it seems), but it offers little protection at the same time if you're doing recipient checking.

--
Hire a Linux system administrator, systems engineer,
Re:Combo of SpamAssassin and Spamhaus by Tony+Hoyle · 2006-08-03 04:07 · Score: 1

Yeah bayes is pretty much beaten - either massive chunks of text appended to the emailor it all uses made up words, or a combination of the two.

A favourite is to just send the email as a huge .gif so there's not enough to filter on, then fill the text portion with a dump of about 500 words of text from a legitimate site.

I've gone back to hard blocking with RBLs and using sender verification so that no email with a bad return address gets through (blocks some badly configured mailing lists but hey they need to fix their systems as I'm not interested in mail I can't reply to). Those two together block about 10,000 spams a day, with about another 1000 or so getting through to spamassassin and caught there - leaving about 20-30 a day hitting the inbox.
Re:Combo of SpamAssassin and Spamhaus by andersa · 2006-08-03 06:03 · Score: 1

Personally I've went from about 10 spams per day to about 1 every two weeks.

Luxury!
Re:Combo of SpamAssassin and Spamhaus by wboelen · 2006-08-03 06:56 · Score: 1

The key is still: don't give out your address.
I've always been careful selecting the people who I give my "private" address. Never had a problem. Up till the point some trusted person included me in some lame "help me my daughter is dying!" chainmail :(

Fantastic Spam Filters Which Work Best Proving! by _vSyncBomb · 2006-08-02 16:30 · Score: 5, Funny

Hey Slashdot, what's up, man! Dude, I read your thing and like totally agree about Best Work Proving Spam Site Work! Dude, that's awesome!

Bro, in the same vein, I was totally checking out this dope ass site which you might wanna check out too man. Guys like us that dig Spam Which Proving and Best work Filters will be all over this before long...

OK, man take care until I see you this Friday at the dinner thing, Slashdot!

Cheers,
John

Under present IST policy... by patio11 · 2006-08-02 16:39 · Score: 3, Funny

... they are not allowed to douse the servers in gas.

--
Help poke pirates in the eyepatch, arr.

RTFA? by glowworm · 2006-08-02 16:39 · Score: 4, Insightful

So, how are we supposed to RTFA then the FA is over 470MB and a video file. Why not just a nice simple text summary Mr Submitter, but nooooo that would just be too easy!

--
Orationem pulchram non habens, scribo ista linea in lingua Latina

Re:RTFA? by emag · 2006-08-02 16:55 · Score: 5, Funny

"We are sorry that these talks are not available as plain HTML, PDF, or text, however under present IST policy we are not allowed to provide plain HTML, PDF, or text."

--
"The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
Re:RTFA? by Enderandrew · 2006-08-02 17:10 · Score: 1

Yes, but the person submitting the story to Slashdot when preparing their little blurb could have spilled the results.

--
http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
Re:RTFA? by cerberusss · 2006-08-02 18:05 · Score: 1

Yeah now the tubes are full again.

--
8 of 13 people found this answer helpful. Did you?
Re:RTFA? by alexhs · 2006-08-03 00:18 · Score: 1

the person submitting the story to Slashdot[...]could have spilled the results.

What lets you think the submitter did actually watch the video ?

--
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
Re:RTFA? by Crayon+Kid · 2006-08-03 02:29 · Score: 1

Yeah that's rich. This is Slashdot, it's a badge of honor to NOT RTFA.

--
i ate crayons when i was a kid and now i have two braincells and the blue ones taste nicer
Re:RTFA? by sootman · 2006-08-03 03:19 · Score: 1

Likely that the submitter didn't WTFV.

--
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
Re:RTFA? by shokk · 2006-08-03 04:33 · Score: 0

Wow, that's a great policy. Because a giant video will kill the server less than a few HTML files. And frankly most here will either quit watching after the first minute or the first few pages of an HTML presentation anyway, so HTML wins, where the file requires you to download the whole thing before making a decision to only watch a tiny piece.

--
"Beware of he who would deny you access to information, for in his heart, he dreams himself your master."

A good DUL helps by winkydink · 2006-08-02 16:42 · Score: 1

DUL = DailUp List... a bit of a misnomer as it commonly refers to all dynamic hosts. My spam went down dramatically after starting to use Trend's DUL (formerly MAPS). Alas, it's a pay service, but it all comes down to your pain threshold. Mine is low relative to my income.

--

"I'd rather be a lightning rod than a seismometer." -Ken Kesey

Not surprising... by RealGrouchy · 2006-08-02 16:45 · Score: 4, Insightful

Although I haven't WTFV (watched the video), it doesn't seem surprising that spam filters which use techniques that aren't used widely would be most successful.

If they aren't used widely, it would either be because they don't work, or they do work but they haven't caught on [yet].

It's like any other fad. As an example, when the original Survivor series came out, it was really popular because it achieved its goal (attracting viewers) in a way that was original. Heck, even I watched the original one. Now that all the networks are doing the reality TV thing, it has become hackneyed, and each successive version of survivor does a worse job of achieving its goal. And I've given up watching TV.

With antispam, new techniques are effective, but as they become more popular and more widely used, spammers will find equally innovative ways of getting around them.

I've noticed that at any given time, there will be a particular style of (non-blank) spam that manages to get through Gmail's filters fairly consistently, but every now and then Gmail adapts its spam filters to block the successful spam type of the season, and eventually a new type will make its way through.

- RG>

--
Hey pal, this isn't a pleasantforest, so don't waste my time with pleasantries!

Re:Not surprising... by Tweekster · 2006-08-02 17:13 · Score: 1

Spam is easy to take care of, well 99% of it. the rest isnt a big deal so who cares.

My office went from 2000 spam mails a day to about 10. across 15 employees. Who gives a crap about the 10 emails remaining...

I only wish it could be taken care of upstream further to shut those pricks down. but for the end user in an admins perspective, most systems are pretty easy to deal with (particularly small offices)

--
The phrase "more better" is acceptable English. suck it grammar Nazis
Re:Not surprising... by laa · 2006-08-02 17:32 · Score: 1

Yupp I agree. My personal spams index (spams per day) has slowly risen the last few years and is now just below 400. Of the weekly 2000 spams around 4-5 pass SpamAssassin. So far these year I've had one false positive (that I know of), but browsing through the >80000 spams isn't that fun so I rarely do it.

It's a silly waste of bandwidth sending me all those viagra commercials, but at least they're easy to get rid of.

--
Why does the kernel go through stable and then unstable forks? Can't it always be a stable build, like with Windows?

Got to go with Brightmail by saha · 2006-08-02 16:46 · Score: 4, Informative

We use Brightmail on our campus and our users love it with its very low false positive and pretty accurate flagging of SPAM. Another campus uses DSPAM and some people are up in arms at the prospect of losing their Brightmail to switch to DSPAM. Personally, DSPAM isn't nearly as good and has flagged many legitamate messages and sent them to the Junk folder.

I also echo a gripe of other posters. Its nice to have a video but 500MB video file it a bit much. A 50KB pie chart or bar graph would have been nice.

Re:Got to go with Brightmail by Anonymous Coward · 2006-08-03 02:32 · Score: 0

What he's said in regard to what Brightmail guarantee is incorrect. Brightmail does not claim to have a zero false positive rate, but a "99.9999 percent accuracy rate against false positives". That means one false positive in a million mails.
Re:Got to go with Brightmail by hacker · 2006-08-03 03:14 · Score: 2, Informative

Personally, DSPAM isn't nearly as good and has flagged many legitamate messages and sent them to the Junk folder.

And what happened when you retrained those false positives as ham? Did you see future mails of the same/similar type get caught again? I bet you didn't.
I've been using dspam for a very long time for my users, and they love it. They love having zero spam in their mailbox, they love the simplicity of the user interface. They love how it treats users on a per-user basis, not globally (i.e. some users WANT html emails, some do not. Each can mark them as they see fit.)
Here's an example of my own stats.. hacker: TP True Positives: 122601 TN True Negatives: 124711 FP False Positives: 211 FN False Negatives: 1046 SC Spam Corpusfed: 3708 NC Nonspam Corpusfed: 456 TL Training Left: 0 SHR Spam Hit Rate 99.15% HSR Ham Strike Rate: 0.17% OCA Overall Accuracy: 99.49%
Re:Got to go with Brightmail by sholdowa · 2006-08-03 09:47 · Score: 1

That's one rich university you're at then!

Flaw in the test by lheal · 2006-08-02 16:48 · Score: 5, Informative

The spammers actively try to subvert the more popular filters. That gives a lesser-known one a decided advantage, one which will go away as it becomes more popular.

As with most choices like this, factors such as ease of use, speed, and resource efficiency can overshadow selectivity. No system is perfect, so it's perfectly reasonable to go with a system that's pretty good if you already are using it, rather than switching to the latest cool thing.

I have found that using two dissimilar systems in a chain is quite effective.

--
Raise your children as if you were teaching them to raise your grandchildren, because you are.

Re:Flaw in the test by MadAhab · 2006-08-02 16:57 · Score: 1

Excellent point.

And that applies to spam filtering techniques as well - it's like anti-biotics. For serious stuff, a spread attack is a good idea.

I've found that using RBLs, SpamAssassin, and Bayesian filters prevents 99.5% of spam with essentially no false positives. And that means, by my day-to-day experience with addresses spammed for a full 10 years now, that instead of getting 100 spam and one real mail, I get 1 real mail, and once every could of days a spam that gets through.

Except for earlier this year. The RBLs went a little nuts, probably in response to some spam onslaught, and generated a few false positives.

--
Expanding a vast wasteland since 1996.
Re:Flaw in the test by Anonymous Coward · 2006-08-02 17:21 · Score: 0

I love this comment. Put it in this anti-spam context and people nod their heads sagely. Put it in an OS thread and do a s/spammers/crackers and s/filters/OSes, and people start foaming at the mouth :)
Re:Flaw in the test by Jeffrey+Baker · 2006-08-02 17:26 · Score: 2, Insightful

The problem with the spam filters, which you have stated, is that eventually a spammer figures out how to craft a spam which avoids the feature detection systems. Right now there's some zombie network sending around a stock market scam, of which I am getting roughly 300 copies per hour, even though spamassassin correctly classifies virtually all other unwanted mail.

Lately, I've been thinking about this problem a lot. The classic method of computer classification systems (Bayes, SVM, whatever) are all based on trying to detect features in a set of objects which separate the objects into two classes. But there is only one feature which is shared by all spam, and which is not shared by mail I wish to receive: all spam is sent by assholes. The problem is, you can't algorithmically detect the asshole coefficient solely from the contents of an SMTP transmission. Therefore I have recently come to the conclusion that we need to revert to a web of trust for accepting email. I have long avoided webs of trust because they seem difficult to manage, but I've come to believe that they are the only way to solve this spam problem.
Re:Flaw in the test by jcr · 2006-08-02 17:45 · Score: 1

Right now there's some zombie network sending around a stock market scam, of which I am getting roughly 300 copies per hour, even though spamassassin correctly classifies virtually all other unwanted mail.

If you're talking about spam with the pump & dump message in an image, and random-words text, I'm getting about a dozen of those a day. They're one of three types that's getting through my filters currently. 300 copies per hour would make me just about ready to kill somebody.

I have long avoided webs of trust because they seem difficult to manage, but I've come to believe that they are the only way to solve this spam problem.

Well, I'm also in favor of hiring goons to change the cost/benefit equation for the spammers.

-jcr

--
The only title of honor that a tyrant can grant is "Enemy of the State."
Re:Flaw in the test by prandal · 2006-08-02 20:14 · Score: 1

Use SA 3.1.4 and run-sa-update.

Theo van Dinter added a rule to catch these to the core rules on Tuesday.
Re:Flaw in the test by shawn.fox · 2006-08-02 23:59 · Score: 2, Interesting

Right now there's some zombie network sending around a stock market scam, of which I am getting roughly 300 copies per hour, even though spamassassin correctly classifies virtually all other unwanted mail.

Do you happen to use Ameritrade? I started receiving these emails this Sunday myself (July 30). Since I always use disposable email addresses I immediatly noticed that the email was being sent to the disposable address I had created for Ameritrade. I sent them an email complaining about it and accusing them of either giving away my email address to some third party who was spamming me or that someone had stolen customer account information from them. I have yet to hear any response back from them.
Re:Flaw in the test by perlchild · 2006-08-03 02:36 · Score: 2, Insightful

A web of trust will work only until someone you trust's computer gets subverted. The zombie network you mentioned doesn't happen by itself. Now the smaller, more technically proficient web of trust, the less likely it is to be subverted, but it's still vulnerable to someone you trust having their computer hijacked.
Re:Flaw in the test by Tony+Hoyle · 2006-08-03 04:10 · Score: 1

I had one the other week posting various combinations of viagra, vlzagra, v1zAgra, etc. presumably from a zombie network since every one I reported came from a different ISP.

Was getting several hundred of them an hour for nearly a week before I worked out a rule to stop them - spamassassin still marks them 0.1 + 10.0 with my rule.

obscurity by TheSHAD0W · 2006-08-02 16:49 · Score: 1

It may not be coincidence that a little-known filter algorithm produces the best results; many spammers probably test their spew on the more popular filters to try and fool them. If this new filter becomes more popular you may see its reliability decay.

Re:obscurity by pe1chl · 2006-08-02 20:23 · Score: 1

This is very true.
I have a successful spamfilter deployed at work. It uses SpamAssassin for the backend filtering, but that part has to do very little.
The bulk of the rejecting is done in the dedicated SMTP engine that receives the mail. There is a lot of information to be deduced from the SMTP transaction itself, which is normally not used by spamfilters.
Close adherence to RFC standards is something that most SMTP servers have achieved quite well, and the tools the spammers use are very bad at it.
I know several "bugs" in those spamtools that make them easy to identify and make it simple to discard spam without even receiving the body.

But unfortunately, when widely releasing such a filter the spammers would of course fix the bugs, and the effectiviness of the filter would be gone.

whitelist by Anonymous Coward · 2006-08-02 16:49 · Score: 0

whitelist

Little known systems will often be most effective by thesleepylizard · 2006-08-02 16:52 · Score: 1

Against viruses and spam. For obvious reasons - hackers and spammers put their efforts into circumnavigating the major systems since this will maximize the impact of their work. That's why smaller anti-suckware products will often do a better job since the focus isn't on them.

It leads to a sad but inevitable cycle of products being improved, gaining popularity, then losing their effectiveness since they are now a bigger target.

At least, until a watertight (rather than guess-work) solution is found. I believe this is impossible without changing the way email works at a fundamental level. Even the much praised challenge-response is subject to email spoofing.

Reminds me of why I like living in Australia - globally speaking we're relatively irrelevant, making us a relatively small target. Hopefully we'll stay relatively irrelevant, lol :p

Harder! by Profane+MuthaFucka · 2006-08-02 16:53 · Score: 5, Funny

I uuencoded the video file, translated it into Sumerian cuneiform, and pressed it into a billion little clay tablets. They are cooking in my oven right now. Now, the Internet is NOT some kind of truck you can just dump stuff onto, so if you want to get the data you're going to have to come to my house.

--
Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!

Re:Harder! by rts008 · 2006-08-02 17:05 · Score: 4, Funny

I can't come to your house, you insensitive clod!, teh tubes are clogged with clay tablets!

I won't be able to download my internet until Friday now!

Turn that crap down, and get off of my lawn! Damn kids!

--
Down With Slashdot BETA!!! I've been around the corner and seen the oliphant; you can only abuse me from your perspecti
Re:Harder! by Cylix · 2006-08-02 17:35 · Score: 1, Funny

Excellent...

By chance, are you nearby?

I have a wonderful set of wikipedia tablets I made and I'm eager to offload them...er I mean... trade them.

It's the updates you see, I've been having a bit of a nightmare trying to keep them all in sync.

--
"You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
Re:Harder! by ciscoguy01 · 2006-08-02 18:26 · Score: 1

Now, the Internet is NOT some kind of truck you can just dump stuff onto, so if you want to get the data you're going to have to come to my house.

No, I understand the internet is actually a series of tubes, and there will be hell to pay if they get "full".

--
.
Re:Harder! by cruachan · 2006-08-02 21:43 · Score: 4, Insightful

Don't knock it, cuneiform on backed clay is the single most successful format for long-term storage ever invented - 3000 years and counting. Heck, most of our modern storage formats can't even manage 30 - tied to read a 8" floppy recently?
Re:Harder! by Jartan · 2006-08-02 22:28 · Score: 2, Insightful

I'm not going to knock it but your statement is very far from the truth. Determining the "most successful" long term storage method invented would require waiting till the year 5xxx something to see if something we've currently invented beats cuneiform. Even then it's pretty hard to prove one way or another since a lot of the cuneiform we have today is being carefuly taken care of to prolong it's lifetime I'd suspect (though I have no confirmation of that part).
Re:Harder! by ozmanjusri · 2006-08-02 23:02 · Score: 2, Insightful

I'm not going to knock it but your statement is very far from the truth.
Yep, you're right. The best long-term information storage media ever invented is poetry.

--
"I've got more toys than Teruhisa Kitahara."
Re:Harder! by cruachan · 2006-08-02 23:24 · Score: 1

Presumably the tablets stuff in museums are, but I was listening to an archeologist recently who was saying that the advantage of clay tablets is that they are virtually indestructable unless you purposly take a hammer to them. Compare that to papyrus and parchment which can get preserved a long time, but need special circumstances, such a being buried in a bog, to stop decay.

Apparantly because of this there is vast amounts of Sumerian and related texts awaiting translation (the language was only deciphered at the beginning of the 20thC and naturally there's not a great many people who can translate it) ranging from business accounts to high literature. Much of it is on tablets that were recycled as building material and the like - there's literally piles of it and at the current rate it will take several hundred years for scholars to get through.
Re:Harder! by the_xaqster · 2006-08-02 23:33 · Score: 1

Either you have made the tablets very, very small, or you have one big-ass oven!

--
I'm just here to regulate Funkyness
Re:Harder! by Squalish · 2006-08-03 00:30 · Score: 4, Insightful

Am I the only one that read the means of presentation as a hilarious attack on a university policy of blocking bittorrent? Given that adding 470MB doesn't really add any usable information to a discussion about spam filters over a piece of text, and all.

Your college doesn't like bandwidth-efficient delivery? Flood them with a Slashdot effect on a 500mb file, an extra $500 in bandwidth charges, and maybe they'll change their tune.

--
People in Soviet Russia, however, appear to be afflicted with amusing juxtapositions of the aforementioned situation
Re:Harder! by Anonymous Coward · 2006-08-03 01:03 · Score: 0

Its true, ask the man from Nantucket
Re:Harder! by Crayon+Kid · 2006-08-03 02:12 · Score: 2, Funny

Bwahaha, I'm moving my blog to clay tablets. They will undoubtedly survive the next Ice Age and the people of year 5000 will be forced to read about my cat, how I hate Emo's and that guy at work who doesn't wash. But first I'll change my blog nick to "Earth Imperial Overlord Supreme", just to fuck with them future dudes.

--
i ate crayons when i was a kid and now i have two braincells and the blue ones taste nicer
Re:Harder! by hesiod · 2006-08-03 03:26 · Score: 1

You are ignoring the many thousands of tablets that DIDN'T make it this long...
Re:Harder! by Morkano · 2006-08-03 03:57 · Score: 2, Interesting

You know, for a university with supposedly the best engineering and CS programs in Canada, their actual use of technology is pretty crazy. You'd think they'd understand it well enough to realize that bit torrent is a great delivery method.

I remember when I applied to go there, I didn't get the email stating my acceptance until weeks and weeks after I got the physical package. Ha!

--
Victory or awesome!
Re:Harder! by Anonymous Coward · 2006-08-03 03:57 · Score: 2, Informative

Hey, we can't help it if people decide to post our videos to ./ and Digg!
[/innocence]

Here are UW's traffic stats, in case anyone's interested:
http://noc.uwaterloo.ca/cgi-bin/14all.cgi?log=cn-r text_gi2&cfg=cn-rtext.cfg

Also note the spikes on Monday and Tuesday from when we posted our last two talks.
Re:Harder! by cruachan · 2006-08-03 04:01 · Score: 2, Informative

True, but as I per below, there's literally mounds of baked clay tablets because they are so indestructable. Apparently they used to get shovelled into foundations and the like. The estimate I heard was that at current rates it will take scholars several hundred years to translate what we've found already. Compare that to parchment records where the discovery of even a few new scraps is a major event (http://news.bbc.co.uk/1/hi/sci/tech/5235894.stm and particularly http://news.bbc.co.uk/1/hi/world/europe/5216320.st m). Point is in the race for the most successful long term storage mechanism cuniform on baked clay is way ahead of the field, nothing else comes close.

Excellent 'In Our Time' programme on Babylon and it's Literature here - http://www.bbc.co.uk/radio4/history/inourtime/inou rtime_20040603.shtml
Re:Harder! by Elektroschock · 2006-08-03 04:10 · Score: 1

How does a university "block" bittorrent?

Are there provisions which prohibit university content to get delivered via torrents?
Re:Harder! by Anonymous Coward · 2006-08-03 06:25 · Score: 0

They make a stupid "anti-P2P" policy not realizing that BitTorrent is just a protocol and nothing more. One that can be used to send both pirated files and non-pirated files, just like HTTP and FTP can. Honestly, I've seen a few such policies which prohibit downloading "copyrighted material" and neglect to take notice of the fact that that's only unlawful if you don't have the copyright holder's permission! Worse, every damn thing that's copyrightable IS copyrighted by default, unless perhaps they disclaim their copyright into the public domain! In other words, that policy forbade doing ANYTHING whatsoever. Idiots.

If the university wasn't a dumbass, they'd use traffic shaping to make sure that HTTP/telnet/SSH/etc. work fine, throttle all other traffic so that it doesn't interfere too much, and make their policy prohibit *copyright infringement* rather than prohibiting random protocols which happen to be every bit as useful for legitimate content as they are for piracy.

Now if you'll excuse me, I'd like to be gettin' back to me piracy. Arrrh!
Re:Harder! by turbidostato · 2006-08-03 10:57 · Score: 1

"the language was only deciphered at the beginning of the 20thC (...) ranging from business accounts to high literature"

Huh! I can figure that there must be some babylonian over there quite quite angry when somebody actually deciphers his home rent bill... with 3000 year overdue charges!
Re:Harder! by terrox · 2006-08-03 16:53 · Score: 0

The fact that someone did 'research into spam filters' is not news - the answer to the question is the news. It would have fit in the summary, so why not just.. bugger it, time to move to a real news site.
Re:Harder! by Anonymous Coward · 2006-08-03 17:18 · Score: 0

That's nothing I have a 10" floppy, which is damn near 15" when it gets hard.
Re:Harder! by ananamouse · 2006-08-06 14:36 · Score: 0

>can't even manage 30 - tied to read a 8" floppy recently? I have a PDP 11 over there that booted off one last summer. >/

Re: Very Interesting And Generally Really Amusing by Anonymous Coward · 2006-08-02 16:55 · Score: 5, Funny

Hey _vSyncBomb,

Having trouble pleasing your woman? I've got something Very Interesting And Generally Really Amusing that you could try!!!

Your buddy,
_vAnoymousCoward

I got the 400M download! by Ossifer · 2006-08-02 16:55 · Score: 3, Funny

And I printed out every frame so I could scan them. I'll be posting the TIFFs on my website shortly...

Re:I got the 400M download! by Anonymous Coward · 2006-08-03 04:14 · Score: 0

Inkjet printer companies are loving you right now!
Re:I got the 400M download! by Anonymous Coward · 2006-08-03 04:23 · Score: 0

Cool, can you send me the frames by email too? Make sure they are printed in A3 and scanned at least at 2000dpi, otherwise they don't look good.

"SpamAssasian"? by Anonymous Coward · 2006-08-02 16:58 · Score: 0

Yes, I do get a lot of spam dealing with asses of the Asian variety. Luckily, most of it is tagged as such by Gmail's filter.

Torrents by shack420 · 2006-08-02 16:59 · Score: 1

I see that the organization is not authorised to host a torrent. Would it be possible for someone who has downloaded the video to put one up somewhere? Id be interested to see what kind of speed we would get out of a /. torrent too...

Re:Torrents by Pantero+Blanco · 2006-08-02 17:09 · Score: 2, Interesting

I wonder how hard it would be for Slashdot/OSTG to host a tracker for large, article-related files like this. I don't think it would require a lot of funding to run, and it would certainly help with convention presentation videos.

I have one word: by get+quad · 2006-08-02 17:05 · Score: 1, Offtopic

Postini.

--
"To err is human, to mod Funny divine."

Re:I have one word: by bitserf · 2006-08-02 17:13 · Score: 1

IronPort works extremely well too. Be prepared to pay enterprise prices though.
Re:I have one word: by Anonymous Coward · 2006-08-02 17:27 · Score: 0

as an ISP i use them, they arent that hot. and gl trying to resolving issues regarding authentication, or anything for that matter.
Re:I have one word: by Jeffrey+Baker · 2006-08-02 17:38 · Score: 2, Informative

I hope you also have another word, because the Postini service is incredibly bad. I had it enabled on my account at acm.org, and the Postini system was generating roughly one false positive for every 10 true positives. I disabled the Postini filtering and started using Spamassassin. Both the false positive and false negative rates are much improved. Among the traffic that Postini was flagging as spam were the Wikipedia article of the day, my daily email from musicbrainz.org, all messages to the BATN mailing list, many replies to my items for sale on craigslist, and other kinds of completely legitimate traffic. Among the mail they chose to deliver were messages in Korean, Cyrillic, other scripts I can't read, and known viruses.

Their main problem is the system doesn't learn. Using their web interface, I look through the spam folder and request delivery of all the false positives. The next day, nearly-identical mails are still generating false positives. You'd think it would be easy these days to design a filter that learns from negative reinforcement.
Re:I have one word: by slugstone · 2006-08-02 19:15 · Score: 1

I hate Postini interface. It is not very user friendly. At least with spamassassin you can build the interface you like
Re:I have one word: by Richard+Steiner · 2006-08-03 02:57 · Score: 1

Postini is easily trainable, however. All you need to do is add the sender of each false positive to the pass list, and you're golden. I've been using both SpamAssassin and Postini for a few years now, and Postini does a better job for me over time once it's been properly trained.

--
Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
The Theorem Theorem: If If, Then Then.
Re:I have one word: by Jeffrey+Baker · 2006-08-03 04:21 · Score: 1

So I should add every person who sends mail to a busy email list to a white list, one by one? No thanks.
Re:I have one word: by get+quad · 2006-08-03 14:58 · Score: 1

no, because you clearly misunderstand.

--
"To err is human, to mod Funny divine."
Re:I have one word: by Richard+Steiner · 2006-08-04 04:53 · Score: 1

So I should add every person who sends mail to a busy email list to a white list, one by one? No thanks.

No. Postini has a feature specifically for mailing lists: you can specify the single TO or FROM address which is associated with that list, and Postini will pass everything which contains the address specified for that list.
Here's the screen screen that Postini uses.

--
Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
The Theorem Theorem: If If, Then Then.

I hate to say this but... by Crydee · 2006-08-02 17:09 · Score: 1

I never used email because of the spam problem and the rampant use of IMs but once I started using G-Mail I never get spam in my inbox and my instant message time has dropped 70% I'd say. Whatever G-Mail uses is the one I would use if I was using a client to download my emails.

Re:I hate to say this but... by PinkyDead · 2006-08-02 21:41 · Score: 1

I was interested myself in installing whatever Gmail uses, cos it does feel like the gmail filter works very well. I had an existing (old) spamassassin installation, but the following investigation convinced me to stay with SA and just upgrade.

http://taint.org/2004/04/15/033025a.html

Whether it holds statistical value or not, is debatable.

Still, Gmail impresses. (And can only get better).

--
Genesis 1:32 And God typed :wq!
Re:I hate to say this but... by commanderfoxtrot · 2006-08-03 02:08 · Score: 1

Gmail and others e.g. Hotmail should theoretically have the best rates of spam filtering. Millions of users manually clicking "Report Spam" on SPAM must beat even high-flying automatic code for categorising SPAM.
The more SPAM that comes through, the better their filters should get. Having said that, I get up to 10 spams per day through Gmail, although my spam folder is currently 4684-strong (holds 30 days of spam). Which is OK by me.

It would be interesting to know how they filter it or give the outside world access to their spam collection.

--
http://blog.grcm.net/

An alternative format by Anonymous Coward · 2006-08-02 17:12 · Score: 1, Funny

All someone needs to do is rig this video up to the wonderful Microsoft Voice Recognition software, and then post the resultant transcript. Surely it won't have that many errors...

text versions of the material by martin-boundary · 2006-08-02 17:13 · Score: 5, Informative

For those who don't relish downloading 400MB worth of video (why can't somebody cut out the audio as a standalone MP3?), the material of the talk is also available in text mode.

The official tests of spamfilters were done in last year's TREC conference, you can read the writeup here (or pdf overview).

You can duplicate those tests yourself if you download the evaluation toolkit (GPL). It's a modular system where you can add a mail corpus (either one of the public TREC ones, or you can make your own trivially), and add a spamfilter package (there are 10 or so to download from the web, or create your own as per documentation).

There's also a video talk given at Microsoft research which should cover pretty much the same ground, if text mode is slashdotted :).

There's a new scheduled test towards the end of the year at TREC 2006.

Re:text versions of the material by IMarvinTPA · 2006-08-02 23:40 · Score: 1

Which link is the write up?

IMarv

--
Trusting software vendors is no smarter than trus

just got this in my inbox by noneme · 2006-08-02 17:21 · Score: 1

from: gordy to: me

SPAM FILTERS WORK

important filters - SPAM

Download Spam Filters in a number of formats: ,XviD(473M) ,DiVX(473M) SEX ,MPG(472M) ,OGG/Theora(481M) ,Real Media(471M) ,WIN ,Windows Media(476M) ,FREE ,SEX ,WIN

BUY SPAM FILTERS

Gord Cormack talk about the science, logistics, and politics of Spam Filter Evaluation.

Re:fuck power went out! by lewp · 2006-08-02 17:25 · Score: 1, Funny

I think it's trying to communicate with us...

--
Game... blouses.

Only one question... by fm6 · 2006-08-02 17:26 · Score: 1

Is there any filter that doesn't give false positives? I don't mean "almost none", I mean zero . It isn't a matter of "holding out for perfect". Some of us simply can't afford to have a key email discarded as "spam".

Re:Only one question... by Jeffrey+Baker · 2006-08-02 17:33 · Score: 2, Insightful

There is no classification system with zero real risk, except for delivering all mail to the Inbox. Sorry.

If your mail is that important, you should be using couriers instead of email.
Re:Only one question... by Anonymous Coward · 2006-08-02 17:34 · Score: 0

Nonsense. Not even a human brain could accomplish that... and that's what Quarantines are for.
Re:Only one question... by Cylix · 2006-08-02 17:39 · Score: 1

Well,

You could have it only filtered completely if it's suspect rating is high enough and then otherwise just tag it if the rating is below a certain point.

That said... white lists are your friends.

Funny thing though... someone forwarded me some "funny" e-mail and usually they are not that humorous. I was so damned pleased when it was filtered out.

That said, I haven't moved to deletion just yet. I just tag the mail and sort it later. As soon as I'm sufficiently happy with the system highly suspect mails can get purged auto-magically.

--
"You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
Re:Only one question... by cruachan · 2006-08-02 19:30 · Score: 1

Cloudmark's safetybar product (http://www.cloudmark.com/ - lousy name, SpamNet which it was before was far better) is just about perfect for me. I get an average of about 20 spam emails a day and it has a false positive result of 0% and has had for months. In fact I've been using the product for several years now and I think the last time I saw a false positive was a couple of years back.

On the efficiency side it has a hit rate of nearly 100%. I would have said it was 100% a couple of months back, but just recently it's been having a bit of a problem with one stock-pushing spam.

Anyway, that aside it's the best spam filter I've ever seen by a very long way, and I'd highly recommend the service. It costs a few $ a month, but it's probably the best value subscription I have.

I have no connection with the company, just a very satisfied customer. The P2P nature of the product places it outside the usual spam filters so it's often missed from reviews.
Re:Only one question... by Anonymous Coward · 2006-08-02 19:33 · Score: 0

Is there any filter that doesn't give false positives? I don't mean "almost none", I mean zero . It isn't a matter of "holding out for perfect". Some of us simply can't afford to have a key email discarded as "spam".

Yes, indeed! The filter is called "pass-through". It only has one down side, all spam passes too!!

Now seriously, consider using a white list of known good senders (your clients?), and use the spam filter with those not in the list only.
Re:Only one question... by Eivind+Eklund · 2006-08-02 23:02 · Score: 1

Delivering all mail to the inbox has a real risk: Human classification error, which AFAIR tend to run at about 0.1%. This is higher than some automated systems.
Eivind.

--
Doubting the existence of evolution is like doubting the existence of China: It just shows that you're uninformed.
Re:Only one question... by ratboy666 · 2006-08-03 04:18 · Score: 1

No

But I question your premise.

Ratboy

--
Just another "Cubible(sic) Joe" 2 17 3061
Re:Only one question... by fm6 · 2006-08-03 07:27 · Score: 1

Go patronize somebody else. I'm not looking for "zero risk". I just want to know that if some stranger sees my resume online and offers me a job they have the same probability of getting through as any other email.
My point being that geeks are in love with mail filters, but they don't fucking work. Which is why I'm careful about distributing my address, and am willing to put up with the odd spam that gets through anyway. And indeed, the odd one will get through, because spam filters also have false negatives. Indeed, spammers seem to be winning the arms race between obfuscation and filtering.
Re:Only one question... by fm6 · 2006-08-03 07:32 · Score: 1

You could have it only filtered completely if it's suspect rating is high enough and then otherwise just tag it if the rating is below a certain point.
Same question: do non-spams ever get high suspect ratings? My guess is yes.
That said... white lists are your friends.
Only if you never trade email with strangers. As a freelancer, I often do. And not seeing a "cold" email from a stranger can cost me money.
Funny thing though... someone forwarded me some "funny" e-mail and usually they are not that humorous. I was so damned pleased when it was filtered out.
Technology is not a solution for your social problems.
Re:Only one question... by fm6 · 2006-08-03 07:36 · Score: 1

Sounds nice. Pity it only works with Microsoft email software.
Re:Only one question... by fm6 · 2006-08-03 07:39 · Score: 1

What's my premise? Or maybe the correct question is, What do you think my premise is?
Re:Only one question... by fatphil · 2006-08-03 23:23 · Score: 1

How do you 'see' a false positive? If it is positive, it's binned?
Are you saying that you have to go checking through the bins just in case it dumped a real mail there?

--
FatPhil

--
Also FatPhil on SoylentNews, id 863

from china with love by nihaopaul · 2006-08-02 17:30 · Score: 0, Redundant

44% [===============> ] 220,996,832 89.21K/s ETA 48:16

almost got it!

Re:from china with love by afaik_ianal · 2006-08-02 18:36 · Score: 1

Downloading in china:
"Progress: 220,996,832 of 400,000,000 bytes. Did we say 400,000,000? What we meant was 320,000,000... Yes, that's right!"
Re:from china with love by nihaopaul · 2006-08-02 18:45 · Score: 0, Offtopic

whooo, ok that slashdot lameless filter error is a `p155` me off

85% [=========> ] 424,250,976 30.16K/s ETA 18:54

almost have it!

Ask Slashdot ... by Anonymous Coward · 2006-08-02 17:34 · Score: 5, Funny

Dear Slashdot,
At the university where I work, they have recently adopted a pesky policy banning the use of bitTorrent.
What can I do to fix this ?
Yours faithfully,
Dr. Gord Cormack

Picking nits by Anonymous Coward · 2006-08-02 17:49 · Score: 0

Survivor didn't spawn the current reality TV craze -- Who Wants to be a Millionaire did. (Though Survivor was already in development).

Re:Picking nits by Anonymous Coward · 2006-08-03 01:06 · Score: 0

How do you figure that as "Reality TV"?
It's a "Game Show".
The only concept it (re*)introduced to terrible network TV was "Tedious Dramatic Time Wasting" that is also a mainstay of "Reality" shows.

* - I believe "$64,000 question" from the olden days used that same horrible concept which for some reason guarantees a huge flock of lowest common denominator idiots will "stay tuned".

Argh! Gratuitous Video! by abh · 2006-08-02 17:54 · Score: 1, Insightful

A 400mb video file? Is this a joke? WTF is everyone thinking that everything on the web needs to be on video all of a sudden. I just blogged about this today: http://www.anotherblogger.com/2006/08/02/please-no -more-gratuitous-videoblogging/

Good job the I don't filter web content by slayer99 · 2006-08-02 18:00 · Score: 2, Funny

"In his study he looked at the major spam filters ( DSPAM, SpamAssasian"

Spam about asian donkeys is a new one on me, though.

--
Martin Brooks / Slayer99 #linux / UIN 2178117

Re:Good job the I don't filter web content by Anonymous Coward · 2006-08-03 00:53 · Score: 0

He's just citing his secret insider source who's identity he can't reveal.

Re:Argh! Gratuitous Video! by Kredal · 2006-08-02 18:02 · Score: 1

Thanks for blogging about it... but did it really have to be a video blog?

(just kidding)

--
Whoever stated that signature sizes should be limited to one hundred and twenty characters can just go ahead and kiss my

Which spam filter won? by LoneBoco · 2006-08-02 18:17 · Score: 1

So... um... I really don't want to wait 8 hours or more to find out which mysterious and generally "unheard of" spam filter performed the best. Does anybody know where a text version of the results can be found?

MS Anti Spam... by pookemon · 2006-08-02 18:22 · Score: 1

I use the built in Spam filter in Exchange 2k3 set to level 8. All "filtered" e-mails are archived. I get maybe 3 or 4 a day (on a "bad" day) that make it through. Once a week (or more if I can be bothered) I view the archive and send on any that aren't spam (<1%) on and those that are spam get junked. I do this using a little tool I wrote that displays the From, To and subject of all these e-mails. If I can't tell from these fields whether the e-mail is a SPAM or not (and it generally is anyway) then I can view the contents of the .eml file.

P**s easy, effective and "Free".

--
dnuof eruc rof aixelsid

Re:MS Anti Spam... by KiloByte · 2006-08-02 18:49 · Score: 1

Er, what? A false positive rate of 1:100!?!?

Usually, anti-spam solutions which give more than 1:100000 are considered worthless. What you're quoting is beyond words.

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:MS Anti Spam... by pookemon · 2006-08-02 19:09 · Score: 1

A false positive rate of 1:100

No, better than 1:100 - that's what <1% means. It's actually around the 1:500

Usually, anti-spam solutions which give more than 1:100000 are considered worthless

Got links, or is that just your opinion?

--
dnuof eruc rof aixelsid
Re:MS Anti Spam... by KiloByte · 2006-08-02 20:37 · Score: 2, Informative
A false positive rate of 1:100
No, better than 1:100 - that's what <1% means. It's actually around the 1:500
And thus still 200 times worse than the acceptable rate.

Usually, anti-spam solutions which give more than 1:100000 are considered worthless
Got links, or is that just your opinion?
There was a massive flamefest on debian-devel about spam filtering recently, but false positive ratios in that range were something commonly used by most participants in the discussion. I don't have the time to find a bunch of such posts right now, but the most recent thread is "greylisting on debian.org". This particular thread deals mostly with acceptable delays, but it does include quite a bit of statistics.
However, note that we are talking about two separate scenarios:
- a home server for an user with no responsibilities
- a project/ISP-wide mail server
In the former, delaying mail for weeks may be acceptable -- but even then, I wouldn't touch something with a 1:500 false positive ratio with a long stick.
--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.

no torrent? by Anonymous Coward · 2006-08-02 18:31 · Score: 0

Does anyone else find it mildly ammusing that U.W., one of the top tech schools in North America, due to their regressive policy disallowing the use of torrents, now has a server getting a proper slashdotting?

So what is the previously unheard of spam filter? by jefp · 2006-08-02 18:33 · Score: 1

Anyone care to post a link?

No bittorrent... No credibility by bgog · 2006-08-02 18:33 · Score: 4, Insightful

Why exactly should be give any weight to anything from and organization so ignorant as to disallow bittorrent? I take someone pretty darn ignorant to disallow a protocol because some use it to transport illegal content. Why havn't then banned TCP? It is an evil technology used every day to violate copyright.

This guy should spend his time educating the fools at his institution.

Re:No bittorrent... No credibility by X_Bones · 2006-08-03 00:49 · Score: 1

Why exactly should be give any weight to anything from and organization so ignorant as to disallow bittorrent?

uh, maybe because the guy in question is doing actual research and has nothing to do with setting his school's IT policy?

Perhaps we should also discount anything that their biology or physics department has to say. After all, their organizations can't use BitTorrent either - maybe they too should spend more time educating fools at their institution.

--

the coolest club on /.
Re:No bittorrent... No credibility by Anonymous Coward · 2006-08-03 01:22 · Score: 0

Waterloo has banned BitTorrent and most everything else on campus so that people don't use the bandwidth for non-school purposes. Waterloo's too cheap to pay for a bigger pipe, so they just ban practically every protocol.
Re:No bittorrent... No credibility by Anonymous Coward · 2006-08-03 02:16 · Score: 0

This pretty much sums up why I stopped using the Resnet there, and am living off campus. Even Rogers is far better, with the traffic shaping they do, and oh how I despise Rogers so much as well.
Re:No bittorrent... No credibility by autophile · 2006-08-03 04:37 · Score: 1

Why exactly should be give any weight to anything from and organization so ignorant as to disallow bittorrent? ... This guy should spend his time educating the fools at his institution.

OK, if you've ever worked for a large organization, you soon realize that a few people in the organization set policy. The rest have no say. That's all there is to it.
And if this guy "spent his time educating the fools at his institution", he wouldn't actually have any time left over for his real job, which is research and educating students, for which the institution pays him.
--Rob

--
Towards the Singularity.
Re:No bittorrent... No credibility by KingEomer · 2006-08-03 06:53 · Score: 1

Resnet also had that annoying 500MB/week limit. (At least, that's what it was when I was there). I remember once downloading Cygwin to try it out. My bandwidth was throttled for quite a while after that, causing me much grief, since I had the "hard" assignment from CS251 to do, and it was too cold to walk to the MC. :P
Re:No bittorrent... No credibility by Krakhan · 2006-08-03 10:20 · Score: 1

Which one was the 'hard' CS 251 assignment? When I took that class, they all had to be submitted into the drop box. Though, maybe they've switched having all of the written assignments submitted online, with the 'fishing' problem they've had with the drop boxes supposively. Well, it gave me an excuse to use LaTeX anyways. And yes, I was mostly annoyed at that 500MB/week limit too when I was living in residence. It's a world of difference compared with my current local provider up here that doesn't give a shit (within reason) how much stuff you upload or download.
Re:No bittorrent... No credibility by KingEomer · 2006-08-04 02:15 · Score: 1

Did I say 251? lol. I meant 241. 251 was a joke. (The "hard" one being assignment 5, btw)
Re:No bittorrent... No credibility by Krakhan · 2006-08-04 17:53 · Score: 1

Ah, so I figured. Yes, the hardest assignment for CS 241 was the code generation for the compiler. Oh, how I spent many hours on that.. And I still didn't manage to finish all of it! Two weeks wasn't enough, damn it! :(

Oh well. It's a neato course, I'll admit.
Re:No bittorrent... No credibility by bgog · 2006-08-06 02:24 · Score: 1

See, but he's in the computer science department. I'd also have a problem with the bio dep if they disallowed students with AIDS because they might infect others. Professors are supposed to be rational and intellegent because they are responsible for educating others. They do have a say in the IT policy.

Possible Text Version by sciop101 · 2006-08-02 18:35 · Score: 4, Informative

On-line Supervised Spam Filter Evaluation
Gordon Cormack and Thomas Lynam

Full Text, May 29, 2006 - PDF Format

http://plg.uwaterloo.ca/~gvcormac/spamcormack.html /

--
The only thing new in this world is the history that you don't know.[Harry Truman]

Re:Possible Text Version by Rich+Klein · 2006-08-02 22:34 · Score: 1

I swear they're actively trying to obfuscate their results! I skimmed through the conclusions section of the paper and I couldn't figure out which filter won. It's clear that DSPAM and CRM-114 didn't perform as well as the rest. CRM-114 might approach the level of the winner(s) after a couple years of training, but it didn't get that much training in their test. I don't see anything about which package worked best, though. What a waste of my time...almost like spam!

--
-Rich
Re:Possible Text Version by gvc · 2006-08-03 01:45 · Score: 3, Informative

Bogofilter works great. Or SpamAssassin but only if you force-feed it its own judgements. In both cases you have to correct classification errors.
Fidelis Assis (who has now gone solo after having participated in the CRM114 project) shows great results for his recent solo effort: OSBF-lua Bratko's PPM spam filter -- the one that did great at TREC -- is not yet packaged as a drop-in filter. Same for my DMC spam filter.
The actual TREC 2005 tests referred to in TFA are here.
Re:Possible Text Version by Rich+Klein · 2006-08-03 06:15 · Score: 1

Thank you for the follow-up! I'll have to take a look at what you wrote about SA when I get home. It might be fun to try out OSBF-lua, but I've already got SA set up and running, and time is fleeting. :)

--
-Rich

best ever by Anonymous Coward · 2006-08-02 18:36 · Score: 0

In my office, the IT department is so cool they implemented the best spam filter ever (when Email server is up): Manual Filtering. It's awesome. Some of us can trash all of our spam before we even read it by carefully reviewing the subject line and sender. We never have false positives, so we don't miss anything. Granted, most people spend 3+ hours a day Emailing, but its OK. We filter out all spam, never miss anything. Some people even collect spam instead of junking it.

Funny thou, we keep buying a particular brand of hard drives for Email storage in our servers. The IT guys keep talking about their sea gate retirement plans. Good to see they want to spend their late years in sunny mexican beaches.

GMail Spam Filter by foxylad · 2006-08-02 18:48 · Score: 5, Interesting

I use greylisting (gld to be specific) which works wonderfully. A couple of customers wanted even better filtering...

First I tried DSPAM, but they refused to train it so the results weren't good. Then I tried Spam Assasin, which also let through a suprising amount of spam - a lot more than my personal account on Gmail.

So I set up accounts on Gmail for them, and forwarded their mail to those accounts (after greylisting - don't want to burden GMail too much!). Gmail lets you set up forwarding, so I simply forwarded all the filtered mail back to a second account on my mailserver for the customer to pick up. Finally I wrote a python script that logs in to Gmail once a week to prevent the account being closed due to non-use.

A tad involved, but it works like a dream. Yet again Google comes out on top, this time in a market it doesn't even know it's in!

--
Do as you would be done to.

Re:GMail Spam Filter by sd.fhasldff · 2006-08-02 20:35 · Score: 2, Interesting

This is actually something Google could sell. Access to their mail filter. I do realize that they have "corporate email", but that still smacks a lot of GMail and some businesses would rather avoid that. Instead, they could provide a simple access to their spam filter. Yes, requiring all email to be piped through a Google server if they don't want to make the filter available as a binary (presumably updated regularly).

To minimize bandwidth consumption and (partly, at least) allay privacy / corporate secrecy worries, the email piped through Google's servers could be limited to anything that didn't pass a white-list filter (e.g. removing all internal corporate email, as well as email from established business partners).
Re:GMail Spam Filter by Yonder+Way · 2006-08-02 23:50 · Score: 1

Yay. Just want I want. All of my email being read, indexed, analyzed, and archived by Google.
Re:GMail Spam Filter by sd.fhasldff · 2006-08-03 01:58 · Score: 1

Yay. Just want I want. All of my email being read, indexed, analyzed, and archived by Google.

That's exactly the problem with "GMail for Domains", but a problem that my suggestion would solve (or at least minimize). Instead of having a complete "GMail for Domains" solution, where everything works as per normal GMail (except for the email addresses not being @gmail.com), my suggestion would be to limit the solution to a pipe into Google's spam filter. Theoretically, Google could create a copy of all emails being piped through the filter, it would almost certainly be illegal for them to do so (since there would be a contract without language to explicitly allow Google to do so - the TOU for GMail is the only thing allowing Google to do it for standard accounts). It should be noted that Google would not have any clear reasons for actually reading the email (unless they wanted to embark upon a scheme of corporate espionage), since there wouldn't be any ads to serve.

And, as I said in my previous post, if this level of trust is not good enough, the problem could be further mitigated by limiting the emails transfered to those not passing a simple white-list filter.
Re:GMail Spam Filter by hacker · 2006-08-03 03:09 · Score: 1

First I tried DSPAM, but they refused to train it so the results weren't good.

Which is precisely why you're supposed to train it under a global account with the dspam corpus. I trained mine with a corpus of ~3,000 ham and ~3,000 spam messages before I let it loose under user accounts.
In 5+ years of using it now, I haven't had a single user complain of spam in their mailbox, ever. Not a SINGLE spam. Sure, they get false positives from time to time, but training those back with the web interface is ridiculously simple.
dspam is, hands-down, the best anti-spam tool I have ever used, and I've used a lot of them, including blackholes.us, firewall blocks, SpamAssassin, and 13 RBLs in Sendmail. Nothing works as well as dspam, IME.
Re:GMail Spam Filter by autophile · 2006-08-03 04:41 · Score: 1

It's funny -- a few months ago, my Google account was slapping the heck out of SpamSieve on my local account. But very recently, I've gotten several spams in my Gmail account. So whatever Google is using, they're going to have to start figuring out what the spammers have already figured out.
--Rob

--
Towards the Singularity.

So Which One Won? by ryanisflyboy · 2006-08-02 19:39 · Score: 2, Interesting

So which one is the "unheard of spam filter?"

Wouldn't it make sense to put this in the /. submission (or at least a link).

Did I miss the obvious "and the winner is..." some place?

Cloudmark's SpamNet by cruachan · 2006-08-02 19:40 · Score: 2, Interesting

I have to push this as it usually gets missed from reviews as it's a hybrid P2P solution and not a straightforward filter, but Cloudmark's safetybar product (http://www.cloudmark.com/) is just about perfect for me. I get an average of about 20 spam emails a day and it has a false positive result of 0% and has had for months. In fact I've been using the product for several years now and I think the last time I saw a false positive was a couple of years back.

On the efficiency side it has a hit rate of nearly 100%. I would have said it was 100% a couple of months back, but just recently it's been having a bit of a problem with one stock-pushing spam.

Anyway, that aside it's the best spam filter I've ever seen by a very long way, and I'd highly recommend the service. It costs a few $ a month, but it's probably the best value subscription I have.

I have no connection with the company, just a very satisfied customer who's been using it since the beta some years ago. I have a publically available email address which I've had for years and must be on many spam lists, without Cloudmark it would be unusable, with it it's no problem at all. I recently installed it for my wife who was starting to get a lot of spam - on that I noticed it took about two weeks to get it trained not to junk a few mailing list emails she was on, but after that it's been just as highly reliable as my installation.

Re:Cloudmark's SpamNet by maximthemagnificent · 2006-08-02 22:10 · Score: 1

My wife is the marketing manager of a ski resort and she gets uber spam. She swears by Cloudmark as well.

I personally get almost no spam...and I'm not quite sure why as I don't do anything to try and filter/avoid it. I suppose once
it happens I'll have to set something up. At least those damn telemarketers are history. Now we need to nail the non-profit
callers.

Maxim

A bittorent policy protest by Anonymous Coward · 2006-08-02 19:45 · Score: 1, Insightful

As you wonder how long it will take for 400MB file to come down at 1.5kB/s, a note from TFA:

We are sorry that these talks are not available through BitTorrent, however under present IST policy we are not allowed to run BitTorrent. We thank you for your understanding.

Erm.. This is more about a "take this policy and shove it" protest than content of the movie. I applaud their creativity.

Best spam filter. by Viceice · 2006-08-02 19:45 · Score: 1

IMHO, the criteria for best spam filter is very simple. It is the filter that is able to consistantly maintain the highest spam to false positive ratio.

Feel free to add to it. :D

--
Sometimes I wish I was a plumber, then I'd know how to deal with other people's shit.

Re:Best spam filter. by A+Life+in+Hell · 2006-08-02 23:07 · Score: 1

I'm inclined to agree with you, however there is a minor flaw in your logic - the best spam filter will _always_ be no spam filter, since it has zero false positives - and therefore, by definition, a score of infinity - you know, anything divided by zero and all that.

--
Commodore 64, Loading up the dance floor!
Re:Best spam filter. by PigleT · 2006-08-03 01:20 · Score: 1

When you've had to roll-out a filter across a multi-user netowrk, you'll have more data to be going on with.

DSPAM might be all spangly and wonderful, but switch it to use a proper RDBMS backend on a separate box and its usage is *SO* inefficient that analysing *one* email went from about 1s to >10mins. ("I need a wordlist! So I'll just, erm, select * from a 10k-row table." Er yeah - I'm sure it's good programming to have multiple switchable backends, but they didn't have a clue about efficiency.)

Ditto spamassassin: it ain't cheap. Particularly so when you have multiple remote lookups against RBLs (hey, they're crap, but spamcop.net is worth something in the scores) and razor2 and pyzor et al, that time-out and add to the box's load.

Personal favourite, without having bothered wasting bandwidth on these videos where plain text would've sufficed, is a combination of strong sender-verification in exim, spamassassin (with score>10 => reject at source) and bogofilter. I cross-train bogofilter and SA regularly, and include my sent-mail box in bogofilter so it knows *exactly* what topics I like to talk about, thereby increasing term-value-separation.

--
~Tim
--
.|` Clouds cross the black moonlight,
Rushing on down to the circle of the turn

Give grey listing a try... by xt · 2006-08-02 19:47 · Score: 1, Insightful

The more effective way I have found to stop spam is grey listing. In the last two months, I have had zero spam messages go through to my mail server. I use GSLT (http://www.xmailserver.org/glst-mod.html), which is mostly for the XMail mail server ( http://www.xmailserver.org/) but will work anywhere.

You should also check this article http://www.freesoftwaremagazine.com/articles/focus _spam_postfix?page=0%2C0, lots and lots of good advice on spam filtering.

Re:Give grey listing a try... by Anonymous Coward · 2006-08-02 20:26 · Score: 0

If you prefer greylisting and dspam, take a look at the Asas project (http://asas.rpath.org) which aims to provide a greylisting/dspam appliance and much more.
Re:Give grey listing a try... by nblender · 2006-08-02 23:34 · Score: 1

Greylisting was predicted to work for only a short time and that's how it worked out. Greylisting works only against zombies who try to send mail directly to your server via port 25. As more and more ISP's get smart and start blocking outbound 25 from their dynamic pools, greylisting (and relying on rDNS pattern matching to filter for dynamic pools) is becoming less and less effective. I am a mailing list owner for a large free open source operating system project. This project uses greylisting on its mail server. I get a lot of spam from zombies relayed through their ISP's mail relays, that has bypassed greylisting. In short, enjoy greylisting while it lasts. It will be almost completely inaffective inside of a year. Spammers are learning how to route through the mail relays they find configured in the users' mail client. They're also learning how to authenticate with those mail relays using your credentials. They're learning how to adapt to rate limiting enforced by your ISP's mail servers. I predict within a few years, those of us on broadband, will only be able to relay mail to our ISP's port 587, using authentication, be limited to 20 emails per day, no more than 2 per hour, and only from a set of 4 pre-configured sender addresses. I predict there will be an RBL to identify those ISP's who do not implement such a sane policy.

There are lots of you with little pet techniques for filtering spam that you think are effective. Some of you claim to be able to rid yourself of 90% of all spam using one single technique. You should consider that any moron can get easily get rid of 90% of spam. Probably even 95% for lesser morons. Especially on personal mailboxes where you can arbitrarily choose to cut out huge geographical regions and don't care about a few false positives. All the work is in that last 5% while still offering useful mail service to a large and diverse user community.

John Graham Cumming has been tracking anti-spam tool spam/ham strike/hit rates according to published studies that meet certain dataset criteria. You can find it here: http://www.jgc.org/astlt/
Re:Give grey listing a try... by ShOOf · 2006-08-03 01:27 · Score: 1

I second greylisting, I've had pretty good spam control but I installed postgrey a couple of weeks ago and spam has come to a near standstill.

Out of Date and Worthless by prandal · 2006-08-02 20:09 · Score: 4, Informative

This paper's a complete waste of time.

He tested spamassassin 2.3 - that's ancient! I'd imagine the other tools are similarly obsolete.

We currently use SA 3.1.4 with a well-trained Bayes database and Razor, Pyzor, and DCC.

Throw in a few custom rules and a selection of rules from http://www.rulesemporium.com/ and the results are outstanding.

With the new sa-update feature the core rules are updated between point releases, which came in useful this week dealing with the new image spams which seemed to be designed to avoid detection by spamassassin. Thanks Theo.

And the folk on the spamassassin-users mailing list really rock.

Re:Out of Date and Worthless by Anonymous Coward · 2006-08-02 21:25 · Score: 0

I have to completely agree with this...

SpamAssassin Pyzor, Razor, DCC do rock but put them all behind a greylist and you have a better solution.

Im using spey (http://spey.sourceforge.net/) - a greylisting proxy, that Ive "hardened" to drop connections that have a "ip/dynamic" looking rDNS (in regex) or give me the same ip/dynamic as a HELO/EHLO (or not a valid domain name). Since I started using this combination three months ago, only 1 spam made it past greylisting and that was a phishing attempt bounced via a legit mail server and then spamassassin caught it. Not a single spam has reached my inbox in months. 28000 blocked connections from dynamic IPs to date. I dont give a flying fuck if you cant email me from 4.5.1.10-dynamic.adsl.dhcp.client.yahooBB.comcast. net, either mail via your ISPs server or get your reverse dns sorted.
Re:Out of Date and Worthless by Anonymous Coward · 2006-08-02 21:34 · Score: 0

http://www.okean.com/antispam/sinokorea.html

And that, IPtables block on China and Korea, that chops most of your spam out as well.
Re:Out of Date and Worthless by gvc · 2006-08-03 01:18 · Score: 3, Informative
I assume the paper that you are describing is the 2004 study. The paper described in the talk (which was given 6 months ago or so) described results of the TREC 2005 Spam Track which took place in November 2005. It included a test SpamAssassin 3.x, not 2.3.
TREC 2006 evaluations are now underway.
While it is reasonable to conjecture that spam has changed so as to defeat spam filtering techniques, or will change so as to defeat the PPM technique that did well at TREC, the historical evidence does not support this conjecture. In particular:
- The spam filters tested in 2004 give pretty well exactly the same performance on 2005 and 2006 data.
- New versions of the filters are a little bit better, but not by leaps and bounds, and also get about the same results over the last 2.5 years of data.
- There is no evidence that "Bayesian poisining" is a viable technique for defeating statistical spam filters in anything but a very artifical laboratory environment where the poisoner has access to the recipient's inbox
The subject of the paper -- and the talk -- is primarily about testing methodology and the need for controlled scientific investigation. So I hesitate to endorse the simplistic notion of a "winner" of the TREC evaluation. However the technique that did very well was indeed quite novel, so here's a characterization.
Andrej Bratko used PPM -- a well-known data compression technique to compress ham and spam separately. Well actually he didn't compress them but just build the statistical model necessary to compress them. Then he simply (tentatively) added the unknown message to each model and chose the one that compressed it best. The general technique of using compression has been mentioned here and elsewhere but Bratko used a much stronger compression scheme and was somewhat clever about it.
I later reproduced Bratko's results using DMC -- a compression schem that I invented 20 years ago -- and got some interesting results. We have a journal article in press describing it and also an evaluation paper at CEAS 2006.
Bratko A., Cormack G. V., Filipic B., Lynam T. R. and Zupan B., Spam Filtering Using Statistical Data Compression Models
Re:Out of Date and Worthless by hackstraw · 2006-08-03 05:06 · Score: 1

I'm a SpamAssasin fan, and it works VERY well for me.

Yes, I use custom rules and RulesEmporium rules, and I have private HAM rules that give negative points to things that are indicators that mail was sent by a human who knows my organization, which in turn trains the bayes filter to learn HAM automagically.

The thing is that spammers read the spamassasin mailinglist and others, but they can NEVER learn the intricacies of every organization to send blanket mails that will escape all of these custom HAM rules.

Unfortunately, we cannot talk about these HAM rules specifically. Its up to every human mail admin to add these things in the privacy of their own computer system.

Amusingly, POPFile caught you by patio11 · 2006-08-02 20:15 · Score: 4, Interesting

I ran your message through a perl script to mail it to me for giggles (I do research on spam filtering at ye olde day job). Regretfully, you didn't make it through. Aside from header garbage, which was a mixed bag (half spam tokens, half "known-good automated email" tokens), you ran into problems with dope, ass, wanna, and... work*. Which is just as well, as I have no desire to speak to anyone who uses those words. * Last 15 occurrences in my mailbox are all of the "Make l0ads of $$$ work @ h0m3!" variety.

--
Help poke pirates in the eyepatch, arr.

Human classification is not zero risk by patio11 · 2006-08-02 20:22 · Score: 1

How many spam do you get a day? I get hundreds. Half of them are not in my native language (much like half the mail in my inbox), which means it takes more than a split-second glance to figure out what is going on. I'd guess my accuracy in split-second decisions is probably on the order of 95%, which if I were a spam filter would earn me a D-. Paul Graham, who probably has more typical email habits when compared with the average Slashdotter, says he misses about 3 per 2,000. http://www.paulgraham.com/wsy.html There are systems which are better than that.

In Soviet Spam Filter, the computer doesn't trust YOU to filter the email.

--
Help poke pirates in the eyepatch, arr.

It is a war by Alain+Williams · 2006-08-02 20:42 · Score: 2, Insightful

Spam is a war between the spammers and the system administrators/spam filters. The spam filters adopt a new technique; then spammers then work round it; the spam filters advance; ...

By the time that I have downloaded the video the war will have moved on a couple of iterations ...

Why do they try? by StoatBringer · 2006-08-02 20:58 · Score: 1

Why do spammers even bother to try to get around spam filters? If someone is actively blocking spam, it stands to reason that they are the least likely to buy any of your HerB0l Vi.aGra anyway, so what's the point in attempting to get into their inbox?

--
Cress, cress, lovely lovely cress

Re:Why do they try? by Anonymous Coward · 2006-08-02 22:05 · Score: 1, Informative

Because many clueless morons have email spam filters administered by the clued;
Not making any judgements but the "clued" category includes Gmail, Yahoo Mail,
AOL, corporate IT managers and university mail server admins.
Re:Why do they try? by maubp · 2006-08-02 23:41 · Score: 2, Insightful

If an end user is trying to block spam, then yes, they are probably not the sort of person likely to buy your product. At least until spam-blocking becomes more main stream in email clients (e.g Mozilla Thunderbird).

However, its very often the end user's ISP doing the spam filtering - and this has no direct bearing on the gullibility of the email recipient.

Way to go compression ! by bytesex · 2006-08-02 21:14 · Score: 2, Interesting

It looks like another win for compression algorithms. Not only do they maximize entropy in your data while shortening it, they can also be used successfully to earmark pieces of text as being written in a certain language, or written by a certain author, and now they can be used for spam detection. The usefullness just keeps on coming. Colour me impressed.

--
Religion is what happens when nature strikes and groupthink goes wrong.

Re:Way to go compression ! by Dr.+Evil · 2006-08-03 01:20 · Score: 1

"Colour me impressed."
The 80's called. I hung up on them.

real men by nude-fox · 2006-08-02 21:23 · Score: 1, Funny

dont use spam filters and we reply to every damn email we get no matter what

Re:Spam Ass Asian? by Shaper_pmp · 2006-08-02 21:27 · Score: 1

Clearly that's the new fork of SpamAssassin that ensures only Vi4gra, penis-enlargement pills and "meet h0T n4k3d t33n s1uts" invitations get through...

--
Everything in moderation, including moderation itself

Best for how long? by kylehase · 2006-08-02 21:47 · Score: 1

I haven't watched the video (its still downloading) but after reading some of the comments it seems that spammers try to circumvent the most popular spam blockers. SO after watching this video, if the best spam blocker becomes the most popular, won't that then make it less effective? Dammit. If that happens I'll have to waste another day downloading the next video.

--
You want fun, go home and buy a monkey!

Re:Little known systems will often be most effecti by MichaelSmith · 2006-08-02 21:57 · Score: 1

Reminds me of why I like living in Australia - globally speaking we're relatively irrelevant, making us a relatively small target. Hopefully we'll stay relatively irrelevant, lol :p

And if it started getting worse you could move to tassie and get that feeling of irrelevancy back.

--
http://michaelsmith.id.au

Torrent by vivin · 2006-08-02 22:55 · Score: 3, Informative

Here is a torrent I made of the xvid file. It should work (I hope).

--
Vivin Suresh Paliath
http://vivin.net

I like

Re:Torrent by Anonymous Coward · 2006-08-02 23:26 · Score: 0

ERROR:
Problem connecting to tracker - HTTP Error 404: Not Found
Re:Torrent by jsharkey · 2006-08-02 23:54 · Score: 2, Informative

Go get VideoLAN client and you can stream download the OGG version. Just open the URL as a Network Stream:
http://www.csclub.uwaterloo.ca/media/files/cormack -spam.ogg
Very handy use of VLC! :)
Re:Torrent by vivin · 2006-08-03 01:54 · Score: 1

Cool! Thanks!

--
Vivin Suresh Paliath
http://vivin.net

I like
Re:Torrent by 42forty-two42 · 2006-08-03 02:24 · Score: 1

Your announce-url (http://www.vivin.net/announce) is 404'd.
Re:Torrent by wayne · 2006-08-03 03:27 · Score: 2, Informative

Your tracker is still 440'ing, so I have put up an alternative tracker. As I write this, I only have about 9% of the avi downloaded, so if someone else can seed the complete cormack-spam-xvid.avi file, I would greatly appreciate it.

--
SPF support for most open source mail servers can be found at libspf2.
Re:Torrent by vivin · 2006-08-03 05:56 · Score: 1

Sorry about that. Setting it up from half-way across the world through unreliable iraqi internet... something's bound to go wrong! I'll check it out.

--
Vivin Suresh Paliath
http://vivin.net

I like

Paul Vixie on botnets and spam by dodobh · 2006-08-02 22:57 · Score: 2, Interesting

See here

The key paragraph:

If you'd like a more topical example, consider "spam". People began altering their e-mail "From:" lines in order to make their addresses harder to guess or aggregate; people began doing pattern matching in order to catch known-bad messages and either sideline or reject them. Many defenders used many small tricks to protect their inboxes. The result has not been that less spam is sent or even that less spam is received, on an aggregate basis. Things are worse now than they've ever been. (I say this as co-founder of MAPS LLC, by which I hope to establish my credentials in the spam field for those of you who do not know me.) Today a small number of highly advanced defenders is spam-immune only because they are a small number and their techniques are not widely effective against the attackers; and a small number of highly advanced attackers can "spam at will" a far larger population than ever before. And the trend is that things are getting worse, and getting worse faster than ever before.

--
I can throw myself at the ground, and miss.

Mod as Funny by vivin · 2006-08-02 22:59 · Score: 1

INFORMATIVE? Mod the parent FUNNY, please.

--
Vivin Suresh Paliath
http://vivin.net

I like

Dspam floats my boat by Zzeep · 2006-08-02 23:20 · Score: 3, Informative

I receive (no kidding) around 600 spam mails per day, versus approximayely 30 real e-mails. I've been using dspam for over a year now (with very faithful training), and there is maybe 1 false positive every few weeks (less than 1 in 10.000) and every few days a few (usually "new") spam mails get through, which I ofcourse immediately train, to never see those kind again. So I am very very positive about dspam. What I do miss though is something like a good and reliable service (better than the RBL's I know) that can block SMTP clients on the fly (like DSL home users and such) to reduce the immense load on our mailservers (I work for an ISP) caused by all the spam (that also has to go through a virus scanner, clamav).

Re:Dspam floats my boat by gvc · 2006-08-03 02:21 · Score: 1

600 spam mails per day, versus approximayely 30 real e-mails [...] 1 false positive every few weeks (less than 1 in 10.000)
30 emails per day times a few weeks (say 3 weeks) is 630 real emails. That's a false positive rate of 1/630 or 0.16%, if you have really noticed all the false positives. Not 1 in 10,000.

Fond memories of Polarbar by smchris · 2006-08-02 23:36 · Score: 0

I have the impression the Java mail program has languished a bit and I haven't used it for years, but the best spam filter I've used I built up myself using their filtering capabilities. You didn't just "add" filter criteria. You could link them together in "AND", "OR" and "XOR".

Yahoo seem to have got it right... by kevingolding2001 · 2006-08-02 23:40 · Score: 1

I don't know what they use or how they do it, but my main email address is with Yahoo and they seem to have solved the problem.

Each month I get maybe 800 odd spam emails and a dozen or so real emails.

Once every month or so I get an email in my inbox which is spam and I click the 'this is spam' button. About September last year was the last time a real email ended up the bulk (spam) email folder. I used to check and delete the spam emails every couple of days, but now just let it build up and be deleted by the 30 day time out.

Obviously a web-based account is not quite as convenient as a local account, but it seems to handle the spam onslaught and still be useful.

What about Greylisting? by IMarvinTPA · 2006-08-02 23:49 · Score: 1

Sadly, the way this was done, there is no way to test how well Greylisting would have helped.

IMarv

--
Trusting software vendors is no smarter than trus

Brightmail is good by beaverfever · 2006-08-02 23:56 · Score: 1

I would like to add my voice to that of the original poster. Brightmail is remarkably good at eliminating spam, and I do not know of any false positives in the years I have used it. (and yes, I have the habit of doing a quick eyeball scan of my spam folder before dumping it)

--
RTFM; please, I beg you.

Re:Argh! Gratuitous Video! by GoulDuck · 2006-08-03 00:14 · Score: 1

I dont want to read your blog ... could you make a video?

No Spam for the Women by chicklet427 · 2006-08-03 00:46 · Score: 1

There's only 15 or so people using the Exchange server I set up, but it still gets around 150 spams a day. I installed GFI Mail Essentails, it works ok but lately the amount of spams getting through is increasing - I would say at least 2 or 3 per user per day. I get especially angry at all the Cialis/Viagra e-mails... they could at least throw in some female targeted spams once in a while! Do they assume if you use e-mail that you're a man?? Or that women are already perfect?? LOL

99% is nothing to crow about. by jamesh · 2006-08-03 00:53 · Score: 0

99% means 3-4 junk emails in my inbox every day. That ain't so good.

Scary by kjdames · 2006-08-03 01:08 · Score: 1

...major spam filters ( DSPAM, SpamAssasian , etc.) along with those...

I don't know about you, but these new anti-spam measures are starting to scare me...

--

Typos... that's just how I role.

Greylisting is intrusive by gvc · 2006-08-03 01:26 · Score: 1

Personally, I can't accept unpredictable delays in my email, so I have opted out of greylisting. Also greylisting has a non-zero and very hard-to-measure false positive rate.

nice and slow by infofc · 2006-08-03 01:42 · Score: 1

at least the download is quite slow. Who wouldn't want to wait 4 hours for the amazing revelation that you have to train your filter well or you will get false positives.

Conjecture on Gmail's spam classification system by Old+Man+Kensey · 2006-08-03 02:01 · Score: 1

I suspect Gmail uses a combination of automated classification and tweaks based on what users actually classify as spam. My Gmail account gets anywhere from 30-100 spam messages per day, and I've noticed a couple of patterns:

One of the mailing lists I'm on had a certain poster who would send things to the list via Bcc:. Gmail classified his mail as spam for a couple of days, but after I pulled his mail back to the Inbox a couple times during that initial period it never did again.
Spam goes in waves: for a few days I'll get none or one or two a day in my Inbox, then a new variant will come out (sort of like viruses) and I'll get 10 that day.
The phishing warning seems to be set pretty cautiously. I have many messages in my spam folder that are clearly phishing attempts (following the same pattern as earlier attempts), but are not marked as such -- more so than in years past. I suspect the assignment of the "phishing attempt" tag is based on user participation, and the userbase is getting complacent over time.

Gmail, I suspect, is taking a brute-force approach to classifying e-mail as spam: if a large number of hundreds of thousands (millions?) of users say it's spam using the Report Spam function, it probably is.

--
-- Old Man Kensey

Your model is flawed. Addresses escape! by KlaymenDK · 2006-08-03 02:25 · Score: 1

Your model is flawed. I used to do the same thing, but EVENTUALLY that one 'private' address WILL escape into the wild, and then you:
(a) are fscked, or
(b) must create a NEW address to keep private (and cross our fingers again).

For me, my 'private' address is "@.", so creating a new one is not a valid option (being dave2205 is okay on Hotmail, but not on the family domain...).

Add to that the fact that I frequently access my email from different computers (locations). Using IMAP and webmail is a must, and while our host does use some form of spam filter it's nowhere near as good as a well-trained Bayesian.

It's now so bad that I've all but given up on using 'alias addresses' and just give everyone my once-private address. That would rid me of the hassle of managing the aliases at the expense of presumably only slightly more spam.

Unless you have a better idea. :)

--
"Good news, everyone!"

Re:Your model is flawed. Addresses escape! by DarkDragonVKQ · 2006-08-03 02:42 · Score: 1

I agree, it's not fool proof. But it's worked for me for several years now so. :). I also find it helps if you don't use name@domain.com. (the bots will pick it up) so whenever possible I always use name at domain.com or some random thing to replace the word at or @ so only humans would know what to do with it most of the time. (I even do that with my junk account). Though it won't work for signups.

I also tend not to use any computer/laptop but my own to check email because of privacy reasons. (though I have checked it at work before). I just use my USB portable Thunderbird so it's not a huge problem checking it from whereever.

Though setting up a filter with the words "sex, viagra, penis, problems, women, mortgage, sexy (and so on) pretty much catches everything outside of the image spam. (though I got images turned off in Thunderbird anyway). And luckily I haven't gotten much of them.

Though I do have an idea to use with those private email addresses. If they're your friends they're pretty much going be using the same email (if they keep changing or losing it I'll usually direct them to send mail to my more public accounts). Setup up an approved list so it automatically sends them thru if they match up the address. (won't help vs viruses and trojans though). Anything else gets stuck at the filter. The only problem is you gotta be on top of updating it. So if you apply to a job and you expect an email response then you better add "@name of business" to the permissions list or you'll be wondering why you never got a response.. lol.

--
"I thought what I'd do was I'd pretend I was one of those deaf-mutes" ~ Laughing Man - GITS:SAC
Re:Your model is flawed. Addresses escape! by KlaymenDK · 2006-08-03 02:49 · Score: 1

You have some good mail rule tips there, thanks for sharing.

Ohh, how I miss the The Bat! client now that I switched to BSD ... but at least those viruses are no longer a worry. And coincidentally the times when I did get email viruses, it was from friends... :/

BTW, webmail goes over https, otherwise I wouldn't do it either, and I only do https because I haven't grokked setting up my desktop at home for ssh access yet.

--
"Good news, everyone!"
Re:Your model is flawed. Addresses escape! by DarkDragonVKQ · 2006-08-03 03:10 · Score: 1

Yeah viruses and trojans are annoying. They infect someone's computer and then access their address book to send it to their contacts. Annoying when its your friends and family that's sending it to you. Not much you can do about it on your side besides ask them if they sent it and scan it before opening. Though you could educate them about computer security.

--
"I thought what I'd do was I'd pretend I was one of those deaf-mutes" ~ Laughing Man - GITS:SAC

Everyone is not out to get you by Old+Man+Kensey · 2006-08-03 02:27 · Score: 1

Disallowing P2P at institutions of higher learning is often nothing to do with copyright at all -- it's about saving the network bandwidth for true academic uses. There may be a smidge of academic use of BT, but even at a university I really don't think anyone's going to try and seriously assert all those kids downloading music in the dorms are doing it for research purposes. Bandwidth is neither infinite or free -- in fact in the real world it's often pretty damn scarce.

I'm sure that if there were a pressing need to use BitTorrent for something academic that could not easily be done any other way, that an exception would be made (but the only thing I can think of that would fall into this category would be research on BitTorrent itself).

--
-- Old Man Kensey

Re:Everyone is not out to get you by ratboy666 · 2006-08-03 04:11 · Score: 1

And Bittorrent is about saving bandwidth in EXACTLY this circumstance. Now, a whole bunch of people are going to download a 400MB file. But, who cares? Its "legitimate" use of bandwidth. Bittorent is not really a "P2P" protocol, ok? The only replacement for Bittorrent in this case is MASSIVE bandwidth. Bittorrent does not have "searching" provisions. It does not allow access to arbitrary files (everything needs seeding) -- at least the version I use. Bittorrent is sort of like "ftp" with a bandwidth magnifier attached, and less browsing.

It IS the tool for the job here.

Ratboy.

--
Just another "Cubible(sic) Joe" 2 17 3061
Re:Everyone is not out to get you by Old+Man+Kensey · 2006-08-03 07:46 · Score: 1

My point flew right by you. I'm not talking about the network from the viewpoint of this professor providing content -- I'm talking about the network from the viewpoint of the IT staff at his college, who have to worry about hundreds of dorm rats queueing up downloads and running BT 24/7 if they allow it. Since the academic use of BT is vanishly small to nonexistent among college students, from their point of view rather than throttling it, it's easier just to block it and make the occasional rare exception for bona fide academic uses.
Also the prof is frankly either not that bright (unlikely for a CS prof at Waterloo) or has an agenda, because if he wanted a BitTorrent version, he could easily carry his 400 MB file home, put it up on a tracker, and voila. So, for that matter, could anyone on Slashdot and I saw several comments talking about doing so. Why he chose to whinge rather than do the minimal amount of work involved is beyond me, unless he has some reason for wishing to heap misdirected scorn on Waterloo's IT staff.

--
-- Old Man Kensey

Correction by KlaymenDK · 2006-08-03 02:27 · Score: 1

My 'private' address is "FirstName@LastName.TLD".

Forgot about using GT and LT signs...

--
"Good news, everyone!"

Slides from the presentation by gvc · 2006-08-03 02:36 · Score: 2, Informative

Here are the slides from the 400MB video presentation.

OG NOT AGREE by alienmole · 2006-08-03 03:32 · Score: 1

Feh, I scoff at your breakable clay tablets. If you want durability, you can't do better than spreading ochre on the walls of a cave. Cave paintings have lasted for tens of thousands of years!

Re:OG NOT AGREE by turbidostato · 2006-08-03 11:02 · Score: 1

"Cave paintings have lasted for tens of thousands of years!"

You really must have taken this wrong!

My archbishop swears the world is only 6010 years old!
Re:OG NOT AGREE by alienmole · 2006-08-03 13:53 · Score: 1

My archbishop swears the world is only 6010 years old!
Explain to your archbishop that he's thinking of God years, which are kind of like the inverse of dog years. According to leading physicists, 1 human year is 7 dog years (actually it's more complicated than that, but I'll keep it simple for you laypeople). OTOH, 1 God year is about 2 million human years. That number's not perfectly accurate, but they'll improve it once they've succeeded in capturing and interrogating God, using the Large Hadron Collider at CERN.

The only filter that doesn't degrade by Anonymous Coward · 2006-08-03 03:56 · Score: 0

The only spam filter I have ever used that doesn't seem to degrade significantly over time is Cloudmark SpamNet (they renamed it to Desktop or something). Every other filter I used got progressively worse. Don't know how they do it, but highly recommended.

Yep. Barracuda now is totally ineffective... by Anonymous Coward · 2006-08-03 05:21 · Score: 0

My company bought a Barracuda a year and a half ago, it worked great for almost a year. 99.9% of all spam got stopped by it, but in the last 6 months, more and more spam just cruises right thru it. Barracuda's vendor-supplied filter rules have become totally ineffective now, it's almost like they've lost all their talent and ability to create new rule sets in their periodic updates that are able to counter the spammers' latest tactics.

Solution: Combine technologies by Jeppe+Salvesen · 2006-08-03 05:27 · Score: 1

What I imagine would be a good spam filter, would be like this:

First, run through a spellcheck and grammarcheck with a fairly lax spellcheck (we all make mistakes, but not all the time). And filter out anything but Norwegian, Danish, Swedish, French and English text. That oughta kill a good 60% of my spam. Next, some technology to kill the image-only spams (checksums? content likeness to known spam?). Then, run through a bayesian filter (or some such technology).

Now, I just need a good spellcheck/grammarcheck library etc, and then maybe I can beat the spammers for a good while.

What do you guys think? Should I spend some quality time with Perl?

--

Stop the brainwash

Greylisting by Vadim+Makarov · 2006-08-03 05:28 · Score: 1

Has greylisting been used more widely than just by pair.com?

The problem with greylisting is not as much the delay, but the messages that never get delivered. I've missed two or three important emails over the last year (i.e. whose absence I later noticed: a renewal notification from a domain registrar, etc.) because of greylisting.

--
17779 eligible voters in a district, 17779 'vote' as one. This is Russia.

In Soviet Russia... by dandanio · 2006-08-03 05:40 · Score: 1

...Spam Filters test... you. :)

8" floppies by Anonymous Coward · 2006-08-03 06:02 · Score: 0

>Heck, most of our modern storage formats can't even manage 30 - tried to read a 8" floppy recently?

Actually, yes -- 8" floppies hold up really well and I hardly ever have trouble reading them. Those 3.5" things are another story though. And QIC-40/QIC-80 tapes didn't really work even when they were new. We may have already peaked when it comes to reliable data storage.

Re:8" floppies by Wonko+the+Sane · 2006-08-05 03:48 · Score: 1

I hated 3.5" floppies. I remember about a 25% sucess rate for "not having a bad block on a new disk as soon as I write something important on it."

Non-sense will get through by Anonymous Coward · 2006-08-03 08:58 · Score: 0

Most of the spam that does get through is the "Poetry Spam". Excerpts from The Bible, Harry Potter, Playboy Fourm, etc. have been dumped into my mail server and some have gotten through because they had some "legitimate" text but the gif (most often) are the ones with virus and/or "mal-links".

whitelist pop3 email /w bounceback for unknowns? by bluetigerbc · 2006-08-03 08:58 · Score: 1

hey all,

this spam really needs a perm solution. other then hacking people's white lists i thought having a system similar to earthlink's "proove yer a real person by typing in a random code" would be good. an even simpler idea would be click the correct pic (from 8-9 pics to choose from) and then it would finally send to their email.

i'd give up newsletters in order to have something like this.
analysing email is a waste as spammers find ways around them. if the pics could be changed up randomly or the code to enter changes then it makes their work much harder to get any spam in.

if i got 1-2/day i wouldn't mind but gmail drops 50 in my spam folder (which of course most skim through anyways).

send a mail, bounces back /w an html to click/type into, gets into their inbox. im suprised it doesn't yet exist.

if this does exist please drop me a line and let me know. i'd love to use it...

bluetigerbc
at gmail dot com

Or . . . by AncientPC · 2006-08-03 10:27 · Score: 1

you can use automatically created disposable e-mail addresses.

AND the Results? by Anonymous Coward · 2006-08-03 12:05 · Score: 0

Fast forward to 39:42 into the movie to see his rankings.

Here's what I saw (YMMV):
1) bogofilter
2) ijsSPAM2
3) spamprobe
4) spamasas-b (learning only)
5) crmSPAM3 (1:40 ham eaten)

Of course, he immediately showed other views of the data and had different rankings. Basically, you need to decide how much real email you are willing to lose to fight **any** spam getting in.

Ugh. presentation was awful by jkinney3 · 2006-08-03 15:14 · Score: 1

After sitting through the full 58 minutes I was truly disapointed that Dr. Cormack was still pacing back and forth. The style and delivery of the presentation was truly horrid. 58 minutes of his pontification was enough. I still need to sort through the "data" he presented in a very unclear manner to see it it makes sense. Too much "off the top of the head" and not enough deliberate, directed information.

CanIt by vandan · 2006-08-03 15:35 · Score: 1

CanIt works a charm for me. It's free ( beer free ) for 50 users, and uses open-source tools to get the job done. I used to get 30 - 50 spam messages per day ( and this was years back, before there was so much spam ). I might get 2 per week now, and the bayesian filter learns from experience, so whatever comes in at least helps you block more of the same stuff.

Slashdot Mirror

Proving Which Spam Filters work Best

263 comments