Fighting Spam with DNA Sequencing Algorithms

← Back to Stories (view on slashdot.org)

Fighting Spam with DNA Sequencing Algorithms

Posted by ryuzaki0 on Sunday August 22, 2004 @01:05AM from the crushing-the-mouse-with-a-mallet dept.

Christopher Cashell writes "According to this article from NewScientist, IBM's Anti-Spam Filtering Research Project has started testing a new spam filtering algorithm, an algorithm originally designed for DNA sequence analysis. The algorithm has been named Chung-Kwei (after a feng-shui talisman that protects the home against evil spirits). Justin Mason, of SpamAssassin, is quoted as saying that it looks promising. A paper is available on the algorithm, too (PDF)."

5 of 142 comments (clear)

Min score:

Reason:

Sort:

High tech for what ? by Ozh · 2004-08-22 01:21 · Score: 3, Interesting

Funny how some people develop more and more sophisticated stuffs to fight against something that is just as simple as sending out emails to random address... and so simple that it will never stop :/
The biggest problem I see, at the moment.... by Rahga · 2004-08-22 01:46 · Score: 3, Interesting

It looks like much of the spam I'm recieving today consits of either nearly-blank or e-mails containing news articles that seem to be designed to pass trough content filters just so users can send them back to their admins as spam, essentially making it easier for bayesian filters and such to mark legitimate e-mail as spam.... though honestly, it's more of annoyance for me, as it makes it easier for users to say "The spam filter isn't working, what are you doing wrong?"
Wrong title, I guess by stm2 · 2004-08-22 01:47 · Score: 5, Interesting

According to the ./ title, it seems they used an algorithm used for DNA secuencing, when in fact they used an algorithm used for DNA analisis (or DNA sequence analisis that is the same), more specifically, gene finding techniques. As you may know, most DNA in a genome is not translated into protein (some people still call it junk, but most of it is no junk at all). So there are programs to sort genes out from the rest of DNA.
I think we will see more and more applications like this with the growing cross-polination between Biology and CS.

--
DNA in your Linux: DNALinux
Re:Mozilla Firefox by littlem · 2004-08-22 02:21 · Score: 3, Interesting

My experience with it has been rather disapppointing. Why I need to tag as spam two messages from the same sender or with the exact same subject is a mystery to me. After the 10th "Make $/d+ in XX days" type message one has to wonder just how effective this thing is.

This shouldn't be all that surprising - Bayesian filtering is all based on probabilities. The reason "Outlook message rules" is so bad is because a friend of mine might send me a joke about Viagra, which I don't want to have deleted indiscriminately as spam. False positives are infinitely more annoying than false negatives, so I'd much rather have conservative filtering that let a bit of spam through.

I'm not saying Bayseian algorithms are perfect yet (though they'll improve) - my personal experience has been SpamAssassin, which got 97% of spam, and I've been experimenting with Thunderbird for a week, which gets 85%-90% and will no doubt get much much better as I train it in the next couple of weeks - but ultimately Bayesian filtering is enough to beat enough spam to make spamming not worthwhile (if everyone did it...)
Giving birth to Artificial Intelligence... by mcrbids · 2004-08-22 04:16 · Score: 3, Interesting

It's my belief that the most likely source of the birth of Artificial Intelligence will be the SPAM filter.

Think about it - we now have software that "learns' what you like.

Sorry, but anything that "learns" fits a definition of intelligence - using past results to predict future outcomes. Note that I'm not saying "self aware" or "conscious", simply "intelligence".

As we move forward, we'll see more and more intelligence on the part of the spammers, and the warring factions of intelligence will likely provide massive financial and political impetus to build ever more intelligence solutions - thus AI is born.

The problem with other vehicles for developing AI is simply the budget. With SPAM, everybody has a direct, financial incentive to develop it, so development will definitely happen!

--
I have no problem with your religion until you decide it's reason to deprive others of the truth.