Slashdot Mirror


Filter-foiling Gibberish Becoming A Spam Staple

hcg50a writes "Wired has a story about the random words which have recently been appearing in spam. Antispam experts agreed that this isn't a brand-new technique, but said the addition of potentially filter-foiling gibberish is rapidly becoming a common component of spam."

4 of 606 comments (clear)

  1. Bayes filters deal with it fine by sidney · · Score: 5, Informative

    Paul Graham mentions the technique in this article, pointing out that the Bayesian filters look for words that commonly appear just in spam or just in non-spam. The random words are common in neither, so are simply ignored by the filters. As a technique, the random words would get past a filter that looks for some spammy to non-spammy word ratio. But that's not how the spam filters work.

  2. Re:why not filter out 1337 sp3@k? by rgmoore · · Score: 5, Informative

    Why bother? A decently trained Bayesian filter will be able to recognize a spam that contains a misspelled word or two, or one that contains substitutions of similar characters. Then it will learn that those modified forms are a very strong indicator of spam. As Paul Graham (the main early advocate of Bayesian Filters) has pointed out, there are legitimate reasons why you might see a mention of "Viagra" in your email, but no legitimate reason that you would see "V1agra", "\/iagra", "Vi@gra", or the like. Instead of slipping by my Bayesian filter, those variants actually stand out as particularly strong spam indicators.

    --

    There's no point in questioning authority if you aren't going to listen to the answers.

  3. Re:What I don't understand by he-sk · · Score: 4, Informative

    That's the text/plain part you see. The "advertisement" is in the text/html part.

    I was very irritated by that, too, until one day I was testing the HTML viewer of an e-mail client.

    --
    Free Manning, jail Obama.
  4. Re:What I don't understand by ElectricRook · · Score: 5, Informative
    I hope to hell they're fishing for non-bouncing addresses, because at the moment any email which SpamAssassin says is spam, I bounce.

    Don't ever do that, all spam has forged headers. You're just making life hard on someone who had their address sold.

    I work for a big company, an icon the the computer business. Our mail servers get spammed a lot. We often have typical user names grafted onto the From or Reply lines. Since my user name is pretty damn common, and some of my work mail aliases are TLAs, I look at a lot of spam. When I read the headers (in a text file, not easily spoofed mail software), almost always the senders domain is not even close to the domain of the spamming machine. Go put the IP addresses into dnsstuff.com, and compare that to the hostname. These turds hack the sendmail.cf file of the spamming machine. "SallySmith@aol.com" probably did not send spam-mail from a ".kr" ISP.

    --
    - High Tech workers, please say NO to Union Carpenters, their Union sees fit to control our compensation.