Slashdot Mirror


Gmail Spam Filter Testing

An anonymous reader writes "What can you do with 1000MB of e-mail space on your Gmail account? One guy, by the name of Aaron Pratt ( prattboy@gmail.com ), has decided to test the spam filters of Google's Gmail service by having his Gmail account blasted with every kind of spam imaginable. He is testing to see how well Gmail's spam filters can sort out the spam from legitamate email (yes, he does get personal emails from people). As of May 25th, he was at about 30% of his Gmail account's 1GB capacity. You can track his progress on his website, http://gmail.prattboy.net (Google cache of this site: cache: gmail.prattboy.net). Here is also an article talking about Aaron's efforts from webpronews.com"

16 of 285 comments (clear)

  1. One of the best things Google/GMail could do by Anonymous Coward · · Score: 5, Interesting

    Is use the GMail data to operate a checksum blacklist. Obviously, if thousands (or millions) of their users are getting the exact same email, it's probably spam.

    1. Re:One of the best things Google/GMail could do by Cruciform · · Score: 5, Interesting

      I've been getting them as well.
      The only reason I could think of someone sending those around is to bog up Bayesian filters with random crap, possibly lowering their effectiveness.

      Any spammmers/spam-experts feel like enlightening us? :)

    2. Re:One of the best things Google/GMail could do by Xzzy · · Score: 3, Interesting

      My server was set up to forward anything sent to one of my domains to get dumped into a common inbox. I noticed a ways back (before I changed my config to just bounce all this crap) that I'd get a lot of those dictionary emails to random email accounts.

      So either it's some kind of probe to find working addresses, or a filter clogger. Or maybe both.

      For a few of the random emails I would later start getting "real" spam. Not a majority though.

    3. Re:One of the best things Google/GMail could do by dragonman97 · · Score: 4, Interesting

      Indeed - while I was doing a lot of spam fighting at work, I reviewed a honeypot I'd set up, and was amazed. I used mutt to review the messages, and found a couple of messages where the text part was a page or two from "The Wizard of Oz" and the nasty offer for some kind of auto insurance or other crap was in the HTML section, replete with hidden hash busters behind color backgrounds. These guys are sharp - they must be paying some smart programmers a lot of money, and it's only sad that they've sunk to such levels.

  2. Should be interesting, what filters? by Clinoti · · Score: 4, Interesting

    Can anyone provide a link or source to the kind of filters google has working on gmail?

    --

    Let's keep in mind that patents are in place to keep lawyers employed and keep them litigating. -CatGrep

  3. Is this the AventureMail guy? by magefile · · Score: 5, Interesting

    The guy who got booted off AventureMail (2GB free) for trying to test their spam filters? The story is on Kuro5hin, if anyone wants to see it.

  4. About spam and blocking by AviLazar · · Score: 4, Interesting

    While we cannot block every domain name (i.e. if you get spam from $#(*$#sexphreak@yahoo.com) because it will alienate your legitimate contacts, there are many domain names that we can block (i.e. @spam-your-gmail.com). Yahoo provides email/domain name blocking, but limits this to 100 (unless you are paying). Do we know if gmail will have this limitation?
    -A
    *just for those who didn't know, the above domain names and email accounts are random, any resemblence to an actual domain or email account is purely coincidental, and if you choose to do so, you should sue /., not me :)

    --

    I mod down so you can mod up. Your welcome.
  5. 1gb Relieves Spam Concerns by osewa77 · · Score: 4, Interesting

    I have subjected my e-mail address, afriguru@gmail.com to the same abuse. by redirecting all e-mail addresses that recieve lots of junk mail to this one and posting the address unprotected to lots of websites and newsgroups. At the initial stage, a lot of 419 scam mails got through, but now I hardly get any spam. No false positives for me so far.
    _____________________
    Seun Osewa, Abeokuta Nigeria

  6. 0% Spam by yuri · · Score: 5, Interesting

    Spam is unsolicited, so google should filter none of his mail.

    This guy solicited it.

  7. Lack of updates? by Xiadix · · Score: 5, Interesting

    Did anybody else notice that his site hasn't been updated in almost a month (May 25)? Seems his project is no longer working. I wonder if Google booted him.

    KevG

  8. It's going to get a lot better... by waytoomuchcoffee · · Score: 4, Interesting

    For those of you that don't have Gmail yet, there is a little "Report Spam" button you can use to, well, report spam. When Gmail gets a few million users, and even 1% use this little button, you are going to see the spam detect rate skyrocket.

  9. Re:whining? by Valluvan · · Score: 3, Interesting

    Not many are as gregarious as Pratt. I've been using gmail for some time now. I must say google has done a pretty good job with their spam filters. For not-high-volume users (which most people are), gmail works much better than other email providers (i have yahoo, ureach and hotmail accounts which I use regularly).

    Of course, google should improve and filter out the occasional crap I get too. And also offer 1 TB.

    --

    Science as a way of life.
  10. Re:He gave out his e-mail address... by Algan · · Score: 4, Interesting

    It's not that bad as you think. I posted an dedicated email address to slashdot two times already, just to see what volume of spam I get. Surprisingly, it's only 2-3 messages every other day or so.

    Well, I guess I need a booster shot, so here it is: slashdot@hates.ms. Spam away...

    --
    If con is the opposite of pro, is Congress the opposite of progress?
  11. Dumb question about SPAM filters.. by StressGuy · · Score: 3, Interesting

    I have Mozilla, it has a Bayes SPAM filter. Lately, it's been getting fooled more and more. The messages that make it through have one or more of the following features:

    1) Several intentionally mis-spelled words

    2) Lots of text in white (so it's invisible or nearly invisible)

    3) Message in .GIF form only - no plain text.

    Could you add filters that look for, say, more than 10% of the words mis-spelled, text font nearly equal to background color, or no actual text in message? These would take effect in addition to the existing Bayes filter.

    --
    A goal is a dream with a deadline
  12. Aventuremail not as tolerant by dirvish · · Score: 3, Interesting

    I tried to do the same thing with my AventureMail account but AventureMail wasn't cool with it. They deleted my account! You can check out what little data I collected before the account suspension and read the emails to and from AventureMail about the merits of the account suspension at http://3fingersalute.net/aventuremail

  13. New spin on the "word salad" strategy by Scott+Richter · · Score: 5, Interesting
    Except that won't work, as anyone that understands Bayesian filtering will tell you. In the case of every message with "random words" I've checked recently, the random words actually increased the spam score of that message. Why? Because it seems the random words aren't so random and either the same spammer is using the same "random words" over and over or various spammers are using sets of the same words. Over time most of the "random words" they use actually become great indicators of spam since my real email doesn't typically contain the random words they use.

    Right, and my Thunderbird Bayesian filter catches all of those word salad approaches. But they've come up with a new one - what I call the "encyclopedia attack."

    What they do is copy an encyclopedia entry and put it at the bottom of their spam. The thing is usually a few paragraphs long, so that textually it dominates the message. The subjects are fairly random, and are occasionally educational ;)

    The problem is that the text of this doesn't trip the "too many strange words" flag that's used for word salads. My Thunderbird filter is really having trouble with these. Anyone else having trouble with these spams?