Spammers Choose GMail
EdwardLAN writes "A study by Roaring Penguin has discovered that during the past three weeks, the amount of spam originating from Gmail has risen sharply." My spam has been pretty ridiculously high for the last few weeks, although I have no idea if this is part of it. It really does seem like gmail's spam filters are declining these days.
Maybe they should have just kept the system invite-only, instead of opening it up to everyone -- that would help, the way I see it.
How does spammers creating gmail accounts to send spam from imply that gmail's spam filters for inbound mail are declining? (if that is indeed what the summary is supposed to say).
Half of the spam I get on my gmail account that actually gets past the filter is in some language other than English... in fact its almost always in Cyrillic as well.
Give me a damn drop down that says "I speak English, anything not in English is not to me".
Won't solve their outgoing problem, but adding "this is my language" support would be a big help on the incoming, at least with my spam patterns.
I've got maybe 3 a week, which is up from the normal of 1 per month, but it's not really too big of a deal.
IIRC, marking an email as spam or moving the message to the spam folder (if you're using Gmail's IMAP function as I am) helps to train the filter.
Gmail used to be touted as the best spam filtering service. Certainly it's good, but apparently they only feel the need to filtering incoming messages. Why not filter outgoing messages as well? Can't quite be a CPU problem, because outgoing has be be just a small fraction of incoming, right?
Is it just tradition? People never expect anything they send to ever have anything done to it? Google could set another precedent in webmail by introducing outgoing filters which would block or slow down mail appearing to be 'spammy'.
creation science book
Yeah I've thought the same thing, too. It wouldn't be that hard to filter. You could just select a charset (like Latin-1) and if less than 90% of the characters in a given message aren't representable in your chosen charset, automatically kill it. That wouldn't require figuring out the actual human language it was written in; it's a pretty trivial automatic test.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
It's the outgoing spam from Gmail that's the problem, not the incoming spam, and there's been messages on the Gmail forums about Gmail servers being blocked for spam. If Google doesn't do something about it, then Gmail accounts will end up "read only".
And having Google themselves impose outgoing spam filtering is something else to worry about, if you're a Gmail user.
I haven't noticed any particular trouble with spam originating from Gmail, and Gmail has still been pretty good at filtering most of my spam.
But if you really want Google to do something about spam, go after them for their negligence on google groups. They've allowed the service to become almost unusable due to the amount of spam they allow through. For actual Google Groups it's not a big problem, but for USENET groups it is. Most people on USENET are just dropping anything coming from Google Groups outright. Any legitimate posts from Google Groups are considered an "acceptable loss" given the amount of godawful spam they allow through. It really cheeses me off that Google won't do something about it.
The summary implies that there's something wrong with the GMail spam filters. Actually, the problem is with the GMail spammer filters... the CAPTCHA.
Also, both Google and spammers are being overly complacent about people blocking GMail:
Actually, several sites have blocked Google SMTP hosts that show large spam outflow (it seems to be specific hosts, as if specific accounts are allocated to specific servers or clusters of servers). Including, and I know the irony is thick enough to cut with a knife, MSN Hotmail. There have even been a number of posts to Google's help forums complaining about mail not being sent because Google servers are being blacklisted.
Yeah thats why I mentioned the Cyrillic thing.
In reality doing it via language matching should be pretty trivial. I'd hazard a guess if you had a list of 30 languages and you pulled out the top 50 most common words in each language you'd probably have near 100% success in detecting the primary language in an e-mail. I'm sure an algorithm either purely based on that word set or based on a larger dictionary choosen based on that matching could be done to determine with a very high confidence what language an e-mail is in and if there's more than one or two languages in it.
They also know my white list of contacts. In my case I'd bet 90% of my e-mail comes from them so those can be immediately put in the inbox, reducing the number that need to be scanned at all.
Good I'm safe... It just asked for my credit card number.
CAPTCHA is broken: it's not just various implementations that are compromised, but the entire theory.
If you haven't been down-modded lately, you aren't trying.
Sacred cows make the best hamburger.
Google already does that for their ads. I'm an American living in Germany who also has friends in Japan that I coorespond with in Japanese. I get ads in English, German, and Japanese(in fact I get ads in Japanese offering to teach me English and/or German....) so if they can determine the language for the ads, then they should be able to use it for spam.... at least if you get an email in a language that isn't in your outbox it should trigger something..
Monstar L
Well, I did this study and our results are here.
We in no way imply that Gmail's inbound spam filtering is bad. It's probably excellent. It's just difficult or impractical for Google to filter outbound mail without either human review or complaints because of false-positives.
What we're saying is that spammers are trying to evade IP reputation systems by hijacking organizations with good reputations or which would be impractical to block. There will be a CAPTCHA-cracking arms-race, but unfortunately I think the system will reach equilibrium with spammers quickly breaking CAPTCHAs and continuing to abuse free e-mail systems.
With most big name email players like gmail, yahoo, etc, now using DomainKeys, the value of having an email address on any such system has skyrocketed. Gmail addresses are also usually even more respectable addresses. So being on gmail and a getting through because DomainKeys work makes it is a privileged domain.
What the proper response should be:
What should really happen is SenderKeys, which augments DomainKeys. You will get your own domain key when you can become "verified" like at Ebay and elsewhere. SenderKeys is implied by DomainKeys.
Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.
Taking his mom was the first mistake.
Never argue with a man carrying a water buffalo
Well, as you increase the level of intelligence meeded to go through the CAPTCHA, you start to leave humans out. And this only gets worst as CAPTCHA breakers get better and better, so in that sense, the CAPTCHA is broken, and also in that sense, we have artificial intelligence that is at least as good as the worst humans.
When his defense asked, "Which computer has Jon Johansen trespassed upon?" the answer was: "His own."
Spelling checkers is not the compete salutation.
Nerd rage is the funniest rage.
Blackwater would probably do it.
There's something to be said for this. Many of the major spammers have been identified (see ROKSO). The anti-spam community needs "boots on the ground" to do something about them. There are private companies in that business. Blackwater is one; Kroll is another. Spammers today are part of larger criminal enterprises, which makes them vulnerable to private investigators.