Ask Slashdot: Why Can't Google Block Spam In Gmail?
An anonymous reader writes Every day my gmail account receives 30-50 spam emails. Some of it is UCE, partially due to a couple dingbats with similar names who apparently think my gmail account belongs to them. The remainder looks to be spambot or Nigerian 419 email. I also run my own MX for my own domain, where I also receive a lot of spam. But with a combination of a couple DNSBL in my sendmail config, SpamAssassin, and procmail, almost none of it gets through to my inbox. In both cases there are rare false positives where a legit email ends up in my spam folder, or in the case of my MX, a spam email gets through to my Inbox, but these are rare occurrences. I'd think with all the Oompa Loompas at the Chocolate Factory that they could do a better job rejecting the obvious spam emails. If they did it would make checking for the occasional false positives in my spam folder a teeny bit easier. For anyone who's responsible for shunting Web-scale spam toward the fate it deserves, what factors go into the decision tree that might lead to so much spam getting through?
Spam folder in my Gmail catches 99.9% of all spam I receive.
As a bonus: it's also excellent about learning what I mark as spam, and dealing with false positives.
I realize that this is not a helpful response, but my Gmail account never gets spam, it's all properly filtered into the spam folder. Been years since I even gave spam a second though, actually. I imagine that most peoples' situations are similar.
This has not been my experience at all. I've found Google's email filters to be significantly better than anyone else's.
I can think of several other reasons not to use gmail - but spam filtering is not on that list.
#DeleteChrome
Google does an excellent job of catching spam. The submitter's problem isn't that, it's that he's got other numpties giving out his email address and then he's not using the Google-supplied tool (that little "mark as spam" button) to mark unwanted email so that Gmail learns his preferences. Instead, he's Dunning-Krugered together his own solution that barely works.
Submitter's problem is PEBKAC.
Hail Eris, full of mischief...
E pluribus sanguinem
Google can not do that because while for YOU an email in Chinese is a huge red flag, it means nothing to the chinese american student living in New York who still gets emails from her cousin in Hong Kong.
Most of the decisions you make are like this one. For you, country, language, etc. etc. are indications of spam, but they are not true for the general population.
So a spam filter designed for your personal use will always work a lot better than one designed for all users of google.
excitingthingstodo.blogspot.com
I'm not sure what this guy is doing, but when I ran my own mail server (which I did personally and professionally for well over a decade), spam was a huge problem for me. No combination of spamassassin, rbl's, heuristics signature checks, virus, etc... Nothing got me past 85-90% blockage. And I did everything right. And it was a constant unending fight.
When I switched to Google apps for my personal domain, my life changed. Google catches a HUGE amount of spam. Things still get through occasionally, and definitely get worse as black Friday and Christmas campaigns kick into high gear. But the majority of the spam I get is from legitimate business that decides to put me on their mailing lists without my permission.
The op either has on blinders, or is baiting.
There are lots of legitimate sites that send emails on behalf of someone not on the domain. A lot of 'email this content to someone' links work that way. Maybe Microsoft understands how email is used in the real world far better than you do.
switch over to Yahoo mail
I've seen a lot of recent spam campaigns that get through my basic scanning using the following tactics:
1. Careful design to not trigger Spamassassin content rules, including blocks of text to fool the bayes filter.
2, Careful omission of any identifying headers except for completely valid SPF and DKIM headers with appropriately configured DNS.
3. Real Linux mail servers dropped onto virtual hosting providers.
4. Fresh IP addresses and domains - never used domains that are not blacklisted yet and IP addresses blocks from the hosting providers that take 10-30 minutes to get blacklisted
Then they use snowshoe spam tactics to trickle them out until they're blacklisted and then move to the next domain and address.
If your address is on the lists that the perpetrators of these campaigns are using, it's really hard to avoid spam right now. Not impossible, there are some countermeasures, but vanilla Spamassassin and your standard appliances are going to have problems. I can imagine google is going to have an easier time with this because of its size and volume (=more information), but it's far from trivial.
-db
" It cannot just mark all advertisement as spam"
Advertisements in email are competition, not revenue. Google's incentives and your own are aligned.
because that alerts the spammer that they are detected and they need to change up their messsage/delivery
Snowden and Manning are heroes.
Disclosure: my name is Bruno Bowden and I managed the engineering team on Enterprise Gmail many years ago at Google before leaving to work in venture capital. My profile is www.linkedin.com/in/brunobowden. Though I didn't work on spam fighting directly, I interacted a great deal with the spam team while I worked there.
One of the main architects of the spam fighting system - Brad Taylor - published a scientific paper on "Sender Reputation in a Large Webmail Service" - http://www.ceas.cc/2006/19.pdf. This has a lot of detail about the system. We keep much of the internals secret as it reduces the chance that a spammer can reverse engineer and work around the system. If you'll allow me to be vague, the number of signals it uses was stunning to me. There's a mixture of hard wired tests (e.g. is the sender in someone's address book), reputation (domain and content), machine learning and anything else we can make work.
One of the principle improvements came when we switched to user classification through the "Report Spam" button. People have different opinions on what constitutes spam, so individual filtering is far more effective. It also avoids the politics of certain lists of domains and IPs from third parties which can be controversial. Even then it has challenges, as sometimes users will mistakenly pick out a phishing email and mark it "Report Not Spam". Because of that, Gmail now adds a red warning banner to indicate more strongly what is a likely a phishing attempt. In general, Google has tried to be very supportive of encryption, e.g. DKIM for authentication (and SPF) to STARTTLS for privacy. I would also like to mention the abuse team that works hard to prevent gmail being used as a source of spam, shutting down accounts as soon as possible after suspicious email is sent, then helping affected users to recover their account.
In general, the Gmail has received a lot of compliments on the spam filtering, I'm sure the team will be grateful for the positive comments here on Slashdot. There are still things that can confuse the system, e.g. receiving forwarded email (which might be missing source IPs) or genuine email that is sent to the wrong address. Though the system isn't perfect, I know the team will continue to work hard on it.