Ask Slashdot: Why Can't Google Block Spam In Gmail?
An anonymous reader writes Every day my gmail account receives 30-50 spam emails. Some of it is UCE, partially due to a couple dingbats with similar names who apparently think my gmail account belongs to them. The remainder looks to be spambot or Nigerian 419 email. I also run my own MX for my own domain, where I also receive a lot of spam. But with a combination of a couple DNSBL in my sendmail config, SpamAssassin, and procmail, almost none of it gets through to my inbox. In both cases there are rare false positives where a legit email ends up in my spam folder, or in the case of my MX, a spam email gets through to my Inbox, but these are rare occurrences. I'd think with all the Oompa Loompas at the Chocolate Factory that they could do a better job rejecting the obvious spam emails. If they did it would make checking for the occasional false positives in my spam folder a teeny bit easier. For anyone who's responsible for shunting Web-scale spam toward the fate it deserves, what factors go into the decision tree that might lead to so much spam getting through?
Oompa Loompas HA...
Spam folder in my Gmail catches 99.9% of all spam I receive.
As a bonus: it's also excellent about learning what I mark as spam, and dealing with false positives.
It's likely a machine learning algorithm that is being trained by your responses. Like Pandora or Netflix. Try training it.
Just mark them as spam, you getting that amount of spam is almost certainly you doing or having done something wrong.
Spam is difficult, and Google is among the (if not the) best at it, if nothing else because of the sheer amount of training data they have for their spam filters. The only more secure way is to only allow verified addresses and contacts. (which you can set up if you really want to).
Stop leaking your mail address and you may have less of a problem
I realize that this is not a helpful response, but my Gmail account never gets spam, it's all properly filtered into the spam folder. Been years since I even gave spam a second though, actually. I imagine that most peoples' situations are similar.
This has not been my experience at all. I've found Google's email filters to be significantly better than anyone else's.
I can think of several other reasons not to use gmail - but spam filtering is not on that list.
#DeleteChrome
I get almost zero spam in my inbox, it all goes to the spam folder, where I look occasionally for things that might have been false positive, but even that happens almost never unles I've accedentally ID'd something as spam myself.
If you want news from today, you have to come back tomorrow.
I get no spam, have had the account from when it was invitation only, and have used it in countless purchases and sign ups.
"If any question why we died, Tell them because our fathers lied."
I think more likely what occurs is that they need to be extremely careful about false positives. So they push everything into a SPAM folder. But if you miss a critical email because Google accidentally thought something was spam when it wasn't, then Hello lawsuits. From a legal perspective, blocking anything going into their inboxen is a risk.
Agreed, I run both my companies network (mx, spf, all that jazz) and my personal through gmail, and I get maybe 1 spam message per month on each account tops. I often open them as it is usually an interesting trick that the spammer used (that google will pick up immediately and I'll never see again)
Google does an excellent job of catching spam. The submitter's problem isn't that, it's that he's got other numpties giving out his email address and then he's not using the Google-supplied tool (that little "mark as spam" button) to mark unwanted email so that Gmail learns his preferences. Instead, he's Dunning-Krugered together his own solution that barely works.
Submitter's problem is PEBKAC.
Hail Eris, full of mischief...
E pluribus sanguinem
Google can not do that because while for YOU an email in Chinese is a huge red flag, it means nothing to the chinese american student living in New York who still gets emails from her cousin in Hong Kong.
Most of the decisions you make are like this one. For you, country, language, etc. etc. are indications of spam, but they are not true for the general population.
So a spam filter designed for your personal use will always work a lot better than one designed for all users of google.
excitingthingstodo.blogspot.com
I have a rather trivial gmail nickname. I have 3 mails inside the spam folder and of course none in my inbox. Either i've been extremely conservative in publicizing my address or you are especially bad at it.
I would have to think that spammers start every script with instructions creating as many combinations of addresses with @gmail.com as they can.
Then, there is exposure. How many lists have you included that gmail account in compared to the one you host?
Just a thought.
throw the baby out. The bathwater is cold
Track down and punish all the retards that actually buy the stuff advertised by spam.
I'm not sure what this guy is doing, but when I ran my own mail server (which I did personally and professionally for well over a decade), spam was a huge problem for me. No combination of spamassassin, rbl's, heuristics signature checks, virus, etc... Nothing got me past 85-90% blockage. And I did everything right. And it was a constant unending fight.
When I switched to Google apps for my personal domain, my life changed. Google catches a HUGE amount of spam. Things still get through occasionally, and definitely get worse as black Friday and Christmas campaigns kick into high gear. But the majority of the spam I get is from legitimate business that decides to put me on their mailing lists without my permission.
The op either has on blinders, or is baiting.
... I was working tech support for a small ISP, we got a call from a gentleman about his email, and how it was filled with spam, so at his request we took a look at it, it appeared he signed up for dozens of pornographic email lists, the messages weren't spam but were the daily photos of a girl fucking a horse, and a daily photo of a woman making love to a guy dressed like a panda bear...
About 2/3s of them were marked as read...
he proceeded to tell us that this isn't something he signed up for, and we should just delete it all. You could almost hear his wife (who a call search later would reveal called earlier in the week to determine how to access the email account) in the background...
Perhaps the problem isn't gmail so much as the fact that you signed up for the lady making sweet love to the stick shift in an american automobile mailing list?
I've had the exact opposite experience. GMAIL's filters are so much better than any service out there. I get less than 1 SPAM email a month into my actual inbox.
Mike @ The Geek Pub. Let's Make Stuff!
There are lots of legitimate sites that send emails on behalf of someone not on the domain. A lot of 'email this content to someone' links work that way. Maybe Microsoft understands how email is used in the real world far better than you do.
switch over to Yahoo mail
If false positives are a 100% no-no for you, then you get to enjoy reading every mail in your spam folder, or you get to switch to a different technology than email.
I've seen a lot of recent spam campaigns that get through my basic scanning using the following tactics:
1. Careful design to not trigger Spamassassin content rules, including blocks of text to fool the bayes filter.
2, Careful omission of any identifying headers except for completely valid SPF and DKIM headers with appropriately configured DNS.
3. Real Linux mail servers dropped onto virtual hosting providers.
4. Fresh IP addresses and domains - never used domains that are not blacklisted yet and IP addresses blocks from the hosting providers that take 10-30 minutes to get blacklisted
Then they use snowshoe spam tactics to trickle them out until they're blacklisted and then move to the next domain and address.
If your address is on the lists that the perpetrators of these campaigns are using, it's really hard to avoid spam right now. Not impossible, there are some countermeasures, but vanilla Spamassassin and your standard appliances are going to have problems. I can imagine google is going to have an easier time with this because of its size and volume (=more information), but it's far from trivial.
-db
One guy lives in Utah, another goes to Colorado University, another lives in Southern California, and there are a few more. I regularly get emails for these guys regarding classes, vehicles, rental properties, etc. I also get signed up for lots of spam and unwanted porn crud. I did create a label and look through it from time to time to make sure there isn't anything meant for me. I think the best solution is to create a new email address that is somewhat unique and forward the old one to it until the people you want email from know it. Also, I would never get rid of the old address. You never know what online account you barely use that you forgot to change over.
I want my! I want my! I want my Eee PC!
You have to be careful not to break mailing lists etc. there are plenty of systems which mess up the headers.
Simple techniques (such as comparing the origin of the email with the domain) was beyond him.
I'm not entirely clear on the method which you presented, but one of the very hard problems in spam blocking is that there's a vanishingly close to zero margin for false positives - especially for systems where the person installing the filter is not the person receiving the email. Sending legitimate email to the spam folder is simply unacceptable, particularly if the user hasn't opted-in for aggressive spam filtering. It doesn't matter if the message looks very "spammy", a user wants to receive messages from Aunt Edna, and isn't in a position to strongarm her about her ISP's funky email setup.
So yes, 99% of messages that have email origins that don't match the domain might be spam, but that remaining 1% still totals a *lot* of emails, and BOFHing with "Not my problem - tell Aunt Edna to get a different ISP" doesn't cut it. (Again, it doesn't matter if *your* legitimate email all conforms, Microsoft and Google are dealing with millions of people, many of whom don't know what "RFC" stands for, let alone base their ISP selection - or the selection of people they correspond with - on RFC compliance.)
"Simple" techniques tend to be "simplistic" techniques, and fall short when you get to the woolly world of reality.
How do you compare to the usual solutions checklist?
https://craphound.com/spamsolutions.txt
Catching spam and filtering it is the wrong way to deal with the spam problem. At that point the spam has already been sent, already taken up storage and CPU time somewhere, and already cost you money (yes, even with a "free" email account like gmail it still costs money somewhere). And if you add in the costs of filters, with the admin time and storage they consume, it is even worse.
As I have said many times before, the only effective way to deal with spam is to approach it from an economic angle, as spam is an economic problem. Spam isn't sent out to piss you off, it is sent to make money. The spammers don't need you personally to buy anything, they just need someone else to buy something. The ROI on spam is incredible as the cost is almost nothing to send to billions of addresses, and only a couple of suckers are required in order to make money off the venture.
If you want to actually help end the spam epidemic, stop talking about filters and other crappy "solutions" that only accelerate the arms race with the spammers. The way to stop spam is to remove the profit motive. This has been done successfully already; if you can prevent the spammers from getting paid they won't send spam because it won't be worth their time. Groups have succeeded in this and the effect has been dramatic. By contrast filters just encourage spammers to employ more creative measures to get their messages through - many of which result in reducing the S:N ratio of filters.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
I virtually NEVER see spam in Gmail; they do a great job.
The reason is that they bought Postini several years ago. That technology looks for the same body text being sent to many people in a short time interval; if the body is never customized, then they know it's spam. It's much, much more effective than looking into the content for key words or phrases, even though it slows down mail by a few seconds to get a decent sample of mails from those @@(%&^ spammers.
If you're still getting spam, go look at your email settings in Gmail to see if you have disabled spam filtering (at the site). If you receive your eMail via IMAP on your computer (e.g., Outlook or Thunderbird), make sure you don't have a "SPAM" folder locally, so Gmail doesn't try to sync it.
Care to share your ideas?
Okay, I moved because I had someone spoof my email a couple times, and the IP range of the server farm I was based out of (The Planet) on my shared box got flagged in spamhaus twice in 3 years. For a business, that's death to have everyone in the world reject your (legitimate) incoming emails.
I might get 1-2 actual spam emails a month through Google's filter, with hundreds blocked every day - easily four 9s. Now, that doesn't include the friend who's email got hacked and now needs $1200 wired to him because he's been detained in [insert favorite European coutry here], or the Constant Contact emails I get from one or two of the vendors at the last business conference. As for false positives, I know of one in the past year that got accidentally flagged, so if there are/were more, the people sending the emails didn't care enough about it to follow up.
Is it just my observation, or are there way too many stupid people in the world?
I had this problem, and found that the Gmail documentation seemed a bit sparse on the subject (that I could find).
Recall, recently Gmail purchased (and presumably integrated) Postini.
Basically, you have to mark a message as "Junk" (or Spam, depending on your client), file it in the Spam folder, then "Empty Spam". What I believe happens at this stage is mail you've marked as Junk/Spam gets punted to an identification system so that it can later identify the pattern(s) as spam. Once I began doing this, I had much better luck with Gmail's spam filtering. Though I admit I wish they offered more fine-grained filtering -- for example, some /24's or domains I never want to receive email from.
Anyhow, I also believe the filters collect global data -- they must score it based on some algorithm, so that other users who receive the same spam get the benefit, too.
Anyone else want to chime in on what Gmail is doing in the background?
I use a spam filter which quarantines suspected spam. I then review the quarantine and white list or black list as appropriate. Not an ideal solution for large scale users, but for us it works.
Last week I black listed an email. The subject was "You've got to see this!" and the body was only a link. It turned out that it was a legitimate email so I turned around and white listed the sender. But that email would set off the spam flags on just about any filter, including human based filters. Sadly, there is no certain means of determining spam vs non spam.
My first language is not English and even for those mails gmail's spam filter works really, really well. I am starting to wonder whether running your own little email server has got something to do with it. I am assuming you are running a typical home server on a home connection with maybe a static IP. This is generally a very bad idea. Whatever Ip you are connecting from, it is flagged as a "dial up pool" or "home connection pool" so emails coming from there will instantly look very suspicious to any spam filter.
Maybe you have been sending mails back and forth with your gmail account for so long that you worn gmail's spam filter down? Maybe it thinks you actually want messages like that...
"Only one thing is impossible for God: To find any sense in any copyright law on the planet." - Mark Twain
I (and most gmail users I know) almost never get spam in my gmail inbox. Dunno why your experience is so different. The only complaint I have is one particular sender on on particular email list whose emails are consistently misclassed as spam. I always mark them as not spam, and have even sent them to google's spam team for analysis when offered that dialog, but gmail every time misclasses his emails as spam. That is annoying and weird, but it's the only problem I have with gmail spam filtering.
The submitter does NOT complain about Google's ability to catch spam! He asks why Gmail does not REJECT obvious spam. Rejecting an email means that - in this case the Gmail - server does not even accept it. In such cases the sender gets back a Delivery Status Notification from his own server, telling him that his email did not go through because of such and such error. An important point here is that the email is not lost without any notification. The sender can try to contact the recipient in another way. Actually this may be better than putting the email into a spam folder if that is not monitored regularly, or at all. Yes, this is a valid question, but almost none have undersood it.
I personally like the idea of learning algorithms, through Mark as Spam or Add to Contacts. But as a sysop in a somewhat busy, mid-scale company MX, I find 2 big user-preference deterrents to its use:
My most used technique involves configuring amavis (spamassassin, amavis, etc) just like OP does, but then, and since I use ISPConfig with a plethora of configurable per-user Spam policies, I just tell everyone responsible for creating mailboxes to arbitrate between them, ad hoc. It works somewhat well: every month or so I get an unhappy camper, and I just accept the fact it happens.
... It's what makes HTML markup like bold and italics and stuff.
It little behooves the best of us to comment on the rest of us.
Not sure what nasty links you've been clicking on. But the reason I use Gmail and have dumped the Yahoo's and Hotmails of the world is because I get ZERO spam......none, nada. Seriously, how did this become a post on Slashdot?
According to 'blame the victim' mentality, you shouldn't send your e-mail address around and it's your fault you're getting spam.
I was actually thinking of the opposite trend since a couple of years ago: even people fully capable of running their own mail servers are all using gmail these days; I think we're easily at the breaking point where noone really knows how to run a mail server anymore.
" It cannot just mark all advertisement as spam"
Advertisements in email are competition, not revenue. Google's incentives and your own are aligned.
I think the submitter's intent was to troll slashdot. I can't remember the last time I've seen a Google related article on slashdot where the comments were almost unanimously pro-Google.
Your statement "Some will get legal on you." is bogus. Gmail is free and, if you bothered to read the ToS, Google is not liable for shit, in fact YOU agree to indemnify THEM and, because it's FREE, what do you propose to list a "damages?"
It little behooves the best of us to comment on the rest of us.
We have Gmail at work, and the spam filtering seems to work reasonably well. We get an occasional spam message to come through, and for some reason, most of the ones that get through are written in Chinese.
Two features that drive me nuts are a lack of sorting by headers, and not being able to set a message priority. Yes, I know you can search by sender, keywords, etc., but that only gets you so far. Sometimes I only have a vague idea of what I'm searching for, and being able to sort by subject would be a big help. Also, I hate the fact that you can't set a message priority for messages like high or low. Gmail sets messages as being "important" based on what Google thinks is important, and that's almost never right.
Taking guns away from the 99% gives the 1% 100% of the power.
... and may chance you didn't read my post: (There was a LOT more to my presentation that just this; this single part presented here to convey the concept).
The trouble is - the single part that you presented is clearly broken (eg it doesn't work well with the way many mailing lists work), so if it conveys the concept of your whole presentation, people are naturally going to assume that the whole presentation was broken...
Need to type accents and special characters in Windows? Use FrKeys
Did you throw in a perpetual motion machine for kicks?
Seriously, if you think you have a better spam filter than everybody else, patent it, make it a hosted service and offer for-pay filtering to people who want it. If it works, you'll have a bidding war for your startup within a few years.
I didn't want to copy all aspects of my 30 minute presentation in minutia detail here.
and it will work with mailing lists - that was directly covered (along with sending email from a different domain, and sending email for someone else... )
This isn't spam; at worst, it's bacn with a case of mistaken identity.
As someone whose full-time job is preventing spam (I work on Akismet, which checks about 380MM Web comments per day for spam), my general response to these kinds of questions is this: Fighting spam is hard because what's spam for you is not always spam for someone else, and spammers are continually changing tactics -- what worked to prevent spam yesterday may not work as well tomorrow, so it's a constantly moving target.
In my experience, GMail's filter is just ok. I see about 50 spam per day end up in my spam folder, 3 or 4 that make it to my inbox, and maybe one false positive per month (when I bother checking). That's a 94% success rate with a 0.3% FP rate (based on my ham email activity), assuming that they're not instantly discarding blatant spam that wouldn't even merit ending up in the spam folder (which they very well might be doing). If Akismet had this same success rate filtering comments on my blog, I'd have to manually mark 230 comments as spam each day instead of Akismet's missed spam average of about one per day. I don't complain about it though, since fighting spam is hard (see above).
And related ... there should be the ability for me to restrict where my email is access to/from and where it was sent from. I'm not going to Russia -- so why can't I block all access to my account from Russia?
Yeah, it's not quite a solution to spam, but I've had periods where I get a lot of spam in Cyrillic or Chinese/Japanese characters, and it would have been nice to be able to at least say, "If the email isn't using the Latin alphabet, treat it as suspect because I don't read any languages that use any other alphabets."
I've always thought part of the key to putting a dent in spam would be to make cryptographic email signatures ubiquitous. Then we could check the signature against a valid authority, and if an authority is vouching for too many spammers, then you yank its status as "a valid authority". Then it becomes the authority's job to self-police. Of course, getting people onboard with something like that is impossible.
Now how does your solution in checking "origin" compare with something like SPF? What is it checking the origin against?
And what if one of your friends goes to Russia on vacation and wants to send you an email?
The OP wrote, "I'd think with all the Oompa Loompas at the Chocolate Factory that they could do a better job rejecting the obvious spam emails. If they did it would make checking for the occasional false positives in my spam folder a teeny bit easier." In other words, he's saying that he wants Google to reject the mail before it gets to his spam folder. He's not complaining about the efficacy of their spam filters, but is instead suggesting that Google should find a way to reject it before it even hits his spam folder.
Disclosure: my name is Bruno Bowden and I managed the engineering team on Enterprise Gmail many years ago at Google before leaving to work in venture capital. My profile is www.linkedin.com/in/brunobowden. Though I didn't work on spam fighting directly, I interacted a great deal with the spam team while I worked there.
One of the main architects of the spam fighting system - Brad Taylor - published a scientific paper on "Sender Reputation in a Large Webmail Service" - http://www.ceas.cc/2006/19.pdf. This has a lot of detail about the system. We keep much of the internals secret as it reduces the chance that a spammer can reverse engineer and work around the system. If you'll allow me to be vague, the number of signals it uses was stunning to me. There's a mixture of hard wired tests (e.g. is the sender in someone's address book), reputation (domain and content), machine learning and anything else we can make work.
One of the principle improvements came when we switched to user classification through the "Report Spam" button. People have different opinions on what constitutes spam, so individual filtering is far more effective. It also avoids the politics of certain lists of domains and IPs from third parties which can be controversial. Even then it has challenges, as sometimes users will mistakenly pick out a phishing email and mark it "Report Not Spam". Because of that, Gmail now adds a red warning banner to indicate more strongly what is a likely a phishing attempt. In general, Google has tried to be very supportive of encryption, e.g. DKIM for authentication (and SPF) to STARTTLS for privacy. I would also like to mention the abuse team that works hard to prevent gmail being used as a source of spam, shutting down accounts as soon as possible after suspicious email is sent, then helping affected users to recover their account.
In general, the Gmail has received a lot of compliments on the spam filtering, I'm sure the team will be grateful for the positive comments here on Slashdot. There are still things that can confuse the system, e.g. receiving forwarded email (which might be missing source IPs) or genuine email that is sent to the wrong address. Though the system isn't perfect, I know the team will continue to work hard on it.
Sorry that I didn't make it clear, I had a change of topic .... my bad.
... was to be a security issue; not an email origin issue. Sorry for the confusion.
My intent with the 2nd half
Google filters out ~100-200 spams a day from my email box (which I universally forward all my domain mail through) and leaves me with (usually) only one or two that I have to specifically mark as spam. I've never been able to do better running my own spam filter.
-Matt
What are you doing wrong? Gmail catches spam very well for me, and false positives are at a minimum (and usually they *are* spam, just spam that I willingly signed up for).
Nothing to see here. Move along.
I can whitelist the inbox. Only my contacts get in. So this way I can take care of old business first before moving on to new business (next quarter?). Google needs this option very badly. Then I can drop Hotmail, but really I don't need to. They're no worse than anybody else.
“He’s not deformed, he’s just drunk!”
I think not too long ago, folks were discussing how the spam war was won. Their spam filtering is so good, that, for the most part for users, incoming spam is no longer a huge issue.
If they did it would make checking for the occasional false positives in my spam folder a teeny bit easier.
If it's IN your friggin' spam folder, then they've blocked the spam. They decided it was spam and hid it from your inbox. No filter's gonna be perfect, and the Spam folder is to help you go back if you become aware you are missing an e-mail.
You remind me of e-mail users the complain if they get a spam message in a quarantine digest. Then you remind me of e-mail users that complain if they get a non-spam message in a quarantine digest.
GMail makes it ridiculously easy to set up and use multiple email accounts for different purposes. Here's my setup:
Initials.UniqueChars@gmail.com - this is the account I use to sign up for random internet things. It's fairly anonymous and disposable.
Initials.MoreUniqueChars@gmail.com - an account for "important" internet things, like online stores, internet banking, purchasing apps, and other stuff that is a bit more sensitive than what's good for the throwaway account. Also might use this for various mailing lists.
Real.Name@gmail.com - I only use this to talk to actual people. This is also tied with my Google+ account that I use to share photos. If you're well-disciplined (and have almost no friends like me), then this hardly gets any blather at all.
As a side bonus, I pretty much turn off notifications for everything except the Real.Name account, so my phone isn't "pinging" all of the time unless some human is really trying to reach me.
The second hardest part is just keeping up the discipline of using the right account for the right purpose.
The hardest part is dealing with all of the people who have completely abandoned email due to their inability to keep their email sorted, and figuring out whether to reach them via Skype/HipChat/IRC/Facebook/Twitter/SMS instead.
For me, gmail is superb at filtering spam. I have 167 emails in my "Spam" folder over the last 30 days. Maybe 1 or 2 have gotten through over that same period.
With your own spam filtering, you decide what is the acceptable false positive rates, which spam-high country domains you never get legit e-mails from and so on. With public services, same filter has to work for millions of users. If you are diligent about reporting rather than ignoring spam, you will probably get better results. But still not as optimized for you personally as filtering that you setup yourself.
Unbelievably only one poster appears to have read the article correctly: the poster is asking why does google bother letting all the spam through to spam folder rather than simply rejecting it.
It's a very valid point. Spam folders can be so crowded you can't spot a a misfiled email easily anymore.
Dennis Onstenk
Same.
Some bloke in Ireland must have had a very awkward phone call from a local department store, and there's a guy with the same name who keeps trying to hire cars over there too.
He must think every car hire place is shit because I never confirm his bookings for him...
I've been on GMail nearly since day 1, and have a forwarding service that sends e-mail from my "permanent" address there. I have labels set up so I can see to which e-mail the messages are addressed. I use both addresses for various purposes.
At it's peak, I was getting about 100 spams a day, about evenly split between my two addresses.
Virtually ALL the spam I get now is sent to me through the forwarding service (where GMail catches still it.) The amount of spam I get sent to the actual GMail address has dropped to almost nothing. I suspect the major spammers simply have stopped sending spam to GMail addresses, as it isn't even worth the nearly zero cost to do so, as it will virtually never get through to inboxes.
If you think you can do better, please do.
Most spam is handled fairly well these days. When our spam filter on the email falls over, email just traverses and I get complaints from users that they got a SINGLE spam. That tells me how well it operates day-to-day... they just don't see any.
It's annoying though... "can't we stop that", "but it was a RUDE spam!", "how did they get my address", etc. You can explain any number of times but the only way to shut them up is to turn off the spam filter and show them what's happening day in, day out, against our servers. Or my inbox - which has a lot of heavily-advertised email addresses.
Literally, we get dozens or hundreds of thousands of spam emails a day. The fact that people barely notice we have even one is testament to anti-spam. GMail, in this regard, are fabulous and I've worked in schools where the email basically IS GMail (Google Apps for Education, or Google Apps for Business). It's basically a free alternative to Exchange for many schools.
And, damn, does it filter a load of the junk, even if you don't put on the options to limit the domains, etc.
And if you operate a mail server you'll find out how hard it is to send email to GMail. My personal domain has SPF, DKIM, reverse DNS, etc. and still it's a faff where sometimes GMail thinks I'm spamming my own GMail account from my own domain-forwarding. To be honest, 99% of the time, it's right- spam slips through my email filters, gets forwarded to my GMail, and GMail still makes a fuss even though it's certified, secured, etc. as from my domain by that point.
It's hard to do better than GMail. Think you can do it? Go try. You'll struggle to do it for yourself, let alone for millions of people whose idea of spam varies wildly.
Rejecting spam outright kind of defeats the purpose of having a spam folder. I don't see them implementing something like a variable-strictness 2nd level of filtering for the vanishingly-few people for which this is a problem.
In my experience, gmail is fairly good (the best?) about catching actual spam, but I still get both false positives and false negatives (a lot more of the former). That makes me believe that this is actually a very difficult problem to serve. The post above from someone who was a gmail engineer reinforces this impression.
However, how much spam you receive is largely under your control. I receive very little spam even in my spam folder - usually less than 5 a day. It basically boils down to keeping tight control over who gets your actual main personal email address. That should be reserved only for friends and family, and even then, I've thought about asking them to not enter my email address on any websites if I decide to change my main address some day.
Here's how I control the commercial emails (and consequently, spam):
1. You will need a domain name to use for receiving commercial emails (i.e. any website where you enter your email address), and domain hosting or at least an email forwarding service.
2. Configure the email forwarding/filtering to forward all emails or emails following a certain pattern for that domain to your real email address. I configured the option on my webhost to forward all email (a catch all, if you will), however, I've since learned that this is not the best way, because if your domain starts getting flooded with spam your domain could get blacklisted. Supposedly the best way is to configure a filter that has a "key" string. Let's say you use your initials: .jb (Joe Blow) - the filter would then only forward emails that contain .jb among the recipients' addresses.
3. Register with a unique address at each website, each store, any commercial use of your email. Ex: use spammer.com.jb@mydomain.com when you register at spammer.com. Same thing if you give your email address to any entity who is not a family member or personal friend. Now all the commercial emails will get forwarded to your real mailbox because they have the .jb key. I actually make an exception to this for banks and for things like webhosts, etc, but I'm reconsidering banks after the recent JPMorgan breach when they obtained contact info for everyone. I would still make an exception for webhosts or anything where there could be a problem if your mydomain.com is not available for some reason.
4. ???
5. Profit. I.E. as soon as you start seeing real spam (not the stuff that a lot of people incorrectly mark as spam), you will know what address they're sending to and can block them at your webhost or email forwarding service. Here are some examples of entities that I had to block because they were breached or sold my email address to spammers:
adobe.com (breach)
dropbox.com (breach)
planusa.org (unknown)
cinegearexpo.com (unknown)
equifax.com (unknown)
zappos.com (breach)
whois (open database - I use a proper domain registrar that hides my info by default now)
Bonus: another major advantage of doing this is that it makes it much much easier for you to change your main email address. You can reroute all your commercial email with one reconfiguration of your forwarder instead of having to go to each individual website to change your address.
Extra bonus: makes it super easy to setup a filter at your client or webmail to send all commercial email to a separate folder. Just filter for mydomain.com in the "to:" line.
Doing this for a few years now has really opened my eyes to how many companies and other organizations either don't give a shit about your private contact info, have shitty security, or actually sell you out for money. I was frankly surprised at some of the organizations that I had to block. Unfortunately early on in my spam-fighting days I did use my main email address on websites, and sometimes also used google's floating period or + functionality to try to individualize email addresses so I get some spam where I don't know where they obtained my address. But those are few and far between, and I've been slowly untangling myself from it to the extent that I can.
I like these new commercials, where the audience's talk of the commercial, is the commercial.
Politics; n. : A religion whereby man is god.
I just checked; my gmail catched 40 spams yesterday. I think the daily average is higher, especially since i also have a catch-all @domain.com that forwards to gmail.
About once every 6 weeks i see upto 5 false positives in 3 consequent days, and i think this could be deliberate: to help train the spam filter. Oddly these mostly have some tie to my past search/browse history, which is not creepy but logical in my hypnosis.
To me, the gmail spam filter is near perfect. I go as far to advise clients to use a gmail account if only as a pass-through spam filter..
Hivemind harvest in progress..
I have real false positive about once a year. I think gmail filters try to differentiate spam from your legit mails. If your usual legit mails looks like spam, it has more difficulties to identify spam ;-)
It does a really fantastic job for me. It even filters out these annoying emails I get from Google's Play Store. :D
One would think gmail's spam filter would whitelist *.google.com, but they apparently don't trust themselves.
In general, Google has tried to be very supportive of encryption, e.g. DKIM for authentication (and SPF) to STARTTLS for privacy.
Ugh - you managed to pick two of my pet peeves. I used to securely bounce all my mail from my domain to my gmail account using TLS so that all my email flowed to Gmail encrypted.
However, GMail started enforcing DKIM more strongly, which means that much of my bounced email started, well, bouncing. So, I switched to POP3 retrieval of email. Then I discovered that GMail won't support TLS/SSL unless the presented certificate is trusted by them. So, as a result I've moved from instant delivery of encrypted email to polled delivery of unencrypted email with my credentials probably sent in plaintext (I'm not quite sure whether Gmail at least supports something other than plain text authentication when not using SSL/TLS). I use disposable credentials to an account used only for POP3 with only a copy of my email, so that at least mitigates the damage if they leak.
Of course, I realize that my use case is the obvious 0.01% one, and part of why I like to use Gmail as my MUA if not my MTA is its effective spam removal.
IMAP uses TLS.
Care about electronic freedom? Consider donating to the EFF!
Whitelists used to be a pain to maintain because you would have to go into your mail settings and explicitly allow someone to email you every time someone new wanted to contact you. These days, with people mostly communicating to strangers and new people in social media, email whitelists are the smartest way to handle the issue and it doesn't require any "learning" or spam fighting email at all. 100% effective. My postfix server recieves a storm of garbage all day, nothing gets through except the stuff I want.
One of the biggest problems I faced with my old gmail account was that because I used it for everything, eventually everything was sending me emails. As it came from what looks like legitimate sources, gmail had a huge challenge sorting out the good from the bad. It did a great job, but eventually I had to consider that email compromised.
Initially I planned to setup my own mail server for my own domain and aggressively manage the spam, but the last time I did that was in 2000, and I was rustier than a garden gate. The amount of relearning and work I would have to do to set it up properly and securely was going to be more than I could handle. However, I stumbled upon a solution which works well for me:
I registered a domain, and let GOOGLE manage it for me. The only thing different to me is that my 'google' email uses my domain name. As it's my last name, I get the convient forms of Firstname@lastname.com for my personal email. But how does this solve the spam problem if google isn't already solving it for you? On it's own it doesn't, but I decided to take what works with google and add some quirks (and let's face it, google knows a lot more about hosting email servers than I do).
1. Use a non traditional extension. No .COM, .NET, .ORG. Spammers can catch 90% of all email addresses by bulk spamming incremental names. *@gmail.com is going to get spam no matter what, but *@obscuredomain.it is not likely worth the computational effort, even for a botnet.
2. Do NOT give out your primary email address. If you want to give ABCBusiness your email address, give them the address ABCBusiness@yourdomain.com. There is nothing to setup other than having unassigned email addresses redirect to a single mailbox. What does this do? Well, let's say you start getting spam. Take a look at the 'TO:' field and if it says plumberbob@yourdomain.com then you know it was Plumber Bob that was patient zero for your spam problem. Simply blacklist incoming mail sent to the plumberbob@yourdomain.com email address and your spam is GONE. Give a new email to Plumber Bob and tell him to be more careful with this one.
I've been using this system for over a year and there have been a total of 10-20 spam messages that google caught and sent directly to my spamfolder, and one annoying company that kept sending me advertisements until I blacklisted the email 'thenoisycompany@mydomain.com'. There was also a period of time when a bunch of spam messages came through a to address from the person I assume was the previous owner of the domain. Blaclisted that address and all was quiet again.
The basic premise is that I realized that my email address will eventually get compromised, but at least this way I can compartmentalize the damage.
Out of modpoints but really liked a post? 1BDkF6TtmmeZ3yqXbz9yhdYVqRYnwFoXDj
IMAP uses TLS.
Will it do so if the server presents an untrusted certificate? POP3 supports TLS as well, but Google has it configured to reject any connection presenting a certificate they don't trust. So, the alternative is unencrypted POP3, which also does not present a certificate that they trust but for whatever reason everybody always seems fine with that.
Either I have completely misunderstood the OP's question, or, it would appear, everyone else has.
The way I read it is as follows:
I get a lot of SPAM in my spam folder, and I also get the odd (very) occasional false-positive dumped in there along with it. My inbox is almost SPAM-free. Other mail providers can block SPAM from even being received, so not only does it not appear in my inbox, it doesn't even make it into my SPAM-folder. Why can't Google do this too, as it would make hunting through the SPAM-folder for false-positives much easier?
If this is the question that the OP meant to ask, the only reason I can think of, off the top of my head, is that if they did reject, rather than receive and sideline, suspected SPAM, and they hit a false-positive with that approach, they are worried that their user-base would be up in arms about it. Better to let everything through and sideline (i.e. Dump it into a separate folder) anything that they think is SPAM, than to completely prevent the receipt of any legitimate email that they misidentify.
Whether this approach is better or worse than the alternative is obviously somewhat of a subjective question.
This all being said, I may have completely misunderstood the OP's question, in which case, I would agree that Gmail is working as intended and the OP is simply holding it wrong!
Just my $0.03 (At current exchange rates, my £0.02 is worth more than your $0.02)
The 0.01% of many skilled professions keeps the world turning and we can be grateful for that. As an engineer in my mid-30s, I believe much more strongly now in pushing for simplification. I know from many bitter experiences that trying to do things outside the norm often invites unintended consequences. As a consequence I try and be very strategic about what complexity I take on to make sure it's worthwhile. For me, I use the web Gmail interface and a gmail address. Not the same as running your own domain but I've found it's worked pretty well for me.
You don't have to present a certificate to the server?
You can initiate SSL/TLS where by the only party presenting a certificate is the server to the client.
Do you think that all HTTPS clients present a certificate to the HTTPS server ? This is not how HTTPS usually works, only rare systems that are using client side SSL certificate for authentication use it. But your standard credit card transaction or login portal does not present any certificate to the server.
With STARTTLS sending you start unencrypted, enable TLS via STARTTLS command, then perform some kind of authentication inside the secure TLS channel (this can be plaintext authentication inside TLS). Now you proceed to use the SMTP have both setup a secure channel and authenticated.
You don't have to present a certificate to the server?
You can initiate SSL/TLS where by the only party presenting a certificate is the server to the client.
Read my post again. :)
I had to switch from delivering my mail to Gmail via SMTP to having Gmail poll my POP3 server. In the first model Gmail is presenting me with the certificate. In the new model, I'm presenting them with the certificate. They don't trust my certificate, so they refuse to use TLS. Thus, I end up having to have them retrieve my mail unencrypted.
Just another case where the SSL trust model results in less security.