I do not believe Google's mail servers reject email which fails an SPF check. If that is the case then implementing SPF for your domain will not stop the backscatter from Google. It also won't stop backscatter from any system which does not support SPF at all or any system which only uses SPF to score spam.
SPF was a good attempt at stopping email forgeries. It has its flaws and its usefulness is hampered by limited adoption by MX servers.
Whether or not people want to admit it, understand it, or whatever, Google is contributing to email abuse with what they are doing. So are other email systems which don't properly validate recipient addresses during the initial SMTP conversation. By allowing this backscatter to occur they are abusing innocent third parties. What makes this even worse is that email admins have the ability to stop backscatter without breaking anything and without violating any RFC.
Some people have argued that by accepting all email for a domain they are stopping successful dictionary attacks. Not only is that false, it isn't a very valid point. There are still ways to figure out valid email addresses on a system: Spammers can use a valid envelope sender address so that they receive the bounces, and if no bounce is received the address is likely good; they can embed images, JavaScript, etc. in HTML emails to track which messages have been opened. That's all assuming spammers even care about valid email addresses. For a large part they don't care. Rather it is fire and forget, and hope the messages find real recipients. That approach is inexpensive for them. If they throw enough darts eventually one will hit the dart board. This is backed up by keeping track of email sent to non-existent recipients on your mail server over time. Recipient addresses that were rejected years ago are still getting attempted deliveries. If spammers kept track of bad addresses this wouldn't happen. If mail admins configure their mail servers to accept and bounce email for non-existent users, rather than reject, they are only further hurting innocent bystanders on the Internet. Even if this approach did make dictionary attacks impossible it most certainly doesn't stop anyone from trying one. If the envelope sender address of the spam was spoofed, which is usually the case, then someone else who had nothing to do with the sending of the email will now need to deal with the backscatter. Whether that is some end user who gets flooded with bounce messages reporting emails they never sent, or an organization's email system whose resources are being consumed by the backscatter, the result is someone else being abused. That abuse could have been prevented by the receiving mail server.
Other arguments have stated that mail systems that backscatter are simply complying with the RFCs. Some have taken even further to state that any other approach would actually violate some RFC. The "complying with RFCs" argument is a cop-out. That completely ignores the fact that all a mail system is doing by accepting and bouncing is forwarding the abusive traffic to someone else who had nothing to do with it in the first place. As far as rejecting mail that is sent to non-existent users actually violating some RFC, I challenge anyone to find which RFC makes that statement.
Another argument has revolved around multi-MX systems and how it would difficult or too resource intensive to maintain lists of valid recipients. If doing so is too difficult, or requires too much resources, for an organization I suggest they don't have multiple MX hosts or outsource their email to someone more competent. This has been done for a long time using RDBMS backends and LDAP. Those technologies aren't new. The fact is that Google *does* do recipient validation for the google.com and gmail.com domains. If it can be done for those domains then the argument that it is too difficult or too resource intensive for Google can't be valid. If Google can do it for google.com and gmail.com certainly they can for their other domains. In fact a very large number of email providers of all sizes properly handle recipient validation and do not produce backscatter. Scale is not a valid argument against doing so. If an organization is large enough to need the redundancy of multiple MX servers then they should be large enough to properly implement whatever redundancy is required by their backend to validate the users.
Not only does proper recipient validation on an MX
How is rejecting email to non-existent users in direct violation of standards?
Additionally, the RFC you linked to defines the DSN extension. There is no requirement for an MTA to support RFC 3461. In fact Google's own MXs do not support the DSN extension:
$nc smtp2.google.com. 25 220 smtp.google.com ESMTP EHLO ME 250-smtp.google.com Hello obfuscated hostname [obfuscated IP address], pleased to meet you 250-ENHANCEDSTATUSCODES 250-PIPELINING 250-8BITMIME 250-SIZE 20000000 250-STARTTLS 250-DELIVERBY 250 HELP quit 221 2.0.0 smtp.google.com closing connection
You tested the wrong domain. The backscatter behavior exists for blogger.com and googlegroups.com. I'm not sure why people are testing this with bogus gmail.com email addresses.
Yes, but the contents of the message can't be controlled in any meaningful way, so as you said: They can if the email is sent to the blogger.com domain. Backscatter from that domain does contain the contents of the original message.
It sounds like the submitter wants to blow this out of proportion by equating general backscatter (which nearly all mailing list managers on the Internet generate with their "confirmation" messages) with backscatter spam. I am the submitter (I submitted the story before I signed up for an account here). Whether you want to believe it or not, this is a serious issue. Innocent third parties shouldn't have to deal with the backscatter generated by Google's mail servers no matter what the content.
You also didn't test this with the blogger.com domain. If you had you'd see that the contents of the original message are indeed sent back to the envelope sender in the bounce message generated by Google's servers.
Parent is correct. Scale is the issue. GMail is probably receiving billions of emails per day, most of them spam to invalid accounts. They are in effect being DDOS'd and it is very difficult for them to check every destination address in real time. The solution that scales easiest is for them to queue incoming emails for processing by lots of generic MTAs. It would be far more efficient for them to reject mail to non-existent users during the SMTP transaction. Doing so saves them bandwidth, diskspace, system I/O, etc. Accepting and then bouncing is far more expensive.
Postfix, on the other hand, absolutely must queue all mail before resolving addresses. For this reason it must accept email regardless. That is false. By default Postfix does not accept mail for non-existent addresses; such mail is rejected during the SMTP transaction. A mail administrator must incorrectly configure Postfix for it to operate in the fashion you describe.
I do not believe Google's mail servers reject email which fails an SPF check. If that is the case then implementing SPF for your domain will not stop the backscatter from Google. It also won't stop backscatter from any system which does not support SPF at all or any system which only uses SPF to score spam.
SPF was a good attempt at stopping email forgeries. It has its flaws and its usefulness is hampered by limited adoption by MX servers.
Whether or not people want to admit it, understand it, or whatever, Google is contributing to email abuse with what they are doing. So are other email systems which don't properly validate recipient addresses during the initial SMTP conversation. By allowing this backscatter to occur they are abusing innocent third parties. What makes this even worse is that email admins have the ability to stop backscatter without breaking anything and without violating any RFC.
Some people have argued that by accepting all email for a domain they are stopping successful dictionary attacks. Not only is that false, it isn't a very valid point. There are still ways to figure out valid email addresses on a system: Spammers can use a valid envelope sender address so that they receive the bounces, and if no bounce is received the address is likely good; they can embed images, JavaScript, etc. in HTML emails to track which messages have been opened. That's all assuming spammers even care about valid email addresses. For a large part they don't care. Rather it is fire and forget, and hope the messages find real recipients. That approach is inexpensive for them. If they throw enough darts eventually one will hit the dart board. This is backed up by keeping track of email sent to non-existent recipients on your mail server over time. Recipient addresses that were rejected years ago are still getting attempted deliveries. If spammers kept track of bad addresses this wouldn't happen. If mail admins configure their mail servers to accept and bounce email for non-existent users, rather than reject, they are only further hurting innocent bystanders on the Internet. Even if this approach did make dictionary attacks impossible it most certainly doesn't stop anyone from trying one. If the envelope sender address of the spam was spoofed, which is usually the case, then someone else who had nothing to do with the sending of the email will now need to deal with the backscatter. Whether that is some end user who gets flooded with bounce messages reporting emails they never sent, or an organization's email system whose resources are being consumed by the backscatter, the result is someone else being abused. That abuse could have been prevented by the receiving mail server.
Other arguments have stated that mail systems that backscatter are simply complying with the RFCs. Some have taken even further to state that any other approach would actually violate some RFC. The "complying with RFCs" argument is a cop-out. That completely ignores the fact that all a mail system is doing by accepting and bouncing is forwarding the abusive traffic to someone else who had nothing to do with it in the first place. As far as rejecting mail that is sent to non-existent users actually violating some RFC, I challenge anyone to find which RFC makes that statement.
Another argument has revolved around multi-MX systems and how it would difficult or too resource intensive to maintain lists of valid recipients. If doing so is too difficult, or requires too much resources, for an organization I suggest they don't have multiple MX hosts or outsource their email to someone more competent. This has been done for a long time using RDBMS backends and LDAP. Those technologies aren't new. The fact is that Google *does* do recipient validation for the google.com and gmail.com domains. If it can be done for those domains then the argument that it is too difficult or too resource intensive for Google can't be valid. If Google can do it for google.com and gmail.com certainly they can for their other domains. In fact a very large number of email providers of all sizes properly handle recipient validation and do not produce backscatter. Scale is not a valid argument against doing so. If an organization is large enough to need the redundancy of multiple MX servers then they should be large enough to properly implement whatever redundancy is required by their backend to validate the users.
Not only does proper recipient validation on an MX
How is rejecting email to non-existent users in direct violation of standards?
Additionally, the RFC you linked to defines the DSN extension. There is no requirement for an MTA to support RFC 3461. In fact Google's own MXs do not support the DSN extension:
$nc smtp2.google.com. 25
220 smtp.google.com ESMTP
EHLO ME
250-smtp.google.com Hello obfuscated hostname [obfuscated IP address], pleased to meet you
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-8BITMIME
250-SIZE 20000000
250-STARTTLS
250-DELIVERBY
250 HELP
quit
221 2.0.0 smtp.google.com closing connection
You tested the wrong domain. The backscatter behavior exists for blogger.com and googlegroups.com. I'm not sure why people are testing this with bogus gmail.com email addresses.
You also didn't test this with the blogger.com domain. If you had you'd see that the contents of the original message are indeed sent back to the envelope sender in the bounce message generated by Google's servers.