I am the Most Spammed Person in the World

← Back to Stories (view on slashdot.org)

I am the Most Spammed Person in the World

Posted by timothy on Wednesday June 8, 2005 @06:27AM from the heart-goes-out-to-you dept.

jefp writes "In November 2004, Microsoft's second-in-command Steve Ballmer made some headlines by mentioning that Chairman Bill Gates was getting four million spams per day. At the time, I was dealing with a little spam problem of my own - I was getting around a million spams per day. I found it a little comforting that my problem wasn't quite as bad as Bill's. However, a couple of weeks later Ballmer corrected himself, saying he mis-remembered the stat and Gates actually gets four million per year. This means I was getting one hundred times as much spam as Bill Gates. I've written a tutorial explaining why I get so much crapmail and how I deal with it."

21 of 478 comments (clear)

Min score:

Reason:

Sort:

Before it was Slashdotted.. by Peter+Cooper · 2005-06-08 06:32 · Score: 2, Informative

the server got as far as spluttering this part of the page out:

What am I trying to do here?

Keep my email service running and useful.
Keep my web service running too, since it's on the same machine.

I guess 1,000,000 spams a day isn't as bad as 1000 people simultaneously trying to access your Web server!
1. Re:Before it was Slashdotted.. by LuckyStarr · 2005-06-08 07:11 · Score: 3, Informative
  
  In fact his Webserver still runs perfectly. Why do I know? Because I am reading his article. Slashdottings occur when webservers use more RAM than the system has. Kernel swaps, webserver allocates some more memory, tilt. So the obvious solution is to configure your webserver not to. :) I guess this is what he did. All incoming connects get queued by the kernel and handed over to the webserver if a slot gets available. It gets terribly slow (I can tell!), but if the user has a high timeout-value (of a minute or 2) then no error will occur at his end either.
  
  Very reliable tech I guess. :)
  
  --
  Meme of the day: I browse "Disable Sigs: Checked". So should you.
Greylisting by nocomment · 2005-06-08 06:32 · Score: 5, Informative

Just yesterday I enabled Greylisting in OpenBSD spamd, and today I got 6 spams, compared with my usual 150. (per day).

It's easy to set up and works with your existing mail server. OUr mail server is qmail on red hat, but openbsd just ahppily redirects the legit (what it suspects might be legit rather) to the mail server. The load has dramatically decreaed on the mail server.

--
/* oops I accidentally made a comment, sorry */
/* http://allyourbasearebelongto.us */
1. Re:Greylisting by appleprophet · 2005-06-08 07:07 · Score: 2, Informative
  
  I tried greylisting, but I was not very impressed. I am a shareware programmer, so I rely on receiving email from many unique people, using a wide variety of ISPs. I found that greylisting would often hold legitimate emails for many hours, sometimes days, depending on how the customer's ISP was set up. I even got complaints that I was slow providing support when several customers had their emails thrown in the queue so I couldn't reply to their emails as fast as I usually do. That is unacceptable to me. I suppose greylisting is good if you just use email with a select group of people, but if you rely on emailing people you have never encountered before every day, I warn you about enabling grey listing.
2. Re:Greylisting by af_robot · 2005-06-08 07:33 · Score: 2, Informative
  
  Don't spread rumors! There are *no problems* with normal Lotus Domino (Notes) servers and greyslisting - it is fully RFC compliant.
  
  There can be some misconfigured or ancient SMTP servers, but you can always whitelist it if you really need to get email from such servers.
3. Re:Greylisting by Just+Some+Guy · 2005-06-08 09:34 · Score: 2, Informative
  
  Greylisting is a much larger burden to spammers than legitimate mailers, though. Say your server is configured to greylist. I have a finite and rather stable set of people on my system that will want to send mail to your system. If each of my users sends 50 messages to your server (and assuming that the second and subsequent messages are sent after the greylist timeout so that they're not affected), then 2% of the traffic from me to you gets delayed.
  On the other hand, a spammer wants to deliver 10,000,000 messages to random users on your system. Depending on whether your greylist takes place before recipient verification, he has to delay 100% of his messages to you before even having the privilege of knowing which ones are potentially going to real users. Additionally, there's a fighiting chance of the spammer being added to a DNSBL between the time they initially begin their transmission and when your server finally stops ignoring their requests.
  
  Even if all spammers upgrade their bots to full SMTP compliance, the result of greylisting is a huge spike in the resources required to transmit a given amount of UCE. The goal isn't to make it impossible for them to transmit their junk, but to make it more expensive than it's worth.
  
  --
  Dewey, what part of this looks like authorities should be involved?
coral cache by Anonymous Coward · 2005-06-08 06:32 · Score: 1, Informative

i cached it: http://www.acme.com.nyud.net:8090/mail_filtering/
Mirror by schnurble · 2005-06-08 06:33 · Score: 4, Informative

Just to alleviate some of his bandwidth, I have mirrored the mail_filtering pages. Looks like it's all there. Let me know if you want me to take it down.

--
"To err is human, to forgive is simply not my policy." --root
Coral Cache Link by Anonymous Coward · 2005-06-08 06:36 · Score: 1, Informative

http://www.acme.com.nyud.net:8090/mail_filtering/
Outlook Spam Filter by Langley · 2005-06-08 06:38 · Score: 2, Informative

If you work in a company like mine where Outlook is de rigueur and the Boss is too worried about missing an email to even allow for simple spam filtering at the head end. I can't recommend enough that you give SpamBayes Outlook plug-in a try. It operates nearly perfectly if you train it well (only about 600 spam messages needed).
Re:nowhere by abulafia · 2005-06-08 06:40 · Score: 4, Informative

I know the owner of that domain, and yes, she got so much mail that she ended up turning MX off for it.

--
I forget what 8 was for.
What hardware is your site running on, Jef? by CyricZ · 2005-06-08 06:42 · Score: 2, Informative

For those who do not know, Jef Poskanzer is the author of the thttpd webserver. I'm just wondering what sort of hardware you're running your site and email server on, Jef. I know that thttpd is extremely quick and efficient, so it wouldn't surprise me if you were running on an older 486 or early Pentium I machine.

--
Cyric Zndovzny at your service.
1. Re:What hardware is your site running on, Jef? by jefp · 2005-06-08 10:00 · Score: 5, Informative
  
  Hardware info here. It's a 3.2 GHz P4. I was struggling along on a 450 MHz box until only a year ago, but finally had to upgrade.
Coral cache by gregbaker · 2005-06-08 06:43 · Score: 3, Informative

The site seems to be slowing down, but the coral cache is going strong.
Full text - it's Slashdoted (minus img and tables) by Anonymous Coward · 2005-06-08 06:59 · Score: 5, Informative
Mail Filtering
Or, how to block a few million spams per day without breaking a sweat.
© 2005 by Jef Poskanzer.
Introduction
In November 2004, Microsoft's second-in-command Steve Ballmer made some headlines by mentioning that Chairman Bill Gates was getting four million spams per day. At the time, I was dealing with a little spam problem of my own - I was getting around a million spams per day. I found it a little comforting that my problem wasn't quite as bad as Bill's. However, a couple of weeks later Ballmer corrected himself, saying he mis-remembered the stat and Gates actually gets four million per year.
This means I was getting one hundred times as much spam as Bill Gates.
Nevertheless, after filtering we both get about the same amount: around ten spams per day in our inboxes. Ballmer says that Microsoft has an entire department dedicated to protecting their mailboxes from spam. At ACME Labs there's just one guy, one server, and a T1 line. And yet my filters are a hundred times as effective as Microsoft's. How do I do it?
These pages will show you how, and help you deploy similar filters on your own system.
Goals
What am I trying to do here?
- Keep my email service running and useful.
- Keep my web service running too, since it's on the same machine.
- Avoid losing real email by mistake.
- Delay growth in resource use, so I can delay spending money on hardware upgrades.
- Spend as little time as possible on the above, so I can get more important things done.
- Help other people do the same.
Results
For those who like to read the end of a novel first, here are some overall stats showing how the filters are performing.
Environment
This is all based on a Unix system running sendmail. If you're not using Unix, or you're using a different Unix-based mail system, most of the specific advice here will not help you. You may still find some value in the general ideas.
Sendmail Config
The first layer of spam defense is sendmail itself, because that's the first piece of software to touch each message. Sendmail has a number of different config options that can help you block spam and keep your machine stable.
greet_pause
As of version 8.13, sendmail added an anti-spam feature called "greet_pause". It is both simple and clever.
In a normal SMTP transaction, first the client connects, then the server sends back a "220" greeting message, then the client sends its HELO command. Some spam programs, however, don't wait for the greeting message. They just send their commands immediately without listening.
The greet_pause feature detects this misbehavior by pausing briefly before sending out the "220" greeting message. If any commands arrive during that pause, then the connection is marked bad and anything coming over it is ignored.
This one is interesting because it actually cuts down on the number of spam attempts, not just the spam deliveries. I figure when the spammers hit the pause they are somehow getting stuck. I'll have a graph of this later - before I enabled greet_pause, I was getting a couple million spam attempts per day; after, only 600,000.
To enable the feature, you need to make two changes. First, in your sendmail.mc file:

FEATURE(access_db)dnl FEATURE(`greet_pause',5000)

You probably already have access_db defined; it just needs to appear somewhere prior to greet_pause. The number is how many milliseconds to pause; 5000 = five seconds. Then in your access file you should add this:

GreetPause:localhost 0

The second change prevents the pause from applying
Re:nowhere by njcoder · 2005-06-08 07:52 · Score: 3, Informative

I used to use asdf.com all the time too.. Then one day I decided to see if it actually existed. This is a funny read. :)

--
Open Source Java DAO Generator
Re:qmail by spun · 2005-06-08 08:17 · Score: 5, Informative

Short Answer: No, but other people do.
Long Answer: The concern is the misdirected bounce. By default and in accordance with the RFC, qmail bounces messages it accepts then later decides it can't deliver back to the sender. Spammers use false return addresses, so you end up bouncing spam back to innocent third parties. When used with naive spam-filtering techniques, this can be a problem i.e. qmail accepts the message, but a spam filter rejects it, and it is bounced. Here's what SpamCop.net has to say about it:

Qmail: Qmail is one popular mail exchanger which suffers from this problem by default. If you use qmail, please apply a patch: spamcontrol or qmail-ldap.
There is also an experimental patch for qmail which allows you to send bounces, but isolate them on a different IP address (so that spamcop can block them without blocking other mail): Richard Lyons BOUNCEQUEUE patch

PZInternet.com reports chkuser is a very good qmail patch to avoid misdirected bounces - very easy to install too! http://www.interazioni.it/opensource/chkuser/

For users of qmail-toasters, check out the simscan patch

Everything anti-spam is done by people other than djb. I love qmail, but it really isn't the easiest server to set up for spam control. One needs about a dozen patches to get it working right.

--
- None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
The anatomy of successful spam filtering by Pfhorrest · 2005-06-08 08:38 · Score: 2, Informative

I've had my current email address for the past 13 or 14 years.

(In fact the ISP it's hosted with currently hosts ONLY that email address and a tiny hunk of web space for me; I get my actual connection and everything from Cox).

My address has been plastered all over the Internet from since before there was a spam problem. Even if I were to take it off of all the sites I've made, or ask it to be taken down from all the other sites, there's still hundreds of UseNet posts from before there was need to spam-proof my address, all cached on the various web-based UseNet caches.

At one point a few years back I was getting many hundred spam messages a day. Now, I get about two. And I've not had any problems with false positives that I'm aware of, at least not for quite a while.

I don't run my own mail server and I don't know how West.net (my mail provider) runs theirs, but I do know they run a nice spam filtering service called Postini, which catches a large majority of the spam. When it gets to my end, I've got extensive whitelists for all the discussion lists I'm on, as well as everyone in my address book (everyone I've sent mail to, basically). A lot of spam I'd get has my own address forged onto it, so any mail from me that doesn't contain my passphrase in the subject is blacklisted. I've also got a blacklist for serious repeat spammers (same exact spam every day). Past that, Mail's Bayesian filtering quarantines most of the remaining messages, and all the ends up in my In box are legit messages from people I don't know, and maybe one or two spam messages.

I think the common thread between the article's successful spam filtering and my successful spam filtering is using multiple layers of whitelists, blacklists, and greylists. Keep the people you know on whitelists so you never need to worry about them not getting through; people doing evil things get blacklisted, preferably temporarily as he's done it; and everyone else takes the risk of being filtered (either because their mail server is dysfunctional, as some of his filters would risk, or because the message "looks like spam" as a Bayesian filter would risk). Implement this type of scheme on both the mail server (his way) and the client program (my way) for extra protection.

I think that's about as successful as anyone can hope for a spam filter.

--
-Forrest Cameranesi, Geek of all Trades
"I am Sam. Sam I am. I do not like trolls, flames, or spam."
Re:What to do... by jefp · 2005-06-08 10:06 · Score: 2, Informative

Ooo, good point. PIPELINING is now disabled on acme.com. Thanks!
Greylisting blocks email from Slashdot by hadaso · 2005-06-08 20:55 · Score: 2, Informative

> Greylisting will prevent you from receiving email
> from a variety of non-complying SMTP hosts ...

such as slashdot.org?

I tried enabling greylisting on the sneakemail.com address I use to receive email from Slashdot, and it blocked all the email from Slashdot. The logs on sneakemail show many delivery attempts from Slashdot, so I guess there is some kind of incompatibility between the way Slashdot tries to resend the message and the way Sneakemail expects it to be resent. I don't know who is to blame for the incompatibility. Probably no one, since there is no specification on HOW redelivery should be attempted. Anyway, it shows that there can be problems with greylisting because the way a client resends the mail is not well defined.

On the other hand, greylisting is a very effctive filter. I enabled greylisting on the address I have in the whois record of my domain, and I get practically no spam to that address (before greylisting I got quite a lot, and the sneakemail greylisting logs list lots of attempts that are easily recognizable as spam: lots of broadband connection IPs, and "from" address from domain not matching sending server.).

Publishing an address in Slashdot is the most effective way to receive spam, and receive spam fast. About 10 days ago I changed the address I use in Slashdot. The next day I already received spam on that address. The older address is now greylisted and doesn't receive any mail, but the logs show many messages blocked by greylisting (31 yesterday). What I do now is change the address I publish in Slashdot every once in a while, and enable greylisting the old address. It doesn't block all spam, but it takes a while for the volume of spam to the new address to build.
Preventing False Positives is a critical feature by billstewart · 2005-06-09 02:46 · Score: 2, Informative

If you RTFA, Poskanzer points out one of the critical features, which is that unlike RBLs, Greylisting is safe because it doesn't do false-positive rejections of email from legitimate senders - it just delays them. That's not 100% accurate - somebody running SMTP on a dialup could get repeated rejections until their mailer gives up, but that's pretty rare and they'd at least get a rejection message as opposed to a silent discard.
Without downloading and unzipping your code, I can't tell how your blacklisting features work, but an obvious extension to a greylisting system is to give RBLed sites a much longer greylist time than mail from unknown sites (e.g. 4-hour retries vs. 5-minute.) It's particularly useful because you can even use some of the more aggressive lists in spite of their enjoyment of collateral damage, and you can use whole-country blocklists for places you don't expect to get mail from, such as Korea and China, without actually rejecting much mail from real people.

--

Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks