Spam Catchers Block Latest Crypto-Gram

← Back to Stories (view on slashdot.org)

Spam Catchers Block Latest Crypto-Gram

Posted by timothy on Saturday February 15, 2003 @05:48PM from the unintentional dept.

An anonymous reader writes "Bruce Schneier sent out a note about SpamAssassin and possibly other spam filters blocking his excellent Crypto-Gram newsletter. Fortunately you can get it here (early no less!)." Schneier's email reads, in part "Tomorrow I will be sending out the February CRYPTO-GRAM, as I do on the 15th of every month. In the process of creating this month's Crypto-Gram, I discovered that SpamAssassin thinks that this issue is spam, probably because of certain links and descriptions of scams in the text. I have anecdotal evidence that other spam filters block Crypto-Gram as well. ... I'd apologize for the inconvenience, but I'm not sure what I could do to make it less so -- I don't intend to alter my content to accommodate spam filters."

15 of 238 comments (clear)

Min score:

Reason:

Sort:

Can someone run it through SpamAssassin? by Leeji · 2003-02-15 18:57 · Score: 2, Informative

When you run SpamAssassin in test mode, it tells you what rules got hit. You can also look at the headers in "Spam-Tagged" email to see what rules got hit. I looked for "Spam Testing" pages on the 'net, but had no luck.

Could someone run the Crypto newsletter through SA to find out what cased its evaluation?

As an aside, Counterpane could have done this to find out what the problem was, too. Not that they should have to, but they could have.

--
It all goes downhill from first post ...
1. Re:Can someone run it through SpamAssassin? by MavEtJu · 2003-02-15 21:15 · Score: 2, Informative
  
  SPAM: FROM_MISSING (-0.0 points) Missing From: header
  SPAM: DATE_MISSING (0.8 points) Missing Date: header
  SPAM: SUBJ_MISSING (0.3 points) Subject: is empty or missing
  SPAM: MISSING_HEADERS (1.0 points) Missing To: header
  
  See this posting for one with the headers, which shows that SpamAssassin doesn't tag it as spam anyway.
  
  --
  bash$ :(){ :|:&};:
Re:Whitelist by SimplyCosmic · 2003-02-15 20:02 · Score: 2, Informative

Well, in terms of Spamassassin, you could create rules which subtracts a particular number of points from the spam score of any particular message, rather than letting it through automatically, which gives it a better chance to go through if it's a pretty un-spam-like content.
Re:In principle, yes, in practice, no. by Mr.+X · 2003-02-15 20:04 · Score: 2, Informative

This is one of the key features of PGP/GPG.. It's called signing a message, and there is an option to encode the entire message and not just its hash.
Crazy talk ... that's user error by blinq · 2003-02-15 20:18 · Score: 2, Informative
SpamAssassin works the way you tell it to work. If you feed it all your mail and don't bother to pre-filter or whitelist known good mail, it's your fault if SA flags things such as newsletters as SPAM.
I use procmail with SpamAssassin in this manner:
- add procmail filters to put messages from family members and close friends into my INBOX
- add procmail filters to sort out messages from mail lists and newsletters
- adjust individual scores for SpamAssassin rules if necessary (usually I adjust them so a matched rule's score is higher than the default score)
- whitelist addresses from family members and close friends in SA's user preferences (a redundant mesaure just for the heck of it)
- let any mail that isn't sorted by my procmail filters be checked by SpamAssin
  - messages flagged as spam by SA are put aside into a spam folder
  - messages not flagged as spam by SA make it to my INBOX
It only takes a little bit of thought and minimal configuration to keep your mail from incorrectly being flagged as SPAM. For me, using this method has led to zero (0) false positives on messages from known sources, for two years. Every once in a while a SPAM message sneaks into my INBOX (a couple a year), but then I submit it to a SPAM database used in SA's checks (like Razor), or adjust any particularly annoying rules' scores, and it doesn't make a repeat appearance for me.
If your find that any particular newsletter is being treated as SPAM by your mail filters, there's probably a very simple way for you to make sure it isn't filtered out. Use the tools you have wisely, and you won't be disappointed.
--
~Chris
Re:SPEWS by GammaTau · 2003-02-15 20:53 · Score: 2, Informative

http://www.antispews.org/ the SPEWS fansite (not!)

Heh, this antispews.org money-making scam is a rather funny one. Strangely enough the Hostway Corporation started hosting the site three days after t3marketing lost their lawsuit against Joe McNicol. The Hostway Corporation is behind the t3marketing and many other "direct marketing" buggers. So it's no wonder that they are listed in SPEWS and using every possible way - sue spamfighters, spread FUD, etc. - to help them to continue poison our mailboxes.

That being said, I'm not sure if the SPEWS way of doing things is such a good idea but the antispews.org site is still run by spammers and should be treated as such.
Re:The problem with filters by carsten · 2003-02-15 21:38 · Score: 2, Informative

Well guess I only get spam from non-professional spammers then. I run spamassassin on my server and almost never get any spam into my Inbox. I get maybe 5-10 spams a day and they all get tagged by spamassassin and procmail filtered into a folder where I check them for false positives before deleting. The only false positives I get is a news letter from the airline KLM, for which I am too lazy to set up a procmail filter since I never read it anyway.

I have filters for all my mailing lists and so forth in my .procmailrc and then the spamassassin filter at the end. Works like a charm for me.
initial analysis for Bruce by Daniel+Quinlan · 2003-02-15 21:59 · Score: 5, Informative
I'm one of the SpamAssassin (SA) developers and I asked Bruce to send me a copy of the newsletter after hearing about his note of warning a few days ago.
Aside from the spot-on comments that people have made regarding adding a whitelist entry Crypto-Gram (an obvious candidate for whitelisting if there ever was one, given that it frequently discusses spam, scams, and probably even includes text straight out of some spams), here is my initial analysis and response to him.
Oh, first one other comment: SpamAssassin does not block content. SpamAssassin only flags probable spam. What the site or user does with that flag is their own business. Some mail administrators misuse SpamAssassin to block email, but we do not recommend blocking email. Really.
------
[...] One false positive (or a related set of false positives) is not really a statistically useful sample size. To get to a high rate of filtering, most filters do have some false positives. You can get fewer false positives with customization of one form or another (personalized Bayes training, whitelists, rules, automatic learning algorithms). Our goal (everyone's goal, I think) is to get the best ratio of false positives to false negatives. It's a difficult balance sometimes and some legitimate content has a harder time.
On to the data:
I checked your newsletter with two versions of SpamAssassin: the current stable version (2.44) and the very-soon-to-be-released development version (2.50).
A score of 5.0 is the default threshold to be flagged as spam.
In SA 2.44, your mail receives a score of 3.20 (2.40 as I received it, but I believe the score would be about 3.20 for most people). That's on the high side, but has bit to go before being flagged as spam. The score is the same with network tests (DNS blacklist tests and Razor).
In SA 2.50, your message would probably receive a score of 1.90 without network tests and 1.00 with network tests. Note that the test scores may change a bit before the final release of 2.50, but those are better scores, more what we like to see for non-spam content. They would be even lower when using Bayes (part of SA 2.50). Those lower scores are not unexpected because... well, 2.50 is better. :-)
Based on these results, it's not clear to me why yesterday's newsletter was flagged as spam. Some possibilities:
- your newsletter is routed through blacklisted hosts for some people
- some people are using a old or misconfigured versions of SpamAssassin (extra rules, additional blacklists, many possibilities here)
- the newsletter as received by some subscribers is substantially different than what you sent me
- something else?
Can you give me more information about the false positive that you experienced or was reported to you?
Thanks.
Dan
------
If I find out more of interest before the thread is closed to comments, I'll try to post a follow-up to my post.
Ancient Procmail Secret... by WWWWolf · 2003-02-15 23:15 · Score: 2, Informative

..."Ancient Gurus srb and guenther say, 'Sort your mailing lists to the folders before you filter your spam.'"
Crypto-Gram isn't the only mailing list that gets hit by misunderstandings - all automatic mail handling is always confused about automailers and mailing lists. And even due to usability factors, it makes sense to sort mailing lists to folders anyway, and use a client that supports multiple specific folders.
Re:um, i could be terribly wrong here by Feztaa · 2003-02-15 23:25 · Score: 3, Informative

There would be a tremendously large problem with encrypting the message to all of it's recipients...

See, when you PGP encrypt some text, it is only possible to encrypt it to one person (one public key). That's just how it works, it's inherent in the encryption methods used; however, PGP and GPG get around this by duplicating the entire message for each public key that it is encrypted to.

My point is that if you had a mailing list with 1000 subscribers, and you wanted to encrypt it, you'd basically be increasing the size of the encrypted message 1000-fold, because you need 1000 copies of the message, each encrypted to a given recipient. Obviously, this isn't feasable...

What they could do, though, is sign the messages. I know SpamAssassin, at least, reduces a message's spam score if there is a PGP signature attached to it.

However, if you were just trying to obscure the contents of the mail from the spam filter but not the user, you could just gzip the message and make it an attachment. I don't know how well that would go over with the spam filter, but at least it wouldn't find your m/blow.*job/s in the message ;)
please no by upper · 2003-02-16 00:22 · Score: 5, Informative

A "solution" like that would trash my outbound mail. I forge my From: addresses routinely.
My primary mailbox is with a small, local ISP. I can't buy broadband from them, so I get my connectivity via cablemodem. I do have a mailbox in the cablemodem company domain -- that's the one I give out when I expect abuse. (I do it this way because I expect to be dealing with that ISP long after the cable vendor has either ceased to exist or has treated me badly enough that I left.)
So I want my outbound mail to appear to have come from the ISP. Setting Reply-To is usually adequate, but not always -- when a human is looking for the address, they could easily grab the wrong one. And it creates potential confusion I don't want to create. So I set my from address to name@isp.com.
I can't relay through the ISP's relays, because I'm outside of their IP range. (If they did some form of authenticated SMTP, such as SMTP-after-POP, they could let me.) And the cable vendor's mail relays won't send mail out with some other domain name on it. So I send everything out directly, no relays.
If you look at many headers, I suspect you'll find that I'm not the only one forging my From: address for legit reasons. The presence of the X-Authentication-Warning header some MTAs add correlates fairly weakly with spam. (Some details of it -- e.g. no valid reverse DNS for the sending machine's IP -- could be useful indicators.)
DNSbl operations by Anonymous Coward · 2003-02-16 02:00 · Score: 1, Informative

Considering that most of the time it is Net blocks that are blocked, not just individual IP addresses.(sic)
But most of the time does not really matter, what matters is the DNSbls upon which your handling is based. After a brief foray into listing /24s, SpamCop has returned to its original practice of listing only the offending IP addresses.
If I had a grudge against an ISP, I could fake some SPAM headers and send it to any of the IP blockers.
And you could get your right to submit spam revoked when the ISP complained.
Bad news, it's in Razor by imroy · 2003-02-16 04:40 · Score: 3, Informative

I just got the email today and it failed. I'm running 2.44 from Debian and haven't yet looked at tweaking any of the rules.

Here's the verbose banner that SA put on my copy:

SPAM: Content analysis details: (5.90 hits, 5 required) SPAM: SUBJECT_MONTH_2 (-0.5 points) Subject contains a month name - probable newsletter (2) SPAM: SUBJECT_MONTH (-0.5 points) Subject contains a month name - probable newsletter SPAM: OPT_IN (1.5 points) BODY: Talks about opting in SPAM: US_DOLLARS_4 (0.4 points) BODY: Nigerian scam key phrase ($NNN.N m/USDNNN.N m/US$NN.N m) SPAM: US_DOLLARS_2 (0.1 points) BODY: Nigerian scam key phrase ($NNN.N m/USDNNN.N m/US$NN.N m) SPAM: BALANCE_FOR_LONG_20K (-0.7 points) BODY: Message text is over 20K in size SPAM: BALANCE_FOR_LONG_40K (-0.1 points) BODY: Message text is over 40K in size SPAM: SPAM_PHRASE_01_02 (0.5 points) BODY: Spam phrases score is 01 to 02 (low) SPAM: [score: 1] SPAM: NORMAL_HTTP_TO_IP (1.3 points) URI: Uses a dotted-decimal IP address in URL SPAM: RAZOR2_CHECK (3.9 points) Listed in Razor2, see http://razor.sf.net/

It looks like some dumbass has entered it into Razor. Unfortunately, some people (and yes I did this originally) had their procmail setup to enter an email into razor if it is deemed "spam" by SA or something else. Those 3.9 points are what puts it over the threshold.
Re:um, i could be terribly wrong here by Phil+Gregory · 2003-02-16 06:17 · Score: 2, Informative

See, when you PGP encrypt some text, it is only possible to encrypt it to one person (one public key). That's just how it works, it's inherent in the encryption methods used; however, PGP and GPG get around this by duplicating the entire message for each public key that it is encrypted to.

Incorrect. When PGP or GnuPG encrypts a message with a public key, they really just encrypt the message with a symmetric cypher and a sufficiently long, random key. Then they encrypt the key with the public key. (The reason for this is that public key cryptography is much, much slower than symmetric key stuff.) So for sending to multiple recipients, all that needs to be added is some additional header data for each recipient.
-rw-r--r-- 1 phil phil 212358 2003-02-16 13:01 original -rw-r--r-- 1 phil phil 90343 2003-02-16 13:02 one-recipient.gpg -rw-r--r-- 1 phil phil 90893 2003-02-16 13:04 three-recipients.gpg
A better solution would still be to encrypt the message with a particular public key for which the private key was widely available. Encrypting the message with Bruce Schneier's private key makes sense cryptographically, but I don't believe PGP and GnuPG support that sort of behavior.

--Phil (Far too much of a crypto geek)

--
355/113 -- Not the famous irrational number PI, but an incredible simulation!
Re:The problem with content filtering by rookkey · 2003-02-16 08:16 · Score: 2, Informative

The number of people using something like SpamAssassin are so small, it's not worth their time.

Not for long. Filtering software such as SpamAssassin is now being used at the server level to recognize junk email for thousands of clients.

For example, the University of Colorado at Boulder now uses SpamAssassin to scan all incoming student email. This means SpamAssassin handles the spam filtering needs of a student population of 30,000. There is no doubt that as the spam problem increases, filtering solutions will begin to appear at the ISP level.