Slashdot Mirror


Spam Catchers Block Latest Crypto-Gram

An anonymous reader writes "Bruce Schneier sent out a note about SpamAssassin and possibly other spam filters blocking his excellent Crypto-Gram newsletter. Fortunately you can get it here (early no less!)." Schneier's email reads, in part "Tomorrow I will be sending out the February CRYPTO-GRAM, as I do on the 15th of every month. In the process of creating this month's Crypto-Gram, I discovered that SpamAssassin thinks that this issue is spam, probably because of certain links and descriptions of scams in the text. I have anecdotal evidence that other spam filters block Crypto-Gram as well. ... I'd apologize for the inconvenience, but I'm not sure what I could do to make it less so -- I don't intend to alter my content to accommodate spam filters."

24 of 238 comments (clear)

  1. Hopefully SpamAssassin didn't by Chris_Stankowitz · · Score: 5, Funny

    block that important e-mail I was waiting for on enlarging my....never mind, I have to check my e-mail now.

  2. um, i could be terribly wrong here by Anonymous Coward · · Score: 4, Interesting

    but why not distro the newsletter encrypted? then the spam filters wouldnt have anything to trigger the filters, and id say the target audience have the knowledge to unencrypt it when it gets there..

    1. Re:um, i could be terribly wrong here by Feztaa · · Score: 3, Informative

      There would be a tremendously large problem with encrypting the message to all of it's recipients...

      See, when you PGP encrypt some text, it is only possible to encrypt it to one person (one public key). That's just how it works, it's inherent in the encryption methods used; however, PGP and GPG get around this by duplicating the entire message for each public key that it is encrypted to.

      My point is that if you had a mailing list with 1000 subscribers, and you wanted to encrypt it, you'd basically be increasing the size of the encrypted message 1000-fold, because you need 1000 copies of the message, each encrypted to a given recipient. Obviously, this isn't feasable...

      What they could do, though, is sign the messages. I know SpamAssassin, at least, reduces a message's spam score if there is a PGP signature attached to it.

      However, if you were just trying to obscure the contents of the mail from the spam filter but not the user, you could just gzip the message and make it an attachment. I don't know how well that would go over with the spam filter, but at least it wouldn't find your m/blow.*job/s in the message ;)

  3. Seems like it worked fine.... by telstar · · Score: 4, Funny

    So he sends out the Crypto-Gram newsletter, then he sends out a note about the Crypto-Gram newsletter. 2 emails to cover what should've been sent as 1. Seems like the spam filter is doing just fine ...

  4. White List by SealBeater · · Score: 4, Insightful

    That's easy to fix, add the crytogram address to a whitelist. Every spam
    filtering software I've ever run, including spamassasin (which I like a great
    deal) has a whitelist option. If you're running some kind of filtering
    software, it behooves you to keep an eye on what it's blocking, hence, I am
    sure that people are aware of it and have adjusted their software accordingly.

    SealBeater

    --
    -- Its survival of the fittest...and we got the fucking guns!!!
  5. Whitelist by sean23007 · · Score: 5, Interesting

    That's why most good spam blockers (especially OS X's Mail.app) use their filters but compare the senders to a whitelist so that your friends can send you whatever they want to. If you've been receiving CRYPTO-GRAM for a while, it should be on your whitelist, and the blocker should just let it by.

    But you don't always want to get everything people send you (everybody has those people who send you things they think are funny but you just can't stand). So there should be levels of "friendship" in the whitelist, so that some senders can be considered dubious (their mail shouldn't be deleted like spam, but perhaps placed in a different "Uninteresting" folder).

    --

    Lack of eloquence does not denote lack of intelligence, though they often coincide.
  6. SpamAssassinAssassin by Anonymous Coward · · Score: 5, Funny

    SpamAssassinAssassin could look at the folder where you put your filtered mail and learn what to pull back out, and flush the rest to /dev/null.

    I'm sure Paul Graham will be glad to write it in lisp.

    Or, of course, we could just do what the obvious solution is: get in a P.O. Box, send out spam for herbal viagra and penis enlargement, and when you get the checks in the mail HUNT THE CUSTOMERS DOWN AND KILL THEM.

    It's simple, really.

  7. This is a non-issue.... by MrByte420 · · Score: 4, Interesting

    False-Positives should be a non-issue. Either you choose to run a spam filtering software and live with thoose limitations or don't run a spam filtering program and deal with the extra emails about enlarging various organs that you will receieve every day.
    I do tech support for a webhosting company and people call us every day complaining about their spam but as soon as we offer blocking software based on lists, etc all we get is complaints that some more-valuable-than-gold email is going to get lost and ruin their entire business.

    This is a simple choice and people have to learn they can't have their cake and eat it too.

    --
    If religous zealots don't believe in Evolution, then why are they so worried about bird flu?
    1. Re:This is a non-issue.... by Elwood+P+Dowd · · Score: 4, Insightful

      Thank you. Also, if all the bayesian filtering advocates are right, then the users should be able to mark the Cryptogram as non-spam, and the filter should adapt. More to your point, though, is that lack of spam-filtering software can cause false-positives in your own personal, analog, spam filtering algorithm. Many of my users have deleted important, non-spam, automated emails manually because they thought it was spam. Sometimes, the machine might have less false positives than they would.

      Huh. It occurs to me that it seems like some spam filters might pass a turing test if the only output is their spam judgment. Wow. The future is now, dude.

      --

      There are no trails. There are no trees out here.
    2. Re:This is a non-issue.... by 1u3hr · · Score: 3, Insightful
      Either you choose to run a spam filtering software and live with thoose limitations or don't ...

      Except if it's done upstream from you, perhaps even without your knowledge (eg a few months ago it was found that Mac.com was aggressively filtering, with a lot of false positives).

  8. The problem with filters by markfletcher · · Score: 5, Insightful
    This illustrates one of the big problems with filters. They will never be perfect, spammers are always adjusting to them (even the Bayesian ones), and the way many are implemented, they make email unreliable (by deleting suspected spam messages and not bouncing them). Blocking untrusted servers by IP address avoids these issues.

    obPlug: This is why I created Trustic.

  9. The problem with content filtering by Leeji · · Score: 4, Insightful

    This is exactly the problem with most content filtering approaches.

    It is very hard to discern the difference between talk about sex, spam, viruses, etc and talk from sex, spam, viruses, etc. Newsletter authors go as far as writing "v*rus" and "sl*mmer" so that pitiful content filtering blocks don't trash them.

    It gets even worse for email lists that use inline text ads. The ads alone would constitute spam, but they're nestled within several paragraphs of high-quality discussion.

    The problem is that content filtering approaches usually only analyze the "spamminess" of a piece. They usually don't analyze the "goodness" of a piece. So if I put "hot teens go crazy for debt-free viagra while earning $$$ from home" in the middle of some fine Shakespeare, that will get flagged as spam.

    The new "bayesian" approaches are finally dealing with this problem -- something can look an awful lot like spam, but it will be saved if it looks even more like legitimate email.

    In this case, spam doesn't generally run for 21 pages with words like "cryptography," and "full disclosure."

    --
    It all goes downhill from first post ...
    1. Re:The problem with content filtering by Tricot · · Score: 5, Funny

      ...if I put "hot teens go crazy for debt-free viagra while earning $$$ from home" in the middle of some fine Shakespeare, that will get flagged as spam.

      eMerchant of Venice. Act I Scene IV, right?

    2. Re:The problem with content filtering by 1u3hr · · Score: 3, Insightful
      In this case, spam doesn't generally run for 21 pages with words like "cryptography," and "full disclosure."

      The problem with that is that if you score mail by the percentage of spam, rather than the absolute amount, the obvious response by spammers is to ADD 21 pages cribbed from a crypto newsletter to the end of their penis-enlarging spam. Maybe even fake the headers to make it look like it came from a respected source.

  10. SPEWS by some1somewhere · · Score: 3, Insightful

    At least he is only on Spamassassin which tends to be run on the client-side, so statistically less people would not see the newsletter. If he were on the SPEWS's blocklist, he'd never get out!

    http://www.antispews.org/ the SPEWS fansite (not!)

    Personally I see less problem with client-side blocking, as there is less chance that any 2 people would use exactly the same combination of blocklisting/priorities/etc. Plus, programs like Spamassassin use quite a lot of processing power, so large mail servers (eg. for an ISP) would need significant additional resources to handle this. Thus it is best to move such individualized and resource-intensive applications to the client-side anyway.

    YMMV.

    --
    **FREE** Track and view your phone's via CellID and/or WIFI and/or GPS :- http://tinyurl.com/la6fhd
    1. Re:SPEWS by Skapare · · Score: 3, Insightful
      If he were on the SPEWS's blocklist, he'd never get out!

      And this is why the SPEWS blocklist is so effective and so good. If he were on it, then that would mean that he and/or his network fell into one of the following categories:

      • Is a spammer
      • Is an ISP harboring a spammer (or an upstream ISP thereof)
      • Is a customer of an ISP harboring a spammer

      Because spam causes abuse to email servers, even when the mail is refused either for reasons of an IP based blocklist, or for content filtering ... abuse in the form of higher costs for the server operators and recipients ... the proper goal is to get the spammer not just blocked from being able to get mail into your mailbox, but fully disconnected from the internet to prevent these kinds of costly abuses in the future. And since only the ISP hosting them can actually disconnect them, it will be the job of that ISP to do so. Most ISPs will when they realize the situation. A few ISPs refuse to, and that's when it comes time to put pressure on the ISP by expanding the blocking of the ISP's network, forcing them to consider that their legitimate customers will be leaving if they do not disconnect the spammer. SPEWS gradually expands listings so that the point where the ISP finally understands this can be reached with the minimum of so called collateral damage (which is not really, because these are customers who are paying money to an ISP which harbors spammers, so they share in the guilt).

      Bruce Schneier's mail server happens to not be listed by SPEWS. So it can be said that he is not a spammer, is not running an ISP that harbors spammers, and is not using an ISP that harbors spammers. That is a good thing and shows that SPEWS not only works, but works better than content based filtering.

      Content based filtering also is a direct violation of the principles of the US First Amendment right to free speech (although the actual amendment only applies to restrictions imposed by the government and does not apply to private businesses in most cases, if not all). Infringement of free speech happens when the decision is based on what the content is. When restrictions are not affected by the content, then such restrictions are considered fair since any content can be passed when the behaviour that evoked the restrictions is not done. And the whole spam issue is about behaviour, not content. The bad behaviour is the act of inappropriately choosing multiple recipients for sending the message ... e.g. unsolicited bulk email (UBE).

      Of course on your own mail server you have a right to use whatever methods you deem appropriate based on how you want to balance your costs, the quality of your service to your customers, and how much cost you want to pass on to your customers. Obviously you have to be in contractual agreement (possibly implied) with your customers about what methods are chosen. If you only offer one kind of service and your customer does not want that kind, by being properly aware of what you do offer, they can go elsewhere. Or you can offer a diversity of services the customer can choose from (e.g. a customer control panel to control the methods of spam filtering for their email accounts). So the choice of what method to use to block spam is strictly a relationship between a provider and its own customer.

      In the case of a network owned by a business only to serve that business function, then it's simply the commercial version of "my server, my rules".

      --
      now we need to go OSS in diesel cars
  11. A possible solution to the spam problem... by kcbrown · · Score: 4, Interesting
    Right now everyone is forced to accept email connections from anyone who sends email because it's not possible to tell ahead of time whether or not the connection is coming from someone who is reliable, right? And spammers take advantage of this by sending millions of messages from open relays. Blocking that is a virtual impossibility because which relays are open changes over time.

    The first inclination one has would be to suggest that everyone close their open relays. But this depends on people doing the right thing all the time, and has proven ineffective.

    Fortunately, there's another way.

    Right now, everyone who receives mail has to listen to everyone who tries to connect. The problem is how do you separate the wheat from the chaff?

    The solution is to take advantage of the information SMTP and TCP/IP give you when a connection is established. The fact that you're receiving a connection gives you the address of the sender. And during an SMTP transaction, one of the SMTP commands (the MAIL FROM command) gives you the domain of the email's sender, e.g. "MAIL FROM slashdot@sysexperts.com".

    When you're sending email to someone else, you do so by looking up the MX records for their domain, which tells you which systems are responsible for receiving email for that domain. This gives us a possible answer to the spam problem.

    Suppose instead of blindly accepting email from everyone, you were to take the domain given to you by the MAIL FROM command, look up the MXes for that domain, and reject the email connection if the IP address of the sender doesn't match one of the domain's MXes?

    Now, suddenly, you would end up rejecting email sent from every unauthorized relay, because the owner of the domain can make any system that is allowed to send email on behalf of his domain into an MX (and, if he doesn't want that system to be used for delivering email, then he simply makes such systems the lowest priority MXes in the list and blocks outside port 25 connections to them ... something he's probably doing anyway).

    Suddenly, the only systems that spammers can send email from are systems that they legitimately control and that are defined as MXes for a domain they control. Suddenly, spammers have to set up and maintain their own domains and their own boxes. The costs have just become a lot higher, which will get rid of most of the spammers.

    And suddenly, blocking spam becomes orders of magnitude easier -- you only have to deal with spammers who have decided to pay the (now much higher) price for sending spam and who cannot use someone else's system to do their dirty work without permission.

    --
    Use 'slashdot stuff' in the subject line in any email you send me if you want to get past the spam filter.
  12. Re:Maximum size of spam by waynemcdougall · · Score: 3, Interesting

    My point remains valid. Because there is a direct cost to the spammer to adapt.

    If they bulk up their spam that's going to slow them down, increase their costs (even if bandwidth costs aren't going to be passed back to them now, the more they use, the more visible they become). They become more visible.

    Or they continue on their way. The reality is that they concentrate on the easy targets - you and I will never purchase their services so people taking this approach aren't really in their target audience anyway. I know this is (surprisingly) less true than one might think. Spammers do work to overcome basic obstacles, but that adds more costs and time - they don't work hard to avoid tar pits, because there are so few of them.

    So I still see it as a win...large emails are very unlikely to be spam. If that changes, well so be it, but that will hurt the spammers. In the meantime I reap the benefit of fewer false positives and faster spam filtering.

    Final comment - over the last six months I've seen spam get slightly larger (from about 32k peak size to about 45k peak size). But I haven't been analysing for any trends - just the outliers.

    --
    Recycle PCs and build a wireless community network www.hillsborough.org.nz
  13. So let's send spam as Bruce Schneier by marcink1234 · · Score: 3, Interesting

    As a lot of people will probably whitelist cryptogram, if one wishes to spam technical people, he just needs to set From to Bruce.

  14. A Simple question... by Pathwalker · · Score: 3, Insightful

    Am I the only one that has all of the mailing lists I subscribe to bypass SpamAssassin?

    For each mailing list I subscribe to, I use a special address suffix just for that list, that bypasses all of my spam checks (including SpamAssassin ), and just goes right into the mailbox that I use for that mailing list.

    No problems with false positives, and it saves me the overhead or running SpamAssassin on every incoming message from a busy list.

    it just seems like common sense, no one should have a problem with SpamAssassin misclassifying incoming newsletters if they just think about how they organize their email.

  15. initial analysis for Bruce by Daniel+Quinlan · · Score: 5, Informative
    I'm one of the SpamAssassin (SA) developers and I asked Bruce to send me a copy of the newsletter after hearing about his note of warning a few days ago.

    Aside from the spot-on comments that people have made regarding adding a whitelist entry Crypto-Gram (an obvious candidate for whitelisting if there ever was one, given that it frequently discusses spam, scams, and probably even includes text straight out of some spams), here is my initial analysis and response to him.

    Oh, first one other comment: SpamAssassin does not block content. SpamAssassin only flags probable spam. What the site or user does with that flag is their own business. Some mail administrators misuse SpamAssassin to block email, but we do not recommend blocking email. Really.

    ------

    [...] One false positive (or a related set of false positives) is not really a statistically useful sample size. To get to a high rate of filtering, most filters do have some false positives. You can get fewer false positives with customization of one form or another (personalized Bayes training, whitelists, rules, automatic learning algorithms). Our goal (everyone's goal, I think) is to get the best ratio of false positives to false negatives. It's a difficult balance sometimes and some legitimate content has a harder time.

    On to the data:

    I checked your newsletter with two versions of SpamAssassin: the current stable version (2.44) and the very-soon-to-be-released development version (2.50).

    A score of 5.0 is the default threshold to be flagged as spam.

    In SA 2.44, your mail receives a score of 3.20 (2.40 as I received it, but I believe the score would be about 3.20 for most people). That's on the high side, but has bit to go before being flagged as spam. The score is the same with network tests (DNS blacklist tests and Razor).

    In SA 2.50, your message would probably receive a score of 1.90 without network tests and 1.00 with network tests. Note that the test scores may change a bit before the final release of 2.50, but those are better scores, more what we like to see for non-spam content. They would be even lower when using Bayes (part of SA 2.50). Those lower scores are not unexpected because... well, 2.50 is better. :-)

    Based on these results, it's not clear to me why yesterday's newsletter was flagged as spam. Some possibilities:

    • your newsletter is routed through blacklisted hosts for some people
    • some people are using a old or misconfigured versions of SpamAssassin (extra rules, additional blacklists, many possibilities here)
    • the newsletter as received by some subscribers is substantially different than what you sent me
    • something else?

    Can you give me more information about the false positive that you experienced or was reported to you?

    Thanks.

    Dan

    ------

    If I find out more of interest before the thread is closed to comments, I'll try to post a follow-up to my post.

  16. Re:In principle, yes, in practice, no. by BlueUnderwear · · Score: 3, Insightful
    Then it would not be an encryption but a signature.

    You are right that it would not be encryption in the sense that it doesn't protect privacy of the message (indeed, in order to read the message, you only need Bruce's public key, which is indeed, uhmm, public...).

    However, it would still fulfull the goal of evading spamassassin, because, as far as I know, spam assassin is not yet smart enough to figure out that the message has been "encrypted" with Bruce's private key, and to fetch the public key from the Bruce's webserver to decrypt it.

    But then again, rot13 would probably be enough to evade spamassassin too... as long as you don't mispell inventive as ivntenive that is...

    --
    Say no to software patents.
  17. please no by upper · · Score: 5, Informative
    A "solution" like that would trash my outbound mail. I forge my From: addresses routinely.

    My primary mailbox is with a small, local ISP. I can't buy broadband from them, so I get my connectivity via cablemodem. I do have a mailbox in the cablemodem company domain -- that's the one I give out when I expect abuse. (I do it this way because I expect to be dealing with that ISP long after the cable vendor has either ceased to exist or has treated me badly enough that I left.)

    So I want my outbound mail to appear to have come from the ISP. Setting Reply-To is usually adequate, but not always -- when a human is looking for the address, they could easily grab the wrong one. And it creates potential confusion I don't want to create. So I set my from address to name@isp.com.

    I can't relay through the ISP's relays, because I'm outside of their IP range. (If they did some form of authenticated SMTP, such as SMTP-after-POP, they could let me.) And the cable vendor's mail relays won't send mail out with some other domain name on it. So I send everything out directly, no relays.

    If you look at many headers, I suspect you'll find that I'm not the only one forging my From: address for legit reasons. The presence of the X-Authentication-Warning header some MTAs add correlates fairly weakly with spam. (Some details of it -- e.g. no valid reverse DNS for the sending machine's IP -- could be useful indicators.)

  18. Bad news, it's in Razor by imroy · · Score: 3, Informative

    I just got the email today and it failed. I'm running 2.44 from Debian and haven't yet looked at tweaking any of the rules.

    Here's the verbose banner that SA put on my copy:

    SPAM: Content analysis details: (5.90 hits, 5 required)
    SPAM: SUBJECT_MONTH_2 (-0.5 points) Subject contains a month name - probable newsletter (2)
    SPAM: SUBJECT_MONTH (-0.5 points) Subject contains a month name - probable newsletter
    SPAM: OPT_IN (1.5 points) BODY: Talks about opting in
    SPAM: US_DOLLARS_4 (0.4 points) BODY: Nigerian scam key phrase ($NNN.N m/USDNNN.N m/US$NN.N
    m)
    SPAM: US_DOLLARS_2 (0.1 points) BODY: Nigerian scam key phrase ($NNN.N m/USDNNN.N m/US$NN.N
    m)
    SPAM: BALANCE_FOR_LONG_20K (-0.7 points) BODY: Message text is over 20K in size
    SPAM: BALANCE_FOR_LONG_40K (-0.1 points) BODY: Message text is over 40K in size
    SPAM: SPAM_PHRASE_01_02 (0.5 points) BODY: Spam phrases score is 01 to 02 (low)
    SPAM: [score: 1]
    SPAM: NORMAL_HTTP_TO_IP (1.3 points) URI: Uses a dotted-decimal IP address in URL
    SPAM: RAZOR2_CHECK (3.9 points) Listed in Razor2, see http://razor.sf.net/

    It looks like some dumbass has entered it into Razor. Unfortunately, some people (and yes I did this originally) had their procmail setup to enter an email into razor if it is deemed "spam" by SA or something else. Those 3.9 points are what puts it over the threshold.