Slashdot Mirror


Spam Solutions from an Expert

Mod N writes "SecurityFocus has posted a nice survey of anti-spam technologies by spam expert Neal Krawetz, in which he delves deeply into the specifics and pitfalls of the numerous proposed solutions. Krawetz makes it obvious that securing the email infrastructure is a very complex problem that many of the current (simple) solutions can't solve alone."

33 of 420 comments (clear)

  1. Proof? by monstroyer · · Score: 5, Interesting
    The marketing myth emphasizes two misconceptions: (1) a human must perform the challenge, and (2) these problems are too complex for automated solutions. In truth, most spam senders ignore these CR systems because they do not account for a large recipient base, not because the challenge is difficult. Many spam senders use valid email addresses for their scams or for validating mailing lists. When CR systems begin to interfere with spam operations, spammers will automate the responses to these challenges.

    Excuse me, what? Where's the proof? That's quite a brave statement to be making considering i've never seen this cracked, ever.

    I challenge someone to find an automated response to C/R.

    I did hear of a theory where C/R was being cracked by taking the C/R image, posting to a porn session, and letting a seeing person do the work. However, i've yet to witness this in practice. Show me the automated response to C/R that exists beyond a blog theory, and i'll believe. Until them, i hardly consider it "marketing hype".

    1. Re:Proof? by LostCluster · · Score: 4, Insightful

      That's like saying a all theoretical attacks is not worth securing against somebody's fallen victim to it. Sure, there's some way-out ideas that can be dismissed that way, but this one seems so simple I'm pretty sure somebody who runs both spam and a porn site could pull it off...

    2. Re:Proof? by ookabooka · · Score: 5, Insightful

      I cant even get my scanner to correctly identify a regular text document, it gets most of it, but it still misses a lot of letters. A computer program could do this, but you would need either a very large database of the letter pictures (most places use all different kinds of text pictures, and add in a degree of randomness). Or you would need a very developed algorithm to detect the letters (in which case you would be making oodles of money from the scanner industry. . . spam would be the least of your worries.
      In the end i think it is inevitable that software will eventually break this system, but as soon as it does, there will be another system in place. . . .

      --
      If you are about to mod me down, keep in mind that this post was most likely sarcastic.
    3. Re:Proof? by LostCluster · · Score: 4, Interesting

      Yes, but such a human-check is unlikely to be beaten by a computer 100% of the time. If a log of the failed challenge attempts is kept, the source of repeated failed challenges can be ruled out from getting any more challege attempts, or even just one failed challenge with hundreds of successful ones coming from the same IP space... then the hacker source cna be flagged and ruled out.

      The best defenses involve several lines so that when the first gets beaten, another one tightens up against whatever the first line learned from its defeat...

    4. Re:Proof? by jazman_777 · · Score: 5, Funny
      The point I was making is that, while noone has done it yet, there's no theoretical reason why it shouldn't be possible.

      I think you have a future in marketing.

      --
      Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
    5. Re:Proof? by Elwood+P+Dowd · · Score: 4, Insightful

      Challenge / response systems are broken anyway, even if spammers can't break it.

      Why? Because from: is forgeable, and viruses use other people's real addresses constantly. Every day, one of my 40 spam emails is a C/R email from someone that I've never heard of. Am I going to click the link and authorize my email address? Fuck no. But I'll never be able to send email to that person. I realize that's a *tiny* incidental, but it's still broken by design.

      If your C/R system includes a solicitation to purchase said C/R system, you're a fucking spammer. Fuck you.

      --

      There are no trails. There are no trees out here.
    6. Re:Proof? by chrisbtoo · · Score: 4, Interesting

      Well, this is by no means a proof, but maybe a method.

      1) Get image. I followed your link and got given this image.

      2) Pre-process. I loaded it into the GIMP and did Image->Mode->Greyscale, which yielded this image. Then I did Layer->Colours->Threshold, which yielded this image.

      3) Match characters. At this point, you have a monochrome image, in what appears to be a known font. The chars don't even appear to overlap, so a simple 1-for-1 match is achievable. Scan left-right, top-bottom until you see a 10x10 (or whatever) section with a black pixel. Scan down and right from that pixel until you see a character.

      I don't have the time to code it up right now, but if someone wanted to pay me to do it, I'm pretty sure it's acheievable - not least because a whole bunch of the more difficult code is available for me to use under the GPL.

      --
      Registering accounts later than some other chrisb since 1997
  2. Oh Well by dirkdidit · · Score: 4, Funny

    With the way the Chinese government keeps making their own versions of everything, maybe they'll have their own version of the Internet. That shoud alleviate a good deal of the spam right there, given that their Internet will probably be incompatible with ours.

    1. Re:Oh Well by _Sharp'r_ · · Score: 4, Insightful


      The Chinese government will probably solve any internal spam problem pretty quickly.

      I mean, if you start by shooting all convicted spammers, the profession tends to stop attracting replacement members.

      --
      The party of stupid and the party of evil get together and do something both stupid and evil, then call it bipartisan.
  3. Re:Darth Vader by Anonymous Coward · · Score: 4, Funny

    pishhhhh *breathe*
    I find your lack of junk mail disturbing.

  4. Don't forget SMTP+AUTH by RT+Alec · · Score: 4, Informative

    Good overview, all things considered. I would like to add to one of his conclusions (from part 1):

    IMAP can be used with SSL and supports secure authentication, but not all servers support this. SMTP also supports SSL or TLS but again, many organization's servers do not support this or use only server-side certificates.
    This conclusion is correct, but why is this considered a stopping point? Mail admins-- get off your collective butts and add encryption and authentication to your mail servers! The author also forgot to mention that server side certificates are not necessary for SMTP, SMTP+AUTH addresses this quite nicely.

    Note that such measures are not necessary for most users. Home users that use their ISP's mail server don't have to implement any of this, since the ISP can already account for the user. Let us not forget that "most users" do not have the e-mail needs that many Slashdot readers do. For those needing roaming access and multiple addresses, use IMAPS and SMTP+SSL+AUTH.

  5. Cut Your Junk Mail By 50% !!! by Snagle · · Score: 5, Funny

    Just buy porn in magazine format instead of registering for it online :)

    1. Re:Cut Your Junk Mail By 50% !!! by redJag · · Score: 5, Funny

      What is this buy? *squints suspiciously*

  6. Solution: Stop Spam at the Source by ElliotLee · · Score: 5, Insightful
    According to the article, there is no good lasting solution to spam. Indeed, there isn't, but we need to consider more the reason behind the spamming.

    Why has spam grown to what it is today? It is an undeniably effective means of cheap marketing. What we need to do is come up with a way to stop this not on our end, but by looking at as a social problem or making it non-worthwhile to the spammers. If nobody ever responded to spam, spammer wouldn't bother.

  7. Open Relays by QuePasaCalabaza · · Score: 4, Interesting

    The truth is 90% of spam comes from open relays, that is SMTP servers that can be tricked (a bit like lying to a 5 year old) into accepting and sending out massive ammounts of mail. Simply blocking open relays using The Open Relay Database at http://www.ordb.org/ or other open relay checking utility will save you lots of time if you run your own mailserver. When we can bascially negate the usefulness of open relays to spammers, they will then have to rely on their own bandwidth for the most part providing they cannot comprimise other "closed" relays.

  8. Let's use the Patriot Act for the benefit of good by mao+che+minh · · Score: 5, Interesting

    I am in full support of using the broad-powered, freedom crushing Patriot Act in apprehending and imprisoning spammers. We might as well get some good out of it.

  9. There's one billion people in India... by 3770 · · Score: 4, Funny

    that the challenge/response could be outsourced to.

    Only kidding (I think).

    --
    The Internet is full. Go Away!!!
  10. Good old fashioned riddles by KalvinB · · Score: 4, Interesting

    My free anonymous (as in they can only be traced back to a common e-mail account on my server) e-mailer uses a simple quiz to keep spammers out.

    The form page records the IP address of the visitor along the with the question number they were given in a file named with the IP address. That number is never sent to the client. When they hit submit the file of their IP is opened, the question number is read in and the answer given by the user is compared to the stored answer. The file is then deleted and if the answer was correct the e-mail is sent. Otherwise it's not.

    This forces my custom form to be used to be able to send the e-mails. And it's not possible to simply keep refreshing the submit page to keep sending the message.

    And the challenge is in the form of old riddles and a couple new ones like "what's your favorite color?"

    Things a bot would never get but that anyone who knows how to use Google can. Someone would have to program a custom bot with the answers in order to even attempt to spam. And even then since everything goes through my mail server nobody is going to sneak garbage past me for long and I know who your ISP is.

    I also include a disclaimer with every e-mail. It'd be quite silly for me not to.

    Ben

  11. Fix SMTP! by schnarff · · Score: 4, Interesting

    Well, at the risk of sounding like a broken record, SMTP itself is the problem -- it's badly broken, security-wise, and needs to be fixed. It's going to be painful to move to a new mail standard, or to change SMTP so that it's not broken, but that's what needs to happen to stop spam. Thankfully, our friends the Russian Mafia and the ever-growing number of Windows zombie machines are making spam levels so great that, sometime soon, spam will represent such a large percentage of e-mail traffic that fixing SMTP will be necessary, not just something mail admins like myself wish for.

    BTW, does anybody have a good figure on what percentage of all e-mail spam represents these days? I'm talking about *all* traffic, too, not just what ends up in peoples' Inboxes after all the filtering going on out there has done its job.

  12. More details in Part 1 by fembots · · Score: 5, Informative

    The linked article is part 2, Part 1 is here.

  13. Having experience, I can answer 1.2.1 by snakecoder · · Score: 5, Interesting

    I am not recommending mailblocks, I belive there is a sourceforge project called TMDA which does the same thing. Having said that, my experience comes from using mailblocks:

    -cr deadlock: This does not exist because when you e-mail someone in a challenge and response system, it automatically assumes they are friendly. So if they have a challenge and response system, it will make it into your inbox, because you e-mailed them first

    -automated systems He is correct here. Personally I hate when friends submit my e-mail to third parties without my consent so I do not mind missing these e-mails. I have caught a few while searching my pending folder, and inform my friends I rather have them e-mail me directly.

    -interpretation challenge I believe he is wrong here because of a fundamental issue. When dealing with spam filters, the onus of working out refinements is left to the spamee, to make sure they filter out all spam. If a spammer adds a new technique, they get around the filter. With challenge systems, you have a few methods waiting as backup. When a spammer finally figures out how to read your words through AI, you simply change the challenge system and they are back to square 1 in trying to figure out how to defeat. As long as you have a few methods waiting in the wings, the spammers can easily be defeated, and have huge amounts of work to do.
    if you doubt this, write an AI system to defeat hotmails gifs. Now what if the next day instead of showing a word, they show you a picture of 3 fire trucks and 2 police cars and ask you how many police cars are in the picture, etc ...

    --
    -Nuke the moon
  14. I managed to appall a colleague today... by Ungrounded+Lightning · · Score: 4, Interesting

    Was out to lunch with three colleagues today and the subject of anti-spam measures came up.

    I managed to appall the one from Berkeley by suggesting that the most practical solution was probably a moderate-size bomb.

    B-)

    But seriously:

    In an arms race, weapons eventually defeat armor. Spam will continue until two real-world things are BOTH brought to bear on spammers:

    - Economics
    - Muscle

    If a governmental solution applying both is not forthcoming soon, I predict that there WILL be vigilantism.

    In fact we're already seeing it.

    For instance: Subscribing the Detroit area spammer and his lawyer to enough real-world junkmail lists to bury his bills and other US Main correspondence in several daily truckloads of catalogues and other solicitations.

    Soon to come: Retaliatory information-war software directed at DDoSer / spammer zombi-net machines. (As discussed in a recent Slashdot article.)

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
  15. Maintenace the problem by powerpuffgirls · · Score: 4, Interesting

    As stated in the article's summary, the main problem with most spam-filter is the need for constant maintenance. We need a solution that requires ZERO maintenance by the joe-users, and yet cost-effective enough to implement.

    My ISP seems to have a so-called "Watch Dog" spam filter, where they actually hire people to read spams and filter them manually, that's probably the most effective way to filter spam, but I wonder if it is cost-effective though.

  16. Do not call ... by Ephboy · · Score: 5, Interesting

    Prior to this October, telemarketing calls were a national scourge. Amazingly, since we signed up for the Do-Not-Call list, we've only received 2 illegal calls. I'm rather surprised, in fact, at the relatively uniform acquiescing to this law. While spam, coming from all corners of the earth and is more anonymous, will be harder to enforce, some law with real teeth may be a good start.

  17. Reputations by grotgrot · · Score: 4, Interesting

    The only thing that will work in the end is some sort of distributed reputation management system. To a certain extent that is what RBLs do, except they are on or off. SpamAssassin does offer shades of grey to the RBLs (differening weights to each one).

    To a certain extent this is what we already do in real life. We 'judge a book by its cover' as a first pass (for example people will often walk past a beggar in the street completely ignoring them) and then include other factors. How polite they appear, where they are from, recommendations from friends etc

    All other mechanisms suffer from a determined spammer being able to get around them as the article pointed out. Any mechanism that prevents some spammers makes things more lucrative for the rest.

    1. Re:Reputations by leviramsey · · Score: 4, Interesting

      I just devised a setup that might be interesting:

      • Users (sysadmins) of the blacklist submit two lists of IPs, good (non-spammers) or bad (spammers).
      • When a server receives a mail, it checks with the list to see on which lists the IP appears as good and on which it appears as bad.
      • The user marks the mail as ham or as spam. A Bayesian algorithm then determines which lists are trustworthy for marking spam hosts.
      • Filters could then /dev/null mail based on this bayesian score

      The idea is essentially to allow a collaboratively developed decentralized blacklist and whitelist to develop. Spammers will either submit the IPs they use to this list or not submit them; if they do submit them, then a "good" report from them will eventually be taken as a strong sign of spamminess. If they don't, then nothing happens, but presumably "trustworthy" blacklists would list them.

      Thus, a user in Brazil, where they would be receiving lots of legit mail from Brazilian IPs would not find a blacklist that listed all of LACNIC to be a strong indicator of spamminess. The effects of blacklisters who maliciously put enemies into their blacklist would also be reduced, if not eliminated.

      A suggested implementation detail on the blocking would be to make it random; that is to say that 100% of the mail with a 100% probability of being spam gets dropped, 99% of mail with a 99% probability gets dropped, 97% of mail with a 98% probability gets dropped, 94% of mail with a 97% probability gets dropped, 90% of mail with a 96% probability gets dropped, etc. according to this function:

      d(p) = d(p+1)*p/100, where d(100) = 100, and 73<=p<=100

      This would allow for a degree of "retraining" in the event of false positives (since a /dev/null'd mail cannot be retrained from!).

  18. Of course there is by Sycraft-fu · · Score: 4, Informative

    There are plenty of tasks that you can do that computers find nearly impossible. Facial recognition is a good one. Humans do it easily all the time. Computers are trying, but still screw it up badly. Musical recognition is another one. A human can easily pick out individual instruments in a peice, and can tell that the song is the same even if it is a complete different orchestration and mix (like a remix for example). Computers are confounded by this, even when they break something into component sine waves. Pragmatic language interpreatation is my favourite. Even when people speak non literally and indirectly, you still have no trouble with their meaning. You can also tell which level of meaning they want, and successfully decode the other levels if asked. Computers are lucky if they can get the literal direct meaning out of a sentence, never mind anything else.

    So, just because a human can do it, doesn't mean a computer can. I don't know about any of these image schemes, I've never played with it. However if you make it sufficiently hard for it to recognise characters form background, and one character form another, it's screwed. Computers have trouble with fuzzy and incomplete information that humans are so good with.

    Also remember it needs to be feasable to do in a reasonable time. Maybe you develop some whiz-bang image recog program that can take amazingly distorted text and figure it out. If it takes 5 minutes to process a box, it does you no good anyways, too much time to be worth it for this use.

  19. most effective by mabu · · Score: 5, Insightful

    Make no mistake...

    The most effective spam solution at this time is RBL blacklisting. Bottom line.

    When you take into account that the biggest problem of spamming is bandwidth consumption and network resources, there is NO better way than blacklisting spam sources and refusing to communicate with them.

    Services like Spamcop's RBL really piss off the spammers. All client-side filtering is counterproductive and ultimately useless as you constantly have to update the systems to catch new efforts on the part of spammers to thwart the filters. At least with RBLs, the spammers' connections are immediately refused as soon as they're ID'd.

    If you want to identify what is the most effective solutions, it's simple. Look at what pisses off the sleazebag spam community the most. That's relay blacklisting. They don't DDOS the moronic client-side filtering companies because the spammers know they're useless, and even if they're not, the spammers can't tell. What hurts them are when systems say, 'screw you spammer, (click)' and that's done via relay blacklisting.

    Why are spammers increasingly changing mail relays and pursuing open proxies? Because of RBLs. Even AOL uses RBLs (including Spamcop). All the major ISPs look at the RBLs because they are THE most effective way of stopping spam. And they're the only way to actually shut down the spammers.

    Forget client or server-side content-based filtering. They will NEVER work. RBLs are responsible for forcing spammers into corners of IP space, forcing them to deploy worms and viruses to infiltrate new IP space (which exposes them to more prosecution). RBLs ** WORK ** !

    1. Re:most effective by Ragica · · Score: 4, Informative
      Some would say RBLs work "too well". They have a fairly consistant history of accidentally abusing innocent parties. Is it the price to be paid for the overall protection? Depends on your point of view.

      We don't have that many clients using our mail server, but one noticed one day that mail to him to friends was bouncing. He reported this and we discovered that we were on SpamCop's RBL list.

      I did a quick audit of the mail server, fearing we'd been highjacked, but found no evidence anywhere of spam going out.

      Being generally sympathetic to RBLs I was eagre to get to the bottom of this, and cooporate with whatever needed to be done to prove our innocence.

      But i found the SpamCop web site to be extremely frustrating to find any information. I found some references stating that to refute being listed you must reply to the email that SpamCop sent you: I searched and searched but we recieved no mail from spamcop.

      As I spent a precious day trying to figure out what to do, as mysteriously as we'd been listed, our IP disappeared from spamcop's list.

      To this day I don't know what happened; but have a somewhat more bitter taste in my mouth regarding the arbitrary power of RBLs.

      (Though I still tend to more blame the system which blindly obeys a single RBL: I think SpamAssassin is more democratic in that it only assigns a probability, and an IP has to be on multiple block lists before it goes over a threshold. This gives spammers more lead time before they are blocked, but also prevents any single RBL from weilding absolute power... a sort of check-and-balance.)

  20. Re:Not for all, but a good start.. by mabu · · Score: 4, Insightful

    From spoofing verification won't make a difference... it'll slow down mail services and won't make a dent in spam.

    Spammers are now rotating IP space all over the place... they're also beginning to NOT forge header information, so what are you left with?

    Recognizing rogue relays and blacklisting them, even if they have valid header information. Any improvement to SMTP protocol won't make a bit of difference.

    Most mail servers and large ISPs are already employing additional methods of header-verification. It hasn't stopped spam.

    RBLs ARE working. They're making spammers scramble for un-blacklisted IP space. That's why they're running overseas; that's why they're sending out worms and viruses. Lord help us if IPv6 gets introduced... we'll never be able to stop spam then.

  21. Re:Dueling Challenges by RollingThunder · · Score: 4, Insightful

    Not so much that it would come from Charlie, but that the C/R would have an In-Reply-To that referenced the unique Message-ID of Bill's mail.

    When the mail goes out, Bill's system would record the Message-ID (and probably the recipient, but that could screw up on forwarders if you try for a hard match on the two) and then allow Charlie's C/R because it matches the whitelist.

  22. The spammers weak spot is the money he makes. by sbaker · · Score: 4, Insightful

    I think we are attacking Spam from the wrong direction. Attempting to stem the flood of incoming spam is tough - everything about the identity of the incoming spam can be faked. However, we could alternatively attempt to prevent the replies going back the other way.

    There are two inevitable facts:

    1) In order for spamming to be worth someone's effort, they have to somehow get money from people. If NOBODY replied to them, then spamming would stop overnight.

    2) Something in the content of the Spam must be real - a reply address - a web site, a phone number or something. Block traffic to that location and the spammer gets no money and dies.

    Hence, I think they may be vulnerable. Educating people not to reply to SPAM would help - it only takes a mere handful of people to respond to a SPAM to make it profitable - but if education could drop that handful to a mere one or two - then we could succeed in putting more spammers out of business simply by cutting their margins to the point where it wasn't worth the hassle.

    Where are the TV adverts: "Replying to Spam is Bad!"....we know that the morons who reply to spam are suckers for advertising - they are as likely to believe a well targetted TV advert as a crappy email shot. If Spam is costing the ISP's as much as they say it does - then funding some TV ads might not be impossible.

    What if we made it illegal to respond to an emailed advertisement that was not clearly labelled as such, that would help to deter people from responding. Such a law would be next to impossible to enforce - but we are trying to deter the gullible here - so it might not have to be enforcable - just very well advertised.

    Since every SPAM has to either advertise a product that you can buy from somewhere - or direct you to a postal address, a phone number or a web site - then that route for getting money back to the spammer could be blocked.

    The return route has to be genuine. There is no point in them sending you a fake phone number or faked web address. If the phone companies (who are often also ISP's - or have at least some cause to want to kill spam) were to block calls to and from phone numbers that were seen in Spam - then the reverse route for the money would be curtailed. Whilst you can afford to change the aparrent source of your spam and fake those addresses for each new mail shot, you can't change your phone number for every couple of dozen orders you take. Similar considerations apply to web sites and postal addresses.

    If it was required for credit card companies not to transfer money to businesses that employed spammers to push their goods - then that would also help some.

    It wouldn't take many people to deliberately reply to spammers - to lead them on into thinking you want their product - to send them fake cheques or bogus credit card numbers. If they only get a handful of positive responses per million spams - then it wouldn't take more than a few determined people per million (eg ISP employees) to clutter up the the spammer's cash collection mechanism to the point where it's too much hassle for him to sort out the real orders from the bogus ones.

    I don't pretend to have all of the answers - but there seems to be far too little creative thinking along these lines.

    --
    www.sjbaker.org
  23. Yes, of course... by michaeltoe · · Score: 4, Insightful
    This is similar to the argument that a computer cannot determine when it's in an infinite loop. Humans, however, can... because they are impatient, and given time, will reexamine the code that is executing.

    Naturally we may be inclined to believe that this grants us superiority to the computer. That, while stating some arbitrary facts taken from some textbook somewhere, a computer can never accomplish X objective.

    Therein lies the fallacy. The computer does not identify that it is in an infinite loop, nor can it, because it is not given the benefit of looking at the actual code. If a compiler were designed to read into code for things like while(true) loops, which naturally could result in infinite loops, then already you would be cutting back on the instances of these problems.

    Determining if there is an infinite loop requires a conscious understanding of the code itself, which is no trivial matter. It is not, however, something that could be deemed impossible.

    As with all fields of science, there will be those who say "Well, I haven't seen it yet, so it will never happen"... but skeptics are everywhere, and the presence of skepticism is hardly a measure of credibility... rather, a measure of how pious certain peoples assumptions are.

    Solutions are always found in math, and never in magic. Don't underestimate the computer, and more importantly, don't underestimate your own brain. You don't perceive things the way you do 'just because'... and that's what's so exciting.