Revolutionary Spam Firewall Developed

← Back to Stories (view on slashdot.org)

Revolutionary Spam Firewall Developed

Posted by CmdrTaco on Tuesday August 24, 2004 @03:44AM from the well-that-said-so-anyway dept.

psy writes "physorg has a story on a new spam firewall developed at The University of Queensland. The new technology is the only true spam firewall in existence, according to co-developer Matthew Sullivan. "Existing anti-spam software filters out spam whereas ours puts up a firewall, stopping all email traffic and only allowing real mail through," said Mr Sullivan. "In addition, our technology is accurate and fast. We recently completed a successful trial of a key layer of the spam firewall and it processed the emails at 90 messages per second, misclassifying only one out of 25,000 emails." "It turned out that the software was even better than us, picking up spam we'd incorrectly classified as legitimate emails."

34 of 507 comments (clear)

Min score:

Reason:

Sort:

Spelling by swordboy · 2004-08-24 03:45 · Score: 5, Funny

I have a simple algorithm to reject spam: spelling.

If you can't spell correctly, then I don't want your v1agr4.

--

Life is the leading cause of death in America.
1. Re:Spelling by random_culchie · 2004-08-24 03:51 · Score: 5, Informative
  
  Yes and aparently there are 600,426,974,379,824,381,951 different ways to spell viagra!
  
  Will your algorithm do it with polynomial complexity ;)
2. Re:Spelling by gowen · 2004-08-24 03:53 · Score: 5, Funny
  
  We should apply the "good spelling" rule to /. posts.
  
  ( Read More... | 2 of 1274 comments | it.slashdot.org )
  
  --
  Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
3. Re:Spelling by swordboy · 2004-08-24 04:02 · Score: 5, Insightful
  
  I honestly think that we need an RFC for this so that idiots who can't spell can get a real error message back when their legitimate email gets rejected. At this point, all spammers would be forced to spell correctly and it would be difficult for them to get their point across without using obvious spam keywords like 'viagra'.
  
  --
  
  Life is the leading cause of death in America.
4. Re:Spelling by CommanderData · 2004-08-24 04:29 · Score: 4, Informative
  
  His algorithm doesn't need to. All it needs to do is check against an existing dictionary of words. If the word is not on the list, it is assumed to be misspelled. (If the good spelling of Viagra is in the dictionary, simply remove it so that any correctly spelled reference to Viagra counts as a misspelling too). If there are greater than X% misspellings in the e-mail it gets trashed. X can be a smaller percentage if the e-mail has any hyperlinks in it, because it is virtually guaranteed that someone is trying to sell you something...
  
  --
  Urge to post... fading... fading... RISING!... fading... fading... gone.
5. Re:Spelling by rossz · 2004-08-24 04:40 · Score: 4, Interesting
  
  Spelling doesn't work. The average computer user either can't spell or can't type and doesn't bother to use a spellchecker in email. I did small study on spell checking as an anti-spam tool and was somewhat disappointed by the results.
  
  --
  -- Will program for bandwidth
6. Re:Spelling by wheany · 2004-08-24 05:05 · Score: 5, Insightful
  
  Only if the bayesian filter sucks. Or rather: Only if the tokenizer of the filter sucks. Bayesian filters don't have to treat the message as a raw string. They are free to parse it to, for example, remove comments, use image urls, or the difference between the foreground and background color in html mails as words.
  
  You can make a tokenizer that not only treas a word written like this: 't.r.i.c.k.y', as the word 'tricky', but also as a "pseudoword" like 'trick:dottedword.' So the "bayesian part" of the filter would see these two words: 'tricky' and 'trick:dottedword.'
  
  And there is of course loads of information that can be extracted from the headers of the mail.
Not the first; not revolutionary by Anonymous Coward · 2004-08-24 03:46 · Score: 5, Informative

I think Barracuda Networks would rather disagree with the idea that this is the "only true spam firewall in existence," considering that Barracuda's entire product line consists of spam firewalls.

Damn fine spam firewalls, too, I might add. They handle around 115 messages per second, and can run up to eight filtering steps (including Bayesian analysis, which is similarly efficient to SVM, which the one in the article uses). Plus Barracuda's can do virus scanning.

I'm not sure how this is revolutionary.
1. Re:Not the first; not revolutionary by micromoog · 2004-08-24 03:49 · Score: 4, Insightful
  
  Isn't "spam firewall" just a marketing term for "filter"?
2. Re:Not the first; not revolutionary by Rikus · 2004-08-24 03:53 · Score: 5, Insightful
  
  Isn't "spam firewall" just a marketing term for "filter"?
  
  Isn't "revolutionary" just a marketing term for any stupid new product?
3. Re:Not the first; not revolutionary by Greyfox · 2004-08-24 03:57 · Score: 5, Informative
  
  I believe the distinction is when the filtering takes place. If you wait for the spam to be placed on your hard drive and filter it out when you start your mail client, then it's filtering. If you reject the spam before the remote MTA drops the connection, then it's a firewall.
  I'm using Postfix at home and it's got some nifty features to allow you to do this sort of thing. You can write a simple SMTP server that listens on some port of 127.0.0.1 and configure postfix to send the mail though that. Your server scans the E-Mail and sends a reject or accept message back to postfix, which sends it on to the remote MTA. Your SMTP server then feeds the mail into another postfix server which listens on an odd port of 127.0.0.1 and doesn't have the restrictions that your publically accessable postix server does. There are packages available for all sorts of scanning based on this ability. Since you reject the message at MTA time, you don't have to bother with sending a bounce message, either.
  
  --
  I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
4. Re:Not the first; not revolutionary by Rei · 2004-08-24 04:24 · Score: 5, Insightful
  
  Isn't slashdot supposed to be more than just a conduit for corporate press releases?
  
  --
  No matter how kind you are, German children are kinder.
5. Re:Not the first; not revolutionary by LaCosaNostradamus · 2004-08-24 04:38 · Score: 5, Insightful
  
  Isn't "marketing" just a term for people who don't know, selling to other people who don't know?
  
  --
  [You have a stable society when some nut guns down a schoolyard and the law doesn't change.]
6. Re:Not the first; not revolutionary by naelurec · 2004-08-24 05:29 · Score: 4, Informative
  
  I do multi-layered protection. At the MTA level, I utilize some DNSRBL lists to block from known spam servers. In addition, I require HELO and reject people who are claiming to be my server. In addition, I will reject invalid recipient domains, etc.
  
  From here I run accepted emails through AMaViS / SpamAssassin / ClamAV / Sophos Sweep (I have yet had Sophos catch a virus that ClamAV did not detect.. though ClamAV caught two that Sophos did not..) and will not deliver (but notify postmaster) of spams over a set value (ie 8), deliver spam between 5-8 tagged and items under a certain value get passed without tagging. Viruses are always blocked and reported.
  
  Overall this has reduced unwanted email significantly. On networks of 40-60 users, between 35-50% of email is rejected at the SMTP level, about another 10% or so is quarantined (either viruses/spam), another 10% or so is tagged but delivered and the rest is legit.
  
  I have yet had any compliants of false positives (granted there is a risk that they do not know) but have had a lot of priase for reduction in spam levels. I am not aware of any viruses penetrating.
  
  Check out http://jimsun.linxnet.com/misc/postfix-anti-UCE.tx t for more info (this is postfix centric, but the ideas could be applied to other setups)
Not a firewall by BarryNorton · 2004-08-24 03:47 · Score: 4, Informative

This isn't a firewall as it doesn't filter based on addressing. Furthermore, the use of SVMs (support vector machines) to classify text is not new...
1. Re:Not a firewall by Tony+Hoyle · 2004-08-24 03:58 · Score: 4, Funny
  
  the definition of a firewall is a device on a network that allows or denies access
  
  Ahh, so *that's* what our system administrator is called..
  
  I'll stick to 'Mordac' though.
Uh yeah, OK... by Tony+Hoyle · 2004-08-24 03:48 · Score: 4, Insightful

It's easy to produce these kind of results in trials - you just tune the spam filter to handle a certain set of emails, then you feed it those emails again and you get a near 100% success rate.

Heck, why not do it with a million emails? Makes better headlines that way.

I don't see how this is any different to SpamAssassin (the term 'Mail Firewall' is pure marketing bullshit. It's a spam filter. Get over it.) except I bet it costs a hell of a lot more...
What happens to the 1 mis-classified email? by Thrymm · 2004-08-24 03:48 · Score: 5, Interesting

1 out of 25k is impressive, but what happens to these spam mails? Are they bounced back as an error "no user account found"? Or done like a blackhole where the spammer doesnt know if it reeached its intended recipiant? I like my SpamBayes :)
Ciphertrust, too... by TrebleJunkie · 2004-08-24 03:49 · Score: 4, Informative

I know! Ciphertrust's Ironmail works the same way... It stops ALL mail inbound, runs it through about a dozen different detection queues, only letting legitimate stuff through. I'd really like to see how this new one is otherwise unique.

--
Ed R.Zahurak

You know, oblivion keeps looking better every day.
Re:1/25000 by Shakrai · 2004-08-24 03:50 · Score: 4, Insightful

Although this is a great new technology, for a business setting, I don't know if even missing one e-mail is acceptable...

That's what everybody says but what's the other option? Letting all the SPAM come in? Do you really think that fed-up employee who gets hundreds of SPAMs a day is really going to do a better job of just mashing down the delete key then a SPAM filter with a 1/25000 error rate?

Of course I doubt this technology would perform that well but the point still stands -- if you don't have a computer flagging them then chances are you have a human flagging them. Who do you trust more?

--
I want peace on earth and goodwill toward man.
We are the United States Government! We don't do that sort of thing.
Re:1/25000 by stienman · 2004-08-24 03:50 · Score: 5, Interesting

Most users of email are now treating it as a lossy messaging system, and the users themselves accept that some messages simply don't make it. Critical business is always followed up with a call.

-Adam
Re:1/25000 by Quarters · 2004-08-24 03:50 · Score: 4, Interesting

If you are sending something so critical then you shouldn't be using email. FedEx with signature required delivery and certified/return-receipt USPS mail exist for a reason.
My favorite line: by calypso15 · 2004-08-24 03:50 · Score: 5, Funny

"...companies losing valuable employee time to deleting spam..."

Maybe they should be working on a Slashdot-Firewall. Damn, I really should get back to work.

Oh, and since the linked article got /.ed, here:
http://www.uq.edu.au/news/index.phtml?article=5833
As a self-appointed representative of ... by burgburgburg · 2004-08-24 03:54 · Score: 4, Funny

Unconsciously Desired Email Industry (Our slogan: You opted in in your heart!), I'd like to strongly protest the continuing escalation of technology against us. We provide the opportunity for hundreds of thousands of people to spend freely on products unburdened by simple heuristics of "they work" or "they won't make you ill" or "we'll actually send them". Why are you so intent on interfering with the consumer ethos?
Re:1/25000 by cyngus · 2004-08-24 03:56 · Score: 5, Insightful

One of two conditions exists in this case.
1) The e-mail is vitally important and your business will be seriously damaged by its failed delivery.

2) The e-mail was somewhat important, but not something large enough to materially change your revenue/profits.

If the first is the case, you probably shouldn't be using e-mail in the first place and/or whoever sent it is probably going to follow up with a FedEx or phone call.

In the case of number 2 (ha ha, number two), you've saved so much time not having to wade through spam that the losses are negated.
Here's how it probably works by lokedhs · 2004-08-24 03:59 · Score: 5, Interesting

I heard about this new technique before. Apparently it works trmendously well.
The idea is that the mail server keeps a whitelist of "allowed" addresses which are always accepted. If a mail comes from an address which is not known, the mail server will reply with a "server unavailable, try later" error message. All real mail servers will try to send the message a little later (I don't know the exact time, but it's probably less than an hour. Someone else might know better).
The second time the remote mail server tries to connect, the server accepts the mail and adds the address to the whitelist.
However, mass mailers for spam don't do this but simply go on to the next address in the list if this happens. This way the spam message is filtered out.
Note that this method doesn't require any analysis of the actual content of the messgae, nor does it involve any manual actions from neither the sender nor the receiever. Currently it's porbably the best spam blocking method that exists.
1. Re:Here's how it probably works by Santana · 2004-08-24 04:15 · Score: 4, Informative
  
  That's how spamd works, and yes, it works tremendously well. I used to get 300 spam messages daily. I receive now one or two every week.
  
  --
  The best way to predict the future is to invent it
2. Re:Here's how it probably works by hedronist · 2004-08-24 04:38 · Score: 4, Informative
  
  I think you're trying to describe greylisting. Although greylisting is amazingly effective, I don't believe that's what is being discussed here (the site is slashdotted).
  Our experience with greylisting has been (1) an 90%+ reduction in passed-through email (with no complaints from users about lost mail (yet)), (2) a dramatic decrease in server load because SpamAssassin doesn't see the message until after it gets past greylisting, and (3) people rediscover how useful email is once you get all of the crap out of their inbox.
  Marketing Guy: What's the worst that could happen?
  Dilbert: Our beta product could turn into an evil robot that annihilates the galaxy.
I hope they don't reject my e-mail by koinu · 2004-08-24 04:00 · Score: 5, Funny

I'm a.l-wa-ys wr|?|-ng l|-ke ðißs 2 m.y f-iends
amidoacetic platymyoid granomerite nonacceptant dorsoposteriad uninclined unshocked zibet intercity lornness
Re:1/25000 by Alioth · 2004-08-24 04:03 · Score: 4, Insightful

1/25000 is significantly better than a human being. If you use no automatic spam filtering at all, and you get a typical geek's email load (about 100 spam a day with 10 legitimate emails a day), you will still delete mail as spam when it wasn't spam.

That's why I use SpamAssassin - it does a good job, and is no worse at making false positives than I am. If I'm just as liable to make a false positive than an automatic filter, I'm better off saving my time.

--
Oolite: Elite-like game. For Mac, Linux and Windows
Why filter at firewall layer? by sdxxx · 2004-08-24 04:05 · Score: 4, Insightful

Well, the site is slashdotted, so I can't read their claims. However, it doesn't seem like there is any benefit to doing spam filtering at the firewall layer.
For example, Mail Avenger allows you to filter spam based on network characteristics like SYN fingerprints and routes. It even integrates with the kernel firewall to filter out aggressive spammers and mail bombers. However, because it runs as an ordinary user-level process, it also has much more flexibility, for example allowing individual users to set different policies on different email addresses. What can a spam "firewall" do that you can't do with a system like Mail Avenger.
Re:One solution to spam by MurkyGoth · 2004-08-24 04:06 · Score: 4, Interesting

(Presuming that wasn't a troll) That's a horrible, horrible solution. Viruses fake sender addresses, which means the faked address gets *loads* of these 'Please confirm' emails, clogging up another innocent mail server. Get it wrong, and you'll have two servers sending 'Please confirm' messages to each other until one screws up into a little ball and dies. I'm all for the War Against Spam, but this isn't the way - it just doubles the amount of emails.
The what where now? by broothal · 2004-08-24 04:07 · Score: 4, Funny

This didn't make it through my bullshit filter. Oh - sorry, I mean bullshit firewall. It's like this new technology that rejects bullshit from the evil internet, so I never have to read it. Thank god, because if I'd read about this "revolutionary spam firewall" I would be forced to make a childish comment on slashdot and burn some karma.
Re:1/25000 by biglig2 · 2004-08-24 04:14 · Score: 4, Interesting

Then you're stuffed anyway, because internet e-mail is not guaranteed.

It is difficult. We're swatting away a million of the damn things a week and still our users complain. They also complain when we get false positives. And when, next week, we turn on the system that lets them see what we have blocked that was addressed to them, they'll complain too.

I think the one solution they would find acceptable is for me to personally read every one of those million messages and mark it as good or bad. I hope our VP doens't read slashdot....

--
~~~~~ BigLig2? You mean there's another one of me?