Revolutionary Spam Firewall Developed

← Back to Stories (view on slashdot.org)

Revolutionary Spam Firewall Developed

Posted by CmdrTaco on Tuesday August 24, 2004 @03:44AM from the well-that-said-so-anyway dept.

psy writes "physorg has a story on a new spam firewall developed at The University of Queensland. The new technology is the only true spam firewall in existence, according to co-developer Matthew Sullivan. "Existing anti-spam software filters out spam whereas ours puts up a firewall, stopping all email traffic and only allowing real mail through," said Mr Sullivan. "In addition, our technology is accurate and fast. We recently completed a successful trial of a key layer of the spam firewall and it processed the emails at 90 messages per second, misclassifying only one out of 25,000 emails." "It turned out that the software was even better than us, picking up spam we'd incorrectly classified as legitimate emails."

42 of 507 comments (clear)

Min score:

Reason:

Sort:

Sourcecode? by peterprior · 2004-08-24 03:46 · Score: 2, Insightful

Sourceode would be nice....
Support Vector Machine (SVM) by doofusclam · 2004-08-24 03:46 · Score: 2, Insightful

What the hell is one of these? There seems no substance to this report, bar some TLAs as above and a load of hype. Where is the proof? How was it tested? Etc.
1/25000 by Laivincolmo · 2004-08-24 03:46 · Score: 2, Insightful

Although this is a great new technology, for a business setting, I don't know if even missing one e-mail is acceptable...
1. Re:1/25000 by Shakrai · 2004-08-24 03:50 · Score: 4, Insightful
  
  Although this is a great new technology, for a business setting, I don't know if even missing one e-mail is acceptable...
  
  That's what everybody says but what's the other option? Letting all the SPAM come in? Do you really think that fed-up employee who gets hundreds of SPAMs a day is really going to do a better job of just mashing down the delete key then a SPAM filter with a 1/25000 error rate?
  
  Of course I doubt this technology would perform that well but the point still stands -- if you don't have a computer flagging them then chances are you have a human flagging them. Who do you trust more?
  
  --
  I want peace on earth and goodwill toward man.
  We are the United States Government! We don't do that sort of thing.
2. Re:1/25000 by Mononoke · 2004-08-24 03:52 · Score: 2, Insightful
  
  Although this is a great new technology, for a business setting, I don't know if even missing one e-mail is acceptable...
  I would guess that's right in line with USPS, UPS, FedEx, or even faxing directly.
  
  --
  NetInfo connection failed for server 127.0.0.1/local
3. Re:1/25000 by rjstanford · 2004-08-24 03:53 · Score: 3, Insightful
  
  and if we missed 4 legit client emails a day... that would be lost business, and that's just unacceptable no matter how you look at it.
  
  Well... how much money would it take to have the staff necessary to do the filtering manually (at a better rate - even humans are fallible), and how much would the potential business loss cost you? Assuming that the business was very profitable, and that the senders wouldn't call or send a follow-up email of course.
  
  --
  You're special forces then? That's great! I just love your olympics!
4. Re:1/25000 by cyngus · 2004-08-24 03:56 · Score: 5, Insightful
  
  One of two conditions exists in this case.
  1) The e-mail is vitally important and your business will be seriously damaged by its failed delivery.
  
  2) The e-mail was somewhat important, but not something large enough to materially change your revenue/profits.
  
  If the first is the case, you probably shouldn't be using e-mail in the first place and/or whoever sent it is probably going to follow up with a FedEx or phone call.
  
  In the case of number 2 (ha ha, number two), you've saved so much time not having to wade through spam that the losses are negated.
5. Re:1/25000 by Alioth · 2004-08-24 04:03 · Score: 4, Insightful
  
  1/25000 is significantly better than a human being. If you use no automatic spam filtering at all, and you get a typical geek's email load (about 100 spam a day with 10 legitimate emails a day), you will still delete mail as spam when it wasn't spam.
  
  That's why I use SpamAssassin - it does a good job, and is no worse at making false positives than I am. If I'm just as liable to make a false positive than an automatic filter, I'm better off saving my time.
  
  --
  Oolite: Elite-like game. For Mac, Linux and Windows
6. Re:1/25000 by nkntr · 2004-08-24 04:26 · Score: 3, Insightful
  
  I support among other people, a marketing staff. When people are interested in buying things, they may only send one email. That one email is all you are going to get, and not getting it is the same as not getting the sale. I know the marketing staff is extremely skeptical about any sort of spam filtering, as they are always concerned about missing important emails that may lead to sales, and ultimately, revinue. I don't know how this fits in with spam filtering, but suggesting that all important email is followed up with a call is not true. And ask any CEO--sales are the most important thing to a company. It doesn't matter if you have the best thing in the world, if you can't sell it, it isn't worth anything.
7. Re:1/25000 by Politburo · 2004-08-24 04:37 · Score: 2, Insightful
  
  When people are interested in buying things, they may only send one email.
  
  Assuming you give them multiple avenues to contact you, then they simply aren't that interested if they only send one email and drop it after that. Now, I can certainly see trying to make the email system as hardened as possible to prevent any missed email, but the idea that youre going to lose out on some huge sale because of one email being dropped is silly. The grandparent is correct. If you're at all serious in your business, important email is always followed up with a call or some other means.
  
  And ask any CEO--sales are the most important thing to a company.
  
  Close, but profit is the most important thing. You can sell a billion units, but if you're selling them at a loss, I don't think the CEO will be too pleased.
8. Re:1/25000 by That's+Unpossible! · 2004-08-24 04:42 · Score: 2, Insightful
  
  I don't mean to be a prick, but maybe those are all different users complaining? Maybe give them some options. It sounds like you have:
  
  - Some people that want no spam and can accept losing real email.
  
  - Some people that want as little spam as possible without losing any real email.
  
  This is what I like to call "normal."
  
  --
  Ironically, the word ironically is often used incorrectly.
Fetchmail? by TheLoneCabbage · 2004-08-24 03:48 · Score: 3, Insightful

Fetchmail + SpamAssassin?

What am I missing here?

Doesn't save B/W: you need to run in INSIDE your network.

Don't care how fast it is: It's a dedicated server.

1/25,000 failure rate with no false positives: OK, that's good. But still not amazing.

How are their servers? /.?

--
I would rather be ashes than dust!
Uh yeah, OK... by Tony+Hoyle · 2004-08-24 03:48 · Score: 4, Insightful

It's easy to produce these kind of results in trials - you just tune the spam filter to handle a certain set of emails, then you feed it those emails again and you get a near 100% success rate.

Heck, why not do it with a million emails? Makes better headlines that way.

I don't see how this is any different to SpamAssassin (the term 'Mail Firewall' is pure marketing bullshit. It's a spam filter. Get over it.) except I bet it costs a hell of a lot more...
1. Re:Uh yeah, OK... by Tony+Hoyle · 2004-08-24 04:17 · Score: 2, Insightful
  
  No real researcher would ever perform a test in such a way.
  
  Take of the rose-tinted spectacles.
  
  Have a look at some of the recent MS or SCO research. *real* researchers give ther results they're paid to give, and don't give a damn about methods.
  
  This a press release (presumably.. definately reads like one). Most of the 'facts' in it were probably dreamed up on the spur of the moment because they sounded good. Assuming they really ran the 25,000 email test then it's almost certain they reached the conclusion they were told to reach. If they can repeat those results after a server has been up for 6 months filting *real* email then I'll be interested.
  
  Not necessarily. I don't know how much configuration this system requires, but if it requires nothing more than simply plugging two network cables into a box and away you go, then I think it is very appropriate to call it a "firewall."
  
  No, it's still a spam filter.
  
  If you put it into a sealed self-powered black box with the words 'Firewall' emblazoned in large letters on the side it would *still* be a spam filter.
  
  The word 'Firewall' has a specific use in the IT world, and this aint it.
2. Re:Uh yeah, OK... by Tony+Hoyle · 2004-08-24 04:40 · Score: 3, Insightful
  
  They're not trying to get published. They're trying to get paid.
  
  Someone posted a non-slashdotted link. They've formed a company and are after funding - hence this press release. TBH Slashdot should stop giving these people airspace.
  
  This is *not* science it's a corporate press release. If they had the integrity you ascribe to them (which really doesn't exist - everyone has an agenda, whether it's to get published or, in this case, to get money) then they'd never have allowed it to go out with claims like this is 'new' and 'revolutionary' which are quite obviously total bullshit.
  
  And no, it's still not a firewall. I do exactly the same with postfix and spamassassin and that's not a firewall either. It's a mail filter.
Re:Not the first; not revolutionary by micromoog · 2004-08-24 03:49 · Score: 4, Insightful

Isn't "spam firewall" just a marketing term for "filter"?
Useless by trans_err · 2004-08-24 03:50 · Score: 2, Insightful

Until there is a 0% fail misclassification rate such a method is useless. Filtering was one thing, if you misfiltered a message you always had the oppertunity of occasionally scanning your SPAM box and making sure everything was about penis enlargement and not about the meeting you have next week. However, with this method email is stopped and never delivered, thus your misclassified email is now gone- forever.

I'd rather get 5 extra spam if it meant I also recieved every real email.

--
transmission_err
1. Re:Useless by leperkuhn · 2004-08-24 03:54 · Score: 2, Insightful
  
  if it's just bounced back then how is that bad? there will never be a perfect system - even whitelisting involves a bounceback. I'd be more than happy with 1 out of 25,000 e-mails being incorrect. I bet more mail gets lost by the post office.
  
  --
  http://www.rustyrazorblade.com
Spin doctors by sean23007 · 2004-08-24 03:51 · Score: 3, Insightful

"It turned out that the software was even better than us, picking up spam we'd incorrectly classified as legitimate emails."

Heh. Does anyone else see that as a good way to downplay false positives?

"Oh, good point, Computer. That email from my boss actually was spam. I didn't realize that until you mentioned it."

--

Lack of eloquence does not denote lack of intelligence, though they often coincide.
1. Re:Spin doctors by JimDabell · 2004-08-24 04:37 · Score: 3, Insightful
  
  No, it's well-known that humans make mistakes. Human decisions, when faced with hundreds of spam emails, result in false positives and false negatives as well. The comment you mention merely points out that they consider it to make less false negatives than the average human.
Re:Not the first; not revolutionary by Rikus · 2004-08-24 03:53 · Score: 5, Insightful

Isn't "spam firewall" just a marketing term for "filter"?

Isn't "revolutionary" just a marketing term for any stupid new product?
Re:Spelling by Jeff+DeMaagd · 2004-08-24 03:59 · Score: 2, Insightful

Well, shoot, despite using the pre tag, it got hidden, anyway, an invalid tag might be randomly inserted into parts of words to make scans fail. So it throws off scanners and doesn't show up when rendered for the user.
And human error is better? by metallicagoaltender · 2004-08-24 03:59 · Score: 2, Insightful

I'd guess that if you put the firewall up against your average email user, the average user would shitcan legitimate messages at a much higher rate than the firewall thanks to the fact that the user can get frustrated while the firewall can't. I know my boss accidentally deletes mail from me at least 3 times per week because he's careless while mass-deleting spam in the morning.

Since the firewall functions based upon code rather than emotion and intuition, the firewall's error rate is going to look better and better against human error as it handles more and more mail.
What is this selfimportance trip by Anonymous Coward · 2004-08-24 04:02 · Score: 1, Insightful

Why is it anytime a filter is discussed, everyone starts yammering about "1 is too many" and in reality, a 1000 would still be fine.

email is an unreliable system, so dont expect it to deliver every message flawlessly to begin with.

i think people get all antsy about it, because they like to think their email is just soo damned important, arctic winds will freeze the entire planet if they dont get whatever lame useless email from their spouse/manager/cousin.

if it were that critical that the person absolutely must know that information, it's called a fucking telephone.

over inflated self importance.
Re:Spelling by swordboy · 2004-08-24 04:02 · Score: 5, Insightful

I honestly think that we need an RFC for this so that idiots who can't spell can get a real error message back when their legitimate email gets rejected. At this point, all spammers would be forced to spell correctly and it would be difficult for them to get their point across without using obvious spam keywords like 'viagra'.

--

Life is the leading cause of death in America.
Why filter at firewall layer? by sdxxx · 2004-08-24 04:05 · Score: 4, Insightful

Well, the site is slashdotted, so I can't read their claims. However, it doesn't seem like there is any benefit to doing spam filtering at the firewall layer.
For example, Mail Avenger allows you to filter spam based on network characteristics like SYN fingerprints and routes. It even integrates with the kernel firewall to filter out aggressive spammers and mail bombers. However, because it runs as an ordinary user-level process, it also has much more flexibility, for example allowing individual users to set different policies on different email addresses. What can a spam "firewall" do that you can't do with a system like Mail Avenger.
Re:Not the first; not revolutionary by isorox · 2004-08-24 04:09 · Score: 3, Insightful

I understand a "spam firewall" to close the connection as soon as it recognises spam, rather then let the whole email download. In the case of those "Windows service pack" emails, you can save a lot of bandwidth.
Re:Here's how it probably works by Frostalicious · 2004-08-24 04:15 · Score: 2, Insightful

The second time the remote mail server tries to connect, the server accepts the mail and adds the address to the whitelist. Currently it's porbably the best spam blocking method that exists.

Until the spammers catch on and start to resend their requests. This seems like a stop-gap solution.
False Positives by ewn · 2004-08-24 04:15 · Score: 2, Insightful

"It turned out that the software was even better than us, picking up spam we'd incorrectly classified as legitimate emails."

They are celebrating false positives?
Re:Spam firewall? I want a hard drive firewall by Kaa · 2004-08-24 04:16 · Score: 2, Insightful

That's not a firewall either - it's a sandbox (and not new, either)...

The guy is not asking for a sandbox. He is asking for the ability to give or deny individual processes write-access to the hard drive. That's something quite different from a sandbox.

I would also be interested is software that does this.

--

Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
Re:Spelling by ncc74656 · 2004-08-24 04:21 · Score: 2, Insightful

The there is the old trick of putting html in the middle of dodgey words. Like: viagra

Your typical Bayesian filter works on the message source, not the output of an HTML renderer. "viagra" gets dumped into the spammy-word list along with "v1agr4" and other annoyances, so after the first one sneaks through and is manually classified, the rest are blocked.

--
20 January 2017: the End of an Error.
Re:Not the first; not revolutionary by Rei · 2004-08-24 04:24 · Score: 5, Insightful

Isn't slashdot supposed to be more than just a conduit for corporate press releases?

--
No matter how kind you are, German children are kinder.
Article slashdotted, but skeptical of the blurb by gvc · 2004-08-24 04:28 · Score: 2, Insightful

The only true ... followed by some words with nebulous semantics. Successful trial of a key layer ... [as opposed to an actual demonstration]. 1 misclassification in 25,000 [a.k.a 99.996% accuracy].
All these phrasings automatically trigger my B.S. filter. Or should I say firewall.
Re:Spelling by Anonymous Coward · 2004-08-24 04:29 · Score: 3, Insightful

One of the biggest problems with this proposal is that messages talking/warning about spam-such as this one-would get marked as spam.

It's already happened when I sent an email to a client warning about a porn dialer. The repeated mention of porn got my message spam-trapped.

What's needed is a filter that checks these words & spellings in context-but that's far more difficult than the simplistic spell checker that's proposed.
Re:Not the first; not revolutionary by LaCosaNostradamus · 2004-08-24 04:38 · Score: 5, Insightful

Isn't "marketing" just a term for people who don't know, selling to other people who don't know?

--
[You have a stable society when some nut guns down a schoolyard and the law doesn't change.]
Re:Spelling by Anonymous Coward · 2004-08-24 04:42 · Score: 1, Insightful

This is a very dangerous thing to do..
First, there are many languages to consider - and even if you've covered that, some people are writing using their dialect in emails (I've done this several times when writing in Swiss-German).
I think this only works for emails that are considered english and badly mispelled
But what if... by Clown+Jizz · 2004-08-24 04:58 · Score: 2, Insightful

your name is Dick? My father, whose name is Dick, has had endless trouble with spam filters blocking all of the messages he sends where he uses his own name, or when clients send him email using his name. It seems most filters and firewalls don't distinguish between "Dick" and "dicks," and this is a problem for businesses, where context is so important.
Re:Spelling by wheany · 2004-08-24 05:05 · Score: 5, Insightful

Only if the bayesian filter sucks. Or rather: Only if the tokenizer of the filter sucks. Bayesian filters don't have to treat the message as a raw string. They are free to parse it to, for example, remove comments, use image urls, or the difference between the foreground and background color in html mails as words.

You can make a tokenizer that not only treas a word written like this: 't.r.i.c.k.y', as the word 'tricky', but also as a "pseudoword" like 'trick:dottedword.' So the "bayesian part" of the filter would see these two words: 'tricky' and 'trick:dottedword.'

And there is of course loads of information that can be extracted from the headers of the mail.
Re:Not the first; not revolutionary by GileadGreene · 2004-08-24 05:07 · Score: 2, Insightful

The academic literature search is pretty much dead these days - there's just so much stuff going on in the world that it's well nigh impossible to be completely up to date on your field. There're entire communities of researchers that have no idea what other, similar groups are up to.
Re:Spelling by daveashcroft · 2004-08-24 05:12 · Score: 2, Insightful

....and you must remember that chemists such as myself, will sometimes send an email to a colleague containing the systematic chemical name of a chemical which has just been synthesised for the first time. There is no way a dictionary based check would pass that, as we are effectively creating new "dictionary entries" each day.
Re:Here's how it probably works by Zak3056 · 2004-08-24 05:21 · Score: 2, Insightful

Until the spammers catch on and start to resend their requests. This seems like a stop-gap solution.

It is, but it's a GOOD stop-gap. In order to resend the bounced greylisted message, you'd have to be resending ALL soft bounced messages the number of which, assuming you're sending millions of emails a day, is not insignificant.

It makes the cost of doing business higher for spammers, which ideally cuts down on their profits, making spamming less attractive.

--
What part of "shall not be infringed" is so hard to understand?
Not true, it less than doubles costs of spam by davidwr · 2004-08-24 05:46 · Score: 2, Insightful

If the spammer gets a "try later" response, he tries later ONE TIME. Worst-case this doubles their bandwidth costs and delays everything by 4 hours.

Today, MOST bad addresses will get SOME OTHER reply, so the cost increase is 2x.

I agree that it's a GOOD stopgap measure but it will fail as soon as the spammers catch on.

On the other hand, spammers might catch on to the idea that "these people are likely to complain, so I don't want to mail them anyways." That would be a Very Good Thing.

--
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.