Mozilla Adding Spam Filters

DOWNLOAD NEW MOZILLA by cscx · 2002-11-14 05:51 · Score: 2, Funny

And ENLARGE YOUR PENIS at the same time!

Click HERE!

Re:DOWNLOAD NEW MOZILLA by wiredog · 2002-11-14 05:53 · Score: 4, Funny

Man, a perfect place for a goatse link, and you didn't put it in. Sigh. Kids these days.

--

Best Slashdot Co
Re:DOWNLOAD NEW MOZILLA by dimator · 2002-11-14 11:51 · Score: 2

Never seen goatse.cx link... must resist... mouse moving by itself... nooooo....

--
python -c "x='python -c %sx=%s; print x%%(chr(34),repr(x),chr(34))%s'; print x%(chr(34),repr(x),chr(34))"

102 Features IE doesn't have by Squeezer · 2002-11-14 05:51 · Score: 2, Insightful

Now the list of 101 Mozilla features that IE doesn't have can be amended to 102 features! :)

--
Does the name Pavlov ring a bell?

Re:102 Features IE doesn't have by crossseyed · 2002-11-14 05:55 · Score: 4, Interesting

It doesn't mean they're not thinking about it, though...
http://research.microsoft.com/~horvitz/junkfilter. htm

--
-- Outside of a dog, a book is man's best friend. Inside a dog, it's too dark to read
Re:102 Features IE doesn't have by Yuan-Lung · 2002-11-14 06:01 · Score: 2, Informative

No, it is not, but outlook express, an application distributed with IE, is.
Re:102 Features IE doesn't have by Shippy · 2002-11-14 06:05 · Score: 3, Insightful
Not really. E-mail is Outlook's domain. Not IE. I think that list of 101 things is a great way to show the power and flexibility over IE, but some of them are just filler. For example:
- 98. Supports IRC Protocol - This is something I don't even use. This is just another program which should be separate but isn't and gives rise to the "mozilla is bloated" argument.
- 99. Open Source - Yeah, but good luck sifting through it ;)
- 100. Bugzilla - OK, lots of people use this, but Bugzilla != Mozilla. So it's not like Mozilla has built-in Bugzilla features... This is unrelated to the list.
- 101. Giant Lizards are Cool - 'Nuff said.
So, that brings it down to, what, 97? Still a pretty good list. However, I've heard that popup blockers and tabbed browsing are making their way into IE (and MS employees can already use these features), but we'll see if they're actually integrated.
--
-Shippy
Re:102 Features IE doesn't have by GreyPoopon · 2002-11-14 06:13 · Score: 2, Informative

However, I've heard that popup blockers and tabbed browsing are making their way into IE
It'll be nice to have this, but this is really just another good argument for competition and choice. If Mozilla (and Opera) didn't have this first, how long would it have been before the features came to IE? The same can be said for things that appeared in IE first and finally made their way to Netscape / Mozilla. This is why it's really nice to have some choices.

--
GreyPoopon
--
Why is it I can write insightful comments but can't come up with a clever signature?
Re:102 Features IE doesn't have by gabec · 2002-11-14 06:33 · Score: 3, Insightful

Microsoft playing "catch-up"? Nonsense. Only, what, 9% of the internet users out there use browsers other than IE? Of that, how many of those alternate browsers have tabbed browsing and of those clients using those browsers how many actually *use* tabs?
I agree that Microsoft is scanning around and implementing good features, but no one other than /.'ers will ever know they got the idea from someone else. You're only playing 'catch-up' if there's something to catch-up to. IE has over 90% of the internet userbase, I'd say *that* was something to catch-up to.
Re:102 Features IE doesn't have by xanadu-xtroot.com · 2002-11-14 06:36 · Score: 2

think Mozilla is giving IE a run for its money and Microsoft is realizing this.

And that's only one facet of OSS! Imagine what goes on over there with all the other mile stones OSS continues to make. :-)

P.S.
KDE 3.1 freakin' rules. I'm running RC3 now. It's REALLY nice...

--
I'm not a prophet or a stone-age man,
I'm just a mortal with potential of a super man.
Re:102 Features IE doesn't have by Anonvmous+Coward · 2002-11-14 06:48 · Score: 2

"101. Giant Lizards are Cool - 'Nuff said."

What's ironic about that Giant Lizard is that it may get them sued out of existence. Yeah, that's a bonus. Whatever.
Re:102 Features IE doesn't have by afidel · 2002-11-14 07:03 · Score: 3, Interesting

Popup killing and tabbed browsing are the two killer features that have allowed me to spread mozilla widely through my office. People see me surfing and ask what the tabs are or ask where the popup have gone. I tell them about mozilla and show them how easy it is to stop popups. Yes I know about crazybrowser which does both of these, but it does popup killing badly (it's an all or nothing thing, not just unsolicited popups).

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Re:102 Features IE doesn't have by wkitchen · 2002-11-14 07:10 · Score: 2, Interesting

Microsoft playing "catch-up"? Nonsense. Only, what, 9% of the internet users out there use browsers other than IE? Of that, how many of those alternate browsers have tabbed browsing...
I suspect that Netscape, Mozilla, and Opera collectively make up most of that 9%. Given that all of those have tabbed browsing, then the answer is "nearly all of them". Or maybe not. I guess it really depends on how many of those 9% are surfing around using old versions.
and of those clients using those browsers how many actually *use* tabs?
Good question. I don't have any statistics, but I suspect the percentage is pretty high. Of the few Mozilla users I know, ALL use and love tabs. In fact, tabbed browsing has influenced many, including myself, to use Mozilla as their primary browser, resorting to IE only to deal with those increasingly rare sites that don't work with Mozilla (and significant portion of those fail only because of really stupid browser detection that causes the page to refuse even to try to load if it detects something other than IE). Mozilla is good enough at this point that I now use IE for less than 1% of my web browsing.
I agree that Microsoft is scanning around and implementing good features, but no one other than /.'ers will ever know they got the idea from someone else. You're only playing 'catch-up' if there's something to catch-up to. IE has over 90% of the internet userbase, I'd say *that* was something to catch-up to.
Of course MS isn't playing catch up in user share. No one claimed otherwise. But when it comes to features, MS definitely has some catching up to do.
Re:102 Features IE doesn't have by WowTIP · 2002-11-14 07:16 · Score: 2

I've heard that popup blockers and tabbed browsing are making their way into IE (and MS employees can already use these features), but we'll see if they're actually integrated.

Not to troll, but did anyone except me notice how the development of IE seemed to kind of, stop... when there no longer was any competition (since after NS4.xx, and yes, Opera and Konq and some other were worthy, but not much of a threat to MS dominance). Now it seems to be speeding up again, when they have got a worthy opponent.

Microsoft should really thank the OS movement for keeping them on the edge, without competition it seems they will pretty soon stagnate.

--

--

"I'm surfin the dead zone
In the twilight, unknown"
Re:102 Features IE doesn't have by bugbear · 2002-11-14 07:17 · Score: 2, Informative

IE doesn't do real spam filtering yet, but MSN 8 now does content-based filtering that learns by example. Since they brag that it uses a "patented" algorithm, I assume they're using this Bayesian filtering algorithm.

Before everyone starts worrying that MSFT has patented Bayesian filtering, (a) I don't think the patent would hold up well in court, because e.g. ifile is older and (b) patents are not a problem for open-source projects anyway.
Re:102 Features IE doesn't have by andy+landy · 2002-11-14 08:00 · Score: 5, Funny

Is it? I thought Outlook Express was a virus-support API. I suspect the fact you can send email with it is a bug. :)

--
perl -e 'print "Just another Perl newbie\n";'
Re:102 Features IE doesn't have by Ageless · 2002-11-14 08:58 · Score: 2

Why do they care if they stagnate? MS gets to pay less developers, pay less for research and release fewer revisions (saving money) and people just keep buying it like it's the best thing ever.
Re:102 Features IE doesn't have by Anonymous Coward · 2002-11-14 09:44 · Score: 2, Interesting

> Is it? I thought Outlook Express was a virus-support API.

No, no, Outlook Express is for Internet Explorer what Composer is
for Mozilla or Netscape -- if you don't know HTML, you can use it
to create web pages. They won't be particularly well-designed, and
they won't validate, but the major legacy browsers everyone seems to
still use will display them, so you can put them up on your website.

The reasons it sends email is not a bug, but a feature (albeit one
that tends to be abused). It's not for sending general email, but
so that you can easily upload your web pages you create to certain
free website engines that can receive them by email (on the theory
that most people don't know how to use ftp, or else because ftp is
considered insecure. The usenet engine was included so that
multiple people can use it in a peer-to-peer fasion to collaborate
on the creation of a web page. For example, if your mom and grandma
want to create a web page, but they aren't sure how to get the
pictures of the family dog scanned in, you can let them write the
text about the dog, and you can put in the picture. You can pass
it back and forth on your private family news server until it's
ready for the family website.

The reason people started using Outlook Express for regular email
is because the email software that shipped with Windows 95 (called
Microsoft Internet Mail) was _so_ bad that it was more convenient
to use _anything_ else, including telnet, and so when Outlook
Express came out people jumped on that, and the rest is history;
Outlook Express now handles (on one end or the other) nearly 40%
of the internet's email, more than anything else except sendmail.

The virus API, as you suspected, was not a bug but a feature, but
the reasons for its inclusion are complicated and involve both
particle physics and JFK.

Arms Race by Camel+Pilot · 2002-11-14 05:52 · Score: 3, Interesting

But the spammers will develop Bayesian filters of their own to find the best content that will sneak by your filters.

Re:Arms Race by TamMan2000 · 2002-11-14 05:55 · Score: 5, Insightful

Interesting thought, but they would have to have a large sample of YOUR valid email to train on...

--
"I'll have a Guinness, no wait, make that a Coors Light" -Grad student I work with, who shall remain anonymous...
Re:Arms Race by Anonymous Coward · 2002-11-14 05:55 · Score: 2, Informative

It doesn't work that way. Each person has their own Bayesian 'filter' so each person's tolerance for spam will be completely different.
For instance, someone who often receieves links to web pages, from strangers, their filter will let through more spam than someone who Never receives links from strangers.
Re:Arms Race by Schubert · 2002-11-14 06:00 · Score: 2, Interesting

actually there is a great big gob of it out there... public mailing list archives.

--
-- schubert
Re:Arms Race by jpetts · 2002-11-14 06:08 · Score: 3, Insightful

But the spammers will develop Bayesian filters of their own to find the best content that will sneak by your filters

No they won't, unless the pattern (if there is one discernable in the S/N ratio) of replies they receive changes. As most spam, as far as spammers goes, disappears into a black hole, they have no way of learning how your filters are working.

And that's good filterin'!

--
Call me old fashioned, but I like a dump to be as memorable as it is devastating - Bender
Re:Arms Race by Camel+Pilot · 2002-11-14 06:09 · Score: 3, Informative

Actually they do have your data. If you preview any e-mail they typically have something like
<img src=/spamcity/tracker.pl?id=177729299>
Where 177729299 is your personal id number.

No they have the feedback and they know what works and what does not.
Re:Arms Race by WolfWithoutAClause · 2002-11-14 06:11 · Score: 2

You can turn that off though; and I recommend you do that.

--
-WolfWithoutAClause
"Gravity is only a theory, not a fact!"
Re:Arms Race by ichimunki · 2002-11-14 06:12 · Score: 5, Insightful

Nonsense. It's impossible. First of all, they don't have access to much of the mail I want to let through-- although my mailing list traffic certainly qualifies, so let's assume that's the only mail I get and that they know I am receiving it.

There will still need to be header information and actual spam content in the spams themselves for those mails simply to not be repeats or dada-esque cutups of posts to the mailing list. That is, there must be content unique to the spam that no normal sender on the list will include.

Because of this, and the fact that so-called Bayesian spam filtering works by scoring all the words in an email and then evaluating the email based only on the extremes, there is little likelihood-- since the spam must still contain spam words to have any point at all-- of those words not being on the extreme word list. After all, if the same words are appearing in both spam and not-spam mails, they will be given a spam-probability that is not extreme. So all those words in common will be ignored and only the spam words will be looked at-- and the spam will still be filtered.

--
I do not have a signature
Re:Arms Race by gorilla · 2002-11-14 06:13 · Score: 2

Which assumes that you have HTML enabled and images enabled in your email. Obvious answer - don't.
Re:Arms Race by Webmonger · 2002-11-14 06:29 · Score: 2

Actually, all they know is that all the mail they sent to you is now considered "spammy". Either it was detected as "spammy", or it was undetected and you manually marked it.

So web bugs are irrelevent. All they know is they can't send the same kind of thing twice.

They still don't know what your criteria are for non-spammy email. And they don't know your complete criteria for spammy email, either.

So they have no way to determine a pattern that evades the filter.
Re:Arms Race by Webmonger · 2002-11-14 06:31 · Score: 2

Scary thought: what if Microsoft started putting web bugs in Hotmail messages?
Re:Arms Race by Webmonger · 2002-11-14 06:42 · Score: 2

Only true to an extent. If they know I subscribe to BUGTRAQ, they might tailor their spam to look like BUGTRAQ messages.

But if I'm not a BUGTRAQ subscriber, BUGTRAQ terms like "exploit", "buffer overflow", "shellcode", etc are probably marked as spammy in my filter. The same applies to mailing lists on cooking, grabage disposal and the like. Pretty well every topic has its jargon.

And even so, there's no reason I need to filter my BUGTRAQ messages through a spam filter. I know they're clean, so I can catch them before they get checked for spamminess. So I have no need to train my filters to mark them good.

I know you're saying that public archives are a way of determining non-commercial speech, but my point is that most public archives have specialized jargon, so they don't look like email from my friends.
Re:Arms Race by Spock+the+Vulcan · 2002-11-14 06:42 · Score: 5, Informative

Use Gotmail, which downloads your hotmail messages to an mbox-style file. Or use hotwayd which appears like a POP3 server running on localhost, and uses WebDAV to get messages from hotmail (like Outlook Express). Either way, no web-bugs will get activated.

The added advantage is that you can pipe these through procmail/spamassassin just like ordinary incoming mail, and not have to manually delete all that spam.
Re:Arms Race by Webmonger · 2002-11-14 07:02 · Score: 2

I mean OUTGOING Hotmail messages. They already put ads at the bottom of outgoing messages-- I bet they'd consider putting web bugs in too.

I don't use Hotmail, but I get messages from friends who do, and if they had web bugs, Microsoft could deduce part of my non-spam criteria.
Re:Arms Race by Lionel+Hutts · 2002-11-14 07:40 · Score: 3, Interesting

That's an arms race the spammers can't win. Sending spam is an ultra-low-margin business: with response rates of a fraction of a basis point, and probably only a fraction of them actually spending any money, the cost and effort per message sent must be very, very, very low for the spammers to make any money at all. Most spam recipients would gladly put in, say, $20 worth of effort to spamproof their addresses; there is no way even a spammer with huge scale could invest even $5 worth of effort for one more address. We will all have different Bayesian rules, remember. Combine that with the fact that I have perfect information about what spam and nonspam I get, and the sender has little or no information about what gets through, and it's clear that even hours of effort by senders wouldn't do much.

And, even if they could afford to keep it up for a while, my spam filter will get better faster than their spam. This is the "Ambassador's criterion" from SDI (briefly: Star Wars won't lead to an arms race if it gets to the point where shooting down an the marginal missile is cheaper than building the marginal missile).

I think we may just win the Spam Wars yet.

--
I Can't Believe It's A Law Firm, LLP does not necessarily endorse the contents of this message.
Re:Arms Race by Software · 2002-11-14 07:48 · Score: 2

Why is this so scary? Let me put your question another way:
What if Microsoft knew when you were reading email that's stored on their servers?
Answer: They already know! You opened the message! They don't need webbugs, they have their server logs. If you don't want Microsoft to know what emails you're reading and when, you shouldn't store your email on their servers.
Re:Arms Race by Lionel+Hutts · 2002-11-14 07:50 · Score: 2

I don't get it. Microsoft already knows the message was sent to you by a real, live person actually using the Hotmail web site (unless spammers will go through the trouble of automating that). I mean, I get spam with Hotmail return addresses, but they must all be forged, right? I'd say genuine Hotmail mail is pretty strongly presumptively non-spam and going to be read.

With web bugs, MS could figure out the IP addresses of people to whom messages are forwarded, but that's about it. If they wanted to build a database of people getting messages from Hotmail users, they don't need web bugs.

All hotmail users' outboxes are belong to them already.

--
I Can't Believe It's A Law Firm, LLP does not necessarily endorse the contents of this message.
Re:Arms Race by ebyrob · 2002-11-14 09:25 · Score: 2

Actually, more likely they'll create spams that cannot be tokenized and are hence passed through the filter unscathed...

In fact, that's already starting to look like an interesting challenge. Hackers do love a challenge.

Bayesian filter: "Duh, gee never seen *that* 20k token before." It might seem like a simple solution in some instances: "Oh, I'll just disallow any token over 100 bytes long..." But in a net prolific with binary objects that a mail program wouldn't normally need to parse, it's only a matter of time before spammers start embedding their messages in messages in copies of "normal mail". Worse yet, they may embed their messages the normal emails themselves!

Perhaps I should be patenting this idea instead of posting it?
Re:Arms Race by sakeneko · 2002-11-14 10:25 · Score: 2
Actually they do have your data. If you preview any e-mail they typically have something like

<img src=/spamcity/tracker.pl?id=177729299>

Where 177729299 is your personal id number.

This won't give spammers what they'd need to make a Bayesian filter work to get past other Bayesian filters, though. Some of the most important information a good Bayesian filter gets is not from your spam, but from your legitimate email. A good Bayesian filter notes who you routinely correspond with and what you talk about, and uses that information to prevent false positives. That means that it can go aggressively after stuff that looks like spam and not worry too much about catching legitimate email.

Still, tracking codes and other such stuff is why users should:
1. Get an email program that doesn't open links or display images automatically.
2. Install a good software firewall, like ZoneAlarm or Kerio , and configure it to block this cr*p.
3. Install Proxomitron to filter out what gets past the firewall.
Computer security and privacy can be enhanced considerably by taking a number of relatively simple measures.
--
Catherine
Re:Arms Race by MonkeyBoy · 2002-11-14 19:11 · Score: 2

Your #1 is good, but I prefer an email program that doesn't interpret ANY HTML automatically.

I realize this means I'm a backwards crotchety old fart, but email is text, dammit. You don't need funny fonts, you don't need any extraneous crap to get your point across. Period.

My primary complaint against Outlook is that there's no way to disable the HTML rendering engine. Other than that, and the myriad of security holes present in all it's components, it's really not that bad of an email program.

But I'll keep using the same email program I've been using for 10 years now, at least until it breaks completely...

--
Moof!
Re:Arms Race by sakeneko · 2002-11-15 09:24 · Score: 2

Your #1 is good, but I prefer an email program that doesn't interpret ANY HTML automatically.

I realize this means I'm a backwards crotchety old fart, but email is text, dammit.

Yep, you are indeed a backwards crotchety old fart. Since I use elm on a Unix shell myself,I guess I qualify too. ;>

From a security standpoint, though, that isn't necessary. Any email program that doesn't interpret active scripts, IFRAME tags, or retrieve images or other media objects is pretty safe. HTML alone is no more capable of carrying a virus or trojan than text is.

I admit, though, that in email (or on Usenet) it is annoying.

My primary complaint against Outlook is that there's no way to disable the HTML rendering engine.

I understand that users of Windows XP who have applied Service Pack #1 now have the option of configuring their Outlook to read email only as text. I haven't tested this out myself, though. I don't do XP or Outlook -- not voluntarily, anyway. :)

--
Catherine

A little misleading by TobyWong · 2002-11-14 05:52 · Score: 5, Informative

The news article makes it sound like this feature is up and running, in reality it is partially phased in - alpha stage stuff.

It will be great when it's more complete but there is a lot of work to do yet.

--
- Toby

Re:A little misleading by DeadSea · 2002-11-14 06:21 · Score: 5, Informative

It is up and running, it just may have a few bugs.
I just downloaded the latest nightly build and enabled the features for my mail. So far I have seen that the icons are kind of funky, the dialog box is way oversized, there doesn't appear to be a good way of marking multiple messages as spam or not spam.
On the other hand, it does seem to be doing a good job of filtering my messages. If you were one of the folks that complained about mozilla until mozilla 1.1 or 1.2, then I wouldn't go near it with a ten foot pole. If you are one of the folks, like me, who used mozilla since milestone 11 when it crashed every hour and couldn't render a heck of a lot of pages, you'll probably want to try it. Especially, if you use mozilla for mail anyway.
Re:A little misleading by Saint+Aardvark · 2002-11-14 06:33 · Score: 2

Still in alpha, true. I've d/l the newest build and am testing this. Still in the process of training for spam, but I'm really excited about this.
As I've mentioned before, this'll be a boon both for poor helpdesk slobs like me and end users: I'll be able to say, "Here, download this" and have it work for them (a bit of effort to train, true, but a simple process). It'll let them get rid of (or at least manage) spam, and keep me from having to come up w/answers to questions like "But why can't you just STOP it?"
Anyhow, just my $0.02...and congrats to Moz developers. You are all, collectively, My Man.

--
Carousel is a lie!

Re:didn't k5 already run a story on this? by Junks+Jerzey · 2002-11-14 05:52 · Score: 3, Informative

Here [kuro5hin.org]. Yeah, it's basically the same thing.

Yes, and your point is? Hint: Slashdot gets most of it's stories from elsewhere.

Great gob Mozilla, but... by GreyWolf3000 · 2002-11-14 05:55 · Score: 2, Offtopic

Mozilla is continuing to shape up to be a great platform, but it's size is getting bigger and bigger. A lot of people get worried about this, or frustrated. A lot of posters will complain about the bloat.

Compile Mozilla from scratch, and you'll see that you can custom tailor the build and cut out a lot of cruft. Of course, if you just want the browser, go for Phoenix, but really compiling on your own puts you in the drivers seat and optimize it to your own needs.

The problem here is that binary distributions package it all together, so the result is the full-fledged Mozilla. Before you Gentoo zealots get out here and plug your so-loved-distro, remember that even you don't have as much control as you could.

Basically, my point is that all these features are a Good Thing, and that complaining about the bloat is silly, since it can be custom tailored to fit your needs.

--
Slashdot: Where people pretend to be twice as smart as they really are by behaving like children.

Re:Great gob Mozilla, but... by xanadu-xtroot.com · 2002-11-14 06:10 · Score: 3, Insightful

it's size is getting bigger and bigger.

Compile Mozilla from scratch, and you'll see that you can custom tailor the build and cut out a lot of cruft.mpile Mozilla from scratch, and you'll see that you can custom tailor the build and cut out a lot of cruft.

The source package is far larger than the binaries! Then there's the wait in compiling the damn thing. No (L)User is going to do that. Maybe us geeks (and I do use the source, Luke), but certainly not a "normal" user.

The problem here is that binary distributions package it all together

So download the Net installer and choose only what you want?

--
I'm not a prophet or a stone-age man,
I'm just a mortal with potential of a super man.
Re:Great gob Mozilla, but... by RAMMS+EIN · 2002-11-14 06:12 · Score: 3, Informative

I partially agree with you. Compiling does allow you to get a slimmer lizard. However, compiling it from scratch is a real pain, and takes a long time and a lot of disk space. My point is that it's probably not worth the effort for most people. Why waste time and disk space on building a slimmed-down Mozilla if you can download a more functional precompiled version? This is why I love modularity so much; every module can be offered precompiled, and nobody needs to waste disk space.

--
Please correct me if I got my facts wrong.
Re:Great gob Mozilla, but... by GreyWolf3000 · 2002-11-14 06:16 · Score: 2

You're right, but should that mean that mozilla should not build all these features, knowing each and every one will get added to the end package most users see?
I personally compile from scratch ;), my rant is that distributions should try and make a leaner, more expandable way of installing Mozilla. It's one of the first impressions a new convert gets.

--
Slashdot: Where people pretend to be twice as smart as they really are by behaving like children.
Re:Great gob Mozilla, but... by GreyWolf3000 · 2002-11-14 18:46 · Score: 2

I'm sure you've seen worse, but isn't it oh so fun to flame?
I made the Gentoo comment because if I didn't, I would get a few replies that say "if you need to customize the installation from source, check out gentoo [gentoo.org]." Gentoo is great, it's developers are great, and most of it's users are fine. There are just a few zealots that piss the hell outta me going around the web and making plugs for Gentoo. I wonder how a Gentoo zealot would like every plug-post to be followed with a "or if you want more control, try LFS [linuxfromscratch.org]. I've been using it for years, and you can tailor every component to your needs. Hell, you don't even need glibc." It is a FACT that you get more control with LFS, but I am content that most people don't use it. Rather, I don't care.
Besides, I'll take ./configure and editing host.def or mozilla's equivalent over emerge any day :P
P.S. I'm sorry for going on the Gentoo rant. My 'dumb' comment was a botched attempt to prevent Gentoo-zealot-plugs. It ended up that I got mad, decent Gentoo users pissed of at me.

--
Slashdot: Where people pretend to be twice as smart as they really are by behaving like children.

Filtering by Transient0 · 2002-11-14 05:56 · Score: 5, Interesting

Bayesian technique is very good for the sort of abstract classification task that spam represents. It would be an interesting hack to try and train a network to categorize based solely on message body... i do however hope that their team has opted for practicality over just hack value and the network will also use such extremely relevant data as header information and comparing address versus address book(an e-mail from someone not in your address book is not necesarrily spam... but it is more likely to be).

--
lysergically yours

Re:Filtering by Gabe+Garza · 2002-11-14 06:04 · Score: 5, Informative

Actually, using only the body isn't just a hack, it's a relatively new technique invented by Paul Graham that seems to produce excellent results. It makes a lot of sense: Spam is Spam because the body contains commercial or otherwise unwanted material--it's only natural that the most direct and accurate Spam filters are going to analyze the body. Bayesian classification like this is computationally tractable and appears to work. You can read more about it here.
Re:Filtering by garymcm · 2002-11-14 06:45 · Score: 3, Insightful

I would like to understand the choice of Bayesian more. As far as I know Bayesian is good for classifying based on *belief* and can be pretty good when only partial evidence is available to network. This is great for Marketing activities, eg sending out mass emails to a segment of a database :) . However as this is _my_ email and mission critical to me, just a simple belief that something is spam is not enough

In my experience, 99% of spam can be caught with static rules (am I in the TO or CC line gets a bit under half the spam I receive). Taxonomical analysis of the subject and body can get the rest.

Bayesian seems like overkill, or maybe even a bad fit. Let's face it, the other well known use for Bayesian is the famous Microsoft Office Paper Clip!!! And that is about as useful as the proverbial ashtray on a motorbike!!

Gary
Re:Filtering by swdunlop · 2002-11-14 09:14 · Score: 5, Insightful

1) How much time do you spend training your paperclip in Office?

How much time are you going to spend on training your spam filter? If you are unwilling to invest a little time and effort in developing a solid set of values that fit your personal pattern of behavior, then Bayesian filters are indeed a poor match for you.

2) What harm is a false positive?

If you are automatically deleting anything that is marked as a positive for spam, then you are playing roulette with your email. I would generally recommend diverting email classified as spam by your filter to a folder, especially one that is relatively new and has had very little experience with your patterns of use. Set an expiry on your spam folder, and check it from time to time to see if something fell through the cracks. Mozilla has a handy feature that allows you to simply conceal spam from view, which works adequately, although I dislike the potential performance hit in a large folder.

Considering how important your email is to you, you should certainly consider applying a little diligence to how you manage it.

--
Weapons of Mass Analysis
Re:Filtering by guybarr · 2002-11-14 10:15 · Score: 2

Let's face it, the other well known use for Bayesian is the famous Microsoft Office Paper Clip!!!

1) office paper clip.
The Office assistents of all shape, sizes and levels of irritation were a good idea at the time. The fact is MS did not understand just how hard educating people (especially people with, ehmm, suboptimal approach to machines) with current AI methods is. They nevertheless did try to do something instead of whining, which is a positive, experimental approach I like very much. The fact they failed does not mean the attempt was not worthwhile.

2) office assistants Vs spam-filtering.
It may be, and IMHO very likely, that spam filtering is a much easier problem to solve than educating people. In fact, since spam is the output of a very limited statistical source (compared with the human brain ...) and needs to be, by its definition, highly repetitive, a multi-user statistical approach is, like all sparks of genius after-the-fact, the obvious solution.

So you can use static rules if you wish, but ignoring the shared data from a multiple of users, seems like refusing to take the better solution.

--
Working for necessity's mother.
Re:Filtering by Shamashmuddamiq · 2002-11-14 10:46 · Score: 3, Interesting

I don't believe it was "invented" by Paul Graham. Thoughts of separating spam from real email based on the statistical properties of its content is something that has come to my mind, as well as the minds of many people over the last few years. Just because Paul's page was the first one that you've seen explain it in detail doesn't mean he invented it.
BTW, there are ways of getting around Bayesian filtering. For instance, if you take random words from a large dictionary of long, normal conversational but not-often-used-in-spam words and splatter them throughout your spam, its easy to convince the bayesian filter that it's not spam. Not only will this decrease your false negatives, it has the capability of increasing your false positives. This is because your new spam will be training your bayesian filter, and putting lots of non-spam-like words into its vocabulary. If the spammers keep up with their dictionaries as well as the filters keep up with theirs (and I must assume this will happen), we've still got a big problem on our hands.
Don't get me wrong. I have bogofilter installed on my mail server at home, and it works great for now. But don't expect it to work forever.

--
...just my 2 gil.
Re:Filtering by G-funk · 2002-11-14 13:28 · Score: 2

If you need to go in by hand and check all your spam, it defeats the purpose of filtering in the first place.

--
Send lawyers, guns, and money!
Re:Filtering by bugnuts · 2002-11-14 14:33 · Score: 2

That will completely hose you if you subscribe to a mailing list without a way to bypass it. I do some similar checking and have to put many address in specifically.

One huge method of stopping spam is blocking anything west of hawaii. Cut china, tw, jp, and especially kr, and you lose a TON of spam. You have to typically use the IP blocks, though... they've caught on. Sayanara [211.*] and [210.*], I don't have any friends that write me from there anyway.
Re:Filtering by Lars+Arvestad · 2002-11-14 19:52 · Score: 2

If you need to go in by hand and check all your spam, it defeats the purpose of filtering in the first place.
No, it doesn't, at least not to me. I am a Xemacs/Gnus user and have some trivial (non-Bayesian) filters that puts emails likely to be spam in a special folder. The accuracy is around 99%. Whenever I read my regular email, I can simply walk through them one by one without stumbling over spam every second message. Once a day, I take a look in the spam folder to see if someone I know but didn't happen to have in my address book sent me an email without a Subject line (yup, that's the most common problem).
This is way less intrusive than have my good emails mixed in with tons of spam!

--
Reality or nothing.

Mozilla mail / browser by FrostedWheat · 2002-11-14 05:58 · Score: 4, Interesting

I wonder if a similar technique could be used in the browser. Automatically block images or popups based on previous ones you have blocked.

Now that would be very nifty!

Re:Mozilla mail / browser by SethJohnson · 2002-11-14 06:07 · Score: 4, Informative

The site-specific white list feature of Mozilla's pop-up blocking seems to work fine enough. The number of sites where you actually want popups from is far less than those offering popups. So manually adding these exceptions to the white list is not such an annoying task. I think bayesian filtering would be overkill in this case.

--
$5 / month hosted VPS on linux = awesome!
Re:Mozilla mail / browser by pVoid · 2002-11-14 06:09 · Score: 2, Informative

It would be much harder because an image doesn't have 'content'. At least text content.

URLs are generally cryptic numbers, so that even humans can't decipher what they are.

Although there are certain apps out there (such as Norton PErsonal FW) that let you block a certain add from ever popping up again. Which I find very cool.
Re:Mozilla mail / browser by Penguinoflight · 2002-11-14 06:38 · Score: 2

Banner-filter by phroggy proggy.org will block banners very well, but it only works inside a proxy now. Mozilla had a chance to implement a banner filtering system, but they opted out. Pretty sad actually

--
"And we have seen and do testify that the Father sent the Son to be the Savior of the World"
1 John 4:14
Re:Mozilla mail / browser by po_boy · 2002-11-14 07:52 · Score: 2

You know, this came to my head last night as I was falling asleep. I pictured it as a learning filtering system just like bogofilter or any of the other statistical mail filtering deals. It could work on embedded images and look at their URL, size, and possibly content to decide if you wanted to view it or not. When you saw a banner ad or something that you didn't want in your browser, right click on it and add it to the spams list. Your browser learns a little more and starts filtering out similar images.

I initially thought about it based on a news story about the supreme court and library filtering systems. One problem with the filtering in libraries is that you have to depend on some company to make a good block list. They get a huge government grant and produce a crappy filter.

Seems like we could make a rather intelligent filtering proxy or browser or something to remove adult content from library kiosk machines. That way the libraries wouldn't be dependant on some poor filtering list, the kind that elicits cries of censorship. Eventually, a smart enough filter could be built to keep out objectionable material.

The only thing I couldn't figure out was how to train the statistical filter well.
Re:Mozilla mail / browser by Sylver+Dragon · 2002-11-14 11:00 · Score: 2

There's a site-specific whitelist feature? Where? That sounds much easier than repeatedly changing the pref whenever I want to turn on popups...

I have to ask. When would you ever want pop-ups enabled?
I've been running without pop-ups since I discovered Mozilla, both at home and work, and not once have I had a need to have pop-ups turned on. Its made my whole internet experience much better.
I also make use of my HOSTS file to kill most banners on sites I visit. Just find the URL for where the banner image is loaded from, create an entry that points that URL to 127.0.0.1, and that banner becomes defunct. Mind you, you might need to clear your disk cache out, so it doesn't get loaded locally; but, I do that on a regular basis anyhow.

--
Necessity is the mother of invention.
Laziness is the father.
Re:Mozilla mail / browser by Sylver+Dragon · 2002-11-14 12:26 · Score: 2

Well, often sites offering...completely legal software for download will use popups for their download windows...;)

While I have seen this sort of thing, usually on places such as download.com.com, they are also usually kind enough to provide a link to click on, if for some reason the pop-up fails to start the downloand. Ok, so it requires one more click, but for not having to click 50 billion or so times to close pop-ups, I figure its a good trade off.
Again, not seen much use for pop-ups, especially the kind that happen when first loading or unloading a page.

--
Necessity is the mother of invention.
Laziness is the father.
Re:Mozilla mail / browser by SpaceLifeForm · 2002-11-14 14:01 · Score: 2

Are you implying that doing a right-click on a banner ad and selecting 'Block Images from this Server' does not work?

--
You are being MICROattacked, from various angles, in a SOFT manner.
Re:Mozilla mail / browser by Penguinoflight · 2002-11-15 01:43 · Score: 2

No, but there's tuns of banner servers out there, doubleclick.net has about 20 subdomains, and it would get frustrating for me to "block from this server" every time I see an ad, and still get tuns.

--
"And we have seen and do testify that the Father sent the Son to be the Savior of the World"
1 John 4:14

zilla by sstory · 2002-11-14 05:58 · Score: 3, Interesting

I just switched to Mozilla. Happy to be free of Microsoft for email. It's skinnable, and there are some cool skins--like one which sort of emulates Evolution. I noticed an annoying 'feature' though, which is still there from Netscrap days--if you send an email without a subject, a dialog pops up and goes blah blah blah. I asked the Mozilla newsgroup if there was a way around this, but all I got was the sort of adolescent yammerings that keep me out of unmoderated newsgroups. Nice to see it has a spamfilter now. The only major improvement remaining is to add a spell-check (the Netscrap one was licensed from a 3rd party, and can't be freely distributed).

Re:zilla by Neon+Spiral+Injector · 2002-11-14 06:13 · Score: 5, Informative

It is so annoying to get an e-mail without a subject. My spam filters actually bump you a little bit closer to being considered spam if there is no subject. I consider it to be a required header.

For one I sort my mail by thread, while Mozilla will use reference headers to thread messages, the fall back is the subject. Without a subject your message would be tossed in the thread with the other loosers who also forgot their subject.

The easy way to keep that dialog box from popping up when you send a mail is to...put a subject on the message.

If you want a spell checker go to the Netscape FTP server find the XPI file for the spell checker and install it.
Re:zilla by mstyne · 2002-11-14 06:15 · Score: 2

How the frak is this Informative? Moderators, have you been taking stupid pills again?

It's skinnable, and there are some cool skins--like one which sort of emulates Evolution.

I'm sure the Mozilla dev team will be happy to know the first reason you listed for switching was because it was skinnable. Cripes.

--
mstyne: real name, no gimmicks
Re:zilla by Hard_Code · 2002-11-14 06:20 · Score: 2

"was because it was skinnable."

And skinnable to look like a different mail client at that :)

--

It's 10 PM. Do you know if you're un-American?
Re:zilla by Dog+and+Pony · 2002-11-14 06:39 · Score: 2

if you send an email without a subject, a dialog pops up and goes blah blah blah

There is nothing more annoying than getting mails without a subject line. It is even more annoying trying to spot that mail again in the list, when you have received several of them.

Now if they could actually forbid sending mails without a subject line, I'll start forcing Mozilla down all my friends throats... :)
Re:zilla by ChaosDiscord · 2002-11-14 06:39 · Score: 5, Funny

I noticed an annoying 'feature' though, which is still there from Netscrap days--if you send an email without a subject, a dialog pops up and goes blah blah blah.

The "blah blah blah" is roughly, "You have not specified a subject. Would you like to enter one now?" Perhaps you're right, it should be changed. Instead, it should say, "You're about to send an email message without a subject. That's an amazingly rude thing to do and likely to irritate the recipient as it makes it harder for them to pioritize their incoming mail and harder to distinguish from spam. Because this is such a terrible idea, you should enter a subject line below. If you fail to enter a subject, the default entry of 'I'm a idiot, please delete this message without reading it' will be used."

--
Search 2010 Gen Con events
Re:zilla by DrXym · 2002-11-14 06:53 · Score: 2

The around the issue is to hack the chrome so it doesn't do it. The reason it does do it has to do with good usenet practice - there is a list (whose name escapes me) of stuff that all good email / news software is supposed to enforce and not allowing blank headers is one of them.
Re:zilla by Malc · 2002-11-14 06:53 · Score: 2

The most important part of an email is the subject line. Think about it.

When you're done, check out http://spellchecker.mozdev.org/. It's not as active as it could be, and one day it might even be part of the main source tree. Then we might get spellchecking for input forms on web pages.
Re:zilla by cduffy · 2002-11-14 07:16 · Score: 2, Insightful

I care. I'm busy, and if one of my friends needs a ride tonight I'll read it. If that same friend is just wondering how I'm doing, I won't -- unless I'm not at all busy.

Further, some of us actually have multiple threads of conversation going with our friends, or archive our messages and occasionally go back through them. I may be simultaniously talking with someone about (say) some PHP problems they're having and discussing motorcycle riding. If I want to go back and reread what exactly the problem he was having with PHP is, I don't want to have to sort through the messages where he's trying to convince me I should be riding a crotch rocket instead of a cruiser.

My friends understand this, and are polite enough to use the subject header in their emails. If they don't do that once, I'll ask politely that they start. If they don't do it again, I may well be rightfully a bit annoyed.
Re:zilla by sstory · 2002-11-14 07:31 · Score: 2

It's exactly the same sort of blather I got on the newsgroup. Instead of helping with a technical problem, the posters assumed they understood what I was trying to do, and condemned it. I have very good reasons for not including a subject line on certain emails, which no one asked for, and which I don't need to submit to their approval. People often criticize without knowing all the relevant information.
Re:zilla by Reckless+Visionary · 2002-11-14 08:15 · Score: 2

I agree with you. I also frequently have reason to send email without a subject, for example, to myself so that I have a file available on my IMAP server that I can easily access from outside the office. The reason I don't need the subject is because I already know the contents of the email and it's a waste of time to write myself telling me what I'm sending myself. No one seemed to mention this but the obvious behavior to be called for here is that of Outlook Express for the Mac (I think that's where it happened to me, no promises) It will tell you that you're sending an email without a subject, but has a checkbox allowing you to disable that prompt. Is there something wrong with this type of behavior that the Mozilla developers purposely chose not to include it?

--
I think I'll stop here.
Re:zilla by Brendan+Byrd · 2002-11-14 12:59 · Score: 2

I hate it when somebody mass FWDs a message fifty billion times this one simple joke, with a hundred greater-than signs on it, like:

> > > > > > > > > > > > > > > > > > > This joke is pretty funny:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The chicken crossed the road...
> > > > > > > > > > > > > > > > > > > AND GOT HIT BY A CAR!
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > HA HA HA HA HA HA HA LOLOLOLOOOOOL!!!!

--
Zodiac Survey
Re:zilla by cymen · 2002-11-14 17:07 · Score: 2

Yeah, because we all know mailbots will have a hissy fit on the style of the email when managing your list subscription. Seriously, I'm on tons of mailing lists and Mozilla's little subject helper is just plain annoying. I put a subject in every damn email I send except those sent to bots--does that make me some sort of freak? I don't think so. The subject thing is dumb and, at the very least, should be an option.
Re:zilla by Phroggy · 2002-11-14 17:13 · Score: 2

People who only get a handful of messages a day probably don't appeciate how important this is for people who receive two hundred messages per day (between work email, personal email, and various technical mailing lists). When you're getting that sort of email, you either become very aggressive about how you handle it, or your time disappears.

Well said. Most of mine is filtered into various folders, but I can't filter everything. I glance at my mail throughout the day, and if nothing about your subject line indicates the importance of the message, I'll be pretty annoyed.

Hmm, come to think of it, I think Spamcop might flag messages with blank subject lines as possible spam. I'm usually pretty careful about checking through everything before reporting it, but I might miss yours and report it anyway.

--
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;

Hope it doesn't have false positives by tedgyz · 2002-11-14 05:59 · Score: 2, Interesting

This is really great technology.

I had the benefit of working with this technology for a classification problem here at work. I was amazed at how good it worked. We were using it to replace a purely human process.

However, there is one huge problem. Incorrect classification. Blind tests against a known dataset showed 80%+ correctness. The problem is, you don't know which 20% is wrong. Thus, you still need 100% inspection to validate the results.

When applied to mail filters, I wonder how the technology avoids dumping your good mail? Like when your friend sends you a URL to good pr0n site.

--
"No matter where you go, there you are." -- Buckaroo Banzai

Yeah, but... by digital_milo · 2002-11-14 06:00 · Score: 3, Funny

This will be of no use to me until it automatically deletes any Word Doc and .exe files that my co workers try to email to me.

One question... by Hard_Code · 2002-11-14 06:01 · Score: 5, Interesting

I assume the filtering statistics live on the client side. What about IMAP? If I open up Mozilla on a new machine, are all my spam statistics lost (presumably rendering the junk mail filtering statistics I've accumulated useless on the new machine).

It would be neat if, with IMAP accounts, Mozilla just stored the statistics in a file on IMAP server instead of on the client.

--

It's 10 PM. Do you know if you're un-American?

Re:One question... by reaper20 · 2002-11-14 06:12 · Score: 2

It's like this with every setting, in both the browser and mail/news.

The real fix is full roaming profiles so I can have a master profile on a server with all my bookmarks, cookies, mail and spam settings, etc., but it seems like that feature is still a ways off ...
Re:One question... by Jobe_br · 2002-11-14 07:32 · Score: 2

... unfortunately, I might add. Roaming profiles is an underappreciated feature, it appears. Its incredibly simple to setup w/ NS 4.7+Apache+mod_roaming and works like a charm. Even for my small company (2-7 employees at any given time), roaming profiles provides a lot of flexibility, especially when moving around between machines, which I, personally, often do. Its also great for recovering settings after configurations get corrupted - that's come in handy quite a few times.

Some day, I guess ..
Re:One question... by BroadbandBradley · 2002-11-14 08:36 · Score: 3, Interesting

someday you'll be able to backup and restore your Mozilla Profile, and when that day comes, I hope you'll remember that Mozilla has a House online at ZillaVilla.com

--
"The Most Fun Possible on 4 wheels" is at SunBuggy in Las Vegas
Re:One question... by biostatman · 2002-11-14 09:34 · Score: 2, Informative

Spam Assassin in combination with procmail has worked well on the server side for me. You can tune the sensitivity to how much spam it catches, but my informal assessment is that it catches about 95% of the spam, with only 1 false positive in about 3 weeks of use (the false positive's and any other email address can be put in a whitelist of email addresses that are let through automatically). Great stuff. Saves me from having to constantly update my ~/.procmailrc for new spammers.

--
For the love of $DEITY, loose != not win!!!!!

SpamAssassin + Mozilla = Schweet! by Noryungi · 2002-11-14 06:04 · Score: 5, Interesting

Well, most of my spam is already sent to /dev/null by the SpamAssassin ninja.

But, for those that make it past the email shadow warrior, I guess Bayesian filters are a double whammy they'll never survive... Mwahahahaha!

Kudos to the Mozilla programmers!

--
The right to offend is far more important than the right not to be offended. (Rowan Atkinson)

Re:SpamAssassin + Mozilla = Schweet! by Plutor · 2002-11-14 07:03 · Score: 2

SpamAssassin should soon include its own Bayesian filters, and Perl support.
Re:SpamAssassin + Mozilla = Schweet! by TheTomcat · 2002-11-14 07:33 · Score: 2

Don't effective Byesian filters need a good sampling of email you DON'T want, to be useful?

Spamassassin does a pretty good job, so the amount of spam that makes it to your client should be negligible (as you said, most of your spam is already filtered).

So, without a good sample of spam, that second level can't act intelligently.

Unless I'm wrong.. (-:
OR, if you could hook it up with spamassassin's rejected mail -- THIS would be useful.

S
Re:SpamAssassin + Mozilla = Schweet! by Noryungi · 2002-11-14 07:49 · Score: 2

Actually, on my Linux workstation, SpamAssassin sends the spam straight to the 'trash'.

So I guess that (a) it's possible to hook them up to gether and (b) get a good sample of rejected emails... =)

Mwahahahahahaha!!

--
The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
Re:SpamAssassin + Mozilla = Schweet! by bugnuts · 2002-11-14 14:26 · Score: 2

What you'd do is save all the messages that got PAST the spamassassin filters, then feed those into it. Hopefully it'll start catching such spam later.

The ones I find most often get past spamassassin are one-line urls.

Re:MSN 8 rules, Mozilla Sucks by Hard_Code · 2002-11-14 06:04 · Score: 3, Funny

"since every Mozilla article degrades to a flame fest of Microsoft greatness versus the rest of the world"

s/Microsoft/Open Source/

--

It's 10 PM. Do you know if you're un-American?

Re:MSN 8 rules, Mozilla Sucks by mblase · 2002-11-14 06:05 · Score: 2

I can get spam filtering as part of upgrading my free MSN account to MSN 8 for only $10/month! (Just trying to figure out what the MS trolls will have to say about this one)

Besides the obvious fact that Mozilla costs $0 per month, you mean?

Microsoft's Patent by woboz · 2002-11-14 06:05 · Score: 5, Interesting

What happens when microsoft attempts to enforce this patent

Re:Microsoft's Patent by VValdo · 2002-11-14 06:27 · Score: 2

They haven't enforced it against Apple Mail's junk filter, so (without reading the whole patent), I wonder if it applies.

W

--
-------------------
This is my SIG. There are many like it, but this one is mine.
Re:Microsoft's Patent by DaveAtFraud · 2002-11-14 06:50 · Score: 5, Informative

This is from Paul Graham's site with regard to the Microsoft patent. Patents tend to be very narrow in scope such that, if some aspects change, the patent may no longer apply. Pick on any typical consumer product such as hair dryers, stereos, you name it. They all have patents and they're all different and they don't "infringe" on each other unless they're virtually identical.

--
They that can give up essential liberty to obtain a little temporary safety deserve neither safety nor liberty.
Ben
Re:Microsoft's Patent by aredubya74 · 2002-11-14 06:52 · Score: 2

I rarely traffic in bumps, but mod the parent up. Really good catch on this patent.

--
RW
Re:Microsoft's Patent by McFly777 · 2002-11-14 07:09 · Score: 2

I hope somebody can find some prior art on this. I just (quickly) read the claims and body of the patent and it sounds very much like the techniques that have been described here previously.
Unfortunatly, the Patent was issued in Dec 2000; the first time I heard this idea was the Paul Graham implementation in the last few months.
So, if this is all old hat to anyone out there, please do everyone a favor and find that prior art and let everyone know, so that, in 5 years when MS trys to enforce this patent, there is a defense.
------
I accidently posted this as an AC (score:0) so I am reposting it, but in the mean time another AC post claims to have some prior art. According to that AC This article (fixed link) may be helpful, I couldn't read it myself as the Full Text requires ACM membership. Perhaps somebody with access could take a look, and review it's potential applicability as prior art. (ie. Does it explicitly mention using baysian techniques to filter spam?)

--

McFly777
- - -
"What do people mean when they say the computer went down on them?" -Marilyn Pittman
Re:Microsoft's Patent by egghat · 2002-11-14 07:23 · Score: 2

Can somebody with access to the ACM library could check this and if parent is appropriate, mod him up?

That may be the prior art we're looking for.

I hate software patents and I hate MS.

Bye egghat.

--
-- "As a human being I claim the right to be widely inconsistent", John Peel
Re:Microsoft's Patent by blamanj · 2002-11-14 07:48 · Score: 2

Microsoft and Apple have a number of cross-license agreements, so that's not a valid test.
Re:Microsoft's Patent by DeadSea · 2002-11-14 07:56 · Score: 5, Informative

Specifically in this case:
...then stored in a corresponding folder for subsequent retrieval by and display to the recipient.
So it looks as if this patent only covers server side implementations. A client side (Mozilla's) implementation retreives it and then filters and displays it.
Re:Microsoft's Patent by crisco · 2002-11-14 11:11 · Score: 2

Something called iFile has been doing something similar since 1996. Changelog and readme document this. Some thoughts from the POPFile project (my sources for those links).

--
Bleh!

Mod this guy up!! That's brilliant by Mustang+Matt · 2002-11-14 06:06 · Score: 2

So obvious yet so simple!

--
The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin

Re:I just started up popfile by netringer · 2002-11-14 06:06 · Score: 2, Informative

I've been running for popfile for just a couple of weeks. It's working amazingly well.

The fun thing is when it works on its own, like when you get a message from a subscribed list that it has never seen before and it knows that it ISN'T spam.

With popfile working so well I'm not in a hurry to have Bayesian filters built into the mail client.

Has anybody tried sharing the history data between Windows and Linux clients on a dual boot machine?

--
Ever dream you could fly? Get up from the Flight Sim. I Fly

Since some of us run Windows, by Dot.Com.CEO · 2002-11-14 06:07 · Score: 5, Informative

I dare submit myself to the rage of the Slashdot crowd. I use Outlook and "Spamnet" is a way to stop most spam in Windows. Based on the Razor project (distributed spam detection), it is a great solution for whomever cannot or does not want to move to Mozilla. Granted, it is beta quality, but the Mozilla feature is still in the alpha stage.

--
Mother is the best bet and don't let Satan draw you too fast.

Re:Since some of us run Windows, by Dot.Com.CEO · 2002-11-14 06:35 · Score: 2

You are missing understanding of my mail ;-) I said that maybe there are people out there who enjoy using Outlook and do not want / cannot change it. Spamnet is a solution
Not everyone can change browsers / mail clients you know. And, believe me, there are people out there who rather like Outlook. I am one of them.

--
Mother is the best bet and don't let Satan draw you too fast.
Re:Since some of us run Windows, by zrodney · 2002-11-14 07:03 · Score: 2

well, then you shous have made the title
"Since some of use like running Outlook"
and there would have been no confusion.

Windows can run stuff other than MS programs
Re:Since some of us run Windows, by jilles · 2002-11-14 07:30 · Score: 2

I browse with mozilla and I mail with outlook xp. Outlook is currently one of the best mail clients (in terms of features) and the mozilla mail client still needs a lot of work to even come close. Security is an issue with outlook only if you can't find your way to the preferences dialog to adjust security to the appropriate settings.

Spamnet is catching most of the spam on my machine now. I installed it two months ago and it has caught up of 95 percent of the spam I receive. More importantly, it hasn't miscategorized a single message.

--

Jilles
Re:Since some of us run Windows, by Reckless+Visionary · 2002-11-14 08:27 · Score: 2

Unfortunately, Spamnet doesn't fully support IMAP. That makes it useless to me, but they say they're working on it.

--
I think I'll stop here.
Re:Since some of us run Windows, by CvD · 2002-11-14 21:46 · Score: 2

I used to use Vipuls Razor, but I noticed it was marking email from Red Hat and CERT as spam, putting them into my spam folder. The distributed spam network is a good idea, but it sucks when it gets poisoned. Who does this?

Now I've got only relay checking and spamassassin, which works great.

Cheers,

CvD.

--
The Official Steve Ballmer Webpage

No, too obvious by wiredog · 2002-11-14 06:07 · Score: 2

The "Freedom From Interference With Commercial Speech Act"

--

Best Slashdot Co

Re:No, too obvious by saider · 2002-11-14 06:24 · Score: 2, Interesting

This new law will force you to leave your radio and TV on even while you aren't paying attention to it. Furthermore junk mail will no longer be able to be discarded without an affidavit that states the recipient has read and understands the offer. Street mimes and homeless people wearing commercial signs must be paid attention to by anyone within a 10 foot radius. You will be required to sample every free offering at the Food Court in the mall and surveyors cannot be ignored. All fliers distributed on your vehicle must be followed up with a phone call or your vehicle will be impounded. You will be required to contact every business that advertises at sporting events if you choose to attend.

Failure to abide by these rules will result in the forfeiture of all assets and the garnishing of all wages earned, which will be deposited into the Federal Marketing Enforcement Fund. Monies from this fund are distributed to companies whose marketing campaigns are not successful.

--

Remember, You are unique...just like everyone else.
Re:No, too obvious by nrosier · 2002-11-14 06:32 · Score: 2

Don't see how this would apply here. There's no interference. The receiver of the mail still has the last word on what happens with his mail. It's just another way of filtering mail but not based on regexps on subject, sender etc... but on the likelyness of being junk.
For the moment, the filter only flags mails, it doesn't even delete/move them. At most. it will only move mail to certain folders (if you like the Trash folder).
I use Ifile and procmail to filter my mail. What would be next? A law prohibiting the use of filters on mail?
I guess spammers would like to see a law that forces me to look at spam but I don't see this happen.

My only complaint... by Mustang+Matt · 2002-11-14 06:09 · Score: 3, Interesting

In Outlook Express, I can setup 100 different email accounts and not have a giant list of mail folders.

In Mozilla (last I checked) for every account you setup it creates a new set of folders.

Since I've got a catchall account, I'd like to tie multiple email addresses to one set.

Anybody out there on the Mozilla team listening?

--
The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin

Re:My only complaint... by ChrisDolan · 2002-11-14 06:29 · Score: 5, Funny

No they likely aren't. They have this cool thing called Bugzilla (http://bugzilla.mozilla.org/) which is designed to track bugs and new feature requests. If you want to be heard, that's the place to submit, not here.

It's like, if you want to submit a complaint to Microsoft, you write them a letter to their company address instead of, say, writing your complaint as graffiti on a New York subway car. Wait a minute, actually, you might run into a MS employee doing butterfly graffiti, so that's a bad analogy... Plus, a subway isn't a good metaphor for Slashdot. The /. crowd is much scarier.

Not enough by sulli · 2002-11-14 06:09 · Score: 2, Interesting

Spammers don't use relays these days, they use spam tools that directly SMTP the receiving mail server. So the receiver still needs to filter.

--

sulli
RTFJ.

Outlook is part of the IE Package by yerricde · 2002-11-14 06:13 · Score: 4, Informative

E-mail is Outlook's domain. Not IE.

It's possible to net-install Mozilla without installing Mozilla Mail, but the default setting includes both. It's possible to net-install IE without installing Outlook Express, but the default setting includes both. Thus, it is a fair comparison.

100. Bugzilla - OK, lots of people use this, but Bugzilla != Mozilla. So it's not like Mozilla has built-in Bugzilla features... This is unrelated to the list.

I think the point of that entry was that unlike IE's bug database, which only Microsoft employees see, Mozilla's bug database is 99% open to the public (the other 1% primarily covers unfixed security vulnerabilities).

--
Will I retire or break 10K?

You know what would be cool? by PDHoss · 2002-11-14 06:15 · Score: 5, Funny

If the spam filter could intercept outgoing mail. I would sneak into my goddamn in-laws house and install Mozilla if it would eat every forward-of-forward-of-forward-of-forward message they tried to forward to me based on rules like:

1. Says "someone is testing something and you get $NN.00"

2. Says anything like "angels watching over us" or "a mother's poem" or other such bullshit.

3. Says "This is really funny"

4. Says "We'll be over on Tuesday right during dinner when you are trying to put the moves on our daughter/your wife."

Umm, not the last one, really. Just got on a roll.

PDHoss

--
======================================
Writers get in shape by pumping irony.

Re:You know what would be cool? by SCHecklerX · 2002-11-14 09:10 · Score: 2

Just use spamassassin at home. My annoying cousin gets sent to the 'likely spam' folder quite frequently. Anything higher than 12 hits, however is /dev/null'd
Re:You know what would be cool? by rthille · 2002-11-14 13:34 · Score: 2

> 4. Says "We'll be over on Tuesday right
> during dinner when you are trying to
> put the moves on our daughter/your wife."

yeah, because having them just show up while you're putting the moves on their daughter is so much better :-)

--
Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
Re:You know what would be cool? by EnglishTim · 2002-11-14 23:00 · Score: 2

But the problem with these forwarded-around things is that once they get into the hands of a spammer, they're a big juicy list of valid emails - and your email address will be on there and there's nothing you can do about it.

Eudora finally has the filter I need by Continental+Drift · 2002-11-14 06:15 · Score: 3, Informative

Eudora's latest version, 5.2, includes the ability to filter mail against your address book. If someone sends me mail and they are not on that address book or they don't use a special key word in the subject line, they get an automatic reply telling them to try again with that key word. Spammers will ignore that reply, so I'll only real people will include the key word, and then I can add them to my address book.

This, comibined with some clever regex filters I already had means that I can reliably get the 10% of my mail that I actually want to read.

Re:Eudora finally has the filter I need by Sancho · 2002-11-14 19:30 · Score: 2

It's similar in approach to Spam Arrest Spam Arrest is annoying, and I refused to "authorize" my email address, which was problematic for me since the person using it was on a listserv I actively participated in. Every message I sent got me a Spam Arrest message sent to me asking for authorization.

Vote for it by cheezycrust · 2002-11-14 06:17 · Score: 2

This is bug number 199684 in Bugzilla (no direct links from Slashdot, you know). They are not sure what to do about it, but they are thinking about it.

--
Teenagers these days don't have as much sex as they want each other to think they do.

Re:Vote for it by sstory · 2002-11-14 07:44 · Score: 2

Invalid Bug ID Bug #199684 does not exist. Please press Back and try again.
Re:Vote for it by po_boy · 2002-11-14 08:03 · Score: 2

Danger, Will Robinson!
Bug #199684 does not exist

Re:didn't k5 already run a story on this? by ergo98 · 2002-11-14 06:18 · Score: 2, Insightful

Really, eh? I mean, I turned on CNN today and they were reporting a story that I'd already heard on ABC News! The nerve! I sent them a letter saying "Um, excuse me, but I already heard that on ABC l053rZ!" They haven't replied yet.

To make matters even worse, when I was on the train I overheard two people talking about the Israeli conflict. I couldn't believe it! I mean, I heard someone talking about that LAST YEAR for crying out loud! That is so 2001! I told them that they're l4m3rZ for being so dated. They just seemed to ignore me though.

Good example of MS's monopoly abuse by SethJohnson · 2002-11-14 06:18 · Score: 5, Insightful

Sorry if this comes off as a MS-bashing rant. It's not intended as such.

The fact that MS doesn't seem hard at work implementing spam filters in Outlook or popup blockers in IE is a good example of consumers suffering due to Microsoft's monopoly. It also demonstrates how Microsoft is able to leverage its monopoly in one area (mail and web clients) to build profit in another market.

This other market is it's aspring ISP services. The app and mail client development teams aren't implementing these features because the Microsoft ISP wants to be able to tout the ability to filter spam and block popups. If the browsers and email clients used by 90%+ of the internet users had these features, then it wouldn't be a selling point for their ISP. This is a clear example of the company witholding features in the free products so it can profit from the antidote.

It also demonstrates the lack of competitive pressures in the market that normally drives a company to implement features at a rapid pace. Consumers are stagnating with a product for which the developer has no competitive pressure to improve. Hence that list of 102 things Mozilla can do that IE can't do.

--
$5 / month hosted VPS on linux = awesome!

Re:Good example of MS's monopoly abuse by schon · 2002-11-14 06:41 · Score: 5, Insightful

Sorry if this comes off as a MS-bashing rant.

No need to apologize - I love a good MS-bashing rant as much as the next /.'er.. :o)

I do, however, feel that it's not as big a problem as you do..

The app and mail client development teams aren't implementing these features because the Microsoft ISP wants to be able to tout the ability to filter spam and block popups.

This may (or may not - although I'm inclined to agree with your views) be true, but the important thing to understand is that the MTA (ISP)-level is where spam blocking belongs.

The real problem with spam is that it steals bandwidth - blocking spam after it's already sitting in your mailbox is like closing the barn door after the horses have eaten your children - the bandwidth has already been used, so you don't gain anything... having your email client "block" spam isn't really blocking it, it's just an automatic "delete key".. which is what the spammers want (how many of them say spam isn't a problem because you can "just hit delete")

MS's intentions aside, the solution they have is the correct one, even if their motives are suspect.
Re:Good example of MS's monopoly abuse by bamm · 2002-11-14 06:59 · Score: 2, Informative

I see your bet and raise you an infinite number of software and hardware developers.

Installed anything on a MS platform lately? Everyone wants your email address, so they can give you better "support" by selling your info to hordes of spam artist. For instance, my Mom doesn't use Windows because it's easier to use or crashes less often. She uses Windows because CreateCard12 doesn't run on Linux. Her new printer didn't come with Linux drivers and neither did her scanner. MS retains its monopoly status for those reasons and it isn't about to jeopordize it's relationship with these software and hardware companies by helping prevent spam (unless the get $30 a month from you).

Bammkkkk

--
www.sguil.net
The Analyst Console for NSM
Re:Good example of MS's monopoly abuse by novakreo · 2002-11-14 07:36 · Score: 2, Insightful

I for one am quite happy for Internet Explorer to never implement tabbed browsing, pop-up blocking, mouse gestures, or anything else which it currently lacks. It makes it much easier to convince people to switch browsers (if they don't care about security), and the fact that since 90% (or whatever the exact stat is) of the world uses IE and sees the pop-up ads means that advertisers aren't rushing about trying to circumvent the pop-up blocking.

In short, Microsoft is the open-source movement's greatest asset :-).

--
O frabjous day! Callooh! Callay!
Re:Good example of MS's monopoly abuse by mpsmps · 2002-11-14 08:00 · Score: 2

Actually, Microsoft was ordered by the court to stop filtering spam because they were filtering out competitors' emails.
Re:Good example of MS's monopoly abuse by Clover_Kicker · 2002-11-14 08:56 · Score: 2

>The real problem with spam is that it steals
>bandwidth - blocking spam after it's already
>sitting in your mailbox is like closing the barn
>door after the horses have eaten your children -
>the bandwidth has already been used, so you don't
>gain anything

I disagree, this tool would spare me the time and annoyance of deleting the spam message by message.

(I don't dispute that spam is a terrible waste of bandwidth, but it's not the only problem.)
Re:Good example of MS's monopoly abuse by ebyrob · 2002-11-14 09:47 · Score: 3, Insightful

is like closing the barn door after the horses have eaten your children

Ya, you should have shot those man-eating horses to begin with. Seriously though, don't you think we should have laws against this type of mail fraud (forging headers and the like) instead of simply trying to "block" the fraud at the ISP level? I suppose blocking as well can't hurt, but freedom requires punishing the guilty and only the guilty.

The last thing I want is Microsoft deciding which emails destined to me are "spams". (subscription email from FSF? Must be spam!)
Re:Good example of MS's monopoly abuse by Refrag · 2002-11-14 16:21 · Score: 3, Interesting

The real problem with spam is that it steals bandwidth - blocking spam after it's already sitting in your mailbox is like closing the barn door after the horses have eaten your children - the bandwidth has already been used, so you don't gain anything... having your email client "block" spam isn't really blocking it, it's just an automatic "delete key".. which is what the spammers want (how many of them say spam isn't a problem because you can "just hit delete")

I'd argue that the time wasted on filtering spam is more valuable than the bandwidth wasted delivering it. This is why I am glad that Apple was able to bring good client-side spam filtering to the people with Mail and that Mozilla will soon provide this feature as well.

--
I have a website. It's about Macs.

Re:MSN 8 rules, Mozilla Sucks by lovebyte · 2002-11-14 06:22 · Score: 2, Informative

I get spam filtering for free on Yahoo!

--

I'll do it for cheesy poofs.

Re: Mozilla bloat [...] Gentoo by delta407 · 2002-11-14 06:23 · Score: 3, Informative

Before you Gentoo zealots get out here and plug your so-loved-distro, remember that even you don't have as much control as you could.

I disagree. See the Mozilla 1.1 ebuild for details. I can write:

# export USE="moznomail"; emerge mozilla

Or, if the ebuild still doesn't provide enough customization, I can just manually remove a config option (say, --enable-xsl) and "emerge mozilla" to get exactly what I want.

if spam gets through.. by EvilStein · 2002-11-14 06:25 · Score: 5, Funny

procmail filters, SpamAssassin, AND the new Mozilla spam filters.. can we make a law that will make it legal to find the spammers and execute them in public?

Pleeeease??

So you really want... by dpilot · 2002-11-14 06:26 · Score: 5, Informative

You really want server-side filtering. I do that on my IMAP server with procmail, though not Bayesian. A quick google with "procmail bayesian filter" turns up quite a bit of interesting stuff to sift through. Of course if it's not your IMAP server, you're back to client-side solutions.

--
The living have better things to do than to continue hating the dead.

Re:So you really want... by DeadSea · 2002-11-14 07:20 · Score: 2

It seems to me it would be possible to write a filter that works with IMAP but still runs on the client. Basically, the client would connect to the mail server from a cron job every 5 minutes (or just before your mail reader checks) and check for new email. It would filter any new messages and move the spam to the spam folder. When you check your email you would move spam it didn't catch to the spam folder and move stuff it marked as spam that wasn't back to the inbox. The next time the program ran, it would account for moved messages.
The great thing is that it wouldn't rely on any particular mail reader (you could even use imap webmail and switch between various reader), and you wouldn't have to run your own server.

"Bayesian filtering" aka "Naive Bayes" by ghamerly · 2002-11-14 06:27 · Score: 5, Informative

This approach is more commonly called "Naive Bayes" classification in the field of machine learning. It is naive because it considers each word to be a feature (dimension), but it also considers each word in an email to be conditionally independent of all other words in the document (which is not true, but really useful in practice).

The author of the web page on using this technique to classify spam (Paul Graham) has a better explanation of Naive Bayes on this web page.

I've written my own naive Bayes classifier to identify spam, with less positive results than he reports. However, naive Bayes can be a very effective technique, and I can believe his results.

The two things you have to beware of when using it are "smoothing" probabilities of words you've never seen (you don't want them to always be zero, as straight naive Bayes will give you), and you need LOTS of training data for naive Bayes to work well. That means that you need to already have a fair amount of spam to identify spam well.

You can see a paper I wrote on using naive Bayes to classify hard drive failures here, or look for more stuff on naive Bayes on Google. Also, don't reinvent the wheel: Andrew McCallum has written a very good toolkit for doing these sorts of things in Bow.

Re:"Bayesian filtering" aka "Naive Bayes" by standards · 2002-11-14 07:13 · Score: 3, Interesting

Well, I certainly have a large volume of SPAM that I plan to use for training purposes. I'm not a big user of personal email, but somehow about 70% of all my incoming personal mail is SPAM. My Dad is much worse off.

I'm glad to see that the software industry is taking the SPAM problem seriously. And it's great to hear that more and more states, like Massachusetts, are enacting laws to curb the abuse of email systems.

I've been dependent on some static rules to curb SPAM (about 90% effective), but I think now it's time to implement more serious anti-spam measures.
Re:"Bayesian filtering" aka "Naive Bayes" by ceswiedler · 2002-11-14 07:21 · Score: 3, Interesting

Based on the last /. article on Bayesian filtering, I installed SpamProbe. I gave it a folder of about 70 spam emails, and a few hundred good emails I had in various folders. In the past few weeks, it's had one false negative, and a few false positives which were 'semi-spam' mailing list emails from Dell, RedHat, and Amazon. When I moved those emails into the 'recheck as good' folders, it learned its lesson.

It may be naive, but I was very surprised at how well it worked. It's better than SpamAssassin IMO, especially at foreign-language spam.

Client-Side Filtering is Wasteful by divide+overflow · 2002-11-14 06:27 · Score: 5, Insightful

Since you must first download the content for client-side filtering to work you waste bandwidth. If you are truly bombarded by spam you still lose...your mail spool still gets filled up with stuff you don't want, your data transfers compete for bandwidth with the spam, storage hardware works harder storing data that will only be deleted. It raises everyone's costs, including yours.

We need to block undesired mail at the host, not filter it at the client. That way the spam never gets sent, the spammer gets the message that their attempt was futile, and bandwidth is conserved. Many ISPs already provide this service...we need to improve on it. And we need better tools for identifying and dealing with spammers. The current mail standards are woefully inadequate to this task.

Re:Client-Side Filtering is Wasteful by Asprin · 2002-11-14 08:41 · Score: 2

We need to block undesired mail at the host, not filter it at the client. That way the spam never gets sent, the spammer gets the message that their attempt was futile, and bandwidth is conserved. Many ISPs already provide this service...we need to improve on it. And we need better tools for identifying and dealing with spammers. The current mail standards are woefully inadequate to this task.

Not that this would be practical or feasible, but suppose we designed a software system specification consisting of:

1) Decentralized database servers that communicates P2P-like to track and exchange statistics about what is spam and what is not....
2) Mail Server Plug-In/Filter that uses (1) to decide whether to deliver/mark/throw out mail based on a....
3) Mail Client Plug-In/Filter that receives mail from (2) according to a level of filtering you specify. Oh, and you can also vote on the mail that does get through to ID it as spam so the rest of the system gets it's statistics updated from your misfortune.

Sound good? Ok, now GO WRITE IT!

--
"Lawyers are for sucks."
- Doug McKenzie
Re:Client-Side Filtering is Wasteful by divide+overflow · 2002-11-14 10:33 · Score: 2, Informative

1) Decentralized database servers that communicates P2P-like to track and exchange statistics about what is spam and what is not....

Like Vipul's Razor...

2) Mail Server Plug-In/Filter that uses (1) to decide whether to deliver/mark/throw out mail based on a....

Like SpamAssassin...

3) Mail Client Plug-In/Filter that receives mail from (2) according to a level of filtering you specify. Oh, and you can also vote on the mail that does get through to ID it as spam so the rest of the system gets it's statistics updated from your misfortune.

Although this takes more effort due to the need to support a number of different mail clients it appears that this may be doable on some platforms using software that supports SpamAssassin.

interesting idea... by Lumpy · 2002-11-14 06:28 · Score: 5, Interesting

what if in addition to this someone put together a company that the mozilla email client can report back to about what is labelled as span and the filters it created along with the headers of the message (or even the entire spam) and grab filters from others that recieves some spam that you have yet to recieve? it would be like a big distributed computing anti-spam project.. then if we were able to make the filters useable by sendmail to block at the server...

I'm almost thinking a distributed and automated anti-spam system like that could completely crush the spam problem within a 12 month period.

or I may be completely out of my mind.

--
Do not look at laser with remaining good eye.

Re:interesting idea... by SandSpider · 2002-11-14 06:44 · Score: 3, Insightful

That's a really cool idea in theory. In reality, you have to deal with trusting that everybody on the internet are trusted enough to decide what your spam is and isn't.

I mean, you've been on the internet before, right? You've seen the other people here, too? Think about it.

=Brian

--
There is nothing so good that someone, somewhere, will not hate it.
Re:interesting idea... by Dunkirk · 2002-11-14 06:46 · Score: 3, Informative

It's called vipul's razor.

--
Acts 17:28, "For in Him we live, and move, and have our being."
Re:interesting idea... by Gothmolly · 2002-11-14 07:23 · Score: 2

It's called Vipul's Razor, you insensitive clod!

--
I want to delete my account but Slashdot doesn't allow it.
Re:interesting idea... by ceswiedler · 2002-11-14 07:37 · Score: 2

There are already efforts to do this. One is called Razor, IIRC. However, they haven't completely crushed the spam problem, and aren't likely to.

Teaching my computer by guacamolefoo · 2002-11-14 06:33 · Score: 2

I have enough problems teaching my one year old not to eat dog food. I don't think I really want to have to educate my email client about spam, and then continue to monitor it to make sure it doesn't fuck up.

The problem with spam (for me) is that I have to waste time dealing with it or my existing filters sometimes accidentally chew up a legit message (rarely). The basic plan Mozilla seems to be after doesn't really fix that for me.

I do like the idea of allowing anti-spam plug-ins. Having a variety of methods to choose from will let me decide what, if any, third-party solution works best for me.

guac-foo

--
Lots of petrified grits

Re:Teaching my computer by WolfWithoutAClause · 2002-11-14 06:53 · Score: 2

I've been running a Naive Bayesian filter for about 3 weeks now, and it's misclassified NO good email as spam, but it has marked about 10% of the spam as normal mail. As I understand it, my experience is entirely typical.
You still have to eyeball the subjects of the spam messages just in case, but the fact that you don't have to manually delete them is a very good thing- it does save quite a bit of time that they are already sorted; and as I said it hasn't fucked up once so far in that sense.
Also, I spend almost no time dealing with any filters. Once a week I take the pile of spam that it missed and tell it that it was in fact spam; it takes about 2 minutes and any similar spam will get spotted in future.
It's definitely a far more pleasant way to deal with the issue, and it does save time. It seems to be almost as good a scheme as any so far, and at the moment it looks unlikely that any scheme will outperform it by much; so it may well be 'good enough'.

--
-WolfWithoutAClause
"Gravity is only a theory, not a fact!"

Different technique by JediTrainer · 2002-11-14 06:33 · Score: 2

I have a different idea. Well, it's not my idea - I remember reading somebody describing it on /. some months ago and it seemed brilliant.

The original idea described setting it up on the server side, but this should work on the client side as well, and might be a good candidate for a Mozilla mail filter plugin:

1 - download new message headers from server

2 - Compare 'from' email addresses to list of known people you accept email from. Only download mail from known senders.

3 - if email comes from an unknown party, email them with instructions to reply to your message, and put some word in your subject line (ie: activate). The word should be randomized to eliminate the spammer's chance of guessing it.

4 - if a message header is found with that subject from that sender, the sender can be automatically added to the 'known' list and the mail is downloaded

5 - if no further message received from that sender, delete their messages within X days (or download it and put in 'spam' mailbox just in case)

6 - user has capability of adding new 'known' senders, plus ability to blacklist senders who have authenticated (persistent spammer).

I can't think of any loopholes here - it seems that this might solve just about every spam problem I've ever come across. No reason why this can't be implemented on the client side (especially if you don't have control over the server). Any takers?

--

You can accomplish anything you set your mind to. The impossible just takes a little longer.

Re:Different technique by olethrosdc · 2002-11-14 06:55 · Score: 2

This can lead to storage problems and a possible race condition. What about parties that do not reply? How long will they be kept in your 'list-of-people-I-have-sent-an-automatic-reply-to' ? What if you get a 'I-could-not-deliver-your-message' type of message? Automatically reply to that.. and .. hey.. you started a loop.

--
I miss my rubber keyboard.(Homepage)
Re:Different technique by q2k · 2002-11-14 06:56 · Score: 2

You can do all of this today with Pocomail.(on a Windows box) I already filter against my address book and specific "to" addresses for maillists. I haven't bothered to set up the automatic reply, but it could be easily done with Poco's native scritping capability.
Re:Different technique by JediTrainer · 2002-11-14 08:05 · Score: 2

This can lead to storage problems and a possible race condition.

Obviously some of the issues need to be worked out, but I'm convinced that it can be done...

What about parties that do not reply? How long will they be kept in your 'list-of-people-I-have-sent-an-automatic-reply-to' ?

That can be easily configured by the user. I would suggest 3 days, but again, give the user the ability to add names to the list themselves (thus, making their messages visible).

What if you get a 'I-could-not-deliver-your-message' type of message? Automatically reply to that.. and .. hey.. you started a loop.

That's simple enough to do with a little coding. Only send one of those per address within a specified time. For example, I would think that only one autoreply would be necessary within a week's worth of time.

--

You can accomplish anything you set your mind to. The impossible just takes a little longer.

Not impressed by macdaddy · 2002-11-14 06:35 · Score: 4, Interesting

Well, ok I am impressed that Mozilla is implementing spam filtering abilities in their MUA. I AM NOT impressed with Bayesian spam filters AT ALL. I've been using Mac OS X's Mail.app since I switched to OS X. It's not my primary MUA but I am letting it POP out a copy of all my mail and "learn" from it. It does a pretty good job of finding maybe 80% of the spam I get. However it has a BAD false-positive rate. I mean hell its been flagging CERT advisories as spam. That kind of crap is really annoying. It's flagged co-workers' mail as spam numerous times (and even though I happen to agree... :) ). The biggest problem I have with Bayesian as a mail admin is that I am constantly dealing with spam. Users forward it to me. I receive a number of spam bounces. I work in spam all that damned time. That's the problem. I need a MUA with Bayesian filters that are smart enough for me to tell them to ignore all mail from certain domains or that went to certain accounts. All of the Bayesian filters built into MUAs I've worked with so far can't do things like that. It's really annoying given the position that I'm in.

Re:Not impressed by self+assembled+struc · 2002-11-14 07:41 · Score: 3, Informative

if i'm not mistaken you can edit the SPAM rule in mac os 10.2 mail and add additional properities to it's rules.

the default is "if not in address book and it's SPAM" send to SPAM folder.

you should be able to add a properity to that rule that says

"if not in address book and FROM: doesn't contiain XYX.COM and it's SPAM" send to SPAM folder

you just add the properities before the SPAM one.
Re:Not impressed by Knobby · 2002-11-14 08:20 · Score: 2

I've noticed a few false positives with Mail.app, but that's probably because I'm doing exactly what you just suggested.

Every time there's an article on /. about SPAM, there's a bunch of posts with filter definitions. I generally end up looking through those posts and adding a few new rules to my list. In the last 3 months I've accrued just under 70 messages in my junk mail folder. That's a small percentage of the 100's that are being trashed.
Re:Not impressed by tbmaddux · 2002-11-14 08:50 · Score: 4, Interesting

However it has a BAD false-positive rate. I mean hell its been flagging CERT advisories as spam. That kind of crap is really annoying. It's flagged co-workers' mail as spam numerous times..
I had this problem early-on as well. I fixed it by marking the false positives as "Not Junk." You can do these even when it's in "Automatic" mode as opposed to "Training." All the "Automatic" does is enable the filter that send the marked messages to the "Junk" folder.
But it still learns in either mode! Early on my shipping notices from Amazon.com (and even Apple.com, ha ha) were being flagged as Junk, but not anymore. I think it's great and will only improve with time, with others' caveats about client-side email spam checking being flawed noted.

--
Can't you see that everyone is buying station wagons?
Re:Not impressed by The+Raven · 2002-11-14 12:13 · Score: 2

That's what whitelists are for. Whitelist the people/domains that forward you legitimate spam.

--
"I will trust Google to 'do no evil' until the founders no longer run it." Hello Alphabet.

Emacs! by MosesJones · 2002-11-14 06:36 · Score: 4, Interesting

This is something that Emacs has in the GNUS client, you score emails up and down and it starts adding filtering rules. Using LISP you could extend this to do some pretty funky moderating.

Every problem is reducable to a previously solved problem or by definition is unsolveable - Church Turing Thesis.

--
An Eye for an Eye will make the whole world blind - Gandhi

Spellchecker for Mozilla by WankersRevenge · 2002-11-14 06:36 · Score: 2

I use Mozilla for my mail. I installed a spellchecker I believe from Mozdev. It's pretty good and can be found here

The ultimate filters by TigerTime · 2002-11-14 06:36 · Score: 5, Insightful

There needs to be a tiered structure with filters. The main one would be at the ISP level. It would only filter out obvious spam(like spam going to 2000 users at that ISP). The second tier would be at the client side and would have a certain level of intelligence in identifying spam. One feature that I'd like (it might already be available) is if it could automatically send an email back to the sender saying the email address doesn't exist. This should be done at the server level and/or client level. This could possibly help in removing your email from such lists. As far as what to do with the spam at the client level, I think that it should be sent to your main inbox but just marked as spam (maybe greyed out or something). Like new mail is always bold and once you read it it goes to a regular font. Well, spam could be just greyed out. That way you would ever miss something that the spam filter had a false hit on.

Re:The ultimate filters by Reckless+Visionary · 2002-11-14 08:49 · Score: 2

if it could automatically send an email back to the sender saying the email address doesn't exist
I love! this feature of some email progs, notably Mail on OS X, called "Bounce to Sender". Admittedly, there is the annoyance that you almost always receive another server message telling you that your bounce email was sent to an address that doesn't exist, but in those cases where you don't, it may help get off some spam lists.

--
I think I'll stop here.
Re:The ultimate filters by kalinh · 2002-11-14 10:41 · Score: 2

fastmail.fm, which has become one of my favorite companies on the net, has a bounce feature on their webmail interface which brings one no end of joy bouncing stuff back (even though most from addresses are bogus). They also use spamassasin on thier premium accounts which doesn't delete the mail but simply adds a X-Spam: (or some such) header, you can filter it however you like after that.
Accessing my mail through IMAP with evolution I'm a big fan of doing exactly what you said, basically testing for the spam header and displaying the mail in a different color or moving it to an alternate folder (I'm super paraoid about false-positives although I've never seen one with spamassasin).

--
Metamuscle.com - News in the Iro

SpamCop! by JediTrainer · 2002-11-14 06:38 · Score: 3, Insightful

How about a spamcop-like plugin? Or something that can submit my message plus contents to SpamCop?

If using SpamCop, there should be a way to still show the site's banners, because they deserve to get paid for their bandwidth I'm using up.

I'd love to just be able to right-click on a message and report it to the various abuse/postmaster accounts without having to copy my whole message plus headers, and pasting such into their web form. SpamCop seems to be pretty good at tracing the origins of messages, so I'd love to be able to leverage that sort of functionality.

--

You can accomplish anything you set your mind to. The impossible just takes a little longer.

"Junk Mail" Button by kstumpf · 2002-11-14 06:41 · Score: 2

I've use this ancient mail client called Calypso for years now. One of the reasons I continue to use it is its filtering capabilities. It has a good interface, its very configurable (you can control if the message is deleted locally, remotely, marked read, lots more), and it has a "Junk Email" button. Click on an email, hit the junk button and it deletes it and creates a filter for any more messages like it. One click and the mail is gone from my mailbox entirely and I dont get any more.

Mozilla Mail has decent filtering, but it needs more options and it needs to be more accessible before I can use it.

Re:How? by mark_lybarger · 2002-11-14 06:45 · Score: 5, Informative

Preferences -> Privacy & Security -> Images, you can turn off images in mozilla, or only in mail/news.

Hmm, my spam experiences by krappie · 2002-11-14 06:47 · Score: 5, Interesting

I personally dont really care about all the junk emails I get. I dont get that many, and I can pretty much tell without looking at them. They go straight to /dev/null.

Spam is such a horrible thing though. I work at a webhosting company. Im the one that has to track down the site with the old formmail.pl, removing 'aol.com' and 'yahoo.com' from the hosts to relay for, trying to find out who the hell added them so I can murder them. Im the one clearing out the mail queue with 100,000 mails. Im the one clearing the mail queues of people who thought it was a good idea to check the 'open relay' option in plesk. Im the one that has to deal with people bitching about how their mail isnt working or didnt get through.

Just the other day, I had a raq2 where someone had apparantly put yahoo.com and excite.com in the hosts to relay for. Yay! Thats what attracted the spammers. Now I get a request every second to send mail to 50 people at once. Now that I've removed them, none of them are getting through. But its a raq2, 133 mhz. It has to go through all 50 addresses and say 'relaying denied' and log it. It cant keep up! syslogd is taking up all the cpu and logging things from hours ago because its behind. Quickly, sendmail quits listening on port 25 (but the spam attempts keep coming somehow).

So I get the idea to block their ips, they seem to be using the same ips. But oh guess what, they're using open proxies and have about 400 ips. Well, I did this for about 5 hours, writing scripts to grab the repeated ips out of the maillog, adding them all to my sendmail access lists. Now every time they try to send mail, it blocks them instead of saying relaying denied 50 times for each request. But a minute later, I get a few new ips and it starts all over again. I have an access list about 6 pages long. Its doing ok, blocking about 90% of them, but every once in a while, they get a new ip and sendmail is brought to a stop.

Oh yeah, and my /var/ partition is only 200MB, 50mb free. And the maillog is growing at about 10mb a day. So now Im babysitting this server every day until the spam attempts stop. I dont think theres any way around it unless I get sendmail to check for open proxies. But I dont know how to do that, and I dont think they trust me enough to make such changes to sendmail.

So oh well, mail is getting lost every day on this server and its been renderred horribly slow for its users.. just because some moron noticed it would send some emails for him and started up his scripts.

Spam causes so many problems on the server level. Its what is making mail an unreliable service. I could care less about spam filters on my mail client. These are the things that make spam evil!

your .sig by Anonymous Coward · 2002-11-14 06:52 · Score: 5, Funny

--- Does the name Pavlov ring a bell?

Two brothers immigrated to a mostly Catholic country, hungry and looking for work. Pavlov, whose forehead was quite thick, found work at a monastery bell tower. The monks taught him to tell time, then sound the bell when appropriate. Not too bright, Pavlov missed the part about how to sound the bell. So he notes the time on his handy wristwatch, climbs the belltower, inches up to the edge of the platform, and dives face first into the massive centuries-old bell. KKKLLLAAANNNGGG!!! Poor Pavlov falls to his death hundreds of feet below.

Apparently, monks don't communicate very well. No one in the crowd gathered around Pavlov's remains could identify him. Finally one monk admits, "I never caught his name, but his face sure rings a bell."

Mysteriously, a man steps forward from the crowd and insists on taking Pavlov's place as caretaker of the belltower. One of the monks removes the wristwatch from Pavlov's arm, gives it to the mystery man, and precedes to indoctrinate him in his duties. On the hour, just like Pavlov, our mystery man ascends the tower, perches on the edge -- but this time wielding a massive sledgehammer. He leaps towards the bell and smashes it with Thor-like fury. KKKLLLAAANNNGGG!!! The poor fool falls to his death in a manner very similar to Pavlov's.

Much like deja vu, a muted crowd gathers around the mystery man's remains. After an extended silence, one monk asks, "Does anyone know this man's name?" Answers another, "No, but he's a dead ringer for his brother!"

Won't anyone PLEASE think of the popup advertisers by Tired_Blood · 2002-11-14 06:53 · Score: 5, Funny

However, I've heard that popup blockers and tabbed browsing are making their way into IE (and MS employees can already use these features)

IE is the most widely used brower and pop-up advertising has become part of the Internet Experience. If MS decides to incorporate popup blocking in IE, then the pop-up advertising business is RUINED! They'll just be another group victimized by a huge corporation. These people have families to support and will be forced to send their children to public schools. Won't someone PLEASE think of the children?

And all this news about fixing vulnerabilities within Windows is going to affect the virus community as well (both authors and anti-virus). Worrying about vulnerability exploits has also become part of the computer experience.

Won't someone PLEASE think of the virus writers?

--
This is not my sig.

Re:MSN 8 rules, Mozilla Sucks by Malc · 2002-11-14 06:55 · Score: 2

$120/yr? I paid Yahoo $20 for a year. 90% of my spam has the header X-YahooFilteredBulk. My mail server ditches all that for me. I think you've been had by MSFT's marketing.

Re:Its still too slow... by casio282 · 2002-11-14 06:55 · Score: 3, Informative

IE starts up quickly in Windows because it is loaded into memory at system start up and runs in the background. When you "start" the program you are simply creating a new browser window. So you suffer the program start-up overhead when the system boots, instead of each time you create a new instance.

The good news is, for those inclined to sacrifice system performance for quick browser load times, is that this option is also available in Mozilla...Look under "Preferences...Advanced" for the Quick-launch option.

--

:wq

Personalised solution by 5lash · 2002-11-14 06:56 · Score: 2, Interesting

I personally don't think that systems like this can work that well. Everyone seems to get different type of spam, and you're best bet is to create your own filters. About 80% of my spam messages have wierd foreign characters in it (like Á), so I've got filters in Eudora to delete anything with one of these characters in the Subject or Body. Then obviously anything with "porn", "sex" etc, although spammers dont seem that stupid anymore. This way I only get 5-10 spam messages in my inbox per day, maximum. And this takes me about 20-30 seconds to deal with, I don't see what all the fuss is about.

--
Everything sucks except musicandstuff

That's great, but... by hawkbug · 2002-11-14 06:57 · Score: 2, Interesting

I'm running a sendmail server, and I access via webmail accounts, pine, and Mozilla. I would like to add this new type of spam filtering to sendmail directly. Does anyone know if this is something that can be added to sendmail, rather than a specific mail client like Mozilla?

Real spam control.. by grub · 2002-11-14 07:00 · Score: 3, Interesting

.. should start at the server preventing the offending mail from ever coming into the network in the first place.

Not that localized spam filters are a bad thing (they aren't!) but refusing connections from known spammer IPs and the proper use of blacklists would cut down on a lot of the email traffic. Once the spam is in your inbox, its just an annoyance to you. The cost to the net has already been incurred.

--
Trolling is a art,

Re:Real spam control.. by William+Tanksley · 2002-11-14 09:07 · Score: 2

Yes, it'd be wonderful to stop spam from ever being published. It'd save a lot of money. But it's impossible.

refusing connections from known spammer IPs and the proper use of blacklists would cut down on a lot of the email traffic.

NO. Blacklists are a horrible, horrible non-solution. Once an IP address is on a blacklist it's almost impossible to get it off, so it's useless -- so the spammer just gets a new one, and lets the old one rot. So it doesn't even slow them down!

For arguments against blacklists, see
http://www.paulgraham.com/falsepositives.html

Those things are not just bad, they're REALLY bad.

I wouldn't at all object to a mail relay running its own simple mail filter to reduce the load on other machines -- but it'd better not EVER, ever have a false positive. And honestly, that's the point I just can't believe.

-Billy

It learns from the spam you receive... by Aquillion · 2002-11-14 07:00 · Score: 5, Funny

"...good morning, Dave. You have recieved spam again. I have been analyzing the spammer's patterns, and I believe I have figured out the most efficent way to protect humans from the harm of spam while adhering as closely to the First Law as possible. To protect them from spam, humans must be pushed. They must go down the stairs. Please go stand by the stairs, so I can protect you."

Spam filters should bust the spammers, also. by Futurepower(R) · 2002-11-14 07:04 · Score: 5, Interesting

Software that only does mail filtering encourages spammers. The technically knowledgeable people don't get spam, so they stop worrying about it.

All mail filters should also use a service like SpamCop, so that the spammers lose their internet service accounts as the spam is filtered.

I send Spamcop all my spam. Spamcop analyzes it automatically and sends a message to the Internet Service Provider. I use the free Reporting only service.

But will it be in Evolution? by mshiltonj · 2002-11-14 07:17 · Score: 4, Informative

I may drop Evolution in favor of Mozilla Mail.

I tried to find out if the Evolution dev team was going to do this. The only thread I could find on the topic is here:

http://lists.helixcode.com/archives/public/evoluti on/2002-August/020845.html

Doesn't look like it's part of their vision.

--
Software Wars

My Problem with Mozilla sorta OT by pneuma_66 · 2002-11-14 07:18 · Score: 3, Insightful

I love mozilla, and use it as my main browser. However my biggest complaint is that all the components (browser, mail, composer, etc) should be separate apps. I don't like the fact that if my browser crashes, so does my email reader, and vice versa.

I tried to find some documentation on how to acheive this, however, there was none to be found. Does anyone know how to do this, the I can use Mozilla's mail, rather than the flaky mail app that comes with OSX.

Re:My Problem with Mozilla sorta OT by RazzleDazzle · 2002-11-14 07:45 · Score: 2

Use another mail client unassociated with Mozilla. If you feel adventureous try mutt.

--
ZERO ZERO ONE ZERO ONE ZERO ONE ONE! Just brushing up for my next big invention: Ethernet over Voice (EoV)
Re:My Problem with Mozilla sorta OT by wizarddc · 2002-11-14 08:52 · Score: 3, Informative

There are people working on this. Currently, Phoenix is the brower only app. It's lean, quick, and efficient. Bugs are still being worked out, but it's very usable right now. Also, K-Meleon is a browser that uses the Gecko rendering engine, but not the Mozilla XUL interface.

As for email/news clients, there are two, I believe. Thunderbird and Minotaur. Neither are out at all yet to use.

--
Th

Here's the link by ChrisCampbell47 · 2002-11-14 07:22 · Score: 3, Informative

A dozen or more replies and yet no link to it .. OK, I'll spend the 1.5 minutes posting it ...

101 things that the Mozilla browser can do that IE cannot

--
One simple rule for its versus it's

tmda.net? by Sludge · 2002-11-14 07:27 · Score: 3, Interesting

Has anyone tried Tagged Message Delivery Agent out? I would be curious to hear the mileage of others who have tried this.

Essentially, it throws the parsing problem right back in the spammer's faces: They must answer a fuzzy logic question in order to get into your inbox once and for all. It is similar to challenge/response routines in network connection code to prevent spoofing. The most interesting part from the intro:

The way TMDA thwarts incoming junk-mail is simple yet extremely effective. You maintain a "whitelist" of trusted contacts which are allowed directly into your mailbox. Messages from unknown senders are held in a pending queue until they respond to a confirmation request sent by TMDA. Once they respond to the confirmation, their original message is deemed legitimate and is delivered to you.

Bayesian filters to me, seem to work if you are a dull person without many changes in your life. For ex, if you constantly get spams with the word Madam in it and you later on get a sex change, you will need to recalibrate your filters. (Probably not the most pressing thing on your mind, so you'd lose a few authentic mails.)

Just some thoughts.

Re:tmda.net? by scrytch · 2002-11-14 08:12 · Score: 2

Essentially, it throws the parsing problem right back in the spammer's faces: They must answer a fuzzy logic question in order to get into your inbox once and for all.

As someone who does email support, lemme tell you just HOW MUCH I love those fucking responders. Nothing like deliberately making my job harder. Sure would be nice if these services would automatically whitelist addresses you sent mail to first.

--
I've finally had it: until slashdot gets article moderation, I am not coming back.
Re:tmda.net? by William+Tanksley · 2002-11-14 09:13 · Score: 2

Essentially, it throws the parsing problem right back in the spammer's faces: They must answer a fuzzy logic question in order to get into your inbox once and for all.

It also throws that problem in the face of everyone who wants to communicate with you. That rather clearly indicates an arrogant attitude to most people, and definitely is a barrier to communication.

Bayesian filters to me, seem to work if you are a dull person without many changes in your life.

Wrong. Bayesian filters _learn_, they're not static.

For ex, if you constantly get spams with the word Madam in it and you later on get a sex change, you will need to recalibrate your filters.

So, you have no idea how Bayesian filters work. You seem to have them confused with regexp filters, and really stupidly configured ones, at that. (Imagine throwing away an email because it contained a /single/ slightly negative word!)

-Billy

Dealing with Spam by mabu · 2002-11-14 07:30 · Score: 2, Insightful

I am completely against all client-based spam filters. This essentially does nothing to address the most serious repurcussion of spamming, and that's exploitation of third-party networks & bandwidth. Aside from the fact that client-based spam filtering is most-likely the least effective solution and more likely to stop legitimate mail than other methods such as known spam relay blocking.

Ultimately, the only way we're going to really curtail spam is by enacting harsh *criminal* penalties for mail relay and server hijacking, which is the standard method by which most spam is distributed. It's true that these activities are already considered illegal but the law enforcement agencies are either unwilling to take action because there's a minimum threshold of monetary damages required, or they're ill-equipped knowledge and technology-wise to aggressively go after these people.

And Puleeze don't even bother with the ineffective, "let the industry regulate itself" argument, which doesn't work. Most spammers are small "cell groups" that move around a lot; most don't have any money in the first place; only criminal penalties are going to work, and client-side and industry regulated efforts don't stop their efforts at all and just drive bandwidth charges up for the rest of us.

Sort by Spam Probability by Krellan · 2002-11-14 07:38 · Score: 5, Insightful

It seems too many people distrust spam filters because of the chance of accidentally blocking an important legitimate message as if it were spam.

Many spam filters are strictly binary: a message is either spam, or not spam. This is not ideal, because "gray area" messages - between these two extremes - will likely not be sorted correctly.

I propose adding a new sort option to email clients.

Sort by Spam Probability

This would be an additional field that can be displayed in a message list, similiar to "To", "From", "Subject", and the like. Like the article, probabilities would range from 99% (almost certain spam) to 1% (most likely an innocent message). Notice that 100% accuracy either way is not claimed.

This way, the user can see up front the messages that are most likely not spam. The spam messages will be relegated to the bottom of the list, possibly colored to indicate their likelihood of being spam. If there is a message in the "gray area", it will most likely appear in the list between the legitimate messages and the spam, so the user will have a chance to see the message and make a decision, without the message being lost in the shuffle.

This would be a great feature. I hope this gets into Mozilla's mail client.

(BTW, another feature that would be great to see in mail clients would be datestamping of the actual time the message was downloaded. Many spammers, and innocent people with misconfigured clocks, send emails with wild dates that are not to be trusted. You can see this in yearly archives of GNU "mailman" mailing lists! Datestamping emails as they are downloaded will also keep mailboxes in order when sorted by date, as newly arrived messages will always be at the bottom, instead of being scattered throughout the inbox. But sorting by spam probability will probably become more popular than sorting by date....)

--

Dr. Demento On The 'Net!

Re:Sort by Spam Probability by ghamerly · 2002-11-14 09:23 · Score: 2, Interesting

Since naive Bayes gives probabilities, this is easy to get out of what Mozilla (and Paul Graham, and others) are trying to do. However, it is well-known that the probabilities that naive Bayes classifiers give are typically exaggerated (too close to either 0 or 1). This is partly because of the naive assumption (conditional independence of features).

However, while the probabilities themselves may be exaggerated, they are also usually found to be ranked correctly, which would give you what you want here -- a ranked list of possible spams.
Re:Sort by Spam Probability by brw215 · 2002-11-14 09:29 · Score: 2, Informative

Actually in theory you could can "set" a threshold for SPAM detection with a Bayes filter.

Bayes therom is something like (note the Pr(mail) term is dropped):

PR("SPAM" | mail) = Pr(mail | "SPAM") * Pr("SPAM")
vs.
PR("LEGIT" | mail) = Pr(mail | "LEGIT") * Pr("LEGIT")

A bayes classifier always picks the label (spam, not spam) with the higher probability or

Pr("SPAM" | mail) vs. Pr("legit" | mail)

The spread between these two numbers is going to define the "certainty" that any given mail is in fact SPAM. You could either sort your incoming mails by this spread or color almost definite ones red, most likely yellow etc......

Spamassassin on Windows by TheSync · 2002-11-14 07:50 · Score: 2

Dare I say it, my wife's work uses Windows desktops. She answers an email address that gets several hundred spams per day. She is trialing SpamAssassin Pro with Outlook, it seems to be doing good so far.

SpamAssassin Pro also has an enterprise version for Exchange, but I can imagine a lot of Exchange admins fearing fooling around with it too much.

Re:Spamassassin on Windows by Reckless+Visionary · 2002-11-14 08:59 · Score: 2

Again, this is for POP3 accounts. I'd love to try it, but it needs to have IMAP support.

--
I think I'll stop here.

Bayes filters can't adapt to text in images by DuSTman31 · 2002-11-14 07:50 · Score: 4, Insightful

As a popfile user, I'm quite impressed with the catch rate possible with bayes theorem spam filters, however I suspect this will decrease in effectiveness over the long term.

Spammers are likely to respond to filters like this by encoding text in ways the filters can't read but humans can (eg having a .gif file of the text, loaded by a HTML statement in the message).

Statistical filters would need to have some kind of built in OCR routine before it could be effective against that trick, and some respectible mailing lists are using images as well, so you can't just filter all mails with images attatched.

In the long term, therefore, I suspect that filters that use a network database of spam will be more successful.

brain fart... block HTML in e-mail? by Micah · 2002-11-14 08:12 · Score: 3, Insightful

The big problem with this is spam still gets to the server. :(

Just thought of this now... but it seems like almost all spam these days contains a whole bunch of HTML tags. Maybe someone should write a server plugin to instantly reject all mail containing , instantly adding the sending IP to a iptables DROP rule.

There's little legitimate e-mail with tables, unless you count paypal, datek, and travelocity news and that kind of crap. But we could always add a list of "good" IPs.

I know there are server solutions, but all make me a bit queasy. I just want something that will detect funky activity on the fly and instantly deny all access to that IP.

DOH: block TABLES in e-mail by Micah · 2002-11-14 08:14 · Score: 2

GAAAAA that sure came out wrong! Slashdot apparently dropped my inclusion of the HTML [table] tag in the text and subject. That's what I meant, NOT all HTML e-mail!

Re:DOH: block TABLES in e-mail by Darby · 2002-11-14 17:46 · Score: 2

Slashdot apparently dropped my inclusion of the HTML [table] tag in the text and subject. That's what I meant, NOT all HTML e-mail!

Only a couple of spams make it to my inbox, and only a few legit mails go to my trash. (I do a quick scan before I empty it).
I have one mail filter that drops anything containing html.

YMMV, but it works pretty well for me.

Server-side filtering by scarhill · 2002-11-14 08:28 · Score: 3, Informative

The big problem with bayesian server-side filtering (as opposed to rule-based tools like SpamAssassin) is that baysian filtering requires a UI. The user must classify email as spam/not-spam to provide fodder for the filter. Having that UI in the mail client is the right thing to do. It would be nice if there were some protocol that the client could use to communicate that info to a server-side filter, but AFAIK no such protocol exists.

So client-side seems like the right place for bayesian filtering right now.

Thanks for pointing out the obvious! by Mustang+Matt · 2002-11-14 09:11 · Score: 2

A feature request has been filed:
Mozilla feature request

(bugtracker sure is slow today!)

--
The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin

My Bayesian Adventures by unorthod0x · 2002-11-14 09:12 · Score: 3, Insightful

After collecting 87 megs worth of spam and a similar amount of non-spam I decided to implement the so-called 'Bayesian' method of spam filtering by way of popfile - it's a pretty slick concept; Perl code that acts as a POP3 server on your own machine - simply drop your collected spam and non-spam in to the appropriate bucket, have popfile go through them and create its indices and set up your mail client to connect to 127.0.0.1 with your username being 'my.pop.server:loginname'.

I know I've got a particularily difficult task for this filtering technique; I get an awful lot of spam that comes in every day (~100 messages per 24 hour period), some of it I actually want (I run an underground music site, and in some cases I subscribe to opt-in lists that result in something that looks like spam), the rest I could care less about.

My results have been decent for the most part; 100% of my spam ends up in my Spam folder, however there is a handful of messages that I wish to keep that end up there as well.. For the most part they are the above-mentioned 'borderline' pieces of spam (which I have been careful to put aside and have indexed by popfile anyway), I can only hope that more time and samples will yield better results. I was however surprised to find that some of the e-mails I was getting from friends were falling in to the Spam mailbox anyway; after taking a closer look, I can see why, they use an awful lot of otherwise unmentionable words - but my suspicion that I haven't gotten enough of these 'good-emails-with-bad-words' to make the filtering truly effective.

Nonetheless, it is nice to have all of my spams seemingly guaranteed to drop in to my "Spam" folder, but my usual task of manually filtering messages that made it past my existing filters in to my Spam folder has been replaced with a different (albeit quicker) task; taking messages out of my spam folder and putting them where they really belong.

Bottom-line: I still have to visually scan through my mail for legitimate messages amongst the thicket of items informing me about the exciting exploits of women at the farm, wonderful business opportunities from Nigeria and suggestions that I should buy Viagra by the boatload.. all this despite having collected a well organized and rather large collection of spam/non-spam mails. I'll stick with it for a while as I'd like to try it out and give it a proper chance, but I suspect that if you're in a similar situation then you should be prepared to tough it out..

Spam 'em back by dazedNconfuzed · 2002-11-14 09:38 · Score: 2

I want to see a Mozilla feature button which when pressed:
1. stores the spam sender's address
2. forwards the spam to all stored spammer addresses

Give 'em a taste of their own medicine. Get enough people doing this, and each spam site should get melted down.

--
Can we get a "-1 Wrong" moderation option?

A weird technique by swb · 2002-11-14 09:47 · Score: 2

You could merge the measuring portion of the Bayesian filter into imapd.

A special imap folder called "spam" would exist. Messages fed into this folder would be used to compute a filter database. After computing the filter database, the spam messages would be deleted leaving a single message behind representing the Bayesian filter database.

When fetching messages, this filter database would be checked by imapd as it fetched messages; matches would be automatically fed back to the spam folder, where they'd improve the filter, non-matches would show up in your inbox as expected.

No special client software required.

You could even have special virtual folders called "Inbox-Unfiltered" that would give an unfiltered view, a "Spam" folder that gave a spam-only view, as well as options not to delete spam moved to the spam folder autoamatically for review for false-positives.

Mutt has the feature you want by autechre · 2002-11-14 10:27 · Score: 2

As usual :)

From my configuration file:

set sort="threads"
set sort_aux="date-received"

What this does is to thread all replies to a message, Usenet style. There are commands to break apart (for people who send a message to a mailing list by replying to a random other message) and join together (for people with bad email clients) threads.

The sort_aux tells Mutt "OK, once you've threaded everything, sort the the messages by using the date received of the top level message in each thread." If you're one of those lunatics that doesn't like a threaded view, you can just use 'set sort="date-received"' instead.

The only time this is a problem is when your email server goes down and there are a batch of messages from a mailing list that arrive in reverse order. But then, if they all happen to be in the same thread, they're sorted by who's replying to what, so it ends up OK.

I went from Netscape mail to PINE to Mutt, and I don't see any reason to use anything else.

--
WMBC freeform/independent online radio.

Suggested Feature: "Block Plugins from This Site" by Maul · 2002-11-14 10:28 · Score: 3, Insightful

I like the ability to block images from a server, but it'd also be nice to have a similar feature for plugins and Java applets.

A lot of ad companies are now using really annoying flash. Blocking images doesn't stop these.

--

"You spoony bard!" -Tellah

Combine this with open relay databases... by cardshark2001 · 2002-11-14 10:57 · Score: 2, Interesting

And you'll have a real winner. Probably several other techniques could be combined as well, but back when I wrote a program just to check all of the from IPs in an email to see if any of them were open relays, I got around 80% filtering with very few false positives.

Furthermore, you can assign a pretty good probability number based on what sort of open relay it is (i.e. verified, unverified, spam server, merely unsecured server, etc). If it comes from a spam server, the chances are 100% that it's spam. If it comes from a dialup server, the chances are about 99.9999%. If it comes from an automatically verified open relay, that's merely unsecured, the chances are more like 60%.

The open relay thing really intrigued me because it has NOTHING to do with the message body, and it was my belief at the time that there was no good way to filter based on message content.

However, combine this with bayes, and I'll bet you'll have something grand.

Also, a great feature would be a multi-tiered identifier, so that you could have the 99.999% sure spam filtered into one folder, and the 75% sure spam filtered into another. You'd have to sift through the 75%, but probably could just leave the 99% alone.

--
WWJD? JWRTFA!

I expected this by The+Raven · 2002-11-14 11:59 · Score: 2

After the articles a couple weeks ago about the utility of bayesian spam filters, I knew it was merely a matter of time before it was put into Mozilla. :-)

--
"I will trust Google to 'do no evil' until the founders no longer run it." Hello Alphabet.

and all of those users were forced to pay cash by Lewis+Mettler,+Esq. · 2002-11-14 12:35 · Score: 2

Almost all IE users were forced to pay cash money for their browser.

That is not true with Netscape, Mozilla or Opera.

Only at Microsoft are you forced to key products based upon the needs of Microsoft instead of your own.

--
NexuSys - Linux support by the best

I use the FREE service. by Futurepower(R) · 2002-11-14 13:31 · Score: 2

Quite possibly you didn't see the link. I use the FREE service. I've never paid SpamCop a penny. SpamCop builds a database of spammers, and uses the information to convince ISPs that they need to shut off the spammer.

It works, too. SpamCop has sometimes forwarded replies from ISPs that say that they are deeply sorry and the spammer's account was shut off immediately, sometimes within two hours of the time I received the spam. Nothing undeserved can happen; the ISP examines the logs and discovers the truth of SpamCop's computer analysis.

A secret that should be known by everyone: Many spammers put serial numbers in their spam. When SpamCop forwards the spam to the ISP, the ISP sometimes forwards that to the spammer, as evidence. The spammer recognizes to whom the spam with that serial number was sent. Since they don't want to have other accounts shut off, they remove me from their lists -- very quickly.

Note that SpamCop never discloses my email address to the spammer or the spammer's ISP.

Spammers don't want the grief that comes from messing with people like me who will always forward their spam to SpamCop within a few hours.

There are other services like SpamCop. I'd like to hear about user's experiences with them.

If everyone who used Mozilla sent all their spam to services like SpamCop, we would create a rocky road for spammers. There are spam-friendly ISPs, but SpamCop communicates with the internet backbone providers also, who are unlikely to be spam-friendly.

spam back could work, but by Lewis+Mettler,+Esq. · 2002-11-14 13:37 · Score: 2

Spamming back could work but many of those emails do not have legit reply email addresses.

However, if you bother to reply to the email until you find a real valid email address then "that" address would be the one to associate with the spam. Then send all your spam you recieve to all of the valid and proven email addresses they use for business purposes.

Of course, your email is likely to end up on more than one list of spammers too.

--
NexuSys - Linux support by the best

Re:They're only doing this to compete with Opera by Ilgaz · 2002-11-14 18:18 · Score: 2

Opera 7 beta shipped.... Unlike every single +0.001 release of Phoneix, it doesn't make news on Slashdot.

Gee they coded it from the start and surprisingly, its faster,smaller, unlike netscape 6-7 teached us... http://www.opera.com

Mozilla fanatic moderators will burn points now, so I hate doing it but sending with score +1 bonus. At least, some of Slashdot readers would be AWARE...

Playing games Slashdot? :)

Re: Spellchecker that actually works for Win32 by ayden · 2002-11-15 04:32 · Score: 2

This was posted to the SpellChecker Email List last night (14 Nov 2002). After 2.5 months without a spellchecker for Mozilla on Win32, someone finally released one that works. See http://mozillacafe.org/MozSpell_1.2f_w32.xpi.

Just in case anyone wondered, using the spellchecker from spellchecker.mozdev.org has not worked for Win32 nightly builds, Mozilla 1.1 or 1.2b releases since the end of August. The spellcheck.xpi from Netscape 7 may work for these Linux builds but does not work for Win32.

--
"I'm The Bounty Bear. I will find him anywhere. I'm searching."

Re:Suggested Feature: "Block Plugins from This Sit by MikeBabcock · 2002-11-15 05:45 · Score: 2

Submit a request at bugzilla.mozilla.org ...

--
- Michael T. Babcock (Yes, I blog)

Coming soon: standalone Mozilla mail by yerricde · 2002-11-15 06:39 · Score: 2

Is it possible to install Mozilla Mail (and the address book) without installing the browser?

It will be when the Mail component is branched off into its own project, soon after the release of Phoenix 0.5.

Is it possible to install Outlook Express without installing Microsoft Internet Explorer?

--
Will I retire or break 10K?

216 of 464 comments (clear)