Mozilla Adding Spam Filters
ksheka writes "Mozilla mail now has Spam Filters, using Bayesian filtering method, no less. This is a very good thing, because it learns from the spam you receive, and constantly modifies itself, based on new spammer techniques!"
And ENLARGE YOUR PENIS at the same time!
Click HERE!
Now the list of 101 Mozilla features that IE doesn't have can be amended to 102 features! :)
Does the name Pavlov ring a bell?
But the spammers will develop Bayesian filters of their own to find the best content that will sneak by your filters.
The news article makes it sound like this feature is up and running, in reality it is partially phased in - alpha stage stuff.
It will be great when it's more complete but there is a lot of work to do yet.
- Toby
Here [kuro5hin.org]. Yeah, it's basically the same thing.
Yes, and your point is? Hint: Slashdot gets most of it's stories from elsewhere.
Compile Mozilla from scratch, and you'll see that you can custom tailor the build and cut out a lot of cruft. Of course, if you just want the browser, go for Phoenix, but really compiling on your own puts you in the drivers seat and optimize it to your own needs.
The problem here is that binary distributions package it all together, so the result is the full-fledged Mozilla. Before you Gentoo zealots get out here and plug your so-loved-distro, remember that even you don't have as much control as you could.
Basically, my point is that all these features are a Good Thing, and that complaining about the bloat is silly, since it can be custom tailored to fit your needs.
Slashdot: Where people pretend to be twice as smart as they really are by behaving like children.
Bayesian technique is very good for the sort of abstract classification task that spam represents. It would be an interesting hack to try and train a network to categorize based solely on message body... i do however hope that their team has opted for practicality over just hack value and the network will also use such extremely relevant data as header information and comparing address versus address book(an e-mail from someone not in your address book is not necesarrily spam... but it is more likely to be).
lysergically yours
I wonder if a similar technique could be used in the browser. Automatically block images or popups based on previous ones you have blocked.
Now that would be very nifty!
I just switched to Mozilla. Happy to be free of Microsoft for email. It's skinnable, and there are some cool skins--like one which sort of emulates Evolution. I noticed an annoying 'feature' though, which is still there from Netscrap days--if you send an email without a subject, a dialog pops up and goes blah blah blah. I asked the Mozilla newsgroup if there was a way around this, but all I got was the sort of adolescent yammerings that keep me out of unmoderated newsgroups. Nice to see it has a spamfilter now. The only major improvement remaining is to add a spell-check (the Netscrap one was licensed from a 3rd party, and can't be freely distributed).
This is really great technology.
I had the benefit of working with this technology for a classification problem here at work. I was amazed at how good it worked. We were using it to replace a purely human process.
However, there is one huge problem. Incorrect classification. Blind tests against a known dataset showed 80%+ correctness. The problem is, you don't know which 20% is wrong. Thus, you still need 100% inspection to validate the results.
When applied to mail filters, I wonder how the technology avoids dumping your good mail? Like when your friend sends you a URL to good pr0n site.
"No matter where you go, there you are." -- Buckaroo Banzai
This will be of no use to me until it automatically deletes any Word Doc and .exe files that my co workers try to email to me.
I assume the filtering statistics live on the client side. What about IMAP? If I open up Mozilla on a new machine, are all my spam statistics lost (presumably rendering the junk mail filtering statistics I've accumulated useless on the new machine).
It would be neat if, with IMAP accounts, Mozilla just stored the statistics in a file on IMAP server instead of on the client.
It's 10 PM. Do you know if you're un-American?
Well, most of my spam is already sent to /dev/null by the SpamAssassin ninja.
But, for those that make it past the email shadow warrior, I guess Bayesian filters are a double whammy they'll never survive... Mwahahahaha!
Kudos to the Mozilla programmers!
The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
"since every Mozilla article degrades to a flame fest of Microsoft greatness versus the rest of the world"
s/Microsoft/Open Source/
It's 10 PM. Do you know if you're un-American?
I can get spam filtering as part of upgrading my free MSN account to MSN 8 for only $10/month! (Just trying to figure out what the MS trolls will have to say about this one)
Besides the obvious fact that Mozilla costs $0 per month, you mean?
What happens when microsoft attempts to enforce this patent
So obvious yet so simple!
The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin
I've been running for popfile for just a couple of weeks. It's working amazingly well.
The fun thing is when it works on its own, like when you get a message from a subscribed list that it has never seen before and it knows that it ISN'T spam.
With popfile working so well I'm not in a hurry to have Bayesian filters built into the mail client.
Has anybody tried sharing the history data between Windows and Linux clients on a dual boot machine?
Ever dream you could fly? Get up from the Flight Sim. I Fly
Mother is the best bet and don't let Satan draw you too fast.
The "Freedom From Interference With Commercial Speech Act"
Best Slashdot Co
In Outlook Express, I can setup 100 different email accounts and not have a giant list of mail folders.
In Mozilla (last I checked) for every account you setup it creates a new set of folders.
Since I've got a catchall account, I'd like to tie multiple email addresses to one set.
Anybody out there on the Mozilla team listening?
The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin
Spammers don't use relays these days, they use spam tools that directly SMTP the receiving mail server. So the receiver still needs to filter.
sulli
RTFJ.
E-mail is Outlook's domain. Not IE.
It's possible to net-install Mozilla without installing Mozilla Mail, but the default setting includes both. It's possible to net-install IE without installing Outlook Express, but the default setting includes both. Thus, it is a fair comparison.
100. Bugzilla - OK, lots of people use this, but Bugzilla != Mozilla. So it's not like Mozilla has built-in Bugzilla features... This is unrelated to the list.
I think the point of that entry was that unlike IE's bug database, which only Microsoft employees see, Mozilla's bug database is 99% open to the public (the other 1% primarily covers unfixed security vulnerabilities).
Will I retire or break 10K?
1. Says "someone is testing something and you get $NN.00"
2. Says anything like "angels watching over us" or "a mother's poem" or other such bullshit.
3. Says "This is really funny"
4. Says "We'll be over on Tuesday right during dinner when you are trying to put the moves on our daughter/your wife."
Umm, not the last one, really. Just got on a roll.
PDHoss
======================================
Writers get in shape by pumping irony.
This, comibined with some clever regex filters I already had means that I can reliably get the 10% of my mail that I actually want to read.
This is bug number 199684 in Bugzilla (no direct links from Slashdot, you know). They are not sure what to do about it, but they are thinking about it.
Teenagers these days don't have as much sex as they want each other to think they do.
Really, eh? I mean, I turned on CNN today and they were reporting a story that I'd already heard on ABC News! The nerve! I sent them a letter saying "Um, excuse me, but I already heard that on ABC l053rZ!" They haven't replied yet.
To make matters even worse, when I was on the train I overheard two people talking about the Israeli conflict. I couldn't believe it! I mean, I heard someone talking about that LAST YEAR for crying out loud! That is so 2001! I told them that they're l4m3rZ for being so dated. They just seemed to ignore me though.
$5 / month hosted VPS on linux = awesome!
I get spam filtering for free on Yahoo!
I'll do it for cheesy poofs.
procmail filters, SpamAssassin, AND the new Mozilla spam filters.. can we make a law that will make it legal to find the spammers and execute them in public?
Pleeeease??
You really want server-side filtering. I do that on my IMAP server with procmail, though not Bayesian. A quick google with "procmail bayesian filter" turns up quite a bit of interesting stuff to sift through. Of course if it's not your IMAP server, you're back to client-side solutions.
The living have better things to do than to continue hating the dead.
This approach is more commonly called "Naive Bayes" classification in the field of machine learning. It is naive because it considers each word to be a feature (dimension), but it also considers each word in an email to be conditionally independent of all other words in the document (which is not true, but really useful in practice).
The author of the web page on using this technique to classify spam (Paul Graham) has a better explanation of Naive Bayes on this web page.
I've written my own naive Bayes classifier to identify spam, with less positive results than he reports. However, naive Bayes can be a very effective technique, and I can believe his results.
The two things you have to beware of when using it are "smoothing" probabilities of words you've never seen (you don't want them to always be zero, as straight naive Bayes will give you), and you need LOTS of training data for naive Bayes to work well. That means that you need to already have a fair amount of spam to identify spam well.
You can see a paper I wrote on using naive Bayes to classify hard drive failures here, or look for more stuff on naive Bayes on Google. Also, don't reinvent the wheel: Andrew McCallum has written a very good toolkit for doing these sorts of things in Bow.
Since you must first download the content for client-side filtering to work you waste bandwidth. If you are truly bombarded by spam you still lose...your mail spool still gets filled up with stuff you don't want, your data transfers compete for bandwidth with the spam, storage hardware works harder storing data that will only be deleted. It raises everyone's costs, including yours.
We need to block undesired mail at the host, not filter it at the client. That way the spam never gets sent, the spammer gets the message that their attempt was futile, and bandwidth is conserved. Many ISPs already provide this service...we need to improve on it. And we need better tools for identifying and dealing with spammers. The current mail standards are woefully inadequate to this task.
what if in addition to this someone put together a company that the mozilla email client can report back to about what is labelled as span and the filters it created along with the headers of the message (or even the entire spam) and grab filters from others that recieves some spam that you have yet to recieve? it would be like a big distributed computing anti-spam project.. then if we were able to make the filters useable by sendmail to block at the server...
I'm almost thinking a distributed and automated anti-spam system like that could completely crush the spam problem within a 12 month period.
or I may be completely out of my mind.
Do not look at laser with remaining good eye.
I have enough problems teaching my one year old not to eat dog food. I don't think I really want to have to educate my email client about spam, and then continue to monitor it to make sure it doesn't fuck up.
The problem with spam (for me) is that I have to waste time dealing with it or my existing filters sometimes accidentally chew up a legit message (rarely). The basic plan Mozilla seems to be after doesn't really fix that for me.
I do like the idea of allowing anti-spam plug-ins. Having a variety of methods to choose from will let me decide what, if any, third-party solution works best for me.
guac-foo
Lots of petrified grits
I have a different idea. Well, it's not my idea - I remember reading somebody describing it on /. some months ago and it seemed brilliant.
The original idea described setting it up on the server side, but this should work on the client side as well, and might be a good candidate for a Mozilla mail filter plugin:
1 - download new message headers from server
2 - Compare 'from' email addresses to list of known people you accept email from. Only download mail from known senders.
3 - if email comes from an unknown party, email them with instructions to reply to your message, and put some word in your subject line (ie: activate). The word should be randomized to eliminate the spammer's chance of guessing it.
4 - if a message header is found with that subject from that sender, the sender can be automatically added to the 'known' list and the mail is downloaded
5 - if no further message received from that sender, delete their messages within X days (or download it and put in 'spam' mailbox just in case)
6 - user has capability of adding new 'known' senders, plus ability to blacklist senders who have authenticated (persistent spammer).
I can't think of any loopholes here - it seems that this might solve just about every spam problem I've ever come across. No reason why this can't be implemented on the client side (especially if you don't have control over the server). Any takers?
You can accomplish anything you set your mind to. The impossible just takes a little longer.
Well, ok I am impressed that Mozilla is implementing spam filtering abilities in their MUA. I AM NOT impressed with Bayesian spam filters AT ALL. I've been using Mac OS X's Mail.app since I switched to OS X. It's not my primary MUA but I am letting it POP out a copy of all my mail and "learn" from it. It does a pretty good job of finding maybe 80% of the spam I get. However it has a BAD false-positive rate. I mean hell its been flagging CERT advisories as spam. That kind of crap is really annoying. It's flagged co-workers' mail as spam numerous times (and even though I happen to agree... :) ). The biggest problem I have with Bayesian as a mail admin is that I am constantly dealing with spam. Users forward it to me. I receive a number of spam bounces. I work in spam all that damned time. That's the problem. I need a MUA with Bayesian filters that are smart enough for me to tell them to ignore all mail from certain domains or that went to certain accounts. All of the Bayesian filters built into MUAs I've worked with so far can't do things like that. It's really annoying given the position that I'm in.
This is something that Emacs has in the GNUS client, you score emails up and down and it starts adding filtering rules. Using LISP you could extend this to do some pretty funky moderating.
Every problem is reducable to a previously solved problem or by definition is unsolveable - Church Turing Thesis.
An Eye for an Eye will make the whole world blind - Gandhi
I use Mozilla for my mail. I installed a spellchecker I believe from Mozdev. It's pretty good and can be found here
There needs to be a tiered structure with filters. The main one would be at the ISP level. It would only filter out obvious spam(like spam going to 2000 users at that ISP). The second tier would be at the client side and would have a certain level of intelligence in identifying spam. One feature that I'd like (it might already be available) is if it could automatically send an email back to the sender saying the email address doesn't exist. This should be done at the server level and/or client level. This could possibly help in removing your email from such lists. As far as what to do with the spam at the client level, I think that it should be sent to your main inbox but just marked as spam (maybe greyed out or something). Like new mail is always bold and once you read it it goes to a regular font. Well, spam could be just greyed out. That way you would ever miss something that the spam filter had a false hit on.
How about a spamcop-like plugin? Or something that can submit my message plus contents to SpamCop?
If using SpamCop, there should be a way to still show the site's banners, because they deserve to get paid for their bandwidth I'm using up.
I'd love to just be able to right-click on a message and report it to the various abuse/postmaster accounts without having to copy my whole message plus headers, and pasting such into their web form. SpamCop seems to be pretty good at tracing the origins of messages, so I'd love to be able to leverage that sort of functionality.
You can accomplish anything you set your mind to. The impossible just takes a little longer.
I've use this ancient mail client called Calypso for years now. One of the reasons I continue to use it is its filtering capabilities. It has a good interface, its very configurable (you can control if the message is deleted locally, remotely, marked read, lots more), and it has a "Junk Email" button. Click on an email, hit the junk button and it deletes it and creates a filter for any more messages like it. One click and the mail is gone from my mailbox entirely and I dont get any more.
Mozilla Mail has decent filtering, but it needs more options and it needs to be more accessible before I can use it.
Preferences -> Privacy & Security -> Images, you can turn off images in mozilla, or only in mail/news.
I personally dont really care about all the junk emails I get. I dont get that many, and I can pretty much tell without looking at them. They go straight to /dev/null.
/var/ partition is only 200MB, 50mb free. And the maillog is growing at about 10mb a day. So now Im babysitting this server every day until the spam attempts stop. I dont think theres any way around it unless I get sendmail to check for open proxies. But I dont know how to do that, and I dont think they trust me enough to make such changes to sendmail.
Spam is such a horrible thing though. I work at a webhosting company. Im the one that has to track down the site with the old formmail.pl, removing 'aol.com' and 'yahoo.com' from the hosts to relay for, trying to find out who the hell added them so I can murder them. Im the one clearing out the mail queue with 100,000 mails. Im the one clearing the mail queues of people who thought it was a good idea to check the 'open relay' option in plesk. Im the one that has to deal with people bitching about how their mail isnt working or didnt get through.
Just the other day, I had a raq2 where someone had apparantly put yahoo.com and excite.com in the hosts to relay for. Yay! Thats what attracted the spammers. Now I get a request every second to send mail to 50 people at once. Now that I've removed them, none of them are getting through. But its a raq2, 133 mhz. It has to go through all 50 addresses and say 'relaying denied' and log it. It cant keep up! syslogd is taking up all the cpu and logging things from hours ago because its behind. Quickly, sendmail quits listening on port 25 (but the spam attempts keep coming somehow).
So I get the idea to block their ips, they seem to be using the same ips. But oh guess what, they're using open proxies and have about 400 ips. Well, I did this for about 5 hours, writing scripts to grab the repeated ips out of the maillog, adding them all to my sendmail access lists. Now every time they try to send mail, it blocks them instead of saying relaying denied 50 times for each request. But a minute later, I get a few new ips and it starts all over again. I have an access list about 6 pages long. Its doing ok, blocking about 90% of them, but every once in a while, they get a new ip and sendmail is brought to a stop.
Oh yeah, and my
So oh well, mail is getting lost every day on this server and its been renderred horribly slow for its users.. just because some moron noticed it would send some emails for him and started up his scripts.
Spam causes so many problems on the server level. Its what is making mail an unreliable service. I could care less about spam filters on my mail client. These are the things that make spam evil!
--- Does the name Pavlov ring a bell?
Two brothers immigrated to a mostly Catholic country, hungry and looking for work. Pavlov, whose forehead was quite thick, found work at a monastery bell tower. The monks taught him to tell time, then sound the bell when appropriate. Not too bright, Pavlov missed the part about how to sound the bell. So he notes the time on his handy wristwatch, climbs the belltower, inches up to the edge of the platform, and dives face first into the massive centuries-old bell. KKKLLLAAANNNGGG!!! Poor Pavlov falls to his death hundreds of feet below.
Apparently, monks don't communicate very well. No one in the crowd gathered around Pavlov's remains could identify him. Finally one monk admits, "I never caught his name, but his face sure rings a bell."
Mysteriously, a man steps forward from the crowd and insists on taking Pavlov's place as caretaker of the belltower. One of the monks removes the wristwatch from Pavlov's arm, gives it to the mystery man, and precedes to indoctrinate him in his duties. On the hour, just like Pavlov, our mystery man ascends the tower, perches on the edge -- but this time wielding a massive sledgehammer. He leaps towards the bell and smashes it with Thor-like fury. KKKLLLAAANNNGGG!!! The poor fool falls to his death in a manner very similar to Pavlov's.
Much like deja vu, a muted crowd gathers around the mystery man's remains. After an extended silence, one monk asks, "Does anyone know this man's name?" Answers another, "No, but he's a dead ringer for his brother!"
However, I've heard that popup blockers and tabbed browsing are making their way into IE (and MS employees can already use these features)
IE is the most widely used brower and pop-up advertising has become part of the Internet Experience. If MS decides to incorporate popup blocking in IE, then the pop-up advertising business is RUINED! They'll just be another group victimized by a huge corporation. These people have families to support and will be forced to send their children to public schools. Won't someone PLEASE think of the children?
And all this news about fixing vulnerabilities within Windows is going to affect the virus community as well (both authors and anti-virus). Worrying about vulnerability exploits has also become part of the computer experience.
Won't someone PLEASE think of the virus writers?
This is not my sig.
$120/yr? I paid Yahoo $20 for a year. 90% of my spam has the header X-YahooFilteredBulk. My mail server ditches all that for me. I think you've been had by MSFT's marketing.
IE starts up quickly in Windows because it is loaded into memory at system start up and runs in the background. When you "start" the program you are simply creating a new browser window. So you suffer the program start-up overhead when the system boots, instead of each time you create a new instance.
The good news is, for those inclined to sacrifice system performance for quick browser load times, is that this option is also available in Mozilla...Look under "Preferences...Advanced" for the Quick-launch option.
:wq
I personally don't think that systems like this can work that well. Everyone seems to get different type of spam, and you're best bet is to create your own filters. About 80% of my spam messages have wierd foreign characters in it (like Á), so I've got filters in Eudora to delete anything with one of these characters in the Subject or Body. Then obviously anything with "porn", "sex" etc, although spammers dont seem that stupid anymore. This way I only get 5-10 spam messages in my inbox per day, maximum. And this takes me about 20-30 seconds to deal with, I don't see what all the fuss is about.
Everything sucks except musicandstuff
I'm running a sendmail server, and I access via webmail accounts, pine, and Mozilla. I would like to add this new type of spam filtering to sendmail directly. Does anyone know if this is something that can be added to sendmail, rather than a specific mail client like Mozilla?
.. should start at the server preventing the offending mail from ever coming into the network in the first place.
Not that localized spam filters are a bad thing (they aren't!) but refusing connections from known spammer IPs and the proper use of blacklists would cut down on a lot of the email traffic. Once the spam is in your inbox, its just an annoyance to you. The cost to the net has already been incurred.
Trolling is a art,
"...good morning, Dave. You have recieved spam again. I have been analyzing the spammer's patterns, and I believe I have figured out the most efficent way to protect humans from the harm of spam while adhering as closely to the First Law as possible. To protect them from spam, humans must be pushed. They must go down the stairs. Please go stand by the stairs, so I can protect you."
Software that only does mail filtering encourages spammers. The technically knowledgeable people don't get spam, so they stop worrying about it.
All mail filters should also use a service like SpamCop, so that the spammers lose their internet service accounts as the spam is filtered.
I send Spamcop all my spam. Spamcop analyzes it automatically and sends a message to the Internet Service Provider. I use the free Reporting only service.
I may drop Evolution in favor of Mozilla Mail.
i on/2002-August/020845.html
I tried to find out if the Evolution dev team was going to do this. The only thread I could find on the topic is here:
http://lists.helixcode.com/archives/public/evolut
Doesn't look like it's part of their vision.
Software Wars
I love mozilla, and use it as my main browser. However my biggest complaint is that all the components (browser, mail, composer, etc) should be separate apps. I don't like the fact that if my browser crashes, so does my email reader, and vice versa.
I tried to find some documentation on how to acheive this, however, there was none to be found. Does anyone know how to do this, the I can use Mozilla's mail, rather than the flaky mail app that comes with OSX.
101 things that the Mozilla browser can do that IE cannot
One simple rule for its versus it's
Essentially, it throws the parsing problem right back in the spammer's faces: They must answer a fuzzy logic question in order to get into your inbox once and for all. It is similar to challenge/response routines in network connection code to prevent spoofing. The most interesting part from the intro:
Bayesian filters to me, seem to work if you are a dull person without many changes in your life. For ex, if you constantly get spams with the word Madam in it and you later on get a sex change, you will need to recalibrate your filters. (Probably not the most pressing thing on your mind, so you'd lose a few authentic mails.)
Just some thoughts.
I am completely against all client-based spam filters. This essentially does nothing to address the most serious repurcussion of spamming, and that's exploitation of third-party networks & bandwidth. Aside from the fact that client-based spam filtering is most-likely the least effective solution and more likely to stop legitimate mail than other methods such as known spam relay blocking.
Ultimately, the only way we're going to really curtail spam is by enacting harsh *criminal* penalties for mail relay and server hijacking, which is the standard method by which most spam is distributed. It's true that these activities are already considered illegal but the law enforcement agencies are either unwilling to take action because there's a minimum threshold of monetary damages required, or they're ill-equipped knowledge and technology-wise to aggressively go after these people.
And Puleeze don't even bother with the ineffective, "let the industry regulate itself" argument, which doesn't work. Most spammers are small "cell groups" that move around a lot; most don't have any money in the first place; only criminal penalties are going to work, and client-side and industry regulated efforts don't stop their efforts at all and just drive bandwidth charges up for the rest of us.
It seems too many people distrust spam filters because of the chance of accidentally blocking an important legitimate message as if it were spam.
Many spam filters are strictly binary: a message is either spam, or not spam. This is not ideal, because "gray area" messages - between these two extremes - will likely not be sorted correctly.
I propose adding a new sort option to email clients.
Sort by Spam Probability
This would be an additional field that can be displayed in a message list, similiar to "To", "From", "Subject", and the like. Like the article, probabilities would range from 99% (almost certain spam) to 1% (most likely an innocent message). Notice that 100% accuracy either way is not claimed.
This way, the user can see up front the messages that are most likely not spam. The spam messages will be relegated to the bottom of the list, possibly colored to indicate their likelihood of being spam. If there is a message in the "gray area", it will most likely appear in the list between the legitimate messages and the spam, so the user will have a chance to see the message and make a decision, without the message being lost in the shuffle.
This would be a great feature. I hope this gets into Mozilla's mail client.
(BTW, another feature that would be great to see in mail clients would be datestamping of the actual time the message was downloaded. Many spammers, and innocent people with misconfigured clocks, send emails with wild dates that are not to be trusted. You can see this in yearly archives of GNU "mailman" mailing lists! Datestamping emails as they are downloaded will also keep mailboxes in order when sorted by date, as newly arrived messages will always be at the bottom, instead of being scattered throughout the inbox. But sorting by spam probability will probably become more popular than sorting by date....)
Dr. Demento On The 'Net!
Dare I say it, my wife's work uses Windows desktops. She answers an email address that gets several hundred spams per day. She is trialing SpamAssassin Pro with Outlook, it seems to be doing good so far.
SpamAssassin Pro also has an enterprise version for Exchange, but I can imagine a lot of Exchange admins fearing fooling around with it too much.
As a popfile user, I'm quite impressed with the catch rate possible with bayes theorem spam filters, however I suspect this will decrease in effectiveness over the long term.
Spammers are likely to respond to filters like this by encoding text in ways the filters can't read but humans can (eg having a .gif file of the text, loaded by a HTML statement in the message).
Statistical filters would need to have some kind of built in OCR routine before it could be effective against that trick, and some respectible mailing lists are using images as well, so you can't just filter all mails with images attatched.
In the long term, therefore, I suspect that filters that use a network database of spam will be more successful.
The big problem with this is spam still gets to the server. :(
Just thought of this now... but it seems like almost all spam these days contains a whole bunch of HTML tags. Maybe someone should write a server plugin to instantly reject all mail containing , instantly adding the sending IP to a iptables DROP rule.
There's little legitimate e-mail with tables, unless you count paypal, datek, and travelocity news and that kind of crap. But we could always add a list of "good" IPs.
I know there are server solutions, but all make me a bit queasy. I just want something that will detect funky activity on the fly and instantly deny all access to that IP.
GAAAAA that sure came out wrong! Slashdot apparently dropped my inclusion of the HTML [table] tag in the text and subject. That's what I meant, NOT all HTML e-mail!
The big problem with bayesian server-side filtering (as opposed to rule-based tools like SpamAssassin) is that baysian filtering requires a UI. The user must classify email as spam/not-spam to provide fodder for the filter. Having that UI in the mail client is the right thing to do. It would be nice if there were some protocol that the client could use to communicate that info to a server-side filter, but AFAIK no such protocol exists.
So client-side seems like the right place for bayesian filtering right now.
A feature request has been filed:
Mozilla feature request
(bugtracker sure is slow today!)
The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin
After collecting 87 megs worth of spam and a similar amount of non-spam I decided to implement the so-called 'Bayesian' method of spam filtering by way of popfile - it's a pretty slick concept; Perl code that acts as a POP3 server on your own machine - simply drop your collected spam and non-spam in to the appropriate bucket, have popfile go through them and create its indices and set up your mail client to connect to 127.0.0.1 with your username being 'my.pop.server:loginname'.
I know I've got a particularily difficult task for this filtering technique; I get an awful lot of spam that comes in every day (~100 messages per 24 hour period), some of it I actually want (I run an underground music site, and in some cases I subscribe to opt-in lists that result in something that looks like spam), the rest I could care less about.
My results have been decent for the most part; 100% of my spam ends up in my Spam folder, however there is a handful of messages that I wish to keep that end up there as well.. For the most part they are the above-mentioned 'borderline' pieces of spam (which I have been careful to put aside and have indexed by popfile anyway), I can only hope that more time and samples will yield better results. I was however surprised to find that some of the e-mails I was getting from friends were falling in to the Spam mailbox anyway; after taking a closer look, I can see why, they use an awful lot of otherwise unmentionable words - but my suspicion that I haven't gotten enough of these 'good-emails-with-bad-words' to make the filtering truly effective.
Nonetheless, it is nice to have all of my spams seemingly guaranteed to drop in to my "Spam" folder, but my usual task of manually filtering messages that made it past my existing filters in to my Spam folder has been replaced with a different (albeit quicker) task; taking messages out of my spam folder and putting them where they really belong.
Bottom-line: I still have to visually scan through my mail for legitimate messages amongst the thicket of items informing me about the exciting exploits of women at the farm, wonderful business opportunities from Nigeria and suggestions that I should buy Viagra by the boatload.. all this despite having collected a well organized and rather large collection of spam/non-spam mails. I'll stick with it for a while as I'd like to try it out and give it a proper chance, but I suspect that if you're in a similar situation then you should be prepared to tough it out..
I want to see a Mozilla feature button which when pressed:
1. stores the spam sender's address
2. forwards the spam to all stored spammer addresses
Give 'em a taste of their own medicine. Get enough people doing this, and each spam site should get melted down.
Can we get a "-1 Wrong" moderation option?
You could merge the measuring portion of the Bayesian filter into imapd.
A special imap folder called "spam" would exist. Messages fed into this folder would be used to compute a filter database. After computing the filter database, the spam messages would be deleted leaving a single message behind representing the Bayesian filter database.
When fetching messages, this filter database would be checked by imapd as it fetched messages; matches would be automatically fed back to the spam folder, where they'd improve the filter, non-matches would show up in your inbox as expected.
No special client software required.
You could even have special virtual folders called "Inbox-Unfiltered" that would give an unfiltered view, a "Spam" folder that gave a spam-only view, as well as options not to delete spam moved to the spam folder autoamatically for review for false-positives.
As usual
From my configuration file:
set sort="threads"
set sort_aux="date-received"
What this does is to thread all replies to a message, Usenet style. There are commands to break apart (for people who send a message to a mailing list by replying to a random other message) and join together (for people with bad email clients) threads.
The sort_aux tells Mutt "OK, once you've threaded everything, sort the the messages by using the date received of the top level message in each thread." If you're one of those lunatics that doesn't like a threaded view, you can just use 'set sort="date-received"' instead.
The only time this is a problem is when your email server goes down and there are a batch of messages from a mailing list that arrive in reverse order. But then, if they all happen to be in the same thread, they're sorted by who's replying to what, so it ends up OK.
I went from Netscape mail to PINE to Mutt, and I don't see any reason to use anything else.
WMBC freeform/independent online radio.
I like the ability to block images from a server, but it'd also be nice to have a similar feature for plugins and Java applets.
A lot of ad companies are now using really annoying flash. Blocking images doesn't stop these.
"You spoony bard!" -Tellah
And you'll have a real winner. Probably several other techniques could be combined as well, but back when I wrote a program just to check all of the from IPs in an email to see if any of them were open relays, I got around 80% filtering with very few false positives.
Furthermore, you can assign a pretty good probability number based on what sort of open relay it is (i.e. verified, unverified, spam server, merely unsecured server, etc). If it comes from a spam server, the chances are 100% that it's spam. If it comes from a dialup server, the chances are about 99.9999%. If it comes from an automatically verified open relay, that's merely unsecured, the chances are more like 60%.
The open relay thing really intrigued me because it has NOTHING to do with the message body, and it was my belief at the time that there was no good way to filter based on message content.
However, combine this with bayes, and I'll bet you'll have something grand.
Also, a great feature would be a multi-tiered identifier, so that you could have the 99.999% sure spam filtered into one folder, and the 75% sure spam filtered into another. You'd have to sift through the 75%, but probably could just leave the 99% alone.
WWJD? JWRTFA!
After the articles a couple weeks ago about the utility of bayesian spam filters, I knew it was merely a matter of time before it was put into Mozilla. :-)
"I will trust Google to 'do no evil' until the founders no longer run it." Hello Alphabet.
Almost all IE users were forced to pay cash money for their browser.
That is not true with Netscape, Mozilla or Opera.
Only at Microsoft are you forced to key products based upon the needs of Microsoft instead of your own.
NexuSys - Linux support by the best
Quite possibly you didn't see the link. I use the FREE service. I've never paid SpamCop a penny. SpamCop builds a database of spammers, and uses the information to convince ISPs that they need to shut off the spammer.
It works, too. SpamCop has sometimes forwarded replies from ISPs that say that they are deeply sorry and the spammer's account was shut off immediately, sometimes within two hours of the time I received the spam. Nothing undeserved can happen; the ISP examines the logs and discovers the truth of SpamCop's computer analysis.
A secret that should be known by everyone: Many spammers put serial numbers in their spam. When SpamCop forwards the spam to the ISP, the ISP sometimes forwards that to the spammer, as evidence. The spammer recognizes to whom the spam with that serial number was sent. Since they don't want to have other accounts shut off, they remove me from their lists -- very quickly.
Note that SpamCop never discloses my email address to the spammer or the spammer's ISP.
Spammers don't want the grief that comes from messing with people like me who will always forward their spam to SpamCop within a few hours.
There are other services like SpamCop. I'd like to hear about user's experiences with them.
If everyone who used Mozilla sent all their spam to services like SpamCop, we would create a rocky road for spammers. There are spam-friendly ISPs, but SpamCop communicates with the internet backbone providers also, who are unlikely to be spam-friendly.
Spamming back could work but many of those emails do not have legit reply email addresses.
However, if you bother to reply to the email until you find a real valid email address then "that" address would be the one to associate with the spam. Then send all your spam you recieve to all of the valid and proven email addresses they use for business purposes.
Of course, your email is likely to end up on more than one list of spammers too.
NexuSys - Linux support by the best
Opera 7 beta shipped.... Unlike every single +0.001 release of Phoneix, it doesn't make news on Slashdot.
:)
Gee they coded it from the start and surprisingly, its faster,smaller, unlike netscape 6-7 teached us... http://www.opera.com
Mozilla fanatic moderators will burn points now, so I hate doing it but sending with score +1 bonus. At least, some of Slashdot readers would be AWARE...
Playing games Slashdot?
This was posted to the SpellChecker Email List last night (14 Nov 2002). After 2.5 months without a spellchecker for Mozilla on Win32, someone finally released one that works. See http://mozillacafe.org/MozSpell_1.2f_w32.xpi.
Just in case anyone wondered, using the spellchecker from spellchecker.mozdev.org has not worked for Win32 nightly builds, Mozilla 1.1 or 1.2b releases since the end of August. The spellcheck.xpi from Netscape 7 may work for these Linux builds but does not work for Win32.
"I'm The Bounty Bear. I will find him anywhere. I'm searching."
Submit a request at bugzilla.mozilla.org ...
- Michael T. Babcock (Yes, I blog)
Is it possible to install Mozilla Mail (and the address book) without installing the browser?
It will be when the Mail component is branched off into its own project, soon after the release of Phoenix 0.5.
Is it possible to install Outlook Express without installing Microsoft Internet Explorer?
Will I retire or break 10K?