Comparison of Bayesian POP3 Spam Filters
kreide writes "Spam e-mail has become an ever increasing problem, and these days it is next to impossible to use e-mail without receiving it in large amounts. Although various techniques exits to combat the problem, spammers seemed to be winning the war - until a new, powerful weapon appeared on the scene: Bayesian filters, our last, best hope for spam-free inboxes. In this review I compare POP3 based bayesian spam filters." We did an Ask Slashdot on this a few weeks ago.
It's harrasment.
BOO! TERRO
I just sure as hell hope he meant "latest, best hope", because anyone who thinks bayesian is the LAST best hope doesn't understand CS technology at all. And such a person sure as all hell shouldn't be given an audience on /.
- I love animals. I try to eat at least one a day.
I think spam is overhyped : it is not convenient to get some but with properly adjusted filters, very few of these will land elsewhere than in you trash can.
Personally, I get around 100 of these a day, but only 3 get in my inbox instead of one of my specific mail directories, this is not *that* disturbing.
I just wish these spams were better targetted : getting some penis-enlargement, ultra-fast-diet, university-diploma or cheap-herbal alternative to viagra is somehow repetitive and boring.
Trolling using another account since 2005.
I still believe that we should have a hunting season for spammers, just like we do for ducks...
Never underestimate the predictability of human stupidity...
Spam is effective because it reaches millions of people who are not installing these filters on their systems. Until ISP's start applying these filters to all spam by default, then the spam filters will have no effect at all, exactly the same number of marks will be reached and respond no matter if the people who know better than to respond to spam go ahead and filter their e-mail or not!
I'm an American. I love this country and the freedoms that we used to have.
I would have liked to see how my favorite bayesian spam filter, K9, would have faired in your comparison, but it failed to meet your first requirement of being cross platform. It's freeware written in C, is about a 60kb-100kb download, depending on if you get it with the self installer, is easy to use, and has a very small memory footprint. Before today it had sorted my email with over 99.8% accuracy, excluding the first couple days of training, and after only a couple weeks of use, though now it's down to 99.7%.
I have used PopFile in the past on both Windows and Linux, but found K9 to be better suited for environments where Windows is an option. It's very easy to use, having a windowed interface, and it seemed to learn much faster than PopFile did.
I haven't used SpamBayes. I'll have to give it a shot.
The article didn't mention SpamProbe. It is what I use, and it has worked quite well for the past month or so that I've been using it. Perhaps this is just because the author didn't test this spam filter yet, but I like it quite a lot with my current mutt/procmail setup. Take this for what it's worth.
- I love animals. I try to eat at least one a day.
Somebody needs to move to the UK for a while...
I love spam protection programs. I've been using them for years, but have to switch every couple of months because of the friggen spammers. The people that make the spamming software don't just sit around cackling about how evil they are. They reverse engineer every anti-spam protection out there in an attempt to get around it. While this seems like a good idea (and I will be playing around with these two programs for a while), it's unfortunately only good up to the point when spammers figure a way around it.
I wish the government would somehow make the practice illegal, but I doubt they'll ever get anything to stick. The far better option at this point is to have a class action suit of server owners (who provide mail accounts) against developers of spamming software and spammers. I've gotten enough warnings from my university to know that bandwidth costs money. By sending millions of spams a year into any one e-mail server, that can account for a serious chunk of bandwidth used at significant cost to the provider. It won't stop spam all together, but it will bankrupt anybody that has been doing it.
It's not stupid. It's advanced.
Taking I get 100+ spams a day I've found that its a goo idea to at least use tagging. For example posting on usernet I use usenet@domain.com with something in my sig saying actualy email is me at domain dot com. Anything sent to usenet is automatically deleted. Doesn't stop the flow by any means but at least I can track where the spam came from.
If you are feeling clever you can even use addresses that expire after a week. So something like epochseconds@domain.com
Just my 0.02p
Rus
Cheap UK and US VPS
As someone who recently acquired a B.S. in mathematics several days ago, I understand how these filters work. They are an excellent way to fight spam over the older methods.
;)
However, I think that ultimately this sort of thing misses the point. Spam needs to be fought in the courts, not in the battlefield. I'm afraid that the success of these filters will cause spam NOT to become illegal, and thus lead to a world where we have a constant trickle of spam, albeit in small amounts.
I think we all agree that we want spam to be gone entirely, as is evidence by the first post being labeled as "troll"
- I am a viral sig. Please copy me and help me spread. [strain #2] Thank you
Fighting spam as an individual will never work no matter how great filter algorithms you develop. Hell, even the blacklists won't work until the ISPs are forced, by guerilla action if necessary, to crack down on spammers and hard.
I have been using POPFile since January, and I know it uses pseudowords for all kinds of features spammers use, like comments, remote images etc. (html:comment, html:imgremotesrc).
Does SpamBayes do anything similar?
But that's still 3 pieces of shit you have to deal with. Sure, it's a simple click to delete, but the fact is WE SHOULD NOT FUCKING HAVE TOO.
Some wanker spammer got my email address and within two days my spam volume went from zero (seriously) to 30+ a day. All for the same fucking thing. These shits should be legal to hunt and kill.
In respose to the original troll, it's a bogus analogy. We PAY for our internet access. We get bombarded with ads on damn near every site... The revenue generated from these scumbags does NOT go towards funding your internet access, or the production of new content. It goes to their wallets. Ergo, you're an idiot.
Side note: "Last, best hope"... I can't be alone in expecting "for peace" to come after that.
i have been using popfile for a while, but its accuracy still isint good enough to skip over just yet...
;\
maybe i should restart it, mightve let one or two spam emails go into home, thats the problem with bayes filtering is if you make a mistake you have to restart
Classification Accuracy
Emails classified: 206
Classification errors: 45
Accuracy: 78.15%
Your server and its harddrives still end up being a storage bin for it, and the spammers will continue to send as long as your machine allows it to be recieved. Always remember that spam differs from postal junk mail, in that the -receiver- pays for it. Unsolicited postage due mail.
Spam must be -blocked- and the ISPs that allow/encourage its continued spread must re-educated, or be put out of business. Only when spam becomes costly to send with it diminish.
The current proposed laws concerning the subject are currently focusing on content rather than consent. They dont mind if you get spammed with hundreds of ads, provided what is being advertised isnt fraudulent. They overlook the fact that the claim of you having 'opt in' for the spam is in itself the lie and fraud.
--Teh
I have long been an advocate of Bayesian or keyword based spam filters, but have recently been forced to change my outlook, and to argue that MULTIPLE SIMULTANEOUS solutions are the answer.
I encountered a very simple but unique spam system which works entirely on the sender's address. Simply, you create a small database with the domains/addresses you want to whitelist. Then, a program screens your mail, and if the sender is not in your whitelist, it sends an e-mail BACK to the sender with a simple URL (or even an actual link for HTML e-mail clients) which states that they REALLY want to send the e-mail to its destination. When this is done, they are added to the whitelist. Therefore, mails from forged remote addresses are no longer a problem, and neither are mails from trusted sources. And, better than SPEWS or similar blacklists, the sender gets a SECOND CHANCE to send their mail to you.
There's a commercial solution using this system right now, although the URL escapes me.
Of course, one could encounter problems when ordering online, say. Droids at Amazon will not be clicking your links to make sure your order receipt got through. One could argue that you'd put things like Amazon.com in the whitelist, but what if someone used amazon.com as a spoofed e-mail domain/address? Ay, there's the rub. But if this system were tied in with a Bayesian system, it'd be pretty unbeatable. What's more the Bayesian system would have extra data for negative matches, in the form of e-mails that were never 'approved', and positive data in the form of those that were.
So, I'd be more interested in producing a homebrew system that used MULTIPLE weaker systems, than one supposed 'sure fire' method.. as I feel no one method is perfect, whereas multiple systems can approach this nirvana.
Which is an intersting statement.... Just think about this for a second (I do not claim this to be an end all statement about spam): You pay for cable TV, and yet have to sit through over 12 minutes of advertisements when watching a 1 hour program. Now why is that advertising, yet getting spam email is not? You pay for both media per month, yet one is generally allowed "spam" but the other is not. Please, go right on ahead and point out why spam is not the same as a commercial. I simply wanted to bring this topic up for discussion.
- I love animals. I try to eat at least one a day.
ideally, i think the client should take care of the filtering. Pour your resources into improving context based filtering and let the individual clients do the dumping. Widespread usage of this kind of filtering could make spam even further unprofitable. Since spam is entirely business related, it would likely reduce the numbers of it passing through the network.
From a sysadmin's POV, this doesn't halt the issue of spam eating bandwidth or disk space. I'll address that next.
Disk space depends on what kind of e-mail your organization uses. For POP3, most people delete e-mail on the server after its downloaded, so while the disk space may be consumed with spam, it would be temporary. That is unless you have alot of dead or rarely used accounts. In that case, you should have policies in place for when to wipe user's accounts out after a set period of time. Or set up some kind of forwarding policy. If you're using something like IMAP, then using a server-wide content filtering system as mentioned above would be effective.
For bandwidth, the only way to halt spam from consuming your bandwidth is by blocking packets at the router. If you use SPEWS to dump the e-mail by your e-mail server, its still consumed your bandwidth. So you'd have to block the packets directly. I think this is draconian and should be avoided, for the net's sake. Unfortunately there really is no good solution to this, for as long as spam flows, it flows and consumes bandwidth. The only way to halt it is to halt the initial spamming to begin with. As mentioned above, when your spammer's audience never exists as a result of good content filtering, the spam will be unprofitable and lessen somewhat.
Attacking users and their ISP's won't do much good, aside from causing spammers to jump from isp to isp, something they're readily willing to do. Attacking regular users just makes you a big jerk.
I have more than enough things to worry, including my shopping list, my housekeeping tasks, my garden... to just lose time and nerves other that few junk : when I get an unexpected commercial in my snail-mailbox, this *is* annoying as, here, in Switzerland, we pay for each garbage bag we throw away.
So, spam is junk, indeed, but i dispose of it almost instantaneously.
I won't make spamfighting my Holy War...
I have more interesting and valuable things to deal with IRL and I am naturally optimistic.
Let the spammers waste their time sending their hectobytes of off-topic (mostly american-centric) mail to my ever-improving filter.
Trolling using another account since 2005.
NEVER?....Try the BBC?
No ads, quality programming, small fee.
I'd be happy to.
I don't know about you but for me e-mail is an important part of my work - not something comparable to watching cable TV.
Spam clogs my mailbox and I have lost several important e-mails from clients when deleting the spam which, by the way, is often disguised as legitimate non-commercial mail and comes with forged headers. In addition to pushing fraudulent products, these facts make spam a completely different beast from the cable TV and its legitimate, controlled ads which eat up only my free time - not my emails or work efficiency.
BOO! TERRO
I don't mean to troll, but I hope it's not too late to put an end to the unfortunate term "Bayesian spam filtering". This is perhaps the worst abuse of the adjective "Bayesian" I've seen, because nothing crucially depends on the application of Bayes' Theorem and/or on the use of Bayesian methods (informative priors, model selection, etc.). Why not simply call it "data driven spam classification" (as opposed to "rule based") or "empirical spam filtering"?
If the spam disaster had struck fifteen years ago, we'd all be talking about "neural spam filtering" (using artificial neural networks, ANNs) and basking in the warm fuzzy feeling imparted by the term "neural". But ANNs and Bayesian classifiers have the same interface: both are trained on labeled data and can be used to classify unlabeled data. The implementation details are not of primary importance, and if you think they are, I'd encourage you to look into large margin classifiers instead of Naive Bayes or ANNs.
Marklar: marklar
I'd personally go for the last option... Maybe the next-to-last if their suit takes place in a really democratic place (there are 278 millions American citizens and 2,2 of them are in jail, this is a *lot*).
Trolling using another account since 2005.
But you still do get spam. Exactly as much of not more because you use Bayesian filtering. Spam still wastes your bandwidth to download that spam before it can be filtered. Spam still wastes any inbox size limits your ISP might impose. Spam cuts into any quota a forwarding service might now or in the future impose on your account, or it could take you to a higher charge level if you pay for a forwarding service. It costs your ISP money, costs that one way or another are eventually paid by you. Even the processing power for that Bayesian filtering costs you CPU cycles, while having no negative effect on the spammers whatsoever.
While you might not think you care how much spam I get, you might care if dozens, hundreds or thousands of other users at your work also get tons of spam, particularly when all of that spam significantly cuts into your bandwidth. And you will care when overload from spam on your mail server is so bad that it causes failures, effectively causing a D.O.S. situation.
And as long as geeks happly play with their little Bayesian filters, they stop seeing spam and so stop complaining to the providers that are letting spam get through. They stop doing other things that might make spammer's life difficult. Heck, I fully expect some spam haters with an additude like yours to say within earshot of a congressman or Senator something like "Oh, I never get any Spam. Spam can be filtered easily and nothing should be done about it". The spammers should love Bayesian filtering, it takes the presure off them while allowing them to reach exactly the same number of marks with a mailing.
I'm an American. I love this country and the freedoms that we used to have.
When will 'the net community' finally get it?
filtering is no solution as long as there's no way to stop the spammers!
Or would you say that ignoring the corpses in the gutters would be a solution to the problem of violence on the streets?
bye
[L]
By this article, SpamBayes.
;)
Which only works out of the box with Outlook 2000/Express. Woopy doo.
Are there any recommendations for those of us who aren't forced to use outlook? I use Eudora my self, have been for years, thus I'm not looking for a new email client recommendation.
Computational Madness in a round package.
I think there's a very simple distinction that can be made between spam and television advertising, and it has to do with the amount of control that your service provider exercises over the advertising content.
When you watch cable TV, you know that for an hour of content, you are going to see up to 12 minutes of advertising. The advertising is controlled by the cable company, and no-one can advertise on the channel without going through that 'filter'.
Spam, on the other hand, is not restricted. If I receive 100 e-mails a day, anywhere from 0 to 100 of them could be spam. None of those spams are sanctioned (or controlled) by my service-provider, and they were not part of the package I signed up for.
I get hundreds of mails per day and it's pretty good a picking out the spam.
It's no good at more subtle classification though, but spam/not spam is highly useful.
If you make a mistake filtering you don't have to restart, you just keep training it, eventually your mistake will be drowned out as statistical noise.
I've since been moved to Notes so no more spam filtering.
Government of the people, by corporate executives, for corporate profits.
Putas chicas! Muy caliente!!
First of all, I'll never reply to such a mail. You filter your spam on your own. Don't make me filter your spam. I sent you a mail because I thought it countained valuable info. If you don't want to receive it, bad for you, I don't care.
What about order confirmations and the like? Mailing lists? They won't reply either, and you won't get their mail.
What about if two people have this system? Will there be an infinite loop of confirmation requests?
How does bayesian filters solve the problem of pure-image spams? -I.e. HTML mails that contain nothing else than an IMG tag. I only see collaborative filters solving this problem - SPAMfighter would be an example of this.
- E-mail contains HTML tags of any sort, except for <A>
- E-mail contains attachments (unless solicited; whitelist)
- With all non-alphanumeric characters removed, certain case-insensitive keyword matches can detect spam
- E-mail is a forward or looks like chainmail / Nigerian scam
- E-mail contains junk strings in subject or sender
- E-mail comes from you, but header doesn't match your send name
- E-mail is excessively large (>20K) and unsolicited (whitelist)
- E-mail headers and/or text contain Mojibake, if unsolicited (whitelist) - this will block anything in Chinese or Russian, for example
- Badly formed headers
- Address doesn't match reverse lookup
If ANY of these apply, then, IMO, YOU FAIL IT!!I think, this would be a perfect filter system, if it could be coded. I have a homemade POP3 client that I could stand to add some of this to, I guess...
-uso.
Dreams, dreams, don't doubt dreams, dreaming children's dreaming dreams. Sailor Moon SS
No Adds??? no, it's stuffed to the brim with promos for their own stuff though... (Gardening magazine, History magazine, Nature magazine, Radio times, TellyTubby toys, Fimbles stuff, trailers for upcoming programmes and series)
Quality programming??? it's gone really downmarket in the last few years..
Small fee??? That fee is your license for receiving _all_ television programs, even cable and satellite... not just the BBC. Although that license money goes to the BBC, really a goodly share of it should go to the other service providers as well.
Donald 'Duck' Dunn: We had a band powerful enough to turn goat piss into gasoline.
Sorry, but filters are not the final answer. Even when the filters can "learn", the user still has to expend a certain amount of effort to "teach" the software. And quite frankly, spammers (or the people who write automated spamming software) just need to study the filters and learn to get around them. And worse, you can never be sure that the filter is not deleting email that you actually want, unless you set it to never delete suspect mail, allowing you to examine and delete it manually. But at this point, you've gained absolutely nothing -- simply setting your email client to put all email that's from addresses not in your address book, or that doesn't contain your exact address in the "To:" line will achieve exactly the same effect.
The only thing that can truly save email is to switch to a service that requires authentication of senders.
The quality of the programming is also going downhill fast - perhaps viewers outside of the UK are spared the dross we have to wade through here?
I know this is slightly off topic, but can someone answer me a reasonably simple question thats been bugging me for a while?
Why not instead of hunting down the spammers do we not hunt down the people who are selling and advertising their junk via the spammers?
The spammers purposly make themselves difficult to find, but it must be easier to track down a company that is collecting money and sending out products? Why not make the using of spammers services illegal and fine and punish those doing so?
I think Im correct in saying and please tell me if Im wrong, but here in the UK a similar situation is people "fly-posting". In these cases, if advertising posters are put somewhere illegal or unwanted, it is not the person who put the poster up that is fined, but the club, record label, whoever is beign advertised that takes the rap.
Just my 0.02p
You probably ARE a scumbag spammer.
For people who have to pay for their online time (England for example), these scumbags are essentially stealing money from people. Filtering only works once you've downloaded the mail. You still have to download their worthless drivel. Sure, it may be pennies a week in costs for a user, but you tally that up over a year or two of dealing with these idiots, and you've got a sizeable chunk of change. Certainly enough for a nice pizza.
Let's not forget the TIME these shits waste as well. All this work invested in stopping spam. Who know's what cool stuff may have come from the minds who instead are working on ways of dealing with the email cancer.
As I said, these scumbags should be legal to hunt and kill.
My ISP provides me ipv6 natively. Yep, a full /48 for me. And it's on a plain vanilla home DSL line.
/dev/ttyp1
Aug 11 03:19:02 traminer pppoe[19276]: Sent PADT
Aug 11 03:19:02 traminer pppd[12690]: Serial connection established.
Aug 11 03:19:02 traminer pppd[12690]: Using interface ppp0
Aug 11 03:19:02 traminer pppd[12690]: Connect: ppp0 <-->
Aug 11 03:19:08 traminer pppoe[12694]: PADS: Service-Name: ''
Aug 11 03:19:08 traminer pppoe[12694]: PPP session is 4029
Aug 11 03:19:12 traminer pppd[12690]: local LL address fe80::c959:a698:ed94:0ec
3
Aug 11 03:19:12 traminer pppd[12690]: remote LL address fe80::0208:e2ff:fe0a:d80
8
Aug 11 03:19:12 traminer pppd[12690]: Cannot determine ethernet address for prox
y ARP
Aug 11 03:19:12 traminer pppd[12690]: local IP address 62.212.101.212
Aug 11 03:19:12 traminer pppd[12690]: remote IP address 62.4.16.244
Aug 11 03:19:12 traminer pppd[12690]: primary DNS address 62.4.16.70
Aug 11 03:19:12 traminer pppd[12690]: secondary DNS address 62.4.17.69
Because advertisers pay good money for airtime, which in turn provides the cable people with funds to give you more tv shows. But then again...I guess you think those wares peddled by spammers work. In which case you are part of the problem.
Never underestimate the predictability of human stupidity...
You used to get a free satellite viewing card for your licence fee giving access to all the "terrestrial" public channels on satellite, which was great if you had a spare decoder and crappy terrestrial reception like where I live. To save a few quid, the BBC no longer fund these cards and have gone unencrypted, which means I've lost the other terrestrial channels upstairs. Thanks guys.
When I am king, you will be first against the wall.
Spam are intented to people who are in need of a miracle products or enough credule to purchase what spammers have to offer. And i don't think that those people will take the pain to run a spam filter and moreover to learn it to recognize spam. Because bayesian spam filters need to learn and it can take a week or two before it is efficient. This way spammers will always reach their goal, but you will have a way to filter the spam from your mailbox. The only solution that would really annoy spammers will be Bayesian filtering on server side. But ISPs will probably never do that, because the bayesian algorithm is not reliable at 100 percent and can filter the so-called "false positive". IMHO, bayesian filtering will never be the solution to *reduce* the spam flow. It could be useful for people who will classify the spam in a junk directory. Or for business who want to get rid of spam at the risk to loose some good emails. Oh by the way, Spam is a trademark! ;p
Regards
Moz's Bayesian filtering works well, but its Achilles heel is that it doesn't work on the POP3 server, so you still have to download everything. As POP3 allows the header and the first part of the message body to be read without downloading it, surely there could be an option - once Moz has been trained and you're fairly sure the false positive rate is negligible - for filters to operate on the server and delete spam from there?
When I am king, you will be first against the wall.
Yup, I was also thinking "for peace" ;) Long live B5
... I've had the same problem you had - going from 0 to 25-30 a day, sometimes even more. I don't think we'll ever be able to stop the spammers, but I think that some of the blame has to be put on those people offering free mail services like Hotmail, Yahoo.com (and .ca) and AOL. 95% of my spam originates from accounts on their domain, and when I'll try to send bounce messages with Mailwasher, the accounts used to spam me doesn't exist anymore ... so if these mailservices had made a system couldn't be used to create accounts automatically with a script, we might se a little more spam out on the net, as I doubt that the spammers would bother using lots of time creating accounts themselves ...
;)
But on the other side
I like the thought of an all year huntinglicense for spammers though
Since you use the slang wankers, I'm assuming you're UK based. However, it's a good thing you didn't do that here in the States. I would have pulled the trigger and the good-ol-boy coroner would have ruled it a hunting accent. Offical cause of death: misadventure.
it is viruses you clot
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Except that it doesn't. With any other service, an ISP, etc, you can take your business elsewhere if you don't like the service being - but the BBC still get paid the same if you watch ITV all the time. And the licence payers have no say in the programmes that are shown - the BBC have a pretty easy ride.
If you are going to delete everything that comes to via the Usenet address why do you include a valid email in as your return address?
you could reduce the flow to 0 by putting
From: not_real@naimod.moc
and to be honest if I was an email harvester I might have noticed "user at domain dot com" and be harvesting those too
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
If you have ever signed up with the Direct Marketing Association's Mail Preference Service (list of people not to send junk mail to), but continue to receive stacks of crap every day, here is what you can do about it: Prohibitory Order
Links to pdf's you need to print and mail in included.
"A little-known Federal law allows individuals to send a Prohibitory Order against companies that are sending unsolicited sexually provocative or erotically arousing mail. The Supreme Court went one step further, allowing individuals to decide what constitutes "erotically arousing" mail. The law makes it illegal for a company to send mail to an individual within thirty days of receiving the Order."
"Postmasters may not refuse to accept a Form 1500 because the advertisment in question does not appear to be sexually oriented. Only the addressee may make that determination."
Whenever the offence inspires less horror than the punishment, the rigour of penal law is obliged to give way...
"Support both Windows and Linux " ...
"The first requirement is because I wanted the results to be applicable to everyone"
My how the definition of everyone has changed. So it's bad luck Mac, Solaris, *BSD, HP-UX, VMS users...
-----
One of the things I love about popfile is it is not a Spam filter. It is a general mail filter. I have about ten categories of mail that it sorts out for me. This also helps cut out false positives. 'Work', 'Personal', 'Friends' and all much more similar to eacth other than 'Spam'.
And advertising is regulated, at least in my country.
How many adds do you see on television that are pornagrphic and targeted to children? How many penis enlargement adds do you see on television every day?
Thats right, in television there are standards to be upheld and when violated, there is usually a backlash against the individuals concerned, well at least in my country :)
POPfile really got shortchanged by this review. It serves as much more that a spam filter. I thought I'll give SpamBayes a try anyway but the Outlook plugin won't install on my XP machine. Some problem with an unresolved dependency in shlwapi.dll... boring. The point is, the SpamBayes site doesn't have a tech support forum where I can ask for help with these kind of problems.
The power of Christ compiles you!
The real solution would be to stop Spam from being sent, instead of stopping Spam from being received.
Now, that's not very easy, I guess. So to stop it from being sent, we stop it from being received so that the spam serves no purpose. Now this wont work until the vast majority starts using good spam filters. What we need is Bayesian filters in Outlook Express, so normal users uses it.
An analysis of filtering methods against spam is kind of like a comparison of bullet-proof vests in that there's no incentive to stop someone from pointing a gun at you and firing it. In the past, spammers have been grossly affected by more sweeping changes, and I'm afraid filtering methods are only creating the mindset of, "Give up, use this software, it will do the deleting for you." It takes the attitude of, "just delete the stuff" and makes it automatic; sure it's convenient for a time, but in a year you're still going to get spam and your ISP will likely have fewer resources to deal with the complaints.
I'm saying, why not focus instead on technology which puts a bigger dent in spammers' ability to operate, like how to secure against proxy hijacking.
"The cup... the drop... it's a YES!"
Where do you think most of the spam out there comes from? Small business owners. Stop being unpatriotic and show your support for American small business owners.
"Orthodoxy means not thinking--not needing to think. Orthodoxy is unconsciousness." --Eric Blair
Yahoo uses captchas to prevent scripted sign-ups, so if you get anything from a Yahoo mail account, there was once a human (OK, a subhuman) at the other end.
When I am king, you will be first against the wall.
Yes the filters can safely be put on the server side.
simply let all the email through, but everything that is tagged as spam is prepended in the subject line as [spam] and now your users have to sort it out.
and by simply adding a harvesting email address pool to snag spam spam and only spam you can automate the addition of new spam rules.
works great.
Do not look at laser with remaining good eye.
cripples all cripples.
... idiots.
like adding electronic fuel injection/managment to a shitty engine.
nobody has the cash do fix the problem from the bottom up so just add crutches
i don't want to end up a borg because some much hyped about gen-manipulation went wrong and human-kind couldn't undo it!
how much processing time is it going to cost to filter all the emails sent around the world in 24 hours please? alot!
that this SPAM filtering MIGHT be covert operation to monitor email traffic doesn't seem to bother anyone (distributed CARNIVORE system).
oh hell, if carnivore is really monitoring all the email traffic why not give it the job of cleaning out SPAM?
but this is all nonsense anyway because fbi.gov can't even manage their own domain (hosted by a33.g.akamai.net [202.181.171.73]). talk about home land security!
i think the data on the fbi.gov website isn't even stored in america...
%|
$0.04US charge for every Email SENT. Collage accounts can get refunded costs by delivering a sent mail list.
This will stop spamming quick... or at least make it slow way down.
1,000,000 spams = $40,000.00US more than the entier net worth of the most sucessful spammer.
Do not look at laser with remaining good eye.
hey - it pays for the radio too guys.. Just listen to radois 4 and 3 if you want a bit of quality.
SpamBayes only takes a day or two to get up to speed. After a week it's about as good as it's going to get.
My problem with spam isn't that it exists.
My complaint is it shows up in my inbox.
My problem with violence is it could happen to me, not that it happens to others.
I think this is a typical and selfish attitude.
I believe the URL that you are looking for (there was just a story on it on the public radio station the other day) is:
:)
http://www.mailblocks.com
Have a nice day
Use the Z-modem protocol between Information Superhighway routers to compress the plaintext. ~LordOfYourPants
POPFiles utility does not lie just in managing the spam menace. To me, the real utility in POPFile is the ability to create x number of buckets and train it to sort your mail. SpamBayes looks great for spam but has no further utility. I like having POPFile sort my work from personal emails, and file all my mailing lists in another, and even jokes. Of course there is the spam folder that I check every now and then. I look forward to it being able to support IMAP servers as well.
how to deal with lqhwquczzuk lqhwquniounxqs lqhwqusgwthsgn lqhwqumkzhtd lqhwquxuwdmvgxr lqhwqucslcyqxki lqhwquzytbktnxhqlqhwqurstlyagn lqhwquzaloqzq lqhwqumlkohoxfq lqhwqurjsnbjvagp lqhwquyjnfo lqhwquwxaqgnvlox lqhwqudxnht lqhwqurqrlqhwquzspjarube lqhwquuvryuc lqhwqukisuinib lqhwqurqkxans lqhwqufbjxrcgbrl lqhwqugqagax lqhwquhf lqhwqucluiinadcylqhwquhr.
Can some US people sue guys who send real junk like this? Or maybe ask them for a license for sending themselves email ($699 seems a reasonable price for a single inbox?)
Actually, the rapid growth of endorsements, product placements, "documentaries" about products etc. means that you're really seeing far more than just 12 minutes of advertising, the only restriction is that you're limited to 12 minutes of OBVIOUS advertising.
SpamAssassin catches 99.99% of the SPAM i get.
i aggregate about a dozen email addresses through one Linux user (postmaster) and then filter and distribute the mail using procmail and SpamAssassin.
after tweaking it for the first month or so i have not had to mess with SpamAssassin's filtering.
i get the occasional false positive but my setup rarely lets a false negative slip through.
when i do get a false positive, i put the offender email address in a temporary SpamAssassin whitelist and send the message to a dummy email address (deSPAM) which de-spams the message before SpamAssassin passes it to the intended user.
email me if you want to learn more...
thor
I have finally had to resort to something that scans post download - so I finally have to resort to the pop3 scanner of sorts.
I currently have at least 5 or so e-mail addresses, all of which just funnel down into a single address at this point.
But I am starting up an online company and need to add at least 10 more addresses (info@companyname.com, sales@companyname.com, etc).
I currently get just over 100 spams a day, and I am fine with that - I set the filters to be pretty restrictive and if I miss mail, no big deal. I have a small enough list of people that contact me that I add them to the whitelist and then *most* new people contacting me get through assuming what they are talking about is sufficiently non-spammy.
I am using SpamAssassin 2.60 and it is working well for me. I have tweaked the settings for my uses.
But since my company will have these web facing e-mails, and I really can't miss any of them since they are existing or potential new clients, I have to lessen the strictness of my spam filter.
As a result, the 1-5 e-mails that sneak through each week is going to increase just with the less strict settings, as well as with the increase of new addresses available and coming in.
What I like so much about SpamAssassin is that it runs on the server and therefore it yanks out the spam and I don't need to download it over my connection. It was fine while I was in the States and had a cheap and fast connection to the net.
But now that I am on a variety of connections and speeds, having to download 100 messages that are spam and THEN have them filtered out to find out that I just downloaded something, taking up bandwidth and time for naught, is really annoying.
I would say that well over 90% of my mail right now is spam - so getting rid of that before I download it is key.
That said, I know now that things are going to get through, so I need a client side pop3 filter.
I liked the idea of Cloudmark's SpamNet and so I've been giving that a shot. It is free for a month and it is easy to install.
I have been using it now for a few days - maybe a week at most.
I can't say that I have been particularly impressed with it. Of the 10 or so e-mails that get through each week (the filter is less strict now), it grabs 5-8 of them.
I of course would love for it to pick up on all of them.
That said, it is integrated well with Outlook, is easy to use, and the service is cheap once I have to pay for it ($4 a month I think).
I know there are totally free options out there, and I will very likely look into them at some point soon before committing to paying for SpamNet, but the ease of installation and usage is key to me.
I used to love the "fun" of toying with something and getting it to work. I liked it if it was annoying or challenging - I had time to do it and it made me feel like part of a group that knew what they were doing, and we were better than the slobs that couldn't get it working.
But now I'm very busy and actually do things with my time that make money - and my free time is getting increasingly sparse.
As a result, I just want things that work straight out of the box and always work.
There are some odd things afoot now, in the Villa Straylight.
I did my own investigation of spam filters about a week ago. I didn't test the actual algorithms, just the features.
SpamPal with the add-on Bayesian filter (search Google for it) came out top. It works as a proxy and also provides blacklist/whitelist/known Spammer list checking.
What a twat.
OK, I'll bite on this troll just because it's still at zero, and the moderators need a reason to finish it off, placing it firmly in -1 hell where it belongs.
In the days before user-paid television service, it is true that advertising was the business impetus to put up huge powerful TV transmitters and undertake the other investmentss necessary to support land-based TV broadcasting. You are correct, therefore, in pointing out that TV content from 1977 derives from the business need to advertise.
But to suggest that the meager investments in bandwidth and hardware the average spammer makes is somehow otherwise useful to the world is absurd. When one considers that most of the infrastructure costs of spam are borne by the recipient rather than the sender, the idea of spammers contributing to the public good is assinine.
who are those slashdot people? they swept over like Mongol-Tartars.
No it's not.
I get spam at the rate of 1 spam mail per 6 months or so. Or maybe even less. I can't remember getting a single spam email on my actual email address for about a year.
If you have an account on a crapless domain (i.e. not hotmail.com, msn.com, aol.com and the likes),
it all comes down to this very simple rule:
Do not, under any circumstance, have your email address posted publicly accessible ANYWHERE on the web.
It WILL get trawled. And then it will be spammed relentlessly.
If you have an existing address you don't want to give up, or an address at hotmail.com or a similar place, dump it.
Then exercise a bit of common sense about where you use your actual address.
I have a domain which catches email to unknown addresses and put them in my regular mailbox.
Whenever I have to give an email address to some place on the web, I use *domain-i-am-currently-visiting*@mydomain.com. So if I am visiting foobar.com, I would put in foorbar.com@mydomain.com.
I have been doing this for years. It enables me to see what was the source of the leak when I get spam on one of the addresses.
It has taught me one thing: I have never, ever, ever, in all my years of online shopping, forum posting etc, come across a single website that have ignored their own privacy statement. Ever. Even the slightly sketchy sites (like divx subtitle sites) don't leak addresses.
I was surprised to realize this.
The only addresses I ever get spam on are the ones I know to be publicly displayed on the web.
So it's that easy to avoid spam.
Give me liberty or give me kill -s 9
I already pay for my email account. Why should I have to put up with unsolicited commercial email? Make email advertizing mandatory "opt-in" and I'll agree with you.
I Just installed SpamBayes !
It looks great, but does any one have init.d startup scripts to start the pop3proxy ?
This method of combating SPAM is amazing to me. Admitingly I'm a little behind the geek times so my interest in this method was peaked when Apple released Mail.app. But I still use Mac OS 9 and am in no rush to run X yet so I'm glad to see there are alternatives that I can use.
I think the only reasonable way to rid the world of SPAM is to get the foolish folk who respond to it to stop. The reason there is so much of it now is that it seems to work; there are people who actually respond to it. If these people stopped responding to it the use of SPAM would most likely diminish.
Sending SPAM costs money. No sence spending that money if no profit is made.
I'm looking for a spam filtering solution that will work with my a)desktop client Eudora, b)webmail client, and with a c)Palm client--and maybe d)a cell phone down the road.
SpamBayes seems to do the trick for (a), where I can filter on the client, but how can I accomplish b) c) and d)??
Can you recommend a good webmail client for b)webmail? I played around with Squirrelmail and liked it, but have moved back to POP mail for the most flexible approach with good clients everywhere. That leaves me with Neomail, or some other you recommend...
And how about c)Palm and d)cellphone? The problem with most of the mark-message-with-new-header approach is that you are still downloading the Spam with the Ham and you are getting bandwidth charges for both. I'd like a pop filter that only returns the known good stuff (when I wish).
In the absence of any helpful responses, I will probably hack up pop3proxy.py from SpamBayes to make it do what I'm looking for.
Thanks in advance.
slashsearch.org - slashdot search. powered by google.
The "unsure" feature directly combats the latest Spammer technique -- filter poisoning.
You've all seen it work; the Spammers don't just send you the same spam once, they send you it 5 to 20 times, and they include a clipping from the headlines or something under their pitch.
They're not doing it to get that one mail past to you. They're actually HOPING that you classify all 20 mails as spam.
Why?
Because every time you classify that mail as spam, EVERY SINGLE WORD of that news clipping is "poisoned" inside the filter, and becomes an indicator of a spam. Then you turn around, and get an email from someone legitimate using those common words... and it gets wrongly classified too.
Enough false positives, and the spammers win, because they'll get you to turn the filter back off.
Enough is enough -- time to establish open hunting season on Spammers.
Fraud and deceptive sales practices are already illegal, why not use those tools to diminish the spam problem?
Most spam that I get is for the sale of products that don't work (eg, penis enlargers), probably don't work (get rich quick), are part of an ongoing swindle operation (stock spam, which is likely pump-n-dump), or may violate other laws (cable descramblers, online pharmacies).
The people collecting the money for these products are the ones paying the spammers; if you can put a significant dent in these fraudulent enterprises, the spammers will lose business and some may move along to something else, and more power to you if you can implicate the spammers as accessories to the fraud; with a fraudulent businessman facing 5-10 in a Federal prison, they might easily roll over on their spammer friends for a reduction in jail time.
Focusing energy and legislative action to "ban" spam is fruitless if you don't eliminate the source of the spam. Deceptive selling is the problem, spam is just a tool for this.
The only problem with this that I can see is that deceptive selling is often considered a legitimate business practice in the US, and there's a lot of people that lie and cheat customers and only get rich. If we could have a little more stringent interpretation of fraud (ie, you have to tell the truth as the common person understands it), then we could easily go after these people and probably put a significant dent in fraud.
Maybe in the good ol' crappy US of A. Here in Britain, we pay a LICENSE so that we can actually pay for programmes without being bombarded by advertising nonsense. Thank heavens for the BBC.
The name refers to Bayesian networks.
A Bayesian approach to spam filtering
A simple mathematical introduction
As a network/web/computer manager, my email has been provided to dozens of companies and trade shows. I still remember the day (August, 3 years ago) when someone first sold my address to a spam list. I went from 2-3 spams per day to 15-20. This spring brought another explosion, this time into the 100+ range. I am currently receiving over 6,000 spam messages every month! Obviously my main email address was useless and needed to be burned on a pyre to purge the evil.
After a week or two of this, I installed SpamBayes in the form of it's outlook plugin. I showed it my email archive as my "good" messages, and a bunch of spam gleaned from my deleted folder as "bad". My mailbox is now perfectly clean. I have received at least 15,000 spam messages since installing SpamBayes, and I have probably had to hit the "Delete As Spam" button about 10 times for ones that it missed, most of those being variations on the Nigerian scheme. It has never grabbed a real message, and the "Unsure" feature localizes everything that I really need to look at in one place.
If you have a spam problem, get SpamBayes. It is that simple. There is no need to speculate about that better method that you thought up, or how it really won't work because of XYZ theory... it works almost perfectly, and it lets you know about anything that it is not sure about with the "Unsure" folder, so it never throws the baby out with the bathwater. In short, this is almost the perfect Spam filter. It even caught the emails that were using GIFs to avoid being filtered on content, placing them in unsure until I said "this is spam", after which I never saw another one. Pretty darned cool!
It is actually kind of fun to watch this thing work. I came in this morning to find 568 new messages in my spam folder, 3 in unsure, all of which were spam. No spam anywhere to be found in my inbox, just 15 unread messages that were correctly left alone by SpamBayes. Just imagine having to flip through 600 emails to find 15 real messages! Now I just hit "CTRL-A DEL" in my spam folder and it is all gone! 5 seconds a day to deal with spam, I can live with that....
Out of that 2.2 million people, somewhere near 700,000 are in jail from possession, use or distribution of marijuana. A law that was originally used to control migrant mexican workers has bogged down the american legal system to the breaking point. Imagine, 700,000 new cells open for child molesters, rapists, spammers, and SCO executives.
Wouldn't it be grand?
PS: Sorry about the OT, but things like this need to be said whenever the opportunity presents itself.
The chains are broken
Loki is free
Ragnarok is at hand...
Why kill the spammer, when you should be focusing on the idiot users that purchase shit from these guys. Kill 'em all! or was it Sue 'em all! I always forget these days!
I got the link and figures wrong on that last post.
As of 2001, # of Americans (only americans, this says nothing sbout the rest of the world) arrested for marijuana related charges since 1965: Over 11 Million
The chains are broken
Loki is free
Ragnarok is at hand...
No one appears to have mentioned Knowspam yet. 100% spam blocking. No filters. Just a simple "prove you're human" auto-reply sent to the sender and a "friends" list. http://knowspam.net/
Television model: advertisers via commercials (spam) pay for the programming I see, subsidizing my costs to watch. Snail Mail model: advertisers pay the post office to send their bulk mail (spam), subsidizing the cost for me to send and receive mail. Internet model: Spammers have a free ride. I pay to receive their crap. Want to get rid of spammers? Not likely - we have their equivalent in other media. Want to reduce it? Make them pay. Unfortunately there will always be some fool who thinks the herbal viagra will work, or that Munbumi from Nigeria is going to transfer him 20% of $22.5 from the Nigerian reserves. Spammers will always want to get to these people. Right now it costs them next to nothing.
my cube has a window...
SpamAssassin has Bayesian learning, which I have running but not for long enough to test. I recently set up MIMEDefang as a Sendmail milter calling SpamAssassin (which calls Razor). This setup allows Sendmail to reject e-mail beyond an arbitrary SpamAssassin score. The remote mail daemon is informed the mail cannot be delivered.
Setting that score at 8 has resulted in no false positives over a week (I log From and Subject information - it's all obvious spam). Then stuff that scores between 5 and 8 I divert to a separate mail box, which I comb through every day or two. There have been two false positives that ended up in that over the week. This is with hundreds of e-mails for a half-dozen users coming in a day. I also end up, with this setup, with 2-4 spams making it through to my own mailbox (the bussiest on the system). These are, because of the filtering, the least obnoxious, and easily enough report to Razor to spare others. Meanwhile, I like to keep a window open to the mail server running "tail -f mail.info | grep REJECT" and watch a dozen or so attempted spams an hour refused acceptance with a message like "554 5.7.1 SpamAssassin score of 15, rejected" back to the origin, which is enough that if it wasn't spam any good mail daemon will inform the sender, and they can find another way to get through.
Even if this gives spammers a clue about ducking SpamAssassin, the spams that can get by it are by far the least obnoxious. I look forward to seeing if the Bayesian feature helps (it feeds itself anything ti scores at over 15 by default). But it's a pretty good system short of that. If it became standard for ISPs to reject all mail with a SpamAssassin score of 8 or higher, the loss of legitimate communications would be exceedingly rare, and politeness standards would be encouraged.
"with their freedom lost all virtue lose" - Milton
I was more than a little disappointed to see that Apple's Mail.app was not included in the comparison. It wouldn't surprise me in the least if it were already the most widely used Bayesian spam filter. Unsurprisingly, it is also very easy to use.
Mail.app also combines Bayesian filtering with the Address book -- any mail from a known correspondent won't be tagged as Junk. This reduces the risk of false positives. This is an integration cheat not available to stand-alone spam filters, because Apple supplies the Address book app and provides other integration between the two applications. But, (as a self-centered end-user) I don't care that it is a cheat, I am merely happy that it all works well. (And I cross my fingers and hope that somehow, Apple's C/C++/Objective-C programmers are less prone to leaving buffer overflow holes than Microsoft's programmers clearly are.)
The author needs to read Edward Tufte's books on presenting information (e.g., The Visual Display of Quantitative Information).
In short, we are left with lies, damn lies, and what you said.
Nerd rage is the funniest rage.
Spam Sleuth also does all the other things like Whitelists, Blacklists, RBL, Challenge-Response (Turing), etc. It combines the results to determine "spaminess" and takes action.
Another advantage of Spam Sleuth is that it begins working without Bayesian, until it can build up a set of messages it can use for training. It also lets you correct any mistakes before training so you don't get a bad statistical data set.
It is naive (no pun intended) to think that Bayesian will be able to perform better than a multi-view solution.
The difference is very straightforward (which is why you are getting modded as a troll). Advertising on the TV is supposed to be covering (or, at least, defraying) the cost of production of the programming. IOW, the more ads there are, the cheaper it is to provide access to the programming. Spam, on the other hand, dramatically increases the load on networks, mail servers, storage arrays, and user mailboxes and the spammers do not have to cover that cost. IOW, the more spam there is, the more expensive it is to provide Internet access.
This difference flows out of the other big difference: spam exists because of a loophole (the trusting design of SMTP) and not because someone in the supplier and consumer chain took some sort of extra step to allow it to be there. As a result, there is no option of charging the spammers to cover the cost of their spamming. So, until the email delivery infrastructure is made less implicitly trusting, spammers will have no incentive to keep stop abusing it.
Micropayments would in no way stop spam, but they would cause cost and problems for honest users.
Where would those micropayments go? To the ISP? The one that is providing a haven for the spammer? To the spammer itself when it is acting as it's own ISP. In either case the micropayment would be a farce, certainly any service provider who is working with a spammer could wave the micropayments if they are already letting a spammer on the Internet in the first place.
But more importantly, micropayments will ruin some current valid used of e-mail and prevent some future ones. Think of the many mailing lists that are run by very low budget groups to communicate with hundreds or thousands of members. These would be destroyed if each time they were to send out a message to all members they invoked a micropayment on every single e-mail. Heck, Slashdot sends me an e-mail when someone responds to one of my posts; how long do you think that nice service would last when Slashdot started to have to pay a micropayment for every single response that was posted??? I would certainly see that this would completely kill any new uses of e-mail as well. Imagine a school that was considering using e-mail to keep the parents more current on their child's ststus; perhaps weekly or even daily report cards rather than end of the quarter suprises. It would be a system welcome by many partent, and could be easily done if integrated into an electronic grading system, but would be killed instantly if there were a unit cost on sending e-mail, even a small micropayment. There are lots of other useful things that e-mail might do for us in the future; do you really think we should give them up an adopt a payment system that really isn't going to stop spam in the first place?
And I still maintain that it will not stop spam, even if the payments are not to the original ISP. Many spammers manage to inject their spam into the net through badly configured servers or other exploits. While I'm all in favor of making someone who improperly sets up a server pay for letting spammers have a doorway to the net, I'm not as likely to advocate a system that encourages more trojans and backdoors so that spammers can pass their costs on to unsuspecting in-duh-viduals. There are simply better ways to fix the problem (by changing the fundamental flaws in SMTP) than by approaches that will harm valid e-mail uses.
I'm an American. I love this country and the freedoms that we used to have.
another good solution is pop3proxy.pl
it works like pop3proxy.py from spambayes but uses spamassassin for checking.
so you benefit from all the spamassassin checks + its bayesian classifier.
works with windows too.
http://mcd.perlmonk.org/pop3proxy/
As someone tired of receiving hundreds of pieces of Spam per day, I was overjoyed to find www.123mail.net . This is a great (albeit paid) alternative to Yahoo or Hotmail and for $24 per year you get a POP3/Webmail account with 15MB storage with no spam or advertising. Their filterng method is based on an incrementing point system which includes: Bayesian Classification , white-listing, Heuristics , Automatic Fingerprinting (DCC), Black-lists (known spammer, IP blocks, experiential), and user White-lists. A combination system is much better than a single method IMHO and I have seen the results first hand. I have yet to have a false positive and they have done all the Bayes training already. They have a handy web-based interface for reporting false-positives and re-delivering the mail if it makes any mistakes. 123Mail is actually owned by The Electric Mail Company www.electricmail.com and this same spam filtering solution is also deployed in Fortune 1000 firms. Ok this wasn't meant to sound like a plug, Bayes has just made me a believer.
The Bayesian Project was our last, best hope for peace.
It failed...
But in the year of the Spammer War, it became something greater: Our last, best hope for spam-free inboxes.
The year is 2003, the place: Bayesian 5.
I've read Grocklaw. BoycottNovell, you're no Grocklaw
I use a very bizzare email address, it's like a 24 character hex number@myisp.com
NO auto-spammer will *EVER* guess it or accidently stumble into it, ever. To this date I've never received a single spam message on my REAL email addy..
Many spammers just take dictionary names and add numbers to them, auto-incrementing the numbers.
Like bob333@aol.com then bob334@aol.com, etc..
Not to hard to figure that method out. If you use some dopey addy like that you are going to get spammed, sooner or later. It's just a matter of time before they hit your combo.
Now, being that I get ZERO spam, and that my dad and my friends get a virtual flood of crap everyday, I created a troll addy that is fully functional, see the addy I have listed on my post (it's real) and I splashed it all over usenet and forums everywhere. I've been training and tuning and learning to control and tweak the various spamassassin type filters so that I can be proficient in it.
Now that I've trolled up thousands and thousands of various crap spams, I can take care of my folks and my friends.
Once I have it down pat I'll delete the troll addy and enjoy a spam free life. With spamassassin installed and tweaked, and going back to using my hexidecimal addy I can live a spam free life.
If spam pisses you off, use bizzare addys and bayesian filters. That's the ticket, trust me..
Not when I'm *paying* for my ISP and email, you fucking idiot!
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
No sense mentioning I'm already *paying* for my email account in the first place! Fucking idiot. (Not you, dnj, the parent.)
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
I know this sounds like a troll but consider the following:
1) Americans probably get more t.v. advertising than anybody else on the planet, yet slashdoters (apart from tivio style conversations) rarely complain about t.v. adds
2) If you saw 10 advertisments per ad-break and had an ad-break per 10 minutes of viewing time, thats 60 ads per hour.
3) I read recently that the average american watches between 4-8 hours of t.v. a day. (thats increadible, where do they get time to crap and eat and work etc?) which translates to 8*60 which is 480 adverts per day.
thepaytons and msnbc
4) very few of us get 480 spams per day.
The solution isn't to complain or to sue or to punish any advertisers (whilst there are purchasers willing there is always an incentive to spam)
In fact maybee like the bafta awards for commercials immediate future
where tv shows case the best advertisements (funny, sexiest, shockingest etc etc) perhaps spam will mature into stuff we tolerate and maybe even laugh at.
People say that the spammers should be made to pay, or that we should charge for email so that spammers would stop or slow down. Well last time i checked they do pay for email. Sure fat pipes with pre-paid gig bandwidth's is v cheap, but its still bought and paid for.
People say that everyone hates spammers, yet that simply isn't true. No matter how much geeks hate spammers, there are more customers out there willing to buy dick extenders, boob enhancers and the staying power of a donkey!
The correct thing to do with spam is to wait for the market to mature, and silently use technology to strip it out whilst we can.
Whilst we are on the subject of unsolicited advertising, consider junk-mail, bill boards, video previews, movie theater advertising, sports brand promotions, corporate sponsorships etc etc etc.
Advertising and spam is here to stay. Stop whining, and accept it. The vast majority of the world simply does not hate spam as much as the average slashdotter.
Jech.
IANAL, but can't I sue some of the spammers? Not the mortgage ones, but the penis size ones should be no problem. Isn't there some sexual harrasement law I can apply? I get enough of it a day that I can't imangine anyone would not consider it harrasement. In fact the only juriers I can imangine are those without computers, and I have this idea that many are "little old ladies who only drive their car to church on sunday" and would want to throw the book at anyone who is "degrading socity" in that way, even if there is nothing illegal.
How can I track down who is sending me these things, and then where can I find a lawyer to take the case?
No, if you miss a few it doesn't matter at all. It only matters if you misclassify them. Using POPFile it only updates the corpus on a deliberate classification.
That's kind of a lame comparison with a lame set of requirements: has to run on both Linux and Windows.
My Bayesian of choice, SpamSieve, is directly hooked into my POP program, Mailsmith 2 on OS X. It sucks all the mail from all the accounts down, transparently adds an isSpam or isNotSpam property to the email and then Mailsmith dumps it into the spam folder if it is, or deals according with them if not.
After going through maybe 20k messages, I've had one false positive (good mail marked as spam and that was very early on its training) and running an overall rate of 97.5% including when it was being trained.
Bah to "Windows or Linux only." Even Apple Mail seems to be a better solution than what was previewed. A web interface? Give me a break.
It's a strange language :P
... ...
1 virus
2 virii
3 viriii
4 viriv
10 virx
1001 virmi
I prefer this flavor/flavour
Any filter based strictly on message content is all but useless in the long run. Why? Three reasons: false positives, false positives and false positives!
If there is a reasonable chance of losing even one real message, then I have to comb the filtered messages anyway, no matter how they got segregated. So absolutely nothing is gained in the end.
For an example in the extreme, what if a good friend forwards you a particularly juicy piece of spam with a commentary to make some point? Bam! Any content-based filter will rate it high and filter it. Message lost... unless you comb the rejects anyway.
Is there a filtering method with no reasonable chance of false positives? Yes, actually, the bait account, distribution-based, signature filtering represented well by BrightMail (I have no affiliation with BrightMail). That approach actually uses the very definition of spam, namely ~unsolicited~ mail sent by strangers to large numbers of recipients, plus blacklisting.
BrightMail claims false positives are 1 in 100,000, but it's probably even smaller than that. Even 1/100,000 is small enough that I don't feel a need to scan the filtered messages for false positives.
And, if you are unhappy with the less-than-100% filtering of something like BrightMasil, then you can apply other methods as well. At least you'll have less purported junk to scan for false positives.
Actually, the best combo would be an automatic whitelist acceptance (anybody you've ever mailed to or accepted mail from) followed by BrightMail (or equivalent) followed by a good content-based filter. Why nobody's done this yet is beyond me!
I sure wish there were a consumer version of BrightMail.
--David
I know that this will be buried since this story is already a day and a half old, but I figure this, too needs to be said.
That's just wrong, even according to the site which the above poster quotes. Here's the relevant lines:
Okay, so to review:
That's all for now...
So if you don't have any users name Aaron and Alice and Viagra and Zebra, the only people trying to reach them are dictionary-attack spammers and people who found their email addresses by running harvesters on your web site (or alternatively, running harvesters on Google after searching for useful phrases, so make sure you've got a lot of attractive-nuisance words like bulk email and multi-level marketing and such and some meta comments that'll help attract the search engines.)
If you want to also hack your DNS so that it gives different answers depending on who's asking, you could set things so that any DNS requests coming from an address on the less aggressive RBLs get handed the address of your teergrube, or 127.0.0.1, or the address of some other open relay.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks