DSPAM v2.10 Released

Cool! by Anonymous Coward · 2004-03-13 16:05 · Score: 5, Funny

I've always wanted a spam filter with 1000% accuracy!

Re:Cool! by Monx · 2004-03-13 16:31 · Score: 5, Informative

IIRC, the "10x better" means 10x lower failure rate. The wording almost seems meant to deceive. The idea is that if you misidentify 10 messages out of 100, the filter would only misidentify 1. Since you made 10x as many mistakes, the filter was 10x as accurate as you were.

The real problem by Anonymous Coward · 2004-03-13 16:08 · Score: 4, Insightful

The real problem is people who actually buy this stuff. If no one was buying things from spam, no one would send spam. We all know this.

I propose we start spamming. Anyone who responds gets a nice l'il pistol whipping and is returned to their comptuer. After the first news report, people will be afraid to respond to spam.

Re:The real problem by www.fuckingdie.com · 2004-03-13 16:20 · Score: 5, Funny

Is there somewhere that I can sign up to be a pistol whipper?

--
That really is my homepage, no kidding.
Re:The real problem by kramer · 2004-03-13 16:37 · Score: 5, Insightful

I think the best answer the 'If nobody would by this stuff...' argument was:

Spam works on the level of 1 in 10,000. The general population contains a far higher rate of mental illness, senility, and retardation.

You'll never cure spam by 'education' of any sort. There are some people who are just too crazy or too stupid to learn.
Re:The real problem by dillee1 · 2004-03-13 16:38 · Score: 2, Interesting

Nice idea.

Make another email worm like MyDoom(call it MyDick/MyAss etc), with misleading title/body that sounds like those spam mail that enlarge/shrinks various human anatomy.

People who reply those mails will be activated the virus and make his/her computer unuable. Soon nobody will have the gut to open spam mail anymore.
Re:The real problem by No.+24601 · 2004-03-13 17:09 · Score: 3, Funny

Is there somewhere that I can sign up to be a pistol whipper?
The German government is cracking down on people like you.
Re:The real problem by Anonymous Coward · 2004-03-13 17:36 · Score: 4, Insightful

All these suggestions make the naive assumption that people in general learn from past mistakes.
Re:The real problem by Anonymous Coward · 2004-03-13 17:41 · Score: 2, Interesting

Actually, there is another solution. Everyone could simply respond to the spam they get. That'd quickly ruin the ecomonics of spammers.
Re:The real problem by r_glen · 2004-03-13 17:46 · Score: 5, Funny

But I thought they were the spammers.

Details. by Anonymous Coward · 2004-03-13 16:09 · Score: 5, Informative

Introduction

DSPAM (as in De-Spam) is an extremely scalable, open-source statistical-algorithmic hybrid anti-spam filter. A majority of users running v2.10+ achieve filtering rates ranging from 99.92% - 99.98+%, DSPAM is currently effective as both a server-side agent for UNIX email servers and a developer's library for mail clients, other anti-spam tools, and similar projects requiring drop-in spam filtering. DSPAM has been implemented on many large and small scale systems with the largest systems being reported at about 125,000 mailboxes.

What is a Statistical-Algorithmic Hybrid Filter?
Present-day language classifiers bear the responsibility of maintaining accuracy in the midst of ever-increasing sample complexity. In the setting of spam filtering, many types of intentional attacks have been introduced such as obfuscation, word list injection, sample flooding, and etcetera. As the complexity of classification text continues to multiply rapidly, many filter developers today are left with conflicted feelings between increasing the complexity of their filter and wise teachings from CS class reminding them that computer science is about controlling complexity, not creating it. At the rate complexity is rising, filters will (and have already begun to) become so resource-intensive that they lose scalability, eventually leading to a second conflict of interests: where fighting spam becomes more expensive than managing it.

DSPAM is the first Statistical-Algorithmic Hybrid filter and in being such boldly suggests that there is a better alternative to increasing the feature set of filters to match the spams they are trying to fight. By employing algorithms designed to increase the quality of existing data rather than the quantity of data with the goal of reducing the feature set rather than increasing it, DSPAM has managed to achieve nearly equal levels of accuracy with present-day Markovian-based filters and other types of filters that employ large feature sets with the added benefit of using a significantly fewer amount of resources. DSPAM presently peaks at 99.984% accuracy, which is ten times more accurate than a human being [1] and is presently being used on implementations as large as 125,000+ mailboxes.

DSPAM's Focus
The DSPAM project attempts to go beyond "just another statistical filter" by focusing on the following areas:

* DSPAM has a strong focus on providing better data to already existing algorithms (Bayesian, Chi-Square, etcetera) Combination algorithms work inherently well, but depend on the quality of data. Some of the approaches deployed in DSPAM towards this goal include Chained Tokens, Inoculation Groups, Classification Groups, advanced de-obfuscation techniques, and a new noise reduction algorithm called Bayesian Noise Reduction. The goal is to incorporate processing algorithms that can withstand the long haul of ever increasing message complexity. So far we're doing a great job.
* A strong focus on large-scale implementation support. The largest implementation of DSPAM we've heard about to-date involves 125,000 users. DSPAM has been designed to experience a very short execution time (0.03s - 0.10s on average hardware), and has been equipped with a storage driver API allowing several different storage mechanisms to be used. Depending on disk space constraints, accuracy can be traded off for additional disk space or vice-versa.
* Empty Corpus Support and Global Dictionary Support. It is very important in a large-scale environment to allow users to build their own dictionaries starting from scratch. Why? Because system administrators haven't got the time to create 20,000 seeded dictionaries. On top of this, ISPs require out-of-the-box filtering, which DSPAM's global dictionary feature provides for end-users, with minimal centralized learning. DSPAM provides support for building corpuses from scratch without suffering many fatal training errors (false positives). When these two approaches are combined, we end up with instant-filtering for all u

Re:Details. by sirsnork · 2004-03-13 20:46 · Score: 2, Insightful

Fantastic.... Really I would live to try it.

I'm assuming you are linked to the project, forgive me for the rant if thats incorrect.

Might I suggest you get a webserver/ISP that is somewhat reliable. I've been trying to get a copy of this software since it was alst mentioned on Slashdot. The site was slashdotted when I first tried, cool I thought, I'll check again tomorrow. Still down the next day, OK I think maybe there's still an effect. I wait a week and check again thinking maybe they went over their cap from their ISP and they shut them down, but the site was still down and stayed that way for weeks.

I finally get back to it today and the site was up, great I think, so I try to download the latest version (before this story hit v2.08 if I remember), and the file wasn't there, although it was probably getting 2.10 copied up and linked. Then when I hit Slashdot I see this and of course the site is now down again (imagine my surprise).
How long will anyone that actually wants a copy of this have to wait? Could you not actually host a copy on your sourceforge site too so that people who want to use this could actually get a copy to install?

On a slightly related note when I was there I noticed they are looking for someone to write some installation scripts to add installation with various MTA's, again kind of hard to do if no one can actually get a copy. OK I'm finished :-)

--

Normal people worry me!

cool by adamruck · 2004-03-13 16:10 · Score: 2, Interesting

now the question is.. how hard is it to get it to work with cpanel

--
Selling software wont make you money, selling a service will.

I wonder if this will catch what Mozilla misses by wmspringer · 2004-03-13 16:11 · Score: 4, Informative

Right now the only spam getting through my Mozilla filter is stuff that starts with one or two unrelated sentences, then goes into the advertising with any spam-type words (viagra, etc) horribly mispelled.

--
Twenties Retirement

Re:I wonder if this will catch what Mozilla misses by reaper20 · 2004-03-13 16:38 · Score: 4, Informative

Thunderbird's latest builds have an improved spam filter using some ideas from SpamBayes, it's substantially improved from the older filter.
Re:I wonder if this will catch what Mozilla misses by reaper20 · 2004-03-13 17:26 · Score: 2, Interesting

The last two weekly builds have had this turned on. Further information is in this thread.

The bugzilla number for this feature evades me at the moment. I've only used the windows builds provided, but it shouldn't be too difficult to make your own linux build with this stuff turned on.

Re:What's DSPAM? by wintahmoot · 2004-03-13 16:12 · Score: 4, Informative

From what I can tell, DSPAM plugs into your MTA as a local delivery agent, very much like SpamAssassin does.

I couldn't see any platform requirements on their site, but here's what they say about MTA compatibility:

DSPAM works great with Sendmail, Postfix, Qmail, Courier, and Exim, and should work well with any other MTA that supports an external local delivery agent.

Hope that answers your questions :P

--
Martin May

funny faq by adamruck · 2004-03-13 16:12 · Score: 4, Funny

this is from the faq...

In real-world scenarios, false positives have ranged anywhere from 0% (none) to 0.10% depending on both implementation and user's mail behavior. Users with relatively predictable mail behavior (such as geeks, dweebs, and freaks) have generally received very few false positives (less than 1 in 10,000 messages).

--
Selling software wont make you money, selling a service will.

Re:funny faq by Feztaa · 2004-03-13 19:30 · Score: 4, Funny

Users with relatively predictable mail behavior (such as geeks, dweebs, and freaks) have generally received very few false positives

What about losers, dorks, and morons? Are they cursed with a high rate of false positives?

I still prefer tougher email security by NanoGator · 2004-03-13 16:15 · Score: 4, Insightful

This may work for a little while, but the creative peeps will find a way around it.

I say forget the filtering shit and force email to evolve. Part of the reason that spam happens is that there is no real authentication going on. No requesting permission to be on your white list. No real strong way to block anybody you don't want to hear from. No real way to verify the sender is legit. etc.

I don't claim to have all the answers, but I do know that I've been using ICQ for years and haven't seen a Spam from there since I turned on the 'require authorization' feature.

--
"Derp de derp."

Re:I still prefer tougher email security by Paleomacus · 2004-03-13 16:24 · Score: 2, Interesting

Well I haven't used it in a year or two. But I had require authorization on from day one and still got request for authorization spam. Where some pr0n/webcam botperson requests authorization with a little ad in the request.

I don't have any clue what the solution to the spam email problem is but I believe it'd have to be a pretty major evolution.
Re:I still prefer tougher email security by Enahs · 2004-03-13 16:28 · Score: 3, Informative

A short overview of SPF + SMTP

--
Stating on Slashdot that I like cheese since 1997.
Re:I still prefer tougher email security by tftp · 2004-03-13 17:38 · Score: 4, Interesting

Evolution of email is difficult even in theory.
The authentication is useless even if implemented - you want to receive email from strangers, that's what all businesses are doing. If you are not one of them and only converse with your buddies, make a whitelist and be done - no spammer will guess your friends' emails.
Permissions to send email are also troublesome. If they are automated, then spam robots will be written to ask for permission first. If they are not automated... but how would you know if some random "John X. Frisby" <jfrisby@big.provider.net> is really who he is, and the matter he wants to discuss with you is not a bug in your Loafizer 0.99 script for your bread making machine, but a placebo enlargement pill. Additionally, permissions delay the mail exchange, which is bad for business.
There are ways to block anyone you don't want, and all other senders are legit (until they spam you, that is.)
So the problem is quite different, as you can see. There is a free channel of marketing, and spammers will be using it until it remains a) free and b) channel. Remove any one of those two, and they will close up the shop.
Re:I still prefer tougher email security by Orne · 2004-03-13 18:24 · Score: 2, Interesting

Ah, but every now and then I get a "User has requested to add you to their contact list..." in my ICQ and they just put the spam in the notification reason box. I see the same thing with automated request system; they'll use the request process to pass the advertisements in to you.

Call me a cynic, but I think we're dealing with an inherantly unsecure system. As long as you have one mail server out there forging message headers, you can't trust the path back to the sender. Like abstinance, Whitelisting may be the only way to block 100% of what you don't want. But then you might be blocking an email from your third cousin someday who decides to email you out of the blue. The happy medium is the automated filter, like Yahoo's... but I'm noticing that this past week spammers have figured how to slip message through that one too ...

CRM114 Discriminator works better for me by Anonymous Coward · 2004-03-13 16:15 · Score: 5, Interesting

I tried several incarnations of dspam over a period of about 6 months. It was a pain in the butt to install, required a massive amount of training, and required you run a web server in order to have the point and click training capability.

I eventually gave up and tried the CRM114 Discriminator:

http://crm114.sourceforge.net/

It was MUCH easier to install, MUCH easier to maintain, and has the same or better level of accuracy. I used to get 100+ spam messages a day and now I'll get maybe 1 or 2 a week that sneak through (after only a few weeks of training on errors only).

Now, if there was an adaptation for Kmail by grmoc · 2004-03-13 16:21 · Score: 3, Insightful

That would be ideal.
(since then the 'casual' user could benefit from using it, without undue difficulty in configuration of mail delivery programs, which are notorious in general..)

now only if.. by crache · 2004-03-13 16:26 · Score: 2, Insightful

it could be used in html rendering

Re:now only if.. by hotchai · 2004-03-13 16:51 · Score: 2, Interesting

Exactly my thoughts! Can we include something in Slashcode that automatically filters the GNAA and goatse trolls? Perhaps as a user-configurable option.

Some Bayesian approach ought to do it ... I wouldn't want jokes based on the "$PROJECT is dying - Netcraft confirms it!" troll to be filtered out!

Preventing Victims of Spam by www.fuckingdie.com · 2004-03-13 16:30 · Score: 4, Funny

Computer manufacturers will begin including a Hammer type device into PCs beginning immediately. This device will, when its associated software detects a user attempting to sign up for free porn, hammer the user to death.

Computer manufacturers are also investigating whether this device will be able to deal with the so-called "Stupid User Problem" which plagues so many IT professionals world wide.

--
That really is my homepage, no kidding.

Bayesian Unsupervised Learning by VoidEngineer · 2004-03-13 16:31 · Score: 5, Interesting

FYI, modern MRI scanners use bayesian noise reduction during image processing. I used to work in a MRI research laboratory, and our director had pioneered the application of Bayesian noise-filtering algorithms in post-processing of image data.

Oddly enough, our director of research was notoriously difficult person to schedule a meeting with. Makes me wonder about 'unsupervised learning'...

More accurate than a human? by Percent+Man · 2004-03-13 16:42 · Score: 4, Funny

accuracy levels as high as 10x that of a human...

So, let me get this straight - my spam filter will know better than I do which emails I want to read, and which ones I don't?
"No, trust me man, you really want a bigger johnson. Read it!"

Re:More accurate than a human? by asavage · 2004-03-13 17:10 · Score: 2, Informative

yes it can. A human can be 100% accurate when dealing with only a few emails, but when you are dealing with tens or hundreds you will sometimes make mistakes.
Re:More accurate than a human? by rudedog · 2004-03-13 19:05 · Score: 2, Insightful

So, let me get this straight - my spam filter will know better than I do which emails I want to read, and which ones I don't?

Yes, it will. When I'm faced with 100 new messages in my inbox and probably only one or two are legitimate, I often delete messages that look like spam without opening them, and other times, I have to open them just to double check that it really is spam. I have accidentally deleted more than one legitimate message this way, and have wasted more time that I care to contemplate opening up spam.

So I probably have an accuracy rate of around 97 or 98%, which is nowhere near as good as 99.9.

(And I use SpamAssassin as well; but it's clearly no longer the killer it once was :-(
Re:More accurate than a human? by jmv · 2004-03-13 19:11 · Score: 2, Informative

Most likely, it'll make less errors than the number of mistakes you're going to make because you're flooded in spam. Given a mailbox with 1000 spam and 1000 ham, I'm pretty sure I'll mess up a couple times while trying to delete only the spam.

--
Opus: the Swiss army knife of audio codec

Umm... what's the definition of spam? by michaelmalak · 2004-03-13 16:43 · Score: 4, Interesting

algorithm providing accuracy levels as high as 10x that of a human

Is this to say I can't tell when I'm being spammed? I thought the ultimate definition of spam was mail unwanted by a person. How can a computer decide a piece of mail is bad for a person if that person really wanted it? One could digress way off with this on Asimov's Laws and the politics of Socialism/Fascism vs. Libertarianism (that e-mail is just no good for you, you oughtn't read it).

Re:Umm... what's the definition of spam? by Rick+the+Red · 2004-03-13 17:05 · Score: 3, Insightful

You miss the point. You teach dspam what you do and don't want to see, so ultimately you decide.
Outlook is like what you fear; Microsoft decides what you will and won't see. I can add specific senders to the black and white lists (you click to add to the blacklist, but you have to type in an address to add it to the whitelist -- stupid MS shits), but Microsoft decides if I can see that attachment (if they think it's bad, it's gone and I can't recover it) or if this email's spam (it regularly discarded stuff from IBM Developer Works until I added them to my whitelist). With a tool like dspam I can regain control over what gets filtered (although I've found no way to turn off Outlook's attachment blocking).

--
If all this should have a reason, we would be the last to know.
Re:Umm... what's the definition of spam? by Snowmit · 2004-03-13 17:40 · Score: 3, Informative

Is this to say I can't tell when I'm being spammed?

Leaving aside the part where you barely avoid the paranoid rantings of a madman, yes, there are times when you can't tell if you're being spammed. Like, how many times have you accidentally deleted an email that you thought was spam but was really from a long-lost friend? Or how many times have you opened Spam because you weren't sure that it was Spam or something from your ISP (or whatever).

Say you've done it 10 times in 10 000 messages. If this program only did it once in 10 000 messages (false positive or missing negative) then it was 10x as accurate as you.

--
I have a lot of opinions about Cyborgs and Architects
Re:Umm... what's the definition of spam? by kryptkpr · 2004-03-13 17:58 · Score: 2, Informative

Didn't look very hard did you?

Tools, Options, Security, uncheck "Do not Allow attachments to be Opened that cound potentially contain a virus".

--
DJ kRYPT's Free MP3s!

Take it one step further; share what you filter by bigberk · 2004-03-13 16:44 · Score: 5, Interesting

DSPAM is one of these statistical filters (like spamprobe and CRM114) that can perform virtually perfect filtering of spam/non-spam you receive.

Now that you are free of spam yourself, may I suggest that you take it one step further and share your data with the anti-spam community; the WPBL project lets many users report the IPs sending them spam and non-spam in realtime using a couple simple scripts installed in procmail.

Our central database then publishes a real-time list of spam sources (the IP blocklist). Unlike spamcop, WPBL is entirely based upon automatic decisions made by statistical filters, 24/7. The resulting blocklist is already used by many ISPs; and you can also use it to block spamming IPs at your own server.

Re:Take it one step further; share what you filter by gclef · 2004-03-14 00:38 · Score: 2, Interesting

And how exactly do you keep the spammers from submitting their own IPs as "good" or from submitting real ISPs as "bad"? I didn't see anything on that website to indicate how you're managing potential liars making submissions, which will kill this system pretty quickly if it ever becomes commonly used.
Re:Take it one step further; share what you filter by Anonymous Coward · 2004-03-14 07:13 · Score: 2, Informative

AFAIK, both the SBL and the WPBL only allow list writes from trusted users with accounts.

DSPAM sounds great... by DarkHelmet · 2004-03-13 16:44 · Score: 3, Funny

But will it keep all those GNAA posts out of slashdot? ;)

--
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i

Re:Filter at sender? by whmac33 · 2004-03-13 16:49 · Score: 2, Interesting

Cox was doing this when I was using their service.

I think that was more to prevent the SMTP virus stuff going on though more than spam.

The solution - seriously by ryanvm · 2004-03-13 16:52 · Score: 2, Interesting

The solution to the spam problem is simple yet elegant - gambling.

Every time you send an email you place a small wager on the line that the recipient wants to read your message. Something like 1 cent. If the recipient doesn't mind your message then they don't redeem your offer and it doesn't cost you a thing. However, if you're sending spam then the recipient cashes it in (or perhaps it is used to cover overhead costs of this system).

If you send a legitimate email and somebody decides to be a jerk and cash it in then you're only out 1 penny. However, if you just sent 2 million of those unwanted emails you're screwed.

This is better than the "small price" schemes because it doesn't cost anything. Well, unless you're A) a spammer or B) sending email to dickheads.

This wouldn't replace SMTP, it would just be a layer on top. If you sent an email and you participated in this system then a third party would sign your messages and you'd be get a special verifiable header that the recipient could then treat as "likely ham".

Anybody have a better idea? I didn't think so. :)

Re:The solution - seriously by Rick+the+Red · 2004-03-13 17:18 · Score: 2, Funny

My better idea: A network of pissed-off spam recipients. If I get a spam I contact someone on the network who lives near the spammer, and they go over and beat the shit out of them. Likewise, if there's a spammer in my area I'll go beat the shit out of them for you if you're on the network. Call us eMail Agents For Independent Action.

--
If all this should have a reason, we would be the last to know.

Re:Filter at sender? by Rick+the+Red · 2004-03-13 16:54 · Score: 3, Insightful

What about the ISPs who cater to spammers? AOL and MSN are not the only ISPs, you know.

--
If all this should have a reason, we would be the last to know.

Magic Bullet Idiots by SuperBanana · 2004-03-13 17:08 · Score: 3, Insightful

Then the spam wouldn't even be transported over the net, saving vast amounts of traffic on the internet backbones. This action could also potentially kill spam overnight.

Ever read the FAQs for the anti-spam listsnewsgroups? Virtually top of the list is "I have some magic bullet solution that'll end spam tomorrow!"

You are -truly- naive to think this kind of solution would even be possible to implement; there are literally dozens of reasons why this would be a horrifically stupid idea; how this post ever got to +5 is way beyond me. Time to start meta-moderating more, as apparently positive mod points are getting handed out a little too easily these days.

--
Please help metamoderate.

Here's where "10x as accurate as human" comes from by Gldm · 2004-03-13 17:13 · Score: 4, Informative

If you check the footnotes on the DSPAM page, it says "According to a study by Bill Yerazunis of CRM114."

If you then check the link to CRM114's project, you'll find this: "I measured my own accuracy to be around 99.84%, by classifying the same set of 3000ish messages twice over a period of about a week, reading each message from the top until I feel "confident" of the message status, (one message per screen unless I want more than one screen to decide on a message.) and doing the classification in small batches with plenty of breaks and other office tasks to avoid fatigue. Then I diff()ed the two passes to generate a result. Assuming I never duplicate the same mistake, I, as an unassisted human, under nearly optimal conditions, am 99.84% accurate.)."

Given the amount of people who even read the article on slashdot I doubt anyone else is going to check the tiny [1] footnote and find this.

--

Introducing the new Occam Fusion! Now with sqrt(-1) fewer blades!

Impossible. by SiMac · 2004-03-13 17:15 · Score: 3, Insightful

If this happened, there would have to be about 10 SMTP servers handling all the mail, the ones belonging to the major backbone providers. Otherwise, a spammer could purchase a T1 from a backbone provider and send out as much spam as he wanted. Almost all ISPs catering to end users have to get their connections from other ISPs somewhere along the line.

It might be sort of difficult to have 10 companies handle the Internet's email supply.

Re:Heh, I had the same idea... by Yottabyte84 · 2004-03-13 17:19 · Score: 2, Funny

My friend and I had jokingly sugguested starting a spam 'pharmacy' selling various things, that are, in reality, arsinic. Kill the morons that buy shit.

Put this into Slashcode? heh by dsanfte · 2004-03-13 17:22 · Score: 4, Insightful

By the looks of the Intel story below, Slashdot sure needs a good Bayesian spam filter. I recommend this. Or a baseball bat. Because you can go over to anti-slash and really pound some skulls with a baseball bat, and it would probably be more satisfying. But filters are good too, don't get me wrong.

--
occultae nullus est respectus musicae - originally a Greek proverb

Bah... by Pig+Hogger · 2004-03-13 17:24 · Score: 4, Interesting

It's STILL just an " automated press-deleter".

No matter what technology it uses, neural nets, b-trees, recursion, tinkertoy logic, smell-emitting diode, leaky junction zener transistor, steam-powered aeolipiles, it only automagically presses delete, which is a pretty lame way of fighting spam.

It's a lame way of fighting spam, because, we STILL have to pay for the fucking spam bandwitdh; we STILL have to pay for the goddammed disk space used by the spam; we STILL have to pay for the bloody time lost transmitting the spam; we STILL have to pay for the extra ISP infrastructure to carry those spams.

Naaah. Spammers should be eradicated from the Internet, and the best way to do so is to completely BLOCK networks who host spammers (no matter what service), in order to force the collateral damage to whine to the ISP or simply vote with their feet.

It would be nice if.... by mark-t · 2004-03-13 17:28 · Score: 3, Interesting

... if there was some way to plug tools like this into Mozilla directly so that you could expand on its built in junk mail detection with something more powerful.

--
File under 'M' for 'Manic ranting'

Re:Works great with Qmail? Oh really now? by 7Ghent · 2004-03-13 17:32 · Score: 2, Informative

Easy, just set up a .qmail file in each virtual account's home dir that contains

|/usr/local/bin/dspam --user $EXT@HIDDEN$HOST -d $EXT@HIDDEN$HOST

Explained in the last DSPAM /. story by devphil · 2004-03-13 17:34 · Score: 4, Insightful

except that my article history is truncated in a futile attempt to get me to subscribe. So I can't point to the writeup I did.

The increased accuracy comes from the emails that will slip under your mental radar. You are a human, and you make mistakes. You wouldn't deliberately choose to read the email, but one day the subject line looks plausible, and so you bring it up. Three-quarters of a second later, you're glaring at the monitor and hitting "delete", but DSPAM wouldn't have let that slip by in the first place.

--
You cannot apply a technological solution to a sociological problem. (Edwards' Law)

Combating SPAM is easy, if you have the technology by Avlimator · 2004-03-13 17:46 · Score: 5, Interesting

I don't get SPAM. I don't have SPAM filters. How is this possible? Simple. I create a different e-mail address for any new untrusted entity that I have to provide one for. In the beginning I took advantage of being able to alias all e-mail for non-existent mailboxes (basically, *) at my domain to my primary account. It seemed to me an obvious and simple approach. Whenever I needed to provide an e-mail address, I just made one up, and it was forwarded to my regular Inbox. In my opinion, at that time my ISP was more "sophisticated" than most. Since then I have moved to hosting all of my domains on my own co-located server which runs Exchange 2000, thus complicating things. Now I have to actually add any new aliases that I want to use into my user account. I know of at least one product out there that can handle non-existent addresses and forward them to a specific account, but it is rather expensive for a feature that should have been built-in from the beginning (althought I'm not aware if the new Exchange can do this out of the box). Not to mention that someone with the proper knowledge and skills could make a similar add-on in relatively short order, but who ever has the time? The point is that you have to consider when and where you give your e-mail address out, and the possible consequences therein. It's not altogether different from giving out your phone number (especially if you are unlisted) or even your SSN.

Daft, on many levels by Julian+Morrison · 2004-03-13 17:52 · Score: 3, Insightful

Everyone would fudge refusals and pocket the cash.

Scumbags would use billions of zombied PCs to send themselves mails, aggregate and pocket the cash. Or to spam you gratis.

There are transaction costs for generating, checking, and accumulating digital cash. Your paypal bills would be huge.

Everybody hates micropayments.

It's a dumb idea and it simply isn't gonna happen.

The trouble with per-user filtering by Animats · 2004-03-13 18:14 · Score: 3, Insightful

Spam filtering needs to be applied to multiple E-mail accounts to work really well. The fundamental characteristic of spam that can't be avoided is that large numbers of similar messages are sent to different people. That's recognizable.

Looking for spam by content analysis for a single user only works for some people. If, for example, your legitimate E-mail contains many messages about investments, mortgages, and similar financial subjects, it's going to be hard to separate out financial spam by word analysis.

Spamcop does multiple-user analysis. It works better than most of the single-user systems.

CRM114/P.O.E. by Jasn · 2004-03-13 18:32 · Score: 2, Interesting

Not to underestimate the effort, but with extensions this has got to be easier than I think it is. Ruven Gottlieb's Purity-of-Email project is out there to integrate Mozilla mail with CRM114.

Your spam solution could be abused by Quantum+Jim · 2004-03-13 18:39 · Score: 3, Insightful

There are several scenarios where your proposal would be bad for the Internet. Say I want to put my competitor out of business, or at least raise his costs. I simply use a bot to sign up for a couple hundred thousand email addresses, sign up for his newsletters, then ask for all those 1 cents back. The financial powers that be might also foresee too much liability and risk in ventures that depend on email (since it is, as you say, gambling). Thus the end of any free service that depends on e-mail for verifying accounts including newsletters, bulletin boards, online banking, and online auctions among others.

Furthermore, you'd have to have a foolproof system to pay for those cents. Fraud could be much more rampant: If you pay via credit card, the other guy (or gal) has your number and could overcharge a corporation by a twenty or so dollars. Furthermore, micropayments aren't economical unless many many many people pay. If most people play by the rules, then the costs of credit companies or banks or other institutions would either put most of these services out-of-business or into subscription only domains. Not to mention some companies might have "you agree not to ask for those cents" in addition to "I can send you spam" legal clauses - negating your proposal!

--
It is impossible to enjoy idling thoroughly unless one has plenty of work to do.
- Jerome Klapka Jerome

Re:Combating SPAM is easy, if you have the technol by sparkeyjames · 2004-03-13 19:36 · Score: 2, Insightful

As a further note. The best technology is to use spaminator.com. When you encounter a website that askes for you email address why give it one to send spam too that you have to clean up or leave to rot. Try this..... whateverthehellnameyouwant@spaminator.com.
Dumps the email data and address data base every 5 hours. Fun stuff.

Sparkeyjames

To mod or to post. Spam is the question. by krray · 2004-03-13 19:45 · Score: 3, Interesting

You *WILL* get spam my friend. I've been doing this for almost 20 years (admin) now -- and have specifically used aliased accounts for various reasons over the years as you are doing.

Wait... You'll be interested to know that the biggest problem with the spam coming in comes from virus infected Windows boxes. They send it. They harvest the users Outlook address book. If you ever end up in somebody's Outlook box ... it only a matter of time before you're screwed.

I chuckle at the whole Exchange thing. You pay for that?

I personally pay to have a fixed IP @ home and run a old Linux box. A lot of aliases I've used over the years (and some blatantly used to harvest) all go to some local account that processes the spam. Upon receipt -- mail the wrong account and sorry, but you're blocked (unless white-listed). White-listing can come from valid already received email -- but I work everything based off of IP. My hope is that the registered MX host(s) or any valid listed server by the authenticating DNS server will be the type of scheme that's re-implemented (or more to the point SHOE-horned in real soon :). Bill's idea of email stamps, well, hahahahaha...

Over the last decade I've now got 380 aliased harvesting spam address' in use -- two valid email accounts @ home (my wife and myself) which is on my own IP with my own domain. I pay $5 extra a month above my broadband (10Mbit [yeah, solid] wireless) -- how much do you pay for that Exchange box?

I've run this type of setup through many offices scaled to dozens of email servers -- and the beauty is they also talk to each other sharing block/white-listed address' as needed. Wait -- you will get spam. Filtered through my account to I'm seeing 80 something that got in -- 2,164 blocked IP's [today], 380 harvested address', and 48 for various other infractions (attempts to relay through me, from a country where I know nobody, etc :).

Statistically (yeah, they all get nmap'd back)? 96% Windows based.

I give my email to friends. I have a work email that anybody that knows how to call me can have it. I even print it on my business card. No, I wouldn't post it to USENET or even here -- but it's still "out there". My unlisted phone number, OTOH, anybody can have. 847.854.0048. It's always busy and one channel of my ISDN home line. The other channel routes to the house for two phone lines (or Internet backup if and as needed) and is automatically unlisted and unpublished (at no cost since it is a "data circuit") -- and no, I'd rather not post that either. :)

Exchange? Never!

Re:Combating SPAM is easy, if you have the technol by Avlimator · 2004-03-13 20:03 · Score: 2, Insightful

The post above is mine, my login must have been dropped.

Re:Problem is ... by Shisha · 2004-03-13 22:34 · Score: 3, Interesting

The bottom line is, "No software can ever be better than a human in defining Spam".

That is true if the human is looking at a single email. Now give the same human a mailbox with 2000 messages, 1000 of which are spam (by his standards). He won't be thinking twice about calling the message spam and getting rid of it, so he's bound to makea couple of mistakes (happend to me a while ago, one of my friends has her email @ladymail.com and the Subject was in Latin - random to me. I called it spam befere even reading Hello,...).

The claim that is being made is that if this poor man overlooks 10 spam emails, dspam will only overlook one. Whether that's true or not is another thing, and would again depend on the circumstances, but I believe it would apply to me.

Control set = training set? by munch117 · 2004-03-14 00:39 · Score: 2, Insightful

The filter was tested on 6597 messages. So how many messages was it trained on? I sure hope it's not the same 6597 messages, because in that case any accuracy number is meaningless.

/A

It's Good That It's So Good At Filtering Spam.... by Necrotica · 2004-03-14 02:53 · Score: 2, Insightful

Now if they could only make it usable. After reading the last Slashdot article about it I decided to try and move my Amavis/ClamAV/SpamAssassin/Postfix/Courier-IMAP setup to use DSPAM. Good Lord what a configuration nightmare. I couldn't find a decent HOW-TO and no real working example configurations in order to test it out. Sure the README "has all the information I'll ever need" but some of the stuff that it talks about I don't understand and I don't have the patience to configure it through trial and error.

Developing good software is one thing. But it's a lot nicer when good software is actually usable. I'll be sticking with SpamAssassin until they can dumb it down a little.

Certified SMTP Hosts. by eluusive · 2004-03-14 06:04 · Score: 3, Informative

What would work well is SSL certified SMTP relays. If every valid SMTP relay needed an SSL certificate then, If spam was sent their SSL certificate could easily be rejected. And hosts that didn't have one at all could just be dropped.

SSL certificates are costly, and that limits everyone from having one. However, there is no reason the Open Source community could not make up our own root certficate, and have an SMTP SSL certificate signing organization. Where we verify the authenticity of someone before we give them a cert. For a small fee to cover costs. It wouldn't be like we'd have to convince Netscape, Microsoft, Apple and whoever else makes a browser to include the cert. It'd just need to be available for people hosting servers to download.

Yes, this would mean rejecting massive amounts of email to begin with. Maybe some intern solution could be thought of as people move over to it?

Ideas? Comments?

bogofilter by tacocat · 2004-03-14 12:30 · Score: 2, Interesting

I recently started using bogofilter as a replacement to spamassassin. The reason for doing this was curiousity and the fact that the spamassassin regex process will always be following the spammers, not preceding them. The result is packages supplied by distros are quickly outdated and ineffective.

I have been using bogofilter for one month and have trained it to such a point that my weekly spam misidentification is well below 0.1% with proper training and configuration. And it's processing time is well below 1 second per message on a VIA EPIA 533 cpu (slow, ok?)

The net outcome of this is that I have found something which is highly adaptive to new spam techniques, extremely effective, very fast and light on the resources, and is at the point now where if just works.

The idea that they, DSPAM, will provide you with a pre-defined training set. That's damaging. What if you are an oral surgeon? You'll never get any email!

I've been working intensively on spam and have come to a few conclusions about spam filtering and such that I just have to share.

It will never go away. Even if you can proper regulate and control it, spam will never go away. No matter what anyone does. If the US constitution is to remain intact you can't remove spam just as we haven't been able to remove advertisements from radio, telephone, or television. And just like you can't get rid of pornography. It's all Free Speech.

It's also carrying a lot of money.

What will happen is that corporations, in the name of reducing spam, will lock up mail servers such that you have to pay them a service fee to send email on top of your connection fees paid today. Microsofts recent movement into the arena shows that thier is a motivation to make money out of spam/email.

In a few years, we'll pay for our email and we'll still get spam

68 of 234 comments (clear)