SendMail CTO Sounds Off On Spam and FTC
CowboyRobot writes "Eric Allman takes his well-deserved turn in commenting on the state of spam, the dark future, and the need for intervention.
He calls spam an "arms race" where "in the long run everyone loses (except the arms dealers)."
As you might imagine, he's on our side, and he does a good job of clearly describing the current state of spam, and the possible solutions."
Isn't he one of them?
Forget thrust, drag, lift and weight. Airplanes fly because of money.
of the do not spam registry that they mention in the article. But it seems like a real pipe dream considering how much trouble there has been getting the do-not-call registry up and running.
Also, most telemarketing is done from in-country because of LD charges. Not so with e-mail. It's pretty hard to enforce US laws on a Taiwan spamhaus.
Ah well, every little voice against spam warms me a little at least.
lysergically yours
....the more I realize that no amount of technology or legislation is ever going to completely eradicate spam from our lives. More and more it seems to me that the only way we can get rid of spam is through educating the next generation of Internet users to ignore it.
Spammers spam because they make money. Educate people to ignore spam, and the spammers don't make money. Bingo, no more spam!
I know it sounds like a pipe dream, but what other options are there?
SCREW THE ADS! http://adblock.mozdev.org/ Proud user of teh Fox of Fire - Registered Linux User #289618
>The seventh is opt-out with an unsubscribe link that actually confirms your address as belonging to a live account.
The author doesn't say whether he believes this happens, but he implies so by adding another similar case: "The unsubscribe link removes you from the list in question, but it also adds your address to another list."
I'm calling bullshit on both of them. I challenge anyone here to cite any quantative evidence that replying to spam has resulted in them receiving so much as one extra message.
No, anecdotes don't cut it. Neither does common sense, or "Well, it stands to reason" arguments. Neither does the availability of "verified" address lists. I can create a billion psuedo-random addresses, call them "verified" and slap whatever price tag I like on them. It doesn't make it so, and remember what sort of people we're dealing with here. You don't think they'd screw each other over for a few bucks?
As far as I'm concerned, spam is so untargetted that replying to an unsubscribe cannot possibly make it worse. It's vanishingly unlikely to make it better, but how, exactly, does it make it worse?
Examples, statistics please. No more anecdotes, no more gut feelings.
If you were blocking sigs, you wouldn't have to read this.
I am sorry to tell you that you dont understand the average internet user at all. Installing any such spam filter or tool is well beyond the capability of 95% of the users atleast. Classifying mails as "spam" and "ham" and training the bayes engine and all are good for geeks, but not for the average user.Belive me for this. For him/her, these are just unacceptable solution and spammers exploit this weak point. As long as substantial chunk of users are non-geeks, spammers can flourish.And anti-spam laws are relevent in this context.
http://www.nasirudheen.blogspot/
You have a point. In the early days of spam, I'm certain that replying to spam would definitely get your address marked as alive. Nowadays, though, spammers have so many addresses and are sending so much spam that I highly doubt that they could deal with any replies to the crap they send out. And even if they do get a reply, they have so many other addresses to cycle through that they probably at best ignore it, and at worst might actually mark it as valid.
I agree with you. Does anybody have linkage to a Web site that actually explores this?
SCREW THE ADS! http://adblock.mozdev.org/ Proud user of teh Fox of Fire - Registered Linux User #289618
Bogofilter may not be for everyone, but DSPAM implements server-side...which means it's the sysadmins for the ISPs who install it and allow their users to opt-in or opt-out of spam filtering. All the average user has to do is forward messages they deem as 'spam' to an email address. pretty brain-dead easy.
When 99% of the spam on the internet passes through your product at some time, I'd say you should have an opinion.
Sendmail, promiscuous relay for all, Sendmail, providing remote root access since Day 1 on the Internet, Sendmail, of the indecipherable rules file , is on "our side" ? Are they even relevant except for inertia?
Lets talk to DJB, to Wietse Venema, to the MS Exchange developers first, before giving soapbox time to some suit.
I want to delete my account but Slashdot doesn't allow it.
Why can't certain specified mail servers be something like the look outs. If a certain percentage of them recieve the same email in a specified amount of time then they can designate it as spam and delete it from all the mail servers. then ISP's could subscribe to the "lookout server" list and delete any messages that have been designated as spam?
http://Lenny.com
Your last paragraph, however, shows that nevertheless you completely don't get it, and, by completely, I mean that you really sound as clueless as can be on the topic of spam.
Let's see how many standard spam-thread replies are required for your two sentences of nonsense at the end.
- SPAM is an arms race - single tools don't work, because eventually they will be beaten, as has happened to ALL tools as yet, including bayesian filters.
- SPAM tools such as you suggest are basically for the 3l337. you are basically saying "spam is not my problem if *I* can avoid it. this is a) antisocial and b) bs, because
...
- your note does not in any way address those billions of dollars of bandwidth wasted before spam gets to your personal box.
- if you stop 99% of spam now, by a rough guesstimate of what the parent article alluded to, you can roughly expect to get 100 times more spam than you currently do in 2.5 years time. ergo, problem not solved.
- you still haven't worked on the issue of spam definition.
In short, any article, post, or message that claims that Product X is an acceptable solution to SPAM just doesn't get it.All you can do is look at the spam industry itself, and ask, "why wouldn't they harvest opt-outs for future spamming?" By opting out, after all, you've just given proof that the email address in question is valuable to you. Why wouldn't they want to take advantage of that piece of information. Do you think spammers suddenly adopt scruples on this point? Given how unscrupulous spammers are in every other aspect of what they do, I think it's absurd to think they treat opt-out lists with any integrity.
That opt-out lists will be abused by spammers is common-sense. I think the burden of proof is on you to show otherwise.
I'm generally "Interesting," "Insightful," and even "Funny" here. What the hell happens to me at parties?
"If everyone quit whining and installed one of these tools, nobody would get spam, and the spammers would be out of business."
Deary, deary.
You obviously aren't seeing the sharp end of the wedge and the people trying desperately to increase both the false positive rate and therefore the value of these tools. It is like an arms race, and anyone who has even approached the subject knows that arms races have no end. Better to simply slap a lawsuit on trading entities that use spam as a sales vector and drive the spammers out of business by cutting their food supply.
Oddly Draconis
Too cynical to live, too stubborn to die.
Spammer ahoy! Lock up your open relays! Ready your blocklists!
In case you didn't bother reading the article, it mentioned that the volume of spam was doubling every 10 weeks. This is nothing short of a threat to the viability of email itself. Would you even bother opening your inbox, if you knew that you would have to delete several thousand irrelevant, unwanted and (in many cases) fraudulent emails just to get to the 10 or 20 useful ones from friends and family? Spammers are intensely selfish - being quite happy to abuse the network infrastructure provided and paid for by others for their own gain.
Your statement about the meaninglessness of the internet shows that you haven't a clue (outside of those spam-rimmed spectacles) what the Internet is about. People do not wish to be deluged with unsolicited junk any more than the likes of Alan Ralsky likes receiving tons of junk snail mail.
Of course, you can try to prove me wrong - post your email and real address and let's see if you can swallow your own medicine.
Apparently you don't understand how [good] spam tools work, I think is the problem...but first let me suggest that you re-read my previous posts. I suggested that everyone quit whining about spam and install some software. I also made a comment that this was something that could be done at the ISP, leaving the 95% ignorant people on the Internet to have to not do much except forward spams. Now back to how spam tools work...your last couple of statements suggest you don't understand how any good spam filter works. It's not based on a filter list, or an IP list, but the tools actually have the capability of learning new types of spams. This means in 2.5 years time, 100 times more spam will be sent, forwarded into DSPAM, and *learned* by DSPAM without any rules lists to maintain. Spam is always changing, and therefore the only truly effective spam tools much learn. So if you decide you don't want to install a spam filter - fine...enjoy your spam... but 2.5 years from now I still won't have seen much of any spam.
So then we must develop spam tools that do not subject themselves to high false positive training =)
Spam may be the most profitable, but far from the most successful. Considering the amount of capital needed to run a spam/scam campaign, it is virtually all profit. Analysts estimate Google has annual revenue of 60M to 100M, and I have never heard of Google spamming. Our 2002 annual revenue was just over 48M, and we have never spammed. Targeted advertising is far more successful than any spam campaign.
Most spam emails I see in my Inbox are scams, bogus prescription drugs, and Web site affiliates violating their related site TOS. Spammers would never be able to generate revenue comparable to the top Internet properties.
Pete Carr Owner Chatmag.com
...because the 'email' economy doesn't have to connect to the real economy, as long as you (or your ISP) sends roughly as many emails as you receive. Which is true of personal emails. Genuine mailing lists would need a free pass, which could be set up when you opt in. ISPs Of course, an ecash mechanism imposes a cost in CPU cycles. But spam prevention doesn't need as strong a mechanism as the real economy: even if the spammer manages to spend each incoming email 100 or even 1000 times, they still can't send enough to make money. Maybe an ecash algorithm can be devised to take advantage of that. The real problem is adoption. Unlike filtering, the above has to be applied to all or most of the email system; people can't adopt it on their own and expect to get any benefit.
He doesn't provide material directly to the combatants (spamers and spam fighters), but is more interested in helping the people on the ground. Think of it as support for NGOs like the Red Cross or Doctors without Borders. His software is used by both sides, but in real wars aid convoys get ambushed routinely.
At worst he'd be a medical or pharmacetuical company selling to the victims.
I think it is clear which side he wants to win, but his efforts are more dedicated to keeping email functioning than fighting spam
Spam may be the most profitable, but far from the most successful.
Huh? What are you talking about?
It sounds like a good idea on the surface, but it won't work.
I got hit by a spammer last week who was changing his host names every couple of messages. And not just on the envelope - he was changing 'em in DNS because he had his own nameserver! He got shut down by the mid-level carrier after about 12 hours, during which my servers received thousands of messages that I had to block by IP. Today, though, I am getting the same stuff, now coming from a cracked cable-modem user.
Hundreds of the spams that hit here every day are sent from cracked systems connected to Comcast, RoadRunner, and Verizon DSL.
If you allow anyone to send mail, regardless of how that mail is encrypted or secured, the spammers will find a way to illegally take advantage of that legitimate mailserver and send their trash.
This is because they are criminals. Not "legitimate businessmen" and not "entrepreneurs exercising their freedom of speech". Criminals who purchase accounts with stolen credit card numbers and move on as soon as an ISP shuts them down.
"So then we must develop spam tools that do not subject themselves to high false positive training =)"
I'll pencil it in for after over-unity power generation, Microsoft secure computing and my night of passion with Christine Aguilera.
Oddly Draconis
Too cynical to live, too stubborn to die.
That already exists.
It's called the Distributed Checksum Clearinghouse (http://www.rhyolite.com/dcc). I use the DCC as part of my SpamAssassin configuration (sitewide, called by Exim) and around 85% of spam I receive is already listed in the DCC. The latest version (2.60) of SpamAssassin, plus the SBL plus the DCC works as a very effective shield. My JE (link in the sig) describes my recent experience with SA 2.60.
Oolite: Elite-like game. For Mac, Linux and Windows
I had to kill one of employee accounts a few weeks ago because she had clicked on an unsubscribe. I do all I can [spamassassin on webhost, mercury32, popfile, and my eyes] but that one got thru.
A while back I ran across site that had been putting together who owns/sells/buys what. The jpg prints on 40" X 105" which is bigger than our HP755C [36"] and guess what the center blocks are comprised on only about 5-6 people.
The currrent regs/laws say if I "unsubscibe" that business can not send mail but says nothing about giving the "validated" info to all its child orgs and then passing it own.
Your another here suffering from TWHUA [talking with head up ass].
You put Microsoft Secure Computing before your night of passion with Christine Aguilera? Your priorities are whacked, man!
If the government would enforce the laws against fraud, deceptive advertising and some of the outwardly criminal schemes advertised via spam by following the money trail, it should put a big dent in the spamming business, perhaps enough that the trailer-court spam king seen on Slashdot lately would have to figure out something else to do.
I do not believe that a "do not spam" law would work; at worst, the law of unintended consequences guarantees we'll end up having to give John Ashcroft a sperm sample to get a license to run a mail server due to the slippery slope of regulation. At best, we'll have an empty law that punishes no one.
Instead we've got Ashcroft forming an American Schutzstuffel to protect us from ourselves, and his big anti-crime initiative is to go after people that make bongs. Gee, I feel safer already.
As long as people willing to commit fraud or other "entrepenuers" feel they can lie, cheat and steal via email with no consequences they will, and someone will be willing to deliver the message for them. Get the seller via the money trail and you stop the spam, and can probably nail the spammer as an accessory as well.
The first question was, "What is spam?" This is much harder to answer than it at first sounds. For example, some people define spam as "any e-mail I don't want to get," even if the mail is for a list that they really did sign up for. As one panelist pointed out, some people really do want to receive pornography. Most people agreed that getting a newsletter that the recipient has actually requested is not spam. My personal take on the only "reasonable" definition comes down to consent: If you request that you receive something, it's by definition not spam. However, reselling such a list may or may not result in spam, and not everything unsolicited is spam.
It occurs to me that spam is better defined by the sender's intent rather than by the victim's lack of interest or want of it. I'd define spam to be randomly targeted bulk e-mail, similar to junk snail-mail. A blanket coverage message. The sender intends to sell the reader something, be it a product, idea, etc. I get bills in the mail all the time that I don't want, but they're different than junk mail in that they require attention, and are specifically targeted.
The spam problem has to do with the whole future of person to person communication, as well as the whole future of adverticement. Whichever way it will be solved, a very likely outcome is that in 10 years it will no longer be possible in any way to get in touch with someone you don't already know from outside the Internet, and the first decade of Internet will be looked back upon with nostalgia as the only decade of totally free communication. This is because the real problem lies in the initial contact.
You might argue that we can still communicate via boards, chat channels and similar things, where you can give out crypt-keys to those you wish to continue communicating with, but remember that these will be the next target for adverticing after open email collapses. I'm sure adverticers will even write AI's to simulate people so that they can lure the crypt-keys from innocents.
I think such a product already exist. Lemme remember the name of the company that makes it... soft-something? Ah, there I remember: Softmicro!
"You put Microsoft Secure Computing before your night of passion with Christine Aguilera? Your priorities are whacked, man!"
It's sorted into the likelyhood of it happening, mon frere, rather than in my desire of it happening. That's a completely separate list that I would produce, but it was subpoenaed by the courts over some 'injunction' or another.
Kylie has absolutely no sense of humour, despite her elfin perfection.
Oddly Draconis
Too cynical to live, too stubborn to die.
So, using an unsubscribe link could work with those. Not sure however, whether typing ' or ''=' into the unsubscribe box would work: even the dumbest spammers have backups, unfortunately.
Your responses really do make you look foolish. I'm ever the more amazed that you were able to make a good comment about slashdot spam articles given how little you apparently actually know about spam.
Whine and insult me all you like... and you can throw all the papers you want to my way, but the proof is in the fact that I DONT GET SPAM (except for the mindless responses such as yours posted to slashdot).
You guys can moan and groan all you want about how [insert tool] won't work, or you can shut up and install the thing. I personally don't care if you wanna whine for the rest of your life - some of us are whiners and some of us are born to a higher purpose.
True. If spam doubles every 10 (or even 100) weeks, we only have a short time left before SMTP email is rendered unusable and port 25 itself needs to be blocked upstream (spam rates of multiple megabytes per second are really a DoS attack, no matter what they claim).
There are two solutions:
1) A new protocol to replace SMTP, that _somehow_ provides non-mobile authentication (i.e. a credential that is tied to an identifiable person, not something as malleable as an IP address or even as cloneable as a MAC address)
or
2) A protocol on top of SMTP (e.g. CAMRAM, TMDA, etc) that severely limits the ability of an two previously-unconnected persons from sending each other email, and preferably does so as close to the originator as possible.
Personally, #1 sounds way harsh (you'd have to fingerprint (or worse) every ISP subscriber). Therefore, #2 is the only way left.
That's why I see the future as something like CAMRAM (one of whose layers uses CRM114 as a backstop Bayesian filter before it decides whether to invoke the "Prove You Love Me" protocol. This layering provides some advantages over other protocols).
Perhaps it's time to ask ICANN for a new SMTP port that is only used with CAMRAM or other authenticated email protocols. Then users can shut off port 25 upstream and that will end the DoS issue. Port 465 (smtps) is just SMTP over SSL; a good start, but not what we want here.
I just installed a spam filter for the first time, SpamPal. However, of the 50-70 spam messages I get per day (and perhaps 10-15 non-spam), it flags non-spam around 1% of the time, and lets spam through about the same percent. I can handle a few spams a week.
So my question really is, is the state of spam-filtering still improving, or have we reached a plateau where the spammers will just find more and more ways of defeating them. Much of the spam I receive contains characters like: Viagra so the filtering is a bit harder.
Wer mit Ungeheuern kämpft, mag zusehn, dass er nicht dabei zum Ungeheuer wird. --Nietzsche
Then they deserve all the spam they get. I'm sorry, but I have no sympathy for people that are unwilling to learn how to use anti-spam tools. Mozilla Mail and Thunderbird both have excellent junk mail controls that are simple to use, there is no excuse not to use them.
Why doesn't Slashdot mirror articles? The slashdot effect, while being somewhat charming, is frustrating. As long as slashdot would respect the "Disallow: /archives" robots.txt tag this should be ok, no?
I assume I am not the first person suggesting this, but anyway...
"Targeted advertising is far more successful than any spam campaign." Conversion ratios of targeted banner advertising versus spam shows that targeted advertising far outdistances any spam campaign.
Spam is profitable only due to the fact that there is little or no investment to operate a spam campaign. Any other advertising campaign requires a capital investment in web servers, bandwidth, tracking, product or service fulfillment, etc.
Pete Carr Owner Chatmag.com
White listing may be the only way to go. Have a list of people that are allowed to send you messages in your mail client, which would drop mail from them straight to your inbox. Anybody not on the list gets dropped to the Junk folder, which you could sort through and add the people you wanted.
I honestly don't understand the logic of spammers. I've been contacted by a spamming service before (they spammed me offering their services), and it just blew my mind.
At this point, I think there is a mass-marketing laziness about the entire thing. In order to get the spam email through all the filters, you have to have a fake email address. You also have to keep changing email addresses, as the filters will pick up on your email address and you'll only get to use it once or twice at best.
And yet, with all this in mind, I still have received more than one spam talking about how wonderful spamming is as a marketing tool. Reach hundreds of millions, they advertise!
And in reality, get ignored by them.
Robert B. Marks
Author, Demonsbane in Diablo Archive
Personally, I don't buy that that is true, but it's completely irrelevant to my point. Even if most spam does currently originate in America, if the U.S. somehow passes and enforces an effective anti-spam law, there is effectively zero cost involved in these spammers moving there business out of the States and still spamming Americans.
;-)
This is only half of it. Apparently much of the spam received outside the US originates from Florida. I can't see this changing, even if the US passes an anti-spam bill since it will presumably only apply to spamming Americans.
What it needs is a multi-lateral agreement. Perhaps it could be done through the UN
Laws are only effective if the punishment is strong enough deterrence. It is what keeps the chaotic neutral in check (I being one of them). A do not spam list will only give the disreputable a list of good targets, hoping to catch that 1 in a million, drunk, at the pc, with a Visa card. I believe legislation only works when it has teeth.
/.end pipe dream./
And as for educating lUsers, don't waste your time. Unless it is with a spam campaign?? Or perhaps threatening lUsers with hostile military action??
Die spammers Die
One of the things mumblestheclown is pointing out is that the fact that you personally are currently managing to filter out your spam is *not* sufficent evidence to prove that the software you are using will be an effective long-term solution.
The software you're using (however clever it is, however hard it tries to "learn" new types of spam), has easily exploitable flaws. The spammers haven't gotten around to exploiting them because it probably hasn't seemed worth their while--probably not enough people are using the same type of filter yet. But they will, eventually. At which point filters that take a fundamentally new approach will be required. Which the spammers will eventually figure out a way around. Etcetera.
Most spam filters are designed with the goal of filtering out spam that is similar to currently circulating spam; they make no attempt to resist an intelligent person who has spent some time thinking about how to circumvent the filter.
Bayesian filters are no exception here.
--Bruce Fields
An email should be registered. Older emailaddresses could be more trusted than super new ones.
"If everyone would just ..."
I hear those words about spam and proposed solutions all the time. But the fact is, and will always remain so, that you cannot get absolutely everyone to do so (whatever that might be).
Consider the first possibility: "if everyone would just stop sending spam". Most of the spam comes from about 200 or so different spam gangs. Most of the rest comes from a few thousand naive victims that try it once or twice, get cut off, and never do it again (and thus losing their investment into the spamware and "list of millions" they paid some spamgang for). Already, 99.999% of internet users do not send spam. A solution that requires getting so close to a percet 100% just isn't possible.
Now for the second possibility: "if everyone would just stop reading the spam and buying from spammers". Spam works because the costs to spam senders is so utterly low, that even sending to every internet user is a lower cost than trying to trim the list down to those few people that really want what the spammers are peddling. This goes along with "just press delete". But it doesn't take much in response for the spammers to actually make a profit from their spam runs. And spammer's for hire are making money even if their clients lose money, so as long as there is a supply of naive vendors who are willing to part with their money to get a spam run in their name, spammers profit. Again, this is a case where closing the gap between 99.99% of people who don't even read the spam and the 100% needed to make spammers and their clients go away, is just not going to happen.
But there is a third possibility: "if everyone would stop using ISPs that permit spam". If even so much as 50% of users who are using ISPs that permit spamming were to cancel and switch to a better ISP that doesn't, that would definitely have a substantial effect on that ISP. I bet even 10% would get noticed, although I think a bit more, like 25%, might be needed to get some of the worst ISPs to act. Of course many people do whine about things like "there is only one ISP here" (not anywhere near 50% face this problem) and "it costs me money to switch" (it costs the victims of spammers even more money for you to continue to support an ISP that is able to give you a discount by accepting pink money from spammers). If we were to simply identify the top 10 worst ISPs for permitting spam to come from or through their network, and get a whopping 25% to 50% of their customers to leave (preferring to go to the top 10 best ISPs for not permitting any spam in or out), this would make a substantial impact and cause some CFOs to panic. And this doesn't require anywhere near 99% to be a successful anti-spam campaign.
The above campaign can also be pushed harder if many of us refused to accept email from those ISPs (and thus anyone in their network) as a sort of boycott against spam support. Of course there will be whiners here, too saying "You have no right to block my email since I don't send spam" (but if they are supporting a spammer anyway, guess what).
My whole point is that we need to avoid any "solutions" that make it necessary for absolutely everyone to do something. There will be plenty of people that won't. Instead, the solutions we need are the ones which only require a practical number of people to take that action. If you don't like the ones I propose, then propose your own and say how many people would have to act to make it work.
now we need to go OSS in diesel cars
I doubt there will ever be an effective defense against spam, just like its predecessors we really haven't solved the overall issue of identifying it or making it unattractive to the sender.
Some random points to ponder:
1) What is spam, one mans spam is another mans ham, so there is NO universal measure (although some good approximations).
2) We've never managed to shut down the telemarketers cold calling. There not too much of a nuisance (depending on your definition of nuisance - why do they alsways call at meal times?) as they have to pay a significant cost per call, and automation is largely unsuccesful.
3) Junk mail is also costly to send, compared to email, and I still get lots of that.
I suspect the real answer, much like with junk mail, is to move house occasionally. It feels rather like giving in to me though.
Luckily this is easier with email than real life, but still a royal pain. Meanwhile bayesian filtering is the best I've found so far.
I think the thing that will kill spam is the success of email marketing. I work at a company that does email marketing - i.e. - VERY targetted campaigns (usually under 1,000 recipients, most of whom have some sort of business relationship with the client), easy ways to unsubscribe, always a valid reply-to address, etc. The results are great - we usually get about 80% opens and 10-30% click-throughs. We have one list/service that has 1,000 emails and gets 500 click-throughs when we send to it!
I get frustrated when I hear about ClickZ calling an email campaign to 800,000 people, where many people got the email up to six times, and they got a 4% open rate with a 4% click-through rate OF THE OPENS (i.e. - a 0.16% click-through rate), and called it a great success. Email marketing is a great tool, but spam really hurts it.
For example, I _love_ getting my email at half.com telling me that a book I want is available at the price I was looking for it. It doesn't even seem like marketing. It's cheap, trackable, targetted, and they can load it with whatever other marketing message they want, too.
Anyway, one thing that annoys me about slashdot is that everyone seems to think that all email-marketing is spam, when there are at least some of us that are trying to do the right thing.
We actually have customers that we tell them _not_ to use our service because they don't have a legitimate list. We tell them to start right now and get everyone's email address they can - have places on every form for people to get their email address, have a "newsletter sign-up" link on their website, etc., and then call us in a year with the list they put together and we'll help them with a campaign.
Engineering and the Ultimate
I know I'm not the only one who has deployed DSPAM on my system, and judging by the number of people reporting to the lists I'd say it's a success for everyone else running it too. In response to your comments about an intelligent person who can think about circumventing the filter...this really isn't accurate. If you look at what spammers are doing today to _try_ and circumvent spam filters, they seem to only be succeeding with static tools like spamassassin. Although the term 'Bayesian' filtering is a very loose term, they all usually have the following traits in common:
1. Unknown tokens are assigned a moderately neutral value.
2. Only the most interesting tokens are used in the actual calculation
3. Statistics are stored on a per-user basis
With the above 3 mechanisms, it is very difficult to craft a spam that will make it through a majority of filters, and here's why: since each user has different email behavior, the innocent tokens that exist in their system are going to be very different meaning that a spammer can't simply "run their spam through a filter" like they can with spamassassin. With a tool like dspam, where chained tokens are used, it is even more difficult to determine what the most commonly innocent tokens are. Since only the _most interesting_ tokens are used (and not the most common), most of the common words a spammer might choose are never used in the calculation. Many spammers will flood emails with junk words that may or may not hit...such as "tomato" or what have you. These tokens, when they don't have any significant hits in the user's database, is given a fairly neutral value which causes them to be ignored in the calculation. When it all hits the fan, ultimately a good spam filter will detect whatever spammy words a spammer has embedded (or even tried to hide) in the email and ignore any of the junk words that were unknown to the user's dictionary (or didn't have enough hits). The only way to get a spam through is to provide more tokens that are not only innocent, but more innocent than spammy tokens (e.g. 0.01 in value) and these types of tokens are very different for each user. Like I siad, since DSPAM uses case-sensitive chained tokens, the spammer would need to come up with two adjacent tokens, case sensitive, that a majority of users are likely to have as very innocent in their dictionary...not a very easy feat.
I'm not blind enough to say it's impossible to do, just very difficult...and should some spams get through that are crafted to hit these tokens, the spam filter should quickly learn and adjust these tokens to a slightly more neutral value - meaning the NEXT time they spam, they'll have to find another set of very-innocent tokens.
While it may be somewhat feasible to craft an email that targets a small group of people, spammers don't make any money off of that - they only make money when a large mass of their emails can get through, so even though I could find some way of getting around YOUR bayesian filter, it's extremely difficult to find a way to get around a hundred thousand people's.
While I do realize that there are potential exploits involved, and have read several papers on such, I think many of them are overrated. Even in my own testing many of the exploits haven't significantly impacted filtering. Should a spammer find a way that really does beat the system, it's only a matter of a little time before whatever development "tweaks" are made to fix the problem.
This is a bit like pointing out that exploiting some buffer overflow is difficult, and concluding that buffer exploits will never happen. The problem of course is that it only takes one person to figure out the exploit and automate it.
I haven't read the papers about bayesian filters (reccomendations? I'd be interested), but I'd think the first attack would be on the tokenizer. What does a bayesian filter do, for example, with a message consisting of nothing but tokens it's never seen before? (It should be possible to convert an arbitrary message to a message with unique tokens using unicode tricks and mispellings and such.)
Also my understanding is that bayesian filters only capture the frequency of tokens, with little or no information about their ordering. So tricks like appending ham-like messages to spam might be effective. As you point out, the notion of "ham-like" may vary significantly from user to user:
We'd need to do experiements to determine if this is an easy feat or not; it could be that analysis of a few popular mailing lists and such would yield enough data about what "ham" looks like to be useful to a spammer. I'd think that Spam filters that really depend heavily on a small number of user-specific "good" tokens to identify ham would have unacceptably high false-positive rates. A great deal of the legitimate mail that I receive (e.g., mail from the linux kernel mailing lits) is not directed specifically at me.
It seems to me that the frequency of tokens in a message captures much too little information about the message, and it should be relatively easy to find ways to automatically munge spam messages to make those frequencies look innocent, without greatly degrading the spam signal.
--Bruce Fields
Paul Graham's paper on Bayesian filtering, although incomplete, is a great start to understanding how it all works. http://www.paulgraham.org.
Several attempts have been made to attack the tokenizer, which is one area DSPAM has a considerable lead on other tools. DSPAM performs several different deobfuscation techniques prior to tokenizing a message. From simple things as removing embedded html comments to more complex issues such as j/u-n,k t,e*x$t, DSPAM makes every attempt to deobfuscate such messages - and is very successful. Mis-spellings are actually ideal ways to identify spam because they show up much more frequently in spams than in innocent spams - DSPAM treats them just like any other token.
DSPAM tracks ordering to some degree - if a token shows up in a particular header, or a URL, etc., it makes note of the (for example URL*[Email Address] is a LOT more guilty than just your email address). Even attaching ham messages doesn't quite do the trick, for the reasons I mentioned in my previous email.
Frequency isn't measured on a per-message basis but just totals. E.g. if the word 'offer' appears once or 20 times in a message it makes no difference to most filters...for obvious reasons.
The easy solution to spam is to make the identity of the spammer known to all.
Do their neighbors know that they live next door to a spammer?
When a customer walks into your store, do you know if they are a spammer?
When someone hits on you at a bar, do you know if it's a spammer who is hitting on you?
When you're on highway patrol and catch someone speeding, do you know if is the spammer that is speeding?
When you walk down the sidewalk and pass by a car parked on the street, do you know if it is the spammer's car?
When your kids go to school, do they know the spammer's kids?
When you are delivering (paper) mail, do you know if it is the spammer's mail?
When you are serving food to someone, do you know if you're serving food to a spammer?
When you receive a call to 911/poison control, do you know if this is a spammer calling 911/poison control?
Spam is a community problem, and the community is the one best able to deal with it.
All the community needs is information.
The problem will solve itself.
In their configuration management department - until they laid off 40% of the work force. It was a nice place to work. That was my last permanent position. Nothing but short term contract jobs since then.
Eric, if you're reading this, I could sure use a job.
-- Will program for bandwidth
Try www.paulgraham.com instead. The .org address is a photographer in Glasgow :-)
No, at best, we'll rather have a law that means jail time at least for recidivist spammers.
They need some drastic illustration of the harm their "business" can do.
The proverbial one night with Bubba in Cell Block 3 should finally teach them to never ever try and sell penis enlargements again. Oh, and by the way, please webcast close-up video account of their experience to that lovely town of Spam Haven (somewhere in Florida IIRC).
Make your lawmakers make laws... Call your congresscritter now!
You mean this one?
In other words, spammers have already started to attack bayesian filters (or at least filters that identify keywords) and DSPAM is using techniques to deal with those particular attacks. The bayesian filter didn't automatically learn to defend against the tokenizer attacks--humans had to intervene and write code. And the code they wrote doesn't deal in general with attacks against the tokenizer--it deals with the particular attacks that have been tried so far.
We can both imagine further attacks on the tokenizer, and we can both imagine defenses against those attacks. This is an arms race. It's not a very satisfactory long-term solution.
I believe the reason you gave was that you thought the "ham"-identifying tokens would be too particular to the individual receiver? Again, I'm not so sure this is true--for example, any filter that I use has to (at a minimum) identify as "ham" almost all email from the linux-kernel mailing list and a dozen other lists on various topics. Any spammer can download the archives of a few big mailing lists and test out their spam against a bayesian filter that passes mail on those lists.
I doubt the ham each of us receives is *that* unique. And if even only 10% of the mail we receive is significantly generic, then this is enough---a spam filter that wrongly identifies 10% of my mail as spam is close to useless to me.
--Bruce Fields
Bruce,
Bottom line is you can complain about it all you want or you can actually try it and see that it works. I've got better things to do today - cheers.
The solution there is fairly simple. Spammers have a product they want to sell. That product will usually originate in the country where the spam recipient lives (ie: U.S.A.), so even if the spammer hides behind foreign remailers you can still identify one of the parties that are within U.S. jurisdiction. The government can therefore lay a charge of "conspiracy to deliver spam" against John Doe and the U.S.-based company that contracted the spammer.
The key is not to whitelist, blacklist, etc. The key is to make mass emails impossible.
The answer should be obvious. What do you care if your email to your Aunt Millie takes 20 seconds to send?
All sendmail or other mailers should demand a pain-toll before allowing you to pass. The toll should be plug-in, so that while there's always the first (common) one to fall back on and so new ways to get approval (such as $-based, blacklists, whitelists, etc.) can be added.
But at core, the common one should be a painful calculation -- a large public/private key handshake, for example. If the spammer has to buy a Cray to send out 10000 emails, then WE WIN.
The problem with this is that it demands a sendmail replacement. Everybody needs to have the sending component to get email to those with a pain-toll-based recieve version.
But the advantage is huge. Imagine a world where you can decide to allow all emails in for either:
a. A 10 cent donation to UNICEF
b. Those with a public key in your database (known firends/whitelist)
c. Those willing to do a 10000 byte key encrypt/decrypt function (one which goes fast on YOUR end).
SPAM as we know it simply GOES AWAY.
I would hasten to add that actual $-based systems can be added but are entirely optional.
The fact that so many spammers don't have to invest in any actual product has a positive effect on their bottom line as well. That's why they're so evasive about their address/phone/contact information. Most, if not all, of the 100+ spam emails I receive every day are obviously completely fraudulent. I find it amazing that anyone actually responds to any of it.
/.'ed the poor website and I won't get to read it for another 12 hours or so... *sigh*
The spammers are outlaws, but it would be good if the few actual identifiable vendors who profit from Spam could have their feet held to the fire. They can't ALL be overseas. For the rest, I say block them all if that's the only way.
As for the article, I'm afraid we've
Everything I've ever learned the hard way was based on a statistically invalid sample.
I guess this is the best this spammer could do since the sendmail patch.
The contents of this message have been doubly encrypted by ROT13
I've never doubted you on that. I use spam filters myself, and find that they work; that's not the point. Your original claim was that spam filters were now good enough that we no longer have to worry about the problem of spam. What the rest of us would like to point out is that the fact that some spam filters currently work reasonably well is *not* sufficient evidence to establish that they will work on their own as a long-term anti-spam solution.
--Bruce Fields
As you might imagine, he's on our side, and he does a good job of clearly describing the current state of spam, and the possible solutions."
I'm a spammer, you inconsiderate clod!
Anti-spam tools also does not prevent one of the most annoying things with spam, especially when on a narrow line: You have to spend time and money downloading the spam before it can be identified as spam.
My original point was that spam filters are good enough and therefore we no longer need to worry about legislation, do-not-email lists, and other less effective forms of filtering. If everyone who complained on slashdot about spam would install a filter at their ISP, I think you'd find there would hardly be any spam left in the world. Obviously, additional resources are going to be given to improving the effectiveness and learning capabilities of spam filters...but so far the effectiveness of even the most basic filters hasn't changed over the past few years that Bayesian has been hot. We should always be working on improving our software, but my point was that there are a million other "solutions" people are wasting their time with on this board.
Then I can't become rich by helping out the family of a deceased Nigerian warlord? WHY are u people SO selfish??
Come on Taco, help him out with a direct link to the FAQ!
I must say I am frustrated this morning at not being able to read the
article. Acmqueue seems to be complete toast.
Read Epic the first RPG novel.
All I get is:
Fatal error: Call to undefined function: message_die() in db/db.php on line 88
When I try to access the link. I really want to read this, can anyone help?
HashCash has some limitations that make it unworkable in the wild. The one I noted is that it is necessary for the recipient (e.g. the one who is trying to cut back on the costs imposed by spammers) to keep track of the stamps that have been spent, up to the expiration period. Further, the costs imposed by spammers are still imposed anyway, if the server is not the one verifying the stamps (and thus also keeping a database of spent stamps for every user it serves).
HashCash would also be a burden on legitimate mailing lists. Of course, to solve that problem, whitelisting of the mailing list would be used. But it tends to be inconvenient to whitelist during subscription. This could be solved by using the HashCash only on the initial signup confirmation, and whitelist thereafter for the bulk mailings. But this still has a problem. I get lots of spam already that mimics mailing lists I am on, using the mailing list itself as the sender, and my tagged email which I signed up with as the recipient. So having whitelisted it lets the spam in, and spammers will make more use of this technique by including such details in their spam lists.
If HashCash could be modified to also include information only the real sender can prove she has, without revealing it in the ability to verify it (e.g. PKC), that might help.
now we need to go OSS in diesel cars
Here's a spam-fighting idea - I haven't read of ideas similar to this one.
Not all spam wants you to spend money using a credit card (CC). But for those that do, allow a CC transaction to be labeled as "Spam".
This CC transaction is essentially contested by the customer contacting the CC company, providing a copy of the e-mail and details about the transaction. The CC tells the vendor that the customer really didn't want the item, instead the customer wanted to "tell" on the vendor -- that the vendor is sending spam.
Vendors with too many transactions labeled as "spam" have their accounts terminated.
Yes, there are holes in this: people angry at a company could tag transactions with that company as "Spam". Spammers could advertise for vendors that have no idea that customers are being led there via spam. It can be a pain to go through the entire buying process. Most sites these days require the CC's matching billing address be provided. The item could have been delivered by the time the vendor is notified.
(hmm... maybe it needs some work)
State and federal laws will not eliminate spam. It is nice to have these guys on our side but spam is bigger than the federal or state gevernments. The bad buys will just move off shore to avoid the laws if they are enacted.
Like it or not, the internet is anarchistic in nature and it allows both good and bad things to happen because if that nature. Spam to me is like pollution, it will take the cooperation of many nations to bring it under control and it is doubtfull that even if that cooperation happens that it will be eliminated.
I don't think that the internet is ready to hae a real but virtual government although a set of virtual laws regulating spam and other criminal behavior that could be enforced across international boundaries would be nice it would also be restrictive. The politics would ruin the potential of the internet and it would be a nightmare to make fair for everyone.
For the time being, yes we should have local, state, and federal laws passed that regulate spam but some of the responsibility should be put on the user's end. The laws could require ISP's to filter UCE and they could require tools be built into email clients that would allow recipients to submit (report) the UCE that they recieve to a central repository that the ISP's could draw their filter info from. This would be analagous to the reqirements put on automakers to prevent pollution. As motorists, we are required to purchase unleaded gas and to have catalytic converters.
----- The following addresses had permanent fatal errors -----
... while talking to localhost.ftc.gov.:
uce@lhasa.ftc.gov
(reason: 554 Transaction failed, No space left on device)
(expanded from: <uce@ftc.gov>)
----- Transcript of session follows -----
>>> DATA
554 5.0.0 Service unavailable
ms
That's a rather cogent observation... sorry I don't have mod points today.
I'm coming to the conclusion that what is necessary is to attack the "making money" part of spam. One way that might work is similar to the "release gadzillions of sterile loathsome parasites" method that eradicated the screwworm fly in the U.S.
Or, spam them back.
If the spammers get hundreds of thousands of bogus requests for more information or signups on their web page (signing up other spammers, of course) for every legitimate one, they could never find the dollar bills buried in all the crap.
What it would take would be an Eliza-like program to convert a spam into a request for more information, and (more complicated) a program to download a web page, find the form, and fill it in with data that looks legit enough that it will take a human followup attempt to determine its bogosity.
Yes, this would result in more network traffic wasted in the short run. In the long run, if it were to make spam uneconomical, it might be a net gain.
If people would be willing to fundamentally change the protocol used for email, there would be a pretty simple solution for Spam, and untracable email in general - sender-hosted email.
The fundamental problem is that email is sent to a receiving server immediately, which receives it without much in the way of caring where it comes from. The sender might be illegitimate, or even gone by the time the receiver checks the email. The receiver pays for the storage resources - this is receiver-hosted email.
The solution is a protocol that doesn't sent email - rather, only a header is sent, and the message itself is stored for retrieval on a host that the sender runs, or pays for. The header contains the reference to the waiting message which is retrieved when the receiver wants to read it (and marked as read so the sender can automatically delete it).
What this means for spam - the spammers pay for their own email servers - no free rides. The mail is absolutely tracable - it must be on the specified server to retrieve it. And if the spammer account goes away for abuse, so does the email - spammers can no longer shotgun a million messages from a sacrificial account.
Security issues would be more of a problem, but are fairly easily solvable.
Alas, I have no time to pursue this idea. Too bad, 'cause I'm on the verge of just giving up email entirely.
Spammers will exist as long as somebody pays them to send unwanted messages. Any legal or economic remedy has to allow for the punishment of companies that use spam for advertising, in addition to the delivery service. Kill off the customers, and the business of spamming will become much more difficult.
Focus on opt-in vs opt-out solutions is also half-baked. You probably don't know whether you've ever opted in to an agreement containing fine print that says "this agreement establishes a transferable, ongoing business relationship". If you have, then you're toast regardless of any existing or proposed law, since there's no real control over what the European Union's privacy framework calls "onward transfer" of information unless what's given once can later be taken away.
Do-not-spam lists will not work effectively unless they contain provisions to retroactively revoke any previous permissions. Requiring annual renewal of any opt-in permissions is probably going to be necessary.
Taxation without representation is tyranny! Statehood for DC, Puerto Rico, Virgin Islands & Pacific Territories!
Where are you pulling your numbers? Revenue is *trivial* for fraudsters to pull in, given the many thousands of rip-off artists in the world, especially those who do identity theft and dig deep into your savings (such as the Nigerian bank deposit scam spammers).
It's *profit* that they rarely make. Most spammers are suckers who bought into wildly advertised pyramid schemes, but there are enough suckers born every minute to make the spam a never-ending deluge. Like people heading to the gold rush, it only takes a few (fraudulent!) cases of people making a big profit to keep all the suckers lining up, digging through the trashpiles and finding fool's gold to encourage or maybe swindle the next round of suckers.
"The first, double opt-in, requires that a subscriber e-mail two messages to get on a list. The first message requests addition of thus-and-such address (this first message can be done via a Web form, e-mail, or even scanned badges at a conference). The list owner then sends a confirmation ("challenge") message saying, "If you really want to subscribe, reply to this message"--usually with some random number in the subject to prevent guessing. Only when that reply is received is the address added to the list."
This is not "double opt-in", this is "confirmed opt-in". Accept no substitutes.
The second is confirmed opt-in. It works exactly like double opt-in, except that the confirmation message says, "You have been added; do this if you want to unsubscribe."
A more accurate name for this would be "confirmed opt-out".
After all, the one main reason we get spam is that spamming is profitable. If people stop ordering Viagra and cable descrambles from strangers who email them, there will be no point in keeping it up. Maybe we should make it easier for ppl to have anonymous access to sleaze, which seems to be the major selling point of spam.
In an ideal world they wouldn't need to, I agree.
But the reality is that spam exists, and users need to learn how to deal with it.