Analysis of Spam, and a Proposed Solution
2bot_or_not_2bot writes "Spam: The Phenomenon is a detailed analysis of spam: products, scams, viruses, obfuscation methods, etc. Failed, and doomed-to-fail, methods of blocking spam are described. A general solution is proposed that does not: invade privacy, perform wide censorship or blacklisting, or involve payment and cooperation with corporations (beyond the transport and storage of data)." Hmmm.
We apply Islamic law.
They steal our time, money, and bandwidth.
We take their hands.
Striking fear in the authors of godawful fanfiction, I am here, appearing in darkness, Tuxedo Jack!
I'm glad the author included so many examples of actual spam messages. I was beginning to wonder what spam looked like.
John.
The best way to stop SPAM is to find the person(s) that are sending and post their personal information on the web. Everything email address, phone numbers, cell phone numbers, home address, business address, dogs name... everything there is... and let vigilante justice take over from there...
.5% of the people (s)he sent out spam to call his cell phone and leave a nice voicemail, everyday, all day, he will start to know what it is like to be harassed and for it to cost him money out of his pocket and the grief that he caused so many...
I mean come on, if only
"The word "genius" isn't applicable in football. A genius is a guy like Norman Einstein," - Joe Theisman
There's a boycott occurring for Microsoft's Caller ID for E-mail. They're asking for anyone developing a mail client, spam filter or mail transport agent to use a more open protocol, rather than a patented one.
The web page contains lots of images of SPAM that the author has received.
Here is the text of his proposal:
Test 1 2 3 4
John.
Here is another way of looking at it: Spammers exist because there are idiots out there who fall for "vicod1n" or "pen1s enl@rgement" or what have you. We should have users who are purchasing these products pay an additional "spam tax" on it, to compensate for the wasted bandwidth and so on. Sort of like "shipping and handling fee". Actually, it comes close to the Internet tax idea that Congress is punting about, but applied to spams.
Spammers are not very hard to track down. The companies that use their 'services' are even easier to track down. Many if not most are in the US or EU.
I've done it myself a couple of times, and have explained the relevant legal code from spamlaws. I have yet to hear back from either the spammers or the authorities I have explained this to.
I would think if law enforcement would do what it is SUPPOSED to do, spamming would be vastly reduced.
Counter Spam Measure: Negative Feedback.
Imagine if all or some very large contingent of email clients allowed you to
"retaliate" against spam messages. Highlight message, select "negative feedback"
option, a daemon is spun that traces back as far as possible the route of the
message and barrages it some fashion. By pings maybe? By directed replies? Imagine
it does this in some scheduled fashion so as to minimize the impact on your local
network. As 1 million disparate sources converge upon the last traceable source of
the route of the offending spammer, some network somewhere will start to feel the
load. Like the spokes of a wheel converging on the hub, the retaliation traffic will
thicken as it closes in on the source. The pain increases. ISPs inundated by
individuals expressing their right to freedom of speech, will feel suddenly inclined
to exercise their right to refuse service to someone.
The "negative feedback" could be dosed in a coordinated fashion if there were some
P2P means of establishing how many individuals had received a particular spam. If a
spammer hits only a hundred people, the dose of retaliatory traffic would have to be
increased to be felt. If the spam hit a million, it would require only a modest
retaliation to utterly swamp the source.
Just thinking out loud. Could this be made to work? No one's free speech is
curtailed, spam is dealt a serious blow.
fight fire with fire.
It should be self-evident that this solution is not workable. Anything that requires this massive type of retooling of the whole method of using e-mail is doomed to failure.
Any proposed solution cannot cause this type of massive interruption of normal e-mail usage.
Someone is WRONG on the Internet!
Next!
Personally I rally liked D. J. Bernstein's (qmail, djbdns, daemontools) idea for a new mail protocol. The big difference between it and mail we have now is that only the notification of mail is sent, not the mail itself. The mail sits on the senders mailserver, waiting to be picked up, and if you want to retrieve it, your mail client does so from his server. Think about it - No more anonymous spam, since you KNOW where messages are coming from if you have to retreive them. Therefore, if spam is illegal, we can punish them... and there is no more faking of where its coming from.
The other cool concept to that is mailing lists vs bandwidth. In old mailing list styles, a message would go out to the list, bouncing back from all people whos boxes are gone or full- witha lot of traffic. In DJs new way, there is only notification of the message sent, and then only those who really want the message download it.
The more you think about it, the better of an idea it becomes. In the wold of terrifying ideas like "postage for emails" or "really super-mega-expensive domain names for mail only" Bernsteins has an elegance and practicality I haven't seen elsewhere.
I administer a mail server for a small ISP. The problem with filtering on the user's end is that my costs are consumed by the time the user deals with the spam. I don't think, as the article suggests, that spammers will slow down if their message is not being read, in fact they will just spew out ever more spam. If a 1/10 of 1% hit rate does not deter them, a smaller hit rate won't either.
I have to put some upper limit to the amount of storage I can give each person (right now I allow 100M, which I think is quite reasonable). But if a user goes on vacation and does not check their e-mail for a month, they could have their inbox filled with spam and viruses (not much difference these days, from a server admin point of view). This will preven legitamate messages from coming through. Therefore, I use the following technical measures to help reduce spam:
- RBLs: dnsbl.njabl.org, sbl.spamhaus.org, xbl.spamhaus.org, and dul.dnsbl.sorbs.net
- SPF:Sender (not adopted widely yet, but it does block a few messages a day even now)
- Blocking specific subject lines (during virus outbreaks this can help)
- Blocking mail "from" non-existant domains
I really have no choice, I cannot afford not to take these measures. I explain all of them to my clients, nobody has had a problem yet. These measures catch roughly 75% of spam and viruses, and as far as I know, no false positives.I have 1 email address that I have used for many many years, far before spam was a problem. The problem is, my email address has passed beyond my control. You can still find it on the 'net in usenet archives, mailing list archives, and who knows what else. The point is, 10 years ago, we didn't think to conceil their addresses... they wanted to make them easy to find so that people could find *us*!
Even better, somehow, there's a database that matches names to email addresses. People other than me map to my email address, so I get "legitimate" spam.
Furthermore, not loading the images and not clicking on the links doesn't fix the problem entirely. I've checked, depending on which address they've spidered. Contact addresses for my web-design business that I shut down 3 years ago are still getting spam.
That I have to change an email address that I've had for nearly a decade... well.. it makes my blood boil.
Gentoo Sucks
My spam folder is full of mail with all sorts of crap random words.
The one or two which have gotten through look like they could have been written by a Perl guru.
Government of the people, by corporate executives, for corporate profits.
Post your email address and I'll forward my spam messages to you. That'll train your bayesian filter.
Government of the people, by corporate executives, for corporate profits.
Two months after we moved out, we went for dinner there, I had to look up something quick in google and *OMFG* the computer is barely crawling, it has half the system tray filled with icons, and it has so much malware that adaware crashes :o
Self-installing and opt-out add-ons suck. Hard.
Seriously? Go to a syn-syn/ack-ack system.
The sending SMTP box says to the receiver "I've got a message for you" Receiver caches the message, hands the source box a 32 digit random number and says I'll call back in 30 seconds by your FQDN. It does so. Receiver says "did you send me a message with the serial 'x'"? If yes, then the source in the header wasn't spoofed, and the message goes through, if not, the message gets dropped.
Almost all spam these days comes from spoofed sources. But if in this case it's still spam, it's a lot easier to track the source immediately and deal with it. Take away the ability to hide, and like mold in the sunlight, most of it will vanish without further effort.
Each item in the following list was suggested by the words or actions of people who presented themselves to the IETF or elsewhere as having discovered the FUSSP. Some of the items may seem obscure to those who have not dealt with the IETF.
Prevent email address forgery. Publish SPF records for y
That way you can use different addresses for mailing lists, orkut, random recipients, each Slashdot posting, etc., and blacklist addresses that get abused and/or only whitelist addresses you've sent people. There are some risks - the subdomain version occasionally gets hit by dictionary attacks, so you might receive 10 million messages on an occasional really bad day (this mainly happens if your subdomain doesn't run its own SMTP server that can milter it.)
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
It's dangerously bad. If email messages accurately identified where they came from, and if spammers didn't maliciously forge addresses of people they want to harass, and if spammers didn't usually abuse free email systems and free web pages or forge purely bogus sender addresses (usually also at free email systems), then that would be a fine idea. Many spammers also frequently put other people's valid URLs in their mail to fake legitimacy, e.g. URLs from E-Bay's news site or the Better Business Bureau or various anti-virus companies, in addition to having their own URL for the suckers to click.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
This is simple and requires no changes to a mail client to function, but one small change would make things easier. The solution does not need to happen all at once to be effective, and does not change any of the current protocols for email (POP,IMAP, SMTP).
The idea: multiple, sender/use specific addresses on the client side. Basically instead of having one address with your ISP, you would have the ability to create up to 50 aliases to your account. Not that these are not 50 accounts, all of your mail still winds up in the main mail account at your ISP.
Lets say you have bob.smith@myisp.com as your email address. The goal here is that you would NEVER give out that address. Instead, you log in to your ISP's web site and create addresses that you then give out. These addresses can be set to expire after a set date, or only be removed manually.
So you like to pay your bills on-line, create an address bobsbilling@myisp.com and use that on all the registration forms for your utilites, credit cards, etc.
bobs-shopping: use it to register for any on-line shopping sites
bobs-long-ebay-address, sendmailtobob, tossaway32341, etc....
You create an address that you give only to your family/friends, you create an address for each mailing list, create an address that you put in the public LDAP systems and other person-search sites, create an address for sweepstakes/contests, etc.
If you start to get spam on an address (you can easily check the headers to see which address the spam was sent to), you simply change the address and tell the few people/sites that used that address about the new one. The more addresses you have, the fewer places you need to notify of any changes.
The only disadvantage is the initial changeover does take some time/effort. Once created, the addresses mostly just sit there and don't require any maintenance or routine changing.
The advantages: little to no spam; abliity to easily identify WHERE the spammer acquired your address when you do get any; spam does not take up any bandwidth or storage space on the recieving mail server once an address is deleted after getting spammed; no resource intensive and complicated filter software required on the server.
How well does it work? With about 35 addresses out there (may are web site specific), I receive only about 6 spam messages a month. Each and every one of those is sent to a public administrator address like webmaster, hostmaster or the like, not too bad considering I recieve such email for about 10 domains.
In the last year or so since I've started doing this I have only had to disable a single address due to spam, and since it was for a single web site, it took less that five minutes to effect the changeover to a new address.
To those who say that this is too much of a hassle or takes too much effort, I ask this: would you rather have to spend 30 minutes a year maintaining and changing email addresses and informing senders of the new address, or spend 5 minutes a day updating your spam filters and double-cheking the positive results for false hits?
As I stated, this does not require and changes to the mail clients, but if there were one change it would be nice: when you reply to a message the client should automatically use the address that the initial message was sent to instead of attempting to use the actual account address.
Article X: The powers not delegated... by the Constitution...are reserved...to the people
Step 1: Salt the spammer's email databases with guaranteed bogus email addresses that no legitimate email sender has ever seen. This is currently trivially implemented as follows. In your website's robots.txt file, list several files that robots must not examine -- these are your honeypot. Then, fill those files with HTML that contains your bogus email addresses. Spammers will, quite reliably, disobey the robots.txt file, use it to discover HTML files that are not linked to from anywhere else in the world, and add your bogus mail addresses to their database.
Step 2: Implement greylisting + honeypot-based RBL. When email arrives that is not whitelisted, see if it comes from an IP address that is "temporarily" blacklisted in your RBL. If it is, you can reject it right now. Otherwise, see if the target address is in your honeypot database. If it is, add the sender's IP address to your RBL and fail immediately. Otherwise, engage the now-classic greylisting algorithm (see http://www.greylisting.org/) to "tempfail" the email. The point of the temporary failure is to give the spammer time to use the same IP address to send the same spam to an address that *is* in your honeypot database, so you can then proceed to reject the retry of the spam to a legitimate email address).
- requires no per-user work, such as "training" of filters.
- requires no changes to any software, except MTAs (and only a handful of them handle most of the world's software). no new laws.
- no false positives. to get blacklisted you *must* have transmitted email to an address that could only have been obtained by illegally harvesting a website.
- even compromised home systems are not terribly harmed. if a spammer takes over your home computer and uses it, well, the IP blacklist need not be permanent, just long enough to cover a single spam run -- a few days is probably plenty. if the spammer is blasting out runs from your home computer continously, well then you have worse problems than finding yourself unable to send email to GrandMa.
- not easy to defeat. right now, anti-spammers must work very hard to locate the "real" email amidst all that spam -- and never, ever mistakenly reject a "real" email. greylisting plus honeypot RBL inverts the equation. the spammer must make sure that not a single "bogus" email address is anywhere in his database! spammers are ingenious, but developing absolutely perfect lists of legitimate email databases is something they have no experience with so far.
- no restriction of free speech. total whacko strangers who aren't spammers can still send you email -- it may just get delayed for an hour or so (a fact which is totally true already).
- nobody makes any money off it. you don't have to pay anybody, except for the effort involved in setup and maintenance (a fraction of the total time wastes on spam currently).
- computationally cheap. most MTAs are already looking up IP addresses and target addresses in databases. cost of this scheme should not greatly slow down most MTAs. especially compared to content-examination schemes such as Bayesian filters.
- no judgement calls in blacklisting. no third party has to decide what is spam and what is not. the rbl in this scheme is totally generated from absolutely bogus email addresses -- the only way you can get in the rbl is to flat-out declare yourself a scumbag by sending to one of those illegally obtained addresses.
No scheme is perfect, but greylisting combined with an RBL that is derived solely from bogus email addresses is pretty damn good.It did? Apple's Mail.app uses a Bayesian filter, right? Salting messages with random words haven't thwarted its filter at all. I might see a couple or three spam every week, but considering that's out of hundreds filtered per week with no false positives, I can live with that.
He also makes the following curious claim:
Is this really a problem? I'd say this is one of Bayesian filtering's advantages.
So far, Bayesian filtering has worked wonderfully for me. I don't see that it's been defeated -- or will ever likely be truly defeated -- at all.