Beat Spam Using Hashcash
Shell writes "If they want to send spam, make them pay a price. Built on the widely available SHA-1 algorithm, hashcash is a clever system that requires a parameterizable amount of work on the part of a requester while staying "cheap" for an evaluator to check. In other words, the sender has to do real work to put something into your inbox. You can certainly use hashcash in preventing spam, but it has other applications as well, including keeping spam off of Wikis and speeding the work of distributed parallel applications." If you're specifically interested in hashcash for your mail server, Camram has some interesting ideas -- their Frequently Raised Objections page may be illuminating.
Your post advocates a
(*) technical ( ) legislative ( ) market-based ( ) vigilante
approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)
( ) Spammers can easily use it to harvest email addresses
(*) Mailing lists and other legitimate email uses would be affected
( ) No one will be able to find the guy or collect the money
( ) It is defenseless against brute force attacks
( ) It will stop spam for two weeks and then we'll be stuck with it
(*) Users of email will not put up with it
( ) Microsoft will not put up with it
( ) The police will not put up with it
( ) Requires too much cooperation from spammers
( ) Requires immediate total cooperation from everybody at once
( ) Many email users cannot afford to lose business or alienate potential employers
( ) Spammers don't care about invalid addresses in their lists
( ) Anyone could anonymously destroy anyone else's career or business
Specifically, your plan fails to account for
( ) Laws expressly prohibiting it
( ) Lack of centrally controlling authority for email
( ) Open relays in foreign countries
( ) Ease of searching tiny alphanumeric address space of all email addresses
( ) Asshats
( ) Jurisdictional problems
( ) Unpopularity of weird new taxes
( ) Public reluctance to accept weird new forms of money
( ) Huge existing software investment in SMTP
( ) Susceptibility of protocols other than SMTP to attack
( ) Willingness of users to install OS patches received by email
(*) Armies of worm riddled broadband-connected Windows boxes
( ) Eternal arms race involved in all filtering approaches
( ) Extreme profitability of spam
( ) Joe jobs and/or identity theft
( ) Technically illiterate politicians
( ) Extreme stupidity on the part of people who do business with spammers
( ) Dishonesty on the part of spammers themselves
( ) Bandwidth costs that are unaffected by client filtering
( ) Outlook
and the following philosophical objections may also apply:
(*) Ideas similar to yours are easy to come up with, yet none have ever been shown practical
( ) Any scheme based on opt-out is unacceptable
( ) SMTP headers should not be the subject of legislation
( ) Blacklists suck
( ) Whitelists suck
( ) We should be able to talk about Viagra without being censored
( ) Countermeasures should not involve wire fraud or credit card fraud
( ) Countermeasures should not involve sabotage of public networks
( ) Countermeasures must work if phased in gradually
( ) Sending email should be free
( ) Why should we have to trust you and your servers?
( ) Incompatiblity with open source or open source licenses
( ) Feel-good measures do nothing to solve the problem
( ) Temporary/one-time email addresses are cumbersome
( ) I don't want the government reading my email
( ) Killing them that way is not slow and painful enough
Furthermore, this is what I think about you:
(*) Sorry dude, but I don't think it would work.
( ) This is a stupid idea, and you're a stupid person for suggesting it.
( ) Nice try, assh0le! I'm going to find out where you live and burn your house down!
Today, spammers buy and sell large lists of email addresses on CDs or other media. Each of these addressess took some mining to find it, and put it on the CD.
In the future (if this takes off), these lists will simply contain the hashes along with the addresses. This temporarily makes the spammers lives a bit difficult, but doesn't have a long term impact.
Spammers share information. The cost of all those hashes amortized over a few years to a large number of spammers is nothing.
There are plenty of available solutions, however they're all designed/implemented/pushed by large companies - viva la open source.
The end effect of this is eventually bad, or utterly worthless.
Joe Sixpack wants to send a mail. If it takes him an hour to parse a key, he's not going to mail his mother anymore.
If a spammer has to spend an hour processing the key, he's just going to invest more of his time getting zombie PCs to get the work done for him.
Who wins here? Certainly no one.
Disclaimer: the hour was used as an example. I've no clue how long it takes, but the point should still hold.
The moral being, don't make the end users pay for the actions of spammers. We have laws for spammers now; it's time to start using them.
For example, Sourceforge sends site-wide update messages about once a month or so. They have tens, if not hundreds of thousands of users. If every one of those users used HashCash, Sourceforge would practically need a dedicated server farm computing hashes simply in order to send out its update notices.
This is a really, really stupid idea.
An awful lot of spam has been generated from machines infected by worms. If the spammer controls a thousand zombie machines, he'll have all the CPU power he needs...
Sort of like burning your harvest to keep grain prices high. Just send me a completed work unit of Seti-At-Home or Folding-At-Home in an email header. I am sure, given the incentive of every e-mail message advancing their goal, some of these projects can come up with work units that are difficult to calculate but easy to verify.
Maybe for once zombied Windows boxes will be more productive than they would be under their users' control.
the fact that not everyone is sending legitimate email with a powerful computing device. Something that could cause an inconvenience to a spammer with a boatload of cheap commodity 2Ghz desktop systems (other their own or a zombie army) will bring more modest systems to their knees. Handhelds, phones, old 486 systems recycled for use in the 3rd world, set top boxes, embedded systems, etc. will no longer be viable systems with which to send mail. And what about web mail providers?
These's simply no reason to resort to kludge solutions that depend on penalizing those who cannot afford top-of-the-line systems.
To me greylisting seems like the best thing to do. See:
g .html
http://slett.net/spam-filtering-for-mx/greylistin
and/or:
http://projects.puremagic.com/greylisting/
In a nutshell, it simply uses a standard 451 SMTP response that says "Hey, I'm busy now, can you call back in a minute or so?" To my knowledge, all standard SMTP servers respect this request, and little to none of the mass mailers do. And if they do, their bandwidth will triple.
Here's a log example:
Oct 15 15:18:17 example1.example.com sendmail[6955]: [ID 801593 mail.info] i9FJIGH06953: to=, ctladdr= (168/601), delay=00:00:01, xdelay=00:00:01, mailer=esmtp, pri=121994, relay=example2.example.com. [123.390.141.456], dsn=4.3.0, stat=Deferred: 451 4.7.1 Greylisting in action, please come back in 00:01:00
If the mail never comes back, then the sender is now blacklisted. If the mail does come back, the sender is whitelisted.
Simplest and most standards compliant thing that I've heard of, and it seems to work.
Hashcash predates the MS Research project.
This article is about the first correct (supposedly) Python implementation of hashcash.
Only unsolicited mail needs a hashcash field.
Wikileaks, no DNS
Spammers never use RFC compatible SMTP servers
And spammer tactics remain static, so the same techniques that worked five hours or five years ago will continue to work indefinitely. Not.
My next sig will be ready soon, but subscribers can beat the rush
It's that in order for this to be useful, it has to be widely implemented. Anybody who sends a lot of legitimate email (e.g., hotmail) is going to need to buy a lot more CPU. So it's not going to get widely implemented. So it won't help. Sorry. :'(
I consider mailing lists a cute throwback to a much earlier time. Don't get me wrong, I subscribe to three or four myself. But every single one of them, I could just read on-line (and no, not all Yahoo lists, only one in fact).
To effectively eliminate spam, I would gladly visit a web page rather than have the same info appear in my mailbox.
Er... How does that differ from actual spam? I don't give two shakes of a rat's ass whether or not UCE comes from a "legitimate" source. I don't want it. Any of it. So, it really doesn't bother me that, for the benefit of no more "Free v1@6ra" email, I also lose out on "buy our totally legit ink cartridges" at the same time. I consider it a perk, not a problem.
The problem with technologies like this is that they need to gather widespread acceptance to become useful.
Quick grep on my mail archive (which is HUGE) failed to find single message with X-HashCash header. That means even if I would enable it now, it will be practically useless.
Of course wide acceptance could be achieved by the means of widespread grassroots campaign, but this is hard way. If somebody big like GMail, Yahoo Mail or MS Outlook or Apple Mail started to use it , that would have snowball effect.
Why not have it compute the stamp before you send the mail? You start a new mail window, that least intensive of applications. In the background it calculates the stamp while you type.
Under that system, you could make the stamps as much as a minute. Very few e-mails are written in less than twenty seconds, most take a few minutes. Really short messages go via IM. You still queue it to go after the stamp is ready to deal with the short e-mails, of course.
Just add some javascript that would hash the message, some part of the URL or page, or a salt and that would be a required part of sending.
Unfortunately this means that each installation would need its own javascript function. Otherwise you just take a look at the wiki package, see what sort of computations it does, write a program to perform the same computation in C, do a google search for the wiki engine and compute 1000 hashes in the same time the javascript has calculated one.