Microsoft Researching Anti-Spam Technique
Tim C writes "Microsoft's Research group are working on a technique to combat spam. Dubbed the 'Penny Black project', it involves making email senders perform a computation taking around 10 seconds, which their recipients can then check for. This delay would limit bulk emailing speeds to around 8000 a day, meaning that to spam all of those 'fresh, guaranteed 25 million addresses' would take approximately 8.5 years." We've reported on this before.
Is it something that will require using Outlook on Windows to work? Alternatively, will I be force to use some MS software just to send mail to people who are using MS based web/mail/etc client/server programs?
The law of excluded middle : Either I'm foo or I'm foobar
We studied this in a computer security course I took. This technique has been proposed to TCP establishment as well. It involves the server calculating a hash of a particular nonce (random value). The server then provides the hash and a certain number of bits of the nonce. It becomes the clients job to complete the nonce such that the value hashes out correctly. The server can vary the number of bits it provides to vary the difficulty of the puzzle...
By rejecting their emails otherwise. D'uh.
You really want to email me [or get priority over other emails] you will do as I say.
Of course you can get to the point where it's too much hassle. I think MSFT is seeking to have this built into OE [e.g. integrated]
Tom
Someday, I'll have a real sig.
I know, I think microsoft should charge the customer for each and every message that is routed through a exchange server. Just think of the money they could make and help curb spam.
Got Code?
Your server can do the calculations for you. That's the point. You pay for email right? [if you don't run your own server]. Then why not expect your ISP to actually provide service.
The idea though is that you can automate the process. E.g. unless the email has a tag on it that's valid you delete/filter the message.
Tom
Someday, I'll have a real sig.
Comment removed based on user account deletion
The idea was originally formulated to use CPU memory cycles by team member Cynthia Dwork in 1992.
;)
But they soon realised it was better to use memory latency - the time it takes for the computer's processor to get information from its memory chip - than CPU power.
Don't GPU's have a lot smaller memory latency?
hmm, whats this?
BrookGPU: General Purpose Programming on GPUs
That's just it, reductions. HC is based on the difficulty of finding collisions in a hash. If you break HC you break the hash.
This memory-bound one doesn't have such a nice reduction but it's conjectured to be similar.
So you can't "fake the method". Sure they could put a fake header in there, e.g.
X-MBHC: BLAH
But the verifier could trivially see it was faked.
Tom
Someday, I'll have a real sig.
I have two points. First, I think you're wrong about that. They speak in terms of the sender and the recipient taking actions, but I think they're referring to software on the sender and recipient computers taking these actions, and not humans. The only action that was clearly intended to be taken by a human was the part about agressively whitelisting good recipients, which is definitely something that I anticipate users will need to be willing to do.
The second point that I have is that the whining is interesting, and this is a big part of the problem. We, the lazy users, will absolutely have to get used to taking some sort of action ourselves as part of whatever the SPAM solution turns out to be. Right now we like the very low barrier to entry into the e-mail community, but that is exactly what makes SPAM possible.
I have taken a couple of very small steps in the direction of participation in the solution. I decided to start signing all of my e-mail with my PGP signature. It it ignored by many and it confuses many, and it probably makes some roll their eyes (it's quite a geek fashion statement). But it damn sure identifies the message as one that I wrote, and it (sort of, except without a CA) identifies me as a person and not a spammer. I feel that PGP signatures might very well be a part of the SPAM solution. Everybody could sign all of their e-mail, which is getting easier for non-geeks every day, and we could all start rejecting e-mail that is not signed. We could even all get real keys from real CAs and reject all mail from users that have not been independently verified. Send whatever you want in your e-mail, even Viagra ads, but make sure I can trace it back to YOU.
The second step I have taken is to install and use SpamAssassin on my mail server. It's something that is making the situation more tolerable, although it's still costing me a little in terms of bandwidth of the messages I never see and don't want to see being sent to my server. It also minimizes the impact of SPAM on me, which could be a bad thing because my SPAM problem is actually bigger than I regularly realize. But my point is that it required some effort on my part. It wasn't enough for me to bitch about SPAM. I had to take an action.
SPAM is more like terrorism (bear with me) than is initially obvious. Do you check under your car for a bomb before you get in? Neither do I. But I did when I lived in a place where car bombs against my demographic were a reality. I altered my behavior to counter the threat. I could have said, "I shouldn't have to check under my car," but instead I got down on the ground and took a look. I could also say, "Airport security is an inconvenience, " or "Do I look like a terrorist?" or "SPAM should just go away or be 'fixed' by the government or somebody like Microsoft, but not in a way that I have to participate." But the problem is here and it's staring us in the face. We must change our behavior in order to fix the problem. Once we're all on board with the fact that we are all a part of the solution, we can be free of it.
This MS Research stuff is all very interesting, and all ideas are welcome at the table of solutions, but the neat thing is that the technology to remove SPAM from our lives already exists. But it's a little strange and uncomfortable. It would be great if we could all pull together on some sort of e-mail signing solution and work together to get the word out to the world that we can take our e-mail system back.
First, though, we have to get over the fact that we MUST change our assumptions and we must raise the barrier to entry -- not much, but some.
Finally, I'm sure I probably misunderstood the spirit of your reply. It got me started on a vent, and that's not a bad thing.
RP
On the other hand, IBM Research has done pretty well, though it too has gone through hard times. Its contributions to open-source are substantial, and at the same time, it's much more in touch with the demands of the company.
Now, if someone had beaten me to it and moderated my parent as flamebait perhaps I'd have kept quiet....
Mencken had it right. So glad that's old news.
If you have zero-waitstate memory you could essentially own the system [well it's still a slowdown but you will win overall].
However, 8MB of what essentially amounts to cache is expensive. This means now for a spammer to spam in volume they have to buy a $20,000 cpu.
The trick though, is in the original HC to make spammers slow down you have to slow down the lower end users.
MSFT research realized that if you make the memory bus the major limitation you can level most desktops. E.g. a P4-3000 is only 4 times faster than a P2-233 in terms of tag generation.
Ram is relatively cheap [even in older desktops] so you can step this upto [say] 32MB buffers. They will only be required to send an email but will totally prevent "zero-wait state 32MB cells" since they would cost a shit load of money.
Of course this makes the system useless for portables since they often have little memory to spare. At the conference the speakers suggested that the ISP would then generate tags [at a cost] for the users.
Tom
Someday, I'll have a real sig.
Ok, I'll bite - why not just insert a "sleep (10);" line into the connection response of sendmail (or qmail, or whatever MTA you are using)? By making the sender wait 10 seconds before delivery can begin, you get the same effect as a tar-pit...
Ron Gage - Westland, MI
I believe you 100%, only Microsoft would come up with a solution that artificially induces inefficiency.
I'm no fan of Microsoft, but this is silly. Lots of security tools "artificially induce" inefficiency. One relatively early example that comes to mind is Unix crypt, the function originally used to hash passwords. It runs a DES-like algorithm many times to produce its results, not because that improves the quality of the hashing, but because it takes longer, which makes brute force attacks harder. The Unix login program also deliberately introduces an artificial delay after every failed login attempt, and it's not to give you time to remember your password.
There are many instances in which slowing down legitimate users a little is an effective mechanism for deterring abuse.
That said, I still think this particular idea is stupid, since there are plenty of people who have a legitimate reason to send large volumes of e-mail, and this would cause them more pain than it would cause spammers.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
So this would have the effect of making legitimate high-volume, high-subscribership mailing lists expensive to operate
Well, maybe. There still could be a white list for cases like this.
I think that high volume mailing lists should probably actually be newsgroups anyway. But what it does do is put a crimp in people who host a lot of low volume mailing lists.
The point is they did produce a result, it was published in a first tier crypto journal and the results are acknowledged as correct.
And my point is that your comment is both insulting to MSR and misses the point.
Your comment is insulting to MSR because anybody who knows anything about CS research knows that MSR has top people. They have produced hundreds of first tier journal publications over the years. This is just a minor publication among many good things MSR has done.
It's meaningless because you are missing the main problem that all industrial research labs share: making the connection between research and products. MSR has been as unsuccessful at that as any other of the big industrial computer research labs before. Microsoft's problems is the quality and lack of innovation in their products, not their research labs.
mod parent offtopic.
I suppose when your points are weak, you have to fall back on calling on moderators. Why don't you engage your brain instead of falling back on such underhanded tactics?
So why bother with all the computation and hashing, and just refuse to accept connections from a given IP except every 10 seconds? So if an email was sent from AAA.BBB.CCC.DDD at 00:00.00, don't accept another from that IP until 00:00.10.
This makes it happen entirely at the recipeient server side, so you're not breaking SMTP, and it's backwards compatible with everyone else.
On the other hand, if it's 10/sec per email it doesn't sound like this would be feasable to implement:
First, let us note that the S in SMTP stands for simple. What may look like a "flaw" today was indeed an attempt to make a standard that is usable with no regard for OS, system, bandwidth, transmission medium, or any of the other factors which complicate computers today now that everyone and their grandma has one.
... that way, it does not matter how old or new a computer is because the system does not rely on processor chip speeds..."
Micro$oft's proposal has several issues. First, the proposal itself:
"If I don't know you, I have to prove to you that I have spent a little bit of time in resources to send you that e-mail."
This changes the effort to convincing the system that I know you and we can bypass all of this. Microsoft's track record tells me that this will be accomplished quickly (likely before the software even reaches final release.)
"...use memory latency
No, it relies on bus speeds and memory speeds, not to mention caching schemes. These change almost as rapidly as processor speeds these days.
All of that is meaningless when you look at the greater problem:
"For this scheme to work, it would want to be something all mail agents would want to do,"
There are 2 ways to implement such a solution; on the server side and on the client. As for the server:
Not just want to do but be able to do. Since SMTP severs began requiring authentication (several years ago), most spammers have turned to using old servers still alive on the net. These would not have new schemes implemented. Denying them to play if they don't update would kill several servers (including several universities).
As for the client:
Anyone who can say "HELO" can send a mail (see RFC 821, RFC 1123, RFC 2821). This means that any decent coder can write a mail SMTP client in about 30 minutes. We will never be able to assume all spammers are using any e-mail client.
"It is certainly not going to stop all spam for good"
And in the aftermath, we will all have slowed our systems with no effect on spam levels.
you don't understand, once the sender does this there will be some type of key. If the client doesn't see this key in the headers or wherever then it will be seen as spam by the reciving client.
How do you know if the key is valid?
Why can't a spammer just make up a false key? Does the client check it mathematically? How long does that take? Why not just delete the spam manually (like we all do now) if it's still going to take time to filter it out?
LK
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
The only thing I could argue against that would be that if this did go through it would make the trojans and virii not only more noticable, but it would make infected machines almost impossible to work on, thusly resulting in more of them being fixed (cause you can't use a broken computer) and less relays! This does seem to be a fairly good solution. Though I do have to agree that if MS decides to create the method it better be an open standard that every one else can adopt or it'll go the way of BetaMax & OS/2.
Kleedrac
Sure we wang, can.
Well, maybe. There still could be a white list for cases like this.
I think that high volume mailing lists should probably actually be newsgroups anyway. But what it does do is put a crimp in people who host a lot of low volume mailing lists.
As somebody who hosts low-volume mailing lists, I have to agree.
Whitelists are nifty (we use them extensively), but what worries me on that score is that if they become frequent, I suspect we'll just see spammers hijacking address books along with machines, and forging "trusted" From lines.
Slashdot's token middle-aged housewife
You are missing the point. Nobody is saying that this is going to be required for all machines. Essentially it is an extra header attached to emails so email recipients can filter messages that don't have this tag. As I see it this is how it would work for most end users.
First setup a whitelist, make this your first spam check. On the whitelist? Email goes through never checking for any other spam criteria. (Mailing list should be accepted here).\
For mail that doesn't pass the white list check we can check for the header created by the MS program. We verify that the computationally intense header is correct and maybe we can let that through if we want, maybe I let emails with this tag pass through my spam checker with a higher spam score.
If we decided to accept mails with the header, we now check the remaining email with a very thorough spam checker and use a very low score.
No matter how many computers they have, it will lower the number of emails that are able to be sent, if people filter on this criteria.
As a matter of policy, I do not respond to whitelisting requests because the sender of the whitelisting request has already accused, with zero basis in fact, of being a spammer...
If you got a whitelisting request from him, it would have been because your message looks like spam. That is not a zero basis in fact from his perspective.
In fact it would be because you did something in your email to total a high bayesian filtering score.
As the sender *I* would not be insulted if that were to happen. In fact, it would be great to know that the mail I send is not being silently trashed. How unimportant is your message that the perceived insult is of greater importance?
I always wonder these days whether a mail got through, when it is not answered. I find I end up on the phone more often than not, because mail is no longer a reliable method of communication due to spam.
If you continue to get a lot of whitelist requests after such a system is implemented, it would behoove you to make your mail look less like spam. For instance, not using Base-64 encoding, or sending purely HTML mail, or including trademarked names of pharmaceuticals, or including random strings of characters, linking to spam domains, putting lookalike accented characters or too much punctuation in the subject line, or cc'ing or bcc'ing everyone in your mail.
M$ should consider out-sourcing it since well....my hotmail account still gets spam even though I set it to exclusive (meaning only email from ppl in your address book will get through); spam with obvious fake addresses. And the spam that goes through this "exclusive filter" also seem to fly passed my custom filters that have the words that the spam has ("financial", "viagra", "herbal", etc.)
Yahoo works better with regards to spam though I wish it would empty the bulk mail folder more often.
And my pop3 acct has something called greylisting and that alone cuts 95% of spam. Plus black and white listing IPs and domains helps too (for instance, only allowing email from hotmail.com if it originates from one of hotmail's servers, etc.) and blocking known spam-haven Class C ranges (eg x.x.x.*).
First, the protocol is overly complex. The receiver sets the puzzle. How does the receiver to this. But sending the puzzle before receiving the email? That is complex, perhaps involving connections that must remain open for tens of seconds, or lists that correlate puzzles to particular senders, and the sender must match the answer. How will the puzzle be generated. Will it be psuedorandom or pad. How will we gauge the strength of the puzzle. I do not see how this is superior to current filtering.
Second, alternate filtering methods will still be needed. Whitelists will have to be kept so that friends, interoffice mail, and current customers will not be challenged. Email that does not meet the challenge will still have to be accepted and filtered. The only advantage is that certain email will be tagged as 'safe' because the sender solved your puzzle. This 'safe' email will still often have to filtered to meet the specific needs of the receiver. For instance, a 'safe' email may still contain graphic sexual content unsuitable for the office.
Third, there may be no way to know whether the calculation was done. If the puzzle is pseudo-random, the sender may exploit some weakness. If the puzzle is off a standard one-time pad, and the number of puzzles are finite, or can be cataloged into a finite number of sets, the sender may have database that already contains complete or partial answers. So, even if the spammer is not using owned hardware, there is no way to know that each email is in fact generating any specific liability.
Again, this is a ploy for MS to sell servers to advertisers. The number of machines, and related number of MS licenses, is going to be non-trivial. The client will be built into outlook and the marketing will convince consumers that anything marked safe is legitimate advertising and not spam. This does nothing to solve the spam problem.
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
This is just hashcash.
Hashcash is wasteful... it just runs processes at full blast for tens of seconds to tens of minutes at a time, which is a small energy waste but overall a loss.
Hashcash is impotent... any hashcash scheme cheap enough to let someone with an older computer send mail in less than minutes won't slow down a P4-3GHz at all.
Hashcash is harmful, because it makes no distinction between solicited and unsolicited mail. How would you subscribe to Slashdot without whitelisting it?
And once you're whitelisting senders, you might as well just whitelist everyone you get mail from, and now you only need to discourage unknown senders. And hashcash is still a silly solution there, how about real cash?
Here's one way to do that. Whitelist not a sender, but a server. A server at a company that simply charges a few pennies to a few dollars to forward mail (you pick the level of unsolicited mail you want), or one that requires other hoops...
Much simpler, doesn't require new proprietary Microsoft technology, and allows all kinds of alternatives...