Microsoft Researching Anti-Spam Technique
Tim C writes "Microsoft's Research group are working on a technique to combat spam. Dubbed the 'Penny Black project', it involves making email senders perform a computation taking around 10 seconds, which their recipients can then check for. This delay would limit bulk emailing speeds to around 8000 a day, meaning that to spam all of those 'fresh, guaranteed 25 million addresses' would take approximately 8.5 years." We've reported on this before.
How do you "make" senders do anything?
"Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it." -- Linus Torvalds
Well actually yeah they did. At Crypto'03 a method for memory bound HC was presented.
So while MSFT didn't invent the original HashCash concept MSFT did improve upon it. So before anyone gets the bright idea of flaming MSFT ignorantly.... know your facts!
Tom
Someday, I'll have a real sig.
Is it something that will require using Outlook on Windows to work? Alternatively, will I be force to use some MS software just to send mail to people who are using MS based web/mail/etc client/server programs?
The law of excluded middle : Either I'm foo or I'm foobar
We studied this in a computer security course I took. This technique has been proposed to TCP establishment as well. It involves the server calculating a hash of a particular nonce (random value). The server then provides the hash and a certain number of bits of the nonce. It becomes the clients job to complete the nonce such that the value hashes out correctly. The server can vary the number of bits it provides to vary the difficulty of the puzzle...
"The payment is not made in the currency of money, but in the memory and the computer power required to work out cryptographic puzzles. "
Phew!!! For a second there I thought I was going to have to do a math problem for each email I was going to send. I woulda been fucked!
Buy Steampunk Clothing Online!
This is not a solution... as *I* still have to check for something on my end, and then discard if that condition is not met... my bandwidth and time are still wasted.
Whine!
It may not be the end all be all solution, but obviously we haven't found that yet. This seems like a pretty good solution for the moment. There may be a better one that comes out, making this one null and void, but we are continuing to find ideas which are a little better than the last.
How can that be a bad thing?
http://use.perl.org
Even today, the most annoying spammers are not using their own computers, but insteady they are bouncing e-mail off virus infected and trojaned PCs.
So 8,000 emails / day is fine, if you have a couple thousands relays to pick from.
---- join dshield.org Distributed Intrusion Detec
No, it *is* a solution...
Some of your bandwidth and time is being wasted in the short term, because spam is still being circulated.
But in the long term, spam ceases to be an effective business model.
Count on Microsoft's "cure" to be worse than the disease itself. You would think for $40 billion they could buy just a little more intelligence than that.
SMTP needs to be redesigned. Not by Microsoft, who will use any change in the protocol to tighten their monopoly grip, locking in their customers (and locking out the non-Microsoft world), but by the IETF.
Spammers having to do a computation before delivering email isn't going to limit them to 8000 pieces of mail a day, it simply means they're going to cluster all of those Windoze boxes their custom worms have infected, and let those millions of PCs do the work for them in parallel. SPAM won't decrease one bit, but the load and toll it places on those who use the net will go up significantly.
The solution isn't to increase the cost of email (computationally, bandwidth-wise, or financial), the solution is to repair the design flaws in SMTP (and, for that matter, USENET, something that remains the most useful medium on the 'net despite its widespread abuse) that make SPAM a viable methodology.
The Future of Human Evolution: Autonomy
Mod parent down [-1,unsightful]
The research this is based on [presented at crypto'03] is designed to level the difference between a P4-3000 and a P2-233. They use problems where cache hits will be lower [e.g. use a 8MB buffer or something] so you end up computing at the speed of your memory bus.
If you had done some research before posting your crap you'd know this.
Tom
Someday, I'll have a real sig.
How is my older hardware (or even pretty recent hardware on a huge ISP, with lots of SMTP activity) supposed to be able to handle this? Bah. It seems to me that adding computational difficulty is not such a great way to combat spam. Do you have any idea how effective IP blocklists and statistical filters alone are? (Or, you could combine them as this project is doings).
If this works as stated, then I can see issues.. For instance, large mailing lists. Would they have to be white-listed? 3000 seconds of computation is a heavy tax on a community based program like the Linux Kernel Mailing List, which averages 300 messages to my inbox a day. Also, there's the issue of viral spammers.. Those that send out viruses to do the spamming for them. If you infect enough, 8000 mails per day per computer can still be quite a bit.
Personally, my whole take on spam is that everything needs to be done on the user end. Laws have loopholes in every situation (foreign spammers being a large one,) server restrictions are either too restrictive on small servers, or can be defeated with distributed computing.. I say we stick with Bayesian filtering. It works _wonders_ for me, and I'd love to see more people use it.
This statement is false.
Um, maybe you don't realize what spammers have been doing lately. They use huge networks of compromized machines to spam FOR them (thank you MS and your wonderful security model). There is plenty of horsepower out there to handle any kind of HC type system. The bottom line is that spammers ALREADY have the resources to make a HC system useless.
Comment removed based on user account deletion
Microsoft Research is no different from other industrial research labs: IBM, Bell Labs, etc. They hire the same kinds of people and get the same kinds of inventions out of them. One can't expect any more or less from any big company with a lot of money to spend. However, so far, MSR has not had much positive impact when it comes to driving innovation into the marketplace.
If Penny Black is all there is, it doesn't look like that's going to change. It will probably be decades before we know whether MSR will have had lasting impact. By that time, Microsoft will probably be a benign, lumbering giant, just like its monopolistic predecessors, AT&T and IBM.
If it takes a long time to send out bulk email, what about all the mailinglists people subscribe to? How would lkml or sourceforge lists continue to operate?
I am a viral sig. Please help me spread.
While this seems useful at first glance (at least open relays would stop working), how does your technique address these issues:
1. Clueless admins (of windows or *nix servers) who refuse to use SA or similar? These are the same who leave the mail servers as open relays in the first place.
2. People who use their own SMTP server
Sure, go ahead and say that you can add reverse domain lookups. But registering a domain is quite cheap these days ($4.95 a year) and point the NS to your machine, set up MX records, and you're on your way.
Your solution is useful, but not comprehnsive. I doubt there is a comprehensive solution short of making the spammers incapable of accessing the internet.
--
Clueless People? Everywhere I look, I see them. And some of them, they WORK here!
US is now divided as the "Red" and "blue" states. Red States = communist countries. Coincidence? I think not
This seems to be a "let's fix this by limiting what technology can do" case.
Instead, they should focus on adding more functionality to the smtp protocol. For instance, they could add sender e-mail address verification. You can't check the actual e-mail address, but you can make a "dial-back" TCP connection to check, if the e-mail is known by the mail-server that belongs to the sender e-mail address.
Combined with law enforcement, blacklists etc., this is extremely effective.
So this would have the effect of making legitimate high-volume, high-subscribership mailing lists expensive to operate (unless subscribers configured their MTAs to accept "unstamped" messages from the list, which is annoying and error-prone -- and has an obvious "workaround" for the spammers).
<tinfoilhat mode="on">Ha! Now we see Microsoft's *real* goal... to slow Linux development by shutting down the kernel mailing list!</tinfoilhat>
Seriously, though, any attempt to make e-mail expensive hampers those who have a legitimate need to send lots of e-mail.
Plus, there are obvious workarounds that will be developed in short order. A hardware stamp-generator could probably cut the stamp generation time to practically nothing, particularly since their approach somehow depends on memory/CPU latencies rather than processing time. You might be able to make a much faster stamp generator by running it on your graphics card, and custom-built hardware could certainly do it.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Ok, I'll bite - why not just insert a "sleep (10);" line into the connection response of sendmail (or qmail, or whatever MTA you are using)? By making the sender wait 10 seconds before delivery can begin, you get the same effect as a tar-pit...
Ron Gage - Westland, MI
The programmer who works next to me used to be a construction worker. Every so often, I come up for an idea for some kind of home project, explain it to him, and he tells me a way to accomplish it that is much simpler and more reliable.
This MS solution is almost a caricature of one of my own over-done home improvement ideas. Why bother with some elaborate cryptographic system to delay inbound emails? Why not just have the receiving SMTP process call sleep(10) at the beginning of the SMTP session? You get the same desired slowdown, and all you have to change is the SMTP server software. There's no need to modify MTAs, promulgate new standards, or fit yourself more tightly into the MS monopoly noose.
Proud member of the Weirdo-American community.
Something that the Redmond Empire conveniently neglects to mention is that an awful lot of the spam is due to virus-compromised systems running -- you guessed it -- Microsoft Windows! I've lost count of the number of broadband IP ranges, notably from Shaw Cable and Comcast, that I've had to dump into our domain's local 'Reject' list thanks to their endless attempts to propagate Swen, SoBig, or whatever the latest spammer-zombie trojan is.
Perhaps, if Steve 'Uncle Fester' Ballmer and his cronies had paid more attention to basic security to begin with, or had taken the trouble to actually try and educate their customers about the most basic computing security steps, there wouldn't be such a huge problem now.
This 'Penny Black' nonsense looks like nothing more than a means for them to make money off a mess that they created in the first place.
Bruce Lane, KC7GR,
Blue Feather Technologies
I actively subscribe to a lot of tech sites that have tens of thousands of subscribers. Slashdot is one of those sites. How many people have Slashdot e-mail their mail to them? How are legitimate bulk mailers (of their own content, not ads) supposed to send out newsletters, etc.)? If a retail outlet with a legitimate opt-in newsletter needs to send it to 50,000 or 100,000 people, what kind of hardware upgrades are they going to be looking at. I mean, I can add them to a trusted senders list on my side, but that doesn't tell them that they no longer have to run the computations. "If I don't know you, I have to prove to you that I have spent a little bit of time in resources to send you that e-mail. How do you know whether you "know" me or not? Does the user's mail client alert the sending server that it approves of mail from that SMTP server? Once senders have proved they have solved the required "puzzle", they can be added to a "safe list" of senders. Whose list? My personal list that is part of my mail client? My mail service's white list? Microsoft's special white list?
If you mod me down, I shall become less powerful than you could possibly imagine.
No, it *is* a solution...
No, it isn't. Three years ago it might have been a solution, but right now, it's just a colossal waste of time.
The problem with this is that it operates on the assumtion that spammers work within the same boundaries as everyone else. Anyone who has spent even a tiny fraction of their time fighting spam knows this is simply not true.
The days of spammers sending spam from a single server are long gone - nowadays, they use thousands of trojaned machines to do their work. How many machines do spammers control? Enough to launch effective DDoS'es on some of the largest pipes out there.
The effectiveness of this 'solution' would be marginal at best.
Now compare the effect it would have on legitimate users - an individual sending mail wouldn't notice 10 seconds.. but email is not only used by individuals.
Something to keep in mind when assessing any anti-spam 'solution' such as this is the following:
From a receiver's standpoint, the only difference between a legitimate mailing list and a spammer is that the user asked to be part of a mailing list.
Now think about how this would affect legitimate mailing lists: How many mail servers do most mailing lists have? One? Two? Six? Some large mailing lists might have a dozen.
So how does this affect those mailing lists?
It would shut them down, is how. They would cease to be useful, as it would take days for their mails to get through.
So the 'obvious' solution to this problem would be to whitelist legitimate mailing lists, right? Wrong. That's not a solution either (and we'll ignore the point that any 'solution' that requires exceptions is probably not very well thought out.)
I maintian a mail server for a few thousand people. I have no idea which mailing lists they would subscribe to. It would probably become a full-time job to keep such a whitelist up to date. (And most users wouldn't have any idea to notify me in the first place - so the end effect is that they would subscribe, and then bitch about how they're not getting the stuff they signed up for.)
This 'solution' does not solve anything, and will create more and worse problems than it attempts to solve.
The idea is not to save you fifty-seconds of time by deleting your spam. That's a fringe benefit. The idea is to stop spam by making it harder and more expensive to do so. If we can up the price and difficulty to a certain point spam will no longer be a viable marketting technique.
You're missing no voodoo magic whatsoever, I think you've simply failed to think this through in its entirety. You claim you're sending 50 emails a day. In all likelihood most of these emails are not first-contact emails which would require a crypto challenge, but are in fact addressed to an established-contact which doesn't challenge you.
But for the sake of argument lets say all 50 of these emails are first contact. Dandy. Lets look at how this goes. You write the first letter, and proofread it, and click send. Your system does not immediately lock for ten seconds. Instead your message goes into your outgoing message queue. While you are writing and proofreading your next message the system is busily computing the hash for the previous message.
Let's suppose even further that you type uncommonly fast, require not proofreading, and get all 50 of the messages into your outbox. You take a deep breath, run to the bathroom or for a refill on your coffee, or whatever -- guess whats happening while you're afk?
I want a new world. I think this one is broken.
> The email is sent and the server runs it through
...their email would go to someone else's
...and they would just trash it...
> the scoring process. If the message scores more
> than 6/10 the server sends the sender an
> authentication message, asking to validate the
> email.
So you are one of those resposible for bomabarding me with those damn things.
> This would require spammers to manually
> intervene and waste tons of their time. if they
> forged the sender email...
They always do. My domain is a favorite.
>
> email...
Yes. Mine.
>
Isn't that what the spammers say? "If you don't want it, just delete it. What's the big deal?"
The big deal is that about a quarter of my email is bogus bounces and useless "confirmation" message from systems such as yours.
_NEVER_ _REPLY_ _TO_ _SPAM_
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
You mean this paper? In that case, the Pentium IV 3066 (533 MHz DDR), was 2.66 times faster than the Pentium II 266 (PC66), and just as fast as a 1.2 Ghz Pentium III (PC133).
I'd love to see the Itanium 2 results. The entire program could fit in cache... Yes, the array size could be increased in size, but that would futrher penalize users of PDAs, which already suffer quite a bit.
The real question is whether this program is suffiently enough of a unique case that further advances in memory technology (short of the Itanium's rather expensive brute force solution) will not make this program obsolete.
You are missing the point. Nobody is saying that this is going to be required for all machines. Essentially it is an extra header attached to emails so email recipients can filter messages that don't have this tag. As I see it this is how it would work for most end users.
First setup a whitelist, make this your first spam check. On the whitelist? Email goes through never checking for any other spam criteria. (Mailing list should be accepted here).\
For mail that doesn't pass the white list check we can check for the header created by the MS program. We verify that the computationally intense header is correct and maybe we can let that through if we want, maybe I let emails with this tag pass through my spam checker with a higher spam score.
If we decided to accept mails with the header, we now check the remaining email with a very thorough spam checker and use a very low score.
No matter how many computers they have, it will lower the number of emails that are able to be sent, if people filter on this criteria.
I don't think this is a good idea.
First, it would kill legitimate mailing lists. Imagine what the perl5-porters list or the Linux kernel list or any of the other high traffic mailing lists would have to do to keep operational. Large mailing lists already have problems with lag. This would just add to that.
Also, there does not seem to be anything that would stop them from doing these operations in background and just contact multiple sites while working on the problem. They would just multi-thread the mail spammer or just hijack more machines to use as their slaves.
This technique requires replacing every mail program out there to support the protocol. Of course, they will just make it a condition to connect to exchange. Might be a way of getting people away from having to talk to compromised Windows mail servers.
This is a bad solution for a big problem.
"Something must be done! This is something, therefore we must do it!"
"Trademarks are the heraldry of the new feudalism."
M$ should consider out-sourcing it since well....my hotmail account still gets spam even though I set it to exclusive (meaning only email from ppl in your address book will get through); spam with obvious fake addresses. And the spam that goes through this "exclusive filter" also seem to fly passed my custom filters that have the words that the spam has ("financial", "viagra", "herbal", etc.)
Yahoo works better with regards to spam though I wish it would empty the bulk mail folder more often.
And my pop3 acct has something called greylisting and that alone cuts 95% of spam. Plus black and white listing IPs and domains helps too (for instance, only allowing email from hotmail.com if it originates from one of hotmail's servers, etc.) and blocking known spam-haven Class C ranges (eg x.x.x.*).
My question is.. what happens with mailing lists that have subscribers in the middle 6 figures? I'm on a couple that have over 200,000 subs. Exactly how stale would they be by the time they all got sent, under any sort of delay-per-post tactic?
~REZ~ #43301. Who'd fake being me anyway?