Microsoft Researchers on Stopping Spam

← Back to Stories (view on slashdot.org)

Microsoft Researchers on Stopping Spam

Posted by timothy on Monday April 11, 2005 @01:49PM from the slow-treacle dept.

TheBackBencher writes "Scientific American today has a very interesting article about "Stopping Spam" by Joshua Goodman, David Hackerman and Robert Rounthwaite from Microsoft Research. They talk about different types of spam -- spam with emails, spam on IMs, spamlinks on web pages and image based spam. They mention different techniques for spam filtering mainly fingerprinting matching techniques, n grams model, naive bayesian approach, optical character recognition, challenge/response systems and Human Interacted Proofs (HIP) in a very lucid style. They however do not mention fingerprinting approach of using Nilsimsa Hash to tackle addition of random words by spammers in emails or hypertextus interruptus technique used by spammers of splitting words using HTML comments, pairs of zero width tags, or bogus tags. Also, Spam-Research is reporting the SplitFit Technique that Spammers are using to fool Yahoo! Mail SpamGuard."

24 of 294 comments (clear)

Min score:

Reason:

Sort:

I don't know... by Anonymous Coward · 2005-04-11 13:55 · Score: 4, Insightful

If it was developed it can be reversed engineered. Sorry to say but spam is here to stay unless of course someday the internet becomes regulated somehow.
Re:The way to stop spam... by Sonar · 2005-04-11 14:01 · Score: 5, Insightful

Of course, one 200MB update from Microsoft would kill this idea. Or how about a 500MB game demo download? Thats legitimately free. Or better yet, what if I need to download a linux distro or a television episode?

I would hate to have to explain all my actions to my ISP. Espically with the way media is driving the internet nowadays. 200MB is way too small of a limit.

Now, you can monitor how many e-mails are sent by a host. That would be a better way. At least there could be a filter on the "to:" line. If that list includes over say, 1000+ users, consistantly, then at least there could be some flags raised.
to stop spam, by havaloc · 2005-04-11 14:02 · Score: 5, Insightful

give spammers a 9 year prison sentence.
1. Re:to stop spam, by Anonymous Coward · 2005-04-11 16:21 · Score: 1, Insightful
  
  This is you missing the point.
  
  He didn't say that drug dealers should be put back on the streets. He pointed out the fact that prison sentences have not stopped the crime in question. For that matter, they haven't stopped any type of crime. Therefore, there's no reason to believe they would stop spam as the original poster suggested. Reduce it, maybe. There's a big difference between reduced, and reduced to zero.
How about a secure OS to get rid of zombies? by Anonymous Coward · 2005-04-11 14:03 · Score: 3, Insightful

That'd probably be the best thing M$ could do to help reduce spam.
Re:Take a lesson by AndroidCat · 2005-04-11 14:05 · Score: 4, Insightful

I shudder to think on what you mean by a "bounce" feature. Most likely sending a "bounce" reply to the forged sender address? That's part of the problem, not the solution.

--
One line blog. I hear that they're called Twitters now.
Re:I have an idea by Aeiri · 2005-04-11 14:07 · Score: 3, Insightful

Interesting idea, however invalid address responses are sent within 5 minutes of the original mail. If the response is sent over a day after the original mail is sent, the spammer could just discard it.

If we respond instantly to all email with invalid address mails, it will be overused and spammers will ignore ALL of them. This is much like antibiotics, we use them too much, the bacteria accommodates for it, and antibiotics become obsolete.

So far, bayesion filtering and/or whitelisting email are the only good systems of blocking spam at the moment.
Re:Spam is easy to define. by ch-chuck · 2005-04-11 14:12 · Score: 5, Insightful

1) It is a form of communication

all email is communication

2) The communication is unwanted

"wanted" is a subjective property of the recipient - the computer has no programmable decision procedure for wantedness.

3) The source of the communication is hidden

There may be some system of authenticating sender ID, and will be as easy as getting ppl to use pk encryption.

4) In recieving the communication, you use your bandwith or incur a cost

again a property of all emaiil.

--
try { do() || do_not(); } catch (JediException err) { yoda(err); }
validating email addresses for more spam by John+Seminal · 2005-04-11 14:13 · Score: 3, Insightful

Also, if you reply, the spammer will know your address is active and send more crap.
I don't undertsand this. On one hand, you have the police saying they can't track spammers. Spammers use drones, they remain hidden, they hide their tracks. On the other hand, if you unsubscribe, they know your email is a real one, and you get more spam. That tells me whoever runs the unsubscribe service is in cahoots with the spammer and is just as guilty. They have to know where to send their lists? Just track them as part of the war on spam.

--
Rosco: "If brains were gunpowder, Enos couldn't blow his nose."
Re:I have an idea by xQx · 2005-04-11 14:15 · Score: 5, Insightful

Here's a more interesting idea...

Authenticate SMTP with public key signing. -- Then use a trust network to only accept email from trusted companies.

Why it won't work:
It involves effort and cost.

Baah, the internet should be unregulated, if they can get rid of SPAM then whats to stop them getting rid of porn, anti-government information etc. There's a road we all want to go down.

Don't buy it and Get over it(tm).
The easiest way to stop span. by Anonymous Coward · 2005-04-11 14:17 · Score: 0, Insightful

ISP's just have to restrict the number of emails that each account can send in day. After all does John / Jane Doe need to send thousands of emails a day ?

Robert
1. Re:The easiest way to stop span. by FooAtWFU · 2005-04-11 15:09 · Score: 1, Insightful
  
  People don't send spam from their ISP's account. They send it straight through their computer. Now, you could put outbound filtering on port 25, and require everyone to send mail through the ISP's servers (with authenticated SMTP of some sort), though there will be some legitimate traffic surpressed if that happens...
  That'll trim spamming more than any 'message count'.
  
  --
  The World Wide Web is dying. Soon, we shall have only the Internet.
Economic problem--NOT technical by shanen · 2005-04-11 14:31 · Score: 3, Insightful

SMTP is working exactly as designed--but the design is broken. You can't fix a fundamentally economic problem with any number of technical tools. It's like adding more epicycles to the earth-centered "perfect spheres" models of the universe.
The article barely mentions economics, and only in terms of the real costs of email--which only shows how much room there is for a real economic model with real business, real email, and *NO* spam.
I really wish one of the major email players would offer an option for prepaid email. That would be an absolutely spam-proof system. It doesn't matter if the postage is two cents, the spammers can't afford it. Two cents against 50,000,000 spams turns out to be *REAL* money. Any email via that address would be at least some kind of real thing.

--
Freedom = (Meaningful - Coerced) Choice != (Speech | Beer^2), and sad sock puppets' bad mods avail them naught.
Ugh by Mr.+Underbridge · 2005-04-11 14:46 · Score: 3, Insightful

Legislate against spam. As long as spam is legal, or the penalties against it are too low, or it is too easy to do, people will continue to try and make a quick buck.
First, I guess you didn't see the guy in VA who just got something like 9 years in jail.
That said, spam doesn't obey jurisdictional boundaries. Any single country can only solve a small part of the problem, and any spam incident often involves over 3 jusrisdictions that may be in separate countries (sender, spambot, recipient, etc). That's a logistical nightmare that isn't soluble outside of a dream world.
Also, force all ISP's to monitor how much bandwith a source has. If you get too much usage per day, say 200 megabytes or more, then that person has to explain why they need that much bandwith. If someone gets the RIAA on board, with their lobbyists, that should pass very quickly.
That's fantastic. Trade a bad problem for one that's much worse. Get the RIAA to legitimize their practices by using a guise of stopping spam? Let's not.
Also, force all email to have some element which identifies the source. Not just a header that can be forged, but something that can't be hacked.
Now by force, what do you do if they don't? Enforement issues again here.
Ultimately, legislative solutions for spam DO NOT and CAN NOT work for much but a small part of the problem. It's satistfying when some moron is clumsy enough to get caught (as with the guy in VA), but mostly these days the spammers aren't that stupid. Technological solutions work far better.
1. Re:Ugh by Mr.+Underbridge · 2005-04-11 15:11 · Score: 2, Insightful
  
  No, all routers in the USA can be forced to reject all email, unless it comes on a specific port, with specific identifiers. For example, maybe have a ISP program that you must instal on your machine that identifies your email. If the hash made by that program is wrong, they drop the email. Like what microsoft does when you try and instal software, you have to validate that you own the software and it is running on one machine.
  None of those "let's redefine the SMTP standard" crackpot schemes are going to work. Remember, any solution has to be implementable for a billion internet users, and none of those hash schemes are. When the cost of implementation is higher than the cost of spam, the solution becomes a problem.
Solution or complication? by Gary+Destruction · 2005-04-11 14:52 · Score: 3, Insightful

Is there really such a thing as a solution to spam? For every new technique that is developed, the spammers will find a way to circumvent it. Spam is a multi-million dollar business. I'd go so far as to say that it's a science. At least, the spammers seem to have it down to a science.

Trying to find a solution to spam is an idea in the eyes of experts and analysts. But to spammers, it's a road block that they must work around to stay in business.
Spamming techniques will no doubt end up as signatures in spam filters that are not unlike those signatures used by IDS and virus scanners. The experts don't seem to understand that if there's a will, there's a way. And the spam will just keep coming in another form or by some other technique. All that can be done is to keep up with changing techniques and patterns and treat spam for what is truly is -- an attack vector.
Re:The way to stop spam... by globalar · 2005-04-11 14:59 · Score: 2, Insightful

Legislation ultimately runs into international borders and places where U.S. law cannot go. It can help, but honestly I am not sure how to craft a good law that will keep up with the pace of technology. Also, a law does not guarantee effective enforcement.

A better strategy, IMO, is to work on the commercial level. It has been said here on /. many times that if there were no money for spammers, there will be no spam. When spam becomes an issue which decides where money goes (who wins and who looses), the economics will take over. We need to convince people and businesses of simple ways to stop spam.

Forcing monitoring is counter-productive. ISP's need to voluntarily enact monitoring schemes for their own benefit and that of other parties. When an ISP is convinced that they can contribute to stopping spam and that this is in their best interests, their efforts are more likely to be aimed at succeeding not simply complying.

Also consumers need to get involved, but not with lobbying Congress (on this particular matter). ISP's and webhosts need to believe that consumers will factor spam tolerance into their decisionmaking. Consumers (and other buyers) need to follow up and practice this a little - at least a vocal plurality.

On the community side, black-lists need to be scrapped in favor of informative lists of known, proven spam havens and spammers. What host's are the real problem? That is what buyers need to know. Block them if you want, but that is counter to how the Internet works and will not ultimately succeed. Instead, inform buyers who is responsible for letting spam through. Who should you not do business with? Do not be condescending or militant - be simple and clear. "So-and-so sends spam to your inbox."

I agree, technical work needs to be done. But beyond protocols, formats, and other standards this is problem which can be solved through small changes in behavior across many groups. It cannot be centralized and squashed.
Re:Hidden Markov Model by mcc · 2005-04-11 15:12 · Score: 3, Insightful

Right now spam is usually filtered using a brownian model. As a result, spammers have begun structuring their emails so as to target brownian models. How many spams have you gotten lately with the subject line ending in confiscate ok wallop yls oblivion?

If we move to filtering spam using markov models, spammers will begin structuring their emails so as to target markov models. Look forward to all your spams ending in 500-word blocks of text from a copy of MegaHAL trained on old grandmothers' email boxes.

--
Irritable, left-wing and possibly humorous bumper stickers and t-shirts
Re:To stop spam, stop the money laundering by killjoe · 2005-04-11 15:22 · Score: 3, Insightful

In the past the FBI has already caught people by simply buying what they were selling and then finding the person who cached the check.

Of course the FBI can't arrest people in the lawless places of the world like croatia and hungry so those government will need to shed their corruption.

In other words I don't think your scheme will work because so much of the world is out of the reach of law enforcement.

--
evil is as evil does
Microsoft already has a "spamming division" by Anonymous Coward · 2005-04-11 16:13 · Score: 1, Insightful

... it's called "HotMail"

How else do you explain getting 50 pieces of junk mail a day even when you never use your account?
Re:I have an idea by sfe_software · 2005-04-11 16:22 · Score: 5, Insightful

Interesting idea, however invalid address responses are sent within 5 minutes of the original mail. If the response is sent over a day after the original mail is sent, the spammer could just discard it.

The thing is, I don't belive spammers ever remove an address due to an error. I had a domain that received a ton of spam, and that domain expired. Two years later (fighting with Network Solutions) I got the domain back, and immediately started receiving a ton of spam. Two years of spammers sending spam to invalid addresses (no DNS on the domain) and they still continued.

Why?

Simple: the spammers don't receive bounce messages, and the spam-servers (which could be static servers, or compromised zombie machines) don't provide accurate return information. Much like how telemarketers often show invalid or "Unknown" caller-ID info. It costs nearly nothing to send a spam message to an address, whether that address is valid or not. It costs much more to weed out invalid or unreachable addresses from your list by intercepting bounce messages etc.

And spammers don't give a shit. Most of the time, they are using someone else's machine (a zombie'd Windows box, or an open relay) so they don't need to care. So this trick simply doesn't work. It's cheaper to just continue sending to invalid addresses. Not to mention, many newbie spammers get their lists from less-than-legit sources who are selling large lists; they don't care (and are usually fully aware) that many of the addresses they are selling are bogus or no longer valid...

In short, simple tricks like this don't work, when dealing with an "industry" that doesn't give a shit...

--
NGWave - Fast Sound Editor for Windows
Re:"SplitFit" by Spy+der+Mann · 2005-04-11 16:48 · Score: 3, Insightful

You haven't RTFA it seems.

That garbled text is ungarbled by certain software (i.e. outlook). That's because there are invisible chars in there that activate the "right to left" mode.

Example:
De*ra* B*lcra*ays M*bme*er
translates to:
Dear Barclays Member

(I tried to copy the text I got in Yahoo, and paste it in MSN messenger. Amazingly, the text was "ungarbled". That's when I realized how tricky spammers were)

SPAM software could simply detect left-to-right characters in such text, and ipso-facto label it as spam. Unless of course, you're reading hebrew. Which is obviously NOT the case.
Why Do People Use HTML for Email ??? by mamladm · 2005-04-11 19:59 · Score: 2, Insightful

The overwhelming majority of spam filter deceiving techniques relies on HTML. If you block messages containing HTML on the mail server, the spam that gets through is near 100% identifiable as spam using bayesian filters.

So why on earth do people still use HTML in their email? Email should be plain text only anyway.

--
the macintosh asterisk mailing list http://www.astm
Actually, it's not that complicated by Moraelin · 2005-04-11 21:36 · Score: 2, Insightful

1. Most civilized countries are sick and tired of SPAM too. E.g., most European countries. So there is enough scope for a spam free zone, if the USA does want to get its act together and cooperate. It's not like you're alone against the world on the SPAM issue, except for the fact that:

2. It's mostly your spam that's dumped upon the rest of the world. USA is currently _the_ biggest source of spam, followed by... offshored operations paid for by someone from the USA.

So on one hand, the USA could halve the SPAM traffic on its own, without even needing much international cooperation, if it actually got its act together. And on the other hand, hey, there's a lot of incentive for a lot of other countries to cooperate. Just show us where to sign, if it means we'll stop getting your crap in our inboxes.

3. Once you have secured an EU+North America treaty on that issue, the rest of the world should IMHO be actually pretty easy.

We're talking some major combined economic power there. Any country who doesn't want to play ball with that kind of a behemoth can be whacked into submission in a variety of ways, ranging from economic sanctions to just disconnecting them from the Internet.

Makes that country unattractive to spammers too in the process. See, I don't think spammers want to target the local citizens of Elbonia with their operations. You disconnect them from the rich targets, you've killed that operation. So any country which thought it'll get rich by sheltering spammers, will quickly lose that investment too and be left with just the other disadvantages.

So I think they'll play ball.

4. But I don't think the spammers want to move to Elbonia or East Bumfuckistan and run their operation from there anyway. They might pay some local 5 bucks to run a server for them there, but they don't want to go live in a third world country. Those countries aren't that much fun.

You may see that even for legitimate operations, IBM might offshore their tech support to India or China, but you won't see the CEO of IBM moving there. (And those are already developping countries, not third world ones.)

--
A polar bear is a cartesian bear after a coordinate transform.