Microsoft Researchers on Stopping Spam

← Back to Stories (view on slashdot.org)

Microsoft Researchers on Stopping Spam

Posted by timothy on Monday April 11, 2005 @01:49PM from the slow-treacle dept.

TheBackBencher writes "Scientific American today has a very interesting article about "Stopping Spam" by Joshua Goodman, David Hackerman and Robert Rounthwaite from Microsoft Research. They talk about different types of spam -- spam with emails, spam on IMs, spamlinks on web pages and image based spam. They mention different techniques for spam filtering mainly fingerprinting matching techniques, n grams model, naive bayesian approach, optical character recognition, challenge/response systems and Human Interacted Proofs (HIP) in a very lucid style. They however do not mention fingerprinting approach of using Nilsimsa Hash to tackle addition of random words by spammers in emails or hypertextus interruptus technique used by spammers of splitting words using HTML comments, pairs of zero width tags, or bogus tags. Also, Spam-Research is reporting the SplitFit Technique that Spammers are using to fool Yahoo! Mail SpamGuard."

12 of 294 comments (clear)

Min score:

Reason:

Sort:

I have an idea by Sonar · 2005-04-11 13:54 · Score: 1, Interesting

I have an idea for stopping spam. How about sent mail is not flagged as recieved until it is viewed by the user. Once it is viewed, it can be flagged as good or flagged as spam. If the mail is flagged as spam, the mail could be sent back to the orignal host as unreturned mail.

So for every mail that is sent out, once flagged as spam, could be sent back. Of course, normal spam filters would also flag the data as spam. If the e-mail is returned to the host, maybe they will remove the e-mail from thier list. Otherwise they will just recieve a bunch of spam as well as sending it. It could clog up bandwith pretty good.

However this probabally wouldn't work if they somehow spoofed the sent address.
1. Re:I have an idea by ConceptJunkie · 2005-04-11 14:30 · Score: 2, Interesting
  
  I assume you're being facetious. Why does stopping someone from sending you garbage equate to not being able to find something. That argument sounds like those nitwit "censorship" whiners who think Freedom of Speech means Freedom to be Heard and that all content must be available at all times in all places.
  
  Anarchy never worked in the real world, how could it work in the electronic world?
  
  --
  You are in a maze of twisty little passages, all alike.
Take a lesson by TrIp0d · 2005-04-11 14:00 · Score: 1, Interesting

If Microsoft email clients had a "bounce" feature spam mail wouldn't be such a problem. Microsoft should take a lesson from KMail. Ha!
Spam is easy to define. by John+Seminal · 2005-04-11 14:02 · Score: 1, Interesting

Spam is like porn: hard to define but you know what it when you see it. That can be hard to program I would think. But, who knows.
No, Spam is easy to define, it is any unwanted emails. Name elements that make spam:
1) It is a form of communication
2) The communication is unwanted
3) The source of the communication is hidden
4) In recieving the communication, you use your bandwith or incur a cost

--
Rosco: "If brains were gunpowder, Enos couldn't blow his nose."
"SplitFit" by 1000101 · 2005-04-11 14:21 · Score: 2, Interesting

From the SplitFit link...
Dera Blcraays Mbmeer, Thsi eamil was stne by the Barclays serevr to vreify yuor emial adsserd. You mtsu competel thsi pssecor by ccilking on the likn bewol and entireng in the smlal wiodnw yoru Braclays Membership nrebmu, passcedo and meelbarom word. Tsih is doen for yruo proteoitcn - buacese semo of our mrebmes no lonegr haev assecc to theri emlia adserdses and we muts virefy it. To vyfire yruo eiaml arddess and accses yruo bnak anuocct , cilck on the lnik bolew:"
That email is extremely difficult to filter out because the only 'real' words are no, of, our, and, etc. Simple words that occur so many times in legitimate emails that most spam filters practically ignore them. But I have to wonder.. who would actualy 'cilck on the lnik bolew' anyway? I hate to use the term 'you get what you deserve', but if you are naive enough to click the link, then the problem isn't your spam filter, it's you.
Re:The way to stop spam... by John+Seminal · 2005-04-11 14:23 · Score: 3, Interesting

And SPAM is different from junk snail-mail how? (BTW, anyone have any idea as to why bulk E-mail postage costs less than regular snail mail postage?) The main difference is if I want to send you something through the mail, I have to put a stamp on it and pay money to ship it. If I want to spam you, I can write a virus and get 1000 machines to pump out the spam. I can do it so it does not cost me anything but my time.
Plus, with the postal service, there are 1000's of laws in place. If I send you an offer through the mail designed to rip you off, that is a federal offense. You can't use the US Postal Service for illegal activities, if you do you get caught.
Remember the movie The Firm? They did not convict the lawyers for tax evasion or any other crime. They convicted them for mail fraud. And if you let the worst spammers know that each and every time they send a message that is spam, each instance will incur a penalty, that might stop them.

--
Rosco: "If brains were gunpowder, Enos couldn't blow his nose."
my solution by ricochet81 · 2005-04-11 14:35 · Score: 2, Interesting

Here's my solution to the greater unwanted communication Anti-spam paper submitted to Conference on Email and Anti-Spam

--
Error: Id10t detected
To stop spam, stop the money laundering by Animats · 2005-04-11 14:38 · Score: 5, Interesting

A spammer needs certain resources to survive. Most spam control effort focus on cutting off the spammer's ability to send spam. Much has been done in that direction. Now more effort needs to be applied to the other direction - cutting off the spammer's payment stream.
Legally, this is promising. First, there's no free speech issue. Second, in most jurisdictions, it's illegal to operate an anonymous business. So most spammers are criminals. Third, laundering transactions through intermediaries is usually a crime, too.
The problem for law enforcement is that following the money is difficult. Additional technical support for that would be a big help.
A good starting point would be to get a credit card issuing bank to cooperate in a scheme where, when one of their credit cards is used, full transaction details, including the payee's full identity, are immediately returned to the cardholder, using encrypted E-mail or some other secure means. That would make "following the money" much easier. This only requires one cooperating bank. That bank's credit cards might become popular with heavy Internet users. Especially if this works for prepaid credit cards, so you can find out who's behind a web site by using some disposable credit card.
The next step is to crack down on "credit card intermediaries". Non-bank credit card intermediaries that handle spammer transactions should be stuck with the legal liability of the spammer. Legally, they're the "merchant". They shouldn't be allowed to pass the buck to some other party. This will make "cheap merchant accounts" harder to get, which is probably a good thing.
Hidden Markov Model by icejai · 2005-04-11 14:43 · Score: 2, Interesting

Why not use a hidden markov model to filter spam that use random digits as filler?

A very basic filter will work this way:

Train a network of say, 30 to 40 units, with any english text. The training text doesn't just have to be limited to letters and numbers, it can include other ascii characters as well, because the hidden markov model will create distributions for them as well.

Now, for each new email that comes in, grab random chunks of text (maybe random 30-character strings) and see how probable the text would be in this hidden markov model. If it turns out not very likely, then scrap it.

Any thoughts?
Re:The Arms Race Goes On by lakeland · 2005-04-11 14:48 · Score: 2, Interesting

Yeah, that won't cause much problem for bayesian. Essentially your filter will learn that news goes from good to neutral, and that javascript mouseovers go from bad to terrible.

However, this isn't what Joshua and the rest of MS are working on. His stuff is much more in the area of modifying SMTP so that untrusted clients have to perform some calculations before their email is accepted, or pay a few cents. My guess is it will fail since it doesn't account for zombie PCs but I'm sure he has something planned for them.
briefly worked on contract for a spam company by Fox_1 · 2005-04-11 14:59 · Score: 3, Interesting

Well they weren't really a spam company, they sold software that allowed you to generate spam messages. I was going to do some telephone sales for them, cold call their market (I know, it's evil but I was calling corporations, not individuals, and I needed some cash) but after I got a copy of their software and became familiar with it's capabilities I felt icky, like I stepped in something, I couldn't in good conscience work for them. It had been presented to me as a customer contact software package - but it had too many little sneaky features that marked it to me as spam software, (built in SMTP server, throttle control on smtp activity so your ISP didn't get mad at you, and a bunch of message generation/tracking options) or at least there was nothing stopping customers from using it in that way, no matter how the company described their product.

--
The rock, the vulture, and the chain
Commitment by Deliveranc3 · 2005-04-11 16:11 · Score: 2, Interesting

They blocked the block function for microsoft messages in hotmail.