Microsoft Researchers on Stopping Spam
TheBackBencher writes "Scientific American today has a very interesting article about "Stopping
Spam" by Joshua
Goodman, David
Hackerman and Robert Rounthwaite from Microsoft Research. They talk about different types of spam -- spam with emails, spam on IMs, spamlinks
on web pages and image based spam. They mention different techniques for
spam filtering mainly fingerprinting matching techniques, n grams model,
naive bayesian approach, optical character recognition, challenge/response systems and Human Interacted Proofs (HIP) in a very lucid style. They however do not mention fingerprinting approach of using Nilsimsa Hash to
tackle addition of random words by spammers in emails or hypertextus interruptus technique used
by spammers of splitting words using HTML comments, pairs of zero width tags,
or bogus tags. Also, Spam-Research is reporting the
SplitFit
Technique that Spammers are using to fool Yahoo! Mail SpamGuard."
Creating your own spamming division, use illegal tactics to undercut your spamming competition, put them out of business, then stop spamming.
Spam is like porn: hard to define but you know what it when you see it. That can be hard to program I would think. But, who knows.
http://www.busyweather.com/
So, does Microsoft Research plan on combating Spam with a Bob-like approach, or the more refined Clippy approach?
:).
Or are they going to come up with an entirely new file system to combat it, hype it up for every Windows release, but then delay its release a few more years?
Oops, pardon me while I reminisce about all the great advances Microsoft Research has given me
my blog
Don't you mean, Microsoft Mergers & Acquisitions?
"Who says nothing is impossible? Some people do it every day!" - Alfred E. Neuman
Each year they will announce that This is the Year of No More Spam on the Desktop (of course this never happens).
Or they will invent a brilliant new way to stop spam but as it requires the user to recompile all their OS and apps every 3 days it never gets used.
Or they just tell the end users "Why dont YOU code some anti-spam software?"
Or they produce an anti-spam system but the user must install 3 desktops and window managers, requires a 10,000 line config file that must be written by hand, comes with either missing or misleading documentation depending on the version you download and randomly purges any non-free software from the hard drive.
I see all this time and money being invested into research to block spam. But we need to rethink our premises: does spam even need to be blocked? Is it actually a problem?
What you call "spam", I call "emails that help me learn about the latest products, websites, and business models". You want less of it? I want MORE of it. "Spam" keeps me informed about the world. And the fact is, consumers LIKE spam. Why do you think spam is profitable? Because people buy the products advertised! Studies show that 3 in 5 people who dislike "spam" have actually bought something online. So frankly, you need to be real careful about how you define "spam" because you could be targeting something you LIKE.