MS Research Automates Search Engine Spam Hunt
Barbie Dollar writes "Researchers at Microsoft are working on an ambitious new project to hunt down and neutralize large-scale search engine spammers. The project, called Strider Search Defender, automates the discovery of search spammers through non-content analysis. The project integrates technology from two previous Microsoft Research prototypes (Strider HoneyMonkey and Strider URL Tracer) and promises a new approach to removing junk results from search engine queries."
Every anti-Microsoft blog and article in existence has been flagged as search engine spam.
More at 11.
"You will pay for your lack of vision..." - Emperor Palpatine to Ray Charles
..that Strider HoneyMonkey was Arwen's pet name for Aragorn?
Sure, preventing search engines from indexing blogspam posts is great. Maybe that's the first step, but it's not going for the root cause - the botnets that run the apps that post/email in the first place, and the compromised webservers hosting order sites.
These are not mutually exclusive goals. If you take away any incentive for spamalizing content (meaning, not only does it not boost your search placement, it penalizes you), then much of the pressure to run botnets and crack servers goes away.
Don't disappoint your bird dog. Go to the range.
First, do not be so skeptical. Have you noticed how well Outlook 2003 spam filtering works? I realize the algorithm is different, but based on results, I have to say that it is probable that Microsoft will succeed with reasonable effectiveness.
Second, what business rationale is there to give away a competitive advantage (after spending millions to get it) in the very competitive search market, where, by the way, Microsoft is not the market leader?
All major search engines have been doing this for quite some time. Google is probably the best hunter of them all and the most recent update, which occured on June 27, banned a large number of spammers who had billions of sites indexed. Unfortunately, the war on spam is quite difficult. They spammers are working with non-content pages but it is a matter of time before they start generating non-jibberish content to spam with, too.
Hopefully, Microsoft's approach will give some effect and push other operators to work harder on preventing the web spam.
Amusingly, you're most likely getting affected only if you're searching for penis pumps, pornographic content and gambling.
Full Tilt
Seems to me that a group of 10 people could easily flag a large amount of spam websites. Is this currently being done by any major engine?
"Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
Microsoft forgot to mention my non-content based method of blocking comment spam entirely known as Bad Behavior. And now that they seem to have swiped a few of my ideas, I'm going to have to go see what they're up to...
How am I supposed to fit a pithy, relevant quote into 120 characters?
This *must* be one of the next battle lines in the so-called search wars.
I remember the first time I saw google - I was blown away: "Wow. These results are exactly the web pages I was looking for!" But that's no longer the case when you search in google. They've really fallen behind in being able to separate out (or, as they say, "search for") the pages I want from the junk.
I hope google will win this war, but maybe microsoft chucking some money at the problem will help light a fire under google to get this fixed before someone else does it better. If searching at google no longer brings me relevant results better than any other source, I'm gonna start looking for somewhere else to search. Just like I did when I switched to google from yahoo back in the twentieth century.