Microsoft Tracks Down Mass Fake Web Pages
An anonymous reader writes "According to an article on New York Times, Microsoft researchers have discovered tens of thousands of junk Web pages, created only to lure search-engine users to advertisements. While most of us have run across them from time to time, the company researchers have found the pages are deliberately generated in vast numbers by a small group of shadowy operators. By following the money trail, Microsoft researchers were able to track the flow from big-name advertisers to search engine spammers. Many use Google's blogspot.com to set up spam doorway pages. 'The practice has proved to be a vexing problem for the major search companies, which struggle to prevent both spammers and companies specializing in improving legitimate clients' Web traffic -- a field known as search-engine optimization -- from undermining their page-ranking systems. Surprisingly, the researchers noted that the vast bulk of the junk listings was created from just two Web hosting companies and that as many as 68 percent of the advertisements sampled were placed by just three advertising syndicators.' The report is available at Microsoft Strider Search Ranger project page."
They could have saved a lot of time and money by just visiting forums like DigitalPoint. These doorways and other spammy sites are for sale every day. It's no secret.
Developers: We can use your help.
I was actually surprised to find their "what to do" points so simple and to the point.
Is it really cheaper to use Page Ranking companies instead of just well, PAYING for an advertisement on Google or MSN or something?
Ok. Forgive me if MS just discovering this makes me think they just entered 2002. That crap is _not_ new folks.
On the other hand, what idiot spouts off about two hosting companies being responsible without naming them? Seriously. This isn't Fark, you can't get kicked off for calling some asshole out.
Er, that sounds like the old saw "we lose a penny on each one sold, but we make it up in volume".
If there's only so much karma going into your pages, there's only so much karma they have to give, no matter how huge it is. A trillion pages pointing at my page won't increase its karma, if those trillion have no karma to give.
Xenu loves you!
Google is already developing methods to deal with clusters of these fakes. Usually they're scraping web directories and databases. I've seen a lot of this lately, searching for dental hygiene schools for my girlfriend. Usually they're linking to each other, even if they're huge clusters. Legit SEO guys (yes, there are consultants who actually try to get your site linked legitimately and by hand) call these areas "bad neighborhoods". Whatever Google's doing, though, clearly isn't enough, and a lot of these guys are using adsense to make money. Martinibuster's got a few good links on the subject.
...a friend of mine figured he could get great Google listings by autogenerating trashy link farm pages, he had the top 1000 porn search terms all cunningly mispelled, ie "Brittney Spares" and hundreds of thousands of static pages all linking into each other across a bunch of subdomains. For about a year we reckoned he had some stupid percentage of all porn listings in Google, and in that time he made around $1,000,000 from banner clicks. Eventually Google caught onto it and blocked his sites enmass, but he'd made enough to buy some property by then.
Thanks for a informative post. Beats the typical whiny M$ iz S4T4|\| crap.
Google does keep up, but quietly- anecdotally, last week I was searching for a certain spec ARM9 dev board (the VULCAN-Lite) with USD also as a search term and all kinds of fake keyword sites and eastern block bride services were in the top 20 results.
I sent Google feedback with my search terms (VULCAN-Lite +USD), explained what spam was popping up, and as I write this comment a few days later-- the Google search comes back clean (empty for +USD, no spam in first 30 results for VULCAN-Lite). They apparently listen and respond to random user feedback pretty quickly.
You are 100% correct that Google does help clean up it's searches. I do about 100 web searches a day to learn stuff, every time I come across spammy results I send Google a note. I think it's working, because the next week when I want to learn more on a topic it's much improved
if you see me, smile and say hello.
I agree. Whatever else you say about MS, and there's lots to say, they seem to have given their security researchers a lot of freedom and because of their size and power have the resources and brainpower to tackle these problems in pretty cool ways. The sad thing, as with much of what comes out from MS, is that you see these really smart, awesome people doing great work, but when it comes to taking their own advice, you can see quite directly the way that the vast bureaucracy and Microsoft's avaricious corporate culture corrupting the good work.
Case in point is IE7. If you look at the IE7 development blogs, you see some good ideas from people who by and large wanted to do good by the web development community. Yet the IE7 that was delivered to consumers can be charitably described as "disappointing", and less charitably described as a "watered-down piece of shit."
Online citizen journalism from the inner city: The View From The Ground