Microsoft Tracks Down Mass Fake Web Pages
An anonymous reader writes "According to an article on New York Times, Microsoft researchers have discovered tens of thousands of junk Web pages, created only to lure search-engine users to advertisements. While most of us have run across them from time to time, the company researchers have found the pages are deliberately generated in vast numbers by a small group of shadowy operators. By following the money trail, Microsoft researchers were able to track the flow from big-name advertisers to search engine spammers. Many use Google's blogspot.com to set up spam doorway pages. 'The practice has proved to be a vexing problem for the major search companies, which struggle to prevent both spammers and companies specializing in improving legitimate clients' Web traffic -- a field known as search-engine optimization -- from undermining their page-ranking systems. Surprisingly, the researchers noted that the vast bulk of the junk listings was created from just two Web hosting companies and that as many as 68 percent of the advertisements sampled were placed by just three advertising syndicators.' The report is available at Microsoft Strider Search Ranger project page."
They could have saved a lot of time and money by just visiting forums like DigitalPoint. These doorways and other spammy sites are for sale every day. It's no secret.
Developers: We can use your help.
Man. This Microsoft project is just a ripoff of Google's Gandalf Search Wizard project...
This guy's the limit!
Is it really cheaper to use Page Ranking companies instead of just well, PAYING for an advertisement on Google or MSN or something?
Time to time? For mee it seems like more than 50% when I scan the search results. Maybe less, maybe more, but certainly more than "time to time". For many of my searches, I may not find anything truly relevant until the second and third page. People have learned how to play Google to the point where more and more Windows Live is starting to give better results (scary!).
If you want news from today, you have to come back tomorrow.
There's actually some pretty decent research here. The site cloning report is a good read.
t tack_by_Website_Clones.htm
http://research.microsoft.com/SearchRanger/Spam_A
The cloning of popular blogs as been a scourge for a while now, both for manipulating search engines and good old fashioned advertising - using someone else's content to draw visitors in
-- Using the preview button since 2005
It's coming from inside the building!!!
The original generic sig.
Quick, somebody make a few thousand clones of this report.
Microsoft researchers have discovered tens of thousands of junk Web pages, created only to lure search-engine users to advertisements.
In other news, Microsoft researchers have discovered that the sky is blue and that water is wet.
General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
It works because you don't realize the size of this thing. They're talking about millions of fake pages here, lots of them pointing at other fake pages to raise their pagerank so they can in turn point at yet more pages. You would think Google would have someone seeking these kind of sites out and applying a discount on their domain though (although when that happens the spammers just move on anyway).
I read the internet for the articles.
... but you can get sued for libel if you're wrong.
We Build Beautiful Websites
Question everything
...a friend of mine figured he could get great Google listings by autogenerating trashy link farm pages, he had the top 1000 porn search terms all cunningly mispelled, ie "Brittney Spares" and hundreds of thousands of static pages all linking into each other across a bunch of subdomains. For about a year we reckoned he had some stupid percentage of all porn listings in Google, and in that time he made around $1,000,000 from banner clicks. Eventually Google caught onto it and blocked his sites enmass, but he'd made enough to buy some property by then.
I just finished reading how much the Strider group at M$ has accomplished and how, and it is rather amazing. They lifted the covers off of typo-domain squatters exploiting Google's programs, a progressive honeypot setup that detects which levels of XP are attackable by different mal-ware attacks (up to and including reporting zero-day exploits if the latest "patch hardened" machine is exploited], and now this project. Even better, they are publishing the "how", and any OS (AKA Mac OS or any of the Linux distros) could benefit by using similar approaches on even more machines.
So -- from an admitted open source advocate -- here's a rare kudo to the giant in Redmond for keeping a "white hat" and his group -- and letting them work.
...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
Thanks for a informative post. Beats the typical whiny M$ iz S4T4|\| crap.
Google does keep up, but quietly- anecdotally, last week I was searching for a certain spec ARM9 dev board (the VULCAN-Lite) with USD also as a search term and all kinds of fake keyword sites and eastern block bride services were in the top 20 results.
I sent Google feedback with my search terms (VULCAN-Lite +USD), explained what spam was popping up, and as I write this comment a few days later-- the Google search comes back clean (empty for +USD, no spam in first 30 results for VULCAN-Lite). They apparently listen and respond to random user feedback pretty quickly.