Slashdot Mirror


Microsoft Tracks Down Mass Fake Web Pages

An anonymous reader writes "According to an article on New York Times, Microsoft researchers have discovered tens of thousands of junk Web pages, created only to lure search-engine users to advertisements. While most of us have run across them from time to time, the company researchers have found the pages are deliberately generated in vast numbers by a small group of shadowy operators. By following the money trail, Microsoft researchers were able to track the flow from big-name advertisers to search engine spammers. Many use Google's blogspot.com to set up spam doorway pages. 'The practice has proved to be a vexing problem for the major search companies, which struggle to prevent both spammers and companies specializing in improving legitimate clients' Web traffic -- a field known as search-engine optimization -- from undermining their page-ranking systems. Surprisingly, the researchers noted that the vast bulk of the junk listings was created from just two Web hosting companies and that as many as 68 percent of the advertisements sampled were placed by just three advertising syndicators.' The report is available at Microsoft Strider Search Ranger project page."

33 of 135 comments (clear)

  1. The easy way by truthsearch · · Score: 4, Interesting

    They could have saved a lot of time and money by just visiting forums like DigitalPoint. These doorways and other spammy sites are for sale every day. It's no secret.

    1. Re:The easy way by insanemime · · Score: 5, Funny

      Well it's about time someone tracked down these spammers. I can't count how many times I was searching for porn on the internet and got an ad page. The nerve of some companies.

  2. another ripoff by gEvil+(beta) · · Score: 3, Funny

    Man. This Microsoft project is just a ripoff of Google's Gandalf Search Wizard project...

    --
    This guy's the limit!
    1. Re:another ripoff by eldavojohn · · Score: 4, Funny

      The report is available at Microsoft Strider Search Ranger project page.

      Man. This Microsoft project is just a ripoff of Google's Gandalf Search Wizard project...
      Yeah, but let's not forget that even before that was AOL's Smeagol Browser Gollum project ...
      --
      My work here is dung.
    2. Re:another ripoff by lostboy2 · · Score: 2, Funny

      The report is available at Microsoft Strider Search Ranger project page.

      Man. This Microsoft project is just a ripoff of Google's Gandalf Search Wizard project...

      Yeah, but let's not forget that even before that was AOL's Smeagol Browser Gollum project
      When I was a kid, all we had was the U of Minnesota's Sauron Gopher Overlord project...
    3. Re:another ripoff by CurtisAutery · · Score: 2, Funny

      The report is available at Microsoft Strider Search Ranger project page.

      Man. This Microsoft project is just a ripoff of Google's Gandalf Search Wizard project...

      Yeah, but let's not forget that even before that was AOL's Smeagol Browser Gollum project

      When I was a kid, all we had was the U of Minnesota's Sauron Gopher Overlord project...
      You had Gopher as a kid!? Man, we were stuck with local BBS Sam & Frodo's ASCII Express Second Breakfast project.
    4. Re:another ripoff by The_mad_linguist · · Score: 2, Funny

      You had the Second Breakfast? We only had Bilbo's Punchcard Breakfast.

  3. Why? by Herkum01 · · Score: 5, Interesting

    Is it really cheaper to use Page Ranking companies instead of just well, PAYING for an advertisement on Google or MSN or something?

    1. Re:Why? by Frosty+Piss · · Score: 5, Insightful

      Is it really cheaper to use Page Ranking companies instead of just well, PAYING for an advertisement on Google or MSN or something?
      Yes, or they wouldn't do it.
      --
      If you want news from today, you have to come back tomorrow.
    2. Re:Why? by fruey · · Score: 5, Informative

      The average return on investment on Search Engine Optimisation (generally: increasing your search position on specific keywords relevant to your business) can be about 10x more than the return on keyword purchasing, which can cost 0.30c - several dollars. Every click costs money.

      Once you've optimised to your keywords in "natural search" e.g. *free* results, then your investment keeps paying (you need to maintain positions of course, but this is lower cost, especially if you're in a niche) whereas in paid advertising you have to keep giving money to Google and, in competitive industries, your cost per click will be subject to significant inflation...

      --
      Conversion Rate Optimisation French / English consultant
    3. Re:Why? by terraformer · · Score: 3, Insightful

      It is also more effective. How many times do you click on ads? Now how many times do you click on search results? 'nough said...

      --
      Who are you? The new #2 Who is #1? You are #617565. I am not a number, I am a free man! Muhahaha.
  4. "time to time"? by Frosty+Piss · · Score: 4, Insightful

    While most of us have run across them from time to time...

    Time to time? For mee it seems like more than 50% when I scan the search results. Maybe less, maybe more, but certainly more than "time to time". For many of my searches, I may not find anything truly relevant until the second and third page. People have learned how to play Google to the point where more and more Windows Live is starting to give better results (scary!).

    --
    If you want news from today, you have to come back tomorrow.
    1. Re:"time to time"? by hobo+sapiens · · Score: 5, Funny

      I agree. I run a small business out of Nigeria that helps people in unfortunate situations recover lost money, and we rely on upfront investments from Americans. We always promise a good cut of the money to our American investors. This search engine spam really puts the hurt on my business, too.

      --
      blah blah blah
    2. Re:"time to time"? by Frosty+Piss · · Score: 2, Insightful

      I have never seen results that bad. You must be searching for porn, where spam is to be expected.

      I beg your pardon... "Erotica" is a perfictly legitimate subject.

      --
      If you want news from today, you have to come back tomorrow.
  5. Nice work by MysteriousPreacher · · Score: 4, Informative

    There's actually some pretty decent research here. The site cloning report is a good read.

    http://research.microsoft.com/SearchRanger/Spam_At tack_by_Website_Clones.htm

    The cloning of popular blogs as been a scourge for a while now, both for manipulating search engines and good old fashioned advertising - using someone else's content to draw visitors in

    --
    -- Using the preview button since 2005
    1. Re:Nice work by onepoint · · Score: 2, Interesting

      You are 100% correct that Google does help clean up it's searches. I do about 100 web searches a day to learn stuff, every time I come across spammy results I send Google a note. I think it's working, because the next week when I want to learn more on a topic it's much improved

      --
      if you see me, smile and say hello.
  6. Then Microsoft realized... by physicsboy500 · · Score: 5, Funny

    It's coming from inside the building!!!

    --
    The original generic sig.
  7. And? by jafiwam · · Score: 2, Interesting

    Ok. Forgive me if MS just discovering this makes me think they just entered 2002. That crap is _not_ new folks.

    On the other hand, what idiot spouts off about two hosting companies being responsible without naming them? Seriously. This isn't Fark, you can't get kicked off for calling some asshole out.

    1. Re:And? by Sirch · · Score: 3, Insightful

      ... but you can get sued for libel if you're wrong.

  8. Re:The easiest way by TheMeuge · · Score: 3, Funny

    Quick, somebody make a few thousand clones of this report.

  9. And in other news... by sconeu · · Score: 3, Funny

    Microsoft researchers have discovered tens of thousands of junk Web pages, created only to lure search-engine users to advertisements.

    In other news, Microsoft researchers have discovered that the sky is blue and that water is wet.

    --
    General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
    1. Re:And in other news... by Smuffe · · Score: 3, Funny

      Microsoft researchers have discovered that the sky is blue

      I live in London, you insensetive clod!

  10. Re:How does this help them? by jandrese · · Score: 3, Insightful

    It works because you don't realize the size of this thing. They're talking about millions of fake pages here, lots of them pointing at other fake pages to raise their pagerank so they can in turn point at yet more pages. You would think Google would have someone seeking these kind of sites out and applying a discount on their domain though (although when that happens the spammers just move on anyway).

    --

    I read the internet for the articles.
  11. Re:How does this help them? by Paul+Crowley · · Score: 2, Interesting

    Er, that sounds like the old saw "we lose a penny on each one sold, but we make it up in volume".

    If there's only so much karma going into your pages, there's only so much karma they have to give, no matter how huge it is. A trillion pages pointing at my page won't increase its karma, if those trillion have no karma to give.

  12. Obligatory Bill Hicks by Thaelon · · Score: 4, Funny
    Obligatory Bill Hicks...

    If you work in advertising, kill yourself.
    --Bill Hicks - Another Dead Hero
    --

    Question everything

  13. Bad neighborhoods by condour75 · · Score: 2, Interesting

    Google is already developing methods to deal with clusters of these fakes. Usually they're scraping web directories and databases. I've seen a lot of this lately, searching for dental hygiene schools for my girlfriend. Usually they're linking to each other, even if they're huge clusters. Legit SEO guys (yes, there are consultants who actually try to get your site linked legitimately and by hand) call these areas "bad neighborhoods". Whatever Google's doing, though, clearly isn't enough, and a lot of these guys are using adsense to make money. Martinibuster's got a few good links on the subject.

  14. Re:How does this help them? by FooAtWFU · · Score: 2, Insightful
    Presumably some of these trillion pages have a karma greater than or equal to epsilon.

    The scummiest part of it all is that some of the pages in question will be on domains that someone let expire and someone else immediately snatched up. They get their PageRank from the sites that linked to the formerly legitimate domain. And if that was your domain name, and you only let it expire accidentally, well, sucks to be you. :(

    --
    The World Wide Web is dying. Soon, we shall have only the Internet.
  15. A few years ago... by AliasTheRoot · · Score: 3, Interesting

    ...a friend of mine figured he could get great Google listings by autogenerating trashy link farm pages, he had the top 1000 porn search terms all cunningly mispelled, ie "Brittney Spares" and hundreds of thousands of static pages all linking into each other across a bunch of subdomains. For about a year we reckoned he had some stupid percentage of all porn listings in Google, and in that time he made around $1,000,000 from banner clicks. Eventually Google caught onto it and blocked his sites enmass, but he'd made enough to buy some property by then.

  16. Microsoftie wearing a white hat? by CodeShark · · Score: 5, Insightful

    I just finished reading how much the Strider group at M$ has accomplished and how, and it is rather amazing. They lifted the covers off of typo-domain squatters exploiting Google's programs, a progressive honeypot setup that detects which levels of XP are attackable by different mal-ware attacks (up to and including reporting zero-day exploits if the latest "patch hardened" machine is exploited], and now this project. Even better, they are publishing the "how", and any OS (AKA Mac OS or any of the Linux distros) could benefit by using similar approaches on even more machines.

    So -- from an admitted open source advocate -- here's a rare kudo to the giant in Redmond for keeping a "white hat" and his group -- and letting them work.

    --
    ...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
    1. Re:Microsoftie wearing a white hat? by TheViewFromTheGround · · Score: 2, Interesting

      I agree. Whatever else you say about MS, and there's lots to say, they seem to have given their security researchers a lot of freedom and because of their size and power have the resources and brainpower to tackle these problems in pretty cool ways. The sad thing, as with much of what comes out from MS, is that you see these really smart, awesome people doing great work, but when it comes to taking their own advice, you can see quite directly the way that the vast bureaucracy and Microsoft's avaricious corporate culture corrupting the good work.

      Case in point is IE7. If you look at the IE7 development blogs, you see some good ideas from people who by and large wanted to do good by the web development community. Yet the IE7 that was delivered to consumers can be charitably described as "disappointing", and less charitably described as a "watered-down piece of shit."

      --
      Online citizen journalism from the inner city: The View From The Ground
  17. Nice work by kad77 · · Score: 3, Interesting

    Thanks for a informative post. Beats the typical whiny M$ iz S4T4|\| crap.

    Google does keep up, but quietly- anecdotally, last week I was searching for a certain spec ARM9 dev board (the VULCAN-Lite) with USD also as a search term and all kinds of fake keyword sites and eastern block bride services were in the top 20 results.

    I sent Google feedback with my search terms (VULCAN-Lite +USD), explained what spam was popping up, and as I write this comment a few days later-- the Google search comes back clean (empty for +USD, no spam in first 30 results for VULCAN-Lite). They apparently listen and respond to random user feedback pretty quickly.

  18. Firefox is good. by wetelectric · · Score: 2, Informative

    Firefox has an extension called customizegoogle. It adds a 'filter' option to a google results page. Allows one to filter out the sneaky pages that hi-jack your search query.

    --
    Most people have no idea what they are doing, and are silently panicking on the inside.
  19. Timing by jeichels · · Score: 2, Insightful

    I think it is funny timing how we turned down a $73k/month in advertising last night from one of the top three spam supporting syndicators. They were seeking a $1.16 per average click through.

    I am very glad I read the detailed report from end to end. We seek value in advertising, not spam, but it is very difficult for well meaning companies to figure out which is which. You shouldn't have to be a rocket scientist to differentiate the deceptive tactics/companies from the valid ones. I guess most forms of fraud end up being abstractly similar to this scheme in the end though.

    If something smells fishy don't eat it.

    --

    JohnE
    jobbank.com - Search jobs, post resume,