Slashdot Mirror


Google Goes After Content Farms

RedEaredSlider writes "Aimed at stripping search results of pages from 'low-quality' sites, a new Google Chrome extension allows users to block specified websites from appearing in search results. The names of these sites are then sent to Google, which will study the collected results and use them to determine future page ranking systems. Google principal engineer Matt Cutts wrote in a post on the Google blog that the company hopes the extension will improve the quality of search results. The company has been the target of criticism in recent months, much of which centered around the effect that content farms were having on searches."

59 of 345 comments (clear)

  1. Firefox Extension Needed! by dch24 · · Score: 5, Insightful

    Dear Google,

    Please port this to Firefox.

    Sincerely,
    The rest of the browser market

    1. Re:Firefox Extension Needed! by nicedream · · Score: 3, Informative

      Dear dch24,

      Try this script for Greasemonkey.

      Sincerely,
      nicedream

    2. Re:Firefox Extension Needed! by Anonymous Coward · · Score: 5, Insightful

      perhaps the description for that script is lacking...BUT it doesn't report the sites you block back to google--which is the best frickin point of this extension!!!

    3. Re:Firefox Extension Needed! by multisync · · Score: 2, Insightful

      Dear Slashdot

      Please give us a plug-in we can use to report moderation abuse.

      Missing the old meta-mod system,
      A concerned Slashdotter

      --
      I don't care why you're posting AC
    4. Re:Firefox Extension Needed! by Fluffeh · · Score: 3, Insightful

      You mean this one that is still there and happily waiting for you to metamod in?

      --
      Moved to http://soylentnews.org/. You are invited to join us too!
    5. Re:Firefox Extension Needed! by Flipao · · Score: 2, Insightful

      They will, that's why they are collecting relevance feedback from users.

    6. Re:Firefox Extension Needed! by multisync · · Score: 5, Informative

      That is not a meta-mod system. It is a comment popularity system.

      It is useful as well. It is comment-centric, and gives site administrators a very high level snapshot of what users think about the current state of the user generated content.

      The old meta-moderation system oth tasked the meta-moderator with judging whether a specific moderation a comment received was fair. It wasn't a perfect system, but it provided just the smallest possibility that there may be consequences for abuse of moderation privileges.

      --
      I don't care why you're posting AC
    7. Re:Firefox Extension Needed! by Dhalka226 · · Score: 2

      That's not the old metamod system. The old system was essentially an agree/disagree with the moderation that was given to a post. This system basically asks you to moderate the posts yourself and, presumably, tries to see how many metamodders arrive at different conclusions than the moderators did.

    8. Re:Firefox Extension Needed! by pavon · · Score: 2

      Only to be replaced with thousands of other ones. If you want to fix this problem you have to remove the profit, which means you need to get them off the google index.

    9. Re:Firefox Extension Needed! by hedwards · · Score: 2

      The old version was better. I haven't bothered to meta mod since they took away the up or down vote on it.

    10. Re:Firefox Extension Needed! by Shadow+of+Eternity · · Score: 4, Insightful

      Dear Google,

      Please stop fucking with my search results. When I type something in the search box I want you to search for exactly that and suggest possible typos. I don't want you to search for what I DIDN'T type, I don't want you to combine it with my previous results, I don't want you to assume I must have meant something else and search for some other word entirely because you THINK it's the same thing.

      Sincerely,
      Everyone who's sick of searching for one thing and having something totally different returned.

      --
      A bullet may have your name on it but splash damage is addressed "To whom it may concern."
    11. Re:Firefox Extension Needed! by Rabenblut · · Score: 2
  2. Here's to hoping Expert's Exchange is among them by Zilvreen · · Score: 5, Insightful

    I can't begin to express how aggravating it is to google a programming issue, and have the top five results all link to the same page with the same paywalled answers.

  3. An incremental improvement, I suppose... by fuzzyfuzzyfungus · · Score: 3, Funny

    Frankly, no browser extension will be suitable to the task of going after link farmers until Lethal Force over IP is developed and widely adopted; but, in the absence of robust LF/IP implementations, I suppose hitting them in the wallet will have to do....

    1. Re:An incremental improvement, I suppose... by Dachannien · · Score: 3, Funny

      Frankly, no browser extension will be suitable to the task of going after link farmers until Lethal Force over IP is developed and widely adopted; but, in the absence of robust LF/IP implementations, I suppose hitting them in the wallet will have to do....

      As I understand it, there are concerns of collateral damage because of all the hosts behind Network Assassination Translation firewalls.

  4. Search Wiki by kabloom · · Score: 2

    Isn't this similar to the "Search Wiki" feature of Google that's available in every browser? Why didn't they just use that instead?

    1. Re:Search Wiki by game+kid · · Score: 2

      Pretty much; I can't blame them for seeking marketshare, though.

      For a somewhat short while, Google search results each had squares with "X"es in them that would take them off the list when clicked (explained further in this post on BlogsDNA). I kinda wish they stayed so I could nuke the spammier results I find, but we all know downvotes are just as exploitable as raves and I have a feeling this Chrome thing will get used more nefariously than not.

      --
      You can hold down the "B" button for continuous firing.
  5. Paywall sites are going to be hit pretty hard by ZackSchil · · Score: 5, Insightful

    Users who run into paywalls are going to pretty quickly add these sites to the filters, since the results are technically useless even if the content locked away is high-quality. This does not bode well for sites like Experts-Exchange or America's Test Kitchen.

    1. Re:Paywall sites are going to be hit pretty hard by Mouldy · · Score: 3, Interesting

      A trick I learnt with experts exchange is that the posts are actually accessible. You just have to scroll past the "GIMMER ALL YER MONEY" messages and you get to the original text. Experts Exchange's paywall is a simple example, but if Google's indexer can read past the paywall, there's no reason why you can't. Sometimes, if a site serves different content to people than to spiders, you can just click on the "cached" link in Google's results page to see the version that Google indexed.

    2. Re:Paywall sites are going to be hit pretty hard by dch24 · · Score: 2

      Because they don't produce original information, they just link-farm it.

      There are plenty of good sites out there who aren't gaming the system.

    3. Re:Paywall sites are going to be hit pretty hard by Intrepid+imaginaut · · Score: 5, Insightful

      if a site serves different content to people than to spiders

      If a site does that, why should it be listed at all? That's straight down the line spammery, as far as I can see.

    4. Re:Paywall sites are going to be hit pretty hard by dgatwood · · Score: 2

      There are actually valid reasons for doing that.

      The classic example is JavaScript-rendered dynamic content. This tends not to work so well when you're dealing with search engines. However, if you can serve them a static page that contains the text of the page minus all the rendering, then it can index the content without choking on the JavaScript. I'm not sure how important this is these days, but it certainly was a problem at one time.

      It's also useful to serve modified versions for search engines so that searches for content within your site can return more relevant results. For example, you might insert certain keywords that describe the content of the page using terms that don't actually appear. Case in point, your page talks about Airport, but you serve a copy to Google that inserts the terms 802.11 and Wi-Fi.

      Finally, there's the question of bandwidth and CPU overhead. If your site changes a lot, Google beats on your servers rather frequently. You can reduce the bandwidth hit by stripping JavaScript, CSS, images, etc. from your content before serving it to Google. This won't significantly change the searchability of the content, but will reduce the bandwidth overhead. And, of course, if there are static versions of content that you can serve instead of a server-side-dynamic version, this also saves on CPU overhead.

      For example, when you're writing a blog, you might decide that you don't care if the comments are searchable in Google. Thus, instead of wasting your server's CPU to compute the HTML for the comments, you can serve up a web page containing only the actual blog content when queried by a search engine.

      Paywalls, of course, are a dubious reason.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    5. Re:Paywall sites are going to be hit pretty hard by Intrepid+imaginaut · · Score: 2

      Isn't serving different content to spiders and to people, for whatever reason, explicitly against Google's rules though? For the first point, I think Google can digest JavaScript alright these days, even some flash. For the second, that's dodgy, if you want those terms indexed include them in your article. The third, it would seem you're still serving the same content just in a slightly different format.

    6. Re:Paywall sites are going to be hit pretty hard by trawg · · Score: 2

      It's actually part of Google's webmaster guidelines that you don't do this. I am not sure if it is grounds for removal though:

      # Make pages primarily for users, not for search engines. Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as "cloaking."

    7. Re:Paywall sites are going to be hit pretty hard by Kalriath · · Score: 2

      They don't link farm. Like StackOverflow, they actually have contributors who answer questions submitted by real people. The problem they have is their UI sucks and their advertising is abominable.

      --
      For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
    8. Re:Paywall sites are going to be hit pretty hard by kasperd · · Score: 3, Insightful

      why is everyone here against EE? Is it because they attempt to charge you for the answer?

      I'll tell you what I don't like about it. I don't mind them charging for an answer when both the person who asked the question, and the person who gave the answer is ok with that. On the other hand, I am not an ExpertsExchange user, and I do not think that it is ok that they charge for access to answers that I wrote.

      I know there are ways to get to see the answer without paying, and that is why I know that some of the answers are nothing but a link to a webpage where I provided the answer to the question (before it was even asked on ExpertSexchange). So far I haven't decided what to do about this. I could direct the users who access my site by using a link from ExpertsExchange through an interstitial page, but that would seem like punishing the users instead. But maybe if I used the page not just to point out my opinion about that site, but to also mention free alternatives, then it may be ok. I have also considered telling ExpertsExchange to make all pages with links to my pages freely available. (If newspapers can claim it is a copyright violation to link to their news, then I should be able to make similar requirements to ExpertsExchange. But it does feel going a bit against my principles because I think linking directly to pages with relevant information is what the web is all about).

      But what I dislike even more than sites charging for access to answers, that are little more than a link to my site, is those fake forums that pretend I am a user of their site. But in reality the entire content of that forum site is a ripoff of a selection of usenet groups. I'd feel much better about claiming copyright violation against such sites because they actually have copied content copyrighted by me. On the other hand, it seems a bit futile to try to go against all of those sites that way. And it may be difficult to draw a line between a legitimate webinterface for usenet, and a blatant ripoff. However, one distinguishing feature is whether the site makes it clear that it is a webinterface for usenet, or whether it pretends to be a amazingly popular webforum. Another distinguishing feature is whether it focuses on a (small) group of users that use it as their way to access usenet, or if the site simply try to attract all kinds of users from every searchengine out there, and just throw tons of ads at the users (with a little bit of copied content in between).

      --

      Do you care about the security of your wireless mouse?
    9. Re:Paywall sites are going to be hit pretty hard by Compaqt · · Score: 2

      Well, I have mod points, and I also browse Slashdot without Javascript (and moderate, too, without problem). Just a data point.

      --
      I'm not a lawyer, but I play one on the Internet. Blog
    10. Re:Paywall sites are going to be hit pretty hard by wolrahnaes · · Score: 2

      There are actually valid reasons for doing that.

      No there aren't. In fact Google explicitly states not to, and that those found to be doing so may have their pagerank reduced or be entirely eliminated from the index.

      The classic example is JavaScript-rendered dynamic content. This tends not to work so well when you're dealing with search engines. However, if you can serve them a static page that contains the text of the page minus all the rendering, then it can index the content without choking on the JavaScript. I'm not sure how important this is these days, but it certainly was a problem at one time.

      If this happens, you did it wrong to begin with. If you're publishing any content where you'd ever care about whether it's searchable, it should always degrade cleanly when features are missing or disabled in a user's browser. The blind, users on less powerful mobile platforms, and those who just disable JS for security or privacy reasons won't get your content either if a search engine wouldn't.

      It's also useful to serve modified versions for search engines so that searches for content within your site can return more relevant results. For example, you might insert certain keywords that describe the content of the page using terms that don't actually appear. Case in point, your page talks about Airport, but you serve a copy to Google that inserts the terms 802.11 and Wi-Fi.

      There are meta tags for this or you can make up for it by writing in a more "SEO" way. Refer to the "Airport 802.11 (Wi-Fi) access point" rather than just "Airport" in the first paragraph and it's all good.

      Finally, there's the question of bandwidth and CPU overhead. If your site changes a lot, Google beats on your servers rather frequently. You can reduce the bandwidth hit by stripping JavaScript, CSS, images, etc. from your content before serving it to Google. This won't significantly change the searchability of the content, but will reduce the bandwidth overhead. And, of course, if there are static versions of content that you can serve instead of a server-side-dynamic version, this also saves on CPU overhead.

      The plain Googlebot, as well as most search crawlers, will not even request CSS or Javascript. The Instant Preview bot does, but it's generating a screen capture of your site so you want it to. If you send sane headers this isn't a problem anyways, as the spider will only hit updated content.

      For example, when you're writing a blog, you might decide that you don't care if the comments are searchable in Google. Thus, instead of wasting your server's CPU to compute the HTML for the comments, you can serve up a web page containing only the actual blog content when queried by a search engine.

      That's really stretching for a justification, if for some reason your comment system is so bad that displaying comments adds a notable load to each page view you should probably go like Ars Technica and some other sites and simply put comments on a deeper linked page behind a rel=nofollow.

      Paywalls, of course, are a dubious reason.

      Well, we agree here at least.

      --
      I used to get high on life, but I developed a tolerance. Now I need something stronger.
  6. Re:Here's to hoping Expert's Exchange is among the by Frosty+Piss · · Score: 3, Insightful

    You're talking about "Expert Exchange".

    I've never used them, paying for Internet based programming help defeats the purpose of the Internet. If that's what I wanted, I'd hire a contractor.

    --
    If you want news from today, you have to come back tomorrow.
  7. Re:Here's to hoping Expert's Exchange is among the by Melibeus · · Score: 2

    That was one of the first sites I thought of when I saw this post.
    It looks like you can set your own block list up. So I'm going to be happy never to see Experts Exchange again.

  8. Death to experts-exchange.com by Uloi · · Score: 5, Informative

    Here is the solution to your coding problem.. Oh wait no, give us money first.

    1. Re:Death to experts-exchange.com by idiot900 · · Score: 4, Informative

      If you reach an experts-exchange.com page via Google, just scroll down to the very bottom for the solution.

    2. Re:Death to experts-exchange.com by eddy+the+lip · · Score: 5, Funny

      Me, too. Now I'm just annoyed because I discovered the quality of the answers.

      --

      This is the voice of World Control. I bring you Peace.

  9. Re:Here's to hoping Expert's Exchange is among the by WoodstockJeff · · Score: 4, Funny

    Well, more correctly, AN answer is there... May not be correct or even relevant to the question, but there will be an answer. I used to have my Google preferences to exclude Expert Sex Change from results, but that setting keeps getting reset...

  10. What took them so long? by Anonymous Coward · · Score: 2, Insightful

    I'm surprised it took them this long to do this. It seems like a pretty good way to leverage the fact that they've got their own software running on the client side too.

  11. Fuck off, squidoo.com by Fear+the+Clam · · Score: 3, Insightful

    I hope that site and its squads of web-shitting bastards all get kicked off google's search results.

    Then, if they could boot the fake review sites and the domain squatters ("AnalRape.com: What you want, when you want it.") the web might be worthwhile again.

  12. Re:Changes Nothing by dch24 · · Score: 3, Insightful

    It may help that Google reviews the results.

    They are pretty good at spotting trends (especially spam), because spammers go for the easiest target.

  13. Re:Here's to hoping Expert's Exchange is among the by Korin43 · · Score: 4, Informative

    Solution: Add "stackoverflow" to the end of every programming-related question. It saves a lot of time.

  14. This is such a good idea! by 93+Escort+Wagon · · Score: 2

    I wish it was available for Firefox. I really get of having to look at the domain name of each returned search result before clicking on it. The so-called "experts exchange" would be first on my blocked list.

    --
    #DeleteChrome
  15. Could also be used for evil... by Anonymous Coward · · Score: 2, Interesting

    Google already "charges" for increased search "relevancy" and gives massive discounts to large bulk buyers (think Amazon, Ebay, etc)... What happens when my legit sites start getting pruned for lack of payment... er... relevancy? Google already sticks it to small businesses with Adwords rates that are uncompetitive when compared to huge advertisers, so what would stop them from not doing the same in this realm? Don't be evil? right...

  16. Re:Here's to hoping Expert's Exchange is among the by SkyDude · · Score: 2

    I can't begin to express how aggravating it is to google a programming issue, and have the top five results all link to the same page with the same paywalled answers.

    Amen brother, amen.

    --
    == First cross river, then insult alligator.
  17. Google already had this feature by skomes · · Score: 5, Interesting

    What really pisses me off is that google already had this feature. Personalized search results used to let you relegate some websites to the bottom and mark some results and sites as being more important. It was incredibly useful when filtering out garbage spam sites. Google also said were would be able to share these in some way to improve search results. Then for no reason they removed that feature and replaced it with the ability to put a gold star on some results. Of course the benefit of the feature was in relegating spam up the bottom of the page and you could no longer do that. When they removed they feature I stopped using the feature entirely. Now google is backtracking by introducing this extension. What was entre point of removing the original feature which worked on all browsers?

    1. Re:Google already had this feature by ogl_codemonkey · · Score: 3, Insightful

      I suspect this is related to some overall plan for adding value to the Chrome platform.

  18. Why a browser extension? by Anonymous Coward · · Score: 2, Insightful

    Why not make this a part of Google search itself, like the report spam buttons in Gmail?

  19. Re:Here's to hoping Expert's Exchange is among the by h4rr4r · · Score: 3, Funny

    My name is not Jesus, most people stopped getting confused about that when I cut my hair.

  20. Re:Here's to hoping Expert's Exchange is among the by Evildonald · · Score: 2

    or Expert Sexchange as they are really known

  21. Re:Here's to hoping Expert's Exchange is among the by EdIII · · Score: 2

    LOL!

    Yes. I knew I could not be the first person to post Expert Exchange. It was the *very* first thing I thought of, and then some of the more annoying driver sites that popup when you do searches for various printer and hardware drivers.

    I love this idea too, but honestly wonder just what Google will do the results. I can see abused like Astroturfing to influence a competitors ranking in the search results.

    That being said, just being able to block Expert Exchange is priceless to me. I hate those bastards.

  22. My strategy by sirrunsalot · · Score: 2

    For my own sake, I just throw the junk into /etc/hosts so I don't make the mistake twice. Still shows up in search results and Google doesn't hear about it, but at least I get the satisfaction of not sending any traffic their way. answers.yahoo.com, telegraph.co.uk, pcmag.com, foxnews.com... The list goes on and on.

  23. Re:Here's to hoping Expert's Exchange is among the by BillGod · · Score: 3, Insightful

    YES the freakin driver sites are getting ridiculous. Every time I look for a driver the first 2 pages are some crap site that just pops you around from page to page only to try and install their crappy software to "give" you the driver for a small fee! If their site was not there the driver would be easy to find. With all these sites it makes the driver impossible to find.. therefore you need these sites to find them.. AHHH gonna throw up now.. then download this bad boy.

    --
    MISSING - Sig file. 2 years old black and white and very funny. If found please email me.
  24. Re:Here's to hoping Expert's Exchange is among the by EdIII · · Score: 2

    Heh

    I know about the hosts file :)

    How does that block it from appearing in the search results for Google? I know Google ain't pure as the driven snow, but they ain't checking my hosts file on disk either before they return search results :)

  25. No plugin, just extend what you have already. by clintp · · Score: 5, Insightful

    Dear Google,

    Screw the plugin.

        1. Give me a "search preference" where I can say "never this site in my results." You track my "safe search" and other preferences, just add this one.
        2. Along with the star, preview, cached, etc... buttons in the results, give me a "this site's results are shit" button. A turd icon would do nicely.
        3. Extend your search keywords to add "nosite". i.e. nosite:experts-exchange.com

    All of these you could track and adjust your algorithms based on trends of "real life" searchers who utilize these features.

    Sincerely,
    Me

    --
    Get off my lawn.
  26. Local link sites / business directories by 19061969 · · Score: 2

    When looking for a local business, I often search for the name and town of the company. All well and good. But I often find that the first few links (sometimes even pages) are crammed with business directory sites. I really would prefer to use the proper company's website.

    1) The company's website will have up to date information and more info like opening hours etc
    2) The business directories often have /NO LINK WHATSOEVER/ to the company's site - just a sodding phone number. It's almost like they feel it's against the law to display a link to a company's website.
    3) The business directory sites sometimes have the name and town and nothing else - wow, way to go. Tell me what I just searched for and nothing else. Thanks. Really useful there Einstein.
    4) Using these directories, I would be using a search engine to go to - a search engine! (or as near as). Yeah - maybe if these directories could chain up and I could spend all f***ing day going around in circles (note the no hyperinks point above)
    5) These directories are often full of crap - when the page loads, I'll see my own query loaded up in the directory's search box and then a really helpful and information message below "Sorry, we can't find anything that matches your query. Did you mean blah-dee-blah instead?" (poor recall)
    6) Precision of returns is also poor. Lots of irrelevant company's are shown when something does come up for a query. I want local pizza delivery and they recommend car / auto parts. Wow, I was hungry but maybe what I really need are some brake pads?
    7) These bloody things often appear *above* the website of the very company I'm looking for.

    Although they're trying to be useful, they have a crap business model that doesn't nothing but get in my way.

    Another one is review sites. Say you want to buy a camera and you want to read reviews. Yeah, there are pro reviews, but you want reviews by real users. So you type in the word 'review' as well. Up come a ton of returns from the search engine... ...most of which say, "Be the first one to write a review". I have honestly used that phrase as a Boolean NOT just to try and get some useful content.

    --
    bang goes my karma... again...
  27. Re:Your sig by pegdhcp · · Score: 2

    I can't speak to the financial end of it, but I can tell you that despite my karma being "bad" for almost my entire stay, here, I have the checkbox. So, yeah, that's weird.

    Can it be caused by the fact that contribution is giving comments, making posts etc. and karma is the evaluation of some random moderator about your "contribution". Take a discussion about apple vs. linux. Depending on the day and the phase of the moon, you would end up either at -1 or +5 if you have anything of value in your post.

  28. Re:Here's to hoping Expert's Exchange is among the by hairyfeet · · Score: 2

    Actually you better check more than one source for the information anyway because as others have pointed out just because they give AN answer does mean you get the RIGHT answer.

    I had a customer ask advice on one of the eAnswers style sites before he brought it to me and the advice he was given was like some sort of WinXP urban legends handbook or something. They had him throw out all the Windows prefetch files "to speed things up", set prefetch to 4, make a separate partition for the page file, just total BS.

    So I'd agree with you but I don't even think the helpful part is real big on their todo list, just more of a nice side effect but no hard and fast perquisite. The only thing they really really REALLY care about is cranking up those page views to max, everything else be damned. It reminds me of those SEOs where they fill fill the comments section of a blog with the same set of keywords over and over and over.

    But if this gets content farms off the top 10 I'm all for it, but sadly it will probably be more like spam and SEOs where the little twerps just keep figuring ways around it.

    --
    ACs don't waste your time replying, your posts are never seen by me.
  29. Re:Here's to hoping Expert's Exchange is among the by IICV · · Score: 5, Informative

    They occasionally have actual answers. The thing is, Google won't give you any credit for answers browsers can't see - which would mean the paywall would knock your page rank to shit.

    How does Expert Sex Change get around this? They pretend that the answer is behind a paywall, when in fact the answer is actually all the way at the bottom of the page. The Google search bot is much more patient than you are, and will not care about the pretend-paywall.

    So yeah. Whenever it looks like Expert Sex Change has your answer, just follow the link and scroll all the way down.

  30. Wow, my competitor just got 500 complaints? by Eightbitgnosis · · Score: 2

    Oh gee, I never paid people $0.20 a pop to write in phony complaints on Amazon Mechanical Turk

  31. Re:Why use Experts Exchange? Use Stack Overflow! by mcvos · · Score: 2

    Most people use Stackoverflow for that.

  32. Spammers will `spam' this system by Magnus+Pym · · Score: 2

    The obvious thing for spammers to do is hire lots of third world labor to start marking legitimate web sites as spam. This will mess up Google's data collection and render this useless.

  33. It's not just Link Farms by Phoenix666 · · Score: 2

    It's the indiscriminate use of Adwords and the Search-Based Keyword Tool (SBKT) to siphon lots of traffic that really isn't relevant to your goods or services.

    For example, yesterday I was searching for stamp-sized LCD screens to incorporate into some hobby projects of mine. I. could. not. get. anything. but. Amazon.com. They wanted to sell me watches or personal DVD players or anything but what I was looking for. This has been happening with every search for information for the past month.

    Google needs to really tighten up their advertising policies, because their search engine is teetering on the event horizon of uselessness.

    --
    Do what you can, with what you have, where you are.