Slashdot Mirror


Digging Holes in Google

Kurt LoVerde writes "Though google has become synonymous with searching, the folks over at MSN have written up an interesting article on our favorite search engine's pitfalls. Included among these are a tendency to skew results toward shopping, a lack of diversity for searches containing synonyms and its impact on research."

33 of 644 comments (clear)

  1. Maybe check your search results again... by Anonymous Coward · · Score: 0, Interesting

    A search for Apple brings up a full page of Apple computer related links (The article says not one apple link appears on the first page)

  2. Pretty weak by jandrese · · Score: 5, Interesting

    Those are some pretty weak allegations.

    The jist of the article is that if you give google a one (common) word search term, that the results may not be as precise as you want. For instance, if you want the nutritional content of an apple, and you put "apple" into Google, you're going to get a bunch of hits for things that don't have what you're looking for.

    I'm sure a lot of you are saying "duh" right now.

    --

    I read the internet for the articles.
  3. A stupid article by Mwongozi · · Score: 2, Interesting
    As an example, they claim that doing a search for apple doesn't reveal much about the fruit.

    Well, of course it doesn't! A search for apples, however, is much more useful.

    This is just a case of user error, nothing more.

  4. Try better search terms maybe? by Dielectric · · Score: 4, Interesting

    With the size and complexity of the Internet as we know it, single word search terms like "apple" are completely stupid. I think the reporter was just screwing around with Google and noticed that the publishing deadline was approaching. Sure, there are some unique words that make sense to use as a single term search, but anyone who has used a search engine for more than 3 seconds knows to qualify the search somehow.

    As far as shopping results, that's the character of the web today. Lots of commercial interests. It takes money to maintain a web presence, no matter what Geocities tells you. Google is just presenting you with what it's got, really.

    Finally, a lot more papers are published than books. It's not surprising that you don't get a lot more hits on book-printed resources.

    This is more interesting as a statement on what the Internet has become, rather than what Google might be showing you while filtering other things out.

  5. Bah by Anonymous Coward · · Score: 1, Interesting

    I don't find Google has a tendency to direct me towards shopping at all.

    What I do hate though, is when I search for a band's lyrics, Google gives me a few dozen fan pages before the official (and I'd say a lot more relevent) page ever shows up. This is really annoying when you don't know if the band (or anything else for that matter) has an official page (sometimes it's some idiotic combination like www.band-something-something.pick a tld).

  6. The author is a moron...Or a 2 year old by akiaki007 · · Score: 1, Interesting

    Point 1: Flowers. Answer: Instead of typing "tulips" when you really want gardening tips, try "tulips tips" and you will get what you want.

    Point2: Apple. Answer: Instead of typing "apple" which is a very common word and product and name, try typing what you actually want, "Apple Computer."

    Point 3: If you're looking for a book. Try a library. That is where people put books. That is how people make money on books.

    Now, given that your points are complete stupid, I feel as if your article was meant towards a 2 year old that doesn't know the differece between "apple" and "aldkfja." That's because they can't type or read yet.

    So, Google is brilliant if you *actually* supply it with what you are looking for. I guarantee you that when I want information about a company which has the same name as a common fruit, I will be a little more specific.

    This was the most useless article I've wasted my time reading.

    --
    "Time is long and life is short, so begin to live while you still can." -EV
  7. Re:Convenient Timing by ChannelX · · Score: 2, Interesting

    Its quite possible you're correct. However the problems they discuss in the linked article are quite real and frustrating. I've run into each of the issues they talk about multiple times and it can be highly annoying. It just goes to prove that their system isnt necessarily the best. Sure its great to give the highest rank to the sites that are linked to the most but as we've seen that isn't always the best way to compute results.

    --
    My blog: http://jkratz.dyndns.org/~jason/blog/
  8. Depends on search scheme by piecewise · · Score: 4, Interesting
    I'm sure confident in the validity of this article. Yes, they're right: I entered "flowers" and most of what I saw would lead me to buying flowers.

    Yet their claim is weakened by the fact that if I enter "flower research," suddenly I see very, very little related to shopping, but instead to the research I'm seeking.

    It all depends on the search scheme. If the claim that Google is so heavily weighted towards marketing and shopping were true, then "flower research" would have led me to buy flowers.

    I would also note that "flowers" on MSN.com returns:

    • Proflowers.com (shopping)
    • 1-800-Flowers.com (shopping)
    • Flower.com (shopping)
    • Bulb and Flower Gardening
    • Find a Florist Near You (shopping)
    • Flowers Delivered Worldwide
    • 1-800-FLOWERS.COM (Shopping)
    • FTD Flowers (shopping)
    • FTD Flowers & Gifts (shopping)
    --
    The next comment I write will be ready soon, but subscribers can beat the rush and see it early!
  9. Before we start bashing by CodeShark · · Score: 2, Interesting
    The article is at Slate on MSN, and points out not that Google is weak, but that there is so much data online that in essense the index needs more indexing, aka more keywords with which to eliminate pages you don't really want to see.

    Think about it folks. If you walked into something like a WalMart supercenter, went to the service desk, and said "tell me where I can find some nuts", the answr would be different if you are looking for peanuts, automotive hardware, home hardware, or ??.

    For example, instead of using Apple, or even Apple Newton, searching with "Apple Newton PBS" still comes up with a couple paid links back to Apple Computer but most links point to the right place, referring to the PBS series.

    And in terms of only indexing PDF articles well, I have news for the folks at Slate... there aren't that many complete books out there on PDF that would be as useful to researchers as the PDFS of the articles that make up the scholarly journals themselves. So again, this perceived weakness in Google is a problem in the broad-brush arena, not in reality.

    The bigger problem for most small websites and Google is building up crediblility among a wider network of links in the first place, which is where the quality of the information and it's presentation are key. Repeat after me -- there is no shortcut to success on the WWW. Build something worthless, remain in obscurity. Build something good that has value, and we -- via Google, Yahoo, or whatever search engine you like -- will eventually come.

    --
    ...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
  10. Re:Convenient Timing by jazman_777 · · Score: 2, Interesting
    How nice on an impartial journalistic source to pick holes in google which are almost certainly specific areas which microsoft has chosen to optimise.

    This is insightful? Is rather naive, the idea that there is an impartial journalistic source out there somewhere. The ones you certainly _can't_ rely on are the ones that claim impartiality. The ones that own up to their biases, you can easily apply filters.

    There's a certain honesty you get when someone is not in the "I'm impartial" delusion, even when they're wrong about something, or you disagree with them.

    --
    Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
  11. lousy examples by asv108 · · Score: 4, Interesting
    Google's top results skew very heavily toward stores, and away from general information. Search for "flowers," and more than 90 percent of the top results are online florists.

    I don't think that's a flaw it just makes good sense for their example, most of the people searching for flowers are looking for emergency flowers to send to their GF or mother. If someone wants to research flowers they should probably search for Botany?

    Googlehole No. 2: Skewed Synonyms. Search for "apple" on Google, and you have to troll through a couple pages of results before you get anything not directly related to Apple Computer--and it's a page promoting a public TV show called Newton's Apple. After that it's all Mac-related links until Fiona Apple's home page.

    Again, I think this more a result of what people tend to be looking for when searching for Apple, I would imagine that most people querying google using the single keyword "Apple" would be looking for the company. The average user wouldn't have a reason to search google for fruit. Using a one keyword query is not good enough if you want to criticize a search engine, search for Apple and Fruit will get you everything you need to known about the non-computer apples. If you want to by fresh Apples perhaps you should search for Fruit Store?

    So, when you're doing research online, Google is implicitly pushing you toward information stored in articles and away from information stored in books.

    Hasn't the web been doing that for years? Is this somehow google's fault? If publishers want to have the full text of their books available on the web for free, I'm sure the folks at Google would be happy to spider them.

  12. Stupid article by elviscious · · Score: 2, Interesting

    It seems like most of the authors complaints revolve around Google's search not reading his mind for exactly what he is searching for. He searches for 'apple' and it returns about bunch of articles about apples, apple growers, etc... but it's a long time before it returned something about 'Apple computers', or 'Fiona Apple'. Well maybe this jackass should have typed in 'Apple computers', or 'Fiona Apple' if he was interested in finding websites about them. What an idiot

  13. How about neither? by sacremon · · Score: 2, Interesting

    I still use Google quite a bit, but when Google gives me a mess that's hard to parse with subsearchs, I go to turbo10.com. Metasearch engine with clustering of topics much like Northern Lights had. It often gives me relevant links faster than Google does.

    --
    If you can't beat them, embrace and extend them.
  14. No search engine is the end all /competion is good by acomj · · Score: 2, Interesting

    Google is a very good search engine. But it still doesn't always get what exactly what I want. (I have a url from a coworker that is a GREAT description of UDP multicast that doesn't seem to be in googles top 50, and the url is significantly better than any of googles top hits.)

    He found it with a different search engine. (teoma.com?/ about.com?). He uses more than one engine depending on what he's doing (He does use google too)

    What I'm getting at is competetion is good. It forces companies to make better products because they know if they don't others are going to try too.

    Other companies are working really hard at getting a better search engine. Don't expect google to be on top forever, because athough slashdot readers love google, they'll leave it quickly if something better comes along (remember altavista/hotbot/webcrawler etc.. )

    In the end everyone wins.

  15. Re:MSN hates shopping by TheRoachMan · · Score: 4, Interesting

    He[Steven Johnson]'s kind of right, if you try looking for information about motherboards, you'll first have to wade through all the sites that try to sell you one instead of offering a review of the specific motherboard you asked about. Google does that if you don't use it the right way. I always add "-buy" to my query, which helps sometimes. Read the comments below the article, they're interesting too.
    And by the way, Steven Johnson who writes the Slate column was right most of the time when he was criticising George W. Bush and the war in Iraq, so cut him some slack, he deserves it big time.

  16. This is FUD. by TrebleJunkie · · Score: 4, Interesting

    Plain and simple FUD.

    Given that, as many people here have already pointed out, Microsoft is readying/improving its own search offering, I think it's pretty plain that this is just an attempt by Slate/MSN/Microsoft to smear Google, using journalism or op/ed to do so.

    Google isn't biased, as the article tries to make the case, the _web_ is biased, toward the technical (and unfortunately, towards blogs.) So those, will, of course, show up first. People don't publish complete books online, but they publish papers and articles by the droves. So, of course you're going to be pointed to that stuff first.

    And frankly, anyone who types in "apple" into a search engine should know that they're going to get MANY very BROAD results. You need to be specific in your search. The more specific you are, the better results you're going to get.

    --

    Ed R.Zahurak

    You know, oblivion keeps looking better every day.

  17. Re:Write doesn't know how to search by JUSTONEMORELATTE · · Score: 3, Interesting

    Strangely, The Same Damn Search on search.msn.com returns much the same results. Mostly online florists, with about.msn.com and encarta.msn.com links thrown in for good measure.
    Before you publish the report bashing your competition, at least try to see how your own product compares.

    --

  18. what's interesting to me ... by cascadingstylesheet · · Score: 2, Interesting

    ... is how some people, even smart people, don't or can't get the hang of search engines.

    You know, the ones on newsgroup and mailing lists who say "anyone know of a good BLAH?". Then someone whaps them with a cluestick (or rather, google link).

    There just seem to be brain types or personality types that don't get it. Here's the rules I try to impart:

    1. The total number of results means *nothing*, unless you like statistics. Nobody is asking you to look at 56,000,000 results, so stop complaining about it.
    2. You may actually need to look over a page or two of results, actually look at the titles and page summaries, and decide whether they are relevant. It's no harder than looking at the spines and covers of books. Really.
    3. If the first page or two of results aren't relevant, you may need to qualify your search. It is usually as easy as adding a word.

    But there are still some, like the author of the article, on whom any of this is lost.

  19. Too many shopping hits? Have you *seen* the web? by cgc · · Score: 2, Interesting

    Honestly, think about it. A very significant portion of websites out there are trying to sell you something. (Just check out that banner at the top of this page.) So if you estimate (low, probably) that 50% of all websites are shopping sites, then its a good chance that for a search on any given topic, you'll probably end up with around 50% of the results trying to sell you what you're looking for.

    This isn't Google's fault; it's the nature of the web today.

  20. Shopping Results by DeadBugs · · Score: 2, Interesting

    Actually I will use google to find places to buy obscure items. If it did not return shopping sites I would lose this valuable search feature. If none of the major online retailers have what I am looking for, I just type it into google.

    --
    http://www.kubuntu.org/
  21. subtle effect on research? by tcyun · · Score: 4, Interesting

    So... yes, articles published in PDF format will be indexed, but if one is doing real research, one is probably conducting a comprehensive literature search (e.g, if one is a PhD). If one is a PhD, there is a growing volume of new data will be published online, but there are still important corpos of off line literature, both old and new.

    If one is doing "research" on how to buy a new car, or "research" for one's fifth grade home work project, I suspect that PDF files are probably just fine as a source and that comprehensive literature searches are not necessary (but might still be useful).

    The article states "Google is implicitly pushing you toward information stored in articles and away from information stored in books." More relevantly and accurately (and obviously), Google is pushing you towards information that is stored online. If one uses Google for research, one should understand that it is not the only tool available. If one uses Google as the only tool, well...

    I think this is a vaguely interesting point that might have a lasting impression on the way online content is indexed/stored/made searchable. However, the more relevant issue here is that individuals need to learn how to search (as many have already pointed out in comments), search tools must be understood in the context of available tools and a sense of the data to be found must be developed (it does not need to be known in advance).

    I also assume that the Amazon text searching of books story might put another spin on this.

  22. It's an article about skew, not a search How-To by arrogance · · Score: 2, Interesting
    What a surprise, the slashdot crowd looking with disdain at something that got posted to MSN. There seems to be about 15 comments already about what an idiot the author is.

    While the point that more refined searches give you better results is true, that's not what the author's talking about. He's trying to tell you that Google is an aggregation of zeitgeist and how many links things have (link interdependence, which is Google's strength, also adds its own bias), and not necessarily their relevance to the 'real' world. An understanding of how Google might skew results is useful.

    Here's his site: read the July 16th article. "You can make things less than equal by doing more refined searches, but that doesn't mean the skew isn't important."

  23. Using tools by uohcicds · · Score: 2, Interesting

    I think this article is more than a little dumb. Effectively it's like saying that if you drive a car for 20 years without topping up the oil and the car fails, then it's the car's fault.

    No. It's a tool, just like Google is a tool. To gain the greatest utility from a tool, you must learn how to use the tool properly. In this case it means not being utterly stupid and having at least some idea what you're going to search for.

    Let's take point 2 of the article as an example. Who is going to type apple into google in the expectation of not getting 13.6 million hits? If you're going to search, what kind of apple are you looking for? Golden Delicious, Cox,whatever. This is the user's problem. If you're going to use a search engine, at least have some vague idea of what you want to look for before you start. However, in using the web as a reference one pitfall is the principle of provenance. PageRank is not enough in itself. I'm not entirely sure how the PAgeRank algorithm works ( and I'm fairly certain we're not going to get someone from google telling us either ;-) ) but it would be nice to allow for authorities to be defined, so that in the case of academic content, if an item appears in a certain source (like ACM journals, for example) then it is given higher weighting by the engine.

    In addition, point 3 in the slate article is flawed. Google doesn't divert people from books. It is not the tool's fault if people do not choose to publish in that way. Also, if the New York Times choose to fence off their content, it's hardly Google's fault that it can't spider the stuff. Talk to the NY Times and ask them why they use the registration system in that way.

    Let's just make this clear. Google spiders what is generally available. It is not the fault of google if content providers wish to publish in ways that may limit the scope of viewing. Google isn't perfect by any means but this article doesn't really say anything useful. And hey, don't MSN run an engine of their own too. What possible gain could they have from rubbishing Google? [Cynical? Me? How could you possibly think that...]

    --
    It's not you: I'm just this horrifically socially awkward with everybody.
  24. Top 3 Search Results for "Linux" at search.msn.com by Meniconi,Nando · · Score: 2, Interesting

    Amazon.com Buy Linux software at the Amazon.com software store. www.amazon.com Introducing Linux Find the latest news and information on this operating system. tech.msn.com Alternatives to Linux-Apache-MySQL-PHP Learn about the Microsoft alternatives and how to move to them from open source products. www.microsoft.com/serviceproviders/migration ------- 'nough said.

  25. Has anyone searched for Linux? by TD_3G · · Score: 3, Interesting

    ever tried searching for Linux on MSN? -- oddly enough the first link you get it to amazon.com andmentions "buying linux" -- the second seems to be alright, and the third is funny altogether: 3. Alternatives to Linux-Apache-MySQL-PHP Learn about the Microsoft alternatives and how to move to them from open source products. www.microsoft.com/serviceproviders/migration

    --
    ...
  26. Missed the real lesson by babbage · · Score: 4, Interesting
    As an army of astute Slashdot users has already chimed in, of course the conclusion is bogus: [a] if you enter generic terms in a search engine, you shouldn't be surprised to get back generic results, and [b] seeing as MSN is setting themselves up to be a competitor ro Google, their analysis can hardly be considered unbiased.

    Let's look at a more subtle aspect through:

    Google's top results skew very heavily toward stores, and away from general information. Search for "flowers," and more than 90 percent of the top results are online florists.

    Is this verification that Google is vulnerable to astroturfing? If you assume that half of all web pages with the term "apple" are talking about the computer company and the other half are referring to the fruit, then it seems like a search for the term "apple" should bring up about equal numbers of computer & fruit hits. The fact that most top hits are about the company instead of the fruit probably suggests that at least some of the "ballot stuffing" tricks that companies try to bring up their ranking are effective, even against Google's famed efforts to avoid being astroturfed.

    This example is probably bogus -- the computer company seems to be more popular than the fruit, or at least there's more for internet users to say about it, so pagerank is probably doing it's job well here. But in other cases, where the commercial alternative isn't as famous as Apple Computer but it still ranks higher in Google searches than non-commercial alternatives, that probably says something about astroturfing.

    That or it just reiterates that the web went commercial a long time ago. Take your pick...

  27. The sad part of this: by Jerk+City+Troll · · Score: 2, Interesting

    Unfortunately, most of MSN's readers are very unlikely to understand why this article is nonsense. They will never read the reasoned counterpoints expressed in this article thread. Nor will they ever question it.

    This article is exactly what the layperson craves. It's controversial and it makes some sense if you fail to do any deep thinking. The masses are going to gobble it all up (even when MSN is a demonstration of what this article complains: example) and look to other sources to save them from the newly created Google menance. To them, Google is now not only a bad search engine, it is also damaging the future of our species by negatively impacting research (*gasp!*).

    Of course, this is how all Microsoft FUD plays out. It doesn't fool any of us, but it certainly fools most of them.

  28. Why is this a surprise? by 192939495969798999 · · Score: 2, Interesting

    isn't like 98% of what people are actually looking for "shopping-related"? I would venture "yes". The simple fact is that while google delivers lots of shopping-related stuff, it also delivers the real meat to anyone willing to think of the "right" words to enter. It's not that hard really, and anyone that thought that entering "apple" in a search engine would bring up their momma's apple orchard home page before apple computer has a lot to learn about the internet.

    --
    stuff |
  29. MSN can KMA by paiute · · Score: 2, Interesting

    Boo freaking hoo. Google isn't perfect. Whyever would MSN be interested in making sure we know it?

    This reminds me of creationists pointing out gaps in our knowledge of evolutionary biology and concluding that lack of perfection in science proves that they are right.

    --
    If Slashdot were chemistry it would look like this:Cadaverine
  30. Re:MSN hates shopping by The_K4 · · Score: 2, Interesting

    Maybe it's because people use the web for shoping, i would bet that average joe user uses the net for:
    #1: E-mail
    #2: Shopping
    Is it wrong that google's results would reflect this?

  31. Re:MSN hates shopping by Farley+Mullet · · Score: 2, Interesting

    So, the key is to understand the nature of the tool you're using, including its drawbacks. If you know what you're doing, you can manipulate the keywords you search on to reduce the number of shopping and weblog links you get. And hey, if the drawbacks are too severe for you, nobody's forcing you to use google, so use something else!

    .
  32. Re:Better yet... by circusnews · · Score: 2, Interesting

    Now try asking your grandma what terms she would use when looking for the information on an apple.

    While you are correct that google is only an index, would it really be that hard for it to ask if they were looking for apple computers, apple records, the fruit or other, and refine the search for the user?

  33. Re:What is wrong with this picture... by pjp6259 · · Score: 2, Interesting

    A lot of people have been complaining that this article is biased against google, because microsoft is going to be launching their own search engine soon.

    The main complaint seems to be that: "This guy is an idiot, he only uses 1 word search terms and doesn't know enough to use the negation mark (-)"

    Look, I know all of us are smart enough to add a couple terms or use the negation mark to narrow our search results, but I used to be in the search engine business, and at the time (about two years ago) I remember reading a study saying the average query length at a search engine was something like 1.4 words long. This sounded kind of absurd to me, because I almost never have a query less than 3 words long, but you have to remember how non-technical people think. They want something they can just punch in a word or couple of words and get the answer to their question.

    As for a better solution, the company I was working for did automatic categorization of any query you typed in, so if you typed in "apple" it might come back with these categories:

    home & garden
    computers
    music.

    Then you could choose a category you were interested in, and it would only return results that were relevant to that category. Personally I think that was a great solution, and I'm suprised more search engines don't have something like that.

    Stop thinking that just because you can use a product that it is fit for mass consumption. I love google, it's by far the best search engine out there, but that doesn't mean it can't be improved upon. And in the computer world, improved upon often times means made easier to use, so that your average proletarian can make it work. This is the same attitude that will see Linux used by only the 5% of the population that's geeky enough to figure it out.

    --
    Computers don't make mistakes. What they do, they do on purpose.