Digging Holes in Google
Kurt LoVerde writes "Though google has become synonymous with searching, the folks over at MSN have written up an interesting article on our favorite search engine's pitfalls. Included among these are a tendency to skew results toward shopping, a lack of diversity for searches containing synonyms and its impact on research."
Those are some pretty weak allegations.
The jist of the article is that if you give google a one (common) word search term, that the results may not be as precise as you want. For instance, if you want the nutritional content of an apple, and you put "apple" into Google, you're going to get a bunch of hits for things that don't have what you're looking for.
I'm sure a lot of you are saying "duh" right now.
I read the internet for the articles.
With the size and complexity of the Internet as we know it, single word search terms like "apple" are completely stupid. I think the reporter was just screwing around with Google and noticed that the publishing deadline was approaching. Sure, there are some unique words that make sense to use as a single term search, but anyone who has used a search engine for more than 3 seconds knows to qualify the search somehow.
As far as shopping results, that's the character of the web today. Lots of commercial interests. It takes money to maintain a web presence, no matter what Geocities tells you. Google is just presenting you with what it's got, really.
Finally, a lot more papers are published than books. It's not surprising that you don't get a lot more hits on book-printed resources.
This is more interesting as a statement on what the Internet has become, rather than what Google might be showing you while filtering other things out.
Yet their claim is weakened by the fact that if I enter "flower research," suddenly I see very, very little related to shopping, but instead to the research I'm seeking.
It all depends on the search scheme. If the claim that Google is so heavily weighted towards marketing and shopping were true, then "flower research" would have led me to buy flowers.
I would also note that "flowers" on MSN.com returns:
The next comment I write will be ready soon, but subscribers can beat the rush and see it early!
I don't think that's a flaw it just makes good sense for their example, most of the people searching for flowers are looking for emergency flowers to send to their GF or mother. If someone wants to research flowers they should probably search for Botany?
Googlehole No. 2: Skewed Synonyms. Search for "apple" on Google, and you have to troll through a couple pages of results before you get anything not directly related to Apple Computer--and it's a page promoting a public TV show called Newton's Apple. After that it's all Mac-related links until Fiona Apple's home page.
Again, I think this more a result of what people tend to be looking for when searching for Apple, I would imagine that most people querying google using the single keyword "Apple" would be looking for the company. The average user wouldn't have a reason to search google for fruit. Using a one keyword query is not good enough if you want to criticize a search engine, search for Apple and Fruit will get you everything you need to known about the non-computer apples. If you want to by fresh Apples perhaps you should search for Fruit Store?
So, when you're doing research online, Google is implicitly pushing you toward information stored in articles and away from information stored in books.
Hasn't the web been doing that for years? Is this somehow google's fault? If publishers want to have the full text of their books available on the web for free, I'm sure the folks at Google would be happy to spider them.
He[Steven Johnson]'s kind of right, if you try looking for information about motherboards, you'll first have to wade through all the sites that try to sell you one instead of offering a review of the specific motherboard you asked about. Google does that if you don't use it the right way. I always add "-buy" to my query, which helps sometimes. Read the comments below the article, they're interesting too.
And by the way, Steven Johnson who writes the Slate column was right most of the time when he was criticising George W. Bush and the war in Iraq, so cut him some slack, he deserves it big time.
Plain and simple FUD.
Given that, as many people here have already pointed out, Microsoft is readying/improving its own search offering, I think it's pretty plain that this is just an attempt by Slate/MSN/Microsoft to smear Google, using journalism or op/ed to do so.
Google isn't biased, as the article tries to make the case, the _web_ is biased, toward the technical (and unfortunately, towards blogs.) So those, will, of course, show up first. People don't publish complete books online, but they publish papers and articles by the droves. So, of course you're going to be pointed to that stuff first.
And frankly, anyone who types in "apple" into a search engine should know that they're going to get MANY very BROAD results. You need to be specific in your search. The more specific you are, the better results you're going to get.
Ed R.Zahurak
You know, oblivion keeps looking better every day.
Strangely, The Same Damn Search on search.msn.com returns much the same results. Mostly online florists, with about.msn.com and encarta.msn.com links thrown in for good measure.
Before you publish the report bashing your competition, at least try to see how your own product compares.
--
So... yes, articles published in PDF format will be indexed, but if one is doing real research, one is probably conducting a comprehensive literature search (e.g, if one is a PhD). If one is a PhD, there is a growing volume of new data will be published online, but there are still important corpos of off line literature, both old and new.
If one is doing "research" on how to buy a new car, or "research" for one's fifth grade home work project, I suspect that PDF files are probably just fine as a source and that comprehensive literature searches are not necessary (but might still be useful).
The article states "Google is implicitly pushing you toward information stored in articles and away from information stored in books." More relevantly and accurately (and obviously), Google is pushing you towards information that is stored online. If one uses Google for research, one should understand that it is not the only tool available. If one uses Google as the only tool, well...
I think this is a vaguely interesting point that might have a lasting impression on the way online content is indexed/stored/made searchable. However, the more relevant issue here is that individuals need to learn how to search (as many have already pointed out in comments), search tools must be understood in the context of available tools and a sense of the data to be found must be developed (it does not need to be known in advance).
I also assume that the Amazon text searching of books story might put another spin on this.
ever tried searching for Linux on MSN? -- oddly enough the first link you get it to amazon.com andmentions "buying linux" -- the second seems to be alright, and the third is funny altogether: 3. Alternatives to Linux-Apache-MySQL-PHP Learn about the Microsoft alternatives and how to move to them from open source products. www.microsoft.com/serviceproviders/migration
...
Let's look at a more subtle aspect through:
Is this verification that Google is vulnerable to astroturfing? If you assume that half of all web pages with the term "apple" are talking about the computer company and the other half are referring to the fruit, then it seems like a search for the term "apple" should bring up about equal numbers of computer & fruit hits. The fact that most top hits are about the company instead of the fruit probably suggests that at least some of the "ballot stuffing" tricks that companies try to bring up their ranking are effective, even against Google's famed efforts to avoid being astroturfed.
This example is probably bogus -- the computer company seems to be more popular than the fruit, or at least there's more for internet users to say about it, so pagerank is probably doing it's job well here. But in other cases, where the commercial alternative isn't as famous as Apple Computer but it still ranks higher in Google searches than non-commercial alternatives, that probably says something about astroturfing.
That or it just reiterates that the web went commercial a long time ago. Take your pick...
DO NOT LEAVE IT IS NOT REAL