The Un-Google - The Search Competition
WinEveryGame writes "The Economist is running an article on the state of the competition for Internet Search. While Google clearly dominates, and continues to have positive momentum, its leadership is still vulnerable. The search-engine battle is not over yet." From the article: "In terms of momentum — mass times velocity — Google's lead indeed looks daunting. It has by far the most mass, with an American market share of 43% as of April, which reaches 50% counting AOL, an internet property that uses Google's search technology. This compares with 28% for Yahoo!; 13% for MSN, which belongs to Microsoft; and 6% for Ask, which is owned by IAC/Interactive Corp, a conglomerate of about 60 online media brands. Google also has velocity: its market share grew by 17% in the four quarters to this spring, whereas Yahoo! and MSN both lost share. Only Ask has more velocity — its share grew by 35% — but then again it has little mass."
There are some customers (government/military included) that are aware of the two concepts of precision and recall. Before you groan and skip this post because you recall those words from all classifying algorithms, you should take note that there are two stages we have yet to meet in this respect.
One is simply improving precision without sacrificing recall. When I search for 'horn' in Google, how many of those searches are relevant? I was thinking about a French horn (instrument) and the first link brings me to a society about them. The next three links, however, do not. You might say, "Well, gee, you should have put 'French' in your search" but is this really necessary? So there is some money to be made in "learning" search engines that tailor themselves to the user or perhaps the results could be displayed intuitively in domains of knowledge (a la Clusty). So that I can select a node that applies to the correct searching term and see all results returned below that. Have you ever wished to view your search results in a format other than a linear display of ranked results? The documents are related in more than one dimension, you know. As computing power increases, I suspect there will be room to display them in two dimensions (heat/area mapping, nodes & vertices on a plane) and three dimensions (spatial 3D engines with nodes & vertices in space).
The second stage is giving the user the power to adjust precision versus recall. Even a graphical interface that shows the F-measure relationship between precision and recall would be helpful to consider in the search engine wars. Say you give the user some control through a slider AJAX interface of a threshold ß. But the threshold isn't simply the "Google score cut off" or even a term frequency cutoff. Instead, it's applied to be a "relevance" threshold. You would score relevance by fingerprinting frequency, specificity, clustering and other useful tools by using a domain ontology or taxonomy.
Another big thing that is missing is identifying what kind of data you are searching. Social data? Scientific data? Historical data? etc. Perhaps I'm only interested in who's who to Stephen Hawking. I'd search for him and flip through nodes of separation from him to other people.
The current search sites also only tend to favor key-word regular expressions. What about searching with raw text or entire paragraphs? If you want to see an interesting demo of this, visit Collexis' Demo Site which alludes to a whole new kind of searching.
The key to entering the market as a competitor with Google is to pick up Google's slack and to try to pose yourself as a complimentary service to Google. Google is terrible at closed domain searches but amazingly efficient at open domain searches. You don't want to compete with them so fill a different part of the market. Google benefits from simple design, so go to an advanced flashy complex design. Most people aren't looking for that but the people that are have nowhere to go.
The Economist is alluding to potential leadership problems inside Google. Who cares? That's not going to be Google's downfall. Google's downfall will be an new intuitive way to search and the only thing that will prevent their downfall is if they buyout the company or bone up on the technology.
The search-engine battle hasn't even hit its stride.
My work here is dung.
It seems more and more when I try to find something on google all I get are a bunch of link farms. This morning I was trying to find a bike jersey for a friend of mine and on the first page of results, it took getting to the second page to find any actual results. I did much better using Yahoo and found what I was looking for on the first page of results.
This is just one example, but it happens constantly...
Well, actually, almost all current search engines suck. There is waaaaaay too much noise in the results they return. Let's say I'm doing a search for "product X", search for it in Google and what do you get? Several links to ebay (which may or may not be current), tons of links to various "rate it" sites such as epinions/nextag/msn/etc, and maybe a few smatterings of other sites mixed in. Typically the manufacturers own site won't even appear in the first couple of dozen results!
So basically, I agree with the general position of the article, that there is still a TON (actually several tons) of work to be done and room for someone else to move in with a truely superiour solution. While it's great that Google is tinkering with lots of other technologies, I wish they'd actually make some real advances in their core business (and actually, I'm slowly starting to come to grips with the fact that their core competency may not be searching, but really it's in creating low latency widely distributed computing infrastructures). For all the years and the massive sums of money, my search experience is not significantly better than it was 5 years ago.
I was on the phone with some engineers at MS the other day and even they admitted that they use Google. It's just better... for now.
http://religiousfreaks.com/Does anyone know how they calculate these market share values? AFAIK they don't all publish traffic statistics.
Developers: We can use your help.
So Ask--which used to be called Ask Jeeves but dropped Jeeves, a knowing butler, from its logo in February--is taking a different tack. It has come up with ExpertRank, an algorithm that also ranks web pages by incoming links, but is different from Google's PageRank in that it first groups, or "clusters", pages and links by theme. So instead of using a web page's overall popularity to calculate its ranking, it finds the pages that are most popular among experts on a particular subject, a method that often returns better results than Google's. Ask also uses these thematic clusters to suggest the best ways to narrow or expand a search, a feature called "zoom" that is very popular.
Which is the trouble I have with Google; their search results are like a shotgun blast too many times, getting far too wide a spread of sites having anything at all to do with the subject I type in, instead of being more narrowly focused. The problem I see with Ask's method is just how do you define who the experts are and what field they are experts in? Web sites can contain all sorts of content and people will reproduce links at a whim, just because they like what they see. Would they use a system similar to Amazon, where people are ranked by how many people use their recommendation?
GetOuttaMySpace - The Anti-Social Network
"This compares with 28% for Yahoo!; 13% for MSN, which belongs to Microsoft; and 6% for Ask, which is owned by IAC/Interactive Corp, a conglomerate of about 60 online media brands"
This isn't over, simply due to lack of certainty in net neutrality. If media companies get leverage to control bandwidth to the big search companies (Google), it goes without saying that that these figures will change significantly. For Google, it could be death by a thousand cuts...
I think your numbers are less representative than most, but even so, I find Google "only" having 50% to be strange. On our site and for June only: Google 75.5%, MSN 11.8%, Yahoo 4%, Kvasir 3.1%, Google (Images) 2%, Altavista 2%, everyone else 0.2% or less.
Since we are based in northern Europe, Kvasir (a Norweigan search engine) is obviously having a much higher share than for most other sites, but my gut feeling of Google at 75% seems reasonable.
The reason everyone still uses Google isn't because they have the best ranking of results anymore, which is usually encrusted with spam sites designed to beat Google, but because they have the most COMPLETE results. When you search for something rare, Google most likely will return results no other basic engine has. So people have gotten accustomed to checking Google first out of habit more than anything else.
To me, I think the future of search isn't necessarily a better Google, but something different. The problem with Google is the same as its strength - its simplicity. There is very little control on Google for more complicated searches, such as searching only company websites, or searching only encyclopedia content. It's just a big kludge for them to add stuff like travel info or weather or movie info without knowing the intent of the searcher beforehand. Searchers have to get savvier, not just the algorithms. I think search aggregating sites like Seaurch.com which has 200 engines but still uses a simple interface, is a great idea. Sites like Clusty.com also take an interesting approach towards understanding the searcher's intentions.