In Some Places, Local Search Beating Google
babooo404 points out Newsweek coverage of Google focusing on areas in which the search giant may be vulnerable. In some countries outside the US, local competition is handing Google its head. In South Korea a company called Naver dominates. And in Russia, portal site Yandex leads in both search and advertising. In the Cyrillic language market Google is a distant third in search, and Yandex is trouncing Google in the advertising arena by 70% to 2%.
How some people treat everything "Google" as if it were special. It would be news worth *if* Google was beating local searches in foreign areas.
"Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
Still, Yandex is unbelievable crap - results-quality wise. I'd say Top3 go in reverse in this parameter. But the problem I think - apart from advertising (Y had a rather big ad campaign some time ago) - is that Google seriously dropped the ball and showed huge negligence and ignorance when entering local market unprepared - for example, their engine did not even search for different wordforms and Russian of course has an ultra-developed word endings system. So - at first - Google was 99% useless. Plus - Y had been around the longest and most people simply don't care about switching.
Perhaps in the West, we often assume that Google is the only player in town worth using.
It would be interesting to get the view of someone in South Korea, for instance, as to how useful Google is to them when compared with local/regional alternatives?
It's more than likely that Google is far too orientated around the West, both culturally and in terms of results.
A slashdotting - you get the stick first and then the carrot !
Google searches you! Oh wait...
How does Google handle all the various extended character sets out there? Can you search in Cyrillic, Chinese or even French?
I have excellent Karma and I am not afraid to Troll it.
Same reason I walk to work every day... because all damn car companies are controlled by the damn greedy stockholders.
It's 20 miles but I make it work because I'm so self-righteous.
The most would-be-shocking fact is that more than half of the non-technical people doesn't even know what google is (for example, my mom). In contrast, I find most of my non-technical friends have naver.com as their first page on IE. In Korea, it's quite common to see TV commercials say "search XYZ in Naver", instead of displaying its URL.
The biggest reason is because Naver actually hosts content, rather than just indexing content. Not only that Naver is a strong search engine company, it hosts a vast amount of blogs, forums, an online game site (Hangame), user-provided knowledge base, plus third-party licensed contents (such as dictionaries, public transportation routes, news contents provided by other medias, etc.). All these contents are prohibited to robots (via robots.txt), which means Google can't even index them. Thus, no matter how great Google's search algorithm is, it will be almost impossible to match Naver's quality.
Plus, running a homepage *that looks cool* is a very complicated job for a non tech-savvy person. Thus, they don't get webhosting - they upload contents to big portals. I've even seen many small businesses forget about homepages, and instead have a blog/user-created forum/whatsoever on every major player. It would be much easier for normal users to reach them (since memorizing a URL written in a non-native language would be painful), and cheaper (near zero) to maintain.
Another downside of Google is that it DISPLAYS English search results, which would be useless to them. Yes, people are lazy enough to select the 'Search for Korean contents only'.
In terms of actual users, I believe Google would fall even further behind (far behind 10th place), since there is another big portal cyworld (http://cyworld.com/), which provides personal blogging services and web-based communities.
I use many different searching methods
- Naver or Yahoo for local information (public transport route, looking for a place for a nice dinner, etc.)
- Wikipedia for something that's expected to exist on an encyclopedia
- danawa.com and enuri.com for searching best deals (equivalent to PriceGrabber or whatsoever)
- Naver for anything else in Korean
- Google for everything else, or if all methods above doesn't give a good enough result.
As a result, I get to use google less and wikipedia more, while naver and everything else remains somewhat constant.
I am student from Korea so i know very well about Korean websites. Naver gained popularity by providing human generated search engine and user generated contents such as imitation of yahoo's answer page. But there are no good search engine that supports Korean in the face of this planet. At least european laguages share common alphabet, that is the reason why google holds significant share on europe. But Korean is just different from English. As i search internet in Korean, neither google,naver returns reliable results. There are no search engine that supports basic functions like spell correction neither. (Lets say you type Koreea in google and it will suggest you that if you meant to type Korea) web portals and search engines in Korea are more like very well organized catalog with useful advertisements. There are long way to go in developing web search engine in Korean. In fact there are some progress done. Until the new technology is finally embedded into their websites it is just going to be good yellowbook with lots of ads. Funny thing is that when i use google i do my best to ignore all the ads. But when i use Naver, i only look at their ads. funnier things is tho, most scholars use google in Korea when searching Korean, because it has simpler interface.
Transferring links around isn't the hard part. The hard part is to actually get something that's relevant for that search string.
Just simple lists of keywords associated with that link won't do. We already had that kind of search engines long before Google, and there's a reason why Google handed their arse to them.
And then there are the people gaming the system for a quick profit... even if it means ruining a valuable resource for everyone else. There was an almost epidemic of link spam on all possible forums and blogs, for example, just to raise the Google rank of a couple of pages.
Most of Google's uphill battle so far has been tweaking the algorithm to defend against such "attacks".
(And now that I mention it, it dawns upon me that maybe that's why smaller national engines can do better locally. With everyone trying to game Google and generally the larger English-reading world, it could be that noone bothered polluting the smaller national searches.)
So just being able to swap links around won't do much.
A second and third problems I see with your idea are, well:
1. timing. When I search for something, I'd rather not depend on the right people being online at that exact time. I also want the answer in half a second. Google does that with in-RAM indexes. I wouldn't bet a fortune on someone doing that equally fast via several hops over the net, P2P style.
2. reliability. P2P traffic has been poisoned repeatedly by interested parties, like, say, the RIAA and MPAA. And it's entirely trivial to do so. So what's to keep other interested parties from poisoning P2P search with falsely tagged links?
Even on Google, it's not entirely rare that someone buys ad-word keywords on their competitors' trademarks or such. E.g., if you have a company called, say, "Houndwire", I could buy that keyword for an ad for my company. Now everyone who searches for your company, will have my ad served to them. Then keep my fingers crossed that if I'm in roughly the same market, some people will just go ahead and buy from me. There have been even laws proposed against that kind of impersonation.
Now for adwords it's one thing, but the same could just as well be applied to poisoning a P2P search. Which could ruin its usefulness pretty fast.
A polar bear is a cartesian bear after a coordinate transform.
Nope. It was fairly easy to work with foreign currency in Russia since early 90-s. Yandex was simply MUCH better than Google because Google have not supported Russian morphology until very recently.
...
For example, if I'm searching information about, say, the name of Putin's dog I can use the following search query:
"Imja sobaki Putina" - (the name of Putin's dog) and Yandex can find documents with the words
"Imena sobak Putina" - (the names of Putin's dogs - note the plural) or documents with the words
"Imen sobak Putina" - ([about] the names of Putin's dogs)
"Imena sobakam Putina" - another grammar case.
Russian morphology is MUCH MUCH more complex than in English. Yandex started working on morphological search in 1996, so it's not surprising that it's still much better than Google.
Interesting point... Never thought about that but it makes a lot of sense.
It is a matter of approach to morphology actually.
IIRC Google approach to morphology as a whole is to throw brute force statistical analysis at it. They use statistical models and loads of data for translation. This works wonders with languages like English who have more exemptions than grammar rules while having fairly rigid sentence ordering and relatively limited common vocabulary.
Russian is very difficult to be subjected to this approach. Due to it undergoing a forced language reform at the turn of the 20th century, russian grammar can be expressed in less than 10 pages of strict rules with around 30-40 exemptions. This grammar used to be drilled down with vengeance in Russian schools so it has not changed a bit since formulated 100 years ago.
While the rules are strict (and relatively easy) the meaning of many key grammar elements is positional-dependant. To add insult to injury it has one of the largest working day-to-day vocabularies and there are probably more ways to say the same thing than in any other language (I mean proper Russian, not "Na huja zhe tebe eto nado blad'"..
So no wonder an analytical model is more successful than statistical. Thanks for pointing it out.
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/