Objectively Comparing Competing Search Engines?
aendeuryu asks: "My default search engine of choice is, like most of you I assume, Google. That said, some complaints about Google over the years do seem to have some merit -- basically, that sometimes the indices aren't always updated, that it's too easy to manipulate via googlebombing or legislation, and that maybe too many of its featured services never get out of beta stage. Maybe the fact that Google has gone so long without significant competition is enough to make one at least begin to ask questions about it possibly becoming stagnant. Personally, I'm so used to doing things the Google way (and achieving acceptable results quickly) that I'm not really interested in switching -- case in point, all the above links referenced were quickly found via Google. However, what am I missing out on by not giving (for example) Yahoo search a shot? Or, more to the point, how would one go about trying to effectively and objectively compare competing search engines? In what areas have people found Google to have become obsolete for their purposes? Have less ignorant people than myself figured out ways to test a competing search engine's efficacy for themselves?"
If you know how to use google to achieve your results, whats the issue? If a better search comes along, im sure it will be posted on slashdot (twice), so you dont need to worry about missing out.
"Something's wrong with you...and I hope we never do meet again." - Deftones When Girls Telephone Boys
personally I prefer dogpile. I like the organization of results much better.
""I don't see an obvious biosynthetic pathway from allicin (CH2=CHCH2SS(=O)CH2CH=CH2)to isothiocyanates (R-N=C=S) ""
Yahoo search is okay, not as nice as google, but a good second.
Alltheweb.com has found things google hasn't, but in general I rarely use it.
I rarely use MSN because it was awful all the times I tried it. Same for Altavista.
In general, if I'm searching for something I'll use google first and then Yahoo and Alltheweb to catch anything that google may have missed.
God Bless America. Why? Did it sneeze?
...but I have to admit the AltaVista search engine for pictures is pretty nice. I use that when I want to search for pictures of a particular size for wallpaper.
This tagline is copyrighted material. Please send $10 for an affordable replacement.
Now sure enough Google has its faults, but I do still use it as my primary search engine. I do dislike Google never-ending cookies, so I've blocked them, and my Google bookmark contains all my preferences. I've not really noticed any problems with Googles indices not being updates (except in the silly image search, and I don't really use that for any serious purposes). Having said that, I also do find Yahoo to be a very acceptable alternative. I should probably try it out more so as to see how they compare in greater detail.
Santa's suicide mission go!
I think you have said it already, Google is good for returning acceptable results quickly, but acceptability is something very subjective.
Even by comparing keyword search side by side, one can still consider a worse result better, but who's to judge except the user?
I kept using Yahoo until it's not giving me results that I think are good enough, then I switched to Google, and I'll keep using Google until it's not returning good enough result.
Rock that crushes, Paper & Scissors that don't matter.
Hello.
I have been browsing your internet site for several hours and am generally impressed with your coverage of IT related issues. However, when I saw an article on Google I just had to voice my opinion. I would just like to say how increadibly appalled I am with the Google internet search engine. My main concern with Google is how easy it makes for malicious people to find information on the now illegal Bittorent computer software.
Some background information on Bittorent and what makes it so dangerous:
1. The Bittorent computer software allows distribution copyrighted material.
2. In doing so it inadvertently causes excessive use of bandwidth. Now you might say that this is fairly harmless, but is it really? The effects of electromagnetic radiation pollution caused by this cannot be underestimated. Just think of the millions of wired and wireless connections lighting up and emmiting those deadly electromagnetic rays and all the innocent men, women and children being exposed to them.
Every bittorent user has blood on his (or hers) hands. From this point on, I am boycotting Google and advise any person with a shred of decency to do so too.
Personally, I'm so used to doing things the Google way (and achieving acceptable results quickly) that I'm not really interested in switching -- case in point, all the above links referenced were quickly found via Google. However, what am I missing out on by not giving (for example) Yahoo search a shot?
I ask my wife the same thing. Honey, I'm used to doing things your way.. and I always get acceptable results from you.. but what am I missing out on by not giving (for example) Veronica a shot?
At least Google will never make you sleep on the couch, or give them half of all your assets. Hopefully.
https://www.eff.org/https-everywhere
I open my browser, and see the Google page up and running. I started with Yahoo, I tried meta search engines, altavista, a9, and many others, but I never change my home page to be the other ones. I know Google, I know how to use the results and to view pages all in HTML and to get the cache and to search sites that link to me, or search a specific site. It's easy in the other sites, but I already figured Google out. Google works for me, when I find the wrong thing, I just add "-wrongword" to the end and I find what I need. I see all the blogs and misindexed pages, but I've never really suffered from Google Bombing or any of the other problems that are mentioned.
Make your computer faster: rm -rf
Alternative search engines
-- Knowing too much can get you killed, but knowing who knows too much can make you rich.
Go to google and type in "better search engines"
God Bless America. Why? Did it sneeze?
This should give you an answer
When I am not getting satisfactory results using Google(about 30% of the time), I try Yahoo, and I usually find what I am looking for. If this keeps up, I might start my searches using Yahoo.
University of Washington
Student
Unfortunately, comparing search engines is a nearly impossible task, since they probably aren't indexing the same data.
When you measure a search technology, the values you typically look for are precision and recall. precision says "of the X results you gave me, how many of them are relevant". recall says "in the world, there were Y possible pages you could have found, but you gave me X of them".
you can't measure recall for a public search engine, but you can measure precision. Take a set of sample queries, and some users. Have them perform the queries, and go through the first ~100 pages and give them a "thumbs up" (relevant) or "thumbs down" (not relevant).
Your overall score will measure precision: if at N=100, all 100 were relevant, that's 1.0. if only 50 were judged relevant, precision is 0.5.
You can estimate recall by judging say 1,000 documents (phew). Then sample precision at N=10, 100, 500, etc, assuming that is an "exhaustive" list of documents in the world.
Try a metasearch and let the server figure it out.
It will cost ya.
Google Answers
-- Knowing too much can get you killed, but knowing who knows too much can make you rich.
Teoma has this great feature called Related search which is very useful. Basically if you look for a particular topic, the search engine identifies all related topics and offers you a one click access to all of them. Makes the search equally usable for both a rookie and a domain expert using the same search term.
One thing I like about askjeeves and a9.com is the way the present the search results. I think the next step is to improve on the presentation of the results (data) to make it more usable/accessable. Hit up askjeeves and run a search. The preview feature is pretty nice. And check out a9.com searches with their Site Info mouse-over.
Yeah! In fact, people shouldn't go to school, either. They should go out into the real world and find out everything for themselves, like how to read and write. Never give anyone advice, and if someone ever asks for advice, you should punch them in the face. Information should never be shared between people. Ever.
I hate to say it, but I think your quest to directly compare search engines "objectively" is pretty problematic.
Frankly, I think you're on the right track when you ask, "What am I missing out on by not giving Yahoo search a shot?"
Likewise, I think you're on the wrong track when you go on, "Or, more to the point, how would one go about trying to effectively and objectively compare competing search engines?"
Comparing the results of searches is necessarily subjective. Only that first question has a real answer.
RD
These types of issues are discussed ad infinitum at SEW.. particularly in the forums.
This is the dilemma for any centralized algorithm, as soon as you are number one you are exploited, thus relatively increasing the utility of as-of-yet unexploited competitors.
I got this from a friend who works at yahoo...
m l
http://www.langreiter.com/exec/yahoo-vs-google.ht
Sorry if it gets slashdotted.
You don't have to bother evaluating better web based technologies. When they are worth using others will tell you about them. It's the nature of the web.
For example, a professor of the university department in which I worked came back from Digital Research Labs, enthusing about a great new search algorithm the designers of Digital's Computer Aided Design software had come up with. A short time later Altavista was 'it'.
The same happened a few years later. The buzz from collegues and those on the web was about a new search engine called Google.
The short answer is, "Don't go looking for the 'next search engine'. It will find you."
I love wikipedia. I basically use it as my default search. Unless I think that the question I have is non encyclopedic. acronymfinder for acronyms, babelfish for translations, imdb for movies, and well, for everything else I use google. It has integrated everything else I need. Yes it is subjectable to googlebombing and similar ilk (I should know, I work for a SEO company), but its *way* easier to "hack" Yahoo, MSN, Altavista and others. Googleboming is much harder (and therefore more reliable) than the others.
I _used_ to go to altavista everytime i had a search that involved specific punctuation, usually some kind of coding question. Now i just get frustrated with google while trying to find some related term i can add in that will give me the results i want.
This Space Intentionally Left Blank
- Simple interface, quickly loads.
- No graphical Ads
- Paid results are clearly ads and seperated from real results.
That's it, that's why Google is king. Until Yahoo, MSN search, Ask Jeeves and the like get those three points, they will continue to be second fiddle.I usually test search engines by typing in popular keywords that spammers generally go after, ex:
phentermine
home loans
poker
mesothelioma
viagra
miserable failure
Then look at the sites that rank at the top. It's very easy to tell which search engines are more succeptible to manipulation. A quick look at the backlinks for sites favorably ranking in those competitive keywords tells you how that SE is doing.
Here's my opinion on the race between Google, Yahoo & MSN. Google has more sites that are authorities in the top results and Google penalizes over optimization however extreme examples of over optimization continue to show up in Google. Yahoo is a moderate success and does a fair job of filtering out spammy sites as well as authorities like wikipedia - wikipedia will always rise to the top in G but not in Y - and this is good for Y because you get more variety. MSN does an average job of filtering out blog spam but new sites are too favorably ranked and this is because MSN is new and has no recorded history of URLs. My personal preference is to use G simply because it loads the fastest in my browser... Maybe it's also worth pointing out that my company has several URLs ranked favorably in the terms listed above - looking at the change in rankings over time certainly helps give insight into which SE is better. MSN & Y are by far easier to manipulate than G but G gives the most traffic.
Many people don't realize that Yahoo! has a scaled down (Google like) search interface which is actually pretty sweet: http://search.yahoo.com
Lately my Google results have been so Google bombed that I've been going back and forth between the two. I can't say for sure yet, but I may be in the middle of a bit of a personal transition.
Depending on what you're searching for, Google is often so front-loaded with dead-end advertiser links that its results aren't really worth much. Although it has to be said, it depends what type of a search user you are, and what types of things you're looking for.
Google is still the king of advanced search.
------ The best brain training is now totally free : )
I was looking through my website's logs and noticed a ton of MSN bot hits. Then I noticed one coming from their search page. The search term was "UTC+flash" and my site was listed third in the search results.
My site has nothing to do with UTC or Flash. Turns out, it indexed my lame little archive page that displays article dates in UTC format. One of the article titles was something like "Flash Storm," so it indexed the "UTC" portion of the previous article's date and the word "Flash" that began the next article's headline below it.
It was cool that I got a free hit for it, but my site was hardly a relevant search result for that query.
Nothing screams objective like this article displaying the Google logo.
I personally think Microsoft's sandbox search engine front-end is pretty nifty.
Too bad the search results aren't nearly as up to par as google's results (in my opinion)
http://start.com/1
WTPOUAWYHTTOTWPA
What's the point of using acronyms when you have to type out the whole phrase anyways?
I know how to use Windows to achieve the necessary results better than the Mac or Linux. Does that means I should never try to use the Mac or Linux? Does that mean that I won't achieve better results if I learn to properly use the Mac or Linux?
-Daniel
I've stuck with Google for a while, but I used to do surveys pretty often. My approach was to start preparing a couple of days in advance, by keeping notes about things I was searching for. Then I'd take three or four of them, usually the ones that I'd had the most trouble refining, and try them out on a bunch of search engines. For each, I'd keep track of how many searches I had to do and how many junk pages I had to get through before I could get to something useful on that subject. It usually became clear pretty quickly which search engines were allowing me to make efficient use of my time and which were wasting my time.
Another thing you might want to do is check out some of the newer "clustering" or "concept map" search engines such as Vivisimo or Kartoo, to see whether they suit your searching style better. They're really quite different from the search engines we've gotten used to, so the metrics I just described don't quite work for them. That doesn't mean they're better or worse - just different.
Slashdot - News for Herds. Stuff that Splatters.
I remember having to walk uphill in the snow both ways to the mailbox to mail my google queries in!
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
Since I tend to have to do some SEO for sites, I tend to keep an eye on how search results are returned... one thing I've noticed is that Yahoo seems far more easily manipulated by URLs - ie, it seems to weight something like, "www.goats.com/goats" high for the term "goats" even when the site has little or nothing to do with goats.
Also, Yahoo and MSN both seem extremely poor about figuring out the "right" url to link to. It's almost as if they index the first thing on any domain they come across, instead of trying to figure out where on the site most people link to, so you'll often find yourself deep-linked into a site where you'd prefer to be looking at a higher-level page to start. Google deeplinks too, but it seems to be only when it's really more relevant to the content.
I don't use a9 much, but it seems like google with a different skin. I swear sometimes they're snarfing google's results and storing them. Not that this is all bad, since Google's results tend to be some of the best, but it's still eerie.
Apparently Ask Jeeves reccommends MSN search.
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
Try this: yahoo! vs. google
Surprisingly, I still use Ask Jeeves (www.ask.com) for things - and find it finds things that Google has completely missed!
I guess you have to use a combination of several to really find everything you want - though Google by far is the best one.Looking through the logs for my website, I see Googlebot visiting nearly every day, followed (recently) by MSNBot. (Actually, in raw count, I'm seeing that MSNbot has just recently surpassed the number of requests as Googlebot. Would need to do some in-depth analysis to see if those are requests for the same thing over & over, but in raw requests...) I pretty much never see anything from Yahoo cataloging my site.
t ml#browsum, but Yahoo is giving more traffic (http://klomdark.servebeer.com:443/analog/report.h tml#refsite) than MSN.
What's weird I'm noticing is that I don't see anything from something like a Yahoo bot at http://klomdark.servebeer.com:443/analog/report.h
Google still leads however. I wonder where Yahoo is getting it's data, unless it's from a crawl previous to fall 2003, as I'm not tracking logs from that far back. Strange.
Because relivance is more complex. There's a number of additonal considerations:
1) HOW relivant is a page, and is that page more highly ranked? It doesn't do me any good to have 99 slightly relivant results and 1 highly relivant result, if that one is at the end. So you have to measure how relivant the page is, and how high it appears in teh search and weight that.
2) The ability to find the correct page. Sometimes it's not that you are looking for general inforamtion on a topic, there's a specific page you want. However you don't know the URL or how to get there. Maybe you saw it once and have a vague memory, maybe you just heard about it, whatever. In this case, it's a question of how quickly the engine gets you the correct answer, both in terms of how high it's ranked, and how many search variations you have to try.
3) Along those lines, the ability to deal with degraded input. Sometimes it's as simple as a spelling error, but sometimes it's the searcher misunderstanding their own question. They don't know precisely what they want. Maybe because they only have a vague idea, maybe because the term they remember for it isn't quite right, whatever. So how well can teh search engine figure out what they really want and find that?
So there's lots of things like that to consider as well when you are using a general purpose web search enigne. Really only personal experience can tell you if one works well for you at finding what you want.
The other day I needed to know, for obscure reasons, the number of heroin addicts in Dublin. This is the kind of info that you know is probably on the web, but is going to be hard to find with Google.
I used BrainBoost - "How many heroin addicts are there in Dublin?", and, bam, first line of the result - "There are 13,000 heroin addicts in Dublin."
That's damn impressive. Out of curiosity I tried to see if I could find the same info with Google - it was fairly tough. Took three or four searches, eventually resorting to
which is a fairly specialized search that average users probably wouldn't be able to construct. The BrainBoost search, on the other hand, was completely natural, my granma could have done it.So, thumbs up for BrainBoost for question answering.
Still, it's not the kind of thing you'll want every day. For day-to-day search, Google is the tool, but BB is worth a look.
For some time now, Search Engine Watch has provided a good editorial and comparison on various search engines. They focus on marketing topics, but also tend to talk a lot about the underlying technology, etc.
A recent roundup of engines is at http://searchenginewatch.com/links/article.php/215 6221.
Google DOES NOT spider dynamicly created webpages. If you have, say for example, forums... it will spider only the first page. Yahoo, however, will spider the dynamic content [though with a limit to assure it doesn't get caught in a bot trap].
The Peanut Gallery, Ubergeek, Biblically Sober
NCAAbbs.com: Thousands of fans, Hundreds of teams, Just one place
BB gave me several good locations to score some China White, but Google's beta Junkie Search performed remarkably well. Thanks again googlasdlghoaeu...
Risking karma for a little laugh.
I'm so glad someone raised this. I was thinking just yesterday that the internet was *seeming* to have become smaller. The linking pagerank system google uses is strange IMHO because not all pages are massively linked.. or have reason to be linked.. it turns the net into some kind of boys club...
With that said, I have found that my more obscure and better quality sites have been found on the last pages of google, with the first few pages being generally filled with amazon and other *for sale* sites...
There are sites i've seen that have been around for years and don't even get a mention on google. Word of mouth was the only way I found out about them. I also remember how much the internet opened up when I first used the "stumble to" firefox extension. Who knew these sites even existed?!
I'd say the google solution would be so somehow incorporate a similar "word of mouth" type ranking system as "stumble to" (or slashdot for that matter), so individual users can rank results "useful/not useful" to modify page ranks... Also their "similar page" section would also benefit from a "useful/not useful" to help google learn similarities...
my 2c
'plex
Rich Gentlemen Hide - The Existential Comic
No single search engine had won out so yo uhad a bank of search engines that you always scrolled through. What one engine didn't have another would.
Well a hell of a lot of those "old" search engines are still around! And they have become better over time. Google at one time was so much nicer than the others that people sort of got "lazy" and stopped browsing qround the engines. But everyone else didn't just curl up and die.
So just start engine hopping again. Try Google first if you must, but then try Yahoo, search.msn, alltheweb or search.com or other meta search engines that search all the real search engines for you.
Multiple sources of info have always been and always will be better than one giant conclomerate of info such as Google is becoming.
Contrary to popular belief, coding is not all free blow-jobs and beer. Those things cost MONEY!
And the winner is Yahoo.
While I don't doubt that BrainBoost works, heroin addict dublin into Google gets me "There are 13,000 heroin addicts in Dublin" in the first page of results.
That kind of engines are indeed nice. Still, they have their own oddities. For laughs, I tried to ask the system whether moon is made of cheese.
It so turns out that moon is indeed made of cheese!
"is moon made of cheese?"
"The Moon is Made of Cheese"
I guess it still takes some time before that kind search engines become more popular than the traditional ones.
Interestingly, you also get the same result if you actually do a Google search for your original question: "How many heroin addicts are there in Dublin?" In the summary of the first result: "... There are an estimated 13000 heroin users in the Dublin area. ..." I'll give you the benefit of the doubt here and assume that, as you have supposed, these results have changed since you did your searches.
I have actually found searching for a plain english question to work in a number of other instances, as well. Not always, but sometimes.
You probably shouldn't click this.
keywords: microsoft sucks
:)
Google == 658,000 hits
Hotbot == 136,000 hits
AltaVista == 1,350,000 hits
search.msn.com == 1,957,101 hits
keywords:apple sucks
Google == 750,00 hits
Hotbot == 139,000 hits
Altavista == 1,540,000 hits
search.msn.com == 2,415,023
keywords:linux sucks
Google == 620,000 hits
Hotbot == 117,000 hits
Altavista == 1,110,000 hits
search.msn.com == 1,828,755 hits
So there you have it. To break it down:
- msn HATES apple but would use linux before windows.
- Altavista prefers Linux but would use windows before using a mac
- Hotbot was afraid to take a stance.
- Google clearly thinks apple sucks the worst and linux the least.
This is about as objective as you can get
Join the Slashcott! Feb 10 thru Feb 17!
Bad name, good search results:
Clusty, aka, Vivisimo: http://clusty.com/
This one has succeeded when Google has failed.
WWW
(Slightly premature announcement coming up.. but hey - it's Open Source so that's okay, right?)
I've just started a (Java) project to interface to a number of search engines. It might be a good place to start if you feel like doing some coding. See https://argos.dev.java.net/ - there is no release yet but the code is in CVS.
It currently supports Blogdigger, Feedster, Del.icio.us, Google, MSN and Yahoo (and Google Desktop search). I'd like to include Ask.com, too, but they don't provide a programatic interface and I refuse to screen-scrape.
In my opinion none of the other search engines are close to Google in quality of results. I've found (to my surprise) that Ask.com gives me the second best results (they bought the old Teoma search engine, which was always okay. It had an index almost the size of Google's, which neither MSN or Yahoo can match yet.)