Searching For Google's Successor
weink writes "A new generation of scrappy search engines is emerging to challenge the dominance
of mighty Google
. An
article
at Wired
News
lists up-and-coming search engines, WiseNut
, Teoma
, Lasoo
, CURE
, and Vivisimo
. Take a look, and give them a try. But I still say that nothing is better
then the almighty Google
."
I hit google specifically LOOKING FOR NEWSGROUP discussions on the topic. Granted, I dont need 50 mirrored copies, but I definitely do want to see newsgroup archives indexed.
See, there's this thing called groups.google.com, and...
"And like that
No idea, but I also like their conversion of PDFs to text, and caching of the text.
I also love the cache because I can read sites that are 404'd. Great for digging up old specs on hardware.
The thing that has most impressed me about Google isn't its technology, but the restraint and good sense they've shown in the Internet community. While every other search engine has tried a go at the portal route, Google has focused on simply being a search engine. They've continued to add features that improve the user's experience at the same time other engines sell their results to the highest bidder.
Some of the most annoying companies in existance came about because they pulled a massive version of bait and switch, they adopted a consumer friendly strategy for the short term but changed when they got big enough to destroy the competition. Google has done remarkably little despite their impressive potential marketing position. Companies like this is where our business should go, it is our power as consumers to make decisions like this.
My point is that if/when something better than Google comes along, you should think twice before changing your homepage. When choosing a company, it's not just who provides the best product in the short term, you have to take into account long term as well.
And in the screw users over for short-term gain and long term harm department... I mean, where are all the damn pop-ups/pop-unders and rich media ads that will crash my broswer, make the page jerk around like a fat football center, and offer me a "new and improved" experience while they show the same damn add so many times that I twich when I see it.
it doesn't have a cache (something that I use almost all the time) and also happens to run on Windows.
---
How about a search engine that doesn't index 'rpmfind' mirrors and newsgroups so searches for linux related info turn up something more useful than 50 pages of rpmfind entries...
Ok, I agree with the rpmfind mirrors, but I have to disagree on the newsgroup issue. Usually when I'm really stuck on something (ie: Linux SMP box hanging under high network load (which makes backups a real bitch), forcing me to power cycle : flawed APIC handling for the 3c905 ethernet card), I hit google specifically LOOKING FOR NEWSGROUP discussions on the topic. Granted, I dont need 50 mirrored copies, but I definitely do want to see newsgroup archives indexed.
Mooniacs for iOS and Android
Sure, if I'm searching form something like 'how to setup my dvd drive on linux' I want a HOWTO (and I go to yahoo for that), but for more obscure things (like maybe 'how to setup my mpeg decoder card on linux') the newsgroup and mailing list archives are very useful.
That's one of the main features of google for me.
--
Stay tuned for some shock and awe coming right up after this messages!
Wisenut - seems to work as well as Google. I like it. Doesn't offer alternative spellings, though, and I can't ever spell Skylarov correctly first time :-) The results are harder to parse visually than Google.
:-) Hopefully, that's just a beta feature...
Teoma - needs to crawl a lot more before it becomes a viable alternative. Obviously it can find the easy stuff, but most people (I hope) don't use search engines to find the easy stuff. Results are easy to read, and categories meaningful and well placed. Phrase match is kinda cool, because you get to put back in your common words that Google disallows ("and", "the", etc).
Lasoo - lousy spelling looks terrible, even if it was intentional. Aside from that, what makes this different to Mapquest.com plus a Yellow Pages? I know which I'd rather use.
CURE - this search engine has reached its user limit so I'm not allowed to search. Boy, is that going to be popular
Vivisimo - is a metasearch engine, whatever the FAQ begs you to believe. If you like em, then sure, but speaking personally, they are of no particular use to me.
Google still rocks my world, with cacheing, fast fast oh so fast searching, and relevance that beats the crap out of everything ever. Rock on.
google still rules my world.
Lasoo doesn't load
Vivisimo plain sucks. Nasty interface. Long load times.
Wisenut isn't bad, but it certainly isn't good.
Teoma has promise, but the searches tend to take a long time on arcane subjects. No easily accessible advanced search functions.
I won't even begin going into CURE. How dare they slander the 80s dark pop/goth/electronic group with an interface that cheesy. Nix the graphics and bring up the friggin' search box without the glitz.
Thanks, but no thanks, guys.
Recent results. Google only seems to be getting updated once every couple months. I know they must be pulling down a lot of data, but every other search engine seems to have more recent information that Google does. Anybody have any actual stats of googles refresh?
- Cache: Means that we are able to visit a site after it's been slashdotted.
- Relevance: Google's "relevance technology" is great. Find related sites, and find only pages related to your query.
:-)
- Not only web pages: Google doesn't only search for web pages, but also PDF files and images. More search engines should have had features like that.
So what's bad about Google? AFAIK, nothing an ordinary user would know of. But their hardware is "wrong". Fast has developed a search engine called AllTheWeb. Their search engine is the best seach engine after Google, but could easily (?) have been the best.Why? Well. They have developed special hardware to do their search. And it's damn fast (that's where they got the name, I guess). However, the software running on their hardware isn't as good as Google, and I really wonder why...
My conclusion: The software Google is using should have run on AllTheWeb's hardware. That would have been one hell of a search engine.
No I don't like it, either...
"Search Engine" is no longer politically correct. We prefer "Exploratory Native American."
A man without a God is like a fish without a bicycle.
The first time someone told me about the great new web search "google" I immediately went to my computer, and spelled it correctly, or incorrectly, depending on how you look at it. Because www.googol.com is completely different from www.google.com
At least you didn't sit there and type in the hundred zeroes.
I would do it, but the lameness filter doesn't like it.
Promote proofreading. Don't mod up sloppy posts.
They need the AltaVista NEAR operator: foo NEAR bar.
'Intellectual Properties' are uncontrollable in the wild. To base an economy on them is just stupid.
is Citeseer. It's popular among researchers since you can directly peek into papers...
--
Error 500: Internal sig error
A similar problem I've found is that when I'm searching on using 'foo' and 'bar' together (for example, if they're two popular options in a software package), I'll tend to get a lot of hits from mailing list indexes. The page'll generally contain a link to a message about 'foo' and a separate link to a message about 'bar', and will be highly rated from all the other things linking to the index.
Like Christmas, Independence Day, etc. So cool :).
Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
what's up wit dat? A technology company that actually "gets it".
who would have thought?
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
You know that Wired is now owned by Lycos too, right? They were a package deal.
Ahem! We use more credible sources here, sonny! By it's own definition, Google is spelt correctly ;)
Site slashdotted? Hit the cache
Want to see a dmoz.org directory? See it page ranked.
Doing science research? Find the answers in indexed PDF files.
And the list goes on...
Not to mention they do the right thing advertising wise, run on linux. Bring on the upstarts, but they'd better be prepared for a good bit of starting to knock down google.
(ie: Linux SMP box hanging under high network load (which makes backups a real bitch), forcing me to power cycle : flawed APIC handling for the 3c905 ethernet card),
Out of curiousity, did you find a fix for this? I think that may be explaining the odd lockup I get on my system that I haven't bene able to pin down...
Well yeah wisenut could contend... if it wasn't running on IIS!! We all know how sensitive to worms IIS is, don't we. It would be neat to send a Code Red at wisenut and then reverse some of the search results through the root shell. I can think of ALL sorts of funny reversals/replacements.
...because, for once, a company made their way to the top by simply _having a stellar product_. When I first began using it I was shocked by how many orders of magnatude better than any other search engine it was. But to my surprise, everyone else realized it too, to the point that Google now completely dominates the search engine industry.
I do hope these other engines (many of which I've tried, and they ain't bad) offer up some competition, because a monopoly is bad even when the monolopy provider is so good. But in the meantime it's great to finally see a product suceeding so well based entirely upon its merit.
Yet another example of, "The difference between theory and practice is that there is no difference, in theory."
Yesterday, I found a new feature that I enjoy. Try typing 'link:' into the Google search. It tells you all the sites that link to that site.
I know if you own the site, you can check it out with an HTTP_REFERER, but that isn't always the case.
http://search.yahoo.com/bin/search?p=Apple+Assembl y+Line
Compare the results to this search submitted to Google:
http://www.google.com/search?sourceid=navclient&q= Apple+Assembly+Line
(The first result is one of my pages. I made the rounds of several search engines a little while ago to check the page ranking. Yahoo is using Google's search results more or less unmodified.)
20 January 2017: the End of an Error.
Just ask jeeves... Duh.
-----
"The only difference between me and a madman is that I'm not mad." - Salvador Dali (1904-1989)
Anyone try http://www.alltheweb.com??
I do not know how it stacks up to google but I know that it is pretty darn fast.
you can actually use google to do site searches: just add site:yourdomain to the query. Or use the google toolbar and use the search site button.
You can also include a form on your site that does this for you and google can customize the search results page to match the layout of your site.
Jilles
It's a good start, but since we still don't get any hits for...
http://images.google.com/images?q=natalie+portman+ grits
So it sounds like theoretically the NEAR operator should be unnecessary.
I wish that my inferiority complex were as good as yours.
-RenderHead
When another company plucks away Wired's pride and joy, they advertise the competition.
"Anonymous Coward" is for whistleblowers, not unpopular opinions.
http://images.google.com/images?q=natalie+portman
'nuff said.
-- "Complacency is a far more dangerous attitude than outrage." -Naomi Littlebear
Has any of these manages to solve the keyword problem yet? That is, if you can't think of the keyword that everyone else uses to describe the topic you are looking for, then you will have a very difficult time looking for that information.
Even with Google, I find that my keywords don't always match what the indexed sites use. Often it takes three or four tries to get the right keywords that will get me useful information.
Teoma sounds promising, since getting one site in a topic group can get you more in that topic group.
And don't forget the infamous "I'm feeling lucky" button. All the fun and odds of a Vegas slot machine without the cost.
hell, i remember when "webcrawler" was the shiznat. anyone remember when "smarties.com" was the first porn site? now you can get your candies there
It would really help if everyone set the ID3 tags in their MP3s correctly. I don't think a single MP3 that I got off of Napster had all of the information correctly set (many didn't have any correctly set).
My only political goal is to see to it that no political party achieves its goals.
Would any of the new search engines be controlled by a different government ?
Since the search-engines are becoming our pointers to information, they do have a lot of control over what information we see. I doesn't matter that some web-server in malaysia has a web page describing the complete meaning of life, the universe and everything, if it's not in the search engines.
If all search engines are controlled by the same government (and yes oh yes, they are controlled) the web suddenly becomes biased.
Try searching for "marlboro" on google. What would you expect ? The marlboro home-page ? Oh, no; we have the Marlboro College, poems, but no tobacco company home page. Coincidence? Well, a search for IRIX gives me the SGI home page, so I think the search engine works as designed - what do you think?
I believe I read recently that only something like 30% of google's income came from advertising. The rest came from selling it's searching capabilities to other search engines. I know I've read that Yahoo works to maintain there own categories while using Google for its web page matches.
Google does kickass, and I'm sure the guys that run it will continue to fine tune things so thaat it improves. But the truth is, we're already approaching the limit of what a search engine can do, and any gains will simply be the last 1/100 of that last percent.
Should we stop trying? No, the need for relevant results hasn't been fulfilled, except in the most minimal ways. But we need to look for new answers. I think that to take this any further, it will mean going client-side. To make results more relevant requires too much cpu power, to aggregate it at the engine website. A client side agent, using google as a starting point, and sifting through the results, spidering through them, makes sense. Don't start whining about traffic increase, the same thing happens now, only it's the person himself doing the spidering.
Also, the entire keyword paradigm is at odds with but the most simplistic search. Sometimes I'm looking for a diagram, or I'm looking to buy aa hard to find part. Some engines, like lycos allow you to search for audio or stills, but it borders on lameness. This needs to be epxanded. You need to be able to tell the engine, "hey I'm just looking for general info" or "hey I want to buy something with these parameters". For instance, the diagrams I look for, they can either be gif/jpeg or ascii art. A decent engine/agent should have no trouble returning results thaat reflect these requirements. Same with the "buying" type search, the electronic parts I'm looking for are not common items, and adding a keyword of "shopping cart" doesn't always cut it. As I see it, there are at least a few different types of searches, that a person might make.
I want to buy this item (or a simlar)
I want to find info (of an encyclopedic nature)
I want to find leads about (I don't quite know what I'm looking for yet)
I want to hear news about...
I want to find this file/software (or a similar one)
I want to be entertained about/with...
These things all lend themselves perfectly to a client-side agent. Those websites that don't bother to tag images properly, and yet the image is just stylized text? An agent has the power to OCR it back to normal, something an engine could never hope to do. Get rid of all the mirrors? Google is better at this than any other engine, but can it compete with an agent that can recognize a text mirror or a html page, or vice versa? Or any of the other nifty little optimizations that aren't even obvious to me at the moment? Sure, there will be problems. I'm not sure Joe AOL being able to accept that a proper search will take longer than it takes for a web page to load, but it still seems like the next killer app to me.
Hence, I refuse to use wisenut.
I didn't have this problem. Maybe you clicked twice?
I haven't compared it to google yet, but I'd say Lasoo has its place in my utility belt. After typing in my address, I was able to click on "Bars" and now I know exactly how far my house is from each of the nearest local pubs! The distances are in meters, so I'll have to only drink imported beer and crawl metrically -- which kind of makes sense since I won't be on my feet anyway.
I'm curious to see if any of these new search engines suffer from the /. effect.
A feeling of having made the same mistake before: Deja Foobar
I would prefer that the newsgroup messages not be indexed because it can clutter your results list if that is not what you are looking for. If you know that what you are looking for should be in newsgroups (e.g. it is a question you are looking for the answer to) you could look it up at Google Groups
All I wanted was a rock to wind a piece of string around, and I ended up with the biggest ball of twine in Minnesota
Of course, Google is now the only player in town for Usenet Searches since they bought Deja (and if they're reading this, I want them to bring back Deja's hierarchical nesting features...)
Lawrence Person (lawrencepersonh@gmailh.com (remove all "h"s to mail)
http://www.lawrenceperson.com/
WiseNut, Teoma and Vivisimo all are similar in that they put up pretty relevant categories for some searches. In fact, I'll say they do a better job than Google, from what I saw.
As an example, I did a search on "lisinopril", the generic name for a blood pressure medication I take.
Where as google provided one "category" besides the search results, WiseNut provided 10 relevant categories to further break down my search (ACE Inhibitors, brand names, blood pressure, heart attacks, drug information, etc.
Teoma provided 8 different categories, and vivisimo provided 11 categories and "more" option for more categories.
Personally, I find this to be a nice feature of all three of these engines. As for relevancy of the information, that's really a hard thing to quanitify.
Given the choice, though, I'm going to add WiseNut and Teoma to my list of search engines that I use. Beyond the features mentioned above, they took one good idea from Google and that's to keep the search screen sparse and uncluttered.
Just my humble opinion...
31337 H4x0r g00g13
Google in the language of "Bork, bork bork!"
Igpay Atinlay Ooglegay
Put quotes around the phrase, and prefix noise words with a plus sign, e.g. "number +of +the beast".
CEE5210S The signal SIGHUP was received.
What make Google so great is the fact that "google" just rolls off the tongue. Say it with me ... "goooogle".
Vivisimo is a bit hard to pronounce (and I almost spelled Visio).
[accent=British] "Teoma". That's a tinny word, don't you think?[accent off]
In all seriousness, naming choice is very important as you all know. If you can't remeber the address, you won't go there. And don't say anything about bookmarks. I usually type in the URL of the sites I visit often.
"You like Chinese food." -Fortune Cookie
Most people don't use Google anyways, they just go straight to Ask Slashdot. :(
+5:offtopic,but anti-American
The College of New Jersey and Villanova University are working on a search engine called W.H.A.T. which uses AI to apply contexts to search results. The idea is that the user can express some how more than words do, the meaning of the target. Pretty interesting stuff. :-)
I'm biased as I worked on it for a year, though.
Sam
No viable successor yet.
SpammerQuery - The home addresses and personal phone numbers of spammers.
EinsteinExpress - When you absolutely, positively have to have next month's kernal patch yesterday...
SlashBot - The home addresses and personal phone numbers of FP'ers and goatse.cx linkers.
BootyCall - All porn all the ti... wait a second. We've got images.google.com for that! Sorry, my bad.
The next Slashdot story will be ready soon, but subscribers can beat the rush and slashdot the links early!
Yes, the cache can be invaluable at times. Anyone got any ideas as to how much space Google's cache takes up?
very simple process actually. I was ablt to quickly figure out when you click it zooms in, so click the "out" button to zoom out to click to another region and then click in or simply type in your street address.
Things just haven't been the same since they started taking advertisers money. They've been shamelessly manipulating search results instead of keeping the engine honest.
Trolls throughout history:
Jonathan Swift
Teoma was discussed earlier on /.. The article featured in that posting was quite interesting in it's own right and worth a close read, even if you don't go through the comments of the earlier post.
--CTH
--Got Lists? | Top 95 Star Wars Line
According to Wisenut's front page, it has more pages indexed than Google. Can this be true?
I just tried them all out, here are my 2 cents.
1) They all try to distinguish themselves by stating "we're not just another search engine...". Basically, they are.
2) Wisenut is by far the least bloated, and it shows in terms of speed.
3) Lasoo combines "white pages" with a web directory. Clever, but putting it all on one page is a bit overkill IMHO.
4) None of them is as configurable as google.
However, it will be nice to see how they develop. They all need an innovative feature though, something to make the switch from google worthwhile.
Oh yeah well... I remember when all we had was a stone carving.
Yep, we bought new network cards.
another workaround (that hits performance, but fixes the problem) is to use the "NOAPIC" option at the boot prompt. Supposedly it's fixed in the alan cox kernel, but it doesnt seem to appear in the linus version's changelogs. it may be fixed in 2.4.8
Mooniacs for iOS and Android
put quotes around the phrase.
This style of naming has as a disadvantage that that some OS have path length limits. Also when you burn them to a cd you encounter such limitations. Consequently I prefer to have a directory for each artist and then one for each album of that artist. This way I can strip that information from the filename. The tracknummer is essential for sorting the files in a playlist though so I leave that (and even add it if it is missing).
Jilles
How about a search engine that doesn't index 'rpmfind' mirrors and newsgroups so searches for linux related info turn up something more useful than 50 pages of rpmfind entries...
Ok, yeah, I know how to use '-', but its still annoying...
http://www.masturbateforpeace.com/
The only god I can find is GOOGLE. Who else will actually answer your prayers?
WiseNut looks like it can be a contender, but until it meets or surpasses Google's index AND adds a cache feature... well, I'll just stick with what works.
If I had bothered to read the article before posting, I'd have seen that 'Vivisimo' (bad name, guys. bad name) does some metasearch stuff. But not across all the other search engines mentioned.
-- Ed Avis ed@membled.com
Is there a decent meta-search tool that can run a query through all (or most) of these and collect together the results?
Maybe not on the web (where it might get threatened) but at least a command-line tool or CGI script.
-- Ed Avis ed@membled.com
The cache is a nice gimmick which I've found useful quite a few times, however the main reason I keep returning to google is that I actually find what I need fast. Yesterday I needed some background on C++ templates. I entered the terms "C++ templates tutorial" in the ie google toolbar (that is a great feauture IMHO) and found what I needed at the top of the returned results. 15 seconds later the stuff I needed was on its way to the printer.
That kind of convenience is hard to beat by a general purpose search engine. The story changes if you start using meta information to narrow the search. Google does not do that as far as I know. However, using meta information inevitably narrows the scope of a search engine. Efficient distributed search engines for multimedia are currently emerging. E.g. morpheus actually uses meta information attached to a mp3 allowing for searches for tracks of a particular album, more albums of the same artist and so on.
Jilles
Wisenut
Looks like google without cache, wiseguide provides a nifty preview of categories with matches.
Teoma
Match phrase button handy, no cache
Lasoo
Nice maps, but not a search engine for finding general topics, more geared to finding locations
CURE
Is this a search engine? Hit the user limit so got nowhere.
Vivisimo
The best of the lot. Nice frame layout, organization by category, but lacks ability to jump to page.
A feeling of having made the same mistake before: Deja Foobar
Google catalogs open Administrator websites, and some of those websites have no or weak passwords. I reference google, since it does a good job of treeing websites. Search engines seem to be a good tool for looking for websites with weaknesses.
Example..
If you search on google for "myPHPAdmin" you can find databases without password protection. You can do simple things like SQL queries for Credit Card information or even Drop tables.
Lucky nobody has wrote a trojan that searches google for unprotected databases and drops all tables. Oh wait, maybe they have....
Is the cache. Especially for readers of Slashdot, because it allows them to see a site after it has been Slashdotted. From my quick glance at the other sites, none of them had that technology. That is why I will continue to use Google!
Payola killed the search engine. Yahoo use payola since the get-go (or shortly after) and it was quickly discovered that they sucked. Now all the other search engines have admitted to it, and Google came out a true winner. Now companies want a piece of Google, just as they wanted a piece of Altavista. I just hope Google doesn't become what Altavista is now.
There is no reasonable defense against an idiot with an agenda
:wq
I work for an ISP and consistently use google to probe error messages and the like. I've tried Vivisimo and Teoma but I find they gave me poor results. I could usually find the answer to a problem within the first page of results on google. I have yet to see another search engine match that.
Is the interface! You don't have to spend 5 minutes searching (no pun intended) for the Edit box to type your search into!
Visit the Arcade Restoration Workshop @ http://www.arcaderestoration.com
Ok, ok, ok
/.?) that it could indeed be a problem to the credibility of the web if say 99% of the information being returned by search engines is returned from engines controlled by one government.
So maybe I don't have hard evidence that google is indeed biased already.
But my initial point stands - are the search engines independent? It's pretty much indisputable (hmm.. indisputable on
Centralized control over information (or, pointers to information in this case) is a potential problem.
Am I wrong ?
So, how do we deal with this ? As a regular joe-user there's pretty darn little one can do to prevent this centralization from happening - or ?
But my initial point stands - are the search engines independent?
Uhhh... WHICH search engines? There are many. Independent from WHAT? The government? Uhhh, yeah I would say there's a pretty good chance that the American search engines are not in cahoots with the government. Call it a hunch.
it could indeed be a problem to the credibility of the web if say 99% of the information being returned by search engines is returned from engines controlled by one government.
First, the "web" has no credibility, it is not a person or even a single entity like a company.
Second, there would only be a problem if the dominant search engines were in countries without free speech rights. I'll go out on a limb and say the U.S. has one of the better standards of free speech in the world. The dominant search engines like Yahoo, Google, Altavista, etc. are all in the U.S. I don't see any problem with "credibility."
Centralized control over information (or, pointers to information in this case) is a potential problem.
Please explain how there is any centralized control over the search engines? They are all separate entities.
Am I wrong ?
About what? I can't figure out your argument.
So, how do we deal with this ? As a regular joe-user there's pretty darn little one can do to prevent this centralization from happening - or ?
It's pretty simple. The internet is enjoying a free-market economy. You use the search engines that give the best results. The search engine with most users wins. The search engine that returns illegitimate results, if there was such a search engine, would not be popular.
These things can work themselves out in a free market.
"And like that
Yes, my pages change very often, and Google cache has versions that are between 7 and 8 weeks old. So the refresh rate of 28 days (as mentioned in that interview someone linked to) doesn't really work out.
I also noticed that one of my pages didn't make it into Google and I'd really like to know why. It's linked from the top page and there is nothing different from the other pages. I linked to a PDF file on that page (also on my site) which also didn't get included. Unfortunately I don't have access_logs, so I cannot tell for sure whether the page got spidered at all. I'd really like to know what I'm doing wrong.
It's nice that Google includes PDF files, but why don't they read PostScript, Word DOC and all the other document file formats? It seems to be easy to add a couple of import filters...
They could also easily support compressed documents, e. g. pdf.gz or pdf.bz2.
If the import filter really "understands" the file format (if it knows where things are emphasized or in bold, or larger font, not just the result of pdftotext given to the indexer) the quality of the query results could be improved as well. Words in headings or larger font could be regarded as more relevant for a page (in a similar way that words in h1 or h2 are considered more relevant with HTML).
...when the library card catalog was the shiznit! Those were the days! You could actually find a desk to work at, because they weren't all filled with those pesky computers.