Slashdot Mirror


Searching For Google's Successor

weink writes "A new generation of scrappy search engines is emerging to challenge the dominance of mighty Google . An article at Wired News lists up-and-coming search engines, WiseNut , Teoma , Lasoo , CURE , and Vivisimo . Take a look, and give them a try. But I still say that nothing is better then the almighty Google ."

12 of 282 comments (clear)

  1. wisenut? no thanks by arielb · · Score: 2, Interesting

    it doesn't have a cache (something that I use almost all the time) and also happens to run on Windows.

    --
    ---
  2. All I want from a search engine... by scott1853 · · Score: 3, Interesting

    Recent results. Google only seems to be getting updated once every couple months. I know they must be pulling down a lot of data, but every other search engine seems to have more recent information that Google does. Anybody have any actual stats of googles refresh?

    1. Re:All I want from a search engine... by bleeeeck · · Score: 3, Interesting

      Most of my sites get spidered by Google monthly (or more often). It seems like it takes about 3 - 4 weeks for Google to get the new content into it's database. Here's an example of it on one of my pages that has the date, from Google's cache. The date it was spidered is in the right hand news column.

  3. Better hardware than Google by Evil+Attraction · · Score: 2, Interesting
    I really like Google. Their search engine is fast, and it covers a lot;
    • Cache: Means that we are able to visit a site after it's been slashdotted.
    • Relevance: Google's "relevance technology" is great. Find related sites, and find only pages related to your query. :-)
    • Not only web pages: Google doesn't only search for web pages, but also PDF files and images. More search engines should have had features like that.
    So what's bad about Google? AFAIK, nothing an ordinary user would know of. But their hardware is "wrong". Fast has developed a search engine called AllTheWeb. Their search engine is the best seach engine after Google, but could easily (?) have been the best.

    Why? Well. They have developed special hardware to do their search. And it's damn fast (that's where they got the name, I guess). However, the software running on their hardware isn't as good as Google, and I really wonder why...

    My conclusion: The software Google is using should have run on AllTheWeb's hardware. That would have been one hell of a search engine.


    No I don't like it, either...
  4. They miss the whole point. by Anonymous Coward · · Score: 4, Interesting

    Google does kickass, and I'm sure the guys that run it will continue to fine tune things so thaat it improves. But the truth is, we're already approaching the limit of what a search engine can do, and any gains will simply be the last 1/100 of that last percent.

    Should we stop trying? No, the need for relevant results hasn't been fulfilled, except in the most minimal ways. But we need to look for new answers. I think that to take this any further, it will mean going client-side. To make results more relevant requires too much cpu power, to aggregate it at the engine website. A client side agent, using google as a starting point, and sifting through the results, spidering through them, makes sense. Don't start whining about traffic increase, the same thing happens now, only it's the person himself doing the spidering.

    Also, the entire keyword paradigm is at odds with but the most simplistic search. Sometimes I'm looking for a diagram, or I'm looking to buy aa hard to find part. Some engines, like lycos allow you to search for audio or stills, but it borders on lameness. This needs to be epxanded. You need to be able to tell the engine, "hey I'm just looking for general info" or "hey I want to buy something with these parameters". For instance, the diagrams I look for, they can either be gif/jpeg or ascii art. A decent engine/agent should have no trouble returning results thaat reflect these requirements. Same with the "buying" type search, the electronic parts I'm looking for are not common items, and adding a keyword of "shopping cart" doesn't always cut it. As I see it, there are at least a few different types of searches, that a person might make.

    I want to buy this item (or a simlar)
    I want to find info (of an encyclopedic nature)
    I want to find leads about (I don't quite know what I'm looking for yet)
    I want to hear news about...
    I want to find this file/software (or a similar one)
    I want to be entertained about/with...

    These things all lend themselves perfectly to a client-side agent. Those websites that don't bother to tag images properly, and yet the image is just stylized text? An agent has the power to OCR it back to normal, something an engine could never hope to do. Get rid of all the mirrors? Google is better at this than any other engine, but can it compete with an agent that can recognize a text mirror or a html page, or vice versa? Or any of the other nifty little optimizations that aren't even obvious to me at the moment? Sure, there will be problems. I'm not sure Joe AOL being able to accept that a proper search will take longer than it takes for a web page to load, but it still seems like the next killer app to me.

  5. I'm curious to see... by ackthpt · · Score: 2, Interesting

    I'm curious to see if any of these new search engines suffer from the /. effect.

    --

    A feeling of having made the same mistake before: Deja Foobar
  6. Re:One of the great features of Google by al_d · · Score: 2, Interesting

    Yes, the cache can be invaluable at times. Anyone got any ideas as to how much space Google's cache takes up?

  7. Index Size by Itrebax · · Score: 2, Interesting

    According to Wisenut's front page, it has more pages indexed than Google. Can this be true?

  8. Re:Wisenut ignored my robots.txt by beme · · Score: 3, Interesting

    Have you told them? Not trying to be a smartass or anything... A couple of years ago I shot off an email to the owner of a spider that was ignoring my robots.txt, and lo and behold a bit later the spider started checking and honoring my robots.txt file. YMMV.

    --

    -beme
    1971
  9. Re:One of the great features of Google by jilles · · Score: 4, Interesting

    The cache is a nice gimmick which I've found useful quite a few times, however the main reason I keep returning to google is that I actually find what I need fast. Yesterday I needed some background on C++ templates. I entered the terms "C++ templates tutorial" in the ie google toolbar (that is a great feauture IMHO) and found what I needed at the top of the returned results. 15 seconds later the stuff I needed was on its way to the printer.

    That kind of convenience is hard to beat by a general purpose search engine. The story changes if you start using meta information to narrow the search. Google does not do that as far as I know. However, using meta information inevitably narrows the scope of a search engine. Efficient distributed search engines for multimedia are currently emerging. E.g. morpheus actually uses meta information attached to a mp3 allowing for searches for tracks of a particular album, more albums of the same artist and so on.

    --

    Jilles
  10. Likes, Dislikes, Pros, Cons by ackthpt · · Score: 2, Interesting

    Wisenut
    Looks like google without cache, wiseguide provides a nifty preview of categories with matches.

    Teoma
    Match phrase button handy, no cache

    Lasoo
    Nice maps, but not a search engine for finding general topics, more geared to finding locations

    CURE
    Is this a search engine? Hit the user limit so got nowhere.

    Vivisimo
    The best of the lot. Nice frame layout, organization by category, but lacks ability to jump to page.

    --

    A feeling of having made the same mistake before: Deja Foobar
  11. same here by Anonymous Coward · · Score: 1, Interesting
    One of my small servers, used as a testbed largely, consists of 90% .htpasswd/.htaccess'd data and a bunch of random documents and books.

    Wisenut started showing up in my server logs two or three months ago. They ignored my robots.txt, so I moved all of my data into new directories and passworded the main parent directory so that even if it ignored robots.txt, it couldn't list anything on its site.

    I'm still getting searches from their crawler. They send a request at least every 30 seconds and have been doing so for at least ten weeks now and filled my logs with a little over 250,000 requests.

    This is just an example of the new "make money at anyone's expense however you have to" mentality. Welcome to the new internet.