Slashdot Mirror


Better Search Results Than Google?

Mechanik writes "CNN has an AP article about the next generation of up and coming search tools, which try to cope with the glut of hits that result from 'conventional' search engines such as Google. One tool, Vivisimo, "is like a superfast librarian who can instantly arrange the titles on shelves in a way that makes sense. [...] But unlike libraries, Vivisimo doesn't use predefined categories. Its software determines them on the fly, depending on the search results. The filing is done through a combination of linguistic and statistical analysis." Grokker, another, downloadable program, "not only sorts search results into categories but also "maps" the results in a holistic way, showing each category as a colorful circle. Within each circle, subcategories appear as more circles that can be clicked on and zoomed in on." You have to love the author's use of trying to look for a hotel in France with the terms 'Paris Hilton' as an example of searching gone awry."

46 of 487 comments (clear)

  1. I still won't be happy... by Kickstart70 · · Score: 5, Interesting

    ...until I can regexp my searches. It would make a whole lot of difference.

    1. Re:I still won't be happy... by eyeye · · Score: 5, Interesting

      That would be good.

      One way to improve google would be to filter any domain that has more than one hypen in it.

      You know those results from - "buy-mobile-phones-cheap-now-online.com" that you get when you searched for "linux patch".

      --
      Bush and Blair ate my sig!
    2. Re:I still won't be happy... by interiot · · Score: 4, Insightful

      From what I understand, the reason that google can do many many searches at once and still complete each in 0.5 seconds (besides having a huge linux farm) is that they make a lot of algorithmic shortcuts and precompute datastructures as much as possible. There really aren't any such precomputed algorithmic shortcuts to take with regular expressions, so searches would either be much much slower, or google would need to buy a vastly larger linux farm, for a feature that's used by less than 1% of the population.

    3. Re:I still won't be happy... by costas · · Score: 4, Interesting

      Search engines index words in web pages and at most keep word order (word no X out of N in page). In addition, they throw out common or garbage words (like 'the', 'and', etc), so the actual text is not represented in the index. That right there is a major reason why not to do regexes... Secondly, regexes are awefully expensive CPU-wise and understood by only a tiny portion of your users, so it's quite unlikely that this would happen.

    4. Re:I still won't be happy... by kcornia · · Score: 3, Insightful

      I can't believe how quickly these sites have almost ruined Google.

      I got a bunch of games for Xmas, and when I've gone looking for strategy sites or even walkthroughs (for Morrowind for example), its practically impossible to separate the real sites from those we-sell-u-stuff-cheap-online-from-hungary.morrowin d-strategy-walkthrough-cheat-whatever-else-might-b e-in-a-search-string.html sites.

      VERY AGGRAVATING.

    5. Re:I still won't be happy... by glesga_kiss · · Score: 3, Interesting
      I don't understand why Google isn't doing that sort of filtering already.

      They could get sued. It's an interesting thing legally, it's not really been tested yet. If Google deliberately block a site from appearing in it's results based on a matter of taste (i.e. they think it's poor content), then they leave themselves open to legal action.

      And that is the curse of Google. It's downfall started about six months ago. It's still great for solving technical problems but trying to get product reviews or searching on any brand-name etc for info is a waste of time. Just the official page and a hundred links to "portal" sites that have wormed their way up page-rank, each trying to sell you something.

      It was inevitable I suppose. Once the lay public got their hands on it en-mass, the search-spammers targetted it. Once the google users hit a certain critical mass, it all went downhill.

      Perhaps we should just keep the next best thing to ourselves...? ;-)

  2. Vivisimo by dreamchaser · · Score: 5, Funny

    They aren't off to a very good start:

    Problem occurred while using Vivisimo::

    Currently under heavy load. Please try again shortly

    Please go back to the Vivisimo home page and try your query again

    1. Re:Vivisimo by JWW · · Score: 4, Funny

      Well, this is one strike against them. I can't seem to remember google ever being slashdotted.

  3. Better search results than Google? It will happen! by soluzar22 · · Score: 4, Interesting

    Well, Google made a huge leap forward from the old-guard, of AltaVista & Yahoo, who were in their own way a huge leap beyond what had gone before. We had to expect this to happen sooner or later, but two things spring irresistably to mind.

    1)Will it gain the enormous foothold in the collective consciousness that Google has acquired? To Google is now a verb... and it gets mentioned on Buffy, which is as good a cultural barometer as we are ever likely to have. :-)

    2)Will the UI and secondary services (such as the ODP, and Google Groups) be as good as Google itself?
    Also, while I'm sure that it will happen one day, I'll believe it when I use it and not before... Oh, and the Paris Hilton thing? LOL! That sort of anti-result comes back from search engines *a lot*. I was just talking to my mom about searches of that type of ambiguous nature the other day.

  4. I tried this earlier... by FroMan · · Score: 5, Informative

    I tried this earlier (around noon) when I saw the article. One of my big complaints is that the searches seem to take too long. Google usually is sub-second searches, this seemed to take about 3-5 seconds (this was well before slashdot posted the article, so it wasn't slashdot effect either).

    Also, I already do not like the search results showing up in the sidebar with search engines (with mozilla), as that is one of the features I kill as soon as I install mozilla. So, I guess, this search engine has a ways to go before I prefer it.

    The searches didn't seem too bad over all, I tried looking for "linux kodak 4530" and its results were not any better or worse than googles. I tried a couple other searches and they seem to be on target about as well as google though.

    --
    Norris/Palin 2012
    Fact: We deserve leaders who can kick your ass and field dress your carcass.
  5. Grokker reminds me... by Kickstart70 · · Score: 5, Interesting

    of Antarctica, an old and very clunky Java Yahoo-like engine (sorta). It used a map of Antarctica to drill down into categories and subcategories before putting the user in a 3D world interface at the lowest level. When I interviewed with them, the interviewer did an excellent job of turning me off the technology, explaining that the 3D interface would allow 'billboard and other advertisements' along with the search results formatted in a 'mall or street' of entries.

    Gah.

  6. Every so often... by clifgriffin · · Score: 5, Interesting

    A new search engine comes along that touts its uber intelligent way of searching. It is hyped by the press but ends up by the way side. (See Teoma)

    I don't get excited about "Google alternatives". Google satisfies my searching needs as it is. Sometimes "knowing what to search for" is better than a super intelligent search engine.

    As far as I'm concerned anyone with a clue can produce the results they need with a little bit of practice and common sense. They don't need new search engines.

    Clif

    1. Re:Every so often... by radish · · Score: 4, Informative

      I want it to behave as some sort of web-service so I can use a perl module to manage my results. google should have a programming API to extend their service in some way

      You mean, like this - Google WebServices?

      --

      ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

    2. Re:Every so often... by Cruciform · · Score: 4, Funny

      There was this guy who really wanted to get into my pants. So every time I had a major assignment I'd hang out with the other girls until the day before it was due, then just have him search for me.
      A few cut and pastes and I got at least a B+ every time.

      Who needs a search engine when you've got a push up bra? :)

  7. AP strikes again by 97cobra · · Score: 4, Funny

    Glad to see AP covering a site thats been operational for 2 years, nothing like cutting edge reporting.

  8. What if... by GeckoFood · · Score: 4, Interesting

    What if you want that glut of hits? Sometimes you have to dig through some pretty obscure hits on a search to get what you want, and categorizing them or putting them in funny circles just complicates the process and can make the search take longer. I'll hang with Google and Teoma, thank you very much.

    And I certainly don't want a downloadable search app running, that's just another possible inroad for spyware. I've been burned enough times by apps I thought were "clean" that went off and chewed up enough bandwidth to choke a horse.

    --
    Be excellent to each other. And... PARTY ON, DUDES!
    1. Re:What if... by fiendo · · Score: 4, Insightful

      The bandwidth theft may be something to keep an eye on; something else to think about is the taxing Grokker's going to put on your box's resources:

      "System Requirements
      Windows 2000 or Windows XP
      Pentium III at 400MHZ or higher
      128MB RAM (we recommend 256MB or more, if you're going to use the file indexing service for the My Files keyword search)
      100MB of free disk space (or 20MB only if Java 2 is already installed)"

      Myself I kind of like the idea of the graphical results, but not if my box is doing the grunt work. I think Google has them beat on that point.

      Not to mention that Grokker "Contains a fully functional Web browser based on Internet Explorer". How would one go about updating the various patches for this browser?

      http://www.groxis.com/service/grok/g_products.ht ml

      --
      I went to the city because I wished to live without deliberation.
  9. Many search results now overly commercial by sdo1 · · Score: 5, Insightful
    The problem with Google (and in fact a lot of the internet and in particular search engines) now is that it has almost entirely been taken over by commercial entities. When I was recently shopping for a digital camera, I did the usual internet searches. A few years back, similar searches would have found lots and lots of sites ABOUT the product in question (fan sites, discussion forums, reviews). Now I have to sort through page upon page of sites wanting to sell me said item, most of which aren't even actual store-fronts but instead just referral pages which have manipulated the Google ranking system to get on top. I recenlty hit the same problem when doing vacation planning. It used to be that I could easily find hundreds of pages ABOUT the destination, now I just find sites wanting to sell me airfare, book me into a hotel, and rent me a car. It's become extremely frustrating and has made Google far less useful than it once was. In fact, most of the big search engines are far less useful than they once were. Yahoo! used to be THE place to get organized info on any subject. The directory is almost entirely commercial now. DMOZ is extremely hit and miss and has started to get fairly out of date. I messed around with vivisimo a bit as well and found that to be hit and miss.

    Despite the problems with Google, it's still the best place I've found to get good info. The trick is to be very careful about how you search for something by adding in search modifiers such as "-sale" or "-bargain" or "review" to weed out the overtly commercial results. But even then, things have changed and not for the better.

    -S

    --
    --- What parts of "shall make no law", "shall not be infringed", and "shall not be violated" don't you understand?
    1. Re:Many search results now overly commercial by frodo+from+middle+ea · · Score: 5, Informative
      Tell me about it.

      Searching for info about electronic products is the worst on google.
      I use the following along with any thing i want to search and it usually does the trick

      -shop -shopping -price -buy -order -shipping".

      This no doubts subtracts one or two sites which are good but atleast filters out most of the shopping sites.

      --
      for the last time people, I am "frodo from middle eaRTH", not "middle eaST".
    2. Re:Many search results now overly commercial by jacoplane · · Score: 3, Insightful

      Well, I don't think this will ever change. Commercial entities simply have the most to gain by being at the top of the Google PageRank. So even if Google doesn't make any distinction in to who gets the highest PR, commercial entities will simply make the biggest effort and eventually take the top spot.

      These days I always include other search terms like "epinions" (for reviews) or "wikipedia" for information to get the most out of google. Someday there will be a search engine where you can specify "no commercial s$*t", but till that day...

  10. Re:Nice work, people by npistentis · · Score: 3, Funny

    you have to admit, this may be the first time we've managed to Slashdot a search engine. Yet another /. milestone!

    --
    Gentlemen, you can't fight in here! This is the War Room!
  11. The same idea for images by TomDes · · Score: 3, Interesting

    We realized the same idea for images. Take the results from Google Image Search and rearrange them using methods from computer vision.
    An article about this is available here: Clustering visually similar images to improve image search engines .

  12. Re:Better search results than Google? It will happ by asdhwesd · · Score: 5, Interesting

    Is there a search engine that can filter out all of those annoying placeholder sites that grab unsuspecting visitors by simply putting every word about a certain subject on a page and then having links to other useless websites? This is 'webspam' as far as I am concerned and the next step in search engine design should be 'placeholder' site aware.

    A search engine that ignores specifically commercial sites would also be helpful.

    Any ideas on either of these type features in current or upcoming search engines?

  13. Even if.... by ghettoboy22 · · Score: 5, Insightful

    "Vivisimo" can *somehow* come up with a better engine than google, will people use it? Google is getting bigger and bigger not necessarily by their search results (or lack thereof) but also because of how the phrase "google" has caught on in mainstream culture. Face it - when your competitor makes it into the dictionary, it's going to be EXTREMELY hard to get people to change the way they search. If you ask many non-techs how they find information on the web, they don't say "I search for it" they say "I google it".

    Now, that being said, one thing the CNN article doesn't talk about in great detail is the technology behind this company - Google started out at a major university - what's the background of this company? While I agree something should be done with all the advertising that occurs with PageRank, I find it highly doubtful that it's going to be another company (rather than Google itself) that will fix it.

  14. Re:Better search results than Google? It will happ by CrayzyJ · · Score: 5, Funny

    " and it gets mentioned on Buffy, which is as good a cultural barometer as we are ever likely to have"

    Gawd help us. Society now sucks if that is our barometer.

    Google, the verb, has been mentioned on Law & Order. _THAT_ tells me it has entered the mainstream.

    --
    Holy s-, it's Jesus!
  15. Querying slashdot effect... by alexatrit · · Score: 3, Funny
    A query on "slashdot effect" returned the following groupings, before the engine died under load.
    slashdot effect (111)
    o Technology (18)
    o Definition (11)
    o Story (9)
    o Also spelled (9)
    o Analysis, Three Internet Publications (5)
    o Source (7)
    o Sarcasta.net (3)
    o Spy (2)
    o Downloads (3)
    o Surviving The Slashdot Effect (3)
    Their cluster groups are interesting, but their top X results behave a lot like Google. Most of the results are the same as well. I do like how it lists where the result was sourced, however.
    --

    Nothing but the finest in meaningless drivel
  16. Grokker's kinda cool by Daikiki · · Score: 4, Informative

    I'm actually posting this form the browser window of Grokker. Been playing with it for just a few minnutes now, but I can see how something like this can make obscure or broad searches a lot easier. When you enter a search term, Grokker generates a series of circles, each of them representing a subcategory of results for your search term, and each of them in turn filled with subcategories of their own. Searching for "west coast museums", for example, gives me subcategories such as 'travel', 'west coast attractions', and 'history museums'. Once you find your desired subcategory you're presented with a smallish list of matching sites, represented as squares. The categorization seems to make sense most of the time, even if the overall visual effect is remniscent of 70's disco lighting.

    --
    I want the fire back.
  17. That's all we need... by garethwi · · Score: 3, Funny

    ...a search engine which can't handle a slashdotting.

  18. You can already get better results by JoshuaDFranklin · · Score: 4, Insightful
    Google has never been about getting the "best results"--you can already get much better results for your topic by using a specialized search engine (i.e., IMDB for movies, Lexis-Nexis for newspapers, etc.).

    Google is about having good quality results with a very simple interface, one that anyone can use. Go to an academic library and look at the various journal search engines like "America: History and Life" or PychINFO, or better yet just try out MedLine. See anything wrong? Busy page, weird syntax, a huge instruction page about "how to search".

    Engines like Vivisimo may make it if they can keep Google's simplicity and ease of use and only add value with categorizations. And personally, I think they better get out of 1996 with the frames. Yech!

  19. Huh? by TheTick · · Score: 3, Insightful

    Man, I must have been sleeping...

    When did google become a conventional search engine...?

    --

    --
    bachiatari na torisetsu o yome!

  20. Not quite by bigjocker · · Score: 5, Interesting

    I tried a few searches on Vivisimo before it went live on slashdot and I must say I'm impressed. It addresses one of the main faults of search technology today: context. When you perform a search a tree is shown showing the different contexts (not categories) where the terms were found. Excellent for ambiguous concepts.

    But, and here is the beef, it should be obvious to anyone that there must be a interface change in the short term future of search. A textbox is a very limited input to express a complex search. Using regexps and regexp-like operators is not enough. This Vivisimo is a step in the right direction, but there's a lot of way to go through.

    For example try to make this search using any engine (Vivisimo, Google, Yahoo, Altavista, etc): who was the red-haired singer that recorded a song with Tom Morello a few years back?. At least I can't find an answer because one of the main aspects I'm using (the red hair) maybe is not as important as other aspects used to describe the situation by anyone else.

    There must be a interface revolution in the years to come. Come to think of it, are we still using a textfield to express every possible combination in a google search? Gross!!!

    --
    Life isn't like a box of chocolates. It's more like a jar of jalapenos. What you do today, might burn your ass tomorrow.
    1. Re:Not quite by RetroGeek · · Score: 5, Informative

      Try this search then.

      The search phrase was:
      "red hair" singer "tom morello"

      --

      - - - - - - - - - - -
      I am a programmer. I am paid to produce syntax not grammar. Deal with it.
    2. Re:Not quite by Carnildo · · Score: 4, Informative

      You can get a similar effect in Google by adding a word or two of context to your search. Searching for "paris hilton" gets millions of links to sites claiming to sell the tapes, but searching for "paris hilton hotel" gets hotels in France.

      --
      "They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
    3. Re:Not quite by mike_mgo · · Score: 3, Insightful

      In this case yes, but this is a very simple example. Sometimes, epsecially on topics that you are unfamiliar with, it can be difficult to figure out what additional words are going to help to refine your search.

    4. Re:Not quite by dspyder · · Score: 5, Informative

      You can get a similar effect in Google by adding a word or two of context to your search. Searching for "paris hilton" gets millions of links to sites claiming to sell the tapes, but searching for "paris hilton hotel" gets hotels in France.

      The most under-utilized feature of Google I think has to be excluding keywords. For this query, I would use:
      +"paris hilton" +hotel -tape -porn
      and probably get much better results. If the word "naked" is never ever going to appear in a legitimate result page, you might as well exclude it.

      Same goes for other things. I was looking for information on Microwaves and WiFi the other day... not the ovens, so -oven -food and I got infinitely better results.

      --Darren

  21. Re:Better search results than Google? It will happ by nealfunkbass · · Score: 5, Funny

    I think you misspelled barfometer

    --
    - Donny was a good bowler, and a good man.
  22. Knowing how to search by nolife · · Score: 5, Insightful

    His example of searching for Paris Hilton is nothing more then an glorified example to try to prove his point.
    You do not need to completely redign a search engine to get your desired results. You need to refine your search. Search google for Paris Hilton Hotel and the first three results are directly related to a Hilton Hotel in Paris. I would not find this hotel any faster using his circle method with Grokker2. I use a search engine to find exactly what I am looking for. Displaying all the results on some chart, graph, or 3d display still requires me to browse around to narrow my search.

    --
    Bad boys rape our young girls but Violet gives willingly.
  23. Re: Infinite loops by A55M0NKEY · · Score: 3, Informative

    You can write an infinite loop in alot of regexp packages. They would have to have a way of detecting that ( or a very inefficiently written regexp )

    --

    Eat at Joe's.

  24. Re:Better search results than Google? It will happ by Kobayashi+Maru · · Score: 5, Insightful

    What you ask is more difficult than one may originally think. As soon as a novel approach to counter-acting one of these annoyances becomes popular, it lands itself in the cross-hairs of those who would exploit "the system" in the first place. Witness the current arms race that is SPAM. Witness Microsoft security. Hell, witness Slashdot moderation.

    There are a number of bright people on both sides of the aisle. When one side discovers a new technique, the other will work hard to neutralize said technique. This continues until either: it is too expensive for one side to continue, or too complicated for the consumer to bother with anymore.

  25. Filtering e-stores by vurg · · Score: 3, Interesting

    It would be nice if there is a feature that filters e-store entries. For example, I was looking for a solution to my Logitech RumblePad left analog stick problem. And no matter how refined my search is, I still get thousands of pages to stores selling that gamepad. I don't want to buy a gamepad. But I guess search engines and e-commerce would never be separated. Sadly this is how the Internet works now.

  26. Re:Vivisimo Categorization is language independent by deadbadger · · Score: 3, Informative

    If you're up for some maths and some fairly dry reading, check out the paper "Authoritative Sources in a Hyperlinked Environment" by Jon Kleinberg. He describes a search method which takes regular text-based search results and then examines the link structure around those pages. The idea is that pages of comparable content exhibit heavy interlinking. Clusters of such pages can be identified with a recursive algorithm a little like Google's PageRank, and then distinguished with some nifty eigenvector mathematics. This gives you your basic categories, based solely on the link structure.

    While the paper doesn't detail how one might label the categories identified, I don't imagine that it's all that difficult to do with some simple correlation algorithms, which wouldn't be language-dependent.

    Disclaimer: since vivisimo is down and I've not used it, I could well be talking out of my arse here; this is just one categorisation method with which I'm familiar, and would produce the results mentioned. It may not be how vivisimo actually do it.

  27. The search tools are really not the problem by IBitOBear · · Score: 3, Insightful

    Yes, there is glut and yes there are blog-holes.

    The thing I have noticed to be the greatest single limit on web searching is the operator. I can regularly find things on the net that my co-workers cannot. This is because I understand keyword boolean searching at a deeper level than most people.

    I blame this on the level of education of the common population, as opposed to being evidence of my own superiority. 8-)

    In a world where most people have never actually met or "dealt with" a librarian (archivist, whatever 8-) it should surprise nobody that these self-same people have no idea what it means to take personal responsibility for organizing their own approach to knowing things.

    Having grown up near and actually talked to librarians all my life I actually understand how to group information. Applying that knowledge to a search for some words and against others isn't that far a stretch.

    It is a personal pet peve of mine to have to listen to people bemoan Google (etc.) when these self-same people have never even *noticed* the advanced search link, nor even learned the power of the minus ("-") in the standard search bar.

    There is no technology that can "fix" bad user inquiries that won't in turn "ruin" good ones.

    --
    Innocent people shouldn't be forced to pay for inferior software development.
    --"Code Complete" Microsoft Press
  28. Very specific to your search, but... by BigJimSlade · · Score: 4, Informative

    Google has, among others, a very nice linux filter all ready.

  29. What Makes a Search Engine Better? by fupeg · · Score: 3, Insightful

    From my own experience with developing search technologies for an e-content site, these guys are on the right track. Compared to a lot of search technologies out there, Google is dumb. But it is blazing fast, general purpose, and smarter than most of its (former) compettitors. Part of why it is dumb is that it is so general purpose. To make a search engine smarter, you have to add context. Specialized search engines can do this by standardizing their inputs. Google could do this too, but it would require complex parsing of everything that it spiders.

    Another thing that Google really lacks is detection of duplicates. Google tries to do this, but does it poorly. I remember recently doing a search on Google for an obscure DB2 error code, and getting the same page out of the IBM manual over and over again, all on different college websites.
    This is another area where linguistic/statistical analysis could really help. Most knowledge-base products offer a "More Like This" feature that is an index of linguistic similarities between items. An easy way to detect duplicates with such a system is to have a fine scale and place an uppler limit on similarities, i.e. any two items with a similarity > N are likely to be duplicates.

    All of this being said, I would be surprised if Google does not address these issues in the very near future. I do not think they have gone down the path that many large companies go down where they stop trying to innovate and instead just try to protect their turf.

  30. Oh, I have to do it..... by Lxy · · Score: 4, Funny

    Here's the Google Cache of Vivisimo.

    --

    There is no reasonable defense against an idiot with an agenda
    :wq
  31. What I want... by K'tohg · · Score: 4, Interesting

    I would prefer as an alternative to regexp (since that obviously would be way too much power and too many exploits) is simple logic operators.

    Most search engines now have AND and OR but none have nested logic or short hand

    for example I would love to do this in google: (linux && modems) || ("AT commands" && !windows)

    --
    > SELECT * FROM brain_cells WHERE synaptic_rate > 0
    0 row returned