Better Search Results Than Google?
Mechanik writes "CNN has an AP article about the next generation of up and coming search tools, which try to cope with the glut of hits that result from 'conventional' search engines such as Google. One tool, Vivisimo, "is like a superfast librarian who can instantly arrange the titles on shelves in a way that makes sense. [...] But unlike libraries, Vivisimo doesn't use predefined categories. Its software determines them on the fly, depending on the search results. The filing is done through a combination of linguistic and statistical analysis." Grokker, another, downloadable program, "not only sorts search results into categories but also "maps" the results in a holistic way, showing each category as a colorful circle. Within each circle, subcategories appear as more circles that can be clicked on and zoomed in on." You have to love the author's use of trying to look for a hotel in France with the terms 'Paris Hilton' as an example of searching gone awry."
Despite the problems with Google, it's still the best place I've found to get good info. The trick is to be very careful about how you search for something by adding in search modifiers such as "-sale" or "-bargain" or "review" to weed out the overtly commercial results. But even then, things have changed and not for the better.
-S
--- What parts of "shall make no law", "shall not be infringed", and "shall not be violated" don't you understand?
"Vivisimo" can *somehow* come up with a better engine than google, will people use it? Google is getting bigger and bigger not necessarily by their search results (or lack thereof) but also because of how the phrase "google" has caught on in mainstream culture. Face it - when your competitor makes it into the dictionary, it's going to be EXTREMELY hard to get people to change the way they search. If you ask many non-techs how they find information on the web, they don't say "I search for it" they say "I google it".
Now, that being said, one thing the CNN article doesn't talk about in great detail is the technology behind this company - Google started out at a major university - what's the background of this company? While I agree something should be done with all the advertising that occurs with PageRank, I find it highly doubtful that it's going to be another company (rather than Google itself) that will fix it.
Google is about having good quality results with a very simple interface, one that anyone can use. Go to an academic library and look at the various journal search engines like "America: History and Life" or PychINFO, or better yet just try out MedLine. See anything wrong? Busy page, weird syntax, a huge instruction page about "how to search".
Engines like Vivisimo may make it if they can keep Google's simplicity and ease of use and only add value with categorizations. And personally, I think they better get out of 1996 with the frames. Yech!
From what I understand, the reason that google can do many many searches at once and still complete each in 0.5 seconds (besides having a huge linux farm) is that they make a lot of algorithmic shortcuts and precompute datastructures as much as possible. There really aren't any such precomputed algorithmic shortcuts to take with regular expressions, so searches would either be much much slower, or google would need to buy a vastly larger linux farm, for a feature that's used by less than 1% of the population.
The bandwidth theft may be something to keep an eye on; something else to think about is the taxing Grokker's going to put on your box's resources:
t ml
"System Requirements
Windows 2000 or Windows XP
Pentium III at 400MHZ or higher
128MB RAM (we recommend 256MB or more, if you're going to use the file indexing service for the My Files keyword search)
100MB of free disk space (or 20MB only if Java 2 is already installed)"
Myself I kind of like the idea of the graphical results, but not if my box is doing the grunt work. I think Google has them beat on that point.
Not to mention that Grokker "Contains a fully functional Web browser based on Internet Explorer". How would one go about updating the various patches for this browser?
http://www.groxis.com/service/grok/g_products.h
I went to the city because I wished to live without deliberation.
His example of searching for Paris Hilton is nothing more then an glorified example to try to prove his point.
You do not need to completely redign a search engine to get your desired results. You need to refine your search. Search google for Paris Hilton Hotel and the first three results are directly related to a Hilton Hotel in Paris. I would not find this hotel any faster using his circle method with Grokker2. I use a search engine to find exactly what I am looking for. Displaying all the results on some chart, graph, or 3d display still requires me to browse around to narrow my search.
Bad boys rape our young girls but Violet gives willingly.
What you ask is more difficult than one may originally think. As soon as a novel approach to counter-acting one of these annoyances becomes popular, it lands itself in the cross-hairs of those who would exploit "the system" in the first place. Witness the current arms race that is SPAM. Witness Microsoft security. Hell, witness Slashdot moderation.
There are a number of bright people on both sides of the aisle. When one side discovers a new technique, the other will work hard to neutralize said technique. This continues until either: it is too expensive for one side to continue, or too complicated for the consumer to bother with anymore.