Better Search Results Than Google?
Mechanik writes "CNN has an AP article about the next generation of up and coming search tools, which try to cope with the glut of hits that result from 'conventional' search engines such as Google. One tool, Vivisimo, "is like a superfast librarian who can instantly arrange the titles on shelves in a way that makes sense. [...] But unlike libraries, Vivisimo doesn't use predefined categories. Its software determines them on the fly, depending on the search results. The filing is done through a combination of linguistic and statistical analysis." Grokker, another, downloadable program, "not only sorts search results into categories but also "maps" the results in a holistic way, showing each category as a colorful circle. Within each circle, subcategories appear as more circles that can be clicked on and zoomed in on." You have to love the author's use of trying to look for a hotel in France with the terms 'Paris Hilton' as an example of searching gone awry."
...until I can regexp my searches. It would make a whole lot of difference.
They aren't off to a very good start:
Problem occurred while using Vivisimo::
Currently under heavy load. Please try again shortly
Please go back to the Vivisimo home page and try your query again
Well, Google made a huge leap forward from the old-guard, of AltaVista & Yahoo, who were in their own way a huge leap beyond what had gone before. We had to expect this to happen sooner or later, but two things spring irresistably to mind.
:-)
1)Will it gain the enormous foothold in the collective consciousness that Google has acquired? To Google is now a verb... and it gets mentioned on Buffy, which is as good a cultural barometer as we are ever likely to have.
2)Will the UI and secondary services (such as the ODP, and Google Groups) be as good as Google itself?
Also, while I'm sure that it will happen one day, I'll believe it when I use it and not before... Oh, and the Paris Hilton thing? LOL! That sort of anti-result comes back from search engines *a lot*. I was just talking to my mom about searches of that type of ambiguous nature the other day.
Sign the FSF's Anti-DMCA petit
I tried this earlier (around noon) when I saw the article. One of my big complaints is that the searches seem to take too long. Google usually is sub-second searches, this seemed to take about 3-5 seconds (this was well before slashdot posted the article, so it wasn't slashdot effect either).
Also, I already do not like the search results showing up in the sidebar with search engines (with mozilla), as that is one of the features I kill as soon as I install mozilla. So, I guess, this search engine has a ways to go before I prefer it.
The searches didn't seem too bad over all, I tried looking for "linux kodak 4530" and its results were not any better or worse than googles. I tried a couple other searches and they seem to be on target about as well as google though.
Norris/Palin 2012
Fact: We deserve leaders who can kick your ass and field dress your carcass.
of Antarctica, an old and very clunky Java Yahoo-like engine (sorta). It used a map of Antarctica to drill down into categories and subcategories before putting the user in a 3D world interface at the lowest level. When I interviewed with them, the interviewer did an excellent job of turning me off the technology, explaining that the 3D interface would allow 'billboard and other advertisements' along with the search results formatted in a 'mall or street' of entries.
Gah.
A new search engine comes along that touts its uber intelligent way of searching. It is hyped by the press but ends up by the way side. (See Teoma)
I don't get excited about "Google alternatives". Google satisfies my searching needs as it is. Sometimes "knowing what to search for" is better than a super intelligent search engine.
As far as I'm concerned anyone with a clue can produce the results they need with a little bit of practice and common sense. They don't need new search engines.
Clif
clifgriffin > blog
Glad to see AP covering a site thats been operational for 2 years, nothing like cutting edge reporting.
What if you want that glut of hits? Sometimes you have to dig through some pretty obscure hits on a search to get what you want, and categorizing them or putting them in funny circles just complicates the process and can make the search take longer. I'll hang with Google and Teoma, thank you very much.
And I certainly don't want a downloadable search app running, that's just another possible inroad for spyware. I've been burned enough times by apps I thought were "clean" that went off and chewed up enough bandwidth to choke a horse.
Be excellent to each other. And... PARTY ON, DUDES!
Despite the problems with Google, it's still the best place I've found to get good info. The trick is to be very careful about how you search for something by adding in search modifiers such as "-sale" or "-bargain" or "review" to weed out the overtly commercial results. But even then, things have changed and not for the better.
-S
--- What parts of "shall make no law", "shall not be infringed", and "shall not be violated" don't you understand?
you have to admit, this may be the first time we've managed to Slashdot a search engine. Yet another /. milestone!
Gentlemen, you can't fight in here! This is the War Room!
We realized the same idea for images. Take the results from Google Image Search and rearrange them using methods from computer vision.
An article about this is available here: Clustering visually similar images to improve image search engines .
Is there a search engine that can filter out all of those annoying placeholder sites that grab unsuspecting visitors by simply putting every word about a certain subject on a page and then having links to other useless websites? This is 'webspam' as far as I am concerned and the next step in search engine design should be 'placeholder' site aware.
A search engine that ignores specifically commercial sites would also be helpful.
Any ideas on either of these type features in current or upcoming search engines?
"Vivisimo" can *somehow* come up with a better engine than google, will people use it? Google is getting bigger and bigger not necessarily by their search results (or lack thereof) but also because of how the phrase "google" has caught on in mainstream culture. Face it - when your competitor makes it into the dictionary, it's going to be EXTREMELY hard to get people to change the way they search. If you ask many non-techs how they find information on the web, they don't say "I search for it" they say "I google it".
Now, that being said, one thing the CNN article doesn't talk about in great detail is the technology behind this company - Google started out at a major university - what's the background of this company? While I agree something should be done with all the advertising that occurs with PageRank, I find it highly doubtful that it's going to be another company (rather than Google itself) that will fix it.
" and it gets mentioned on Buffy, which is as good a cultural barometer as we are ever likely to have"
Gawd help us. Society now sucks if that is our barometer.
Google, the verb, has been mentioned on Law & Order. _THAT_ tells me it has entered the mainstream.
Holy s-, it's Jesus!
Nothing but the finest in meaningless drivel
I'm actually posting this form the browser window of Grokker. Been playing with it for just a few minnutes now, but I can see how something like this can make obscure or broad searches a lot easier. When you enter a search term, Grokker generates a series of circles, each of them representing a subcategory of results for your search term, and each of them in turn filled with subcategories of their own. Searching for "west coast museums", for example, gives me subcategories such as 'travel', 'west coast attractions', and 'history museums'. Once you find your desired subcategory you're presented with a smallish list of matching sites, represented as squares. The categorization seems to make sense most of the time, even if the overall visual effect is remniscent of 70's disco lighting.
I want the fire back.
...a search engine which can't handle a slashdotting.
Find funky gifts
Google is about having good quality results with a very simple interface, one that anyone can use. Go to an academic library and look at the various journal search engines like "America: History and Life" or PychINFO, or better yet just try out MedLine. See anything wrong? Busy page, weird syntax, a huge instruction page about "how to search".
Engines like Vivisimo may make it if they can keep Google's simplicity and ease of use and only add value with categorizations. And personally, I think they better get out of 1996 with the frames. Yech!
Man, I must have been sleeping...
When did google become a conventional search engine...?
--
bachiatari na torisetsu o yome!
I tried a few searches on Vivisimo before it went live on slashdot and I must say I'm impressed. It addresses one of the main faults of search technology today: context. When you perform a search a tree is shown showing the different contexts (not categories) where the terms were found. Excellent for ambiguous concepts.
But, and here is the beef, it should be obvious to anyone that there must be a interface change in the short term future of search. A textbox is a very limited input to express a complex search. Using regexps and regexp-like operators is not enough. This Vivisimo is a step in the right direction, but there's a lot of way to go through.
For example try to make this search using any engine (Vivisimo, Google, Yahoo, Altavista, etc): who was the red-haired singer that recorded a song with Tom Morello a few years back?. At least I can't find an answer because one of the main aspects I'm using (the red hair) maybe is not as important as other aspects used to describe the situation by anyone else.
There must be a interface revolution in the years to come. Come to think of it, are we still using a textfield to express every possible combination in a google search? Gross!!!
Life isn't like a box of chocolates. It's more like a jar of jalapenos. What you do today, might burn your ass tomorrow.
I think you misspelled barfometer
- Donny was a good bowler, and a good man.
His example of searching for Paris Hilton is nothing more then an glorified example to try to prove his point.
You do not need to completely redign a search engine to get your desired results. You need to refine your search. Search google for Paris Hilton Hotel and the first three results are directly related to a Hilton Hotel in Paris. I would not find this hotel any faster using his circle method with Grokker2. I use a search engine to find exactly what I am looking for. Displaying all the results on some chart, graph, or 3d display still requires me to browse around to narrow my search.
Bad boys rape our young girls but Violet gives willingly.
You can write an infinite loop in alot of regexp packages. They would have to have a way of detecting that ( or a very inefficiently written regexp )
Eat at Joe's.
What you ask is more difficult than one may originally think. As soon as a novel approach to counter-acting one of these annoyances becomes popular, it lands itself in the cross-hairs of those who would exploit "the system" in the first place. Witness the current arms race that is SPAM. Witness Microsoft security. Hell, witness Slashdot moderation.
There are a number of bright people on both sides of the aisle. When one side discovers a new technique, the other will work hard to neutralize said technique. This continues until either: it is too expensive for one side to continue, or too complicated for the consumer to bother with anymore.
It would be nice if there is a feature that filters e-store entries. For example, I was looking for a solution to my Logitech RumblePad left analog stick problem. And no matter how refined my search is, I still get thousands of pages to stores selling that gamepad. I don't want to buy a gamepad. But I guess search engines and e-commerce would never be separated. Sadly this is how the Internet works now.
If you're up for some maths and some fairly dry reading, check out the paper "Authoritative Sources in a Hyperlinked Environment" by Jon Kleinberg. He describes a search method which takes regular text-based search results and then examines the link structure around those pages. The idea is that pages of comparable content exhibit heavy interlinking. Clusters of such pages can be identified with a recursive algorithm a little like Google's PageRank, and then distinguished with some nifty eigenvector mathematics. This gives you your basic categories, based solely on the link structure.
While the paper doesn't detail how one might label the categories identified, I don't imagine that it's all that difficult to do with some simple correlation algorithms, which wouldn't be language-dependent.
Disclaimer: since vivisimo is down and I've not used it, I could well be talking out of my arse here; this is just one categorisation method with which I'm familiar, and would produce the results mentioned. It may not be how vivisimo actually do it.
Yes, there is glut and yes there are blog-holes.
The thing I have noticed to be the greatest single limit on web searching is the operator. I can regularly find things on the net that my co-workers cannot. This is because I understand keyword boolean searching at a deeper level than most people.
I blame this on the level of education of the common population, as opposed to being evidence of my own superiority. 8-)
In a world where most people have never actually met or "dealt with" a librarian (archivist, whatever 8-) it should surprise nobody that these self-same people have no idea what it means to take personal responsibility for organizing their own approach to knowing things.
Having grown up near and actually talked to librarians all my life I actually understand how to group information. Applying that knowledge to a search for some words and against others isn't that far a stretch.
It is a personal pet peve of mine to have to listen to people bemoan Google (etc.) when these self-same people have never even *noticed* the advanced search link, nor even learned the power of the minus ("-") in the standard search bar.
There is no technology that can "fix" bad user inquiries that won't in turn "ruin" good ones.
Innocent people shouldn't be forced to pay for inferior software development.
--"Code Complete" Microsoft Press
Google has, among others, a very nice linux filter all ready.
From my own experience with developing search technologies for an e-content site, these guys are on the right track. Compared to a lot of search technologies out there, Google is dumb. But it is blazing fast, general purpose, and smarter than most of its (former) compettitors. Part of why it is dumb is that it is so general purpose. To make a search engine smarter, you have to add context. Specialized search engines can do this by standardizing their inputs. Google could do this too, but it would require complex parsing of everything that it spiders.
Another thing that Google really lacks is detection of duplicates. Google tries to do this, but does it poorly. I remember recently doing a search on Google for an obscure DB2 error code, and getting the same page out of the IBM manual over and over again, all on different college websites.
This is another area where linguistic/statistical analysis could really help. Most knowledge-base products offer a "More Like This" feature that is an index of linguistic similarities between items. An easy way to detect duplicates with such a system is to have a fine scale and place an uppler limit on similarities, i.e. any two items with a similarity > N are likely to be duplicates.
All of this being said, I would be surprised if Google does not address these issues in the very near future. I do not think they have gone down the path that many large companies go down where they stop trying to innovate and instead just try to protect their turf.
Here's the Google Cache of Vivisimo.
There is no reasonable defense against an idiot with an agenda
:wq
I would prefer as an alternative to regexp (since that obviously would be way too much power and too many exploits) is simple logic operators.
Most search engines now have AND and OR but none have nested logic or short hand
for example I would love to do this in google: (linux && modems) || ("AT commands" && !windows)
> SELECT * FROM brain_cells WHERE synaptic_rate > 0
0 row returned