Better Search Results Than Google?
Mechanik writes "CNN has an AP article about the next generation of up and coming search tools, which try to cope with the glut of hits that result from 'conventional' search engines such as Google. One tool, Vivisimo, "is like a superfast librarian who can instantly arrange the titles on shelves in a way that makes sense. [...] But unlike libraries, Vivisimo doesn't use predefined categories. Its software determines them on the fly, depending on the search results. The filing is done through a combination of linguistic and statistical analysis." Grokker, another, downloadable program, "not only sorts search results into categories but also "maps" the results in a holistic way, showing each category as a colorful circle. Within each circle, subcategories appear as more circles that can be clicked on and zoomed in on." You have to love the author's use of trying to look for a hotel in France with the terms 'Paris Hilton' as an example of searching gone awry."
Business experts at the Wharton School are now predicting the ultimate demise of Google. According to them, once the SCO intellectual property lawsuit kicks into high gear, one of their main targets will be Google. Google, as you may already know, is one of the largest users of GNU/Linux, with some 10,000 seperate machines, all running the Linux operating system. Once SCO is successful with their lawsuit, Google will most likely be forced to declare banruptcy and shut down their services because of the expenses with trial costs, lawyers and of course Linux licensing fees. It will be a sad day indeed, since everyone I know uses and loves Google. But it is good to see a new player in the search engine game that looks like it can fill the gap left by Google.
...until I can regexp my searches. It would make a whole lot of difference.
They aren't off to a very good start:
Problem occurred while using Vivisimo::
Currently under heavy load. Please try again shortly
Please go back to the Vivisimo home page and try your query again
It's not as catchy or cute as google. I won't give wizimodo another thought.
Problem occurred while using Vivisimo::
Currently under heavy load. Please try again shortly
Please go back to the Vivisimo home page and try your query again.
- Donny was a good bowler, and a good man.
Well, Google made a huge leap forward from the old-guard, of AltaVista & Yahoo, who were in their own way a huge leap beyond what had gone before. We had to expect this to happen sooner or later, but two things spring irresistably to mind.
:-)
1)Will it gain the enormous foothold in the collective consciousness that Google has acquired? To Google is now a verb... and it gets mentioned on Buffy, which is as good a cultural barometer as we are ever likely to have.
2)Will the UI and secondary services (such as the ODP, and Google Groups) be as good as Google itself?
Also, while I'm sure that it will happen one day, I'll believe it when I use it and not before... Oh, and the Paris Hilton thing? LOL! That sort of anti-result comes back from search engines *a lot*. I was just talking to my mom about searches of that type of ambiguous nature the other day.
Sign the FSF's Anti-DMCA petit
You have to love the author's use of trying to look for a hotel in France with the terms 'Paris Hilton' as an example of searching gone awry."
So what you're saying is the search went awry because the author decided that a hotel in Paris was more interesting than the other Paris Hilton entries?//
I tried this earlier (around noon) when I saw the article. One of my big complaints is that the searches seem to take too long. Google usually is sub-second searches, this seemed to take about 3-5 seconds (this was well before slashdot posted the article, so it wasn't slashdot effect either).
Also, I already do not like the search results showing up in the sidebar with search engines (with mozilla), as that is one of the features I kill as soon as I install mozilla. So, I guess, this search engine has a ways to go before I prefer it.
The searches didn't seem too bad over all, I tried looking for "linux kodak 4530" and its results were not any better or worse than googles. I tried a couple other searches and they seem to be on target about as well as google though.
Norris/Palin 2012
Fact: We deserve leaders who can kick your ass and field dress your carcass.
of Antarctica, an old and very clunky Java Yahoo-like engine (sorta). It used a map of Antarctica to drill down into categories and subcategories before putting the user in a 3D world interface at the lowest level. When I interviewed with them, the interviewer did an excellent job of turning me off the technology, explaining that the 3D interface would allow 'billboard and other advertisements' along with the search results formatted in a 'mall or street' of entries.
Gah.
A new search engine comes along that touts its uber intelligent way of searching. It is hyped by the press but ends up by the way side. (See Teoma)
I don't get excited about "Google alternatives". Google satisfies my searching needs as it is. Sometimes "knowing what to search for" is better than a super intelligent search engine.
As far as I'm concerned anyone with a clue can produce the results they need with a little bit of practice and common sense. They don't need new search engines.
Clif
clifgriffin > blog
Too bad their server overload isn't...
how the hell are you supposed to remember a syntax that resembles modemline noise on a good day?
one of the profs at my university was a perl buff. regexps convinced me that my future lies in physics...
Glad to see AP covering a site thats been operational for 2 years, nothing like cutting edge reporting.
What if you want that glut of hits? Sometimes you have to dig through some pretty obscure hits on a search to get what you want, and categorizing them or putting them in funny circles just complicates the process and can make the search take longer. I'll hang with Google and Teoma, thank you very much.
And I certainly don't want a downloadable search app running, that's just another possible inroad for spyware. I've been burned enough times by apps I thought were "clean" that went off and chewed up enough bandwidth to choke a horse.
Be excellent to each other. And... PARTY ON, DUDES!
On the other, a cottage industry has sprung up around keywording your site into the top ten. More often than not, most of the results I get are worthless shell sites that redirect you to an online store. Or worse.
If yahoo or MSN can do a better job of preventing this "search spam", they'll get a nice chunk of the search pie.
Do you even lift?
These aren't the 'roids you're looking for.
Tried it...too many ads and so I don't quite trust it to give me the kind of pure results I seem to get from Google. I'll wait for Google to implement the same kind of categorization system or at least let other people who have the time test out visimio.
---Technology will liberate us if it doesn't enslave us first.
Despite the problems with Google, it's still the best place I've found to get good info. The trick is to be very careful about how you search for something by adding in search modifiers such as "-sale" or "-bargain" or "review" to weed out the overtly commercial results. But even then, things have changed and not for the better.
-S
--- What parts of "shall make no law", "shall not be infringed", and "shall not be violated" don't you understand?
I'm thinking so ...
(SNL Sketch)
I wonder what happens if you use Grokker to perform a search for images? It would be cool if the colorful circle be re-patterned in a way to resemble the image you are looking for.
In this way, searches for something like goatse.cx would be especially topical.
We realized the same idea for images. Take the results from Google Image Search and rearrange them using methods from computer vision.
An article about this is available here: Clustering visually similar images to improve image search engines .
The biggest benefit of any search engine is of it actually being available for me to search on :). Hopefully the slashdotting won't take out the company so I can try it later..
At any rate, Google is an excellent search engine that is constantly being refined. It's good at what it does. I may consider using something complimentary that does something differently but I think I'd stick with google currently for straight out searching. Anything seeking to supplant it would have to handle more than just raw link feedback to different sites. It would have to take advantage of new forms of information and incorporate more than a huge link database into results.
You're reading Slashdot. Of course you like Linux and pc hardware
It's the same thing as with speeding: you don't do anything with those extra seconds/minutes except endanger the lives of yourself and other people.
Google's search results seem to be disintegrating. If you search for almost anything, you are bombarded with dozens of sites that have nothing at all to do with your search, but everything to do with installing trojans and popup-producers. It's depressing to see what used to be a useful tool totally swamped by noise....I just hope they can bring it back from the brink.
Is there a search engine that can filter out all of those annoying placeholder sites that grab unsuspecting visitors by simply putting every word about a certain subject on a page and then having links to other useless websites? This is 'webspam' as far as I am concerned and the next step in search engine design should be 'placeholder' site aware.
A search engine that ignores specifically commercial sites would also be helpful.
Any ideas on either of these type features in current or upcoming search engines?
wow, we didn't have the www in the 70s and 80s!
"Vivisimo" can *somehow* come up with a better engine than google, will people use it? Google is getting bigger and bigger not necessarily by their search results (or lack thereof) but also because of how the phrase "google" has caught on in mainstream culture. Face it - when your competitor makes it into the dictionary, it's going to be EXTREMELY hard to get people to change the way they search. If you ask many non-techs how they find information on the web, they don't say "I search for it" they say "I google it".
Now, that being said, one thing the CNN article doesn't talk about in great detail is the technology behind this company - Google started out at a major university - what's the background of this company? While I agree something should be done with all the advertising that occurs with PageRank, I find it highly doubtful that it's going to be another company (rather than Google itself) that will fix it.
...it never says "Server is under heavy load. Try again later."
Canadian Cynic, canadian politics is less boring than you
Someone's thinking is screwy.
KS
You have to love the author's use of trying to look for a hotel in France with the terms 'Paris Hilton' as an example of searching gone awry.
A few weeks ago, there was a funny part on SNL. On Weekend Update, Jimmy Fallon was interviewing Paris Hilton (it was really her). He was asking her about The Paris Hilton, i.e. a Hilton Hotel located in Paris.
"Is there back door access at the Paris Hilton?"
"Is there double occupancy at the Paris Hilton?"
She took it pretty well, and it was pretty good.
" and it gets mentioned on Buffy, which is as good a cultural barometer as we are ever likely to have"
Gawd help us. Society now sucks if that is our barometer.
Google, the verb, has been mentioned on Law & Order. _THAT_ tells me it has entered the mainstream.
Holy s-, it's Jesus!
I searched for "shakespeare".
Take a look at the categories it determines..
Awesome!
-- jaf
Nothing but the finest in meaningless drivel
Pittsburgh-based Vivisimo sells its technology to companies and intelligence agencies, and offers free Web searches at Vivisimo.com.
Oh boy! Where do I sign up for my free registration! Here's my name, age, adress...
Sigh.
You can't take the sky from me...
I'm actually posting this form the browser window of Grokker. Been playing with it for just a few minnutes now, but I can see how something like this can make obscure or broad searches a lot easier. When you enter a search term, Grokker generates a series of circles, each of them representing a subcategory of results for your search term, and each of them in turn filled with subcategories of their own. Searching for "west coast museums", for example, gives me subcategories such as 'travel', 'west coast attractions', and 'history museums'. Once you find your desired subcategory you're presented with a smallish list of matching sites, represented as squares. The categorization seems to make sense most of the time, even if the overall visual effect is remniscent of 70's disco lighting.
I want the fire back.
Until google allows you to permanently exclude all those annoying shopping/review pages that are nothing but click throughs to amazon etc then it won't be perfect - things like www.dealtime.com and www.shopping.com - why google even includes the damn things beats me.
I've been doing a lot of thinking lately about better ways to interface with data, generally with searches but it applies to most anything. Naturally this was inspired by reading some Sci-Fi (Saturn's Race by Niven and someone...the book is in the other room.) I got to thinking, the perfect interface I can imagine is much like an actual room, things laid out visually where you would expect them. The normal 2D GUI has always seemed a bit unnatural to me.
When this is applied to searches, I'd like to see information grouping, liek was mentioned in circles, although I want it more organic. tree structures, book shelves, whatever is most appropriate to the current search, and I want them interchangable so I can format my view however I think works best. In a web search, I like the idea of a street. The major sites, amazon,com, ibm.com, etc are all represented by nice looking storefronts, but there are also dark alleys I can do down, to find less reputable places. So in this case, information is arranged by reputation of the source.
I haven't quite figured out how to approach this from a coding viewpoint, but surely there are projects out there that try this. WilmaScope for example is a good way to look at certain types of data. Why can't more things have this kind of intuitive interface? 3dDesktop is another attempt at this, but it is a mapping of 2D desktops to a 3D shape. I want more of a visual representation than just a bunch of desktops attached to a sphere. I konw there are others out there, but how about some leads? What have you see/used for intuitive data representation? Why hasn't this taken off?...a search engine which can't handle a slashdotting.
Find funky gifts
Google is about having good quality results with a very simple interface, one that anyone can use. Go to an academic library and look at the various journal search engines like "America: History and Life" or PychINFO, or better yet just try out MedLine. See anything wrong? Busy page, weird syntax, a huge instruction page about "how to search".
Engines like Vivisimo may make it if they can keep Google's simplicity and ease of use and only add value with categorizations. And personally, I think they better get out of 1996 with the frames. Yech!
Man, I must have been sleeping...
When did google become a conventional search engine...?
--
bachiatari na torisetsu o yome!
Google is simple, fast and uncluttered, as opposed to some of the ad-ridden monstrosities like Yahoo. These services would do well to follow this example.
If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
I suppose it stems from impatience and an unwillingness to learn such a basic thing needed to find information. Google is very simple and they have a simple tutorial on how to find what you want to find something specific. I think that is an excellent system and to expect it to work properly with just two words is too much.
In the stated example, a simple change in the request should give far better hits. A Google search with these keywords would do the trick: "Paris France" "Hilton Hotel"
"I'm feeling lucky" would actually return the correct result: www.hilton-paris.com/
That link actually goes to the web site for Hilton Hotels in Paris, France.
Not bad for only four key words.
I've been told it's cool, but I've got 50 spacebucks for anyone who can explain how Kartoo works and why is more useful than a search engine that returns "normal" text results.
I've read the FAQ, I even ran a few searches through it and fiddled with the results, but I still don't get it. Near as I can tell, it's just a way of making spaced-out pictures of words with circles and arrows around them - you know, like PowerPoint, but with fewer distractions.
Is it because I don't do drugs?
"Lawyers are for sucks."
- Doug McKenzie
I tried a few searches on Vivisimo before it went live on slashdot and I must say I'm impressed. It addresses one of the main faults of search technology today: context. When you perform a search a tree is shown showing the different contexts (not categories) where the terms were found. Excellent for ambiguous concepts.
But, and here is the beef, it should be obvious to anyone that there must be a interface change in the short term future of search. A textbox is a very limited input to express a complex search. Using regexps and regexp-like operators is not enough. This Vivisimo is a step in the right direction, but there's a lot of way to go through.
For example try to make this search using any engine (Vivisimo, Google, Yahoo, Altavista, etc): who was the red-haired singer that recorded a song with Tom Morello a few years back?. At least I can't find an answer because one of the main aspects I'm using (the red hair) maybe is not as important as other aspects used to describe the situation by anyone else.
There must be a interface revolution in the years to come. Come to think of it, are we still using a textfield to express every possible combination in a google search? Gross!!!
Life isn't like a box of chocolates. It's more like a jar of jalapenos. What you do today, might burn your ass tomorrow.
I have yet to see a visualization tool that was truly useful. Do people really want to see their results laid out using Cartesian coordinates as result metadata? I don't think so. Its cute but the reality is that people will prefer a list, and more specifically, look at the first five entries. Getting the right links into that top five is all that matters.
hahaha, the reason you get the person 'Paris Hilton' and not the place is the new personalized search algorithms.. they know you are a prev and have no money to travel so they give you the video.. lol you fool!
...google leaves so much to be desired. Too many paid and crafted links...too many stealth redirects...too many commercial links forced ranked...no AI.
google reminds me of that old pizza commercial with the new employee 'big dummy'. When he finally gets something to do, he runs off exclaiming "I am HELPING!!!" - not
Then again, that's what they said about the GNAA. And now they're famous!
And this guy is at least changing ONE word in every post.
Problem occurred while using Vivisimo:: Currently under heavy load. Please try again shortly Whoops. -James
I think you misspelled barfometer
- Donny was a good bowler, and a good man.
... not a search engine. In fact it submits the search to MSN, Netscape, Lycos and Looksmart and clusters the result into categories.
My question is: if it really is interested in providing the future of web searches -- why not cluster on top of google results which are accepted to be *the* best? Them not using google underneath and instead using Lycos etc shakes my confidence in them.
I think Google's search results are worse after their "Florida" update.
alltheweb.com has pretty decent search results.
John Kerry is a Joke!
Given google's IPO situation, and someone I've never heard of touting themselves as being better than google. Maybe this somone is looking for a bit of capitalization themselves?
Well, for an attempt at a better newsbot than Google news, you can check out newsbot here. It does a few things that GN leaves out (XML feeds, PDA version, peer recommendations, etc, etc) and I believe it has a better S/N ratio. End of shameless plug.
Out of curiosity - are there any open source tools? I wish I could get something like citeseer.org (but served locally) connected with some of these other tools... but I can't seem to find anything.
All this article brings to the table is a bunch of BS. "Better search results" means you get what you want more often, not that the indexing makes "sense". Indexing already makes sense so long as you know how to use the index.
Right now you take a webpage, look for words on it, relate the words then goto the most popular page for a given search. This works most of the time, but when someone types in a term they can mean very different things. For example, if I type in porn, I may be looking for freebie galleries, not porn.com. I may want women with 42 inch in diabeter b00bs, not some thin anarexic teen. This is where an intellegent algorythm that helps individual users filter the results helps. Take the results the search engine gives you, filter them out for the information you specified based on your personal taste and the data it's designed to collect on you.
But then you've got the obvious privacy violation. I'll stick to the current system, it works well for what I need.
Candy-Coated Knowledge
I bet it costs too much to stay inside either.
You can still find old mirrors of the reverse engineering site, but the only active one I know of is at www.woodmann.com/fravia. The message board is at www.woodmann.net/forum, no crackz, serialz, or warez allowed. Just techniques, tools, etc.
Google does have a programming API for their search engine. As I recall, it's based on SOAP, and you get a 'developer ID' that allows your programs [X] searches per day. X being a 1 with some zeroes. (10000? )
I have no idea how it works, but that lil' genie dude really weirds me out. He looks like the genie from Aladdin after visiting the methodone clinic.
Nothing but the finest in meaningless drivel
The front page states "Information Overload was the past" yet you can't even use their search engine because it's been /.
His example of searching for Paris Hilton is nothing more then an glorified example to try to prove his point.
You do not need to completely redign a search engine to get your desired results. You need to refine your search. Search google for Paris Hilton Hotel and the first three results are directly related to a Hilton Hotel in Paris. I would not find this hotel any faster using his circle method with Grokker2. I use a search engine to find exactly what I am looking for. Displaying all the results on some chart, graph, or 3d display still requires me to browse around to narrow my search.
Bad boys rape our young girls but Violet gives willingly.
I get the message "Problem occurred while using Vivisimo::
Currently under heavy load. Please try again shortly"
Sure it's cool and everything but I'm not gonna use something that only works half the time.
www.kartoo.com - does what the article states without some other applications having to be installed.
How come all my searches return:
Problem occurred while using Vivisimo::
Currently under heavy load. Please try again shortly
Hmmmmm...
A search on "to be or not to be" on Google produces 3 erroneous results out of the first 10. Visi-whatever produces 10 out of 10. An improvement.
So far Vivisimo is proving to be 100% successful at removing the glut of results. All of them, in fact.
--- What?
Google was a leap forward from Yahoo! and AltaVista in terms of searching, but both have expanded to include other services that people may find handy. AltaVista has the Babelfish translation service, which I have found useful on numerous occasions. And Yahoo! expanded to a rather competitive web portal -- so many extras beyond web searching that if I try to list them, I'm sure I'll leave even more out. Google has also expanded a lot in their own right. Google News is very nice, as is their newsgroup searching, and lets not forget their option to display a cached version of the hits it returns. So, even if some of these new sites take Google's crown in the web searching arena, Google will most likely be around for quite a while.
Search engines are not a public service. They have to satisfy users and advertisers. Thats the balance. You could try to start a subscription-based totally-commercial-free alternative, but I suspect there is little interest in the larger internet audience.
Google's main invention was not just that they had the largest search index, but that PageRank was so good that you could hit "I'm Feeling Lucky" and skip the results set altogether and go straight to the #1 result. The other 450,000 results can go to hell, Google knows the #1 is good enough.
But wait a second, Google's main revenue model is AdWords, which doesn't get a chance to show up if you jump past the result screen. Turns out people aren't feeling lucky that much anymore, that Google's #1 answer isn't the be-all end-all it used to be.
So, it's possible to match Google's indexing scope, AllTheWeb.com has already gotten that one done and then some. But, the game is now once you return a large volume of sites, ranking them properly. To whomever comes up with a better formula than PageRank or presentation scheme than Google's trademark text-heavy interface goes the prize...
I hope what I am writing is not too off-topic. I have found this tendency among people (mostly involved in non technical/scientific jobs) associating top searches for high level of "authencity". It is totally overlooked that top searches are "popular" but might not be of high quality/authencity. Ofcourse, great deal of association can be made between "popularity" and "quality". Better things are more popular.. However, most often popularity (like power) feeds on itself. i.e. Popular links become more and more popular (ofcourse other scenarios exist). There should be some way out..to recognize the quality of information.(slashdot like moderation of all webpages by a search engine is not a bad idea in theory!). So, unless we have search engines that not only come up with popular sites but with more relevant content of high quality there is a lot of scope for improvement. (For instance how does an essay written by a college student through online research compares with that written by library research..). Another area where search engines can make great improvement is search of dynamic pages. "page rank" like algorithms suits well for static data. For instance a highly relavant post on some newsgroup posted *recently* might not show up on your search page! I hope google isn't another future microsoft (oh! did I mention power/popularity feeding on itself before? :) ) stifling innovation.
Search engines can be lot lot lot better..hope they will be soon!
Why do I need another search engine again?
You both watch too much television.
Note:
I DID put "" around the phrase
Putting a + in front of the search still produces bad results
The desired phrase is not present at all in the page, not even in hidden keywords.
The bad results are not relevant: I asked for "to be or not to be", not the variations that do not match.
It does not matter that "pages with the phrase linked to these pages". That is not how Google works. If you read their own "Help", they say that a search produces results containing the phrase or word asked for. The "pages linking" part is only used for ranking. (so this is not a problem with me not understanding how Google is supposed to work)
This problem is sympomatic of Google searching: single-word searches work great, but phrase searches often come up with a lot of irrelevant results.
This is not rocket science. What is so hard about the basic `find where a = b` kind of search?
I asked Google about this. They said it was a bug they might fix someday. (so this is, again, not my misperception of how Google works)
As for Visisimo. I tried it. The results on the "to be or not to be" search were 100% accurate, unlike Google.
Problem occurred while using Vivisimo::
Currently under heavy load. Please try again shortly
Please go back to the Vivisimo home page and try your query again.
FYI "Googled" was in the Sex and the City last night as well.
Carrie "I googled him... he's been around...."
It may be a sustainable argument to group geeks as information retrevialists who fair best and favour alphanumeric representations. If geeks tend to excell at programming tasks it suggests their skill set is honed to alphnumeric scripting. Perhaps the greater number of new internet users are more graphically orientated and prefer to manipulate data as images. It may be the internet is undergoing the early stages of a transformation that will see information predominantly represented graphically. The Power Point users may be in the ascendency and the internet as geeks developed it will become a antedated layer wherein alphanumeric data suffers from being used by a minority.
"Academicians are more likely to share each other's toothbrush than each other's nomenclature."
Cohen
There is a DEFINITE central structure.
Atoms, modifiers, and conjunctions.
Atoms are character classes (letters, ranges, or bracket expressions), conjunctions of said classes, or a paranthetized expression (like in maths).
You have two conjunctions. The first is concatentation is what you get when you put one atom right after another (they both have to appear in that order). The other is alternation (pipe) where either the left atom or right atom must appear.
Finally modifiers are an optional number of repetitions for each atom to match. The default is from 1 to 1 (exactly one). * means from 0 to infiinity, ? means 0 to 1, + means from 1 to infinity, and {x,y} means from x to y.
That's it.
Black holes are where the Matrix raised SIGFPE
You'd better hit Godaddy now! Even worse than Google, this one is easy to misspell. If this thing takes off, all the vivivimo, visisimo, vivimo, visisomo, etc domain names are going to be hot virtual real estate.
is a fine tool for certain types of jobs, but it's unable to understand context from content. Google works great because humans provide the context through linking. Statistical analysis has been shown to be much faster at text analysis than grammar based approaches, but it is hardley the end all of search technology. Tech writers need to pull their butts our of their heads.
Anybody has a google cache link for it? ;)
You can write an infinite loop in alot of regexp packages. They would have to have a way of detecting that ( or a very inefficiently written regexp )
Eat at Joe's.
While I cant seem to find any page that describes their technology throughly (only glossy uses of the word 'clustering'), my guess would be they use the well known K-Nearest Neighbour algorithm to cluster results -- which clusters based on the presence of certain words. They do say that they use the words in the search result returned by the underlying search engines for clustering (MSN et al.)
So unlike google that brings the most linked (in some sense) result to the top reusing the research of the webmaster, Vivisimo takes the quasi-ranked results from other engines and divides them into smaller bins so that people who browse the results can skip uninteresting bins quickly and just look into the bin of interest.
That is clearly a step up from the results (read sponsored results) these underlying search engines offer, but if you ask me the power of google comes from harnessing the research and conscious linkage performed by website authors as opposed to allowing people to skip irrelevant results with ease.
My $0.02
Most famous google result? Now while I really don't care for the man or the policies of his administration a search engine that can be manipulated into giving these bogus results is not what we really want. Good luck to these new efforts.
If I cant find it on the first page of Google results, I dont look at the second page, but try to narrow the google search. I dont see how squeezing 100 or results graphically into a single screen is going to give a better search.
"digital camera" -retail
works for most thing.
A blog about stuff.
I have used Vivisimo a few times but never realized that their method of categorization was quite langaage independent.
If it really is then DMOZ, the Human Edited Directory, ought to incorporate dynamic categorizations like this, infact to the point that someday each user should have his/her own unique categorization of the all the websites in the world ...
Meanwhile, are they using the words in the headings to determine categories ? Or is it words that have in some way been emphasized ? And to do this in a way that transcends language ...
I am really curious as to how the words that determine "categories" in a sentence/para/section/page can be identified and sifted away from less important words. And how to determine the "keywords" that are not as important as "categories" but still more important that the "filler words" on the page. Keyword for Google is what you are searching for. That is easy. But how does Vivisimo take it further and establish it as a category?
To see a world in a grain of sand, and then to step back and see the beach where the sand lies
When did google become a conventional search engine...?
Yeah, that was my reaction, too. I remember Google being the next generation search tool that coped with the glut of hits that resulted from conventional search engines such as Yahoo and Alta Vista.
Internet time sure is fast.
"100,000 hits? That's nothing! In my day, we used to get over ten million hits, and we had to follow every link before we found what we wanted! And we liked it!"
Accountability on the heads of the powerful.
Power in the hands of the accountable.
Searching the Web is so important to most of us now that the download factor won't easily be overcome. Whether I'm at work, at home, at the in-laws, or at a friend's computer, I can jump online and conduct a Google search. The same is not true with Grokker.
My guess is that the Windows-only, downloaded app structure of Grokker will keep it from catching up to Google. Google is a search tool that works on any Web-accessible device and doesn't muzzle me the way a downloaded app does. I expect that whether it's Google or a competitor that makes the next leap in online search, it won't be a downloaded app.
Read the EFF's Fair Use FAQ
Looks like they should have invested some time in clustering their servers too! Fear the slashdot.
that's one of the most graceful slashdottings I've ever witnessed.
"You worthless post!"
-Shakespeare, 2 Gentlemen of Verona, 1. 1. 147
A good point. I switched to Altavista back in the days, because they had a relatively clean layout of the search results, which came up on the screen really fast. Later I switched to Google because of their even cleaner and more functional UI, not because I was getting better search results from them (there wasn't much difference that I noticed).
"Back in the day" for you must have been fairly recent. When I switched from Altavista to Google, doing a search for anything obscure (say, accidental death rates in national parks) meant putting your search terms into Altavista, spending the next ten minutes refining the search to get the number of results down to manageable levels, then checking each of the remaining thousand hits individually. Google, with its ability to put the page you're looking for in the top ten (and usually in the top spot!) was amazing.
Of course, for me "back in the day" was when Altavista was at altavista.digital.com
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
Mark Byland
PO Box 1577
Waitsfield, VT 05673-1577
1-802-496-5068
mbyland@iocus.com
Just a Linkto Vivisimo's background page FYI.
If the author had used the query, "paris hilton hotel" on Google, what he wanted would have been the first link shown. If you are ambiguous, you get ambiguous results.
In my Vivisimo search, the ads were above the results. The results started below the 2 ads beginning with the number "1"...
Google rules for caching the documents, and highlighting your search terms. Try searching for something like 'linux man mount hpfs' and get not only the manpage, but the HPFS specific stuff highlighted for you. A few mouse-wheels later you can scroll right to the stuff you're looking for.
I want to delete my account but Slashdot doesn't allow it.
I Google'd 'Grokker' and nothing came back. How very strange...
Vivisimo is obviously better than google. I searched for "slashdot effect" and "slashdotted" and both times it gave me the result of: "Currently Under Heavy Load."
It's like a freakin dictionary!
everytime calls their product "holistic" I just want to take it and have at it with an awl and hammer. Then I say to them "Now it's DEFINITELY hole-tastic!"
Black holes are where the Matrix raised SIGFPE
While there is a lot of talk about what is better, why don't we think about putting the vivisimo auto-categorization on top of google? Best of both the worlds.
The article raises a good example of searching for a Hilton in Paris. I tried this search before going to France and ended up staying at a different hotel chain during my visit. From what I found off of Google the rooms at the Paris Hilton come with some very nice extras, but the lighting is just an awful shade of green.
To use this search engine, you have to load an obscure plug-in which slows down the browsing experience and greatly increases the presence of obnoxious ads? No thanks. It's a big mistake for them not to make something that sticks to standards.
What you ask is more difficult than one may originally think. As soon as a novel approach to counter-acting one of these annoyances becomes popular, it lands itself in the cross-hairs of those who would exploit "the system" in the first place. Witness the current arms race that is SPAM. Witness Microsoft security. Hell, witness Slashdot moderation.
There are a number of bright people on both sides of the aisle. When one side discovers a new technique, the other will work hard to neutralize said technique. This continues until either: it is too expensive for one side to continue, or too complicated for the consumer to bother with anymore.
"Currently under heavy load. Please try again shortly
Please go back to the Vivisimo home page and try your query again."
I bet.
Disclaimer: Just kidding, not work safe.
"Have you ever thought about just turning off the TV, sitting down with your kids, and hitting them?"
I honestly don't understand why people complain about "search spam". I've never had trouble using Google to find the results I'm looking for, unless I'm searching for something so obscure it might not even be on the web, or when I'm searching for something that would be better served using a specialized search engine (such as looking up FCC identification numbers).
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
I prefer to believe in the visionary Tim Berners Lee with it's Semantic Web ideas. There are lot's of works in that direction. When we could do searches with semantics, the results would be exponentially better. Until them, my bid is in Google. How many time yet? Ten years?
Wow, how did they know my name is Shortly?
Powered by onion juice.
It seems to me that Vivisimo gets its search results from various other search engines and then categorizes them based on how many results they get from each site. While I haven't been able to test it to my satisfaction, I believe the results will be very similar to the search engines they use. However they present them in a different manner.
It would be nice if there is a feature that filters e-store entries. For example, I was looking for a solution to my Logitech RumblePad left analog stick problem. And no matter how refined my search is, I still get thousands of pages to stores selling that gamepad. I don't want to buy a gamepad. But I guess search engines and e-commerce would never be separated. Sadly this is how the Internet works now.
...THE PRESENT
It was still funny, since you described the sitation perfectly.
Just tried a search for my website, and got the message:
Server under heavy load, please try again later.
But it did return the error message very quickly!
Guess we got 'em.....
I agree with you completely. However, IMHO, sometimes it's very helpful to get search results that are not exactly what you are looking for. Although I can't give a specific example right now, several times I have found new and helpful information by getting the WRONG search result. Nevertheless, it does help to weed out alot of the garbage most of the time.
Anything 3d will immediately slow down your interaction to a snails pace as you manipulate your environment. Even if it was a virtual mind-meld into a matrix like environment... "walking" to your search result and activating would take longer than a quick scroll down a result list with text blurbs.
Intuitive does not mean good.
It should be efficient, and become good through acclimation. Just like riding a bicycle. It seems garish at first, but it makes perfect sense later on.
Just look at the interface from Minority Report. We should all be so lucky to have UIs like that. The answer is big screens, "front page snippets" representation of documents/results for at a glance viewing, and multidimensional arrangement where dimension (and tagging) is based on attributes (relevance, date, accuracy). Dimension could mean position in space or in a hierachy, etc.
Black holes are where the Matrix raised SIGFPE
"CNN has an AP article about the next generation of up and coming search tools...One tool, Vivisimo, "is like a superfast librarian who can instantly arrange the titles on shelves in a way that makes sense.
Hmm...faster search engine...leads to...
More and faster Pr0n!!!
Why else would anybody bother to make it better?
They're still a software firm. Did you interview with Tim Bray of XML fame, perhaps? The web demo I saw way back when used ODP data and a lot of Java.
Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
It seems to me that what searching REALLY needs next is a way of distinguishing the content of what your are looking for from its own identity-- are you looking for a definition, a short article, a map....... the "Paris Hilton" example is good, but a better one would be the search "teaching history": Do you want hits relating to the teaching of history, or to the history of teaching? Unless the Bayesian attempt to deduce what text is "about" makes rapid progress, we are probably going to have to have some standards for META tags to make this work right.
My non-tech mother uses it as a verb, thus I know it has entered the mainstream vocabulary, though I think she picked up from me and sis.
Argh!
Too . . . many . . . comebacks.
One thing though, she does have a sense of humour. Here is a summary of her appearance on SNL's Weekend Update right after the scandal broke. You can even download it if you want.
Well, there's spam egg sausage and spam, that's not got much spam in it.
I searched for "welding control" and got back a list with no places trying to sell me welders or welding supplies. Mostly, I got back useful papers that are not availible through my school's library. This is where this idea could shine. Good Stuff. The system does need a overhaul, though. Just my two cents.
"That's not ironic, it's just mean!" - Bender
The Vivisimo toolbar looks like an exact copy of the google toolbar...
Well, Google made a huge leap forward from the old-guard, of AltaVista & Yahoo...
Google search for the term search engine.
If you're up for some maths and some fairly dry reading, check out the paper "Authoritative Sources in a Hyperlinked Environment" by Jon Kleinberg. He describes a search method which takes regular text-based search results and then examines the link structure around those pages. The idea is that pages of comparable content exhibit heavy interlinking. Clusters of such pages can be identified with a recursive algorithm a little like Google's PageRank, and then distinguished with some nifty eigenvector mathematics. This gives you your basic categories, based solely on the link structure.
While the paper doesn't detail how one might label the categories identified, I don't imagine that it's all that difficult to do with some simple correlation algorithms, which wouldn't be language-dependent.
Disclaimer: since vivisimo is down and I've not used it, I could well be talking out of my arse here; this is just one categorisation method with which I'm familiar, and would produce the results mentioned. It may not be how vivisimo actually do it.
and it gets mentioned on Buffy, which is as good a cultural barometer as we are ever likely to have"
Gawd help us. Society now sucks if that is our barometer
Correct.
Google, the verb, has been mentioned on Law & Order. _THAT_ tells me it has entered the mainstream.
Close. If it was on "Queer eye for the straight guy" then it hit the mark coz those guys sure do know everything [/sarcasm]
i read the eheadline, go to the site, and
/.
search for my site
and i get this.
Problem occurred while using Vivisimo::
Currently under heavy load. Please try again shortly
Please go back to the Vivisimo home page and try your query again.
#@$@#$ you
To err is human, to really screw things up, you need a robot.
Not too bad, maybe a bit overly graphical by default, did not produce satisfactory results on a side by side comparison with Google. Maybe in a few months?
Hello Kettle,
You, my friend are as black as pitch.
With love, Pot.
Google, the verb, has been mentioned on Law & Order. _THAT_ tells me it has entered the mainstream.
So has "UpYourButt.net." Several times in the course of a one-hour show, in fact.
If you ignore the TLD and go to upyourbutt.com (as Joe Sixpack is likely to do) then you will notice that NBC has advertised hard-core pornography on its evening programmes. I'd like to publicly shun NBC for its poor judgment.
Sincerely,
Seth Finklestein
Television Watchdog, not Television Watcher
I'm not Seth Finkelstein. I still speak the truth.
easy.
Detecting infinite loops?!?! If you can figure out a way to do that, then you'll have solved the halting problem and disproved the incompleteness theorem.
Of course you'll have also unravled the fabric of the universe...
Kartoo is a frenchie search engine is kinda different. You sorta need high bandwidth cuz
it uses flash.
http://www.kartoo.com/
link to it here
I still miss the original Alta Vista.
1) Will it gain the enormous foothold in the collective consciousness that Google has acquired?
You'll never get me to say "vivisimo" in a complete sentence.
--
oh crap, you've done it !
Google as a verb has been mentioned by my mother. THAT is my barometer!
Life is like a web application. Sometime you need cookies just to get by.
The only one that got processed was "Slashdot Effect".
Grokker reminds me of a similar web search engine called Kartoo.
You may want to try it at: http://www.kartoo.com/
Look at the URL After a (failed) search :
3 Asources=MSN%2CNetscape%2CLycos%2CLooksmart%2COver ture&x=0&y=0
http://vivisimo.com/search?num=150&query=linux&v%
Hmmm... Overture....
Take a look at a company called Copernic, they have been doing this type of so called better searches for years now.
google will just guy the company. then they get access to their technology and can implement it to make their service that much better.
I am the Alpha and the Omega-3
The Internet has been around since 1969.
Just with that word the search "should be" easier. Unless he meant "Paris Hilton real" or download or boyfriend.
as i just realized, alltheweb.com does clustering to, but not so prominently. its on the bottom of the page.
and it does it in almost google-like time..
what i realized too is, that i should have spared myself some 5 minutes when i searched a recipe for the mayr-diet this morning for my mother (i did the search in german, results in english may vary). google was completely filled up with pseudo-sites.. alltheweb has much more sensible results..
fortunately i use opera, so next time it's just a switch from "F2 g mayr-diet" to "F2 a mayr-diet".. won't be so hard..
PAT
SEO Test: TIGI und SEBASTIAN - Online Shop - V
google (v.) is at nearly 3 years old (at the very least), for christ's sake.
where did the last few years go?
"The number of Unix installations has grown to ten, with more expected." (Unix Programmer's Manual, 2nd ed.; june 1972)
Equifax has a service where you can get a copy of your credit report. This service accepts a "coupon" string if you have one.
Challenge: Create a google search phrase that returns a valid result in the first (non-sponsored) 100 results.
3 months ago, when I tried search for Google's CEO, Eric Schmidt, there were links to articles discussing insider trading charges that were filed against him. If you do a search today, there are no such results.
I was getting confused, you know.
A google search for "hilton hotel paris" worked just fine. What's the problem exactly?
Yes, there is glut and yes there are blog-holes.
The thing I have noticed to be the greatest single limit on web searching is the operator. I can regularly find things on the net that my co-workers cannot. This is because I understand keyword boolean searching at a deeper level than most people.
I blame this on the level of education of the common population, as opposed to being evidence of my own superiority. 8-)
In a world where most people have never actually met or "dealt with" a librarian (archivist, whatever 8-) it should surprise nobody that these self-same people have no idea what it means to take personal responsibility for organizing their own approach to knowing things.
Having grown up near and actually talked to librarians all my life I actually understand how to group information. Applying that knowledge to a search for some words and against others isn't that far a stretch.
It is a personal pet peve of mine to have to listen to people bemoan Google (etc.) when these self-same people have never even *noticed* the advanced search link, nor even learned the power of the minus ("-") in the standard search bar.
There is no technology that can "fix" bad user inquiries that won't in turn "ruin" good ones.
Innocent people shouldn't be forced to pay for inferior software development.
--"Code Complete" Microsoft Press
searching for Paris Hilton the socialite and her "exploits".... that darn thing keeps showing the hilton hotel in Paris where Paris Hilton never partied....
"If you ask many non-techs how they find information on the web, they don't say "I search for it" they say "I google it"."
Most of the non-tech people I see start up IE and type what they're looking for in the MSN search that's still hanging around as their homepage.
Grokker has several major downsides as compared with Google right now:
* It's a program, not a web page.
* It only runs on Windows and MacOS X. (More generally: it cares what kind of system it runs on, which Google doesn't.)
* It uses Java.
Basically, it's a step in the wrong direction from Google. Google's homepage is the model of simplicity: no ads, no extraneous information, nothing that isn't specifically focused on getting you the search results you want. Google's search results are clear, unbiased, seperate out and clearly label the advertisements, and have just the right amount of Do What I Mean. If Google came out with a version of this, it would just be a set of unobtrusive text links at the top and/or bottom of the page saying "Did you mean: 'Paris Hilton person' or 'Paris Hilton hotel'."
The other reason you don't want a seperate client is because when you get the results, you will want to open them in a web browser. So why not use a web browser to find them in the first place. The only thing that might make sense is a browser plugin. Grokker also has a plugin, but it is proprietary, requires Java (which is also proprietary), and only works on Internet Explorer for Windows or Safari for MacOS X
If Grokker wants to succeed, they need to realize two simple things:
* They should provide a service, not a piece of proprietary software. Provide a Free Software plugin or provide the information to someone who can.
* Text, text, text. Other than the Google logo, there are no images on Google's front page (which I rarely visit thanks to Mozilla's ability to search from the address bar) or their search results. Grokker's results are entirely image-based.
Here's a specific example for you: If I hadn't been planning that holiday to France, I wouldn't have stumbled on all those new and helpful porn sites.
As for your questions:
I doubt it. If anything, Google will eat better technologies when it comes to presentation. I think the secondary services are superb. They are not on the level of Google vs Old-School Altavista, but they impress none-the-less. And, your Mom sounds tehHOTT!!1blarg.
They're doing it.
$4 billion IPO.
The owls are not what they seem
One of our VPs repeatedly compared our site to Google the other day. That's the only barometer I need to worry about...
-- dR.fuZZo
I just want to start this thread to see if other people are starting to use Teoma. I find myself using it more than Google lately.
If you use Teoma, holler, reply to this thread and let the Slashdot readers know that Google will be usurped.
This is very difficult to do. Most keyword-based search systems use an inverted index for searches.
The first step involves a hash table that converts keywords into term-ids. Then the inverted index is used: it is a table that holds, for each term-id, a sorted list of document-ids that contain the term. The search process is almost instantaneous, as it only involves operations on sorted lists.
To use regexps, the search engine must convert the regexp into a series of words that match the regexp (a very large set - potentially infinite) and then look them all in the inverted index. This is very slow and, as most users never use the advanced search function, very unlikely to be added to popular search engines until some competitive data structure is discovered.
Google has, among others, a very nice linux filter all ready.
My only issue with Google is that it has a mildly annoying problem with linking to other search engines. Say, for instance, you search for n. Sometimes, instead of being presented with a list of sites carrying information about n, you're presented with links to other (mostly horrible) search engines. It's just as bad as being served a list of pages that are nothing more than "Google magnets," filled with a bunch of terms close to the topic you searched for, but missing any real content.
That's Google's largest flaw, IMHO.
(1) Subscription Model - Make submissions for website links only accepted after review by human beings. You could then charge the 'searcher' a monthly or yearly subscription fee to access this service. I would definitely pay $5 a month to get a 'filtered' search engine.
(2) Community Ranked and Moderated Model - An open-source, community driven and moderated search engine that relied on the massive amount of visitors to comment and rank pages they have received via the search engine result page. A simple plug-in for IE or Netscape, etc., could allow the user to simply click on a scale of 1-5 how useful the site was. Obviously this would be biased against brand-new data, but this is a problem with a subscription service as well. With such a large number of users, this free, community moderation model would be hard to defeat, especially with IP tracking and the ability to constantly change the code in the moderation code.
The term "Slashdot Effect" is now one of only three clickable (i.e. searchable) links under their search box, suggesting that these folks at least have a sense of humour. Brownie points for that.
Last time I checked, Google used FreeBSD with internal mods, not Linux.
Lacoste
Vidi Vici Veni
Thanks for the sig
From my own experience with developing search technologies for an e-content site, these guys are on the right track. Compared to a lot of search technologies out there, Google is dumb. But it is blazing fast, general purpose, and smarter than most of its (former) compettitors. Part of why it is dumb is that it is so general purpose. To make a search engine smarter, you have to add context. Specialized search engines can do this by standardizing their inputs. Google could do this too, but it would require complex parsing of everything that it spiders.
Another thing that Google really lacks is detection of duplicates. Google tries to do this, but does it poorly. I remember recently doing a search on Google for an obscure DB2 error code, and getting the same page out of the IBM manual over and over again, all on different college websites.
This is another area where linguistic/statistical analysis could really help. Most knowledge-base products offer a "More Like This" feature that is an index of linguistic similarities between items. An easy way to detect duplicates with such a system is to have a fine scale and place an uppler limit on similarities, i.e. any two items with a similarity > N are likely to be duplicates.
All of this being said, I would be surprised if Google does not address these issues in the very near future. I do not think they have gone down the path that many large companies go down where they stop trying to innovate and instead just try to protect their turf.
Of course, for me "back in the day" was when Altavista was at altavista.digital.com
yup they where the only real search engine on the block, of any worth, back then.
What I want to know is when did they stop using altavista.net and start using altavista.com? I used altavista for the first time in years the other day and entered altavista.net and was surprised when it forwarded to altavista.com. I am sure it used to be the other way round.
I wonder how many alpha systems where sold because of altavista?
blog and junk
Here's the Google Cache of Vivisimo.
There is no reasonable defense against an idiot with an agenda
:wq
When I'm feeling peckish, I like to use Kartoo It searches for items in an interesting way.
__
Thou hast besquirted me, O leotarded one.
I like your sig. I think what makes it funny is that someone went to all the trouble to register that domain just for that.
I'm fed up with people saying Google is the best search engine. Anyone tried ALLTHEWEB? I did. And for me it's better than Google.
It was mentioned on a ton of shows during the same 2 week period when both of those references were made, it's called paid placement, and everyone pretends it couldn't be because it's GOOGLE.
sig:
See the "..for smart people" banners Wired runs here? Look elsewhere guys.
He did not say the internet. He said the world wide web. The world wide web is only a part of the internet. I thought everybody knew that...
Was that really all that hard? Google is around and will stay around because it's simple and it works. If you can't figure out how to combine search terms then maybe you should just stick with Yahoo and AOL.
...because it's too damned hard to type!
On a more serious note, however, what is the chance that a new search engine would become more popular than Google? I recently visited my relatives, and my 4 year old cousin knew what Google was. I think that the whole world is so entrenched with Google (cmon: how many people around the world do you think have google set as their homepage), that it would take somewhere near a miracle to get people to use something new. Either that, or a few pots of coffee.
anyone remember northernlight?? Vivisimo isn't much of a revolution, unless you consider their ability to get around northernlight's patent 'revolutionary.'
I would assume that's only due to bugs. Inefficient regular expressions would be a real problem and if not allowed, that would remove must of the advantage of having them.
To quote my own post to which you replied:"Fortunately most sites clearly tell you what results are paid."
RTFP
Are you "that Finklestein shit kid" that Chong's father rants about at the beginning of "Up In Smoke"?
Here http://www.google.com/corporate/tech.html Google says
... Google employed thousands of linked PCs - one of the world's largest Linux clusters...
although they use past tense verbs. I couldn't find BSD after looking for a few minutes.
Rrrr
I would prefer as an alternative to regexp (since that obviously would be way too much power and too many exploits) is simple logic operators.
Most search engines now have AND and OR but none have nested logic or short hand
for example I would love to do this in google: (linux && modems) || ("AT commands" && !windows)
> SELECT * FROM brain_cells WHERE synaptic_rate > 0
0 row returned
so that they can precomputer the infinite loops.
-- It only takes 20 minutes for a liberal to become a conservative thanks to our new outpatient surgical procedure!
looked up "miserable failure"...still had George Bush as the first result. In addition, the top category was "The president is a miserable failure." Coincidence? I think not. :)
Once upon a time, Altavista was king of searches. But Google quickly overcame that site.
However, Google seemed to do that through an even simpler interface, and very good results. I am skeptical that any of these other engines can be so much easier to use AND have better search results than Google. And it's not like Google is sitting still, either... they are adding some impressive features also that are things I could see a lot of people using. Vivisimo sounds a little too off the beaten path in terms of what people want from a search engine.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
First off, re: vivisimo, how is their folder system different, conceptually, from Northernlight? Granted, Northernlight doesn't have a public search any more, but I remember drilling through folders of search results 3 years ago.
Second, one of the things about google that's so refreshing is their sense of humor, which doesn't intrude at ALL on their usability.
Anyway, I don't see saying "Hang on, lemme vivisimo that..." any time soon.
But unlike Google, you can't say something like "vivisimo/grokker for it" ... it just doesn't sound right .. and unlike grokker, google is free
What would happen if this did go through?
Let's see, Given:
So I'll stick with Google. Loyalty. (And Linux/Mac friendly)
> SELECT * FROM brain_cells WHERE synaptic_rate > 0
0 row returned
Clustering algorithms are well known in the information retrieval field (try searching for "clustering" on CiteSeer for example).
Google has more than enough expertise to roll out clustering if they want it.
It does not correct my spelling mistakes. I have become verry used to that on Google.
Anyone else notice this? You contradict yourself: first you say '3d is bad', then you sya '3d with big screens is good'...anyone doing physical/chemical etc. simulation will tell you that 3d output is (in many cases) good for comprehension.
-- Waht? Tehr's a preveiw buottn?
Don't look now, but you just did!
You're thinking like a linux user, and not the average user.
Honestly, you *must* have had some time in your life you're trying to find out something on the web and Google hasn't been able to easily find it. Another post used the example of the "red haired singer" which is a good one. If a search engine sorted all the websites into "CD Sales" "Performances" and "Tom Malone the Construction Worker" for that person, it'd certainly point them in the right direction.
Your response doesn't apply to most people. People don't want to learn how to work technology, they want technology to "just work" for them.
Everyone says that the big reason Google got big was the simple interface and the lack of ads. That is true to a degree, but the another big reason is that Altavista was falling under an attack by those who would load pages with keywords. Which is the same thing that is happening with Google. Advertisers are loading links with keywords and setting up multiple incestuous domains. The attack and effects are largely the same. The first page of google is generally link farms with a few corporate web pages and almost no private content. The freedom of the press is limited to those that can get on the first few pages of google.
Since google seems unable to fix this problem, the time is ripe for another technology. Such technology can succeed if it is properly advertised.
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
i think he's confusing google with yahoo
www.dogpile.com and www.metcrawler.com, run by Infospace, have been doing for a while what Vivisimo does in its dynamic categorization. The heirarchical folder-like display is virtually identical. And Dogpile and Metacrawler are meta search engines, so you get the best of several search engines, including Google.
Oh, sorry, I missed the "in their right mind" part.
It's interesting that this kind of reminiscence is coming from the same group that generally prefers the command line over a GUI because it's more powerful.
--- 11 meters/second, or 24 miles per hour - the airspeed velocity of an unladen European swallow. Really.
Look again. Results 7,9, and 10 do not contain "to be or not to be". On contains "to be or net to be", easy to overlook if you are not careful.
Altavista has always had the capability to specify that separate search items exist together in a document ("AND"), but that they occur in close proximity.
I can say:
"Knoppix distro" review
to Google, and I get results related to Knoppix, some of them indeed reviews OF Knoppix. I also, however, get useless hits that may mention Knoppix, but review something else further down in the document. I do not get hits restricted to Koppix reviews.
If I do this with Altavista, I get hits much closer to what I want:
"Knoppix distro" NEAR review
as a law student, I've been doing a lot of searches on westlaw and lexis. Some of the handiest search improvements over basic google:
/s word2 - search for word1 and word2 in the same sentence /p word2 - search for word1 and word2 in the same paragraph /4 word2 - search for word1 and word2 within 4 words of each other
word1
word1
word1
word can be replaced with quoted strings. It's amazing how this will enable one to focus a complex search. Moreover, it's simple, easy to understand, and relatively simple computationally.
LOL, going waay offtopic here, but I clicked on your name and saw your journal entry. I mis-read it as "Spooning Yourself". Oh well, I got a laugh out of it anyway.
It's a clustering engine. It's written on their main page right about under the logo. And in the title of the browser window.
As such, it does not compete with google. They're different beasts. It is one thing to have to dig through a few billions of web pages, and quite another to cluster about one thousand. The feasts google accomplishes are impressive. What vivismo does is group the results of google (or some other search engine, btw). It's interesting, of course, but not the same.
Will it replace google ? Of course not: they don't even do the same thing. Because without google (or a decent search engine) vivismo's toy would work on the principle: garbage in, garbage out.
Will vivismo work with google, teoma, etc. and, perhaps, be incorporated by most majour search engines ? Perhaps. Or not. The thing is that, the idea is good, but not very difficult to implement. I wouldn't even be surprised to see it soon in in-house versions of all majour search engines. All it takes is a few computer scientists who understand document clustering (and both Teoma and Google have loads of people with high degrees working for them).
Of course I am. Don't you know how to read? The credits?
Sincerely,
Seth Finklestein
Famous Actor and Compensated Endorser
I'm not Seth Finkelstein. I still speak the truth.
A colleague just asked me a technical question. He said he'd normally look it up on google, but figured it would be faster to ask me.
There's probably a moral there, somewhere.
...laura
You just need a system that doesn't rely so much on heuristics, and relies more on humans.
I ate my sig.
that most search engines have a difficult time with is "Microwave dish". It's a perfectly valid search term and not especially generic. Let's assume you're searching for basic information regarding microwave antennas (the search engine doesnt know this of course) but don't know a whole lot about them.
But...
Am I talking about the dish antenna's used with microwave radiation?
Or possibly cookware that is microwave safe...
Or just recipes for food that can be cooked in a microwave.
Most engines return a combination of all the above in no decent order. Google even returns some obscure clause in some apartment's lease as its #5 hit.
Better engines organize them into categories, or offer suggestions to clarify your search. Teoma does this though I see its closest approximation to microwave antennas is "Microwave Antenna Broadband Home".
-
I occasionally use http://www.teoma.com when Google fails me. It's not exactly "better", but it does provide a different set of results which can be handy when you're trying to find something rare. The "Refine" option is pretty neat too.
Recently, I've noticed a trend in 'landing' pages dominating the results, the kind that the search engine optimizers have been saying get you to the top of the engines. Experts have been saying that those don't work on Google, but over the last couple of months they *have* been working apparently. For instance, do a search for "80/20 mortgage". The first 6 results are all clearly the same search engine "bait" and Google appears to have taken it, hook, line and sinker. None of those pages are real content and none of them are either explanations of what an 80/20 mortgage is or even companies offering 80/20 mortgages.
I used this as an example, both because I already was looking for one and because it's a pretty non-geeky kind of thing to search for, rather than looking at results for Linux and complaining about MS entries.
The Glass is Too Big: My Take on Things
What makes you think for a second they'll let SCO push them around?
There's a lot of very very smart people there and a LOT of really good lawyers. And don't think for a second they will be caught unaware.
Need Mercedes parts ?
How was Military Academy then?
BEFORE Alta Vista? You mean Archie? (Trivia: AV's original name was gotcha.com)
I posit that the folks at Google are among the smartest around and unless they all come down with some incapactiting brain disorder nothing on the planet will render them useless. And yes, I'd put money on that.
Need Mercedes parts ?
Havn't you heard? There's a ".INFO" domain for non commercial, well, information. Just limit your searches to that TLD.
I know the TLD must work because I'm getting p3n1s p1ll spam pointing to domains in that zone.
Don't you love it when a plan comes together?
cf. http://swiss.frog.museum
Need Mercedes parts ?
Google didn't get where it is by having the best search results. They got that way because they weren't a 'portal' full of ads, and the junk sites were targeting other search algorithms.
Interesting, both of those services use Google results, and dice them up further. Grokker adds a few more on
And I can't find it either.
I'm still plugging away. Check out my article in Dr. Dobb's in December '03. It's mostly about the core indexing technique, and I haven't gotten around to putting a regex module on top of it. Right now it'll find any substring in the source file, and it's fairly straightforward to expand that to regexps - I just need a module that expands a regexp into a bunch of strings, more or less.
There are a number of applications for the technology; I'm just too broke to work on them much.
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
Search on "paris site:hilton.com"
m or eDesc=true&ctyhocn=PARHITW
http://www.hilton.com/en/hi/hotels/index.jhtml?
Looks beautiful, anyone would be lucky to visit, even if you had to sneak in the back door.
I have used Vivisimo a few times but never realized that their method of categorization was quite langaage independent.
...
...
I can confirm that the categorization of Vivisimo works well independently of the language, and there was not really a noticeable difference in quality, whether I was searching for English, French, German, Russian or Polish terms.
It seems that the categorizing system is basically language-independent, but that they use language-specific lists of function words that should never be used as category names. At least, I noticed that with searches in Russian, Polish and Swiss-German, sometimes function words (e.g. pronouns, prepositions) were used as category names, which does not happen with searches in English, French and German, so I suppose they have a list of stop words for some languages, but not for others. However, it mostly happened when I was entering rather unusual search terms (mostly conjugated verb forms), so that it's not really a problem. Apart from excluding certain words, it seems to be language-independent.
If it really is then DMOZ, the Human Edited Directory, ought to incorporate dynamic categorizations like this, infact to the point that someday each user should have his/her own unique categorization of the all the websites in the world
I think human edited categories are quite different from automatic ones. Human edited categories are usually more exact, on the other hand, automatic categorization makes it possible to have much broader coverage. Both systems should be there and used when appropriate, but they should be kept distinct.
Meanwhile, are they using the words in the headings to determine categories ? Or is it words that have in some way been emphasized ? And to do this in a way that transcends language
I don't know how Vivisimo works internally, but from what I have seen there and what I know generally about automatic categorization, I suppose that basically those words are taken as category names that are significantly more frequent on pages that contain the search term than on the other pages (or in a general corpus). Then, I suppose there is an optimization, so that the sets of documents covered by the first-level keywords overlap as little as possible (e.g. when most of documents covered by category B are also covered by category A, category B is made a subcategory of A). It is possible that headings and emphasis are used for some kind of weighting, but that's not a central issue for such statistical methods. In principle, that works all quite well in a language-independent way (it's, of course, quite a challenge to implement such a system efficiently).
It's been used in Sex & the City as well. You don't get more mainstream than HBO.
I think it's Linux
...1 /6%Time=3FFA27 B4%O=80%C=179)
Interesting ports on 66.102.9.104:
TCP/IP fingerprint:
SInfo(V=3.48%P=i686-pc-linux-gnu%D=
"These people looked deep within my soul and assigned me a number based on the order in which I joined." --- Homer
nt
Jason Faulkner
Old Os Administrator
jason@oldos.org
oldos.
It took me 5 days to move my homepage from altavista to google. It took me about a day to go from google to vivisimo.
I have had a couple really good web sites that I simply couldn't find off google for months. On my first shot with vivisimo, the item I was looking for came up.
While this sounds pretty vague, I cannot tell you how impressed I am. The only trick part to that site is the two advertisements that show up looking like a result from a search. If they get rid of that... sky's the limit.
I've been doing a certain amount of writing about the WTC attack and have been appalled by the amount of information that is unavailable because there is no way to find it. I can go back and review the old threads in the Alternet forums, follow the links, and the articles are still there, but Google won't pull them up. When Google gets corporatized after its IPO a lot more info will disappear. If Union Carbide becomes a major stakeholder you'll never find anything about the Bohpal disaster through Google.
"Think about how stupid the average person is. Now, realise that half of them are dumber than that." - George Carlin
First, I want to start by agreeing with the people who say if you know how to search well, google can be very useful.
Often, though, you'd much prefer to put in "digital camera", rather than "digital camera review -buy -shop -shopping -purchase -order"
I tried out vivisimo and I found it much easier and simpler than google. I put in digital camera and click reviews, viola.
If you actually try the digital camera search on google, you'll probably find some useful sites, but I just used this as an example, because I'm _sure_ everyone here has punched in a simple query into google and found 20 spam sites, 5 unrelated blogs, 30 porn sites and a handful of warez sites.
webspam is an active area of work. I actually find Altavista to be remarkably impressive in this regard, and Google seems to be one of the worst these days. AllTheWeb and Teoma also do a fairly decent job.
sigs are a waste of space
Hmmm, that would be true only if the regex language is Turing-complete, which it is not (because it is a regular language, not context-free, not context-sensitive, not Turing).
You need to install an RTFM interface.
Digital? Oh, you must mean Compaq.
(Eagerly awaiting altavista.digital.compaq.hp.com.)
That's your machine's info, not google's.
Yahoogle!
The name.
It's not memorable and difficult to spell. "Google" is catchy and easy to spell.
Other than that it's pretty slick.
Ben
Work Safe Porn
funny & true - a lethal combination.
What I see in the replies is that the majority says that Google is good enough and that they do not need anything els/better/different. To me these are exactly the 'arguments' I hear when I tell people to switch over to Linux.
I do like that there are still different possabilaties. I would hate to see Google become the one and only searchengine, just as I hate Windows becoming the one and only OS or RedHat becoming the one and only Linux.
Don't fight for your country, if your country does not fight for you.
- commerce/money:
- politics:
- habits:
- religions:
- war/manipulation:
- social institutions:
a very nice post, smart and funny, i'm with you, everyone mod that post up!
How can a search engine know what is in your mind? when someone types "perl" and expects programmers ? he/she needs to type "perl programmers or php programmers" but then again does he/she wants to find freelance php programmers for doing projects or he/she is looking for marketplaces or tutorials ? artificial intelligence ? maybe !
Chris ,
Php Programmers.
This is probably particularly hard to find because the term "credit card" pulls up so much spammy commercial crap, obviously.
for beer money whilst at University, and the main thing I tried to get across was how to find stuff. We'd usually start by asking people their hobbies, sticking them in Altavista (back in the day!) and letting them browse what came up.
I can still remember the day I realised the net had changed a bit when one enthusiastic student of amateur photography tried it...
Excite White Paper
Check the copyright date.
BugBear
Ignorance is curable. Stupid is forever.
There is a book/standup comedy routine about googlewacking... www.davegorman.com if you are curious.
;-)
I suspect Buffy could be considered one of many popular cultural barometers...
Right I'm off to watch TNG reruns...
At least try to think before posting pointless "me too" messages for karma....
Clever signature text goes here.
When it comes to phrase searches this search engine is absolutely horrible in comparison to google. When it comes to 'phrase' searches of domain names/hosts, its just disgusting.
ummm, just did a search for Paris Hilton
... > Travel and Tourism > Lodging > Hotels
on google. sure, the screen is filled with useless
junk...but please just see the top line of the output.
Category: Regional > Europe >
click on that and the site you want and are looking for is right at the top.
i dont see what these useless whiners are complaining about!
3d output is good for comprehension. (DUH!)
All of which means that 3d interfaces are suitable when you need to see things that are 3-dimensional (double DUH!).
Note that I said big screens, but I didn't necessarily say 3D. A vertical dimension might be implied by stacking or scaling (ala OSX) if needed, but it doesn't need to be full blown 3D (this could become detremental).
And you wouldn't want to have "surfaces" presented at any oblique angle, especially when they can be occluded or if transparency is involved. (We are talking about search results, right?) It prevents you from reading any text or identifying icons at different parts of it with equal resolving power.
And forget about identifying physical objects in 3d. Having a space filling characteristic just limits your ability to present things because there's too many additional oppurtunities for "clutter" (you should see my desk!)
Black holes are where the Matrix raised SIGFPE
You can read the whole study here: http://www.cs.umd.edu/local-cgi-bin/hcil/sr.pl?num ber=HCIL-2003-36
Abstract follows:
There have been several studies that compare sequential search results versus clustered search results, and graphical presentations versus textual presentations. These studies have resulted in confirmed efficiency and preference of clustering over sequential lists. The studies between graphical and textual presentations have usually shown to be task dependant. This study shows a systematic evaluation of zoomable versus textual clustered search results. A controlled experiment with repeated measures design and within-subjects differences was performed with fifteen subjects, comparing Groxis, Inc.'s Grokker - their clustering product - a zoomable user interface, their textual clustering product and Vivisimo's textual clustering product. No significant differences were found for objective measures. However, there were significant differences for subjective measures. The textual clustering interfaces was preferred and elicited major satisfaction among the users. Results are summarized in both a quantitative and qualitative format.
- Ben Bederson Professor Computer Science, Human-Computer Interaction Lab University of Maryland
Are you sure you actually bothered to look? #1 regular result for 'scotland map' is a link to an excellent Scottish map. And anyway, this is what Google Images is for. Search for 'scotland map' then hit the 'Images' tab. Tons of maps at your fingertips. Can't see where your problem was. I found many maps in seconds, and didn't see any ads.
details...
You don't have to block it, you just rank them lower based on them not being relevant -- it's called an algorithm. People might sue, they just won't win.
I am not a number! I am a man! And don't you
Regular expressions ( using mathematical definition ) are not turing complete. They can only be used to specify regular languages. But the 'Regex' packages of the world are far more powerful. I do not know if perl's is turing complete but I wouldn't be suprised..
Eat at Joe's.
Is you know someone who is a music trivia nut. Best guess from a friend of mine is either Bonnie Raitt or Patty Griffin. There aren't too many red-headed females in the rock music genre.
They that can give up essential liberty to obtain a little temporary safety deserve neither safety nor liberty.
Ben
Do you mean like Dmoz.org?
Dmoz is more or less a moderated, catagorized, directory which also includes a search function.
Actually, the challenge of devising good labels is the key to a text clustering algorithm. Usually this is left as an afterthought, after the mathematical manipulations are done, which doesn't work well. A good algorithm should interleave grouping with describing, just as people would do if doing this task (very slowly, of course).
- Raul (from Vivisimo)
o/~ Join us now and share the software
It seems to me that search could be improved by paying attention to what you've done in the past. Current search engines assume each search is completely independent, but there's valuable information in what you did earlier, especially if you keep refining the query trying to find something. One attempt at this is demoed at Findory Search. It particularly makes a difference if you're refining your search query to try to find something. But it's still a work in progress.
Thanks, that's interesting. I don't suppose your algorithms (or just rough descriptions) are published publically just yet, are they? We're going over the paper I mentioned at my department's seminar meeting soonish, and it'd be nifty to know of alternatives. No worries if it's all a trade secret, though ;-).
Mars rocks!