Grokker Search Engine Provides Visual Search Results
KeatonMill writes "The New York Times (as always, free registration) ran this article about a new search engine, called Grokker, created by a company called Groxis. Grokker builds a map of content catagories using metadata. So far, it is used in the Amazon.com online catalog and the Northern Light search engine. Groxis is also developing a version to use to search your own computer."
partner=cmdrtaco
(Yet another search engine) Do we really need another one. Google rocks, nuff said.
"Groxis is also developing a version to use to search your own computer."
You mean they have a FreeBSD version? yah, I thought so.
I've used Grokker at my school's library to do equity research. It works pretty well, though it's not as specialized as specialized programs.
Read jack phelps dot net
I saw some of the similar kinds of sorting of metadata with stuff from YellowBrix and LingoMotors.
I guess, given my background, I'd be interested to see how this works in the bigger arena and if they'll be doing widescale support of the PRISM and SCORM standards.
Anybody out there get to really play with this on the back end?
The power of accurate observation is commonly called cynicism by those who have not got it. - G.B. Shaw
Kartoo.com
Note: Flash required.
System Req's:
Windows 2000, XP (for non-English locale Windows 2000 or XP)
Pentium III at 400MHZ or higher.
128MB RAM (256MB preferred)
100MB of free disk space without preloaded Java (with preloaded Java 20MB of free disk space).
How do they ever expect this to truly catch on everywhere? Google is all you need, and it works great on a 386 with Lynx.
This is just more useless attempts at eye candy programs from another dot-bomb company; move along, folks.
Searching using metadata? Are they just name-dropping terms? What metadata is it? If it's just gathered from a page then it's meta description and keywords. I mean, Northern Light doesn't search for Dublin Core metadata does it?
--Giving to trolls for the benefit of us all
I grok it.....
Kartoo (previously mentioned here) has been doing visual search results for quite a while already; I'd even say that's the most useful application of Flash that I've ever seen.
I'm a firm believer in Google and all, but I think that there are always new things that another search engine could provide that would make me switch. I think some of the things that made Google so big are that their pages load fast, the search results are so simple, and commercialism doesn't affect the search results themselves. If another engine can match this, I'm sure that they'll go far, even if not as far as Google has. Besides, if anything it will give Google something to worry about so that they'll work even harder at securing their position as the best. ;)
In my humble experience, I've noted that preview versions of things tend to come out one one platform, and usually the widest available.
Although they're in California, I would have my doubts as to them using solely American programmers. I don't know how various OS support is all around the world, but I do know that when I worked with Israeli companies, they tended to focus heavily on Windows due to the strong Hebrew support on that platform, and a noticeable lack of Hebrew support elsewhere.
Again, it could simply be because building the front-end for the widest range of users was simplest with Windows.
The power of accurate observation is commonly called cynicism by those who have not got it. - G.B. Shaw
That would be heaven!
I use search engines alot (for research) and it causes me to type the same words again an again. Right after breastfast I started reviewing my paper and cunted about 10 typos per page! Using search engines can affect your writing.
Mr. Decombe argues that Grokker is a more universal approach to the problem of visualizing textual information than what has been found in previous tools, which focus more on navigation than on categorization.
"The difference is that we have no single paradigm"
A search engine should be impartial- if you search for something, it should give you the site that best matches what you're searching for, not the site that best matches what the owner's opinions are.
Just recently, they removed several thousand websites from their index for unclear reasons- I first noticed this when a search for "somethingawful" failed to bring up anything on the http://www.somethingawful.com/ domain, like one would speculate that it should. I'm sure we all remember a few months ago when Google removed anti-scientology websites from their index and refused to sell advertising to anti-sci sites and services. Something Awful, which I'm sure most people here are at least aware of (if not avid readers like myself) has in the past published several anti-scientologist articles.
A quick glance of the google public support newsgroup shows that SA might not be the only site that's recently been removed. Some people are claiming that google has recently removed dozens of Christian websites. It could be a fluke, but it seems to me like perhaps Google has fallen to outside political influence. I for one will welcome new search engines, if for no other reason than to loosen google's monopoly on internet searches.
Username taken, please choose another one.
Saw TouchGraph on thescreensavers a while ago, this article just reminded me of it. Basically its a java applet that allows you to search google and look at the relationships graphically, pretty cool.
Check out http://www.namesys.com/whitepaper.html - it's the future vision whitepaper for ReiserFS. I think that it would be neat for more people to rethink data indexing and metadata strategies at the operating system level. Lets think more about breaking our data into chunks and associating metadata with those chunks. For example, a consider a database stored in a file. Why shouldn't the operating system know about those chunks so that more apps can see them? Why can't I use the same or similar access control mechanisms for those chunks, etc?
:)
Sorry for this kind of off topic post, but the word 'metadata' triggered the memory of that paper.
I don't know about anyone else here, but Amazon.com searches frustrate the hell out of me. I can type in the exact title or exact author's name into Amazon's search boxes, check the "Exact spelling" box or whatever it is, and still the thing I'm looking for comes up 8th, behind things that make no sense whatsoever, if it comes up on the first page of searches at all.
Typical example: I want to search for a specific book called Moon Madness, say. So I type it in, get 28,000 results and it's nowhere on the first page. Re-sort, right? Wrong. Amazon only allows you to sort by featured items (?!), A-Z order, Z-A order, or most popular order. And there's no way to skip ahead to halfway through the list on alphabetical order. So you're stuck clicking "next" thousands of times. I've just switched to BN which manages searches MUCH better. If Moon Madness is unpopular or non-featured, which is fairly easy to be, dust off that auto-clicker.
So if this is what Grokker has in store for us, leave me out. If it's just Amazon which limits the search engine's functionality, then Grokker definitely shouldn't be using Amazon as a reference.
I'm sure a lot of us know that this WILL NOT catch on. Maybe its all that bad dot-bomb experience that induces negative thinking - but what the hell is this thing going to give me that a text-based system won't? I used it and found it very tiring to use. Turns out that nicey grafx is not always the best choice to present information fast and precise.
There are of course other reasons for that:
- NO WAY that many users will install this monster on their machines (if it doesn't come with Windows or a Linux distro most people won't bother even if they got broadband).
- Often you just want to do a simple keyword search, that's how the brain works most of the time, so the graphical relationship explorer thingy is not needed in most instances - and yet on occasions when it is needed it takes far to much (human) processing power to work with in a quick manner.
Besides i found the tool to be somewhat of a context/paradigm breach that isn't well suited for ease of use nor "professional" search work.
When will they learn?
I am not about to spend $100 on software that I don't get to try beforehand. Not giving demos out is so... 19-something-or-other. Hell, I don't even buy CD's that I don't get to sample first. I ain't wasting money on cruft, let alone $100 cruft.
Sorry guys. No demo == no sale.
People have been presenting graphic search engines since 1995 (look up the WWW conference, for example). To this date, none of them have succeeded.
For all graph visualizers out there: no one cares that you can draw a nifty little graph with arrows as links (duh!). The question is: is the information associated with those links best grasped visually?
The page ranking algorithm from Google uses link information to compute the ranking of the result set. It is unclear how a collection of lines in a blank page will enhance the fact that the top reference is, ahem, the top reference...
I find it amusing that so often, technologically-inclined folks have this sort of "religious" appreciation for Google, as if it is the only search engine technology worthy of regular use or mass consumption. Nothing could be further from the truth. While I will admit that Google is my first (but not only) search engine of choice (and only for certain types of searches), I would like to state my general feelings on the current state of search engine technologies which, I would like to characterize as "brilliant yet balkanized".
Before I do that, and in all fairness to Google, I would like to say that Google's PageRank technology is, for the most part, decent, although certainly not universally-superior.
Google however, has a lot of room for improvement. Some suggestions?
(1) Search Result(s) Clustering
Take a look at Vivisimo.com... They are a company widely recognized for having the most intelligent results-clustering technology.
I find it bizarre that Google often reports to me that there are hundreds of thousands -- sometimes millions -- of results...and yet, gives me the "option" of browsing those results in a top-down linear fashion.
I feel that this is unreasonable. If Google were to CLUSTER results (especially in cases where there are >5000 results), I feel the end user would be much better served.
Google could certainly license Vivisimo.com's clustering technology...or implement their own proprietary form. Either way, I am amazed that Google's results are still not clustered -- even if that clustering was a checkbox option to do so.
It's just not sensible to often report thousands and sometimes millions of results, and then give the user an oversimplified top-down (linear) interface to browse through those results.
(2) "Visual" Search technologies.
I've been a regular user of Kartoo.com for some time now. I find it to be the most well-implemented keyword-connectedness research tool I've ever encountered. SEO-enthusiasts are blessed to have (for now) free access to Kartoo.com. It is also a spectacular implementation of Flash for a purpose other than just "looking slick" and being flashy.
Kartoo allows a person to easily (and very neatly) diagram the keyword commonalities that connect and relate documents on the web.
Unfortunately, the Kartoo interface seems to apply to a limited database. Kartoo functionality on top of Google's database would be ideal, in my opinion.
What am I suggesting? Personally, I would love to see Google buy out or license the Kartoo technology and let users apply it to the Google database. Kartoo really is a very intelligent and useful keyword relevancy diagram tool.
(3) Recursive searching of previous-generation results.
For the longest time, I never knew Google had this ability. Why? Well... after your initial Google search, you only see the "Search within results" link at the BOTTOM of the results list. I feel this option should be available at both the top and bottom of the results list.
(4) Memetic Histography
Take a quick look at HitBrain.com. While far from "perfect", they seem to be doing the best job thus far at keyword frequency tracking. While perhaps "novelty", I think there is real demographics-research value in the following sort of functionality:
Allowing registered users to track relative frequencies of keyword/keyphrase data sets. By this, I mean that a person could, for example, keep an ongoing (daily/weekly/monthly/etc) table of the number of instances of certain search tokens. For example, "john lennon" vs "paul mccartney"...or "microsoft" vs "macintosh"...etc. I think anyone who qualifies as an information age demographer has a use for tracking (over time) relative frequencies of keywords and keyphrases. There is also some entertainment value in seeing how many instances of "good" there are relative to "evil", etc.
(5) ISO search engine syntax standards. I think it would be nice if there was an ISO standard for search engine syntax. I personally prefer Boolean searches to non-Boolean searches (especially when clustering of results is not available). I think that all search engines should accept an ISO syntax standard for searching that, at the very least, allows for advanced Boolean queries, and also, string-proximity-specifying (that is, results for "A" within X number of characters/words from "B", etc) capabilities. Wildcard-capable advanced Boolean and string-proximity-specifying are very useful functions, and would be nice to see ISOfied on all major search engines.
It troubles me that the +" " and -" " syntax doesn't work on all my favourite search engines.
(6) DNS search capabilities.
Take a look at WhoisReport.com (now Whois.sc) and see what it can do for you. I have yet to find a better resource for searching the actual DNS itself.
Some may frown on the searching of the DNS itself but, the truth is, to a respectable degree, the DNS itself has evolved into being a useful directory of sorts. Name Intelligence (the people behind Whois.sc) make their technology available as an API, and Google would be wise to add DNS search capabilities to their WWW search capabilities.
Just my $0.02
Michael Fischer
Well I guess if they spelled it with 3 it would be to obvious
I guess it'll only be a matter of seconds before the notorious "Search/*couch*Spam*couch*-King" or his minions will find a way to abuse this device aswell..
A horse can't be sick, you know, even if he wants to.
In the 1961 book Stranger in a Strange Land. Quite an achievment to add a word to the English language. It means "To understand profoundly through intuition or empathy. to comprehend.
"God fights on the side with the best artillery." - Napoleon, Marshal of France - speaking truth to power
Looks like a neat tool for navigating data. To those who say Google is enough: do you use Google to navigate your hard drive? Do you ever follow links on a web page or do you always Google to the linked page? Even Google has multiple types of searches.
Those screenshots look a lot like the Pad demos on the web page of Ken Perlin (my former advisor). Compare, for example, the Grokker web browser with the Pad site tour (which has been online since 1998).
[offtopic]: it uses an interesting algorithm. although it does seem to use a spring dynamic system, it seems to be critically damped. the nodes never oscillate, and seem to pull very smoothly. i have never seen graph layout algorithms with such smooth characteristics.
BSD is for people who love UNIX. Linux is for those who hate Microsoft.
I've got to agree with you that Google has some room to grow, but our religious reverence comes from the fact that Google simply kills its competition. Before Google, I was a huge fan of Altavista. I could write some complex searches, and generally tease good results out of them. However, even as a programmer, I too often felt the bite of an insufficient query, or an insufficient database -- you never know. Now imagine the frustration of those not as technically adept as us.
Since I've started using Google, I've _never_ failed at a search and had another "generic" engine succeed. This obviously doesn't include results that sit behind specialized search engines like whois information.
Regarding some of your frustrations, you might be glad to know that Google does indeed support them. Recursive searching? Just add your words to the end of your original query. Memetic histography? Check out the Google Zeitgeist. ISO Syntax? Google's strength is that you don't have to program your searches. A simple query, combined with their ranking algorithms really sets you free.
Don't mistake this for blind adoration. As a search engine, nothing even approaches Google. As I've enjoyed their recent innovations (news and others,) I continue to look forward to more.
It all goes downhill from first post
Since kartoo was mentioned here I've started using it and it has rapidly moved to being an important tool.
I don't use it to find specific answers to specific questions (google still does that better), but I do use it as a tool to find related topics that I might not have found otherwise. Sometimes it works very well indeed, sometimes not so well, but when it works its great.
The other day I was just browsing a topic area of minor interest and discovered a tool to do something that I've wanted for a while (built my own, i did, so it will be interesting to compare results). Even though the tool was available back when I was looking for it, it was described in terms that were slightly different than the ones I was searching on so I got not direct hits and the more opened ones often resulted in way too many matches - enough to make effective narrowing tough.
I built something a while back that did something similar - using a graph layout tool, some ad hoc similarity measurements and a few other oddities - but it was a pain and didn't use the kind of interface to the search engines kartoo uses. Lacking effective spidering and a large enough database it was usable, but not always pleasant.
If only it were not flash! And if only there were a "open in other tab" thing so I could more easily keep search context.
grok /grok/, var. /grohk/ vt.
[common; from the novel "Stranger in a Strange Land", by Robert A. Heinlein, where it is a Martian word meaning literally `to drink' and metaphorically `to be one with']
The emphatic form is `grok in fullness'.
1. To understand. Connotes intimate and exhaustive knowledge. When you claim to `grok' some knowledge or technique, you are asserting that you have not merely learned it in a detached instrumental way but that it has become part of you, part of your identity. For example, to say that you "know" LISP is simply to assert that you can code in it if necessary - but to say you "grok" LISP is to claim that you have deeply entered the world-view and spirit of the language, with the implication that it has transformed your view of programming. Contrast zen , which is similar supernal understanding experienced as a single brief flash. See also glark .
2. Used of programs, may connote merely sufficient understanding. "Almost all C compilers grok the void type these days."
--
What can ya say? Im a Karma whore...
Allow me to rephrase it (I mean, after all, this is /.! :-D):-
It is now official - SearchEngineWatch has confirmed: Google is dying.
Yet another crippling bombshell hit the beleaguered Google community when recently IDC confirmed that the Google accounts for less than a fraction of 1 percent of all search engine usage. Coming on the heels of the latest SearchEngineWatch survey which plainly states that Google has lost more market share, this news serves to reinforce what we've known all along. Google is collapsing in complete disarray, as further exemplified by failing dead last in the recent Sys Admin comprehensive search engine usage test.
You don't need to be a Kreskin to predict Google's future. The hand writing is on the wall: Google faces a bleak future. In fact there won't be any future at all for Google because Google is dying. Things are looking very bad for Google. As many of us are already aware, Google continues to lose market share. Red ink flows like a river of blood. Google Groups is the most endangered of them all, having lost 93% of its core posters.
Let's keep to the facts and look at the numbers.
The Google CEO Eric Schmidt states that there are 7000 users of Google. How many users of other protocols are there? Let's see. The number of Google versus other search engine hits when you search "search engine" on Google is roughly in ratio of 5 to 1. Therefore there are about 7000/5 = 1400 other search engine users. Google posts on Usenet are about half of the volume of other protocols posts. Therefore there are about 700 users of Google. A recent article put Usenet at about 80 percent of the Internet market. Therefore there are (7000+1400+700)*4 = 36400 Google users. This is consistent with the number of Usenet posts about Google on Google.
Due to the troubles of Mountain View, abysmal sales and so on, Google went out of business and was taken over by Slashdot who sell another troubled web service. Now Slashdot is also dead, its corpse turned over to yet another charnel house.
All major surveys show that Google has steadily declined in market share. Google is very sick and its long term survival prospects are very dim. If Google is to survive at all it will be among search engine hobbyists. Google continues to decay. Nothing short of a miracle could save it at this point in time. For all practical purposes, Google is dead.
Fact: Google is dead.
(Credits: This is a revisionist post-modernist /.-aimed humour inspired by an earlier Web is dying troll, which in turn was inspired by earlier "BSD is dying" trolls. And oh, I got the post by googling slashdot for "Kreskin".)
More than mere navel gazing.
Besides Kartoo, the NYT article mentions Vivísimo as a search engine providing an alternative way of viewing results. Vivísimo displays the results in a more classical text-oriented list, but with a tree of hierarchical folders alongside it. These folders provide refinements of your search with additional keywords.
It's definitely worth checking.
I did the research: Start with Salton's or Ellen Voorhees work and go forward.
Bottom line: boolean matching works "unreasonably well", clustering is more expensive for only marginally better results. Page ranks (big-ass SVDs) are done periodically, cost is cheap amortized over boolean queries. With clustering, you gotta do more work for every query.
For a small document collection (single pc), and no measurable queries (i.e., single, infrequent user), this might be nice. Then again, if it is a small enough collection to be viable, its probably overkill.
Some of these other techniques look interesting though. Might have to fire up google and investigate.
For groxis though, it doesn't really matter anyway. This entire space is covered by multiple overlapping patents. So even if they win, they lose...
in the treacherous seas of search is going to be this companies biggest hurdle. Assuming they don't founder in the sinking economy (unless they get bailed out), they had better run a tight ship. Because even if they win, they will likely lose, when the patent holders come to collect license fees.
Internet search space is seriously covered up with multiple overlapping patents on a few relatively simple notions about how to find stuff. It's a lawyers game now, not a technology game.
Did anyone else notice the ad.doubleclick link which puts you through to the NY times article? Its been changed now to a direct link, but I wonder how much the original poster made with that trick. tsk tsk.
I.O.U One Sig.
I was priveleged enough to have this demo'd for me at my house back in 2000. I talked to the main programmer a lot and gave him some suggestions on how he should further develop his program.
At the time, he was not using XML in any respect, he had developed his own markup language for displaying his data results. Also, he had written his own version of AWT (back then that's all there was) because he didn't like the API.
I commend a guy who rolls his own, but he seemed to take it a little too far.
Anyways, more important, the product struck me to be pretty cool back then, but I really wondered about its uses. I asked him straight out, how do you plan on making money with this, because I really see no possible way. Google has now become the dominant search engine on the net, and people really don't feel the need for yet another way to present data (we've seen ask jeeves, we've had our fill).
I said to the programmer that I thought it was great what he had made, but I didn't think that it would have much appeal in the consumer market. People just aren't willing to pay for some interface that's just a little better; Grokker is definitely _not_ a silver bullet.
The programmer's business partner, the one who was in it for the money, thought that people would definitely go for it, and that people would pay to use the grokker engine. That age is long over, especially with the economy we have now. Good luck.
I see Grokker only existing in one space, small search interfaces for specific websites, like amazon, or other types of online stores or repositories (maybe one for the RFC editor?). But the point is, there's no money to be made here. The programmer has done a lot of work, and made a great product, but he'd be better off just making it open source and letting people use it than trying to make money off of it.
I see this company dying quickly once the VC's give up on it..and to the programmer, I'm sorry friend, but you're going to have to go back to France.
Good luck.
It troubles me that the +" " and -" " syntax doesn't work on all my favourite search engines.
It bothers me that the + syntax had to be invented in the first place. It arose from AltaVista, which would imply "OR" between words instead of "AND" by default, which practically nobody wanted. So users got in the habit of putting + before absolutely everything.
Google did a smart thing by getting rid of +, even though it conflicted with what people had grown to expect.
And then there's:
(2) "Visual" Search technologies.
(4) Memetic Histography
Google has always excelled on not wasting space and bandwidth with useless crap. This is an extension of that. I have never encountered a "visual" search engine which was actually pleasant to use. It's like a 3D desktop - it sounds like it should be better, but it just gets in the way of what you want to see.
Win dain a lotica, en vai tu ri silota
html version is here:
http://www.kartoo.com/en/kartoo.html
I really truly don't understand what is so special about this Grokker. I went to their website, and to the sites that use it and don't see the relation... oh well...
THE STORY OF CREATION
...
or
THE MYTH OF URK
In the beginning there was data. The data was without form and null, and
darkness was upon the face of the console; and the Spirit of IBM was moving
over the face of the market. And DEC said, "Let there be registers;" and
there were registers. And DEC saw that they carried; and DEC separated the
data from the instructions. DEC called the data Stack, and the instructions
they called Code. And there was evening and there was morning, one interrupt
-- Rico Tudor
- this post brought to you by the Automated Last Post Generator...