Post-Googleism At IBM With Piquant
kamesh writes "James Fallows of the New York Times reports an interesting search technology that IBM is developing. IBM demonstrated a system called Piquant, which analyzed the semantic structure of a passage and therefore exposed 'knowledge' that wasn't explicitly there. After scanning a news article about Canadian politics, the system responded correctly to the question, 'Who is Canada's prime minister?' even though those exact words didn't appear in the article. What do you think?"
That's pretty impressive. It takes quite a clever AI to read between lines and connect concepts, but I have to wonder how much of its 'understanding' was hard-coded rather than purely abstract. Would it be trivial to just stick in another language database and have it read translations of the article the same way?
Nevertheless it makes me feel like all the programming and design I've ever done is pathetic and I will never amount to anything. That's how it is in the software industry - always someone out there who makes you look bad.
Sam ty sig.
One example is meaningless. To get a realistic idea of how useful this system is, we'd like to see what it says if you ask several dozen questions. For all we know this was the one question out of 100 that it answered correctly.
Feed it the news about Iraq. Then ask it what the war was about.
Good bye, new system, too dangerous for "national security".
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
The genius being google's success was paying *less* attention to the content of a page when categorizing it, and relying on links *to* the page instead. Why? Because of spammers.
"Genius" would imply some sort of brand new insight, but citation analysis has had a long tradition before Google appeared on the scene as a search engine. Google's biggest achievement is probably in implementing citation analysis on a very large scale, but they didn't break completely new ground in how people search.
And, in the long run, semantics-based analysis, like IBM's Piquant, is probably going to be the better technology: citation analysis for determining relevance to a query is really just a limited substitute for understanding of the content.
From the article:
MR. CICCOLO, the search strategist, said that in a way his team was trying to match - and reverse - what Google has achieved. "As Google use became widespread, people began asking why it was so much easier to find material on the external Web than it was on their own computers or in their company's Web sites," he said. "Google sets a very high standard for that Web. We would like to set the next standard, so that people will find it so easy to do things at work that they'll wonder why they can't do them on the Internet."
They seem to be explicity targeting intranets or known good databases, so the spammer issue might be moot.
This raises another issue, however. Will this technology become so useful as to lead to the bad old days of proprietary information dbs a la Lexis/Nexis? I'm assuming the indexing will have to take place on company-owned servers.