Semantic Search Points To Better Relevancy
ReadWriteWeb writes in to tell us about an article by Dr. Riza C. Berkan, founder and CEO of hakia.com, describing the promise of and potential for semantic search. This approach to providing more on-target search results contrasts with the dream of the semantic Web. Semantic search doesn't require all the Web page authors in the world to begin adding metadata; but it's not a sure thing that the researchers now developing the idea will get it right.
Hear the outlandish claims ladies and gentlemen, of how the brave doctor wants us just to have better searches.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
IMHO semantics don't work on a global scale, it does work if you only check trusted sources. If everyone can create data and place semantics on it, it becomes useless. You can't trust everyone to place correct semantics on it, either they don't have the knowledge to place correct semantics on data, or they maliciously place the wrong semantics on it.
09 f9 11 02 9d 74 e3 5b d8 41 56 c5 63
Honestly, if some Marxist state from the 60s produced propaganda like that, everyone would laugh:
"The People's Revolution is about more than nationalism! New communal agricultural techniques will enable a standard of living of a completely different nature than today! Manufacturing and distributing goods for the Workers could be taken to a whole new level!"
It's the same fallacy: "If only everyone spontaneously got together and did what I think they should, all problems would go away!"
Yet just because the fictional utopia in question is the 'Semantic Web' rather than the 'Workers Paradise', everybody takes it really seriously. And nobody mocks it at all. Nope, nobody ever laughs at the Semantic Web.
Ok, ok, I'm just being mean, I should go and do something useful.
Whence? Hence. Whither? Thither.
But he doesn't say what the right way is, or how it could be, or even if he thinks his company is on the right track. There is no information at all.
Why, what did you expect, a link to their full source code? The article's about the direction the engines are taking, the way those appear in userland. If you'd ask Google about specifics in their algorithm, they'll also be quite silent all of a sudden.
There is a huge problem with the argument made in the article - one which is plainly visible in the "Palladium" example. The meaning of "Palladium" is related to an internal state (i.e. my internal state). What am *I* thinking about when I write "Palladium"? Am I referring to the element Palladium? Am I referring to the DRM technologies from Microsoft? This is dependent on three things primarily:
1: my "role". What am I? Am I a journalist at a newspaper? Am I a private citizen with a large collection of illgotten mp3s?
2: my "context". Am I discussing something? Is this a query related to a conversation I am having with someone else? God only knows how many Google queries actually stem from ongoing IM-conversations where a, to the reader, previously unknown term/subject is brought forward.
3: my "personality". What am I primarily interested in? What is my preferred format of consumtion? If I am 7 years old - what the hell does "Palladium" really mean?
To me it is obvious that the idea of a semantic web, the promise if you will, can never be delivered upon without a framework that is usercentric rather than centralistic in the current Googlefashion. Desktop search is interesting to some extent as a way of tying our personal space with the dataspace outside of our local control but that is still a very limited tool. Since much of what is very simplistically covered in 1 and 2 above is related to interpersonal communication it becomes obvious that what is necessary is data structures that learn from ongoing conversation, eg the intersection of Person A and Person B is described in a way that can give us guidance as to what the appropriate (or most likely) interpretation of the term used is.
There is much that can be said about this but suffice to say that the semantic web people are ignoring the real needs that have to be met in order to create something that is truly semantic and carries a knowledge of what the end user actually intends. Because if we don't understand the intent, we don't really understand anything.
I've had a wonderful time, but this wasn't it -- Groucho Marx
If it weren't for my wife, my media consumption would consist entirely of science fiction and WWI/II movies; thanks to my wife, I've been exposed to a much broader swath of media genres -- some of which has been painful, and some of which I've regretted... but in the balance, I think I'm a better person for it. But, then, I possess an abundance of room for improvement.
Actually, this issue is something that bothers me. This increasing ability to narrow our exposure to data which we find unpleasant, to filter out the world so that we only see what we want to see, is vaguely disturbing. I see what I think are consequences of this increasingly in my own country, and evidence of it in the form of rising fundamentalism around the world. I'm afraid that I do it, too. It is limiting and dangerous, and increasingly easy to do.
I don't have a solution, and maybe there isn't one. Perhaps, someday, we'll all live in virtual realities where all of the facts are shaped to what we want to believe, and we'll never have to interact with anybody who disagrees with us, and we'll find that this is the utopia that humans have been searching for.
Maybe.
--- SER
No. Actually, you're being accurate. Unless folks can solve the multiple taxonomy problem (and, no, deciding on a common taxonomy and taxonomy translation approaches have not worked in the past) and the metadata cheating problem, the "Semantic Web" is BS promulgated by someone who probably doesn't know the history of epistemology, taxology, or why hard AI problems really are hard, even if he has been knighted. And the people who think that this is worthwhile are the same techno-utopians who probably don't know much about the problem either. When you have a robot that can actually return a Dewey Decimal System classification to four digits to the right of the decimal for a set of randomly selected web pages (and, no, just returning the word "pr0n" doesn't count, although it would probably have the best score of most algorithms you can think of) then you can come and talk about having a start. Otherwise, it's all just BS.
That is all.