Semantic Search Points To Better Relevancy
ReadWriteWeb writes in to tell us about an article by Dr. Riza C. Berkan, founder and CEO of hakia.com, describing the promise of and potential for semantic search. This approach to providing more on-target search results contrasts with the dream of the semantic Web. Semantic search doesn't require all the Web page authors in the world to begin adding metadata; but it's not a sure thing that the researchers now developing the idea will get it right.
From TFA:
"There are so many ways of doing it improperly, and only one way of doing it right."
But he doesn't say what the right way is, or how it could be, or even if he thinks his company is on the right track. There is no information at all.
When his defense asked, "Which computer has Jon Johansen trespassed upon?" the answer was: "His own."
The semantic web is about more than search. Rich semantics will enable applications of a completely different nature than today. Aggregating and mashing up data could be taken to a whole new level. Just because someone comes up with better indexing we shouldn't give up on the semantic web.
Just my 2 cents, anyway.
.: Max Romantschuk
While this is not strictly PR piece for Hakia.com, it mentions the site (and some others) and I just to try it. I gotta be honest, it does produce more interesting results than Google in some cases (i.e. more accurate). While in others it produces worse results. But the company's young.
Overall, this is the direction we should be taking. The semantic web is indeed just that: a shiny dream.
Today, we're talking about anyone having the ability to create a web page, using pre-made online page/blog tools, or easy to use WYSIWYG desktop apps.
You can't ask of people who can't make the difference between typing a query in the search engine and typing an URL in the address bar, to add proper meta information on his blog. Not to mention the abuse potential.
I can already hear someone saying "If you don't know the XHTML/CSS specs by heart you shouldn't be making pages" but that's just arrogant. Technology should destroy barriers, not create them, the technology which implements this idea better, will succeed. Look at Google: it will parse even the most horrendous code and extract proper information for it. This is why they are number 1.
BTW, Google already extracts semantic information from both the site and query, but this quite primitive compared to the potential mentioned in the article. Google looks for term context, meaning context, synonyms, related words etc. I hope Hakia.com and businesses like them take this idea further, so there's finally some innovation happening in search (something that only enjoyed gradual and miniscule improvements for the last 9 years, since Google introduced pagerank).
Yes, people will abuse it in any way they can. Mostly to try and get higher up in the search engines. But this does not mean it is by definition useless. It is useless to do ranking, but once you (the search engine) have decided to list a site, you could use the metadata for semantic web-stuff. How about allowing for a physical address, phone number, opening hours (for brick & mortar )... This would e.g. allow for a "copy address to contacts" button. Make an easy (web based) program to generate the HTML so mom&pop shops can include it tin their website, and refrain from using it for ranking purposes, and you should be ok.
10 ?"Hello World" life was simple then
MovieLens is perhaps kind of similar-but-different. You go there and rate movies. Based on similarities to how other people rated movies, it then suggests movies for you and your likely rating of them. It's pretty neat actually -- my wife and I both have accounts there, and you can cross-reference with other people. So now when we go to the video store, instead of each of us picking one movie we like and potentially forcing the other person to suffer through it, we can find a movie that (in theory) we will both like. Seems fairly accurate so far.
Due to circumstances beyond my control, I am master of my fate and captain of my soul.
Semantic Web = the promise that never quite delivers
Such a good idea in theory, but where does trust come from? Who can we trust to mark anything?
And by the time any of this is solved google will have evolved so it can understand plain text better than mark up. How do you markup something as ambiguous? Unsure? Rumor? It's pretty easy in plain English:
"I hear Joe is living in Cornwall". There you go, easy to use and no angle brackets.
monk.e.boy
Open source, flash charts
I got my master's at one of the schools getting the bulk of the research money, and we made that same argument there, to deaf ears. Namely that students and professors were solving the easy "peripheral" problems related to semantic web, and just ignoring the 13,125,732-lb gorillas in the room.
Yeah, pretty much. I set out to make a data assistant program in high school (c 1996-1999) and was thinking about how to get a correspondence between what I was thinking and how data would be retrieved and figured it would have to be so generic to be worthless. And then I read Hilary Putnam's Representation and Reality and felt sick about the entire thing. But now that I think back on it I did have a lot of fun testing out different kinds of data retrieval on structured and unstructured data (and thinking up weird semantic hypertext languages).
9 86906 -- lol
http://slashdot.org/comments.pl?sid=142985&cid=11
"Susan saw the dog in the window. She pressed her nose against it. She wanted to buy it."
The SW project exists *because* machines are too dumb to read English. Or Chinese. And will probably stay that way for the forseeable future.
So W3C's RDF is positioned half-way between the world of dumb computers and smart people. It structures data in terms of classes and properties, and allows different groups to define sets of class and property names that can be freely mixed together without the need for heavyweight standardisation. And it gives us an SQL-ish querying framework, SPARQL, for asking questions of this data, and getting back tables of results. Despite the myths, RDF doesn't oblige people to put metadata "inside ever Web page". It just defines a common data model that information from various sources and formats can be mapped to, so that what they say can be processed with less regard for fiddly detail of file formats and encodings. And RDF certainly doesn't require that you believe everything you read: the SPARQL spec, unlike SQL, provides built-in machinery for querying properties of the data source, inline in your query, so you can filter the data down to the bits you decide to trust in some specific app.
If you are interested in real solution to semantic web markup that works (and is being used) right now, you might want to check out the Microformats website. There is a growing following that is working on getting the semantic web working properly. The Firefox and Songbird guys are looking at using Microformats to make browsing the web a much richer experience - NOW, not 10 years from now.
There are currently Microformats for marking up people, places, events, geographic locations, music, and many other widely used data items on the web. For more information on what Microformats are, check out the info page on Microformats.
-- manuManu Sporny (skype: msporny, twitter: manusporny, G+: +Manu Sporny)
Founder/CEO - Digital Bazaar, Inc.