Using the Semantic Web to Enhance Search
RobMcCool writes "At Stanford KSL, we really like the Semantic Web. So we've taken many of our favorite web sites, scraped them, and put together a huge pile of RDF, which we'll let you download. We've used that RDF to create a search application, in the spirit of Google Q & A or Microsofts recently announced MSN Search extensions. Our search can answer simple factual queries like the previously discussed population of Portugal but can also answer some more complex ones. We also have a smart autocomplete system, type "tom hanks birth" slowly to see it in action (best with Firefox). We're looking for people to be a part of this search system by running their own search sites, and by putting their data on the Semantic Web. Come check it out!"
Semantic-driven search engines have awesome potential. However, it does place a lot of demand on the content provider to provide metadata-rich content - or to be able to provide intelligent mining tools to create metadata from existing sites.
This is definitely one to watch...
Autocomplete is a useless feature that nobody wants to see when the type "a"...and see it load everything that beings with "a". The user is not interested in items starting with "a". Perhas they're interested in terms beging with "anon" or something, which has many fewer items to load, therefore making the load time much faster and not annoying the user in the process.
Or, even better, never have any autocomplete turned on automatically. Do a VB-like idea, where if you want to see possibilities at a certain point, hit a specific key that will register for the list to pop down.
No, 'works best with Firefox' is just as bad as 'works best with IE'. What would be nice would be to see 'works best with any standards compliant browser'.
Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
What truth?
There is no dupe
While the idea of the semantic web has been legitimately lambasted, I think it's a bit far from DOA. While I agree that it's not exactly practical, I think that if you get enough sites displaying their content in such a manner, you'll eventually reach a point at which others will do the same.
I mean, think about it this way - while laziness or inertia might initially win out, once someone's competitors start to explore the idea of the semantic web, interest will start to be shown in it, especially once it becomes either profitable to do so.
concrete5: a cms made for marketing, but strong enough for geeks.
This looks like it will broaden the volume of useful searches. Right now, there are at least two limits that show up when searching:
1. For really popular subjects, the useful links are swamped in the noise of sites trying to make a buck off of getting you to look at their ads before directing you to somewhere else, that might have the actual content or might not.
2. For many less popular subjects, there is some oddity, like an unusual term being borrowed by some other field, so that it is something most people have never heard of, but people in two or more specialties use it frequently, in very different ways. resulting in strangeness. (i.e. the search engine throws up 23,003 links for a search on "Sator Resartus". 30% are esoteric literary criticism, 20% relate to apoptosis (cell biology), 20% relate to building moral inhibitions into A.I., 10% to Keith Laumer novels, and the rest are probably noise).
(I'm sure there are more than these two limits. Someone else may want to comment on some others).
This is likely to help with the second case, oddities in the data set grouping. (it could sort links into the larger sub-categories, query the user which one(s) seemed most applicable, and maybe even sort out a small set of links that explain, for the previous example, how a high brow literary term got borrowed by the other fields).
It's not as likely it would help with the first case, though, as sites that don't have actual content are actively duplicitous. Something that is actively trying to fool humans is still likely to be very successful at fooling our tools.
Who is John Cabal?
Nice straw man argument. How many people making their own personal site is going to dedicate 2/3 of their time to tag their content? The only people that are going to tag their content are those looking to abuse the system. No sane individual is going to spent 3 months of time to go back and edit all their pages with tags. Even then, you still have the problem of conflicting categories (aka ontologies). There will never be a globally accepted set of Onotologies. It's all pipe dream. Why should users spend hours and hours to tag their site when google is already doing a good job of indexing pages?
The Semantic Web is about describing resources, not tagging pages.
Indeed, you might output RDF from your processing of Web pages.
Extracting information from semi-structured text is very different to making logical assertions about resources.
if it would mean that their sites would rank higher in the search results, I'd say that they all would...
Isn't this basically what HTML is supposed to do kind of?