The Web of Data, Beyond What Google and Yahoo Show
jccq writes "Both Google and Yahoo have been supporting Semantic Web markup (RDFa, RDF and Microformats) for weeks and months respectively. What they do, at the moment, is use the markup only for visual feedback by returning better looking, more functional 'page snippets.' But how would it look if you could get all these bits and compose them automatically to form a single structured information page about what you're searching for? The folks at the DERI institute have just released Sig.ma, a visual browser and mashup generator that will go all over the web of data and find dozens of sources to combine together when answering a user query. It also comes in API mode to reuse the information Sig.ma finds inside applications. Here are a screencast and a blog post, with semantic-web-geek details."
I for one wouldn't want to. I'd much rather search and find a good site on the topic myself.
and studied at nearby uni,
DERI is a money blackhole, most of the people there know that semantic web has many many issues and probably will never bear fruit, but chose not to speak up in order not to damage their academic careers and keep their cushy "research" positions
Facts aren't copyrighted. At what point does a result of combining facts become a copyrighted document?
The folks at the DERI institute used to have Sig.ma, a visual browser and mashup generator that will go all over the web of data and find dozens of sources to combine together when answering a user query.
Computer Science is all about trying to find the right wrench to bang in the right screw. -T.Cumbo?
RDF is nice and there are various different syntaxes for it (including various triples formats), and promises, if it can be built, deployed and trusted(!!!) to make the web ever so much more searchable. This will depend though on people writing good ontologies (not easy) and using them correctly (even less easy).
RDFa and microformats look, on the surface at least, to be nice ways to manage RDF type information in HTML. But I'm a bit more dubious - they don't, in many cases, have careful ontologies built around them - when they do (RDFa, mostly) they seem to be very resource intensive (a heavily RDFa annotated HTML page is likely to balloon to several times the same page without RDFa), and the uses of them I've seen have been less than convincingly correct. This doesn't mean that they're useless, just that they're not doing the job at the moment, or they're doing the job poorly.
The solution that seems to be favored by the semantic web types is to present RDF pages as an alternative to HTML pages when RDF is requested. This looks, by far, to be the best way to work this, but does require site builders (and CMSs and web frameworks), and content authors, to be able to build correct RDF pages that represent the information presented, often at the same time as they present HTML pages to human readers (and non-RDF search engines). This is going to be a major problem.
It was the future in 2001; inspired the masses with its vision of the glorious future in 2003; and of course we are presumably right on the cusp of this golden future today.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
I don't know why but their presentation pisses me off beyond reason.
Probably because it's the n-th time somebody is trying to impose some silly standard.
And pretends it's the best invention since you-know-what.
I have in real life a fairly common name, there's at least 10 of me worldwide, I recognized that they deliberately picked a unique name to show how well it works.
Ach we'll see.
Now that you've been slashdotted, I'm wondering how would it look if you could get all these bits and compose them automatically to form your home page?
Say hello to my little sig.
It was fun while it lasted.
My name brought the sig.ma server to its knees.
I will not be pushed, filed, stamped, indexed, briefed, debriefed or numbered. My life is my own.
Some people have already suggested that common names will cause problems with this system. The next big thing should be searching by context. I hate searching for "supernova" only to get a long list of songs by some band. The keyword "space" or "star" helps, but that usually results in other false hits, too. Don't even get me started on acronyms, or things that don't have anything to do with computer technology.
Would there be any way for a search engine to examine a whole bunch of keywords and content in a page, and learn the difference between the context of music and astronomy? That would be a big help.
I worked on a semantic engine out of Redmond, WA (not for MS) last year, and I'll say this: Its only marginally more difficult to put together than a keyword based search engine for the base components, however it is exceptionally time consuming to make usable (think of a 3d engine, for every remaining 10% its just as much work as everything before it) and theres one fatal flaw that killed our project, and will likely continue to kill any level of semantic search: with every page indexed computation time doubles when measured per page. This is unavoidable, because to have good results you must cross index every last thing, now you can do some neat things on groups sets of indicies by keywords to cross reference in a more logical manner, but just in the way people write theres no way your going to get 100% of the data glitch free, because you still have to interpret things well enough to pick out the keywords before you can even cross reference, and the most advancement you get toward one of those 50% jumps is by applying a feedback off a known good datasource that is at least 90% accurate, which you can then key your indicies off of (not in terms of keywords, but semantic footprints). There are so many better ways to search if people just sort their data, at least until we have some sort of AI engine that can actually interpret it (yes, I am sure when it happens it will be quite complex and processor intensive, though likely FAR FAR FAR less so than semantic search on the web in the sense the phrase semantic search is defined - though you would still get the same result).
---
AI Feed @ Feed Distiller
I gave it my first and last name, and it came up with nothing. Then, I gave it my complete name. It took my middle name (David) added "Baltimore" as a random last name and gave me facts about somebody named David Baltimore. Absolute, utter, meaningless gibberish. I am not impressed.
Good, inexpensive web hosting
Combined together is better than combined apart I suppose.
I have a slightly outdated buzzword bingo card, but I think I have a winner even still. So, hold your cards.
-B
Ash and Hickory, straight-grained and true, make excellent bludgeons, dandy for the cudgeling of vegetarians.
the frackin thing doesnt even work. and when you hit the 'Contact' button to yell at the creators of the site for sucking, all you get is an Apache error. Smoooooooooooth.