Going from a 'Web of links' to a 'Web of meaning'
neutron_p writes "Computer scientists from Lehigh University are building the Semantic Web, which will handle more data, resolve contradictions and draw inferences from users' queries. The new improved Web will also combine pieces of information from multiple sites in order to find answers to questions."
When will we be dropping HTTP and HTML in favor of more metadata-friendly protocols and file formats? I can see huge potential in a system built specifically for getting data out there and linking it all together.
Disconnect and self-destruct, one bullet at a time.
Sounds like a recipe for disaster to me.
Does everything include nothing?
Covered not long ago - an interview with Berners-Lee regarding the Semantic Web.
People at DERI in Ireland's Galway are also working on the Semantic Web (see http://www.deri.ie/). I thought lots of people are...
I'll have to rtfa to see what they propose, but just the principle of resolving contradictions is a really difficult one, and most theories of knowledge (which are essentially networks of facts) aren't terribly robust, and contradiction repair, which involves running the entire network to find invalid assumptions, and then propigating the changes is NP complete :| i'm not positive that contradiction resolution is a reasonable thing to expect out of a massive distributed network.
There are lives at stake here!
with their favourite mode of publication being the press release.
Could you please provide a link to this Sinus Apache module? Neither Google nor SF's own search are of any help when searching for it. Thanks!
...everyone assume that the Interweb is intended for information? Yes, I understand it is the best source of collaborative information the world over, but I like finding porn the old fashioned way - not having a search engine, or intelligent nodes, telling me which hottie is the hottest.
You gotta understand that "meaning" has no meaning at all to machines, at least not yet.
And even for humans, the "meaning" of a certain thing can be different thing to different people !
Although I applaud the job they are doing for Semantic Web, I wonder how they can inject "meaning" into the whole thing.
My biggest fear is the 1984-like "my meaning is THE meaning and you canna have any other meaning" thing.
Muchas Gracias, Señor Edward Snowden !
..from user's queries.
Clippy..? Is that you?
There's a Starman, waiting in the sky / He'd like to come and meet us, but he hasn't got the time.
The same can be said about any semantic web technology - whether it's FOAF (an RDF vocab for describing people and their interests) or a vocabulary for reviews. As soon as major authoring tools (i.e. both web editors and content management systems) start integrating these technologies, people will use them if they are useful. Do not expect web designers or bloggers to have a clue about all the great things that the semantic web can do - give them one useful thing which they understand, package it in a pretty UI, and they will start using it.
I guess that the Semantic Web would need HTML documents to meet strict requirements when it comes to validation, use of logical instead of physical markup and so on. This could be an incentive for people to use HTML the way it was intended, instead of the crapload of pages that don't close tags, use hundreds of redundant FONT tags, use the H1..H6 elements to control font size instead of using them to indicate headings, and so on. Strangely enough, all "beginner's" HTML books still teach people to code this way.
The semantic web is a pretty popular area of research right now and its far from being "built by computer scientists at Lehigh University", in fact I could have done an undergrad dissertation on the semantic web, and there were numerous phD positions being advertised at uni's around the world researching about the semantic web.
Whichever lehigh uni professor submitted this is stooping pretty low trying to raise publicity (and hence finance) I would think!
I have discovered a truly remarkable sig which this post is too small to contain.
Am I the only one who recognized the main graphic for the story as a lifted screencap from the movie Hackers? That movie's SOLE redeeming quality was Angelina Jolie...
Well, ok, that and the laugh factor. Not quite as much fun as MST3K'ing The Mummy with about a half dozen friends though.
Please help metamoderate.
grad school, next to: a.i. using digital computers, relational databases in the business world, using C++ as an "object-oriented" languages, and how to build a universal translators.
I have had RDF on my web site for years, but last year as an experiment, I started a web spider running that specifically looked for RDF - I found very little.
I even cheated and specified the 'seed' starting web sites as sites that I knew to use RDF.
Yeah, this is really easy. Just look next to the title and see what score the moderators have assigned and you get a sense of whether there be contradictions! Generally if the score is lower than 1, there could be contradictions so:Yeah it's really difficult.
The dangers of knowledge trigger emotional distress in human beings.
- We need an ontology that will cover many if not all aspect of human experience. And this experience has been evolving dramatically and will continue to evolve. This ontology is probably a moving target. This task alone of creating the ontology has been, and is still the holy grail of AI and Knowledge Management.
- The amount of time we will have to invest in adding metadata to the data will dramatically increase over time. We will need a way to automate the filling of the metadata layer. This is where kicks in automatic image recognition and classification, speech to text, text summarizer and meaning extractor (Here, Copernic is is the right direction). Maybe the librarian profession will be the next hot job...
- Almost every application will have to adapt and inter-communicate. No big deal, RDF will probably become the new data bus anyway.
That will be interesting!!!I still don't know why this feature isnt used to make the web powerful for offering more links on the same web page:
On the same page, a level of links should be increasable/decreasable. The default one would be the one we see currently on all the web sites.
When going to the next level, the page would not reload at all but the browser would just show the links at different places on the page. These links would have been setted by the webmaster on ideas that require linking a sentence or a part of it, not just word. This way you can include as many level of idea/concept into the written text. A website like wikipedia would see this feature really useful I think.
...handle more data, resolve contradictions and draw inferences from users' queries. The new improved Web will also combine pieces of information from multiple sites in order to find answers to questions.
It will essentially be a librarian?
The problem with this is that users first need to know what the heck they're actually looking for. You can draw as many inferences as you like, but so long as people search for "art" when they're interested in "tattoos" you aren't going to get much that's relevant. And THAT is the biggest problem with your average user--and that's what a librarian is good at. Asking questions until people verbalize what they really need.
Hasn't everyone heard of this already?
W3C semantic web activity from 2001.
Heflin's Thesis from 2001.
I'm rather skeptical of the whole thing, it seems to me to be like "Wouldn't it be nice if people documented their web page content better? Then we could do all these neat things." The second statement is right, but I fear the first statment is intractable.
This could create huge problems for people to stay on the right side of copyright law. A medium that pulls information from several different sources could potentially make it much harder to avoid copyright infringement. For example, you pull from a Wikipedia entry, a NY Times entry and a Reason editorial. You better keep track of where you got each part if you use them in any of your own research, commentary, etc.
How does it combine information from different sources in a way that keeps the user knowledgeable about where the data came from? How do you know who to cite, or whether something you're excerpting can be used in the context you want, when your "semantic web browser" pulled the data and combined it coherently or incoherently into a mish mosh of data sources?
Am I the only one who thinks that this could be an IP trial lawyer's wet dream?
Click here or a puppy gets stomped!
It seems to be a common mistake for computer scientists to think that it's possible to make systems that "understand" the world (both real and abstract knowledge), with all its complexity and ambiguity, in the same way that humans do. I feel that there is a fundamental difference between using computers to enable humans to organize stuff, and having computers automatically do it. Every single attempt at getting computers to be "smart" about infering human intentions has ended up as an irritating impediment to using the system - look at clippy, Bob, "intelligent" voice systems that try to "help" you by stopping you from talking to a real person... what computers are very, very good at is amplifying and enabling human intelligence. Computers are not themselves intelligent, and (my personal opinion) I don't think they ever will be - unless we manage to "grow" them using processes that we probably won't fully understand. You can't construct something that is as complex as the human mind through deterministic (i.e. consciously designed architectural) means - all you'll end up with, at best, is a very complex rule inference engine that is limited by the rules you gave it. Every "holy grail" of intelligent programming that has come along - neural nets, genetic programming etc - has turned out to be very limited (though very useful in special situations).
I also feel that talking about automatically organizing the world's knowledge in a semantic web is just more of the same hot air that we've been hearing from AI departments for the last few decades. You can't automatically allocate meaning to something unless you have the capability for "common sense" reasoning, and the world knowledge at your fingertips to be able to interpret the data intelligently, like a human would. And even then, different humans would interpret it differently... so there are multiple meanings, and anyway, how to allocate "meaning" to something abstract such as a poem or piece of art?
And if we require real people to add metadata to everything... well, it just ain't going to happen, in my humble opinion. Adding meta data is a pain in the ass, since you have to define the categories of object, agree on meanings for all the different taxonomies that will have to be used to describe the world... then there's the potential for abuse, as spammers will inevitably seed their documents with inappropriate metadata. So, the "honest" people can't be bothered, and the dishonest people will wreck anything that does get built. So, it ain't gonna happen.
The beauty of google (not that I love google, but they did hit a nail on the head) is that it requires no effort or "machine intelligence", beyond a very simple algorithm that depends not on AI but rather real, tangible relationships between words and documents (proximity and links). This is something that computers can be really good at.
Just my opinion... obviously there will be others out there who will vehemently disagree, and that's fine! Go ahead and try, you'll learn a lot in the process and you will probably come out with some tangential technology that you never thought of initially but is useful nonetheless.
"No formal theory," Heflin wrote in his proposal to NSF, "has considered how ontologies can be integrated and how they may change, or the role of trust in integration."
:)
Like hell. Computation theorists and cryptographers have been applying game theory to their models for upward of a decade. This guy is just catching on.
Of course the formal theories aren't complete, but this fellow is not onto anything new. It just sounds like exactly the naive optimism you'd put on an NSF grant req.
As far as integrating ontologies, it just sounds like graph isomorphism to me.
That's a language that can be parsed by computer! Rippin'. But I figure we'll just wait for the singularity before anything really changes, after which we'll use binary code or something. In the meantime it's English, which will be looked at as civilizations greatest joke at some point in the future. It will make it quite hard to make this semantic web.
-I am an elective eunuch.
Meaning is always "in context". Human communication always requires a "transmitter -> medium -> receiver" structure. Some say the universe is fundamentally structured on that model. When these sematic systems are overlaid on content, there's always these slippery, unresolvable mismatches of "intent" and "understanding", those "semantic arguments" that drive likeminded people crazy. Content searching is extremely powerful, without creating the "cracks" into which meanings can irretrievably fall. As long as there are alternative semantic indices to content still available "raw", semantics will just help. When we move to wrap all content entirely in semantics, we'll live in the "map is not the territory" problem forever. Ask CORBA programmers and EU language translators about the death of meaning by means of the dictionary. If we need to add semantics as a tool, we still get under the hood at the actual content.
--
make install -not war
Crispin
A message has "meaning" if you can make special use of it.
Normal web pages have meaning for browsers, it's just that that meaning is limited to "how to draw words for the user."
What we're doing, is making it so that your computer can make special use of messages on the web, to do smarter things.
It would be scary if the Semantic Web were about "my meaning is THE meaning." But it is explicitely not like that. In fact, one of the main things about it is that anyone can make up their own languages, their own way of modelling the world.
There are tools that make it so you can say, "My word X is sort of like their word Y," but it's acknowledged that such translations will be imperfect. Likely, fuzzy logic, and systems that are able to ask for clarification (and remember responses), will be used to mediate that sort of things.
You may also be interested in my favorite page on AI by Open Mind. The Semantic Web isn't explicitely about AI, but it opens the door for a lot of AI work.
A semantic web is only as useful as the metadata, and people go to great lengths to mislead and disguise.
People who think they know everything really piss off those of us that actually do.
I can't wait for the spam people and porn sites to get a hold of semantic web technology.
The meaning of is V1.agra and C011.3G3 GIRLZ!
So does Anonymous Coward have good karma?
Surely this just piles on more work for us poor poor developers....
it's the taking apart that counts
Semantic Web is the most ridiculous idea I've ever heard. The problem with meaning isn't representation -- English represents meaning just fine. The problem is meaning itself -- it doesn't matter if you figure out a way to encode it in some XML language, for every bit that it's easier for computers to use, it will carry that much less meaning.
Another way of putting it is, any program capable of extracting the same meaning from XML that humans can, should be able to understand English without much trouble. It's the whole Intelligence-complete" thing. Like NP-complete, there seem to be a class of problems which can only be solved by real intelligence, and they're all pretty much equivalent in that with real intelligence, you can solve them all.
"If you look 'round the table and can't tell who the sucker is, it's you." -- Quiz Show
Great. An Expert System to do your google searches based on what it thinks you meant. The giant Semantic 'Clippy' knows what's best when it pops up to say:
''Here are the results to the question you should have asked.''
Maybe next they'll have the Semantic Web manage the way electronic voting is counted. Semantic Clippy will count your 'intent' instead of your actual vote.
The meaning of the Internet, eh?
That should yield some interesting answers.
"42. The Answer that you are looking for is 42."
"You searched for "space ship one", but what you really want to search for is "natalie portman hot grits"."
Isn't the whole point of the Internet a database of information which we can access using tools - not to create a "web of knowledge"?
Google works because it is largely a statistical tool that uses some meta-information.
While I could see frameworks being used for very specific purposes, like searching a homogeneous (e.g., slashdot, pubmed, nytimes) web-site where all content is controlled. But extending these ideas to a heterogenous web that would no doubt take advantages of such a volunteer system is ludicrous.
I also take issue with the top-down mind-state that they will be able to predict what is useful to the user. This is why statistical importance and quantity is the only realistic method for such a massive undertaking (which google is still actively researching).
I think that the only useful research to come out of such an endeavor would be to have news-sites, as mentioned above, implement and be scanned using an ontological browser. Of course, I am not sure how this would be different than Lexus-Nexus (sp?).
Apple OSX afficionados.
ich bin der musikant
mit taschenrechner in der hand
kraftwerk
I can't believe this hasn't made it to +5 Insightful yet.
who sit on the floor with their legs crossed with their eyes rolled back and fingers forming circles are taking the Internet over with their "holistic", "whole-language", "non-judgemental", "authentic", "culturally-sensitive", "non-anglo-saxon centric", "green" politically correct approach to communication...
OR
Google is about to get better...
=8-)
This thing could help in finding incorrect information. All we have to do is put Microsoft in charge of finding and getting rid of the faulty information.
to George B: No please don't, it was a joke man.
There is not 1 truth.
Anybody remember the demise of META keywords?
I think we could run into the same problem with the Semantic Web, as it too allows web developers to attach arbitrary metadata to their pages. The only way to prevent unscrupulous web developers from embedding inaccurate RDF in their pages in hopes of attracting more hits is by establishing a web-of-trust framework.
Google implements a very crude version of web-of-trust that assumes "incoming hyperlinks==trust". I think that in order for the Semantic Web to be something that is usable by web-wide search engines like Google, we will need a much more robust and fine-grained system of trust. The user should be able to specify some of the entities that they trust and the search engine will deduce the rest.
However, without an adequate trust framework, the Semantic Web will just be a new fertile ground for for keyword spam and search engine "optimization".
pi = 3.141592653589793helpimtrappedinauniversefactory7
Well said, there are indeed numerous areas of investigation of this sort of work. It's not as empty an area as the article tells us.
But what is the SIGnificance?
thats just what the computers want you to think...
This strikes me as eerily similar to Daniel Waterhouse trying to write down lists of everything for the Royal Society in Stephenson's Quicksilver.
The whole reason the web is popular is because it's trivially simple to create content for it. Maybe the web would be more useful if it was like a giant encyclopedia but it's just an exercise in futility unless everyone gets on board.
tag is presented in a certain way. The Semantic Web wants to make it so the triple GallonOfMilk hasPrice $1.25 (this would actually be expressed in several triples about a product with a certain id, probably, but you get the idea) can be "understood" by a program in the same way across multiple sources. The same as a person does not automatically assume a site is an absolute authority on the price of milk, semantic web enabled programs would not assume that this information was absolute (nor would it likely be presented as such). However, imagine how powerful it would be if one could give your browser the address of the RDF interfaces for local grocery stores (or it might autodiscover them at least in part), and then it would find out what the price of milk (and other groceries) is at each one of them. That sort of thing is already possible today without the Semantic Web (or other semantic frameworks), but only with methods that either require heavy lifting on the part of the client system (such as web scraping every grocery store site, killing extensibility and easy implementation) or aren't cross-domain (perhaps I want to chart the price of milk (from some milk-price-archive) vs real dollar value -- now my client has to understand two possibly very different ways of presenting information, not just one integrated way). The Semantic Web (and associated technologies) is an enabling framework that frees programmers from doing a lot of the heavy lifting involved in discovering meaning and relating meaning, just as SQL is an enabling framework that frees programmers from doing a lot of the heavy lifting involved in storing data and relating data.
For to end yet again.
The World Wide Web cannot "at its core handle inconsistent information" yet it seems to lurch along okay.
The Semantic Web is not some attempt at global knowledge, perfect knowledge, perfect reasoning, or anything of the sort, regardless of what many posters, including yourself, seem to have construed it as.
It is intended to be an analogue of the World Wide Web, which is primarily consumed by humans, that is instead primarily consumed by computers.
Can it know everything? Of course not! But it can make it so computers "understand" a heck of a lot more than they do today.
For instance: right now an everyday computer (or more accurately, the web browser) "understands" that (absent styling) a
tag is presented in a certain way. The Semantic Web wants to make it so the triple GallonOfMilk hasPrice $1.25 (this would actually be expressed in several triples about a product with a certain id, probably, but you get the idea) can be "understood" by a program in the same way across multiple sources.
The same as a person does not automatically assume a site is an absolute authority on the price of milk, semantic web enabled programs would not assume that this information was absolute (nor would it likely be presented as such). However, imagine how powerful it would be if one could give your browser the address of the RDF interfaces for local grocery stores (or it might autodiscover them at least in part), and then it would find out what the price of milk (and other groceries) is at each one of them.
That sort of thing is already possible today without the Semantic Web (or other semantic frameworks), but only with methods that either require heavy lifting on the part of the client system (such as web scraping every grocery store site, killing extensibility and easy implementation) or aren't cross-domain (perhaps I want to chart the price of milk (from some milk-price-archive) vs real dollar value -- now my client has to understand two possibly very different ways of presenting information, not just one integrated way).
The Semantic Web (and associated technologies) is an enabling framework that frees programmers from doing a lot of the heavy lifting involved in discovering meaning and relating meaning, just as SQL is an enabling framework that frees programmers from doing a lot of the heavy lifting involved in storing data and relating data.
For to end yet again.
Here's an analogy that doesn't prove anything but reframes the problem. As far as I understand it, the Pentagon cannot be audited, because the time it would take to properly count everything extends beyond the period for which the count would be relevant (the current fiscal year). Do those who tout the "Semantic Web" have a response to this kind of question about feasibility?
Shop as usual. And avoid panic buying.
Yes it has.
See Relation Arithmetic Revivedand Structure Theory. These two papers were written as a result of Hewlett-Packard's E-Speak project's support of a continuation of work begun at Paul Allen's thinktank, Interval Research. These then led to an understanding of the importance of identity theory in performing logic with what we were calling "attributed assertions" aka digitally signed speech acts. After the E-Speak project terminated we continued work on identity theory with partial support from the Boundary Institute leading to a reformulation of the foundation of mathematical logic with The Expressive Power of Equality.
Seastead this.
It's analogous to C and Smalltalk. C++ and Java evolved, but are not as purely object-oriented as Smalltalk is.
Either it is not a good model in its entirety or time is not right for it. (though I believe it's the former)
It's like saying: we can draw implications, if the information is marked up and in a canonical form. However, getting to such a point requires marking up and putting all of our information in a canonical form.
The more intresting problem is turning data into information. Id est: How to (and what) index. Google has the Semantic Web beat in all regards.
Whenever I see something about the semantic web, I go back to Clay Shirky's critique of it.
A useful antidote to the hype.
"And the meaning of words; when they cease to function; when will it start worrying you?"
But okay. I don't think we're in disagreement here: I totally agree with you that the "low level tools" are the ones that are of interest here, I'm just sceptical about the scalability of the grand objective: crisp inference, based on this information. But, once the groundwork is done, machine learning will probably be the technology that will be used. Simple, fuzzy inference of the kind you describe and which made google big. Combine these 'simple' queries (find me cheap milk!) with likeliness of a match (is GallonOfMilk talking about dairy?), thrustworthiness of sites (Is the rest of the content about milk or milk-like products or products at all? Does the content match other milkselling sites?), and you've got something useful.
... as pointed out by the incomparable Paul Ford.
Don't believe everything you read - this Slashdot story is a great example.
Clay Shirky has written an excellent article about why the Semantic Web ain't gonna work. I don't agree with everything he says, but it's a thought-provoking read nevertheless.
I agree with you completely on this point. The most important advances that have been made in the knowledge engineering community over the last decade have been those that have tried to fuse non-symbolic and machine learning techniques with the good old-fashioned AI of expert systems.
Given that the semantic web has been in development for years, and that the opinions(*) have long ago finished forming, I'm a little confused as to what this is doing on a news site.
(*) Said opinions break down roughly thus:
1% -- This is an amazing new way of percieving and connecting data that will revolutionize computing in the future.
9% -- This is a waste of time, a clearly impossible task that would seem of interest only to a certain breed of dysfunctional academics.
90% -- Huh?
Whence? Hence. Whither? Thither.
Needs to re-read my post... 1. Google...not the only to note link... 2. Bringing "meaning" implies that someone decides what is "meaningful" and too often that someone has an agenda. =8-)
Making a semantic web that is based on English or even Latin languages would be a useless addition to the chaotic structure of the current web. The problem is not the way computers use our language, rather it is the language that we are trying to use.
The assumptions that are beneath English are difficult to work with, and in reality wrong. When I say "I am a baseball player" the meaing is quite different than that sentence protrays.
As mentioned by another commentor, context is the most important element of language. That doesn't mean we can't record meaning through computers, it just means we have to change the lanuage we use. A better attempt at the above sentence would be:
"I am part of a system in which I play a game with others. This game occurrs 2-4 times a month."
In English hypertext each word in this statement would need an explaination:
I=author
part=player(pitcher, catcher, etc...)
system=logic(baseball rules, game length, etc...)
play=act
game=system
with others=people participating in the same instance of this system
Clearly this method of transcribing context is very difficult, not only because of the language's assumptions but also because of the linear thought process. Contexts are nonlinear overlapping structures and a linear language does not do them justice.
I propose intead of trying to make a web of meaing based on current day languages, a new visual nonlinear language should be created.
Who's down?
Semantic web...makes me think of the scene in WoO when professor is "uncurtained" and the Greatand Powerful Oz shouts something about "pay no attention to the PERSON behind the curtain" (don't remember exact wording). I mean, whose "meaning" is going to drive it? Could be cool tho'..."computer, search for meaning of prOn." WoooHooo!
Down With Slashdot BETA!!! I've been around the corner and seen the oliphant; you can only abuse me from your perspecti
You're right english doesn't work, but that doesn't mean meaning cannot be encoded. The English language lacks context. Provide a language based on context and you can encode meaning. Context is nonlinear ans so should the language be.
I have a good friend doing graduate work in machine learning, she recently presented at that robot conference in Japan, I believe about task recognition. I myself am a lowly undergraduate in Informatics, and am thus perhaps a bit more concerned with the practical than some in the Semantic Web community ;) .
The dairy part would be taken care of because the milk (and its size, and such) would be discussed using common low level vocabularies (which reminds me, someone needs to come up with a units vocabulary for discussing units of measurement. note to self: do so, and gain fame and fortune ;) ), but I wholeheartedly agree that matching semantic web information to loose queries and judging trustworthiness are prime areas where machine learning techniques will come to the fore.
Also, it is important to understand that the direction of the Semantic Web as envisioned by Tim Berners-Lee and others closely involved in its creation does point towards a lot of knowledge synthesis and extension through inference, it still leaves doors open for machine learning-style inference (TBL somewhat seems to shy away from this, but I think in part its his background in classical logic that creates that impression. He does make it very clear the methods used to do things like infer trust are not set or even partially determined, but that he's just speculating on possible ways).
Many of the possibly application of the Semantic Web that he points to are very possible (and closer to what I characterize the Semantic Web as than many university researchers ;) ), here's a good article (the seminar example is what I'm specifically referring to): linky
You know, if you wanted to cache in on the big research bucks, you could always come up with a way to combine machine learning and Semantic Web technologies (*hint hint* ;) ).
For to end yet again.
Now, here's what all the semantic web seems to boil down to, to me:
1) Common interface and data structures for describing all objects (how ever that'll be done, i think that's a pipe dream in itself).
2) methods of combining and cross-referencing information from the aformentioned objects, whether they be user-defined or otherwise.
3) Making sure everyone complies with enough regularity that the system actually does something useful.
Now, i want to talk about a related and similarly interesting problem that i am personally more familiar with (in fact it is perfectly analogous).
I'm a computational linguist (computational psycho-linguist to be precise). Several of my professors are interested in corpus based research, which requires in some cases annotated corpora. The interesting thing about natural language corpora, is that the goal of annotation is exactly what the semantic web people want, metadata that coherently describes all the attributes of the object they're discussing.
The problem is that there isn't consensus as to what attributes human language actually has, so there are corpora which were annotated with metadata that's similar to syntactic theory A and corpora which have tags which favor theory B. So in the past couple years there have been pushes for "theory neutral annotation".
Here's the problem. If it were possible to generate an annotation scheme that could be used by adherents of theory A and adherents of theory B (that is a theory neutral annotation scheme existed), it would mean that fundamentally theory A and theory B were, in fact, compatable. Since the entire reason that the two camps have different annotation schemes is because their theories are incompatible, it seems like a futile or contradictory endevor to undertake theory neutral metadata.
The problem is the same for any system which wishes to create universal object descriptors. There is no way to do pre-theoretic object description. There are of course qualities of objects which will be uncontraversial, but ultimately some people are going to want to pay attention to certain qualities of objects, and other people aren't going to want the same attributes. While you may say "well that's fine, lets just toss in all the attributes," you'll still have the problem that some people will want to divide up the qualities that an object has one way, and someone else will want to divide up the attributes of the same object in another fundamentally incompatable manner.
So what do you do then? You're fucked. You can't enforce consistency, and so your object descriptors are dead, which means you don't have your uniform api, and ontologies won't work.
There are lives at stake here!
The Semantic Web (or the part of it described in the first point :) ) isn't about an absolutely generalized interface for describing all objects at all. Its about community developed vocabularies at a high enough level to be useful, but at a low enough level to be used as good building blocks for lots of structures. They are standards not by imposition but by adoption and general consensus. Existing vocabularies not good enough for your project? Invent one to fill the gaps and tell people about it!
Take your annotation example. The Semantic Web approach wouldn't be to try to resolve the differences at all, or create some sort of neutral annotation. Instead, RDF vocabularies would be written for each form of annotation. If there were any points of commonality between the two sorts of annotation, those would be noted in ontologies, but if there weren't, no big loss.
The important thing isn't that the data is part of some universal consistent structure, but that the data is made available in ways that can be easily related (in many different ways) to other data.
You're misunderstanding the basic technologies of the Semantic Web as some big behemoth, when really the most basic technology is very simple. The only really important parts are these, in fact:
RDF data is made up of triples.
Triples have a subject, a predicate, and an object.
The subject must be a URI or blank node (really an anonymous URI, so to speak), the predicate must be a URI, and the object must be a URI, blank node, or literal.
It is both the simplicity of this structure (almost a graph) and the dependence on URIs that make it powerful. Because URIs are used, which are already used in practice as globally unique identifiers and may be easily extended and discovered, there is a certain universality to RDF statements.
Not that they are universally meaningful or anything of the sort, but that in most cases it will be possible for two RDF statements to agree if they're talking about the same thing or different things -- just check the URIs (ah, the wonders of personification).
For to end yet again.
Ah, so it's more of that FOAF stuff you've been talking about on grenme. I posted my comment in duplicate there, why don't we continue the conversation back at grenme?
There are lives at stake here!
Because this way other people get to see us chatter :-P ?
For to end yet again.
That's a good link. It addresses the problems when trying to relay meaning through any medium, and the impossibility of recording it. What I propose is recording context rather than meaning. The closer you get to the context, the closer you get to the meaning. Too much of language is focused on trying to explain what happened rather than in what context it happened.
these guys http://pespmc1.vub.ac.be/ are using self adapting and darwinian technology to evolve the structure of the web, strengthening or weakening links, automatically, or even creating new links, based on usage patterns. they are comparing it to biological neural networks. they are trying to make the net actually work like a "global brain" ... their project also has the best name: Principia Cybernetica
i disable sigs