Berners-Lee On The Semantic Web

Parsing natural language into semantics by streetmentioner · 2001-04-10 16:36 · Score: 4

Some systems exist to extract facts from language into semantic knowledge representations, and they're surprisingly good.

SNOWY is a system that "reads" the World Book Encyclopaedia and stores each fact about a concept into a hierarchic memory based on that concept. It's sufficiently sophisticated to be able to realise that "The bear digs up the nut" implies that the bear eats the nut, while "The miner digs up the coal" doesn't imply that. You can then ask it "what eats nuts" and it will reply correctly. (At least, this is my impression - I haven't used it, sadly.) As I remember it can fully understand 50-60% of the sentences in the bits of the encyclopaedia that it has been commanded to parse.

The language it works on is fairly simple, but is nevertheless text designed for humans as opposed to computers. Systems like this could be a good bridge between language and semantic based representations.

This is the best link I can find, unfortunately.

There are also, of course, dozens of systems designed to work on English text that has been specifically created to be computer-parsable, but still readable by humans.

I'm incredibly sceptical about all this sort of technology, but if the systems continue to evolve, the agents might be able to glean much of their knowledge from existing web pages.

Rational Programming vs Semantic Web by Baldrson · 2001-04-10 16:58 · Score: 4

As I posted to Slashdot a year ago on the topic:

The future of the Internet is in what I call "rational programming" derived from a revival of Bertrand Russell's Relation Arithmetic. Rational programming is a classically applicable branch of relation arithmetic's sub theory of quantum software (as opposed to the hardware-oriented technology of quantum computing). By classically applicable I mean it is applies to conventional computing systems -- not just quantum information systems. Rational programming will subsume what Tim Berners Lee calls the semantic web. The basic problem Tim (and just about everyone back through Bertrand Russell) fails to perceive is that logic is irrational. John McCarthy's signature line says it all about this kind of approach: "He who refuses to do arithmetic is doomed to talk nonsense." More on this a bit later, but first some history, because he who fails to learn from history is doomed to repeat its nonsense:

When I invented the precursor to Postscript (an audacious claim that I can back up -- it started as a replacement for NAPLPS which I proposed while Manager of Interactive Architectures for Viewdata Corp of America back in November of 1981 -- the Xerox PARC guys found my approach of what they called a "tokenized Forth" communication protocol to be an intriguing way to encode text and graphics), I was interested in having a Forth virtual machine migrate into silicon (ala Novix) so it could evolve from mere graphics rendering into a distributed Smalltalk VM environment (ala Squeak) as videotex terminal/personal computer capacities increased. But I was _not_ interested in object-oriented programming as the long-term semantics of distributed programming environments. (I still have some of the hardcopy of the communiques with Xerox PARC and others from this period.)

Rather, relational semantics were what I saw as the ultimate direction for distributed programming. I had a bit of a go at Tony Hoare's "communicating sequential processes" paradigm and its Transputer realization because he was, at least, starting with the hard problem of parallelism rather than making like the drunk looking for his keys under the light post the way everyone else seemed to be doing (and still are, save for Mozart, since threads, etc. are always an afterthought). But, because there were other hard problems like abstraction, transactions and persistence that he ignored, I christened his approach "Occam's Chainsaw Massacre" in my communiques (in honor of his distributed programming language "Occam") and dropped it in favor of relational programming, which has inherent parallelism resulting from both dependency and indeterminacy. (BTW: Dr. Hoare seems to have finally come to his senses about this issue.)

Unfortunately, the only researcher doing hardcore work on relational programming (meaning, getting to the root of relational semantics in a way that Codd had failed to do) at the time was Bruce MacLennan, then, of The Naval Postgraduate School, and he just didn't have the glamour of Alan Kay at places like Xerox PARC to attract the attention of guys like Steve Jobs. Bruce had a bit of a blind-spot, too, when it came to transactions and persistence, which I attempted to remedy by bringing David P. Reed's work on distributed transactions for the ARPAnet to him, but although he wrote a white paper on a predicate calculus (close to a relational) implementation of Reed's thesis (MIT/LCS/TR-205), he didn't really "get it", IMHO. Reed and MacLennan abandoned their work for other pursuits (ironically, Reed was chief scientist at Lotus while Notes was being developed but did not contribute his ideas on distributed synchronization to that development despite the fact that we had a mutual acquaintance from my Plato days by the name of Ray Ozzie -- so, I share some of the blame for this failure) even as Steve Jobs botched the embryonic object oriented world by abandoning Smalltalk and giving us, instead, a lineage consisting of Object Pascal on the Lisa/Mac which begat Objective C on Jobs's NeXT which begat Java at Sun via Naughton and Gosling's experience with NeXT.

This brings us to the present -- a world in which Javascript-based technologies like Tibet promise to not only salvage the object oriented aspect of the Internet from the birth defects of Jobs's spawn, but actually provide an advance over Smalltalk in the same lineage as CLOS and Self. But it is also a world in which there is growing confusion over the proper role of "metadata" in the form of XML -- particularly when it comes to speech acts and distributed inference. I would call Tibet "the next major Internet advance" except for the fact that the basic idea for a Tibet-like system has been around and well understood since the early 1980's. When it is finally released, Tibet (or a system like it) will put the Internet back on track. I call that a "recovery", not an "advance".

We are now poised to move forward with type inference based on full blown inference engines, thereby dispensing with the nonterminating arguments over statically vs dynamically typed languages that allowed Steve Jobs's spawn to get its nose in the tent. If you want to declare a "type" in a declarative language, just make another declaration and let the inference engine figure out what it can do with that information prior to run time. See how easy that was? Well, there is more to it than that, but not that much: Assertions have implications and assertions made prior to run time have implications prior to run time. Live with it and don't repeat the mistakes of the past.

The confusion over semantic webs, and the reason Berners Lee et al will fail, is essentially the same as the confusion that has beleaguered all inferential systems such as logic programming and "artificial intelligence" over the years: logic is irrational and the real world demands rationality -- otherwise nothing makes sense. By "rationality" I mean that reasoning must literally incorporate "ratios" -- or, as John McCarthy would put it, doing arithmetic so things make sense. By making sense, I mean there is a sense in which one interprets the sea of assertions that clearly dominates for a particular purpose. With logic not only are you limited to 0 and 1 as effective quantities; you have no adequate theoretic basis from which to derive more accurate quantities with which to make sense by taking ratios and determining which inferences are dominant.

Fuzzy logic and expert systems incorporating probabilities have typically failed because they are not based in the first principles of probability and statistics. As Gauss, the premiere probability theorist put it, "Mathematics is the study of relations." He didn't say, "Mathematics is the study of multisets." There are good reasons that relational databases, and not set manipulation languages, have come to dominate business applications -- and Gauss was aware of these differences when he began to derive his laws of probability. Subsequent axiomatizations of mathematics based on set theory were similarly misguided and have led to the idea that "fuzzy sets" are the way to introduce rationality into programming. Rather than sets, relations are the foundation, not just of mathematics but of rationality in the same sense that Gauss realized when he derived his theory of probability from the study of relations.

Rationality allows for judgment which is recognized as inherently fallible -- but which allows one to procede without exponentiating all possible paths of inference. Judgment also allows various identities to limit sharing of information to that needed -- thereby creating speech acts and a basis for rational measures of credibility associated with those identities. Since credit-rating is a degeneration of credibility, it should come as no shock that the invention of negative numbers, originating as they did with the Arabic invention of double entry account keeping, has its analog in something that might be called "logical debt" with which negative probabilities are associated.

And now we have come to the "quantum" aspect of rational programming. It is precisely the "credibility debt" aspect of rational programming that corresponds, in mathematical detail, to the various equations of quantum mechanics and their negative probability amplitudes. (Von Neumann's quantum logic failed to properly incorporate logical debt which has led to much confusion.) Logical debt is important to distributed programming for the same reason debt is important to financial networks. Logical debt is a way of handling poor synchronization of information flow in the same way that financial debt is a way of handling poor synchronization of cash flow. As in any rational system, there are both limits to credit and limits to credibilty that influence one's judgments and actions, including speech acts.

The object oriented folks may, in a sense, have the last laugh here because when we divide up inference into identities that engage in speech acts, we are reintroducing the notion of objects that hide information via exchange of speech act messages that can be thought of as "setters" (assertions) and "getters" (queries). However, I believe it is only fair to recognize that the excellent intuitions of Johan Dahl and Kristen Nygaard did need the added insights and rigor of philosophers like J. L. Austin and T. Etter.

--
Seastead this.

Meaningful Web Content by MattGWU · 2001-04-10 16:15 · Score: 4

>>A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities

Good ideas, but I think we first need to make Web content that is meaningful to Humans before we start worrying about our Computers

(Yeah, I know...not *that* kind of meaningful, but it had to be said, what with all the worthless drivel on the Internet and all)

--
"These people look deep within my soul and assign me a number based on the order in which I joined" --Homer re:

Behold the FUTURE of WEB typoGRAPHY by table+and+chair · 2001-04-10 17:00 · Score: 4

In the FUTURE random words in BLOCKS of text displayed ON THE web WILL be inexplicably highlighted IN A stylish PINK-ORANGE several point SIZES LARGER THAN the rest of the body text. This will come to be known as bernersing, and will BECOME a standard control in GUI web-design APPS, WITH options for frequency, DENSITY , and with the advent of the Semantic Web, relevance TO content (default for the latter = 0).

THOUGH this destroys the FLOW of the TEXT by wrenching the READER'S eye about and causing IT to pause, rather than travel naturally FROM WORD to word, this typographic treatment WILL BE hailed as a BREAKTHROUGH in internet desig N and will unleash a revolution OF NEW possibilities.

Too much technology.... by Peridriga · 2001-04-10 16:25 · Score: 5

I will be the first to say it.... I love technology... But, reading this makes me wonder what is enough...

Alas, voice activated and personalized networks are going to aid in everyday life (especially with those physcially handicapped) but, it removes the most deveolped and complex form of communication... Human Interaction..

This is becoming less and less a factor in the average humans life.. With business going paperless and friends going wireless when does someone really have to talk to someone... If you telecomute and email your family, do you really have to talk to anyone, besides maybe your coffee maker when you get up in the morning..

I don't want to be a anti-technology advocate but, mearly express an idea that we are excluding the most needed facet of human life... Interaction...

Prisoners are isolated for punishment... We are isolating ourselves for convience?..

Well... my two cents.. yall can make as much change out of it as possible...

--- My Karma is bigger than your...
------ This sentence no verb

What the hell, let's just merge them. by Flying+Headless+Goku · 2001-04-10 16:26 · Score: 4

The human-readable and computer-readable stuff, that is.

How? Lojban, a constructed language designed to be absolutely consistent and logical. You might know it in its earlier incarnation of Loglan, which was mentioned in passing as a language used for conversing with computers in Heinlein's "The Moon is a Harsh Mistress."

Certainly, you could structure a valid Lojban statement to be unreadable to computers, but it isn't that way by default. If you state things directly, the computer can extract useful information.

This is why I'm absolutely 100% certain that we'll all learn Lojban soon. Yup, there is no doubt in my mind. None at all...
[rolls eyes,whistles a little tune]
--

--

What's the internet for? A more realistic example: by Flying+Headless+Goku · 2001-04-10 17:44 · Score: 5

The entertainment system was belting out "Put 'Em on the Glass" when the phone rang. When Pete answered, his phone turned the sound down by sending a message to all the other local devices that had a volume control. His mistress, Lucy, was on the line from the office: "I think we need to see a specialist and then have a series of physical sessions. Bi or something. I'm going to have my agent set up the appointments." Pete immediately agreed to pay the fees, after confirming that she meant a chick.

At her "advisor"'s office, Lucy instructed her Semantic Web agent through her vibrowser. The agent promptly retrieved information about the "treatment" from her advisor's agent, looked up several lists of providers, and checked for the ones within budget and a 20-mile radius of her home and with a rating of triple-H (Hot, Horny, and Healthy) on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules.

In a few minutes the agent presented them with a plan. Pete didn't like it. The university student housing was all the way across town from Lucy's place, and he'd be driving back in the middle of rush hour. He set his own agent to redo the search with stricter preferences about location and time. Lucy's agent, having complete trust in Pete's agent in the context of the present task, automatically assisted by supplying access certificates and shortcuts to the data it had already sorted through.

Almost instantly the new plan was presented: a much closer brothel and earlier times--but there were two warning notes. First, Pete would have to reschedule a couple of his less important appointments. He checked what they were--not a problem. The other was something about his STD checker's list failing to include this provider: "Non-contagiousness securely verified by other means," the agent reassured him. "(Details?)"

Lucy registered her assent at about the same moment Pete was muttering, "Spare me the details," and it was all set. (Of course, Pete couldn't resist the kinky details and later that night had his agent explain how it had found that provider even though it wasn't on the proper list.)
--

--

Slashdot Mirror

Berners-Lee On The Semantic Web

7 of 112 comments (clear)