Tim Berners-Lee and the Semantic Web

What Does 42 Mean for Privacy? by Allen+Zadr · 2004-09-27 05:51 · Score: 3, Interesting

'a single Web of meaning, about everything and for everyone.'

So, once this is off the ground, who wants to bet that the answer really is, 42?

Seriously though, this could be really cool, but I imagine that this could have some very adverse effects on privacy given the amount of information that finds itself on the web. Items that are linked by obscurity in disperate places would be easily linked into a single profile (If the stuff he's talking about isn't primarily smoke and mirrors). Either way, like any powerful technology, it will have both good and bad consequences. Here's hoping for the good...

--
Kinetic stupidity has a new brand leader: Allen Zadr.

Re:What Does 42 Mean for Privacy? by cynic10508 · 2004-09-27 06:44 · Score: 2, Insightful

Seriously though, this could be really cool, but I imagine that this could have some very adverse effects on privacy given the amount of information that finds itself on the web. Items that are linked by obscurity in disperate places would be easily linked into a single profile (If the stuff he's talking about isn't primarily smoke and mirrors). Either way, like any powerful technology, it will have both good and bad consequences. Here's hoping for the good...

People would do well to note the principle: Security by obscurity isn't.
Re:What Does 42 Mean for Privacy? by Allen+Zadr · 2004-09-27 06:57 · Score: 4, Insightful

Ah, but what constitutes privacy but an obscurity of your own behaviors in certain circles.
That is to say, I may be an item scammer in online gaming realms, or in Diablo, but not in EverQuest. However, I may be one of the most honest people I know in the real world. Perhaps I have a second account that I use to Troll on Slashdot, but otherwise have this account where I try to post insightful information. You have the right to link these things, you may even have the right to link these to real world data like where I work and where I park my car. However, if I jilted someone in Diablo, do I want them to so easily find me and take it out on my car (as some people would)?
Do I want my employer having instant access to all of my online transactions, regardless if I'm on shift or off shift at the time? Individually, these are not things that have been considered something you would even want to 'secure', yet they may be valuable to someone.

--
Kinetic stupidity has a new brand leader: Allen Zadr.
Re:What Does 42 Mean for Privacy? by Z4rd0Z · 2004-09-27 06:59 · Score: 1

Right...because you certainly wouldn't want to do anything like obscure your data through encryption. That wouldn't be secure. That's why I insist my bank lets me send my password in the clear.

--
You had me at "dicks fuck assholes".
Re:What Does 42 Mean for Privacy? by cynic10508 · 2004-09-27 07:04 · Score: 1

Right...because you certainly wouldn't want to do anything like obscure your data through encryption. That wouldn't be secure. That's why I insist my bank lets me send my password in the clear.

No. Obscurity is putting something like your source code in a pantry and merely hoping that no one ever looks in that pantry. Encryption, on the other hand, intentionally alters the data in such a way that the number of entities who can read it are controlled. You're not obscuring that bank data because you're still sending it over unsecure, public networks, but encrypted.
Re:What Does 42 Mean for Privacy? by Haydn+Fenton · 2004-09-27 07:06 · Score: 1

As much as I want there to be a semantic web at some point, hopefully sooner than later, I can't see it being developed to the point of mass usefullness anytime soon.

There are many, many problems which stopped me from trying to develop my own things for the semantic web.

First of all, sorry for my ignorance if many of these problems have a solution, I haven't followed the development of the semantic web for a while now..
Let's think about this.. let's say there's some user, Ben Smith, who enters his perfectly valid information about himself into the a database about people. Due to the massive amount of people named Ben Smith, when someone tries to get information about Ben Smith (assuming there's no data protection stuff and people are free to access info on him), it would yield results of people from Canada, England, various states in the US, as well as several other locations, all of which are correct.. yet not much use to us.
Considering that correct information will be pretty useless a lot of the time without being very very pedantic in the query (which we probably won't be able to do, since we are searching for info about it in the first place), just imagine the problems that would arrise if somebody somehow enters false information into the semweb. Would *all* information have to be verified? That could and would take far, far, far too long and would put a serious stump on expansion.
What about how to collect the information? Would programs be made to scan the web for information? If so, these programs would have to be extremely clever.. anybody who has tried to make AI programs will know that making sense from english is difficult at the best of times, let alone when people use misleasing (even if not intentional) grammar or spelling. If not, again, it would take a heck of a long time to get lots of useful information into the semantic web..
How would it all be updated, think about how many new news stories appear on the web every day, how can it all be added without verification or AI-type programs? The list of problems that spring to mind are plentiful...

That said, I also hope for the good, just don't be expecting much anytime soon.
Re:What Does 42 Mean for Privacy? by cynic10508 · 2004-09-27 07:10 · Score: 2, Insightful

Ah, but what constitutes privacy but an obscurity of your own behaviors in certain circles.

I would disagree. I would say privay is more like cryptography in that privacy is the ability to control who knows certain information. So privacy is confidentiality.

That is to say, I may be an item scammer in online gaming realms, or in Diablo, but not in EverQuest. However, I may be one of the most honest people I know in the real world. Perhaps I have a second account that I use to Troll on Slashdot, but otherwise have this account where I try to post insightful information. You have the right to link these things, you may even have the right to link these to real world data like where I work and where I park my car. However, if I jilted someone in Diablo, do I want them to so easily find me and take it out on my car (as some people would)?

Well, this goes off on a tangent. I would argue that you're making an incorrect metaphysical and/or epistemelogical distinction in dividing your "virtual" and "real" personas. What is ethical in one is ethical in another and vice-versa.

Do I want my employer having instant access to all of my online transactions, regardless if I'm on shift or off shift at the time? Individually, these are not things that have been considered something you would even want to 'secure', yet they may be valuable to someone.

Kind of another tangent. If you're using your employer's network then legally you've pretty much given up the right to privacy. My suggestion would be not to use company computers to do anything that you wouldn't want them looking at.
Re:What Does 42 Mean for Privacy? by Anonymous Coward · 2004-09-27 07:47 · Score: 1, Funny

+1 Pretentious for using the words metaphysical and epistemelogical in a Slashdot post

+1 Pedantic for managing to spell these words correctly in a Slashdot post
Re:What Does 42 Mean for Privacy? by Jahf · 2004-09-27 07:57 · Score: 1

Security through obscurity when used alone isn't.

However anything that is well secured can have that security enhanced by obscurity as well.

--
It is more productive to voice thoughtful opinions (reply) than to judge (moderate) others.
Re:What Does 42 Mean for Privacy? by Allen+Zadr · 2004-09-27 08:03 · Score: 1

Many of the issues of 'cleverness' are dealt with by the definition itself. First off, the Symantic Web relies on XML and/or RDF. Both are ways of describing disperate data-sets syntactically. This way the 'searching' programs do not have to be clever to glean usefull information from the data.
The important part is assigning levels of trust to each data-set ( a score perhaps ), and in some cases, even a negative score to some sources ( an RDF feed from HoaxBusters for example, where most of the subject matter is a negative truth [[ i.e. not true ]] ).
Of course, trust is always a shady proposistion. Do you trust a Slashdot RDF feed to make assertions about your relationship to common trolls? If you post in a story that also has posts with the Goat Sex pic, what does that say about you, especially to a programmer in Elbonia who's never heard of Slashdot (one of the few places where such an image is relatively common)? Unsettling, no?

--
Kinetic stupidity has a new brand leader: Allen Zadr.
Re:What Does 42 Mean for Privacy? by Allen+Zadr · 2004-09-27 08:26 · Score: 2, Interesting

The Semantic Web is for chasing tangents. Sorry if this seems marginal to you.
My point in the virtual vs. real persona is that you cannot expect the same behavior patterns from the same people given totally different situations. My killing your character in an online death-match does not mean I would be unethical enough to kill you. Likewise, if I pick up trinkets from the monsters you have slain (clearly, they are not my spoils to take), this does not mean that I will take tips off of tables at a restaurant.
Similarly, most of my 'online' activity is done from home. That does not mean that a symantec web is designed to tell the difference. In fact, just the opposite. It's designed to merge all data that's available on me into a single profile. Again, this could be misleading. If I spend 3 hours (average) per day gaming, does this make me less capable of doing my job? Maybe, maybe not. Would this change the way my employer perceives my performance? Probably, yes.
The other point which I think you are trying to make, is that if the data is out there, then it can already be searched out from other means already. This may be the case, but not necessarily.
Given a much more personal example: If my cross-identy is posted by a friend on an obscure site, Google may pick that up. If you then trace my cross-identity into the online world, you will find many, many postings - as well as political views (mostly by the name you see me posting under now). My politics definately don't agree with those whom pay my salary. Would they hold these politics against me if they were easily traced? I don't know. I honestly don't want to find out. Point being, the symantic web (if working) would quickly link me with my politics.
My greater fear, it would be just as easy for an advertiser to do this (not that they don't already to some extent), it would just be even easier. The only benefit? I may stop getting ads for things I don't need.

--
Kinetic stupidity has a new brand leader: Allen Zadr.
Re:What Does 42 Mean for Privacy? by Z4rd0Z · 2004-09-27 09:50 · Score: 1

Putting your source code in a pantry alone of course does not equal security. All other things being equal, however, the obscured system will probably be more secure. Say you programmed a closed source web server with security in mind. Don't you think the closed system might be harder to crack than Apache? As an example, look at MS Word format. It's locked up, and all the open source hackers in the world can't come up with a completely compatible implementation.

--
You had me at "dicks fuck assholes".
Re:What Does 42 Mean for Privacy? by crschmidt · 2004-09-27 12:04 · Score: 2, Informative
There are several solutions to the problems you describe. I'll address the few I'm most comfortable with responding to - not because the others are unsovable, simply because I don't want to provide inadequate information.

All information on the web should be taken with, as they say, a grain of salt. Depending on what you are looking at, it has more or less value. For example, something on Wikipedia can probably be assumed to be relatively accurate, whereas something on Joe Schmo's website on Geocities will probably be considered to be less accurate in general. The semantic web allows for you to see who is saying something in a number of ways, and to verify this information:
- URI Source - If the source of data about Chevy Trucks is at chevy.com/trucks.rdf, you'll probably have a pretty good reason to trust it.
- dc:creator - a self-assigned name for the creator of the document
- Most importantly, wot:assurance: a signature, using standard public/private key encryption, of a document, assuring that the signer indeed did create the information
Each of these methods of determining where information is coming from has its own special place in assigning credence to the document in question. Thus, if a document signed by crschmidt@crschmidt.net says that the person "CHristopher Schmidt" owns the email address crschmidt@crschmidt.net - it's probably safe to trust that person.

Once the data is available on the web, it is easy to find other data: one of the basic terms is "seeAlso" - a way for providing other URLs to look for data at. Once the web starts, it is easy to link it, and to do so is to increase the data .You don't need something smart or intelligent - simply wander around, collect all the rdfs:seeAlso links, and download those - and continue from there. This process, known as "scuttering", is an easy way to start creating a relatively large data store.

Using descriptions of when information is updated allows tools to understand when they should check back for more information. Similar to the way RSS feeds (which are a part of the Semantic Web) can inform tools that they will be updated in 2, 4, 6, 24 hours, general RDF documents can do the same thing - saying 'check me again in a week" or more.

There are currently tools for working with the semantic web in a small scale. Although this is nothing like the big dream - having almost everything described, so that computers can really understand the world around them - these tools do have their usefulness. I can now ask "What is the name of the person whose aim name is cr5chmidt", and be told the answer. Although it's not perfect - very little about the semantic web is perfect yet - it doesn't need to be. For more information, see my post on the bot I created to spider semweb data in my blog.

As you said, it won't be easy. However, it is possible, and it seems to me more and more likely each day that working on these tools and increasing the amount of semantic data in every little way can help.
--
-- Christopher Schmidt YouTube Quality of Experience
Re:What Does 42 Mean for Privacy? by blue+trane · 2004-09-27 13:45 · Score: 2, Insightful

Didn't they come up with a few viruses for it though?
Re:What Does 42 Mean for Privacy? by Z4rd0Z · 2004-09-27 14:31 · Score: 1

For the Word format? No.

--
You had me at "dicks fuck assholes".
Re:What Does 42 Mean for Privacy? by blue+trane · 2004-09-27 14:41 · Score: 1

wasn't your point that "the obscured system will be more secure"? ms word is obscured, but it's not more secure. i suppose you could argue that it's the macro language that's not secure but since that's included in the system...
Re:What Does 42 Mean for Privacy? by Z4rd0Z · 2004-09-27 17:08 · Score: 1

I'm only trying to make the point that the Word format is closed, and because of that no one has been able to completely duplicate it. In other words, it's harder to see into the closed system than an open one. But I'm also not saying all closed systems are secure or that closed systems are the way to attain security. I said all other things being equal, if you pit two security minded programs against each other, the closed one should naturally have less chance of being exploited because it's harder to see what's going on inside. Please note that I'm not arguing against open source or open standards, it's just that I hear that mantra "security through obscurity == bad". I don't see how that's 100% true.

--
You had me at "dicks fuck assholes".
Re:What Does 42 Mean for Privacy? by cynic10508 · 2004-09-27 18:44 · Score: 1

The Semantic Web is for chasing tangents. Sorry if this seems marginal to you.

My work in semantics has left me with the idea of semantics as a tool to idea related concepts, not necessarily concepts I'd regard as tangential. For instance, using semantics to recognize that an umpire and a baseball field are directly related, while perhaps Crackerjacks are tangential to the topic at hand (i.e. baseball).

My point in the virtual vs. real persona is that you cannot expect the same behavior patterns from the same people given totally different situations. My killing your character in an online death-match does not mean I would be unethical enough to kill you. Likewise, if I pick up trinkets from the monsters you have slain (clearly, they are not my spoils to take), this does not mean that I will take tips off of tables at a restaurant.

My interpretation of "virtual" versus "real" in your original post was akin to sending threatening e-mails online but always being passive in real life. So this was my misunderstanding.

Similarly, most of my 'online' activity is done from home. That does not mean that a symantec web is designed to tell the difference. In fact, just the opposite. It's designed to merge all data that's available on me into a single profile. Again, this could be misleading. If I spend 3 hours (average) per day gaming, does this make me less capable of doing my job? Maybe, maybe not. Would this change the way my employer perceives my performance? Probably, yes.

I was addressing this from a legal standpoint. The employer is legally liable for the actions of its employees on the network and therefore has a keen interest in precluding "improper" activites.

The other point which I think you are trying to make, is that if the data is out there, then it can already be searched out from other means already. This may be the case, but not necessarily.

Not sure where I say anything like that.

Given a much more personal example: If my cross-identy is posted by a friend on an obscure site, Google may pick that up. If you then trace my cross-identity into the online world, you will find many, many postings - as well as political views (mostly by the name you see me posting under now). My politics definately don't agree with those whom pay my salary. Would they hold these politics against me if they were easily traced? I don't know. I honestly don't want to find out. Point being, the symantic web (if working) would quickly link me with my politics.

The argument here being: don't post personally identfiable information on the Internet. Such semantic systems can and do work. This is just another reason to not put your information out there.

A second argument seems to be against employers using this technology to survey their employees. This is illegal. Does it mean that employers will never do it? No. But when it happens there will be a legal precedant set against it. Granted, this is assuming you don't work for the DoD, DoJ or DoE where you willingly sign away the rights to such privacy.

My greater fear, it would be just as easy for an advertiser to do this (not that they don't already to some extent), it would just be even easier. The only benefit? I may stop getting ads for things I don't need.

Oh they do do this. The question is: do they realize they're using semantics? Most likely not. Their approaches in identifying interests etc. are similar but far too coarse and inflexible to be compared to "proper" semantics, the area of linguistics.

The Semantic Web is the next big thing by Anonymous Coward · 2004-09-27 05:52 · Score: 2, Funny

and has been for over a decade (or more).

Re:The Semantic Web is the next big thing by the+MaD+HuNGaRIaN · 2004-09-27 05:53 · Score: 1

No, you're thinking of Aspect Oriented Programming.
Re:The Semantic Web is the next big thing by Alien54 · 2004-09-27 05:57 · Score: 0, Redundant

Quick! patent it before Microsoft does.....
ooops, too late
SCO says they own that too....

--
"It is a greater offense to steal men's labor, than their clothes"
Re:The Semantic Web is the next big thing by Anonymous Coward · 2004-09-27 09:26 · Score: 0

Wasn't it actually Harry Morris that originally implemented a big part of this idea with the WAIS protocol?

'Twas a happy day on SemWebCentral... by tcopeland · 2004-09-27 05:52 · Score: 3, Interesting

...when the man himself signed up for a user account. w00t!

--
The Army reading list

What is the semantic web? by Anonymous Coward · 2004-09-27 05:52 · Score: 5, Informative

Well, beyond the "knowledge management"-type mumbo jumbo, anyway. Some basic definitions are here, here, and .

Re:What is the semantic web? by Anonymous Coward · 2004-09-27 06:27 · Score: 0

Oops, that last link should be this. I didn't add any text between the anchor tags, but if you have "Display Link Domains" on, you will see the above as:

"here [w3.org], here [wikipedia.org], and [google.com]."

Which is accidentally interesting to me because it could be argued that google.com is rather like the semantic web.

This will happen... by Anonymous Coward · 2004-09-27 05:53 · Score: 0

...as soon as web services are up and running.

You don't want a "single" web... by Pig+Hogger · 2004-09-27 05:53 · Score: 3, Insightful

You don't want a "single" web... You want a multitude of them, and carefully isolate them (beyond normal information reading and referencing).

This is to insure against a monoculture that is so disastrous in computer circles as demonstrated by the numerous security failings of Windows...

Re:You don't want a "single" web... by JimDabell · 2004-09-27 06:03 · Score: 3, Insightful

This is to insure against a monoculture that is so disastrous in computer circles as demonstrated by the numerous security failings of Windows...

Windows executes stuff. The semantic web is just data. Your warnings about a monoculture apply to the semantic web about as much as they apply to text files.
Re:You don't want a "single" web... by escher · 2004-09-27 06:13 · Score: 0, Offtopic

The semantic web is just data. Your warnings about a monoculture apply to the semantic web about as much as they apply to text files.

Remember when you couldn't get a virus just by reading an e-mail? That's Microsoft for you... making the impossible possible!
Re:You don't want a "single" web... by JimDabell · 2004-09-27 06:16 · Score: 3, Insightful

Remember when you couldn't get a virus just by reading an e-mail?

Yes, and again, the problem is when the stuff that executes has a monoculture. It's not like you see Pine users or KMail users infected by emails with Outlook viruses in.
Re:You don't want a "single" web... by sik0fewl · 2004-09-27 18:09 · Score: 1

Windows executes stuff. The semantic web is just data. Your warnings about a monoculture apply to the semantic web about as much as they apply to text files.

Oh no! If this is as bad as you say it is we need to burn all txts!

--
I remember when legal used to mean lawful, now it means some kind of loophole. - Leo Kessler

Duplicate Posting by Anonymous Coward · 2004-09-27 05:53 · Score: 5, Funny

See the original here.

Actually Slashdot posts this article over and over again every few months, with basically the same headline (sometimes "and" sometimes "on" sometimes "Tim" sometimes not). Kinda bizarre really. :-) I've never read any of them, I only know this Berners-Lee fellow from the headlines.

Re:Duplicate Posting by shmigget · 2004-09-27 16:44 · Score: 1

Thank you. We've been reading about Berners-Lee's Semantic Web since at least 2001. How do the /. editors get to thinking this is news?

Dang CERNopeans! by Anonymous Coward · 2004-09-27 05:53 · Score: 4, Funny

As we all know, Al Gore is the hero of the Web's creation story.

Re:Dang CERNopeans! by Anonymous Coward · 2004-09-27 07:16 · Score: 0

" As we all know, Al Gore is the hero of the Web's creation story.

And George W. Bush is just stupid enough to take credit for the semantic web. ;)

Can you name 3 policies that he, George W. Bush has implemented that have been positive?
Re:Dang CERNopeans! by G.+W.+Bush+Junior · 2004-09-27 09:39 · Score: 1

I never got why Gore's small anecdotes attracted so much negative attention in the US... sure he didn't "create" the web, but he was in a small comitee that had enough foresight to fund the project.
It was obviously not an attempt to convince the voters that he was an engineer - but simply an anecdote with no malintent.

However bush blatantly lied about his support for healthcare in texas... but for some reason Gore was stamped as untrustworthy while Bush's lie was explained as incompetence on Bush's part (he didn't remember what he himself had voted for).

For some reason incompetence is not enough to disqualify you from becoming president of the United States of America.. well obviously everything will be okay as long as his admnistration is incompetent... That was the reasoning whenever Bush said something dumb during the campaign...

Funny how that election could have told you a lot about things to come...

--
"I don't know that Atheists should be considered as citizens, nor should they be considered patriots." -George H.W. Bush

"Where's some semantic web software?" by tcopeland · 2004-09-27 05:55 · Score: 4, Informative

This always gets asked - and a partial answer is right here.

Eclipse plugins, visualization tools... there's some good stuff there.

--
The Army reading list

Re:"Where's some semantic web software?" by Schwarzchild · 2004-09-27 13:05 · Score: 2, Interesting

Yeah but is it anything that you'd want to use?
The God Emperor of XML, Tim Bray, doesn't seem to know of any such software so he posted a challenge.

--
"sweet dreams are made of this..."
Re:"Where's some semantic web software?" by J1 · 2004-09-27 20:42 · Score: 1

Slashdot ran an article on Bibster some time ago, which uses Semantic Web technology under the hood.

Then there are the entries for the Semantic Web Challenge, organized by the ISWC, which have some interesting and useful applications.

These are just a few pointers to semantic web software. There's more, I'm sure.
Re:"Where's some semantic web software?" by anomalous+cohort · 2004-10-01 06:15 · Score: 1
Here's my vote for visualization and editing tools.
- Isaviz is the best for visualization and navigation
- Protege is cool for editing ontologies

The rest of us call this... by Amiga+Lover · 2004-09-27 05:55 · Score: 1, Insightful

The rest of us call this... GOOGLE.

works for me.

Re:The rest of us call this... by aLe-ph-1(sh) · 2004-09-27 06:00 · Score: 0

seems to me that GMail would do a really good job of doing this. When you want all the info on someone, and you have all of their emails, sent to a whole bunch of their friends, and so on and so forth, well, it's been talked about before here on /.

--
sig!wind down the juuice, let the tubes roar with the glow of alternative powers, not they that be." me, today...
Re:The rest of us call this... by BigGerman · 2004-09-27 06:01 · Score: 2, Insightful

Exactly.
And here is the problem: what "the rest of us" are going to do when Google goes south? Either collapses under its own weight or finally broken by its corporate overlords?
Can't put all the eggs in one basket. The only sane future is the one with unified, object-driven search and retrieval methods distributed amongst information consumers and producers.
Re:The rest of us call this... by mr_majestyk · 2004-09-27 06:03 · Score: 3, Informative

The rest of us call this... GOOGLE.

Google identifies relationships between data using only on the links between pages containing the data.

The Semantic web represents relationships between data based on metadata (i.e. data about data). This is a far more powerful way to describe the meaning of data.

works for me.

Maybe, but that doesn't mean its the best way to accomplish what you are trying to do.
Re:The rest of us call this... by NoInfo · 2004-09-27 06:04 · Score: 1

Will Google go south?

--
bug.gd: error search engine. Humanity working together to solve all errors.
Re:The rest of us call this... by bongoras · 2004-09-27 06:21 · Score: 4, Insightful

The Semantic web represents relationships between data based on metadata (i.e. data about data). This is a far more powerful way to describe the meaning of data.

And this is what makes me wonder if this will amount to much more then an interested research project for grad students. In order for the SemWeb to amount to anything useful, everyone is going to have to include the metadata necessary to integrate their data into the Semantic Web. How's that going to work? Who's going to make it work?
Re:The rest of us call this... by 0racle · 2004-09-27 06:24 · Score: 1

Google will go down the same way every ubersearch engine did before, someone will create something better and everyone will begin to use that instead, and google will go the way of webcrawler and altavista and everyone else into a obscure corner of the web. It just might take a little longer since they seem to have hit a chord with gmail.

--
"I use a Mac because I'm just better than you are."
Re:The rest of us call this... by mr_majestyk · 2004-09-27 06:28 · Score: 1

everyone is going to have to include the metadata necessary to integrate their data into the Semantic Web. How's that going to work? Who's going to make it work?

It's already happening...check out sites like flickr (photo blogging) and del.icio.us (collaborative bookmarks).
Re:The rest of us call this... by j1m+5n0w · 2004-09-27 06:33 · Score: 3, Interesting

Google identifies relationships between data using only on the links between pages containing the data.
The Semantic web represents relationships between data based on metadata (i.e. data about data). This is a far more powerful way to describe the meaning of data.

This is an important point. Google computes the pagerank of a page based on the eigenvector of the web link matrix, which is a clever and usually effective approach. Unfortunately, each link only conveys a little bit of information. A link from page A to page B is assumed to be an endorsement of page B's relevance by page A. But what if you could add extra metadata to the links? Not just a URL and a human readable text label, but a machine readable label as well, like this?
<a href=http://slashdot.org relevance=0.3 novelty=0.8 accuracy=-0.2 funny=0.2> slashdot </a>

If you could apply arbitrary attributes to web pages, google would have much better information to go on, and a user could specify the importance of certain attributes depending on what he/she is looking for.
-jim
Re:The rest of us call this... by the+chao+goes+mu · 2004-09-27 06:37 · Score: 1

" unified, object-driven search and retrieval methods distributed amongst information consumers and producers"
Nice marketing-speak! Will it be object-oriented, three-tiered, scalable, interactive and java-based too?

--
Boys from the City. Not yet caught by the Whirlwind of Progress. Feed soda pop to the thirsty pigs.
Re:The rest of us call this... by JimDabell · 2004-09-27 06:37 · Score: 3, Interesting

Google's a hack. No, really, it tries to extract meaning from web pages that really aren't engineered to store that kind of information.

Google is also an application. The Semantic Web is all about building the infrastructure so applications like Google don't have to chase the holy grail of AI to become more than a hack. Think of the Semantic Web as the layer underneath Google.
Re:The rest of us call this... by Anonymous Coward · 2004-09-27 06:43 · Score: 0

google brings you rss feeds? yes? oh well.

seriously, the semantic web brings VERY different application, built on INTERACTION.

i personally don't think it will improve searching much, but there's at least one project which is nicely aimed at that... foafspace.com which looks up people - indexes a lot of lifejournal data so knows pretty much i guess :)
Re:The rest of us call this... by timeOday · 2004-09-27 07:16 · Score: 1

Google's a hack. No, really, it tries to extract meaning from web pages that really aren't engineered to store that kind of information.
Granted. Unfortunately, I think we're stuck with it, and the Semantic Web will never catch on.
Having interchangeable data with semantic information is actually a much bigger idea than the WWW itself, in fact. For intance, it would be nice to transfer information about purchases from any cash register to your PDA, then later to Quicken. Technically, it's not *that* hard, but it hasn't happened.
In fact, the Web itself has grown less semantic over time! The original idea for HTML was that you'd mark up content with a description of what it was, rather than how to display it. Then any device could use its "understanding" of the documented (conveyed for instance by paragraph tags) to render the page appropriately. This is an example of a (somewhat) semantic Web.
That idea has been rejected. In the end most Web creators preferred convenience and/or exact control over appearance, in preference over the ability to use the data more flexibly. The popularity of MS Word over TeX is another example. So is the anemic uptake of XML, and more importantly the even slower adption of standardized XML schemata.
Ultimately, semantic information about data has to come from somebody, so it's not free. You have to write extra tags, or put the data into a structured database, or some analogous process, and at this point people invariably do whatever is easiest and cheapest to accomplish the goal at hand.
And that's why we're stuck with (something like) google - because it processes data structured and annotated only about as much as necessary for a human being to use it.
Re:The rest of us call this... by Anonymous Coward · 2004-09-27 07:20 · Score: 0

Ha! Google won't disappear quite that easily, they're not just a search engine anymore, they offer a wide array of services. Even if something better did come along, who would know, or care? Firefox is by far better than IE but how many people know about Firefox? IE still has the biggest marketshare by FAR, and it's always been one of the worst browsers, and it's always been the most widely used, ever sine it appeared in Windows.
You can't just think straight forward, the average user is an ignorant dumbass technoweenie.
Re:The rest of us call this... by Anonymous Coward · 2004-09-27 07:33 · Score: 0

I think it will actually be N-tiered and implemented using an extreme, aspect-oriented methodology.
Re:The rest of us call this... by Anonymous Coward · 2004-09-27 07:40 · Score: 0

And what happens when people start misusing the metadata like the current meta tags?
Re:The rest of us call this... by mr_majestyk · 2004-09-27 07:53 · Score: 3, Informative

And what happens when people start misusing the metadata like the current meta tags?

The Semantic Web just provides a method for expressing metadata. Maintaining the integrity of those expressions involves a different set of problems. Some of the solutions include trust metrics like Slashdot's own distributed moderation (PDF) or Advogato.
Re:The rest of us call this... by ioslipstream · 2004-09-27 07:59 · Score: 1

Of course this will happen. It's a good idea, but that isn't why it will happen.

It will happen because there is a LOT of money to be made from this.

It probably won't happen widescale until the applications we use generate and publish this data automatically, but rest assured, it will happen. It is in corporations best interest top begin phasing this into their applications at some point in time.
Re:The rest of us call this... by maxpublic · 2004-09-27 08:04 · Score: 0, Flamebait

they seem to have hit a chord with gmail.

You mean a buggy, still-beta, feature-deficient webmail service that's distinguished only by the fact that it offers 1 GB of space? Space that 99% of it's userbase will never come close to using more than a fraction of?

Seems to me that gmail primarily appeals to a) zealot geeks who enshrine Google as something holy, and b) geeks who define the size of their own manhood by the diskspace available in their gmail account.

So far I have *not* been impressed with gmail.

Max

--
My god carries a hammer. Your god died nailed to a tree. Any questions?
Re:The rest of us call this... by doom · 2004-09-27 08:07 · Score: 1

Unfortunately, each link only conveys a little bit of information. A link from page A to page B is assumed to be an endorsement of page B's relevance by page A. But what if you could add extra metadata to the links?
Yes indeed. It's particularly annoying that if you want to talk something down it's best not to use a live link to them, else you may inadvertantly give them some google juice because you think they suck.
Re:The rest of us call this... by 0racle · 2004-09-27 08:41 · Score: 1

Well I can't really say either way, I don't have a gmail account and I don't intend on ever getting one, all I can say is that your about the only person I've seen really say anything really bad about it so they must be doing something right.

--
"I use a Mac because I'm just better than you are."
Re:The rest of us call this... by BigGerman · 2004-09-27 08:58 · Score: 1

kinda have to be multi-tiered and interactive in order to be scalable, right? ;-)
Re:The rest of us call this... by Marcus+Green · 2004-09-27 09:03 · Score: 1

You mean "the rest of us who don't understand the concept", like those that that thought the tv would be radio with pictures and that the web would be TV with a mouse.
Re:The rest of us call this... by maxwell+demon · 2004-09-28 01:25 · Score: 1

Indeed, I like this idea of "semantic links" better than the idea of a separated semantic web. Note that the annotation could be a hyperlink itself, to a document describing more about the relationship (using semantic links themselves). There would be a set of "base URIs" at W3C (in order to have a set of standardized, well-known meanings). Those base URIs would link only to other base URIs, and human-readable text about their meaning (the machines would get the meaning of those from the standardized location).

Say the W3C had the pages (besides others):

http://w3c.org/semantic_base/superclass.html

This link targets a superclass of the semantics described in the linking page.

http://w3c.org/semantic_base/opposite.html

The link target describes the opposite semantics of the semantics described by the linking page

http://w3c.org/semantic_base/same_document.html

The target of this link is considered to be part of the same document. It is therefore an <a semantics="superclass.html" href="internal.html">internal link</a>.

http://w3c.org/semantic_base/subpage.html

The target of this link is considered to be a subpage of the linking page. This implies it's <a semantics="superclass.html" href="same_document.html">part of the same document</a>

http://w3c.org/semantic_base/external.html

The target of this link <a semantics="opposite.html" href="internal.html">is not considered to be part of</a> the same web presence as the linking document.

http://w3c.org/semantic_base/positive.html

The target of this link gets a positive judgement from the linking page. Opposite of <a semantics="opposite.html" href="negative.html">negative<a>

http://w3c.org/semantic_base/negative.html

The target of this link gets a negative judgement from the linking page. Opposite of <a semantics="opposite.html" href="positive.html">positive<a>

Then slashdot could set up pages like

http://slashdot.org/semantics/funny.html

The content this link points to is considered funny. This is a <a semantics="http://w3c.org/superclass.html" href="http://w3c.org/positive.html">positive</a&gt ; judgement.

and link to comments like

<a semantics="http://w3c.org/subpage.html" semantics="/semantics/funny.html" href=(URL of the comment)>(number of comment)</a>

Of course all the pages would be better thought out. But I guess the idea is clear, and the bonus is that this would integrate neatly with current web pages, instead of adding a whole new language.

Maybe one would also allow the semantics attribute on other tags than links, like (assuming there's a site named proglang.org, which has info about programming languages on pages with URL "http://proglang.org/language/(name of language).html"):
<pre semantics="http://proglang.org/language/cplusplus. html"> #include <iostream> int main() { std::cout << "Hello world\n"; } </pre>

--
The Tao of math: The numbers you can count are not the real numbers.

about everything and for everyone... by over_exposed · 2004-09-27 05:56 · Score: 4, Funny

Except for China, they get their own semantic web with special semantic filters in place that semantically keep their citizens under semantic control.

--
"The object of war is not to die for your country, but to make the other bastard die for his." - Patton

Re:about everything and for everyone... by Anonymous Coward · 2004-09-27 06:19 · Score: 0

And Slashdot will turn into a "pedantic web" to stay ahead of the trend curve ;-)
Re:about everything and for everyone... by Anonymous Coward · 2004-09-27 06:24 · Score: 5, Funny

I hope you're not anti-semantic?
Re:about everything and for everyone... by hondo77 · 2004-09-27 06:43 · Score: 0

In Soviet Russia, the Semantic Web categorizes you?

--
I live ze unknown. I love ze unknown. I am ze unknown.

Opposing view by Psychic+Burrito · 2004-09-27 05:56 · Score: 5, Informative

If you'd like an opposing view, make sure to read Clay Shirky's take on the semantic web.

Re:Opposing view by mr_majestyk · 2004-09-27 06:07 · Score: 1

Web pundits like Clay Shirky live in the present. Their entire relevance is based on the way the web looks today. They have no interest in anything being any different than exactly the way it is now.

For a more forward-looking view of this issue, see this essay on the real potential of the Semantic web.
Re:Opposing view by david.given · 2004-09-27 06:24 · Score: 2, Interesting
If you'd like an opposing view, make sure to read Clay Shirky's take on the semantic web.
Having just read quite a lot of his article before becoming far too annoyed to go any further, I really wouldn't take him very seriously. The bulk of his complaint is that although the Semantic Web is about drawing conclusions from widely disparate pieces of data, people don't think like that. I have no complaint with this.
However, he attempts to illustrates his point with lots of syllogisms. Unfortunately, he doesn't seem to understand them. For example, he uses this one:
1. Count Dracula is a Vampire
2. Count Dracula lives in Transylvania
3. Transylvania is a region of Romania
4. Vampires are not real
...to illustrate that despite the fact that all the above statements are correct, the only conclusion you can draw is that Romania is not real.
Huh?
The only way you can come to that conclusion is if you assume that statement 2 implies that, if X lives in Y and X is not real, then Y is not real. Which is an invalid assumption. Therefore his conclusion is not valid.
The entire essay is full of things like this. When he's talking in generalities, he makes a small amount of sense, but as soon as he starts using specifics, he stops making sense. There may be something to his basic point, but I'm not inclined to trust someone's opinions on a fundamentally logic-based concept who seems to be so inept at using logic. Treat with caution.
Re:Opposing view by Anonymous Coward · 2004-09-27 06:26 · Score: 0

People who oppose the idea of a "semantic web" are assholes. I can come up with a thousand reasons why the concept won't work perfectly, and why some people won't want to use it, and a billion things that work today without it.

But instead, I'm looking at the things that currently run like shit because people won't even use basic semantic tags because they only care about presentation, and the people with disabilities that can't even read a freaking webpage because of abominable use of and lack of ALT attributes.

This is not paper we are dealing with, it's computers. Computers need meaning to handle data, not globs of raw ascii text. Any other method of trying to sort, parse or classify data is stupid, when every single step between creation and receipt is handled by computers.
Re:Opposing view by nomadic · 2004-09-27 06:27 · Score: 1

I was going to criticize his lack of formal training, then found out he had taught at my alma mater. So if I criticize his lack of education I inadvertently denigrate the quality of my education.
Re:Opposing view by TuringTest · 2004-09-27 06:28 · Score: 1

That article has interesting points, but it fails in its critic that semantic web will never take off because it requires that everybody agrees on the same ontology. The semantic web allows people to publish their own ontologies, and the best tools should be those that learn to extract interesting info from various sources. This is what S.W. is all about, not yet-another-universal-standard.

--
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
Re:Opposing view by mr_majestyk · 2004-09-27 06:32 · Score: 2, Insightful

semantic web allows people to publish their own ontologies, and the best tools should be those that learn to extract interesting info from various sources.

That's right. More to the point, the system supports many ontologies, and allows the best ontologies to rise to the top.
Re:Opposing view by Allen+Zadr · 2004-09-27 06:36 · Score: 2, Insightful

Having read both of your articles, I do not see either of them as opposite, but rather complimentary.
All information that is subjective is a poor candidate for the symantec web. All information that is quickly subject to change is a poor candidate for the symantec web. When mixing subjective (verb) pointers to a given truth on a large scale, modified by objective pointers, where even one of many thousands is false (or mis-keyed), the overall meaning can become quickly subverted.
In other words, if I get enough people to post somewhere that Allen Zadr lives in New Mexico, the multiple verbs that would otherwise point to the actual fact -- there is no Allen Zadr -- would be subverted. That is, unless you could syntactically link Allen Zadr to an actual human being.
Even more simply, the symantic web is only as good as the data. It's not very difficult to get a well trusted source to make an assertion of a truth while avoiding the linking details - thus presenting the users with a subverted view of reality. It has many flaws, and many promises. It won't fail, but it will never be better or worse than the existing systems, just different.

--
Kinetic stupidity has a new brand leader: Allen Zadr.
Re:Opposing view by Anonymous Coward · 2004-09-27 06:42 · Score: 0

The only way you can come to that conclusion is if you assume that statement 2 implies that, if X lives in Y and X is not real, then Y is not real. Which is an invalid assumption. Therefore his conclusion is not valid.

The conclusion is invalid because YOU happen to know that it's invalid. It certainly could be valid given only the rules presented. As an example, if you used Superman and Metroplis in the above example, it would work fine. Now given rule 3A that states "Transylvania is real", then it would be interesting to see what types of conclusions the software would derive (if Transylvania is real, Romania must be real, and the whole Dracula thing is a red herring).

The whole point of course being that it's most important to have enough "correct" data points to allow more correct meaning to be derived. Which theoretically would happen as the nodes became more prevelant.
Re:Opposing view by Sique · 2004-09-27 06:48 · Score: 3, Insightful

No, computers don't need meaning to handle data. Computers need syntax and rules how to act at syntactic structures. The semantic web is founded on the hope that enough syntax thrown at huge amounts of data turns magically into semantics.

It's based on the assumption that all semantics can be explained by syntax. So far this has not been proven yet, and all attempts to get there went stuck somewhere and turned out something different, sometimes useful (Chomsky's grammars), sometimes not so useful.

The semantic web would have to deal with the laziness of people who can't be bothered to write meaningful ALT attributes to tags. It can try to guess on some of the semantics, but it can also easily be fooled. Everyone who ever tried to use content filters for an internet connection knows what I am talking about. There are lots of false positives rejected and hundreds of questionable sites run through, because the syntax of a site alone doesn't help with evaluation the semantics (the meaning) of this site.

--
.sig: Sique *sigh*
Re:Opposing view by inKubus · 2004-09-27 06:48 · Score: 1

Yeah, I can understand these arguements. But what if you applied "fuzzy" techniques.

"ALL X are Y" will only get you so far. Then you could add additional (numeric) fuzzy logic based on samples of other data. For instance, in the "People who live in France speak French" solliquism, the computer could attempt to validate it by pulling a language census of France. After pulling this data, it would know that approximately 95% of people living there speak French. Thus a "fuzzy" "all" could be made. Like "MOST people in France speak French" and even give it a decent probability.

What this does is make relevance easier to find and actually creates new knowledge and facts out of thin air. Yes, some of them will not be true, but it's entirely possible that the computer could attempt to establish a probability that it is true.

Then you could have a "root" server of "truths" that everyone knows and then based on the data in the system more truths are formed as well as close matches.

I think this is a facinating idea. Although the current work is merely creating the structure and standards things like what I've just described are possible. Imagine going to your google toolbar and asking a question and having it be answered.

Now, yeah yeah, ASK.com etc. had something going for a while. The problem is, as TBL mentioned, there's no descriptive information about the individual ELEMENTS in a regular HTML page. What if your search pulls up a 120MB html page with 2000 pictures and only one particular photo is of interest or whatever. And the page is dynamically generated so it's difficult to create a good list of keywords and stuff to search it with.

I mean, can you imagine, you're a graphic designer. You want a great picture of a sailboat on the front of a brochure. You could just insert a picture box, type a short description of the image you're thinking of, and then boom, one fills the box. Then you can just use your arrow key or something to move thru the pictures until you find the one you want, right within your page setup app, not in web browser, not in a book, etc.

Just one possible example. I have to end this post otherwise I might think of something else.

--
Cool! Amazing Toys.
Re:Opposing view by null+etc. · 2004-09-27 06:54 · Score: 2, Interesting
I don't really find value in Clay Shirky's arguments against syllogisms, which serve as the basis of value within a semantic web.
In order to prove that syllogisms are flawed, Clay presents examples of common English statements, and attempts to arrive at flawed deductions. Such flaws only work for Shirky due to the ambiguity of the English language.
In reality, a semantic web would neither store nor organize data according to the loose ambiguities of English. Rather, such information would need to be highly structured, using a formal system, in order for the accuracy of syllogisms to work.
As an example, let me examine a sentence that appears within a technical specification of a project I'm working on:

A financial institution may offer customers the ability to download account statements from its web site.

If this sentence were to be placed on the semantic web, it would be useless, given the ambiguity of several words and contexts. Instead, the meaning of each phrase, clause, and word would need to be made fully explicit using a formal semantic representation. Such a representation might be based on a hierarchical data structure such as XML.
If the above sentence were to be fully clarified, it would appear as:

From amongst the entire set of financial institutions actual or theoretical, a set of one or more such financial institutions may exist that offers each customer, from a set of one or more of the financial institution's actual customers if the financial institution is actual, or theoretical customers if the financial institution is theoretical or the financial institution is actual and may theoretically have customers, the ability for the customer to download each account statement from a set of one or more of the customer's actual account statements if the customer is actual, or theoretical account statements if the customer is theoretical or the customer is actual and may theoretically have account statements, from the web site owned and administered by the financial institution.

Obviously this structure is much larger, but it contains all of the information necessary to resolve the sentence's ambiguities.
The above structure could also be expressed simply in XML. To examine a fragment of the above structure:

From amongst the entire set of financial institutions actual or theoretical

This would most likely appear using a structured representation such as:
(target)
(set)
(scope)entire(/scope)
(members)
(membertype)financial institution(/membertype)
(instancetypes)
(type)actual(/type)
(type)theoretical(/type)
(/instancetypes)
(/members)
(/set)
(/target)
The Slashot "comments" field is extremely broken, so I've been forced to use parentheses and omit indentation.
Isn't it funny how the english sentence fragment is so much easier for humans to understand, even though both representations contain the same information? It's amazing what our brains do "automatically" by operating under certain contexts. Similarly, a machine will have much greater ease in understanding and processing the formalized structure, in cases where it wouldn't even be able to guess at the corresponding english fragment (Well, it would be able to guess, but with hilarious results. What's that, a piece of toast rules over Utah?)
No doubt, translating normal human english sentences into a semantic web will be a lengthy and complicated process. But some mitigating factors:
- As "prefab" semantic units are constructed, such units could be reused without reconstructing them.
- The resuse of units will allow the full value of the unit to be achieved, without introducing unecessary and confusing variance between instances of identical units.
- Units may be constructed in such a way to semantically avoid the ambiguities of every language, not just english. Such a conversion p
Re:Opposing view by cynic10508 · 2004-09-27 06:56 · Score: 1

That's right. More to the point, the system supports many ontologies, and allows the best ontologies to rise to the top.

The problem I see with this is it can still allow improper associations within the ontology to exist. Ideally, one should have an expert or team of experts describe concepts within their own area of specialization. Herpetologists describe snakes, astrophyscists describe quasars, etc.
Re:Opposing view by TuringTest · 2004-09-27 07:05 · Score: 1

That's how it's done, and that's why it's called "semantic web" instead of "semantic centralized project". You can use an ontology defined by others, the same way that you hiper-link to web pages published by others.

--
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
Re:Opposing view by cynic10508 · 2004-09-27 07:14 · Score: 1

That's how it's done, and that's why it's called "semantic web" instead of "semantic centralized project". You can use an ontology defined by others, the same way that you hiper-link to web pages published by others.

I guess what I'm having trouble with is the implications of trust. In distributed public key cryptography systems the idea of trust is very binary because you can directly trust or not trust someone. You can extend that to say, "I trust people that people I trust trust." However, with ontologies the implication is that I trust every bit of information in that ontology. But I doubt that very many people would be willing to check each ontology they use. So I'm concerned with assumptions of trust where they may not be earned.
Re:Opposing view by TuringTest · 2004-09-27 07:19 · Score: 1

Do you trust the information results provided by Google?

Agreed, the problem of trust is not directly solved by the semantic web, the same way that TCP/UP doesn't solve trust - Semantic web. is a communication tool. Trust managing should be programmed on top of this tool, as an application.

--
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
Re:Opposing view by Fnkmaster · 2004-09-27 07:22 · Score: 2, Insightful

While I understand where you are coming from, let me present the parts of his arguments that do seem to hold water to me.

1. The Semantic Web (or rather, ontology construction and construction of relationships between your local ontology and other ontologies) is complicated and time consuming, and require you deciphering lots of other people's stuff to connect your stuff to it. Ultimately the success of any new technology, especially one that requires widespread adoption to be useful, must be easy enough to adopt that people adopt it. RSS, HTML and other successful technologies allow you to focus your effort on the local endeavour and don't require tons of formalized, structured organization of data, which runs somewhat counter to human nature. They are thus substantially less labor intensive to implement, and have therefore been taken up quite rapidly. This argument I consider to be perfectly valid and fairly strong.

2. Trust of ontological data is a critical issue because lots of false assertions and mediocre data will inevitably creep into a large, distributed "semantic web". This is a problem with the web currently, and you definitely have to take everything you read with a grain of salt, trust certain sources more than others, and so on. I think this argument holds some water, but I think this problem is addressable.

Personally, I think it will ultimately be easier to implement something like Cyc to build structured knowledge networks from information in human grokkable form. The internal representation of a Cyc-like machine will probably look quite similar to the semantic web, including the ability to adjust world view, evaluate source material reliability, etc. Getting a machine to build this knowledge representation, despite all the ambiguities of human expression, is more likely to succeed and be useful to humanity (IMHO) than getting lots of humans to interact with computers and technology in a structured, logical fashion. This is not to say that there aren't applications where structured ontological data would work well.

I particularly like the idea of auto-translation between different structured data formats, but I do agree with Clay that it's more likely that businesses will construct isolated "island" ontologies (such as a specific XML schema for describing formatted data) and deal with translation to other formats on an ad-hoc basis, for simple resource allocation and cost reasons.

Your argument (pro) seems to rely on the idea that tools will make things easier. I can't help but think of 4GL programming, SQL and attempts to make programming accessible to "average" people. The fact is good tools make things easier, but only certain people or people trained to do so can really think in a structured, logical fashion and express that in a way that a computer understands. No efforts to handwave away that issue to "tools" has ever succeeded. Tools can help, but they are not a panacea. HTML is so successful and widespread because it's simple to edit, as it only requires basic visual thinking to understand - and tools let you skip the intermediate step and edit the visual representation directly.

The concept of editing semantic information is fundamentally not so simple, because humans don't formalize their thinking about relationships on a day-to-day basis. Like visual mapping tools for XML, they may make things slightly easier, but I wouldn't expect any magic. Like I said, I think that we will ultimately end up there, but I believe it will be approached from the other direction.
Re:Opposing view by OreoCookie · 2004-09-27 07:34 · Score: 1

Thanks for the link. That is one of the most interesting and well thought out articles I have read in a long time.
Re:Opposing view by Anonymous Coward · 2004-09-27 07:59 · Score: 0

Unfortunately, applied to the entire base of knowledge, in the end all you will end up with are statements where everything is a little bit of something.
Re:Opposing view by Thuktun · 2004-09-27 10:12 · Score: 4, Insightful
If you'd like an opposing view, make sure to read Clay Shirky's take on the semantic web.

His writings appear to have some uncorrected logical fallacies.
Consider the following assertions:
- Count Dracula is a Vampire
- Count Dracula lives in Transylvania
- Transylvania is a region of Romania
- Vampires are not real
You can draw only one non-clashing conclusion from such a set of assertions -- Romania isn't real.
You can conclude the following from those statements:
- Count Dracula is not real
- Count Dracula lives in a region of Romania
I'd like to see the mystery step that combines these to conclude that Romania isn't real; at most, you could say that Romania houses something that isn't real. The conclusion he makes isn't supported by any logic.

More importantly, these are dumbed-down semantics. The assertion that a fictional character lives somewhere real needs to be qualified that this occurs in a certain set of fictional stories, not real life. The fact that these unqualified statements are represented in this example ontology means that the ontology is insufficient, not that this method isn't useful.

Another example in that article:
- US citizens are people
- The First Amendment covers the rights of US citizens
- Nike is protected by the First Amendment
You could conclude from this that Nike is a person, and of course you would be right.
This is even factually incorrect. The First Amendment doesn't actually say anything about US citizens; it restricts the US Congress from certain actions, period, not for certain people.

Ignoring this, you can make one conclusion and reduce this to the following:
- the First Amendment covers the rights of people
- Nike is protected by the First Amendment
Concluding that Nike is a person from this is a logical fallacy. (Nothing in these logical statements says the First Amendment might not also cover the disposition of small peanut butter sandwiches with blueberry jam, which set Nike might then be an element of.)

I find it hard to treat this article with much weight, given its fast-and-loose treatment of logic and ontological assertions.
Re:Opposing view by drew · 2004-09-27 10:27 · Score: 1

The conclusion is invalid because YOU happen to know that it's invalid. It certainly could be valid given only the rules presented

Not so. Look at the following statements:

a implies b
a implies c
c implies d
b is false

what can we conclude about the relationships between b and d? NOTHING!!! There is no correlation between the two, but that is what the author of the article tried to argue.

Personally, I think this was a better example of the author's logical shortfalls:
- US citizens are people
- The First Amendment covers the rights of US citizens
- Nike is protected by the First Amendment
You could conclude from this that Nike is a person, and of course you would be right.

Bzzzt. Wrong. Game over man. The only way you could conclude that Nike is a person is if the second statement said "The First Amendment ONLY covers the rights of US citizens". As they are written, the second and third statements taken together do nothing to imply that Nike is a U.S. citizen, which is what would be required to "prove" that Nike is a person.

It's actually a shame, because the author made a lot of good points. For example, if he hadn't ruined it with his poor reasoning skills, the point of Nike being a person in a "First Amendment Law" context but not a medical context is a good one. He should have left all of the nonsense about syllogisms out entirely, and focused on his two good points- the problems surrounding context and generalizations. The only incorrect deduction in the whole article that could still be made by following sound logical reasoning was the one about speaking in a brooklyn accent- which he promptly pointed out was due to the fact that "People who live in Brooklyn speak with a Brooklyn accent" being a generalization that is not completely true.

--
If I don't put anything here, will anyone recognize me anymore?
Re:Opposing view by Artifakt · 2004-09-27 10:35 · Score: 1

And when someone starts assigning ontologies, will it be "Constitutional scholars describe the constitution.", Law professors describe the constitution.", or "the U.S. Supreme Court describes the constitution.", (or maybe even "Congress describes the constitution.")?.
Which is the winning ontology re. natural selection, the cladistic taxonomist's, or the molecular micro-biologist's?
It will be interesting to see if the semantic web can allow the best ones of such clashing ontologies to rise to the top. Hell, it will be interesting to see if any one viewpoint ever rises to the top, or if they just continue to bump against each other perpetually.

--
Who is John Cabal?
Re:Opposing view by drew · 2004-09-27 10:44 · Score: 1

This guy must have failed his high school geometry course (possibly algebra and calculus as well). Basically every syllogism longer than two lines in the entire article contains some variation of one of the two following very basic logic flaws:

a implies b
a implies c
therefore b implies c

or

a implies c
b implies c
therefore a implies b

--
If I don't put anything here, will anyone recognize me anymore?
Re:Opposing view by Anonymous Coward · 2004-09-27 11:16 · Score: 0

I'm afraid you're misunderstanding.

the reasons the examples you give lead to absurdities is that even though they are expressed in formal language, they vary in degree of formality that's applicable.

his whole point is that without endless qualifications by the use of 'only', 'only if' 'if and only if' etc etc, the whole excercise is pointless.

what you Bzz as wrong, he knows is wrong, which is why a) he used it as an example, and b) he said an entire sentence later (so you can be excuse for not noticing) that is was wrong and why.

his reasoning skills are totally up to scratch, he just expected a little more from his readership than you were prepared to give.
Re:Opposing view by Anonymous Coward · 2004-09-27 11:19 · Score: 0

I liked this so much, I'm going to use it too.

whoosh!

. ----- his point

o ----- your head
Re:Opposing view by david.given · 2004-09-27 11:32 · Score: 2, Insightful
The conclusion is invalid because YOU happen to know that it's invalid. It certainly could be valid given only the rules presented. As an example, if you used Superman and Metroplis in the above example, it would work fine.
Rule 2 does not provide any information about the reality of its parameters. Stating things a bit more formally:
1. isA(dracula, vampire)
2. locatedIn(dracula, transylvania)
3. locatedIn(transylvania, romania)
4. ~isReal(vampire)
These aren't rules, they're statements providing one-way inferences. You may only create forward logic chains. There aren't really any interesting conclusions you can come up with from this, apart from being able to state that some unreal things live in Romania.
Shirky gives examples of some of Dodgson's syllogisms (and Dodgson is a master among logicians). Dogson's syllogisms are interesting because they're based around rules. Take the one about poems:
1. No interesting poems are unpopular among people of real taste.
2. No modern poetry is free from affectation.
3. All your poems are on the subject of soap-bubbles.
4. No affected poetry is popular among people of real taste.
5. No ancient poetry is on the subject of soap-bubbles.
He uses generic statements, rather than absolute statements. You can see this if I restate it:
1. isInteresting(X) IMPLIES ~isPopular(X)
2. isModern(X) IMPLIES isAffected(X)
3. isYours(X) IMPLIES isAboutBubbles(X)
4. isAffected(X) IMPLIES ~isPopular(X)
5. ~isModern(X) IMPLIES ~isAboutBubbles(X)
Notice that all these rules have to be specified in generic terms. We have equations we can manipulate. This means we can use them. There's an rule that ~A IMPLIES B == B IMPLIES A which lets us restate as follows::
1. ~isPopular(X) IMPLIES isInteresting(X)
2. isModern(X) IMPLIES isAffected(X)
3. isYours(X) IMPLIES isAboutBubbles(X)
4. isAffected(X) IMPLIES ~isPopular(X)
5. isAboutBubbles(X) IMPLIES isModern(X)
And from here it's just a matter of substituting in, since (A IMPLIES (B IMPLIES C)) == (A IMPLIES C). This means that we can prove that your poems are modern, affected and uninteresting, but popular.
You need the statements to provide the fundamental information, and the rules to let you manipulate that information. (Dodgson avoids needing a statement by using rule 2 instead; it would work just as well had rule 2 been ~isInteresting(yourPoem), but that would only let you prove that yourPoem was uninteresting, not that all your poems are uninteresting.).
Shirky's trying to discredit the Semantic Web by using a syllogism of his own, that goes like this:
1. Syllogisms that don't contain rules are useless.
2. The Semantic Web is constructed out of syllogisms.
From this he's trying to draw the erroneous conclusion that the Semantic Web is useless. I leave the problem with this as an exercise to the reader.
Seeing as he is apparently trained in this stuff, which I am not, this makes me think that he is either (a) incompetant or (b) is deliberately trying to mislead people. Either way, I don't trust his logic.
Re:Opposing view by Anonymous Coward · 2004-09-27 11:40 · Score: 0

er, which is exactly his point. unless the entire web is rewritten in formal language, it can't work.

why the Shirkster argues is exactly that. *obviously* a new, formally defined web can have a semblance of semantic searching, but it will be weak close to the point of uselessness, impossible to create, and impossible to maintain even if it were, in some minute way, created.

what the semantic web people argue is for a bit of computational linguistics plus a few fancy meta tags. it just won't work.
Re:Opposing view by Nurgled · 2004-09-27 12:15 · Score: 1

More importantly, these are dumbed-down semantics. The assertion that a fictional character lives somewhere real needs to be qualified that this occurs in a certain set of fictional stories, not real life. The fact that these unqualified statements are represented in this example ontology means that the ontology is insufficient, not that this method isn't useful.

I've only really skimmed the article, but I think what the author is trying to say is precisely that the kinds of relationships that will be expressed in the Semantic web will be too furry and hazy to draw any useful conclusions with. I'm not sure why he wrote an illogical conclusion afterwards, but his main point is that we (as humans) don't tend to think about relationships in enough detail to express them in sufficient depth to draw useful conclusions from in practice. In reality, many of the relationships we deal with are so complicated that we simply cannot express every detail of them. The semantics will always be dumbed down because there's always one more layer of relationships that have to be expressed for a complete "graph" of the situation.
Re:Opposing view by drew · 2004-09-27 12:51 · Score: 1

He knows the statement is wrong, and he explains why the statement is wrong. However, he doesn't explain or even hint at the fact that the incorrect inference is a result of a simple logical mistake. Rather, he attempts to attribute the mistake to conflicting meta-data.

While I agree with (some of) his points, he could have chosen a much better way of arguing them. The syllogism was a good way of illustrating his point about how people make generalities ('People who live in Brooklyn speak with a Brooklyn accent'). However, when it comes to Meta-data, the section the two examples I cited came from, it is ineffective. I would presume that if someone were ever able to write a computer program complex enough to derive meaning from a series of statements like this, making it follow the rules of logic and deduction would be a trivial task in comparison. However, regardless of the author's poorly chosen example, you are still left with the much more difficult task of inferring context from the statements- in some senses Nike is a person (regardless of whether it is a US sitizen), and in others it is not. How is a computer program to know the difference, unless it is told on a case by case basis in which circumstance a applies and in which cases b applies....

At any rate my point was not that I disagree with what he was saying. I just think that he could have illustrated some of his points much better either without using syllogisms at all, or at least using ones that don't violate some of the more simple rules of logic- The Nike example could have written using any number of syllogisms that were actually valid and still illustrated the same point.

--
If I don't put anything here, will anyone recognize me anymore?
Re:Opposing view by null+etc. · 2004-09-27 14:13 · Score: 1

er, which is exactly his point. unless the entire web is rewritten in formal language, it can't work.
I doubt that anyone would be able to make a semantic web out of the current WWW, unless the translation were limited to pages that change fairly infrequently.
but it will be weak close to the point of uselessness
Tell that to the scientists who could benefit from it. My friend does DNA genome analysis using Perl and HUGE samples of data. You wouldn't believe some of the tools that he and other scientists use to collaborate.
impossible to create, and impossible to maintain even if it were, in some minute way, created.
You give our technology leaders too little credit. Who thought that a geek could write an operating system to compete with Microsoft?
Re:Opposing view by blue+trane · 2004-09-27 14:38 · Score: 1

And fuzzy logic is a tool to specify just how much anything is of anything...
Re:Opposing view by jsebrech · 2004-09-27 23:10 · Score: 1

"ALL X are Y" will only get you so far. Then you could add additional (numeric) fuzzy logic based on samples of other data. For instance, in the "People who live in France speak French" solliquism, the computer could attempt to validate it by pulling a language census of France. After pulling this data, it would know that approximately 95% of people living there speak French. Thus a "fuzzy" "all" could be made. Like "MOST people in France speak French" and even give it a decent probability.

But the semantic web, even employing fuzzy logic, would have no intelligence about why some parts of a group have a property and the others don't, unless it was specifically encoded. So, to be able to say if an individual in france is likely to speak french it is not enough to say 95 percent of people in france speak french, since you could easily find an individual who has properties that would tell a human they are not likely to speak french, but would still be deemed to be 95 percent likely to speak french by the computer.

So, the semantic web would have to have perfect knowledge of why things are the way they are, and that would require people not only knowing what implicit knowledge they have about the real world, but also why they have that knowledge, and how to encode it into a semantic system. I don't have to tell you how unlikely this would be. And since people are likely to be wrong or leave omissions even when they do have perfect self-awareness, this would poison the data well sufficiently that you wouldn't be able to make guarantees about the accuracy of computer-derived conclusions.

Now, it's good people are trying to increase the usefulness of computers through semantics, and I'm curious to see what the semantic web will result in, but I just have low expectations about the actual abilities of the semantic web.
Re:Opposing view by maxwell+demon · 2004-09-28 01:40 · Score: 1

Besides of the failed logic in this example, Count Dracula is an interesting case to look at, because AFAIK there existed a real Count Dracula (who certainly wasn't a Vampire).

So, suppose you have one web site speaking about the fictional Count Dracula (not knowing that there existed a real one), stating

Count Dracula is a Vampire.

And suppose there's another web site speaking about the real Count Dracula and stating

Count Dracula existed.

Then a program which finds both statements would conclude

A Vampire existed

which is clearly wrong, but follows from both statements which are individually right in their context.

--
The Tao of math: The numbers you can count are not the real numbers.

Semantic Web by null+etc. · 2004-09-27 05:58 · Score: 3, Informative

A topic I posted a few years ago is perfectly relevant to this submission: http://slashdot.org/comments.pl?sid=92504&cid=7953 441

Re:Semantic Web by Anonymous Coward · 2004-09-27 06:06 · Score: 0

You know you're a hard core /. 'er when you say "We talked about this a few years ago, here's the exactl /. link"
Re:Semantic Web by Anonymous Coward · 2004-09-27 06:26 · Score: 0

Except that the link was to 01:03 PM January 12th, 2004, not a few years ago.
Re:Semantic Web by null+etc. · 2004-09-27 07:11 · Score: 1

You know, I suspected it might be this year, but for some reason I can't find the year anywhere within the post information.
All I see is this:
One Net to Rule Them All (Score:5, Insightful)
by null etc. (524767) on Monday January 12, @01:03PM (#7953441)

Tim who? by Anonymous Coward · 2004-09-27 05:58 · Score: 0

Bah! What does this bozo know about the web and where it should go??? It's not like he's devoted any of his time or effort into creating something as important as the WWW...

/sarcasm

Re:In Soviet Russia... by Anonymous Coward · 2004-09-27 06:00 · Score: 0

In Soviet Russia you were never funny!

interesting technology... by LiquidMind · 2004-09-27 06:00 · Score: 2, Interesting

"...enabling computers to extract meaning from far-flung information as easily as today's Internet simply links individual documents."

i wonder if this could be used for a computer's local file system as well. I know microsoft is working on this (WinFS or OFS or whatever it's supposed to be called), but it would be damn awesome to apply this not just to the internet.

--
This sig contains repetition and redundancy.

Re:interesting technology... by mr_majestyk · 2004-09-27 06:42 · Score: 1

It is already being talked about/
Re:interesting technology... by Anonymous Coward · 2004-09-27 06:59 · Score: 0

I think it should be. I fail to see why the distinction of the data being local or remote even matters. Integrate it, from the start.
Re:interesting technology... by KjetilK · 2004-09-27 08:20 · Score: 2, Interesting

The RDF geeks are allready discussing a marriage of Reiser4 and RDF.

--
Employee of Inrupt, Project Release Manager and Community Manager for Solid

Two major problems to a semantic web by levram2 · 2004-09-27 06:00 · Score: 5, Insightful

The extra work required to put data into a standard data format won't be done. People can't bother making their pages w3c complaint (even slashdot). The second problem is that data formats can rarely be agreed upon by a large community. Look at how many calendar event and news feed formats there are.

Re:Two major problems to a semantic web by Anonymous Coward · 2004-09-27 06:10 · Score: 0

Look at how many calendar event and news feed formats there are.
No problem. I'll just invent a new one, to rule them all!
Re:Two major problems to a semantic web by Anonymous Coward · 2004-09-27 06:15 · Score: 0

I disagree. If there's a payoff, the effort is made. Case in point: RSS feeds. They need some programming on your server, but people love them, so site owners bow to the requests.

If some really cool app comes along that uses the sematic web, people will update their pages.
Re:Two major problems to a semantic web by dubious9 · 2004-09-27 07:17 · Score: 1

ARP/ETHERNET
TCP/IP
DNS
HTTP
HTML
JavaScript
CSS
XML/XSL
even, ugh, RTF

Just examples of standards based technology that people said the same thing about. If the semantic-web stuff becomes sufficiently powerful and popular it will become widespread. It certainly has the potential to become a killer app.

There have been many instances of formats becoming ubiqutious, it's just that there have been many more instances of them not.

Baby steps, one page at a time and semantic web will grow. Even better, it doesn't need to be ubiqitous to become useful.

--
Why, o why must the sky fall when I've learned to fly?
Re:Two major problems to a semantic web by jilles · 2004-09-27 07:29 · Score: 2, Insightful

The reason people don't bother with w3c compliant webpages is that there is no obvious advantage. Slashdot works fine in all modern browsers and aside from some bandwidth that could be saved by going fully XHTML/CSS there is little to be gained (well there are a number of advantages but they're obviously lost on the editors).
With data it is different, just look at how quickly RSS & ATOM are being adopted. There's an obvious advantage because having a feed on your site makes it easier for readers to learn about new content on your site. It doesn't matter that there are multiple competing standards because the tools that matter are standards neutral (most feed readers can handle most RSS and ATOM variants). If there is a sufficiently large enough group of people using a particular (open) format, it is worthwhile to program functionality to do stuff with this data.

The RSS world is also spawning some interesting semantic things such as track back links and perma links. Not all of these things will survive but there already are these mini semantic webs emerging. These networks are growing in size and scope. People write tools to search and navigate them in various and sometimes unexpected ways. Whenever one tool involves multiple networks, effectively a larger one emerges.

IMHO the semantic web is not something that will be released by some big software company or standards body like the w3c but rather something that will emerge out of the chaos of different standards, formats that are out there today. There will not be some monolithic onthology that explains everything but rather there will be many domain specific, simple onthologies that may be abstracted from by tools so that relations between datasets may be established and explored without requiring much changes to the data. Where meaningful relations exist, tools and standards will emerge to exploit these relations.

--

Jilles

This burns me up!!! by octaene · 2004-09-27 06:05 · Score: 5, Funny

I'm so tired of Semantic trying to take over all the security tools. Are they now trying to take over the Internet? I mean really, Semantic Antivirus totally sucks ass big-time!!! And don't get me started on Semantic's SystemWorks tool and how bad it blows!

Oh, wait a minute...

Meanwhile... by genixia · 2004-09-27 06:08 · Score: 2, Funny

...a team in Redmond is tasked to make sure that Microsoft own the "single Web of meaning, about everything and for everyone."

Obvious candidate for massive abuse by gammelby · 2004-09-27 06:08 · Score: 2, Insightful

How is the semantic web going to handle abuse like pr0nn g_annotation>...? I mean, anybody can put up bogus annotations to promote their filthy business, like we saw it in the days before google and pagerank.

Ulrik

Re:Obvious candidate for massive abuse by Anonymous Coward · 2004-09-27 07:02 · Score: 0

As much as I think semantic web is a red herring, and next to useless (I mean, current "visions", at least; not necessarily the core idea of adding semantic information to web pages), it's no better or worse than any other markup. Plain old meta-tags have same problems as author-added (semantic) metadata. It's no better or worse.
It really has nothing to do with specific encapsulation method (RDF, meta tags, whatever), and everything to do with who adds metadata, who do you trust, and so on.
Another way to put it is that your concern, and immediate goals of things like semantic web are pretty much orthogonal. It'd be silly to try to solve all problems with just one tool; separation of concerns should work fine here.
Re:Obvious candidate for massive abuse by KjetilK · 2004-09-27 08:46 · Score: 2, Insightful

I suspect the answer to that one are immense social networks, user participation and webs of trust.
The WWW also has Annotea, to allow for people to submit annotations. Now, you can imagine lots of people having a simple way to rate pages, a rating option could for example be "Supplied metadata are bad/fraudulent", or something like that.
You would first and foremost make decisions based on ratings from people you trust. That is, people who are close to you in your FOAF-based social network.
When every Internet user becomes a reviewer, and people are well connected in a social network, so that there is a review available of most pages, there is going to be a very strong incentive for authors to supply accurate metadata. Think of it as moderation.
Face it, allthough it happens that you stumble upon pr0n involuntarily, the vast majority of pr0n surfers do it on purpose. Pr0n0graphers (this is getting a bit too leet for me...) then will have strong incentive to refrain from such tactics, they will be modded into oblivion anyway, and accurate metadata is going to bring them traffic, since they are modded up by those who actually surf pr0n.
So, unless the goatse guy is a friend of yours, I don't think it is a big worry.
Provided SW becomes a reality that is.
FOAF is a really good start, though, go create it now!

--
Employee of Inrupt, Project Release Manager and Community Manager for Solid

Why is a hero? by Gothmolly · 2004-09-27 06:09 · Score: 3, Interesting

Because he chose not to capitalize commercially on the Web? How is the measure of your altriusm the measure of your heroism? I understand that many people DO feel that way, but nobody has ever really explained WHY heroism is a necessary consequence of altriusm. Why is someone who makes a profit necessarily evil? The man who invented a corrugated-cardboard coffee-cup holder holds a patent on it; every Starbucks coffee sold puts a penny in his pocket. Why is that wrong?

--
I want to delete my account but Slashdot doesn't allow it.

Re:Why is a hero? by Anonymous Coward · 2004-09-27 06:20 · Score: 0

I understand that many people DO feel that way, but nobody has ever really explained WHY heroism is a necessary consequence of altriusm.

If someone does something for me, I say thank you in some way. It's that simple.

Why is someone who makes a profit necessarily evil?

They are not. Altruism is not a prerequisite for heroism. No one said it was.

Your Score: -2, Objectivist Troll
Re:Why is a hero? by Anonymous Coward · 2004-09-27 06:22 · Score: 0

Not wrong at all, but how well do you think the web would work if every time you used a URL you put even a nano-penny in Tim's pocket?

If sir Tim is a hero, it's because he (accidentally or otherwise) designed a system that was much better , and more universal than it needed to be. It takes a fair bit of courage, and luck to do that.
Re:Why is a hero? by Anonymous Coward · 2004-09-27 06:28 · Score: 0

nobody has ever really explained WHY heroism is a necessary consequence of altriusm. Why is someone who makes a profit necessarily evil?

Just because people respect those that give freely to others, it doesn't mean that they think that any profit is evil. Just because !a -> b, it doesn't follow that a -> !b. Your logic is broken... usually this exact type of logical brokenness is used to discredit "dirty GNU hippies" when FUDding, so please don't promote it.
Re:Why is a hero? by Jhan · 2004-09-27 06:34 · Score: 1

... [I never understood] WHY heroism is a necessary consequence of altriusm.

It isn't. There are plenty of ways to be altruistic without being a hero, and several ways of being heroic without being altruistic. On the whole there's a confluence, though.

Why is someone who makes a profit necessarily evil?

No, he isn't, and no-one believes so. He's just normal

If, however, you have just come up with a billion-dollar idea (like "The Web"), and decided to give it to humanity rather than extract personal gain from it. This is tantamount to donating $1.000.000.000 to the world.

In my mind, any many others this makes you a good man.

See, the difference? It's not about Good Vs Evil, it's about Normal Vs Good.

--
I choose to remain celibate, like my father and his father before him.
Re:Why is a hero? by Red+Alastor · 2004-09-27 06:39 · Score: 1

They are not. Altruism is not a prerequisite for heroism. No one said it was. In hollywood movies it is. ;)

--
Slashdot anagrams to "Sad Sloth"
Re:Why is a hero? by C.+Mattix · 2004-09-27 06:40 · Score: 1

If Altriusm is a measure of heroism then everyone here should worshipping Bill and Melinda Gates.

They are consistently the top philanthropists in the US. In 1999-2003 they pledged or gave away $23 Billion, or 54% of their wealth.

But since he is Gatus of Borg, everyone on here will call me a MS Apologist and say that he is evil.
Re:Why is a hero? by timeOday · 2004-09-27 07:23 · Score: 1

He's a hero because he created something great, that benefits almost everybody.
Had he tried to captialize it, he would have failed to create something as great as the web. He would have created "Online Magazines for GEnie" (or AOL) or some other useless thing. The freeness and openness are precisely what make it great.
Re:Why is a hero? by ChaosDiscord · 2004-09-27 08:14 · Score: 1

Why is [Tim] a hero? ... Because he chose not to capitalize commercially on the Web?

Well, it's hard to know, only the original poster could say for sure. But here is my guess: Tim could have made the web proprietary and closed off. The result might have made Tim wealthy, but almost certainly would have dramatically limited the impact of the wed. We'd be in a world where Tim was significantly better off, but humanity as a whole was only marginally better off. Because it was free and open, Tim is marginally better off (he still got fame), but humanity as a whole is notably better off. Or something like that.
To be fair, I'm not sure I'd define it as heroism, but "hero" means different things to different people. Given that one definition is "a man admired for his achievements and noble qualities (source), it's a reasonable enough use of the word. I expect the poster admires Tim's achievements and noble qualities.

I understand that many people DO feel that way, but nobody has ever really explained WHY heroism is a necessary consequence of altriusm.

Some people would view altruism as a noble quality and a notable achievement. These people might admire that quality and achievement, and thus view someone engaging in that altruism as a hero.
How is the measure of your altriusm the measure of your heroism?
...
Why is someone who makes a profit necessarily evil?

False assumptions. Reread the article; the poster never suggested that.

--
Search 2010 Gen Con events
Re:Why is a hero? by Marcus+Green · 2004-09-27 09:19 · Score: 1

Around the time when the web began to take off commercially I was attempting to use Microsofts "web killer application" called Blackbird. Anyone who saved me from that product is a hero in my eyes (and being a dead ringer for Douglas Adams helps as well)
Re:Why is a hero? by droleary · 2004-09-27 09:57 · Score: 1

They are consistently the top philanthropists in the US. In 1999-2003 they pledged or gave away $23 Billion, or 54% of their wealth.

No, they're giving away my "wealth"; the ill-gotten gain of a convicted monopolist. If you don't understand that, please allow me to heroically, altruistically drag you into an alley so that I can roll you to give another $100 my charity of choice.
Re:Why is a hero? by Anonymous Coward · 2004-09-27 11:10 · Score: 0

The difference that while Sir Tim Berners-Lee truly gave away something that the world could only benefit from, the Bill and Melinda Gates Foundation is all about having strings attached. Considering what they demand back for their "donations", it's even surprising it's considered philantropy.

They donate millions worth of Microsoft operating systems and software (and count the donations at retail prices, mind you, so a million dollars of MS software only costs them a few thousand) and then demand that, after four of five years, the products be upgraded at regular market prices.

Here's one link: http://aroundcny.com/technofile/texts/massa_bill.h tml

And then there's the AIDS research funds for Africa and whatnot. They can't pay for that with boxes of Windows XP, so they donate in cash. But since that would be too altruistic, the research money requires pledges to advance enforcement of drug patents in already bankrupt countries. That's why the benevolent donations from B&MGF are partnered with more donations from companies like Bristol-Myers Squibb and Merck and Company.

So there's your difference between true altruism and vested business strategies. Hope you had fun.
Re:Why is a hero? by NetSettler · 2004-09-27 11:29 · Score: 1

Why is someone who makes a profit necessarily evil?

I'll go a step farther, just to underscore this already excellent point:

By not first assuring that there was a way to make money on the net, one could argue that it is he who condemned all of us content-providers on the net to a life of never being reimbursed for our efforts.

It is certainly the case that some people make money on the net, but mostly it is not the myriad people who have contributed the value it provides. If you want information on the mating habits of certain bark beetles, you can be sure there's some uncompensated bark beetle expert who has done you the favor of typing in this info, and you can be equally sure that when AOL or NetZero says it can offer you "all the value of the internet for a very low price" (times millions of people subscribing, it's still a tidy profit for them ) they are not making an effort to pay the many content providers.

So let's all "thank" Sir Tim for that, too.

In fact, I think the net just ran out of control and was co-opted by those with power in an attempt to continue to hold power.
Further compounded by the exchange of free software, it has assured that the net is not a very friendly place to try to make a buck. Real success already means, after a very short time, the ability to supply globally in order to make the sales volume that offsets the incredibly low margins the net has produced. And only a few people are capitalized to do that.

I have started to wonder if the world economy and overall happiness is actually improved by the presence of the net. Business seems to thrive on inefficiency, and the net is all about eliminating inefficiency. Something is awry.

--
Kent M Pitman
Philosopher, Technologist, Writer

Two major problems to a semantic web-binaries. by Anonymous Coward · 2004-09-27 06:10 · Score: 1, Interesting

iCal and RSS. The present problem is that people hate everything XML. Kind of hard to do semantic anything if people don't like the way your doing it.

So scarry, man... I love different meanings by Anonymous Coward · 2004-09-27 06:11 · Score: 0

Indeed, anything that offers a "single Web of meaning, about everything and for everyone" is truly a scarry concept to me, philisophically, of course.

I really do love all the very different meanings that the human mind can create from all the very same things.

Re:So scarry, man... I love different meanings by Misinformed · 2004-09-27 07:53 · Score: 1

A post-singularity web! Awesome! Who knows what it'll bring!

--
-- Slashdot: Racism against Indians OK. China bad, USA good. Blue pill in water supply.

Statistical text analysis killed semweb by Ars-Fartsica · 2004-09-27 06:12 · Score: 5, Insightful

As has been stated many times, content producers will spoof semantic data just like they used to with the META tag...which is why no one uses the META tag anymore. Relevance algorithms take into account link analysis and statistical text analysis to provide a much more truthful representation of what data is there. Sorry Tim.

Re:Statistical text analysis killed semweb by TuringTest · 2004-09-27 06:33 · Score: 1

And who's to say that the Semantic Web metadata will not be populated with statistical text analysis and hyper-text analysis?

--
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
Re:Statistical text analysis killed semweb by Anonymous Coward · 2004-09-27 06:53 · Score: 0

content producers will spoof semantic data just like they used to with the META tag...which is why no one uses the META tag anymore. Relevance algorithms take into account link analysis and statistical text analysis to provide a much more truthful representation of what data is there

Lots of people use the META tag. More use it than don't. And your comment about "relevance algorithms using link analysis....blahblahblah". I think what you meant was "Google uses ".

I can promise you, there is NO statistical text analysis that can give you any idea of what data is there. The BEST method is the use of META tags. It is a shame that every porn site on earth felt the need to abuse the hell out of the tags. It sucks that a whole industry peddling "supplements" to prey on Americans penis size issues decided to abuse the META tags. Its too bad that so many people are fixated on Brittany's tits.

In the long run, there is just no better way to express your content than to make proper use of meta tags. Here is a great example that usually kills the statistical text algo.

"To be or not to be"

What is this phrase about? Some search technology will index it as NOTHING AT ALL because all words are STOP words. How do you indicate to an algorithm that this is in fact a famous line from a famous play by Bill S? You can not. The only way to properly describe this data is through Meta tags. You can pick speach appart word by word, and phrase by phrase, but it is very difficult to systematically describe what something is ABOUT. In most cases, it is at best a very vague, although somewhat educated guess, at what the content could be about.

To the little people on the web, who are selling garbage products, showing their home photos, clothed or otherwise, and spouting pointless opinions **cough**Slashdot**cough**, you may not use Metadata. Then again, your content IS NOT IMPORTANT. To those of us who are in the information industry, who are publishing hundreds of thousands of pieces of authoratative, relevant, and often free, data a week, we ALL use META tags (think Federal government agencies, international agencies, not "paid for" think tanks). We have to, becasue it may not be apparent that analysis A relates to topic B, unless the author of the analysis makes it perfectly clear there is a relationship.

Sorry Ars-Fartsica, your suggestion that link analysis, and statistical text analysis, gives a more truthful representation of what data is present is at best a funny comment, and at worst downright ignorant. Remember, when you find our data on the web, its because WE pay google and the rest to make sure it come out on top, not because Google magically understands the data we publish. In actual fact, it has been proven over and over, that textual based indexing is THE WORST possible method of indexing content. I think "google bombing" has shown the weakness of the link analysis method.

Tim is on the right path. At some point down the road, we will be using something much closer to his vision than the current version of the web. There may be two versions, one the porn filled commercial whorehouse that currently exists, and the other, actual information of value and use to the world. Not that I have anything against porn!
Re:Statistical text analysis killed semweb by Ars-Fartsica · 2004-09-27 07:10 · Score: 1

The BEST method is the use of META tags.
Unless you think META strings like "pussy sex sex hentai bukkakae virgin tight" are useful, you are wrong and have been wrong for at least eight years. Of course if people could be trusted to reliably define their own content that would be great, but they can't, and this is already well known and established, which is why no relelvance engine weights META tag data in any significant way.
Re:Statistical text analysis killed semweb by Ars-Fartsica · 2004-09-27 07:13 · Score: 2, Interesting

And who's to say that the Semantic Web metadata will not be populated with statistical text analysis and hyper-text analysis?
Statistical methods excel at query relevance, not ontological interpretation. If the latter were the case, Google would be auto-constructing DMOZ instead of seeding page rank with it.
Re:Statistical text analysis killed semweb by dubious9 · 2004-09-27 07:33 · Score: 1

There's a difference between describing stuff and finding relevant information. The semantic web (as I understand) can co-exist with statistical relevance information. If you lie about what you are describing, we can still use trust-relationship and google-like revance stuff to find what we want to find.

Once we *do* find something worthwhile, the semantic web will enable our machine to do much more with the information than just display it.

In conclusion the semantic web isn't a replacement for META tags or google-like search engines, it's about adding content and functionality so agent can provide a richer experience. Yes, that sounds like marketing talk, but that's what he has in mind, as far as I can tell.

--
Why, o why must the sky fall when I've learned to fly?
Re:Statistical text analysis killed semweb by Ars-Fartsica · 2004-09-27 08:49 · Score: 1

If you lie about what you are describing, we can still use trust-relationship and google-like revance stuff to find what we want to find.
If we can use statistical methods to infer the truthfulness of semantic data, then by definition it is useless as you are saying all the important data lies in the text and the semantic markup can inherently be verified using that test.
As an analogy, I may wear a name tag claiming my height. You are telling me you can use your ruler to verify the veracity of this data. Then what is the point of the name tag? Just use your ruler. The name tag at that point is worthless.
Re:Statistical text analysis killed semweb by Anonymous Coward · 2004-09-27 09:14 · Score: 1, Insightful

Except if you read the rest of the post.

BTW, Most relevance engines give weight to META tags. Only Google, and those who take their results from google don't use them.

You don't have to believe me, I mean I only spend all day listening to the leading experts in search, and search technology discussing the best methods for finding relevant information. The concensus around the world is, text based indexing, and link analysis is the weakest method of all. Although it does help to cut through the "noise" created by the dot com boom style webmaster, and seo tricks driven by the pursuit of $.

However, in your meta tag example, what is the problem? If my site is full of "pussy sex tight virgin", then what is wrong? If it describes the content, then yes, I do find it useful. Just because you tend to search for that shit doesn't make the tags irrelevant. Or are you suggesting that tags like that are often used by, say the library of congress?

You like harping on about abused metatags, and true enough, they are abused when commercial interests are at hand. However, you can not dicount the fact that there is no better way to describe the content. Sir Tim is bang on right. You may think that this opinion has been wrong "for at least eight years", but trust me, it is the direction that everyone is going. If you make meaningful content that has value as information, you use meta tags.

Now, for your garage ecommerce site, ya, you probably spam the living hell out of your meta tags, but you have nothing meaningful or useful. Now go and search around for national statistics on something, loaded with metadata. And if you don't see it, then the search engine is actually spidering the metadata repository behind the scenes.

Meta data is still the best method to use to describe your data. Plus, I notice you chose a very limited protion of my post to respond to. Why not try out

"to be or not to be"

A little harder to dispute, maybe?
Re:Statistical text analysis killed semweb by Anonymous Coward · 2004-09-27 09:45 · Score: 0

I do not follow your logic. If we use a format that is easier to parse by a machine than ordinary english (or some context-sensitive ambiguous language), how does that show it is useless compared to english since it can still be statistically analyzed for truthfulness?

If I wear five hundred name-tags of numbers and none of them have labels how is that any better (ruler or no)? At least if one has a height label associated I can at least try and veiryf the persons height.

The next "web"? by daveschroeder · 2004-09-27 06:15 · Score: 2, Informative

Croquet

...from the minds of Alan Kay, David Smith, David Reed, and others...

Re:The next "web"? by Anonymous Coward · 2004-09-27 07:01 · Score: 0

There's only one thing I hate about this...

We intend to make a developer's release of Croquet available by September 2004.

Well, time's almost up. Last time they were going to release it in April, and they missed that. I really want to play with this! Come on, already!

Ontology by dodongo · 2004-09-27 06:16 · Score: 5, Interesting

I want to offer an alternative, as proposed by Victor Raskin at Purdue. I speak for neither Sergei Nirenburg nor Victor (who does enough talking for himself).

While this idea for more thorough, concise, and accurate searches is a good one, I would question whether embedding semantic tags into web pages is the way to go.

As outlined in Ontological Smenatics, there is an automated system of semantic processing already underway. Basically, it takes a text, then runs it through a parser, which looks up meanings in a lexicon, then reduces whatever translation it comes up with to a text-meaning representation (TMR), by pushing the concepts from the lexicon through an ontology / onomasticon / world-knowledge library. The TMR is basically the "pulp" of the semantics of the article, web page, book, or whatever it's been fed. It just contains the ideas, the things involved, and other relevant concepts, stripped of all other linguistic information.

TMR is great, becuase the TMR can be used then, by reversing the process and using the lexicon of another language, to translate a text from one language to another.

However, it seems to me that with the bits and pieces of the TMR stored in a search engine's index, this could be a huge boon for the search engine.

Instead of just trying to match keywords, by parsing the TMR of web pages and by parsing TMR of search strings, you no longer search for keywords, but keyconcepts.

The advantage to semantic searches / indexes by this implementation is manifold:

-Searches (and the web as a whole) will gain the richness Mr. Berners-Lee is advocating.

-Web authors will not be able to lie in their semantic tags, or otherwise misinform spiders what the page is about (remember tags?)

-No extra work is required in the actual construct of the web or *ML standards. The TMR is only generated and stored by the sites / processes that need it.

-Others?

Just an alternative solution, for fun :)

Re:Ontology by dodongo · 2004-09-27 06:18 · Score: 1

It's supposed to say "Remember META tags" but I wrote it as HTML and the comment parser screwed me :)

d'oh!
Re:Ontology by mr_majestyk · 2004-09-27 06:36 · Score: 1

Basically, it takes a text, then runs it through a parser, which looks up meanings in a lexicon, then reduces whatever translation it comes up with to a text-meaning representation (TMR), by pushing the concepts from the lexicon through an ontology / onomasticon / world-knowledge library.

OK. How does it do with this sentence: "Time flies like an arrow?"
Re:Ontology by Feynman · 2004-09-27 06:49 · Score: 2, Funny

OK. How does it do with this sentence: "Time flies like an arrow?"
It returns: "Fruit flies like a banana."
Re:Ontology by Raindance · 2004-09-27 09:04 · Score: 1

Under your set of assumptions, this is quite literally the ideal solution. Unfortunately, this doesn't mean it's the ideal solution. Then again, it might be worth trying.

In short, words mean different things to different people. Furthermore, from any text you can construct *infinitely many plausible yet contradictory interpretations* (this is standard literary theory and accepted throughout academia). Therefore, speaking of "The text-meaning representation (TMR)" is very suspect. You may (given that Wittgenstein's later work doesn't spell doom for *any* rigorous universal meaning system, which the concept of a rigorous TMR depends upon, though I believe it's likely that it does) speak of "A TMR" or "The most useful TMR given X". Defining X in the case of building a TMR-based semantic web would be a PhD Dissertation-level exercise.

So, yes, I agree that 'Text-meaning representation' is a viable path to trying to make sense of this messed up 'web that we live in. But it has deep theoretical problems- some of which can be mitigated, some of which cannot be.

Mike
Re:Ontology by dodongo · 2004-09-27 12:18 · Score: 2, Insightful

Well... I actually wrote a paper lambasting the ontology for precisely what you bring up here. Specifically, I wrote working from a draft of Adele Goldberg & Ray Jackendoff's paper "The English resultative as a family of constructions" paper (_Language_ vol. 80 no.3, September 2004). It deals with strange things like

"The trolley rumbled through the city"

and led me to believe Victor's ontological approach would have some serious problems encoding this if it didn't have a more attuned syntax processor. It wasn't a good paper, but I made my point, and you bring up a similar idea on a more basic (and thus, even more problematic) level.

Anything remotely "idiomatic" (specifically, where the combinatoriality of semantics fails, as it does in your example, where time does not "fly" in the sense that it does not move through the air held aloft by differences in air pressure) starts to generate serious problems.

Your problem could be solved if the lexicon had in it information about common idioms, which it presumably would, to be functional on any level more colloquial than academic writing. Most linguists would tell you the lexcion really does encode idioms in some fashion too, so this wouldn't be some sort of computational stop-gap.

So the lexcion has in it "time flies" or something. The parser (or some sublevel of it) would then identify "like" as a metaphorical comparison to the following predicate "an arrow."

Thus, the TMR would have something to do with time moving briskly towards a target, perhaps.

I'm not saying this is an entirely feasible option, but read what Tim Berners-Lee is proposing, and see if you find it much more plausible. The amount of information out there people would have to manually encode would preclude the system from having any real functionality beyond keyword search. While I'm not a huge fan of the current implementation of the ontology, I do think future generations could start to sort things out. Its advantage is that once the concept database, the onomasticon, is complete, it should be mostly self-trainable, which is what Berners-Lee's solution lacks.

Think of it as The Web: 1.O by Anonymous Coward · 2004-09-27 06:16 · Score: 0

I remember reading about this obscure thing called the interback in 199O. I was a kid back then, and I equated A0L to the interback. Berners-Lee was mentioned in that article, and I thought, "so what? Who needs such a system?" It wasn't until much later, around 1995, that I connected to the Internet in earnest. Even then, World Wide Web browsers at the time were mostly centered around NCSA mosaic. Netscape has just started being disseminated, and Microsoft didn't even have the Internet on its radar.

Re:Think of it as The Web: 1.O by Anonymous Coward · 2004-09-27 06:30 · Score: 0

Interesting anecdote... with ZERO point. Are you trying to say that even back in 1990, you believed that Berner's Lee was a nutcase, with no ability to make his plan happen? That's what it sounds like you're implying, but who am I to judge. I just work at the most prestigiuos university in the world, MIT. Go back to grade school, kiddy.

Not doing it right by vigyanik · 2004-09-27 06:17 · Score: 4, Insightful

The fact that Tim has been trying for 15 years to sell this idea with little success indicates that he approach is insufficient. He is pitching the idea just like a startup would, giving cool examples and everything. But in practice, all he is doing is proposing and overseeing standards. Developing standards for an idea is not what is required to prove that an idea works. Standards should follow successful technology, not vice versa. You need to have companies that make products professionally and offer complete solutions (i.e. make it work real-life situations). Doing it for a very simple example that he quotes ("find pictures taken on sunny days") itself is a big, big deal. Perhaps Tim should get involved with companies in this field as an advisor/consultant. You know, there are enough smart people out there who could develop the standards. But very few people with his name and recognition to truly ignite commercial interest in his ideas.

Re:Not doing it right by Anonymous Coward · 2004-09-27 06:28 · Score: 0

See the comment directly above yours ("The Web: 1.O"). Basically true, and right on the money in regards to Tim. I knew him personally from grad school days... He's one of those guys with lots of ideas, but no real insight into how to make them happen. So he spends a lot of time expanding on the ideas, but not getting anything accomplished. It's always amazed me how he managed to get this much press.
Re:Not doing it right by dubious9 · 2004-09-27 08:04 · Score: 4, Insightful

Perhaps Tim should get involved with companies in this field as an advisor/consultant.

Um... he invented www and started the W3C. I'd say he's had some experience with companies as a advisor. Take a look at some of the W3C recommendations and look for corporate involvment.

But in practice, all he is doing is proposing and overseeing standards.

That's kinda what the W3C *does*.

Standards should follow successful technology, not vice versa.

XHTML,XML,XSLT and a lot of other recommendations started as standards that *later* had robust implementations. Technology that starts without standards if often not fully thought out and awkward, and at worst, proprietary. Waiting for technology before standards will only inhibit interoperability and adoption of the standard.

The fact that Tim has been trying for 15 years to sell this idea with little success indicates that he approach is insufficient.

I suppose that it has nothing to with the fact that it's a tremendouly difficult and abitious project. You're right. Anything that take 15 years to develop should be scrapped.

--
Why, o why must the sky fall when I've learned to fly?
Re:Not doing it right by Alomex · 2004-09-27 08:12 · Score: 1

[Berners-Lee] started the W3C

You say that as if it was a good thing. The W3C is much inferior to IETF (the internet's analogue), and both much worse than when were academic only driven. The W3C was, from the start, corporate members only.
Re:Not doing it right by vigyanik · 2004-09-27 09:05 · Score: 1

XHTML,XML,XSLT and a lot of other recommendations started as standards that *later* had robust implementations. Technology that starts without standards if often not fully thought out and awkward, and at worst, proprietary. Waiting for technology before standards will only inhibit interoperability and adoption of the standard.
This is factually incorrect. XML was developed because many implementors came up with their own version of a dumbed down SGML. XHTML was developed to enforce "clean" implementations of HTML for which guidelines were already developed and followed in the community. By definition, standardisation implies existing implementation. Otherwise, what are you standardizing???
I suppose that it has nothing to with the fact that it's a tremendouly difficult and abitious project. You're right. Anything that take 15 years to develop should be scrapped.
Did you even try to understand the post??? I never said his efforts should be scrapped. I said he's got some great ideas, but developing standards is not the right way to sell them.
Re:Not doing it right by dubious9 · 2004-09-27 09:54 · Score: 1

Notice I said robust implementations(perhaps I should have emphasised that point). Yes there was a lot of work going on in SGML dialects, none of it was terribly popular or consistant, IIRC (and please correct me if I'm wrong). The XML standard took the concept of structured markup to describe and facilitate machine reading of data from it's infancy to ubiquity. Yes the XML keyword is thrown around too much, but the standard really helped it.

I don't totally disagree with you, but what I'm trying to say is that standards should be developed in parallel with early implementations and it seems like he's doing that.

He is pitching the idea just like a startup would, giving cool examples and everything. But in practice, all he is doing is proposing and overseeing standards.

He just doesn't write and talk about the semantic-web, he's trying to put alot of it in practice. However you can't really show an incomplete implemenation (though he did mention one at the end of the article). Also, notice there is no recomendation out, he's just working on it. In short I don't believe he's *just* using standards to promote the semantic-web, it's mearly a tool in his toolbox, and a useful one at that.

--
Why, o why must the sky fall when I've learned to fly?

Google can leverage its search by PineHall · 2004-09-27 06:18 · Score: 4, Informative

Here is an account that predicts that Google will leverage its search results to create a Semantic Web. I see this as a distinct possibility. Especially Google leveraging its search results to help people buy and sell stuff.

Re:Google can leverage its search by NoOneInParticular · 2004-09-27 08:36 · Score: 2, Interesting

The people at Google are probably too smart to buy into yet another failure-to-be from GOFAI (Good Old Fashioned Artificial Intelligence). Next to automatic translation (60s) and expert systems (80s), the semantic web (00s) will soon be found on the garbage heap of technology. Whenever the real world kicks in, crisp logic and deductive reasoning fail simply because they cannot account for uncertainty in the basis of their reasoning: their assumptions. There is no formal way to assert the truthfullness of assumptions (or if you want, ontologies), they are either true or false. That's it, there is no 'maybe' or 'could be' or 'pretty likely': true or false.
Any form of information found on the web, from whatever trusted source, needs to be evaluated on the likeliness that it is true. From this likeliness, you can start reasoning and finally come up with a conclusion plus a degree of belief in that conclusion, but you will not be able to state that an assumption is absolutely true of false. As crisp logic only leads to valid conclusions assuming absolute truth or falsehood of its assumptions, any conclusion drawn from that meta-assumption is invalid, or at best unqualified.
No, the abberation called fuzzy logic is no solution
Enter the world of Bayesian reasoning. Here the truth of a proposition is never absolutely true or false, there are only degrees of belief and a system of systematic and consistent calculations to derive the likeliness of conclusions in the presence of uncertainty, plus a method to add new evidence to the calculations. Take a simple crisp assumption 'The sun always comes up in the morning'. For a semantic webber this statement is either true or false, and whenever two trusted sites claim two opposing views on the matter, the human operator needs to fix the inconsistency. A Bayesian webber might start to reason first: okay, first in the absence of any information I will assign an observation of true to the assertion, and an observation of false. This is my informationless prior and makes the likelihood 50%. Then I'm going to count: every time the sun has come up in the morning I count one for truth of the assertion. If it didn't I count one for falsehood. As I don't remember it ever not happening (and I would have noticed!) I can claim about the number of days I have lived to the truth of the assertion. That's about 4 9's of truths. Now I can ask someone else if they ever saw the sun not coming up. Assuming that I trust them for 95% to give me the correct answer, I can easily add a few extra nines to my belief in the assertion. Also reading some physics books adds to my belief, up to the point that it will take quite a lot of conflicting evidence to make me doubt that particular assertion.
It might be interesting to note that from this strong belief in the assertion I can actually deduce that somebody that tells me otherwise is very likely lying to me, and I should watch whatever the person says. A semantic web will fly flat on its face when there are conflicting pieces of information or outright lies on 'trusted' webpages
Note that the two approaches are completely at odds: for the crisp logic approach everything is either true or false, for the bayesian logic approach nothing is purely true or false(*). The Bayesian approach is well-known, but can easily lead to computational explosions. However, it seems to be the only way to reason in a world where evidence can (will) be contradictory and assumptions cannot be trusted. Without a consistent framework of reasoning with uncertainty (and the Bayesian framework is provably the only consistent one), the semantic web will be yet another failure of AI.
(*) Bayesian probabilities can be completely true or false (1 or 0), but no-one in his right mind would do that because from that there is no mathematical way to change your belief, 20 9's should be enough for anybody.

Why is a hero?-Whitney Houston. by Anonymous Coward · 2004-09-27 06:21 · Score: 0

Maybe because we've sunk so low as a society (for various reasons. some obvious, most not). We need heros to admire and aspire. Unfortunately a lot of "heros" get left on the cutting-room floor.

Let's hope it fails by Anonymous Coward · 2004-09-27 06:21 · Score: 0

Let's hope this fails.

History tells us that anything that has a
"single Web of meaning, about everything and for everyone" is really bad, no matter how tempting it is.

Freedom of different meanings is so much more sexy.

Will the "spash screen"... by jbarr · 2004-09-27 06:22 · Score: 2, Funny

...have the words "Don't Panic" prominently displayed?

--
My mom always said, "Jim, you're 1 in a million." Given the current population, there are 7000 of me. God help us all!

Your score by Gothmolly · 2004-09-27 06:24 · Score: 1

-3, Anonymous Coward.

--
I want to delete my account but Slashdot doesn't allow it.

Re:Your score by Perianwyr+Stormcrow · 2004-09-27 07:07 · Score: 1

Your score:

Not with me. You're missing out! Get with the program! Get on the bandwagon! Taste the rainbow! Rock the casbah! Paint the melon! Oversee the expedition! Break the mold! Eat the menu!

--
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey

Tim didn't "invent" anything new with the web. by Anonymous Coward · 2004-09-27 06:25 · Score: 1, Interesting

It's just unrooted gopherspace, with a near-total lack of control over the download process (in most gopher clients, you had to choose to get an image - making banner ads impossible, and low-speed connections more useable).

Yawn yawn.

Incidentally, if you weren't on the net before NCSA mosaic, or have never used GOPHER, you don't need to bother replying to this post. Trust me, "surfing" gopherspace was trivially different from "surfing" the web, until the web went commercial.

This is like how Darwin constantly gets credited with "inventing" the theory of evolution (when actually Matthews published it in 1831, 30 years previously, as acknowleged by Darwin himself) or the way Uda's name always gets left off his invention (Uda was the principal inventor of the Yagi-Uda antenna).

Give Tim credit for helping develop the first web browser, he deserves that recognition. But calling him the "inventor" of the web is like calling Sir Isaac Newton the "inventor" of gravity!

Re:Tim didn't "invent" anything new with the web. by KjetilK · 2004-09-27 08:15 · Score: 1

I remember gopher, and I really can't agree... There are many good reasons why gopher didn't make it, the self-FUDing they did is one good. Another was that it was just painful to find things, despite Veronica. I know a few gopher nostalgics, but I for one is glad the world moved on, and the improvements TimBL did over gopher were huge. YMMV.

--
Employee of Inrupt, Project Release Manager and Community Manager for Solid
Re:Tim didn't "invent" anything new with the web. by Anonymous Coward · 2004-09-27 09:33 · Score: 0

Matthew's work, whilst clearly anticipating evolution by natural selection, did not have anything like the range of application and clarity of Darwin's work.

in fact, most of Matthew's writings are so woolly that they could easily be seen as Lamarkian, as when he says : "This circumstance-adaptive law operating upon the slight but continued natural disposition to sport in the progeny, does not preclude the supposed influence which volition or sensation may have had over the configuration of the body." it's tough to work out exactly what he means. he has some idea, but he didn't have the intellectual rigour to follow it through that Darwin had.

I'm sorry, but Darwin's work (which of course was published decades after Darwin had fully formulated it) is incomparably superior to Matthew's. Nearly always the person creditied with an idea, or who has something named after him, was not the one responsible (I think there's a law named after somebody to this effect, the somebody not being the person who invented it of course). But in Darwin's case, it isn't so.

close is not good enough with evolution, without all the pieces exactly in place, it isn't the same theory at all, all the pieces can be found in writings going back millennia. Only Darwin coherently formed a testable complete theory.
Re:Tim didn't "invent" anything new with the web. by Anonymous Coward · 2004-09-27 10:00 · Score: 0

Actually, I always felt like I was just ftp'ing somewhere in gopher. Until they really kludged gopher with gopher+, I did not see the point. HTTP and HTML on the other hand, I could at least see the potential in it. Unlike gopher, the web at least was based on something (markup).

Though you could always yawn since TBL and friends just climbed on the backs of the SGML community...versus...making it up over beers in a few hours.
Re:Tim didn't "invent" anything new with the web. by wombatmobile · 2004-09-27 12:50 · Score: 1

.

calling him the "inventor" of the web is like calling Sir Isaac Newton the "inventor" of gravity!

Are you sure about that? Gravity was around before Newton and has no inventor. Contrast this with the www, which was transported to Earth by alien craft.

Tagging vs. Understanding Conext by saddino · 2004-09-27 06:27 · Score: 2, Interesting

The common thread to the Semantic Web is that there's lots of information out there--financial information, weather information, corporate information--on databases, spreadsheets, and websites that you can read but you can't manipulate. The key thing is that this data exists, but the computers don't know what it is and how it interrelates. You can't write programs to use it.

IMHO, the problem with the Semantic Web is the same problem that evolved the Web from a linked knowledge store to a commercial-driven directory.

Yes, it would be nice if all data were tagged and understandable, but let's be honest: the commercialization (and its result: exploitation by marketers) of the web would certainly spill into the Semantic Web, and so Berners-Lee's vision would be once again ruined by 1) incorrect/misleading tagging, 2) competing standards and 3) out and out fraud.

I assume what Berners-Lee really wants is for a machine to truly understand that, using his example: something is a calendar, and that you are interetsed in it, and that you should add the event to your schedule and then book a flight for it.

But the chances are -- one day -- machines will be able to understand how data is typed by understanding the context around it (just as a human would go through the aforementioned process manually).

Obviously, this type of reading "comprehension" is a long ways off, but the "search engine wars" are resulting in a lot of mind power thrown at the problem of understand context. And I'm guessing it'll be a reality before anything as pure as the vision for the Semantic Web is realized.

(and to throw in a plug for my own copmaniy's attempt at understanding web context: theConcept).

Bush Invented the internet! by rumblin'rabbit · 2004-09-27 06:29 · Score: 1

It said so in the brief history...

1945
In the Atlantic Monthly, director of the U.S. Office of Scientific Research and Development Vannevar Bush describes the Memex, a hypothetical device for linking microfiche documents.

It's just like Al Gore to try to take credit for the rightful president's inventions. Thank God Bush swept Florida.

Re:Bush Invented the internet! by Anonymous Coward · 2004-09-27 15:03 · Score: 0

On the heels of the parent post, people could go read about Ted Nelson. Xanadu has been in the making for 35 years. This is a fun read, but also see the rebuttal here.

There's also this article over at Kuro5hin.

Second System Effect by xleeko · 2004-09-27 06:31 · Score: 4, Insightful

I've been hearing noise about the semantic web, RDF, and what not for years now, and every time I do, the first thing that pops into my head is "Second System Effect".

He got lucky once, because he put together some tools that were simple and straightforward enough for people to pick it up quickly, thereby avoiding the fate of the dozens of other hypertext systems going back to the late 1980's.

Now, like all second systems, he wants to "do it right", over-engineering away all of the things that made the first one take off ...

Just my opinionated rant ...

Re:Second System Effect by Anonymous Coward · 2004-09-27 06:39 · Score: 1, Informative

I've been hearing noise about the semantic web, RDF, and what not for years now, and every time I do, the first thing that pops into my head is "Second System Effect".

He got lucky once, because he put together some tools that were simple and straightforward enough for people to pick it up quickly, thereby avoiding the fate of the dozens of other hypertext systems going back to the late 1980's.

Now, like all second systems, he wants to "do it right", over-engineering away all of the things that made the first one take off ...

No, you are correct. Tim is a physicist who has no background in CS and it shows. The entire AI community thinks the Semantic Web is a joke which is why nobody with any real credibility in the field is studying it. Other semantic approaches are being researched, but they aren't anything like the Semantic Web because you can't trust/expect content authors to tag their pages correctly. Also, there is no one single correct ontology, what is needed is a generic framework for reasoning about and merging information from different ontologies/viewpoints.

Already adressed by TuringTest · 2004-09-27 06:38 · Score: 1

the Semantic Web is a publishing medium; the creation of content is left to the will of the publishers (ideally the creation of metadata should be computer-assisted, but there are other possibilities). Your "second problem" is precisely what the S.W. is intended to solve; it doesn't require people to agree in the data format, everybody can define their own.

--
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.

Re:Already adressed by aussie_a · 2004-09-27 18:04 · Score: 1

I get asked to save a file when I try to click that link. What is it?
Re:Already adressed by TuringTest · 2004-09-27 23:12 · Score: 1

del.icio.us is a "social bookmarking" site; people save their surfing findings, and the links are automatically cathegorized.

I don't know why it doesn't open in your browser; .us is a valid URL domain. Search for it in Google.

--
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
Re:Already adressed by aussie_a · 2004-09-27 23:21 · Score: 1

I don't know why it doesn't open in your browser

I do. I was using IE (please don't hate me, I was at university, which incidently does have Firefox, I just forget that it has firefox).

Semantic Web Tags for Clearplastic.com by mrs+clear+plastic · 2004-09-27 06:38 · Score: 1

So, after reading the main article for this story as well as the one for a previous slashdot story on this subject, I guess that I can add the following meta tags for some of the items in my website, www.clearplastic.com. I don't yet know the syntax for these memantic meta tags; I am but taking a guess:

semantic "Clear Plastic" = "waterproof, transparent, see-through, air-tight, shows-beauty,
protective">

. . . . .

And so forth. Can this lead to 'semantic spamming?' I have only just begun for one of my
two sites. I can see where this can get way out of control. Someone goes to clearplastic.com who lives in a rainy climate area. One of the semantics could say that a clear plastic raincoat is a required item. If someone's computer is set up so that it automaticaly purchases something that is required; I consider this scamming.

--
Cleara

Re:Semantic Web Tags for Clearplastic.com by cynic10508 · 2004-09-27 07:00 · Score: 1

And so forth. Can this lead to 'semantic spamming?' I have only just begun for one of my two sites. I can see where this can get way out of control. Someone goes to clearplastic.com who lives in a rainy climate area. One of the semantics could say that a clear plastic raincoat is a required item. If someone's computer is set up so that it automaticaly purchases something that is required; I consider this scamming.

Well, this approach:

<semantic "Clear Plastic" = "waterproof, transparent, see-through, air-tight, shows-beauty, protective">

Seems to involve something know as semantic features. Basically, we're describing an object in terms of the features it does or does not lack. If we in turn build an ontology out of this I think it will lack the structural rigidity and integrity if we had built the ontology from the ground up. That is to say, forget all about semantic features because they lack scalability, and build an ontology by adding concepts one at a time and making sure they're properly described.

MOD PARENT UP!!!! by Anonymous Coward · 2004-09-27 06:40 · Score: 0

it's both informative and insightful.

Being Done Already by cynic10508 · 2004-09-27 06:41 · Score: 1

The "Semantic Web" is already being done in a quite sophisticated manner by computational linguists. The major stumbling block: money. It takes a lot of time (and hence, money) to build these systems and no one seems to appreciate the possible impact.

Re:Being Done Already by Anonymous Coward · 2004-09-27 11:25 · Score: 0

well, there are two problems, money, and the fact that computational linguistics is just a fun game for academics to play and write endless papers on. it is totally useless outside the lab.

DANGER! by tunabomber · 2004-09-27 06:42 · Score: 1

You don't want a "single" web... You want a multitude of them, and carefully isolate them (beyond normal information reading and referencing).

This horrible monoculture is what's happening to the web right now! A new web browser called FireFox is conspiring with the evil W3C to propagate its agenda of paving over the current safely incompatible WWW with the data duopoly of XHTML and CSS. If they succeed in their nefarious motives, all the markup on the web will adhere to ONE draconian standard!

Seriously, man. Monocultures are a GOOD thing for standards, but a bad thing for implementations of those standards. I expect that once the standards that make up the Semantic Web become solidified, we'll see multiple implementations popping up.

--

pi = 3.141592653589793helpimtrappedinauniversefactory71 ...

Please re-read the submission - by Gothmolly · 2004-09-27 06:46 · Score: 1

The AC submitter chose to smuggle the little "altruism=good" gem into the article, and michael let him get away with it. Clearly not many people noticed, but undeniably the seed was planted, if you are equating "donating $1.000.000.000 to the world" with "a good man".

See, the difference?

--
I want to delete my account but Slashdot doesn't allow it.

Actually, Google is a search engine by wombatmobile · 2004-09-27 06:46 · Score: 4, Informative

The rest of us call this... GOOGLE.

Google searches undifferentiated text. In contrast, the semantic web is all about differentiating text by adding meta tags.

For example, the word "Hilton" on a web page is ambiguous. It could be a hotel, or a celebrity. Which is it? With the semantic web we'd know:

<motel> Hilton </motel> <celebrity> Hilton </celebrity>

Of course, this is a fairly trivial example. A more meaningful example:

<partnumber> LHMJ67523119900012 </partnumber>

Re:Actually, Google is a search engine by Greyfox · 2004-09-27 07:43 · Score: 1

Google can still give me the right pages when I search it on "paris hilton hotel". This despite the fact that two out of my three keywords are very vague in the way you describe.
Don't ever search it for just "LaTeX" without qualifying your search, though...

--
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Re:Actually, Google is a search engine by wdavies · 2004-09-27 08:48 · Score: 1

and in a second you have proved the flaw in the whole idea.

You managed to use the token MOTEL in place of HOTEL. See, thats all it is a *TOKEN*. There is some association with you as a human between the two lexical tokens and a building one sleeps in for money, but thats not encoded. Sure, you can start building symbolic rules, but several winters of AI research should be eniugh to convince anyone, that the Top Down Approach is not going to fly anytime soon, IF EVER>

The *world* and humans in it are the arbiters of varying degrees of truth and linkage. Any generic attempt to encode the world is probably going to fail.

As John Von Neumann put it so elegantly, "Perhaps the simplest model of the world, is its self".

And I'm from the AI school of Comp Sci. Its useful in its place, but I haven't seen a single killer app of WordNet, let alone Cyc.
Re:Actually, Google is a search engine by wombatmobile · 2004-09-27 12:23 · Score: 1

The *world* and humans in it are the arbiters of varying degrees of truth and linkage. Any generic attempt to encode the world is probably going to fail.
I think when you say "generic" you mean "explicit", but you don't give a reason for predicting "failure".
As John Von Neumann put it so elegantly, "Perhaps the simplest model of the world, is its self".

That isn't very portable or manageable. Von Neumann invented a practical architecture for representing discreet subsets of the world, including tax returns, plane tickets and celebrity gossip. For those kinds of things, simple text search is useful for humans but imperfect for computers; semantic enrichment addresses the imperfection precisely to make text accessible to computerized logic.
Re:Actually, Google is a search engine by wdavies · 2004-09-27 12:36 · Score: 2, Interesting

Perhaps I meant generalized.

JVN quote. That was his exact point. You can model very restricted subsets successfully, but the whole thing is too much to encode. I've no problem with designing data structures. Its just when someone says data structure solve the grand AI problem that I have an issue.

Sure, you want to do an XML schema for books - go ahead. For CD's sure. In fact, for any domain. Although bear in mind the documentation of the API is going to get bigger and bigger until it is unmanageable (or you end up with natural language, and we are back where you started with, using IR techniques...).
Re:Actually, Google is a search engine by greggman · 2004-09-27 15:29 · Score: 1

No, more like this

Hilton

Hilton

LHMJ67523119900012

No, there's something there by Allen+Zadr · 2004-09-27 06:48 · Score: 1

While this stuff doesn't "execute", the assertions made by symantic logic could be far reaching. Consider the following (monoculture) type example:

Jim is 24601
Jim is 15931
Account 24601 is online gaming
24601 cheats and scams
Account 15931 is online banking
15931 has made 450 transactions this month
15931 has a positive balance

Thus, when Jim is passing me a check...

Jim has enough to cover this check, has made 450 transactions, but is known to cheat and scam

Incomplete, but technically correct picture of Jim. The bad part has no relavence to me, unless I'm selling him an item in an online game. The symantic web has no way of telling what's relavent to me in a given situation.

--
Kinetic stupidity has a new brand leader: Allen Zadr.

Re:No, there's something there by JimDabell · 2004-09-27 06:54 · Score: 1

The symantic web has no way of telling what's relavent to me in a given situation.

Yes, it does. To take your example, the jump in logic you are making that the Semantic Web doesn't is assuming that the property "cheats and scams" attached to the relationship between "account 24601" and "online gaming" is identical to the property "cheats and scams" that might be attached to the relationship between "account 15931" and "online banking". That's an unjustified leap of logic that only software that is broken to the point of being useless would make.
Re:No, there's something there by Allen+Zadr · 2004-09-27 07:52 · Score: 2, Insightful

Your faith in computational logic is astounding. Not to say that you may not be right, but to dismiss the possibility that 'shady' logic relationships such as this one would simply not occur. Especially when there are billions of similar relationships.
By your declaring such functionality to be an error of logic does not (in my view) make it less likely.
Back to my very example... the 'scams and cheats' property assertion of an online gamer against my account number is, by definition, a symantic inferrence. Unless a human jumps to the various links that make up the conclusion. Couple this with the very fact that my fictional search would be along the lines of 'transaction trust', the property does apply to the query.
Basically that is the point. It is broken beyond usable functionality. It cannot make the conclusions advertised. It can link to points to help a human create valid conclusions.

--
Kinetic stupidity has a new brand leader: Allen Zadr.
Re:No, there's something there by JimDabell · 2004-09-28 04:10 · Score: 1

Your faith in computational logic is astounding.

Not really. Why would anybody go to the trouble of writing extra code to link together disparate properties in an unsafe way that will almost certainly break things? And if anybody did, what's the chance anybody would use such broken software?

the 'scams and cheats' property assertion of an online gamer against my account number is, by definition, a symantic inferrence.

If anything in your example is a semantic inference, it is when you infer that a cheater in a game is a cheater in financial transactions. And it's an inference that has no basis in logic and cannot be reached by an inference engine that is using the data you have described.

Basically that is the point. It is broken beyond usable functionality. It cannot make the conclusions advertised.

No, you've described a conclusion that shouldn't be reached, and I am pointing out that it won't. That doesn't mean that there aren't useful conclusions that can be reached.
Re:No, there's something there by Allen+Zadr · 2004-09-28 07:28 · Score: 1

Symantic logic dictates that the algorhithm itself does not necessarily know the subject matter at hand. Only the context availble to it from various RDF/XML feeds.
So, if one RDF feed from MSN says that I have a 'cheat and scam' property, and another RDF feed from Citibank says that I have an 'enough money' property, and yet another RDF feed from Yahoo links the two, perhaps the original MSN data does not describe strictly enough (within it's RDF definition) that 'cheat and scam' is only a property of MSN Gaming Zone. As is likely, MSN may be planning on implementing the same property in relation to other services as well, so why standardize it now? [ Lot's of room for unintentional human error messing with the system. ]
Basically, because the data comes from multiple sources - even a decent standard (such as RDF) as a backdrop does not guarantee correct digital interpretation. This stuff isn't magic. Especially when you look at the data-feeds that are currently available. Fair chance, on a ChefMoz feed that the restaurant you're looking for has both incorrect and misleading information.
Maybe a given implimentation puts a low trust score on ChefMoz, maybe it doesn't. Maybe it knows how to put different trust levels on individual data-points within a single RDF feed, maybe it doesn't. Either way, there will be a lot of room for mistakes.
That is to say, you are describing a bottom-up approach where every feed is already known and properly documented before you unleash the beast. Except, that isn't what is being envisioned. The vision is that through RDF feeds across the entire web a 'total' awareness picture can be made of anything. That can only be accomplished, as envisioned, by using an adaptive software that doesn't have 'built in' conclusions of every possible subject (RDF feed).
What I'm saying is I think that this is infinately more complex than you think, although I'm certainly willing to concede that some specialized implimentations may be slightly less complex than I am suggesting.
I am genuinely enjoying this exchange, so I really hope I'm not pissing you off, that is not my intent.

--
Kinetic stupidity has a new brand leader: Allen Zadr.
Re:No, there's something there by JimDabell · 2004-09-28 08:29 · Score: 1

So, if one RDF feed from MSN says that I have a 'cheat and scam' property
I don't think I was quite getting your point before. This is the mistake I think you are making now: if you cheat and scam in a game, then the "cheating and scam" property should be describing your relationship with online gaming. Attaching it as a direct property of an individual is a mistake. The two properties are fundamentally different in nature; it's only the ambiguity of English that leads you to equate them. Clearly "cheat" in a gaming context is a different matter to "cheat" in a financial context.
With your example, yes, if an online gaming service is attaching that property to individuals (in my opinion this is broken - bad policy and thus the source would not be trustable), and if an online payment processor is trusting this property to be valid in describing your attitude to paying bills (arguably correct), then it would result in the situation you describe.
On the other hand, if the online gaming service merely describes you as cheating at online games (this is what I expect), or if payment processors don't trust online gaming services to provide accurate information in this respect, then there is no problem and everything happens as I describe.
Obviously the difference lies not in the technology per se, but in the policies that define how it is applied. My contention is that policies that allow these sorts of situations to arise are fundamentally broken and cannot reach widespread use because they will result in massive amounts of mistakes. However, I do not think that because those types of policies are broken, the fundamental technology is broken or that good policies don't exist. As I see it, a policy that limits organisations to making assertions only within areas they are qualified to judge solves the problem you are describing.

That is to say, you are describing a bottom-up approach where every feed is already known and properly documented before you unleash the beast.
Yes and no. For things like you describe, I'd certainly expect user-agents to use a finite list of data sources or have some mechanism for deciding how trustworthy a certain source is. For example, a book finder might base trust of reviewers on how closely their rankings match your own past rankings.
On the other hand, things like Google would want to make all information available no matter if it is untrusted, but be able to categorise it or only trust limited information.

Except, that isn't what is being envisioned. The vision is that through RDF feeds across the entire web a 'total' awareness picture can be made of anything.
Maybe that's the eventual goal, but I don't think anybody is claiming that it's a simple case of connecting the dots. I see people working towards solving small, tightly defined problems before taking the next step. In many ways "the Semantic Web" is a term like "artificial intelligence" - as soon as people take the next step and start building distributed databases of knowledge using RDF etc, it will just be considered simple stuff and the real Semantic Web is the stage after that. There's no need to jump from present-day to the eventual goal immediately; that's impossible. Right now people are working on, e.g. distributed Friendsters where they just tie together people using FOAF files etc, so you can query "who do I need to talk to in order to get introduced to [x] person?" etc.

What I'm saying is I think that this is infinately more complex than you think, although I'm certainly willing to concede that some specialized implimentations may be slightly less complex than I am suggesting.
No, that's pretty much my view on it, except I think there's plenty of useful things to be done with the "specialised implement

Complaint? Don't worry be happy by wombatmobile · 2004-09-27 06:58 · Score: 1

People can't bother making their pages w3c complaint (even slashdot)

You can complain all you like to W3C, they won't make Slashdot compliant. For Slashdot to become compliant, first of all it has to want to become compliant. Well, before that, there has to be standards to comply to, and W3C has given us those.

But did you know that Slashdot isn't the only web site? Tens of millions of web sites are W3C compliant or close enough that the web functions.

That is a great achievement by W3C.

For semantic web to gain adoption, there has to be benefits and then the infrastructure has to be built. We have a clear view of the benefits already, but only part of the infrastructure.

The part that exists - TCP/IP is going great.

The part that doesn't exist yet - a fully standards compliant web browser - isn't being funded by any of the companies that can afford to implement it because they already have cash cow franchises.

But the semantic web creates new opportunities, and when the old men who capitalized on earlier opportunities are dead or retired or superceded, the semantic web will emerge.

Introducing... by Sophrosyne · 2004-09-27 07:05 · Score: 1

...The Semantic Web, where everyone speaks esperanto and the Java is free!

A different second system effect. by Anonymous Coward · 2004-09-27 07:12 · Score: 0

Har!

"The entire sciences community thinks that Artificial Intelligence is a joke, which is why nobody (except for Minsky) with any real credibility in the field is studying it. Other computer approaches are being researched, but they aren't anything like AI because you can't expect mind-bogglingly complex causal models to automatically form reasoning systems correctly."

Your last sentence is spot-on in either context.

....strikes again by Southpaw018 · 2004-09-27 07:27 · Score: 1

Sir Tim (he was knighted by Queen Elizabeth II in July)

There are some who call me....Tim?

--
ACs are modded -6. I don't read you, I don't mod you, I don't see you. Don't like it? Don't be a coward.

Re:....strikes again by Anonymous Coward · 2004-09-27 09:37 · Score: 0

ridiculous, isn't it?

when are we going to stop this idiotic honours rubbish?

the only honorable thing to do is when offered a knighthood is to politely and privately refuse.

DataLibre by comforteagle · 2004-09-27 07:32 · Score: 1

Check out DataLibre: "Own Your Data, Write Once - Read Everywhere"

Some gripes with their time-line by doom · 2004-09-27 07:34 · Score: 1

Vanevar Bush, check, Douglas Englebart, check, Ted Nelson... oops, they missed Ted Nelson. That's a suprise. You either need to (a) get rich or (b) be an establishment-approved "visionary" like Vanevar Bush to get your name in the history books.

Next problem: Marc Andreesen releases "his" Mosaic web browser? He was hardly the sole author of that code.

Alternatively... by Numen · 2004-09-27 07:35 · Score: 1

Why we're going to reinvent Prolog and take 20 years doing it.

Why this is a bad idea - it's a taxonomy by Animats · 2004-09-27 07:42 · Score: 4, Insightful

The big problem with the so-called "semantic web" is that trying to taxonomize ideas doesn't work very well. Full-text search works much better.

In the beginning, we had library card catalogs, with their painful attempts to index and cross-reference books. That works well in some areas, typically ones where names of people are significant. Attempts to apply the same approaches to technical papers worked less well.

There's a very elaborate classification system for patents. When you had to look through patents on paper or microfilm, it was essential. Now that we have full text search, it's used less and less.

A modern example of this approach is the ACM Taxonomy, a structure into which all computer science can be fitted. (As an exercise, try to put the current Slashdot stories into that taxonomy.) Nobody actually uses that taxonomy to find anything.

As to data interchangability, that's a separate issue, and more of a standards one. The big problem for publicly available data is that the cost of encoding the data is borne by different people than those who benefit from the encoding. Many companies don't like having all their product and pricing information easily searchable by price. (Froogle may change this, because Google has so much clout.)

I've spent some time dealing with public financial reporting. There's opposition to detailed disclosure in a standardized format. Many companies don't want their detailed information to be too easily analyzed. Embarassing results show up.

The future is better search engines, not user-created indexing data. As we've painfully learned, a search engine must look at the same data a human reader would, or it will be lied to. Lied to to the point of uselessness.

Re:Why this is a bad idea - it's a taxonomy by Nurgled · 2004-09-27 12:42 · Score: 1

That's all very well for searching, but searching is only one thing you can do with data. I like the ideas behind the semantic web not for search but instead for processing. If I'm given raw data, I can write tools to do things to that data that the original publisher may not have intended, such as cross-referencing completely different data sources using some linking criteria external to the information given. For example, I might know that the URL of an article on slashdot contains its primary identifier in the database and make use of this. Of course, Berners-Lee is pushing the automated processing angle, but I see it more as a chance to publish more atomic relationships so that humans can write tools for specific processing jobs. It's like the reason why I prefer "real protocols" over HTML interfaces to data such as email.

Of course, there are some people who like to try to obscure things through presentation, such as the product and pricing example you gave. These entities will doubtless continue to publish opaque information that resists analysis, but all it takes is for one person to, possibly manually, collate the data into a more useful form and the efforts are foiled. If people want your data enough, they're going to find a way to get at it.
Re:Why this is a bad idea - it's a taxonomy by Animats · 2004-09-27 13:13 · Score: 1

Automated processing requires much cleaner data than you are likely to get from a multitude of sources that don't have to go through validation.
This is a well-explored area. Look into the history of SGML, or of EDI. Look at how hard it's been simply to get invoices and purchase orders into formats that work between companies. It not only takes standards bodies. It usually takes one dominant player who forces their suppliers to do it their way.
Re:Why this is a bad idea - it's a taxonomy by Nurgled · 2004-09-27 13:44 · Score: 1

I wasn't talking about automated processing, really. The kind of applications I have in mind involve simply taking one or two specific data sources and performing a very specific operation on that data source. Inter-operability is a bonus, but it's not required. If someone will give me machine-readable data on something I can do all sorts of things with it, but I can't do the same with human-readable data.

To take a trivial example, slashdot's front page isn't especially machine-readable, but the old slashdot.xml file gives a subset of the data there in a predictable, machine-readable format. This means that I am able to, for example, keep an archive of article titles, or whatever. The code I write to do this will be slashdot-specific, but it'll be a lot more reliable in the long term than trying to find that same data in the front page HTML. All sites using the same data format (such as RSS) is a bonus because it makes it easier to perform similar tasks on other sites, but not a requirement for the specific task of archiving slashdot article titles.

Of course, most people aren't the type to just hack up a one-off script to perform a specific task, but for people who do such things having machine-readable data, even if specific to the application or task at hand, makes things much easier.

And the bigger problem: Trust by SoTuA · 2004-09-27 07:43 · Score: 2, Interesting

Standards for metadata have been implemented, people can't be bothered to mark their pages, that's true, but the bigger problem is trust: How do you know that the metadata is true? It is the same as in the web right now, you can't know with no other references if the data is right, alghough, being a human being, you can judge on the quality of the data (i.e. a properly-written study that states that X is better than Y will garner more trust/respect than a document written in "OMFG X is tEh r0x Y is the Zux0rz!!!!111!1111!!1one and onety-one" style) But a computer reading the metadata is another point entirely.

Trust is one of the major stumbling blocks of semantic applications and automatic knowledge management issues.

The need for information management pops up again. by master_p · 2004-09-27 08:15 · Score: 4, Interesting

If you have followed this little crazy guy that is me, you may have seen that most of today's computer problems are because modern operating systems offer nothing in the information management department.

Remember the CVS story a couple of days before? it's information management: http://slashdot.org/comments.pl?sid=123076&cid=103 47565

WinFS is also about information management: http://slashdot.org/comments.pl?sid=121101&cid=101 99083

The story that the Evolution e-mail client offers the e-mail data as a data model separate from the application? another information management issue.

The web? information management issue.

Distributed databases? information management issue.

Web search engines? information management issue.

Windows search tool? information management issue.

The Windows registry? information management issue.

The unix etc directory? information management issue.

Enterprise workflows? again, an information management issue. That's why there is no general workflow solution accepted and used worldwide.

Dynamic web site contents? information management issue.

The semantic web? another information management issue!

As you can see, from the numerous examples given above, all that an operating system should do, but no one does, is that it must manage information instead of files. If that is coupled with a distributed networked environment, 90% of the world's software would be considered obsolete overnight and the productivity and fun from using computers will increase 10fold.

If any open source developer is reading this, you may contact me for a private discussion on the idea. THIS IS OPEN SOURCE'S BIGGEST CHANCE TO LEAD THE TECHNOLOGICAL RACE!

Nice Try, Tim by Master+of+Transhuman · 2004-09-27 08:16 · Score: 2, Insightful

As you do note in your comments, however, it's not really doable without a good simulation of conceptual processing.

Still, every little bit helps. Certainly a "Semantic Web" would be more useful than the current one.

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!

XML?? by No.+24601 · 2004-09-27 08:29 · Score: 1

Hmm what he was describing sounds a lot like XML/XSLT. I wonder if his work is taking that into account??

Re:XML?? by Anonymous Coward · 2004-09-27 09:40 · Score: 0

er yeah. it's just like that. put everything in name-value tags and you have solved everything.

SemWeb == Huge Prolog program by calambrac · 2004-09-27 08:41 · Score: 2, Interesting

The semantic web sounds a little like a massively distributed Prolog program, with each separate semweb component defining a rule or relation, and each semweb-aware program just being a query into the environment... Other questions: how do you avoid redundancies, or pulling data you don't want, or keeping data confined to specific locales or interpretations, or keeping labels synced with the actual data? What prevents someone from declaring something foo when it's actually bar?

In one word, the problem is "incentives" by doom · 2004-09-27 08:42 · Score: 1

He mentions in passing that there are "challenges" about privacy and "intellectual property", but then skips away from the subject.

You can get semantic markup to work : (1) within a group of dedicated volunteers who understand and care about it; (2) within a large organization, where the ontology can be forcibly standardized, and it's use can be dictated.

Getting it to work out on the Web as We Know It has so many problems, it's kind of crazy... Even if you can skip past the problems of deceptiveness (let's say by authentication and strong laws against fraud) Much of the information that's published is advertising supported. Where's the incentive to mark that information up with semantic tags so that people can skip passed the ads? It's hard to see how you can get to semantic web heaven without some kind of automated micro-payment system.

Re:In one word, the problem is "incentives" by fikx · 2004-09-27 11:12 · Score: 1

just as a comparison, the web was doing just fine before it became overrun with adds. Before the companies made billboards of it, there was plenty of content that was put up just so people could get to the information. And, that is still there under the pile of crap..err adds the are all over the web now. Lots of sites are out there that just give information because it's useful. even the incentive-driven companies put out plenty of these web pages. Tech-support pages, company home pages, product comparison pages, etc.
This sounds like just a new format of web page to me, with machine readable info in it. Yeah, there will be spamming, but there will also be useful stuff too...just like now.

--
AB HOC POSSUM VIDERE DOMUM TUUM
Re:In one word, the problem is "incentives" by doom · 2004-09-29 09:04 · Score: 1

fikx (704101) wrote:
just as a comparison, the web was doing just fine before it became overrun with adds. Before the companies made billboards of it, there was plenty of content that was put up just so people could get to the information.
You have a point of course, but then, that's the reason I mentioned that a semantic web can work (1) within a group of dedicated volunteers who understand and care about it.
But let's noodle this around a little... by my guess is that the ad hoc volunteers that are willing to put up web sites about things would not be a very good match for the "dedicated volunteers" I was referring to.
The reason being that the web is unstructured, you do whatever you want with it until it looks good. Putting up a list of all Saint novels put out under the name Leslie Charteris is a comparatively easy task, compared to marking up that list with appropriate meta information.
So what you need is something like, say Musicbrainz, that runs in actual database with fill-in-the-blanks forms to get the right information in the right places. But if you've gone that route, then what you have is a centralized database, and any semantic web XML mark-up jazz you add to that is just going to be an add-on.
I'm afraid that the future may belong to the imdb.com's of the world, rather than to the de-centralized semantic web dream... much in the same way we've signed over the job of identification to credit card companies, and the PGP web-of-trust dream seems to be fairly stagnant.

Opposing the Opposing View by RAMMS+EIN · 2004-09-27 08:44 · Score: 1

Clay Shirky argues that the semantic web is all about syllogisms, and then goes on how syllogisms are not very useful, as they can lead to wrong conclusions. I don't think the semantic web is, or even should be, about syllogisms. I think it is merely about storing information in a way that is meaningful.

As an example of how the semantic web could work, consider the following snippet of code:

(person
(name "John" "Doe")
(gender male)
(email "john@doe.net"))

This is clearly meaningful to humans, but not immediately meaningful to machines. If, however, one writes programs that interpret and process the data here represented, the data becomes "meaningful" to machines. Such programs become more useful the larger the data set on which they operate becomes.

The semantic web, then, will be realized by standardizing the way data is represented (like HTML does for documents, but more agressively), so that data from various sources can be processed by standard software. This is immensely useful; think about search engines that really let you find documents that discuss certain issues, rather than just matching words you type in against everything in the document.

Whether all of this takes off remains to be seen, but I disagree with the notion that the semantic web would not be useful.

--
Please correct me if I got my facts wrong.

Basic Problems by Kwil · 2004-09-27 10:10 · Score: 1

Either the meta-data is encoded by hand, in which case it's faulty, prone to error, and susceptible to fraud, or the meta-data is encoded by a machine, in which case there's no need for a "semantic web" just an automated "semantic interpreter" that will interpret pages on the fly.

I mean really, Berners-Lee is just picking up John Wilkins' old saw-horse of a Philosophical Language from about 400 years ago. Philosophers of various stature have been picking up and dropping pieces of it ever since, and it's never caught on in all that time. Perhaps there's a reason for that.

--

That Jesus Christ guy is getting some terrible lag... it took him 3 days to respawn! -NJ CoolBreeze

need standardization? by yonyonson · 2004-09-27 10:14 · Score: 2, Insightful

for data to be shared and recognized as distinct fields of information, won't there need to be standardization across all hosts in order to use the data in any comprehensible way?

ie.

<product> Acme(tm) xxxxx </product>

on host #1
while on host #2 the same item is recognized as:

<saleitem> Acme(tm) xxxxx </saleitem>

how will the semantic web describe and relate items which are recognized as an item for sale but under different labels?

Re:need standardization? by ElDuderino44137 · 2004-09-27 15:17 · Score: 1

Exactly ...

There is both evil and stupidity in the world.
Either ppl will not define things correctly.
Or ppl will intentionally define things incorrectly.

And don't forget about evil stupidity,
-- The Dude

Are you Ted Nelson? by Anonymous Coward · 2004-09-27 10:33 · Score: 0

The self-promoting, ever-insistent, look-at-me, look-at-me inventor of hypertext, the world, and everything?

Bleah by musterion · 2004-09-27 10:55 · Score: 1

The Sematic Web is nothing more or less than the "Big Plan" as Daniel Burnham Lambert (an architect) "Make no small plans, they have not the power to stir men's blood" (maybe not quite exact quote). The Semantic Web is where all of those unemployed AI weenies from the 80's have migrated, and have foisted off scheme/lisp and rule-based systems as well as all of the baggage from symbolic AI onto the unsuspecting new generation of programmers. It is sort of like nuclear fusion, "We'll have it in 20 years" --said 40 years ago and still repeated.

How to Build a Semantic Web by dekashizl · 2004-09-27 11:22 · Score: 1

Here's how to do it properly.

How to Build a Semantic Web

Terms:
A "DOCUMENT" is a piece of information/site/etc.
A "SOCIETY" is an arbitrary groups of people and their DOCUMENTs
"QUALITY" is a value (computed similar to Google's PageRank) for a particular DOCUMENT, according to a particular SOCIETY.
"SEMANTIC NODES" link together to constitute a "SEMANTIC GRAPH", composing a global neural network of concepts. Each SOCIETY can emphsize and de-emphasize various features and connections of this GRAPH.
"SEMANTICS" are weighted (by the author) bindings between a DOCUMENT and a SEMANTIC NODE.

The key concepts that complete the picture are:
1. VALUE is contextually dependent on the SOCIETY(s) you are currently in. E.g. Doctors will emphasize medical aspects of breasts differently than pornographers.
2. SEMANTIC VALUE is a derivative property found by taking the full VALUE of a DOCUMENT and spreading it among all of its weighted SEMANTIC BINDINGS (that's the key right there that prevents authorial abuse/semantic SPAM).

And then that's it. It just works. I plug into some set of societies (again, weighted), and I inherit their combined SEMANTIC GRAPH and VALUE assessments of documents. Then each document has SEMANTIC VALUE relevant to me. And a big powerful search engine pulls it all together. Whoever builds that search engine will create and own the emergent global consciousness.

Re:How to Build a Semantic Web by Anonymous Coward · 2004-09-27 11:30 · Score: 0

Whoever builds that search engine will create and own the emergent global consciousness.
One would hope that such an important network would be "open" in some form. Curious to see how it would be possible to fund it without commercial/corporate involvment and control.

I can imagine being forced to weight the "Tricon Global Food Corp Society" with a minimum of of 10% weight whenever I access information through the "free" WiFi network in my local Taco Bell!
Re:How to Build a Semantic Web by Anonymous Coward · 2004-09-27 11:45 · Score: 0

I like the idea of how there is no forced hierarchy on the semantics. The limitation of physical organization (card catalogs, mail folders, etc) does not need to exist in this network. And at the same time, there is a control on what you call "authorial abuse/semantic SPAM" by spreading out the value over the semantic bindings, so you get the freedom of infinite hierarchy without the ability for cancers to grow uncontrolled (like SPAM in email has grown because of the lack of constraints on sending large amounts of mail).

Re:How to Build a Semantic Web (P2P) by dekashizl · 2004-09-27 11:40 · Score: 1

I can imagine being forced to weight the "Tricon Global Food Corp Society" with a minimum of of 10% weight whenever I access information through the "free" WiFi network in my local Taco Bell!

That addresses the *access* side of such a global network (and it's a scary thought), but what about who actually owns the data? Preserving the network in a serverless P2P format could make it truly free AND *extremely* fault-tolerant.

In this way, various subgroups could actually charge for access to their SOCIETY subnet. Charging either money, or again, forced weighting. For example, to subscribe to the "KPWR R&B Music Appreciators" SOCIETY subnet (which would allow searches for "usher" to return the musician rather than the guy who finds your seat at a show), you might be required to accept a minimum weight of 5% in the "Seagram's Party People" (their sponsor) SOCIETY subnet. So now you can find "usher", but if you search for "mixed drink recipes", you'll get a lot of pages recommending Seagram's brand alcohol as a base for your drinks.

As long as this is *transparent* (i.e. I can see how my weights are being set and the dependencies, and I can remove ones I don't like), this doesn't strike me as that horrible.

Half way there by TheCrunch · 2004-09-27 11:47 · Score: 1

Marking up Hilton as <motel> or <celebrity> is all very well. This is what XML is for.

One of the key points behind the semantic web is to define meanings to your meta tags. My system has a <partnumber> tag and so does yours, but that doesn't mean they're the same. I can publish my definition of <partnumber> so that other apps can know how to interpret my partnumbers. Complex definitions can be provided in computer-readable format, which can then be looked up, referenced, shared etc. with other systems.

Take Dublin Core, for example. A standard set of tags to describe document attributes, such as title and author. Why should I write my own <author> tag when I can simply pull-in part of Dublin Core's vocabulary. Not only does that save me (the developer) time, but it means any app that knows about Dublin Core will know what I mean when I say "author". Or, if an app doesn't know about a particular term it can simply go look it up.

Sharing vocabularies is time-saving, but also helps computers process information automatically. Mr Berners-Lee and some colleagues had a good article published in Scientific American a while ago which explains their vision of intelligent software agents doing the sorts of things computers should be doing with the information the web has to offer. Such as automatically adjusting your schedule if your gym's online timetable has changed and your squash game needs to be moved. OK, that's a very basic example, but the point is that although the information needed to do this sort of stuff is already on the web, it is currently only readable by humans.

If anyone is interested in learning more about this stuff then have a look at the Resource Description Framework (RDF) which is a foundation technology of the Semantic Web (There's more to it than HTML META tags!). There's a lot of activity involving RDF-based technologies such as OWL, FOAF and the popular RSS.

--
My life is one big siesta in which I'm dreaming I wished my life was one big siesta.

What was that? Accountability?! by finelinebob · 2004-09-27 11:48 · Score: 1

...However, if I jilted someone in Diablo, do I want them to so easily find me and take it out on my car (as some people would)?

You mean, I'd be able to track down the snotty little uber 1337 haxxor d00d who thinks KS'ing in the Geo Caves is the essence of being an Imp rather than being a punk? Who equates poor spelling with role-playing? Who thinks he know something about computers just because he's had one for all of his 13 years of life?

I'm all for it! WOOT!!

You just don't get it. by brlewis · 2004-09-27 12:35 · Score: 1

This is why everybody likes to retell the "invented the Internet" joke:

Bush lying about healthcare isn't funny.
Bush pointing to legislation passed over his veto as reason why you could trust him to be "The Education President" isn't funny.
Bush's campaign manager saying that the reason he got pulled over while driving drunk was "he was probably going to slow" when he actually hit a tree is not funny. What if it had been a person?
Bush putting forward as justification for going to war intel that his own administration didn't believe was true isn't funny.
Gore saying he invented the Internet, that's funny. Even though that's not really what he said.

Re:You just don't get it. by Anonymous Coward · 2004-09-27 17:51 · Score: 0

Bush getting elected because of a strange sense of humour... not funny.

Proprietary web would have gone nowhere by brlewis · 2004-09-27 12:41 · Score: 1

I'll take what you're saying a step farther. If Berners-Lee had tried to make the web proprietary, the NCSA never would have written Mosaic, the IMG tag would be on hold for years, and something else would have been developed and grabbed everybody's interest. Nobody outside of CERN would ever hear about the WWW.

Re:The need for information management pops up aga by marcosdumay · 2004-09-27 12:43 · Score: 1

How do I "contact you for a private discussion"? I have traid to design infrmation management down on disk and memory management on a OS once. It has lots of problems. The worse are like this NAME and PERSON_NAME stuff, that no system can certanly link but must be linked in order to communicate different (not necessary relational)databases. I could start designing the OS, but a computer with it was unable to comunicate.

So, if you have some ideas, I am very interested in listening. But don't you think this will "lead the tecnology race". An OS like this is surely very new and interesting but widely adoption is very unlikely (think about all the X subistitutes).

--
Rethinking email

Re:And the bigger problem: Trust MOD PARENT UP! by mewphobia · 2004-09-27 12:45 · Score: 1

I couldn't agree with the parent more.

I'm genuinely suprised that there hasn't been more inroads into trust chain architecture. The current web works because it requires people to get the semantics - and people are good at working out whether or not they trust a source.

There is a logical seperation between authors. With the semantic web, everything is presented together. How do you know if you can trust it? How do you know that all authors are presenting data the way it is?

What if they just interpret the data differently?

Re:And the bigger problem: Trust MOD PARENT UP! by DrEasy · 2004-09-27 14:20 · Score: 1

Well, you could try addressing the trust issue using semantic web notions as well... You could create a semantic link between your slashdot profile and a document that you recommend, or the profile of the person who wrote it (hey you could even call it "moderation"). And then somebody else could create a "recommendation" link between him/herself and your profile, etc... You could obtain a rating for each document, using an inference engine that would follow a FOAF chain.

Just like TBL said, it's the network effect that makes the semantic web powerful. Trust evaluation can also leverage such an effect, as Slashdot has been able to demonstrate.

--
"In our tactical decisions, we are operating contrary to our strategic interest."

Re:The need for information management pops up aga by master_p · 2004-09-27 21:10 · Score: 1

My e-mail is axilmar@in.gr. Just send me an e-mail there, then we can have a discussion.

Re:The need for information management pops up aga by DJ_CEO · 2004-09-30 06:20 · Score: 1

YO MASTER P Not sure how to get in touch with you for an offline discussion. Feel free to email me at gregdeocampo@gmail.com , perhaps we can have an interesting conversation! best, greg

--
/* http://www.gregdeocampo.com */

Slashdot Mirror

Tim Berners-Lee and the Semantic Web

250 comments