Slashdot Mirror


Going from a 'Web of links' to a 'Web of meaning'

neutron_p writes "Computer scientists from Lehigh University are building the Semantic Web, which will handle more data, resolve contradictions and draw inferences from users' queries. The new improved Web will also combine pieces of information from multiple sites in order to find answers to questions."

142 comments

  1. When by cbrocious · · Score: 2, Insightful

    When will we be dropping HTTP and HTML in favor of more metadata-friendly protocols and file formats? I can see huge potential in a system built specifically for getting data out there and linking it all together.

    --
    Disconnect and self-destruct, one bullet at a time.
    1. Re:When by SenatorOrrinHatch · · Score: 1, Insightful

      What on earth makes anyone think a computer has anything to do with "meaning"? That would require understanding, which would require thought, which would require consciousness. Which is exactly what a machine explicitly does NOT have.

      --
      The Christian in me says it's wrong, but the corrections officer in me says, 'I love to make a grown man piss himself.'
    2. Re: When by vasubhat · · Score: 2, Funny

      ... And when the computer does indeed possess understanding, thought, consciousness and the likes, and goes about doing something with it, the Vogons go and destroy it before the job is done.

    3. Re:When by noselasd · · Score: 1

      >When will we be dropping HTTP and HTML in favor of more
      >metadata-friendly protocols and file formats?
      When IPv6 is fully deployed, and the US got a black female president that
      just invented cold fusion.

    4. Re:When by legirons · · Score: 1

      "When will we be dropping HTTP and HTML in favor of more metadata-friendly protocols and file formats?"

      When google-spammers stop putting 8 million irrelevant words in their
      <meta name="" content="">
      tags?

    5. Re:When by Anonymous Coward · · Score: 0

      Probably never. Given the sheer number of pages out there coded in HTML, some backwards compatibility will always be built into an Internet browsing/searching device.

    6. Re:When by Anonymous Coward · · Score: 0

      This has nothing to do with AI in the traditional sense. RTFA. Oh, and nice job posting a totally unrelated comment to the first post, karmawhore.

  2. Ummm by bo0ork · · Score: 3, Insightful
    The new improved Web will also combine pieces of information from multiple sites in order to find answers to questions

    Sounds like a recipe for disaster to me.

    --
    Does everything include nothing?
    1. Re:Ummm by amalcon · · Score: 2, Funny

      Yeah, hopefully there aren't many easily-offended cat enthusiasts out there. They might not appreciate some of the more, er, "exotic" sites they find...

      --
      -Amalcon
    2. Re:Ummm by bobbis.u · · Score: 4, Insightful
      Yeah, I would tend to agree.


      One of the reasons the internet has become so popular is because everyone can have their say. Unfortunately, this has the side effect that there is a lot of incorrect and misleading information out there. Everything is also self-reinforcing, because one person often copies their "facts" from another website without first checking the veracity. Even major news outlets and scientific publications have been caught out by this in the past.

    3. Re:Ummm by ezzzD55J · · Score: 4, Insightful
      'Everything is also self-reinforcing, because one person often copies their "facts" from another website without first checking the veracity'

      There is another way in which it's self-reinforcing. People look for sites and pages and people that reflect their own opinions.

    4. Re:Ummm by LionKimbro · · Score: 2, Interesting
      Two things:

      1. "Webs of trust." People will make pages telling what pages they believe have a good reputation, and generally tells the truth. If someone fills the web with a ton of random statements, they will have a low reputation.
      2. Computers will have "beliefs" reflecting their owner's own. You will tell the computer, "I believe this is true," and the computer will absorb the package of information. You can say, "I believe this is false," and the computer will absorb the package of information, and put it into the "bogus" bin.
    5. Re:Ummm by jsebrech · · Score: 2, Insightful

      "Webs of trust." People will make pages telling what pages they believe have a good reputation, and generally tells the truth.

      That won't work for stuff that's politically sensitive, since people will mod sites down just because they dislike what the site says, even if it is accurate. It also gets really complicated with sites that are accurate on one subject but don't know jack about another.

      Computers will have "beliefs" reflecting their owner's own.

      In that case, what's the point? If your computer only accepts data that fits in with your predetermined conclusions, it will provide valueless results.

    6. Re:Ummm by Jerf · · Score: 1

      Unfortunately, this has the side effect that there is a lot of incorrect and misleading information out there. Everything is also self-reinforcing, because one person often copies their "facts" from another website without first checking the veracity.

      I am interested in where you have found a source of information that does not match this description.

      I was not aware extra-terrestrials were running libraries I could get at. Certainly no known human data source has ever risen to this standard.

      (Criticizing the internet for not being perfect and implying it is a waste of time to improve it is stupid on a number of levels. How is it supposed to improve if nobody tries? Are you seriously suggesting we should give up on all forms of collected knowledge because none of them are perfect? If not, why are you only holding the Internet to your absurd standard and giving everything else a pass? I assure you, whatever source you are comparing the 'net too is also full of errors and inaccuracies. Simple example: Look at books on how to train animals. You'll find a book espousing every conceivable technique. They can't all be right... Also, look up "urban legend" sometime and watch unchecked knowledge spread far and wide, even into academic scholarship sometimes.)

    7. Re:Ummm by Anonymous Coward · · Score: 0
      People look for sites and pages and people that reflect their own opinions.

      Where did you hear that?

    8. Re:Ummm by Anonymous Coward · · Score: 0

      Look for websites and people that reflect their own opinion? Where would you get a crazy idea like that?

      Here at Slashdot, we don't need to follow anybody. We all think for ourselves! We are all individuals!

      We're all different.

      (Well, ok, I'm not.)

      We've worked it all out for ourselves.

    9. Re:Ummm by SoTuA · · Score: 1
      Exactly.

      One of the biggest stumbling blocks of the semantic web (semantic anything, in fact) is Trust: how do you know the other guy is telling the truth? Human beings are very good at evaluating trustworthiness from a website, but when we switch to a web made for understanding by machines, we lose that ability. We need some kind of trust infrastructure wich assigns credibility to sources and so on...

      Another stumbling block is common ontologies, i.e. how do we know we are talking about the same thing, this time when the conversation is between machines. For example, bit means a different thing when you are talking about carpentry than what it means when you are talking about computers...

      Those two issues remain big problems when it comes to making machines search for meaning.

    10. Re:Ummm by smartdreamer · · Score: 1
      "Webs of trust." People will make pages telling what pages they believe have a good reputation, and generally tells the truth.

      That won't work for stuff that's politically sensitive, since people will mod sites down just because they dislike what the site says, even if it is accurate. It also gets really complicated with sites that are accurate on one subject but don't know jack about another.

      The Semantic Web is an extention of the web as we no it now. Nothing less, but lot more! It keeps the idea of everyone can say anything. Now about the web of trust (which is something that fits over all lower layers), it's about giving trust to the information. It is what YOU trust more. It is personalised to every user of the Internet. You can say you trust your friend but you disagre with Bush. That is perfectly possible and sensible.
      In fact you could resume the whole thing like Slashdot, where you have friends, fans, foes, etc. In ./ their is always people from both sides which makes interesting debates.
      Don't see this like censure, but like a mechanism to reveal content that mean something for you.

      Computers will have "beliefs" reflecting their owner's own. In that case, what's the point? If your computer only accepts data that fits in with your predetermined conclusions, it will provide valueless results.

      What the computer "believe" is what is user tells him so... So the point is that it presents content would care of. Consider that every people does this. That's psycological. Would you prefer a solution presenting you content of value to your needs or the actual solution where you get everything and you got to filter out meaningless content? In fact, what the computer does is what you would do... with the assle of actualy doing it.

      For me, searching a search result shows one thing: search engine can't do what I want. There are missing something and that's what Semantic Web is all about.

      Still, we are far from there...

    11. Re:Ummm by smartdreamer · · Score: 1
      Jerf is perfectly right.

      I would add that even though, the Internet has far more faulty affirmations or wrong answers it has a special merit : everyone can participate and add to it. It is the first media to achieve this and it is very important. Where nowhere you could be heard, on the web you can.

      Furthermore, you have to question the quality of other information ressources you compare the web to. I mean, journals are pretty much crap (some are very good though). Do you think journalists check their sources? Maybe much of the times, but for forget it when their is a scoop nearby. There is a real lack of professionalism and don't talk about the censure being made. When more than 90% of USA reads journals from a single company, you ask about information control.

      I wanted to bring a little perspective. Is it better to have plenty of information with the risk of it being wrong or to have only one source shredding it for you?

      There is a similar debate around Wikipedia. Intersting results about the quality of the information you can have with good community. :)

    12. Re:Ummm by Anonymous Coward · · Score: 0

      Or the simple fact that sharing knowledge is rarely rewarded.

      Especially if the sharer has to do 3x the amount of work as before in order to publish/share their knowledge.

      Human nature dictates that the semantic web will never happen. Businesses thrive on artificial shortages.

  3. Something similar. by modifried · · Score: 4, Informative

    Covered not long ago - an interview with Berners-Lee regarding the Semantic Web.

    1. Re:Something similar. by Anonymous Coward · · Score: 1, Informative

      And, mentioned at the time, was Clay Shirky's dissenting view, which makes for much better reading than TFA in this case.

  4. Why is this news? by multipart · · Score: 4, Informative

    People at DERI in Ireland's Galway are also working on the Semantic Web (see http://www.deri.ie/). I thought lots of people are...

    1. Re:Why is this news? by BarryNorton · · Score: 2, Informative

      They are - there are several major European consortia, many involving the University of Sheffield where I work on Semantic Web Services, as well as lots of US work especially deriving from DARPA and CMU work on agents...

    2. Re:Why is this news? by ngibbins · · Score: 1

      Ditto the University of Southampton. I've been working on a SW-related project, AKT, for the last four years; as part of this work, I was a member of the W3C working group (along with Jeff Heflin) that wrote the OWL Web Ontology Language.

      Other places to look at are Jim Hendler's MIND group in Maryland, which has been doing some sterling work over the last few years (as an aside, Jeff used to be Jim's PhD student).

    3. Re:Why is this news? by Anonymous Coward · · Score: 0

      >>as an aside, Jeff used to be Jim's PhD student

      Which is probably the only reason this thing has been posted, since the article basically says nothing new...

  5. Resolve Contradictions? by NoTheory · · Score: 3, Interesting

    I'll have to rtfa to see what they propose, but just the principle of resolving contradictions is a really difficult one, and most theories of knowledge (which are essentially networks of facts) aren't terribly robust, and contradiction repair, which involves running the entire network to find invalid assumptions, and then propigating the changes is NP complete :| i'm not positive that contradiction resolution is a reasonable thing to expect out of a massive distributed network.

    --
    There are lives at stake here!
    1. Re:Resolve Contradictions? by fugu13 · · Score: 3, Informative

      The Semantic Web's use to resolve contradictions is probably least applied, at least in these early stages. Also, it is not meant to be a global information store (in which all contradictions may be resolved). It is meant to be large numbers of globally connected information stores, and between small numbers of these contradictions may be resolved.

      Also, the ontology of the semantic web comes in 3 flavors, OWL Lite, OWL DL, and OWL Full. The first two are limited enough that they are decidable (I'm not sure if this is guaranteed or just true for most use cases). OWL Lite in particular is light weight enough that processing of it is in reach for data stores, but powerful enough far more information can be inferred than what is directly stated in the RDF.

      --
      For to end yet again.
    2. Re:Resolve Contradictions? by NoTheory · · Score: 2, Insightful

      alright, having read the friggin' article, all i have to say is that they have their work cut out for them.

      the problem with searching currently is that only librarians, who've had at least a year or two of graduate studies really know the ontology that libraries use. Common users bring their own concepts and ontologies to bear when they're searching for information. But if you move away from the monolithic single ontologies that libraries use, you have the problem that you have to be open to the fact that ontologies change, not just between individuals but over time, as cultures change the ontologies need to change as well. I guess the concept must be that there are a set of descriptors which are invariant, and can thus be interpreted based on the features of those objects by different ontologies.

      The crazy part about trying something like that is that you have to make people define their own ontologies. Furthermore you have the problem that you need to make sure that people are describing their data in an ontologically neutral manner.

      And that's the hidden third problem (the technology review article posted above has the dude citing 2 problems), getting people to behave in a sensible way when dealing with information organization. Unfortunately in so far as we know now, it's really difficult to get computers to automatically create meta-data (that doesn't mean we're not trying), but primarily humans have to be included in the decision process if you want to define what things are.

      the ironic thought that pops to mind is that if you've got a set of universal descriptors, then don't you already have an ontology? And if you don't have a set of universal descriptors, how would you ever create a coherent ontology?

      anyway, enough rambling for now

      -notheory

      --
      There are lives at stake here!
    3. Re:Resolve Contradictions? by NoOneInParticular · · Score: 1

      Any system that is used in the real world and cannot at its core handle inconsistent information (and no deductive system can) is fundamentally flawed for this use. Expect from the semantic web the same as for automatic translation (60s) and expert systems (80s).

    4. Re:Resolve Contradictions? by ngibbins · · Score: 1

      The Semantic Web as envisaged by the W3C is based on the RDF and OWL languages; the latter has a Description Logic as its underlying formalism, which is a subset of first order predicate logic with computationally attractive properties that lead to tractable decision procedures for satisfiability.

      Distribution is a separate issue. While assembling the parts of a distributed ontology may be expensive, it doesn't affect the algorithmic complexity of determining whether a set of axioms contain a contradiction.

    5. Re:Resolve Contradictions? by ngibbins · · Score: 1

      The knowledge engineering community has moved on since the expert systems of the 1980s, and techniques for handling uncertainty and inconsistency are now commonplace. The SW draws heavily on this experience.

    6. Re:Resolve Contradictions? by Anonymous Coward · · Score: 0

      ontology

      You keep using that word. I do not think it means what you think it means.

    7. Re:Resolve Contradictions? by grcumb · · Score: 1

      "the ironic thought that pops to mind is that if you've got a set of universal descriptors, then don't you already have an ontology? And if you don't have a set of universal descriptors, how would you ever create a coherent ontology?"

      There's nothing particularly ironic about it. The question you're asking exposes a fairly common misunderstanding of what the Semantic Web's all about. Several years ago, I attended the talk by Tim Berners Lee in which he announced the principles of the Semantic Web. As I recall, the fundamentals are remarkably simple:

      If a == b and b == c then a == c

      Now, this can lead to the question: if we have ontologies that are mappable, why do we need mappable ontologies?

      But that's the wrong question. Try asking this one: If we can map at least part of one ontology to another, why shouldn't we be able to map between all of them using the same means?

      People who have worked in data mapping and transformation, Information Retrieval and other disciplines that work with large volumes of 'unstructured' data realise this kind of goal is a Holy Grail. It's not easily, nor ever wholly achievable. But if we work on the incremental basis that Tim B-L suggests, mapping data atomically rather than holistically, the amount of machine-driven contextualisation that becomes possible is incredible.

      It's true that data spoofing is a real threat to its widespread applicability. There are, however, any number of places where the disincentives for spoofing far outweigh the incentives to do so. Likewise, not everything on the web is a blog entry; there are huge volumes of information that are not at all ambiguous.

      Price data is a good example. Imagine being able to comparison shop from sites anywhere in the world, and always viewing the total cost (including shipping and applicable taxes) in your own currency. The questions concerning trust and reliability of the data don't disappear, nor are they meant to. What does happen, though, is that the data itself is presented in a format that's useful and consistent to you. No more scurrying around between sites, running currency converters and freight calculators while trying to shave a few pennies off a price. That, in and of itself, is worth something.

      I believe that we'll see the Semantic Web insinuate itself into our lives, rather than appear with the same kind of bang the WWW did. As more people begin to rely on the transformative/translational power of the Semantic Web, more people will invest in making in trustworthy and reliable.

      --
      Crumb's Corollary: Never bring a knife to a bun fight.
    8. Re:Resolve Contradictions? by ioslipstream · · Score: 1

      With this many nay-sayers in the world, I'm surprised anything new is ever achieved.

  6. Snake oil... by Alomex · · Score: 2, Insightful



    with their favourite mode of publication being the press release.

  7. Re:Really quite Trivial. by Anonymous Coward · · Score: 0

    Could you please provide a link to this Sinus Apache module? Neither Google nor SF's own search are of any help when searching for it. Thanks!

  8. Why does... by seanvaandering · · Score: 0

    ...everyone assume that the Interweb is intended for information? Yes, I understand it is the best source of collaborative information the world over, but I like finding porn the old fashioned way - not having a search engine, or intelligent nodes, telling me which hottie is the hottest.

  9. Unless by Taco+Cowboy · · Score: 3, Interesting

    You gotta understand that "meaning" has no meaning at all to machines, at least not yet.

    And even for humans, the "meaning" of a certain thing can be different thing to different people !

    Although I applaud the job they are doing for Semantic Web, I wonder how they can inject "meaning" into the whole thing.

    My biggest fear is the 1984-like "my meaning is THE meaning and you canna have any other meaning" thing.

    --
    Muchas Gracias, Señor Edward Snowden !
  10. and draw inferences.. by murderlegendre · · Score: 5, Funny

    ..from user's queries.

    Clippy..? Is that you?

    --
    There's a Starman, waiting in the sky / He'd like to come and meet us, but he hasn't got the time.
    1. Re:and draw inferences.. by flacco · · Score: 1
      Clippy..? Is that you?

      "it looks like you are searching for pr0n. would you like me to lock the door and dim the lights via X10? how about some romantic music? maybe add to your search results a froogle side-bar with the best astro-glide prices on the net?"

      --
      pr0n - keeping monitor glass spotless since 1981.
    2. Re:and draw inferences.. by Anonymous Coward · · Score: 0

      Combine it with Amazon. People who have browsed this page have also browsed The Karma Sheeptra and Farmer John's Zoolube pages.

  11. It's the authoring tools, stupid by Eloquence · · Score: 4, Insightful
    Who is "building the semantic web"? Academics or web authors? The only semantic web technology that has actually gained wide usage in the sphere of user-generated content is RSS, a syndication format (or rather, a bunch of competing syndication formats). The reason for this is that weblog engines like Slash and Movable Type support syndication. This then allowed programmers to create news aggregators and filters.

    The same can be said about any semantic web technology - whether it's FOAF (an RDF vocab for describing people and their interests) or a vocabulary for reviews. As soon as major authoring tools (i.e. both web editors and content management systems) start integrating these technologies, people will use them if they are useful. Do not expect web designers or bloggers to have a clue about all the great things that the semantic web can do - give them one useful thing which they understand, package it in a pretty UI, and they will start using it.

    1. Re:It's the authoring tools, stupid by Glass+of+Water · · Score: 1

      That is the truest post in this whole thread.

      --
      There are no trolls. There are no trees out here.
  12. The semantic Web and valid HTML by ToreTS · · Score: 2, Insightful

    I guess that the Semantic Web would need HTML documents to meet strict requirements when it comes to validation, use of logical instead of physical markup and so on. This could be an incentive for people to use HTML the way it was intended, instead of the crapload of pages that don't close tags, use hundreds of redundant FONT tags, use the H1..H6 elements to control font size instead of using them to indicate headings, and so on. Strangely enough, all "beginner's" HTML books still teach people to code this way.

    1. Re:The semantic Web and valid HTML by Anonymous Coward · · Score: 0

      the books you mention are correctfully teaching m$-html.

    2. Re:The semantic Web and valid HTML by Just+Some+Guy · · Score: 1
      I'm not sure how I feel about that. On the one hand, moving to a pure data-in-HTML, presentation-in-CSS model works wonders for helping machines decipher the mess. On the other hand, when I see things like
      <font size=+1><font color=red><font face=verdana><font color=blue><font face=arial><font size=+2><font size=+1>Welcome to Example.com! (best viewed in Internet Explorer at 800x600 with at least 256 colors)</font></font></font></font></font></font>< /font>
      I can follow the exact chain of thought that the webmonkey worked through to get the end result. It's kind of like a built-in RCS; you can peel back the tags one at a time to see what the site looked like two months ago.
      --
      Dewey, what part of this looks like authorities should be involved?
    3. Re:The semantic Web and valid HTML by endofoctober · · Score: 1

      I agree completely - most of the beginner HTML books I've read seemed bent on teaching that content and layout go together, exactly the opposite of what the W3 advocates. Luckily there are a few beginner books that teach HTML and CSS side-by-side, but, as an instructor, I'd like to see this approach adopted by all instead of a few.

      The Semantic Web sounds great, but I really don't trust people creating websites to include pertinent and accurate metadata about their site. If someone creates a site and simply wants to drive traffic to it, they're going to include whatever will facilitate that in their metadata.

      --
      - Jack
    4. Re:The semantic Web and valid HTML by Anonymous Coward · · Score: 0

      Oh goodie! Countless interesting new forms of error! Seriously, the system had better be robust enough to deal with the people who can't be bothered to write decent html, because nobody is really going to do things any better. Otherwise, the current proliferation of broken links and poorly designed pages will pale in comparison. It *should* provide an incentive to make better pages, but the people who write good* pages will continue to do so, and those who write bad ones won't get better.

      *By which I mean functional and useful, with broken links generally cleaned up.

    5. Re:The semantic Web and valid HTML by KjetilK · · Score: 1
      Well, surely good HTML could have helped in providing more semantics. For example, a table that had "price" in a TH could make it easier to guess that the numbers in the TDs associated where, well, prices for a product.

      However, HTML is not so relevant in the Semantic Web. There are many reasons for this, but I guess one is that it is expected to never get beyond tagsoup... Well, I dunno...

      It is RDF that is at the core of the Semantic Web. Funny, I have been interested in RDF for six years, still I haven't had time to really sit down and read the specs, and so, I often bump into rather fundamental things I haven't grokked.

      BTW, a quick, funny and interesting way to get started with the Semantic Web is FOAF: Go and generate your FOAF profile here.

      --
      Employee of Inrupt, Project Release Manager and Community Manager for Solid
    6. Re:The semantic Web and valid HTML by yerfatma · · Score: 1

      Except the point of posting the document wasn't to inform you of their awful design tastes. It was to deliver information. So why not just mark that info up in semantic elements and put any presentational information in stylesheets where the relevant user agent (visual, aural, print, whatever) can apply it if the user wants that to happen? What value does a font tag have to the 99.9999999999999999% of people who came to the page to read the text?

    7. Re:The semantic Web and valid HTML by grcumb · · Score: 1

      "The Semantic Web sounds great, but I really don't trust people creating websites to include pertinent and accurate metadata about their site."

      You're quite right to suspect that the Semantic Web won't start in the blogs of the world. It doesn't scratch any particular itch for individual web authors.

      But consider its value for a business that works with dozens (or hundreds, or thousands) of large clients, all of whom submit their data in more or less arbitrary formats. There is huge value for them in standardising the means by which they translate the data they work with.

      Now that is an itch worth scratching. 8^)

      --
      Crumb's Corollary: Never bring a knife to a bun fight.
    8. Re:The semantic Web and valid HTML by Just+Some+Guy · · Score: 1

      Should I start appending smileys to my posts? :-)

      --
      Dewey, what part of this looks like authorities should be involved?
    9. Re:The semantic Web and valid HTML by smartdreamer · · Score: 1
      The Semantic Web is an extension of the current web! So it takes what we already have with changing anything.

      Moreover, it is about content, not presentation. So don't be afraid about if your page is HTML Strict, Final or Transitionnal is has nothing to do.

      But IMHO, good design don't hurt! ;)

    10. Re:The semantic Web and valid HTML by smartdreamer · · Score: 1
      You couldn't be more wrong! ;)

      Go see the Friend of a Friend (FOAF) project it is a live implementation of what we can think of the first layer of the Semantic Web.

      You have to consider that we will not see a "final" version soon. And the Trust layer is on top of pyramid. So wait for a couple of years... :)

      I predict Semantic Web will have a greater impact than the Internet has now. From individuals to business to software and AI, let the revolution begin...

    11. Re:The semantic Web and valid HTML by yerfatma · · Score: 1

      Oh wow. Sorry. End of the week and I must have used up my allotment of perceptiveness. Not that I start with a lot.

    12. Re:The semantic Web and valid HTML by Just+Some+Guy · · Score: 1

      Not a problem. I was "there" yesterday.

      --
      Dewey, what part of this looks like authorities should be involved?
  13. Being built by Lehigh university eh? by The_reformant · · Score: 5, Informative

    The semantic web is a pretty popular area of research right now and its far from being "built by computer scientists at Lehigh University", in fact I could have done an undergrad dissertation on the semantic web, and there were numerous phD positions being advertised at uni's around the world researching about the semantic web.
    Whichever lehigh uni professor submitted this is stooping pretty low trying to raise publicity (and hence finance) I would think!

    --
    I have discovered a truly remarkable sig which this post is too small to contain.
    1. Re:Being built by Lehigh university eh? by Paladine97 · · Score: 1

      A lot of people from a lot of universities are probably working on the same idea. There just happens to be an article about the professors at Lehigh.

      I took some classes from Professor Heflin; he's a very bright guy. As for the semantic web, I don't think it will catch on. When you write your web pages you have to follow a strict schema and add all this metadata to each page for it be 'correct'. Most users could give two shits about this metadata and you'll still have chaos in the web.

  14. it's the Gibson! by SuperBanana · · Score: 2, Funny

    Am I the only one who recognized the main graphic for the story as a lifted screencap from the movie Hackers? That movie's SOLE redeeming quality was Angelina Jolie...

    Well, ok, that and the laugh factor. Not quite as much fun as MST3K'ing The Mummy with about a half dozen friends though.

  15. Add "Semantic Web" as an elective for CS in by Anonymous Coward · · Score: 0

    grad school, next to: a.i. using digital computers, relational databases in the business world, using C++ as an "object-oriented" languages, and how to build a universal translators.

  16. too little RDF by MarkWatson · · Score: 1

    I have had RDF on my web site for years, but last year as an experiment, I started a web spider running that specifically looked for RDF - I found very little.

    I even cheated and specified the 'seed' starting web sites as sites that I knew to use RDF.

    1. Re:too little RDF by Tony+Hoyle · · Score: 2, Informative

      There's absolutely loads of it around... especially as people are starting to use more generated websites (like slashdot for example).

      If you search for *.rdf maybe you won't find as much... a lot of it is *.rss, *.xml and other things.

      Also, google doesn't index them.

  17. It's Easy! by mfh · · Score: 1
    I'll have to rtfa to see what they propose, but just the principle of resolving contradictions is a really difficult one

    Yeah, this is really easy. Just look next to the title and see what score the moderators have assigned and you get a sense of whether there be contradictions! Generally if the score is lower than 1, there could be contradictions so:
    if($score < 1){$contradiction_level++;}elseif($score >=3){$contradiction_level--;}
    Yeah it's really difficult.
    --
    The dangers of knowledge trigger emotional distress in human beings.
    1. Re:It's Easy! by NoTheory · · Score: 1

      the principle of resolving contradictions *AUTOMATICALLY is a really difficult one

      >:P

      --
      There are lives at stake here!
  18. A lot of work to be done by Mazzaroth · · Score: 2, Insightful
    Semantic web is an amazing adea that will profoundly transform the way we interact with information. But I can see huge amount of work remaining to be done:
    • We need an ontology that will cover many if not all aspect of human experience. And this experience has been evolving dramatically and will continue to evolve. This ontology is probably a moving target. This task alone of creating the ontology has been, and is still the holy grail of AI and Knowledge Management.
    • The amount of time we will have to invest in adding metadata to the data will dramatically increase over time. We will need a way to automate the filling of the metadata layer. This is where kicks in automatic image recognition and classification, speech to text, text summarizer and meaning extractor (Here, Copernic is is the right direction). Maybe the librarian profession will be the next hot job...
    • Almost every application will have to adapt and inter-communicate. No big deal, RDF will probably become the new data bus anyway.
    That will be interesting!!!
    1. Re:A lot of work to be done by ubera · · Score: 1

      We need an ontology that will cover many if not all aspect of human experience.

      One of the advantages of the Ontology as a model is that we can avoid needing a 'global' one, instead we can compose ontologies and translate between them to create the semantic viewpoint.

      The amount of time we will have to invest in adding metadata to the data will dramatically increase over time

      There are additional issues, such as 'faithless' annotation (liars and miscreants) as well as genuine errors (human or other). Tagging data for the semantic web is a very big challenge.

      It remains to be seen what the usage model will be, from agents to something new...

      --
      But what is the SIGnificance?
  19. Multiple level of links by eille-la · · Score: 1

    I still don't know why this feature isnt used to make the web powerful for offering more links on the same web page:
    On the same page, a level of links should be increasable/decreasable. The default one would be the one we see currently on all the web sites.
    When going to the next level, the page would not reload at all but the browser would just show the links at different places on the page. These links would have been setted by the webmaster on ideas that require linking a sentence or a part of it, not just word. This way you can include as many level of idea/concept into the written text. A website like wikipedia would see this feature really useful I think.

  20. in other words... by Anonymous Coward · · Score: 1, Interesting

    ...handle more data, resolve contradictions and draw inferences from users' queries. The new improved Web will also combine pieces of information from multiple sites in order to find answers to questions.

    It will essentially be a librarian?

    The problem with this is that users first need to know what the heck they're actually looking for. You can draw as many inferences as you like, but so long as people search for "art" when they're interested in "tattoos" you aren't going to get much that's relevant. And THAT is the biggest problem with your average user--and that's what a librarian is good at. Asking questions until people verbalize what they really need.

  21. Welcome to 2001 by the_demiurge · · Score: 2, Insightful

    Hasn't everyone heard of this already?
    W3C semantic web activity from 2001.
    Heflin's Thesis from 2001.

    I'm rather skeptical of the whole thing, it seems to me to be like "Wouldn't it be nice if people documented their web page content better? Then we could do all these neat things." The second statement is right, but I fear the first statment is intractable.

  22. If for no other reason than IP law by ShatteredDream · · Score: 2, Interesting

    This could create huge problems for people to stay on the right side of copyright law. A medium that pulls information from several different sources could potentially make it much harder to avoid copyright infringement. For example, you pull from a Wikipedia entry, a NY Times entry and a Reason editorial. You better keep track of where you got each part if you use them in any of your own research, commentary, etc.

    How does it combine information from different sources in a way that keeps the user knowledgeable about where the data came from? How do you know who to cite, or whether something you're excerpting can be used in the context you want, when your "semantic web browser" pulled the data and combined it coherently or incoherently into a mish mosh of data sources?

    Am I the only one who thinks that this could be an IP trial lawyer's wet dream?

    1. Re:If for no other reason than IP law by Taladar · · Score: 1

      I think if and when this were deployed in a huge scale Copyright Laws as we know them would be unenforcable and thus void.

    2. Re:If for no other reason than IP law by Anonymous Coward · · Score: 0

      This only reveals how anachronic copyright law is getting as technology advances.

    3. Re:If for no other reason than IP law by WaterBreath · · Score: 1

      I think part of the issue here is the nature of copyrights and how they apply to web sites (and to a point this can be extended to software). I've often wondered, what is the point of copyrighting a mass of information (i.e. a webpage) that is very likely going to change, either slightly or drastically, sometime in the near future? Most of the time the old version of the site is tossed away, but it's still "copyrighted" and technically illegal to reproduce without permission.

      Personally, I think that the U.S. copyright system is in a grave state. It was created when 100% of information that it covered was physically recorded, and generally very static. Now, the vast majority of information that is viewed by an individual is digital, and highly dynamic.

      And in a society where the amount of information being passed around is so gragantuan (I've heard estimates of the Internet "containting" approximately 1 yottabyte, or 1.02x10^24 bytes of data, not to mention private databases), how much of it is truly and verifiably original?

      I'm not saying I have a better solution, but I think it's high time we stop trying to shove all these square pegs into one round hole.

  23. I have my doubts... by ngunton · · Score: 4, Insightful

    It seems to be a common mistake for computer scientists to think that it's possible to make systems that "understand" the world (both real and abstract knowledge), with all its complexity and ambiguity, in the same way that humans do. I feel that there is a fundamental difference between using computers to enable humans to organize stuff, and having computers automatically do it. Every single attempt at getting computers to be "smart" about infering human intentions has ended up as an irritating impediment to using the system - look at clippy, Bob, "intelligent" voice systems that try to "help" you by stopping you from talking to a real person... what computers are very, very good at is amplifying and enabling human intelligence. Computers are not themselves intelligent, and (my personal opinion) I don't think they ever will be - unless we manage to "grow" them using processes that we probably won't fully understand. You can't construct something that is as complex as the human mind through deterministic (i.e. consciously designed architectural) means - all you'll end up with, at best, is a very complex rule inference engine that is limited by the rules you gave it. Every "holy grail" of intelligent programming that has come along - neural nets, genetic programming etc - has turned out to be very limited (though very useful in special situations).

    I also feel that talking about automatically organizing the world's knowledge in a semantic web is just more of the same hot air that we've been hearing from AI departments for the last few decades. You can't automatically allocate meaning to something unless you have the capability for "common sense" reasoning, and the world knowledge at your fingertips to be able to interpret the data intelligently, like a human would. And even then, different humans would interpret it differently... so there are multiple meanings, and anyway, how to allocate "meaning" to something abstract such as a poem or piece of art?

    And if we require real people to add metadata to everything... well, it just ain't going to happen, in my humble opinion. Adding meta data is a pain in the ass, since you have to define the categories of object, agree on meanings for all the different taxonomies that will have to be used to describe the world... then there's the potential for abuse, as spammers will inevitably seed their documents with inappropriate metadata. So, the "honest" people can't be bothered, and the dishonest people will wreck anything that does get built. So, it ain't gonna happen.

    The beauty of google (not that I love google, but they did hit a nail on the head) is that it requires no effort or "machine intelligence", beyond a very simple algorithm that depends not on AI but rather real, tangible relationships between words and documents (proximity and links). This is something that computers can be really good at.

    Just my opinion... obviously there will be others out there who will vehemently disagree, and that's fine! Go ahead and try, you'll learn a lot in the process and you will probably come out with some tangential technology that you never thought of initially but is useful nonetheless.

    1. Re:I have my doubts... by DrEasy · · Score: 2, Insightful
      The beauty of google (not that I love google, but they did hit a nail on the head) is that it requires no effort or "machine intelligence", beyond a very simple algorithm that depends not on AI but rather real, tangible relationships between words and documents (proximity and links). This is something that computers can be really good at.

      And that's the curse of AI right there. Because you happen to know the algorithm underneat Google, you don't think of it as "intelligent". But to the average Joe it can certainly seem that way.

      We used to say that the day a chess program could beat a human, it'd be proof that machines can be intelligent. But now that we know how to build such a system it has lost its magic, and therefore shouldn't count as AI?

      --
      "In our tactical decisions, we are operating contrary to our strategic interest."
    2. Re:I have my doubts... by ngunton · · Score: 1

      I guess it depends on how you define "intelligence". Chess is a very closed system that can be defined very precisely by rules - a great application for a powerful computer that can simply go down all the game paths (possibly using some predefined heuristics) and find the best solutions. Also, remember that the latest chess supercomputers have been "trained" with the best games from the past (human) grandmasters. So I don't really see a computer playing chess as being intelligent, unless you define different kinds of intelligences, such as analytical, emotional, and so on. Deep blue has no clue about birds or poetry or crossing the street. Yes, there are different kinds of intelligence amongst people - some are better at math, some better at art, and so on... I guess I just think that there is some indefinable quality to humans that cannot easily be captured by the kind of logic that we use currently. Human brains are not logical cirtuits, they are a product of genetics over millenia, training from birth, and perhaps (who knows) other, more "spiritual" influences. I am not religious, but I have seen evidence that there is more to people than just the chemical soup that constitutes the brain. Of course, evidence is in the eye of the beholder! The stuff that Edgar Cayce did back in the 1940's was pretty amazing, and as far as I know he has never been debunked as an outright charletan. I'm not claiming he is for real or otherwise, but simply making the point that there are things that we can't explain about the way we work, that seem to go beyond the standard everyday perception of consciousness and mind.

      All conjecture, of course. But I still maintain that having computers allocate meaning to world knowledge is a step beyond what they are actually good at. AI researchers were constantly constructing "blocks world" AI and then proclaiming that "real world" extensions to their research was just a few years away. Here we are, a few years down the road, and somehow the AI still isn't here... we knew this was hard back in the 1980's when I was at university, and it doesn't appear to have gotten any easier since.

      Computers will get faster and more powerful, so that they may appear to give the illusion of intelligence in some limited situations. The chess computer is a good example. But what does it mean to program in past games from grandmasters and have the computer simply use that in order to spot patterns and pick good heuristics? Is that intelligence? I posit that it is very different from the kind of processes that go on in the human mind. But again, that's just my opinion...

    3. Re:I have my doubts... by Trinition · · Score: 1

      Some time ago, on TV, I was watching a show on intelligent computers. They highlighted one project in which a program designed to learn word associations was left alone for a long time with huge quantities of texts. When the researchers came back, the computer had made associations like "father" and "president". THey looked into its associations to see how it came to this conclusion, and it made sense. I don't remember the exact connections it used, but in many ways, the president is like a transient father of our country.

      So, I do not doubt computers will begin to understand meaning. Of course, that program took a long time, but it was also years ago that I saw this. Given the constant increase in computing power, it may one day be a possibility.

      However, I don't think that day is today. Probably not tomorrow, either.

    4. Re:I have my doubts... by foobsr · · Score: 1

      ... unless we manage to "grow" them using processes that we probably won't fully understand ...

      My hypothesis there is that time for data/information/knowledge to settle into kind of a state of integration with retrieval/inference engines/mechanisms is a crucial factor mostly not taken enough care of (IMHO). Support is given by the fact that the socialization period of humans is so much longer than for all the re of mammals.

      I also feel that talking about automatically organizing the world's knowledge in a semantic web is just more of the same hot air that we've been hearing from AI departments for the last few decades.

      Totally agreed - understandable though given the correlation of hype and funding. But still, I think that the problem is tractable, maybe with a more holistic and less CS heavy approach - though not soon.

      CC.

      --
      TaijiQuan (Huang, 5 loosenings)
    5. Re:I have my doubts... by BorgCopyeditor · · Score: 1
      Let me summarize:

      the computer had made associations like "father" and "president" [...] So, I do not doubt computers will begin to understand meaning

      A few questions about this inference: How is what the computer did anything more than a simple correlation? Could the computer tell you what the scope and limits of the association are (i.e., in what ways a president is like and unlike a father)? Could it parse/create a sentence in which it had to determine whether to use either of these words in a metaphorical rather than a literal sense (or vice versa)?

      Those are among the sorts of things we do when we understand language. I don't think that a computer's automatically drawing a correlation between "father" and "president" goes very far in the direction of this kind of understanding.

      --
      Shop as usual. And avoid panic buying.
    6. Re:I have my doubts... by DrEasy · · Score: 1

      So I guess our difference in opinion comes from the fact that to me, a human being can also be thought of as a complicated machine. Intelligence is always an "illusion", as long as the machinery behind it is undiscovered.

      As computers solve more and more difficult problems (beating humans at chess, learning to filter spam, semi-autonomously exploring space...) we become blase with our achievements and our ambitions raise. But let's not forget that when those problems were defined, we said that if a computer could solve them it would exhibit a sign of intelligence. It shouldn't matter that the way the problem was solved was different from the way a human being would have done it. Different illusions/heuristics, that's all.

      I agree though that we are far far away from a general purpose, all-sensing, self-programming intelligent agent. Maybe, in fact, it is just better this way.

      --
      "In our tactical decisions, we are operating contrary to our strategic interest."
    7. Re:I have my doubts... by nml · · Score: 1

      As computers solve more and more difficult problems (beating humans at chess, learning to filter spam, semi-autonomously exploring space...) we become blase with our achievements and our ambitions raise.

      The problem with our progress in all of those problems, (chess, etc) that were once considered only capable of being performable by intelligent beings, is not that we become blase about them. Its that solving them hasn't brought us (much) closer to truly general intelligence. We have the algorithms to create the best chess-playing entity in the world, but the chess-playing machine can't do anything but play chess. Try and get it to filter spam, and you're back at square one. We don't consider solutions to these problems to be intelligent because they merely mechanically execute a single algorithm that someone intelligent thought of. The intelligence is all in the design, and none in the execution.

      So I guess our difference in opinion comes from the fact that to me, a human being can also be thought of as a complicated machine. Intelligence is always an "illusion", as long as the machinery behind it is undiscovered.

      i'd be happy with a convincing illusion of intelligence, regardless of how it worked. But its not going to be very convincing unless it can solve problems that it hasn't been explicitly designed to handle.

    8. Re:I have my doubts... by smartdreamer · · Score: 1
      Most of your point is right but take it from another angle.

      It all depends on what you call "intelligence" and how you define "understand". Let me ask a question : "how do you really understand the meaning of a word?". I would say that it is when you can use this word in the sens it is intended. Now what if a program can use this word with the good meaning? We could say that it undertands it. And in fact, it does so, at least as well as we. There is no intrinsic concept in a word, only arbitrary significance.

      And it's all what ontologies are about. Give a formal definition to a concept to be uniformly used.

  24. no formal theory? get real. by Anonymous Coward · · Score: 0

    "No formal theory," Heflin wrote in his proposal to NSF, "has considered how ontologies can be integrated and how they may change, or the role of trust in integration."

    Like hell. Computation theorists and cryptographers have been applying game theory to their models for upward of a decade. This guy is just catching on.

    Of course the formal theories aren't complete, but this fellow is not onto anything new. It just sounds like exactly the naive optimism you'd put on an NSF grant req.

    As far as integrating ontologies, it just sounds like graph isomorphism to me. :)

    1. Re:no formal theory? get real. by ngibbins · · Score: 2, Informative

      There has been a considerable amount of work on ontology mapping within the knowledge engineering community, but the evolutionary aspects of ontologies have been largely overlooked. Ontology mapping is a harder problem than graph isomorphism, since classes from different ontologies may have extensions that overlap rather than cover each other. It's a difficult problem, certainly, but it's worth noting that game theory isn't applied here.

      Game theory tends to appear more within the multi-agent systems community than the semantic web community; they've been looking at the social models for trust for some years now.

  25. Lojban? by Thinkit4 · · Score: 1

    That's a language that can be parsed by computer! Rippin'. But I figure we'll just wait for the singularity before anything really changes, after which we'll use binary code or something. In the meantime it's English, which will be looked at as civilizations greatest joke at some point in the future. It will make it quite hard to make this semantic web.

    --
    -I am an elective eunuch.
  26. obscured by the cloud by Doc+Ruby · · Score: 2, Insightful

    Meaning is always "in context". Human communication always requires a "transmitter -> medium -> receiver" structure. Some say the universe is fundamentally structured on that model. When these sematic systems are overlaid on content, there's always these slippery, unresolvable mismatches of "intent" and "understanding", those "semantic arguments" that drive likeminded people crazy. Content searching is extremely powerful, without creating the "cracks" into which meanings can irretrievably fall. As long as there are alternative semantic indices to content still available "raw", semantics will just help. When we move to wrap all content entirely in semantics, we'll live in the "map is not the territory" problem forever. Ask CORBA programmers and EU language translators about the death of meaning by means of the dictionary. If we need to add semantics as a tool, we still get under the hood at the actual content.

    --

    --
    make install -not war

  27. Meaning??? by Crispin+Cowan · · Score: 1
    I thought it was a web of money :-)

    Crispin

  28. Meaning = ability to Intelligently Handle by LionKimbro · · Score: 4, Informative

    A message has "meaning" if you can make special use of it.

    Normal web pages have meaning for browsers, it's just that that meaning is limited to "how to draw words for the user."

    What we're doing, is making it so that your computer can make special use of messages on the web, to do smarter things.

    It would be scary if the Semantic Web were about "my meaning is THE meaning." But it is explicitely not like that. In fact, one of the main things about it is that anyone can make up their own languages, their own way of modelling the world.

    There are tools that make it so you can say, "My word X is sort of like their word Y," but it's acknowledged that such translations will be imperfect. Likely, fuzzy logic, and systems that are able to ask for clarification (and remember responses), will be used to mediate that sort of things.

    You may also be interested in my favorite page on AI by Open Mind. The Semantic Web isn't explicitely about AI, but it opens the door for a lot of AI work.

  29. Like when I type "Unicycle Jousting" by briancnorton · · Score: 2, Insightful
    And I get 200 adds for herbal viagra, 300 nigerians that have inherited 15 MILLON USDOLLARS, and deviant pornography.

    A semantic web is only as useful as the metadata, and people go to great lengths to mislead and disguise.

    --

    People who think they know everything really piss off those of us that actually do.

    1. Re:Like when I type "Unicycle Jousting" by Fnkmaster · · Score: 1
      Right, we need some sort of trust mechanism much better than "how many people are linking to this page". It's pretty easy to see how gameable Google is - there's no reason people won't try to game semantic content to push their products and services as well.


      In fact, it seems like the trust problem isn't that different at all, perhaps the only real difference is that with the WWW, you get to look at every page yourself and make the judgment call, "does this look like a scammer, are there lots of blink tags or mismatched colors and large text, or fake 'search engine' results?" If so, you ignore it and move on to the next result. With the semantic web, you are relying on an agent to make inferences for you, so it needs to be able to make those assessments in at least a semi-automated fashion.

  30. I can't wait! by wiresquire · · Score: 1

    I can't wait for the spam people and porn sites to get a hold of semantic web technology.

    The meaning of is V1.agra and C011.3G3 GIRLZ!

    --

    So does Anonymous Coward have good karma?

  31. Work Load by gazz · · Score: 1

    Surely this just piles on more work for us poor poor developers....

    --
    it's the taking apart that counts
  32. Representation of meaning is not the problem by kubalaa · · Score: 3, Insightful

    Semantic Web is the most ridiculous idea I've ever heard. The problem with meaning isn't representation -- English represents meaning just fine. The problem is meaning itself -- it doesn't matter if you figure out a way to encode it in some XML language, for every bit that it's easier for computers to use, it will carry that much less meaning.

    Another way of putting it is, any program capable of extracting the same meaning from XML that humans can, should be able to understand English without much trouble. It's the whole Intelligence-complete" thing. Like NP-complete, there seem to be a class of problems which can only be solved by real intelligence, and they're all pretty much equivalent in that with real intelligence, you can solve them all.

    --

    "If you look 'round the table and can't tell who the sucker is, it's you." -- Quiz Show

    1. Re:Representation of meaning is not the problem by ubera · · Score: 1

      That's the whole point though, English is extremely poor at representing meaning, and semantic annotation is intended to give keywords for more sensible reasoning.

      English is very poor, it's somewhat possible to get effective searching from something like google from the structure of the document and its content, but a better annotation will permit more accurate and more complete retrieval, as well as retrieval based on non-obvious features.

      --
      But what is the SIGnificance?
    2. Re:Representation of meaning is not the problem by fugu13 · · Score: 2, Insightful

      This got insightful?!

      Lets take a look at English, shall we?

      "Milk costs five dollars."

      "Milk always costs five dollars."

      "Milk's price is five dollars."

      "Isn't it cool that milk costs that low, low price of five dollars?"

      "I am so gosh-darn happy that I can obtain the glorious bounty of milk for a mere five (count 'em, one-two-three-four-five) bills featuring our esteemed former president, George Washington."

      Now, lets take a look at some possible semantic web statements.

      Milk hasPrice $5

      anonymousItem hasType Milk
      anonymousItem hasPrice $5

      KrogersItem54728 hasType Milk
      KrogersItem54728 hasPrice $5

      Now, the above are slight simplifications for the purposes of conveying the essential ideas (we're not getting into the ideas of common vocabularies, though it makes relating information far simpler if used. Its a bit too much to explain), but it is amazing that anyone could think that programs which can parse the latter sets of information can parse the former!

      --
      For to end yet again.
  33. Another Clippy by Odd+John · · Score: 2, Funny

    Great. An Expert System to do your google searches based on what it thinks you meant. The giant Semantic 'Clippy' knows what's best when it pops up to say:

    ''Here are the results to the question you should have asked.''

    Maybe next they'll have the Semantic Web manage the way electronic voting is counted. Semantic Clippy will count your 'intent' instead of your actual vote.

  34. Hmmm ... by ggvaidya · · Score: 1

    The meaning of the Internet, eh?

    That should yield some interesting answers.

    "42. The Answer that you are looking for is 42."
    "You searched for "space ship one", but what you really want to search for is "natalie portman hot grits"."

    Isn't the whole point of the Internet a database of information which we can access using tools - not to create a "web of knowledge"?

  35. why this will fail by ndunn · · Score: 3, Informative


    Google works because it is largely a statistical tool that uses some meta-information.

    While I could see frameworks being used for very specific purposes, like searching a homogeneous (e.g., slashdot, pubmed, nytimes) web-site where all content is controlled. But extending these ideas to a heterogenous web that would no doubt take advantages of such a volunteer system is ludicrous.

    I also take issue with the top-down mind-state that they will be able to predict what is useful to the user. This is why statistical importance and quantity is the only realistic method for such a massive undertaking (which google is still actively researching).

    I think that the only useful research to come out of such an endeavor would be to have news-sites, as mentioned above, implement and be scanned using an ontological browser. Of course, I am not sure how this would be different than Lexus-Nexus (sp?).

  36. Guess he forgot by CaptainZapp · · Score: 1
    To sports car enthusiasts, football fans and wildlife specialists, the word jaguar connotes highly discrete entities.

    Apple OSX afficionados.

    --
    ich bin der musikant

    mit taschenrechner in der hand

    kraftwerk

  37. Spot on by pjt33 · · Score: 0

    I can't believe this hasn't made it to +5 Insightful yet.

  38. Either hippies... by The_Real_MrRabbit · · Score: 0, Offtopic

    who sit on the floor with their legs crossed with their eyes rolled back and fingers forming circles are taking the Internet over with their "holistic", "whole-language", "non-judgemental", "authentic", "culturally-sensitive", "non-anglo-saxon centric", "green" politically correct approach to communication...

    OR

    Google is about to get better...

    =8-)

  39. what is right? who really knows? by Anonymous Coward · · Score: 0

    This thing could help in finding incorrect information. All we have to do is put Microsoft in charge of finding and getting rid of the faulty information.

    to George B: No please don't, it was a joke man.

    There is not 1 truth.

  40. Dependency: web of trust by tunabomber · · Score: 2, Insightful

    Anybody remember the demise of META keywords?

    I think we could run into the same problem with the Semantic Web, as it too allows web developers to attach arbitrary metadata to their pages. The only way to prevent unscrupulous web developers from embedding inaccurate RDF in their pages in hopes of attracting more hits is by establishing a web-of-trust framework.
    Google implements a very crude version of web-of-trust that assumes "incoming hyperlinks==trust". I think that in order for the Semantic Web to be something that is usable by web-wide search engines like Google, we will need a much more robust and fine-grained system of trust. The user should be able to specify some of the entities that they trust and the search engine will deduce the rest.
    However, without an adequate trust framework, the Semantic Web will just be a new fertile ground for for keyword spam and search engine "optimization".

    --

    pi = 3.141592653589793helpimtrappedinauniversefactory71 ...
  41. Mod Parent Up by ubera · · Score: 1

    Well said, there are indeed numerous areas of investigation of this sort of work. It's not as empty an area as the article tells us.

    --
    But what is the SIGnificance?
  42. ofcourse... by arcite · · Score: 1

    thats just what the computers want you to think...

  43. Didn't we see this in Quicksilver by faust2097 · · Score: 1

    This strikes me as eerily similar to Daniel Waterhouse trying to write down lists of everything for the Royal Society in Stephenson's Quicksilver.

    The whole reason the web is popular is because it's trivially simple to create content for it. Maybe the web would be more useful if it was like a giant encyclopedia but it's just an exercise in futility unless everyone gets on board.

    1. Re:Didn't we see this in Quicksilver by BorgCopyeditor · · Score: 1

      You might read Flaubert's last novel, Bouvard and Pécuchet, if you want to see the same idea much more thoroughly explored.

      --
      Shop as usual. And avoid panic buying.
  44. You misunderstand what it does. by fugu13 · · Score: 1
    The World Wide Web cannot "at its core handle inconsistent information" yet it seems to lurch along okay. The Semantic Web is not some attempt at global knowledge, perfect knowledge, perfect reasoning, or anything of the sort, regardless of what many posters, including yourself, seem to have construed it as. It is intended to be an analogue of the World Wide Web, which is primarily consumed by humans, that is instead primarily consumed by computers. Can it know everything? Of course not! But it can make it so computers "understand" a heck of a lot more than they do today. For instance: right now an everyday computer (or more accurately, the web browser) "understands" that (absent styling) a

    tag is presented in a certain way. The Semantic Web wants to make it so the triple GallonOfMilk hasPrice $1.25 (this would actually be expressed in several triples about a product with a certain id, probably, but you get the idea) can be "understood" by a program in the same way across multiple sources. The same as a person does not automatically assume a site is an absolute authority on the price of milk, semantic web enabled programs would not assume that this information was absolute (nor would it likely be presented as such). However, imagine how powerful it would be if one could give your browser the address of the RDF interfaces for local grocery stores (or it might autodiscover them at least in part), and then it would find out what the price of milk (and other groceries) is at each one of them. That sort of thing is already possible today without the Semantic Web (or other semantic frameworks), but only with methods that either require heavy lifting on the part of the client system (such as web scraping every grocery store site, killing extensibility and easy implementation) or aren't cross-domain (perhaps I want to chart the price of milk (from some milk-price-archive) vs real dollar value -- now my client has to understand two possibly very different ways of presenting information, not just one integrated way). The Semantic Web (and associated technologies) is an enabling framework that frees programmers from doing a lot of the heavy lifting involved in discovering meaning and relating meaning, just as SQL is an enabling framework that frees programmers from doing a lot of the heavy lifting involved in storing data and relating data.

    --
    For to end yet again.
  45. BETTER FORMATTED THAN PARENT by fugu13 · · Score: 2, Insightful
    Silly me, not previewing.

    The World Wide Web cannot "at its core handle inconsistent information" yet it seems to lurch along okay.

    The Semantic Web is not some attempt at global knowledge, perfect knowledge, perfect reasoning, or anything of the sort, regardless of what many posters, including yourself, seem to have construed it as.

    It is intended to be an analogue of the World Wide Web, which is primarily consumed by humans, that is instead primarily consumed by computers.

    Can it know everything? Of course not! But it can make it so computers "understand" a heck of a lot more than they do today.

    For instance: right now an everyday computer (or more accurately, the web browser) "understands" that (absent styling) a

    tag is presented in a certain way. The Semantic Web wants to make it so the triple GallonOfMilk hasPrice $1.25 (this would actually be expressed in several triples about a product with a certain id, probably, but you get the idea) can be "understood" by a program in the same way across multiple sources.

    The same as a person does not automatically assume a site is an absolute authority on the price of milk, semantic web enabled programs would not assume that this information was absolute (nor would it likely be presented as such). However, imagine how powerful it would be if one could give your browser the address of the RDF interfaces for local grocery stores (or it might autodiscover them at least in part), and then it would find out what the price of milk (and other groceries) is at each one of them.

    That sort of thing is already possible today without the Semantic Web (or other semantic frameworks), but only with methods that either require heavy lifting on the part of the client system (such as web scraping every grocery store site, killing extensibility and easy implementation) or aren't cross-domain (perhaps I want to chart the price of milk (from some milk-price-archive) vs real dollar value -- now my client has to understand two possibly very different ways of presenting information, not just one integrated way).

    The Semantic Web (and associated technologies) is an enabling framework that frees programmers from doing a lot of the heavy lifting involved in discovering meaning and relating meaning, just as SQL is an enabling framework that frees programmers from doing a lot of the heavy lifting involved in storing data and relating data.

    --
    For to end yet again.
    1. Re:BETTER FORMATTED THAN PARENT by NoOneInParticular · · Score: 1
      Thanks for the explanation, your story makes a bit more sense than the article(s) I read about ontologies, distributed ontologies, merging distributed ontologies and research into merging mutually contradicting distributed ontologies, which seems to be linked together with a brittle inference process that will fall flat on its face when inconsistent information is entered. You guessed, I remain sceptical.

      About the web, it's perfectly capable of handling inconsistent information, as all the information it handles is valid HTML and links that resolve. The rest is simply data, strings of bits. It simply enforces consistency in its narrow domain of valid information. Not so with the Semantic Web.

      I don't have a problem with rdf, as it's just a way to state stuff, and it seems genuinely useful. I do however think that the use of inference engines and formal proofs is a conceptual dead-end in a dynamic and inconsistent world. In contrast with your statement that all there is to the Semantic Web is rdf, the Semantic Web research is in the inference based upon these facts, not in the statements themselves (other then their (in)consistency). This is flawed for the same reasons expert systems never worked. An inference engine that cannot handle inconsistencies, cannot learn, and cannot revise its results in the presence of new information is only very limitedly useful.

      I am however sure that some good use of the rdf information can be made using statistical (Bayesian) inference, which can handle inconsistency, revision and extra evidence gracefully, so maybe regardless of the doomed research that is going on, we will end up with a semantic web after all.

    2. Re:BETTER FORMATTED THAN PARENT by fugu13 · · Score: 1

      I think you'll find that most of the sensationalized research that gets put out on the Semantic Web is not in fact most of the research that goes on :) (for instance, we hear all about the big expert systems research still going on, even, but remarkably little about the often very successful, if less grand in scope, machine learning research).

      Most of the practical ontology research focuses on internal ontologies in data stores, so that an RDF store can return "more" information than was put in. Only information trusted by the controller of the data store is added to it, but queries to the data store (likely used to generate public views) are able to take advantage of the considerable inferential power.

      Most of the practical work (and work in general) being done on the Semantic Web realizes that "low level" RDF tools, frameworks, and components are what will lead to the success of the Semantic Web, just as low level XML tools led to the success of XML.

      As RDF spreads more and more, global application of ontologies will become more useful in limited ways, too. For instance, statements associated with, say, news articles (via URL) will be quite useful -- I'd love to be able to construct *likely* statements about the beliefs of people by finding first what articles they've authored and then connecting those to what positions those articles have supported. Still not a global knowledge store, but a limited application of ontologies across globally available data that is useful despite the specific untrustworthiness of the data (will it be possible to "Ontology-bomb" things? certainly, but Google's still useful despite Google-bombs).

      --
      For to end yet again.
  46. Too much work at a badly defined task by BorgCopyeditor · · Score: 1
    One should always be suspicious of a technology that is promoted more on the basis of the desirability of its imagined future consequences than in light of practical, present-day successes. An unanswered question for me in all this is how many man-hours it would take even to retroactively tag all the data currently available on the web with semantically rich metadata.

    Here's an analogy that doesn't prove anything but reframes the problem. As far as I understand it, the Pentagon cannot be audited, because the time it would take to properly count everything extends beyond the period for which the count would be relevant (the current fiscal year). Do those who tout the "Semantic Web" have a response to this kind of question about feasibility?

    --
    Shop as usual. And avoid panic buying.
  47. Heflin is wrong. by Baldrson · · Score: 1
    "No formal theory," Heflin wrote in his proposal to NSF, "has considered how ontologies can be integrated and how they may change, or the role of trust in integration."

    Yes it has.

    See Relation Arithmetic Revivedand Structure Theory. These two papers were written as a result of Hewlett-Packard's E-Speak project's support of a continuation of work begun at Paul Allen's thinktank, Interval Research. These then led to an understanding of the importance of identity theory in performing logic with what we were calling "attributed assertions" aka digitally signed speech acts. After the E-Speak project terminated we continued work on identity theory with partial support from the Boundary Institute leading to a reformulation of the foundation of mathematical logic with The Expressive Power of Equality.

  48. Far, if not impossible. by akar_naveen · · Score: 1
    Making the web a 'web of meaning' looks very very far if not impossible to realize. But some of the work will probably be in mean time used to enhance the current state of the web adding some meaning to it (but not enough to call it a 'web of meaning').

    It's analogous to C and Smalltalk. C++ and Java evolved, but are not as purely object-oriented as Smalltalk is.

    Either it is not a good model in its entirety or time is not right for it. (though I believe it's the former)

  49. Semantic Web $\eq$ Bull Shit. by Anonymous Coward · · Score: 0

    It's like saying: we can draw implications, if the information is marked up and in a canonical form. However, getting to such a point requires marking up and putting all of our information in a canonical form.

    The more intresting problem is turning data into information. Id est: How to (and what) index. Google has the Semantic Web beat in all regards.

  50. The Now Ritual Shirky Link by gilgongo · · Score: 1

    Whenever I see something about the semantic web, I go back to Clay Shirky's critique of it.

    A useful antidote to the hype.

    --
    "And the meaning of words; when they cease to function; when will it start worrying you?"
  51. Just as a bit of context, I'm a researcher in machine learning that's extremely jealous of the amounts of funding my collegues just received for their good-all-fashioned inference technology on the semantic web (They did a great job getting the money, but I think they're on the wrong path).

    But okay. I don't think we're in disagreement here: I totally agree with you that the "low level tools" are the ones that are of interest here, I'm just sceptical about the scalability of the grand objective: crisp inference, based on this information. But, once the groundwork is done, machine learning will probably be the technology that will be used. Simple, fuzzy inference of the kind you describe and which made google big. Combine these 'simple' queries (find me cheap milk!) with likeliness of a match (is GallonOfMilk talking about dairy?), thrustworthiness of sites (Is the rest of the content about milk or milk-like products or products at all? Does the content match other milkselling sites?), and you've got something useful.

  52. Except he's terribly wrong... by holygoat · · Score: 1

    ... as pointed out by the incomparable Paul Ford.

    Don't believe everything you read - this Slashdot story is a great example.

  53. Not so fast there... by brundlefly · · Score: 1

    Clay Shirky has written an excellent article about why the Semantic Web ain't gonna work. I don't agree with everything he says, but it's a thought-provoking read nevertheless.

  54. Re: by ngibbins · · Score: 1

    I agree with you completely on this point. The most important advances that have been made in the knowledge engineering community over the last decade have been those that have tried to fuse non-symbolic and machine learning techniques with the good old-fashioned AI of expert systems.

  55. This is news? by kahei · · Score: 1


    Given that the semantic web has been in development for years, and that the opinions(*) have long ago finished forming, I'm a little confused as to what this is doing on a news site.

    (*) Said opinions break down roughly thus:

    1% -- This is an amazing new way of percieving and connecting data that will revolutionize computing in the future.
    9% -- This is a waste of time, a clearly impossible task that would seem of interest only to a certain breed of dysfunctional academics.
    90% -- Huh?

    --
    Whence? Hence. Whither? Thither.
  56. Whomever modded my topic as "Off Topic"... by The_Real_MrRabbit · · Score: 1

    Needs to re-read my post... 1. Google...not the only to note link... 2. Bringing "meaning" implies that someone decides what is "meaningful" and too often that someone has an agenda. =8-)

  57. Visual Web by aperevolution · · Score: 1

    Making a semantic web that is based on English or even Latin languages would be a useless addition to the chaotic structure of the current web. The problem is not the way computers use our language, rather it is the language that we are trying to use.
    The assumptions that are beneath English are difficult to work with, and in reality wrong. When I say "I am a baseball player" the meaing is quite different than that sentence protrays.
    As mentioned by another commentor, context is the most important element of language. That doesn't mean we can't record meaning through computers, it just means we have to change the lanuage we use. A better attempt at the above sentence would be:

    "I am part of a system in which I play a game with others. This game occurrs 2-4 times a month."
    In English hypertext each word in this statement would need an explaination:
    I=author
    part=player(pitcher, catcher, etc...)
    system=logic(baseball rules, game length, etc...)
    play=act
    game=system
    with others=people participating in the same instance of this system

    Clearly this method of transcribing context is very difficult, not only because of the language's assumptions but also because of the linear thought process. Contexts are nonlinear overlapping structures and a linear language does not do them justice.
    I propose intead of trying to make a web of meaing based on current day languages, a new visual nonlinear language should be created.

    Who's down?

    1. Re:Visual Web by nikolag · · Score: 1

      I think that this almost hits the point.
      Semantic Web is just an attepmt, but a it really is just that. Most likely "a step in wrong direction".

      How many times one can change a meaning of conversation by using its face, hands, body posture? How to write that down is just too hard for simple silogistic logic used in today's technology, even in semantic web.

      There is nice link that describes that even better:

      http://www.shirky.com/writings/semantic_syllogism. html

      --
      Doing a good job is like spilling coffee on a dark suit, you feel warm all over, but nobody notices.
  58. Wizard of OZ by rts008 · · Score: 1

    Semantic web...makes me think of the scene in WoO when professor is "uncurtained" and the Greatand Powerful Oz shouts something about "pay no attention to the PERSON behind the curtain" (don't remember exact wording). I mean, whose "meaning" is going to drive it? Could be cool tho'..."computer, search for meaning of prOn." WoooHooo!

    --
    Down With Slashdot BETA!!! I've been around the corner and seen the oliphant; you can only abuse me from your perspecti
  59. New Language by aperevolution · · Score: 1

    You're right english doesn't work, but that doesn't mean meaning cannot be encoded. The English language lacks context. Provide a language based on context and you can encode meaning. Context is nonlinear ans so should the language be.

  60. Re: by fugu13 · · Score: 1

    I have a good friend doing graduate work in machine learning, she recently presented at that robot conference in Japan, I believe about task recognition. I myself am a lowly undergraduate in Informatics, and am thus perhaps a bit more concerned with the practical than some in the Semantic Web community ;) .

    The dairy part would be taken care of because the milk (and its size, and such) would be discussed using common low level vocabularies (which reminds me, someone needs to come up with a units vocabulary for discussing units of measurement. note to self: do so, and gain fame and fortune ;) ), but I wholeheartedly agree that matching semantic web information to loose queries and judging trustworthiness are prime areas where machine learning techniques will come to the fore.

    Also, it is important to understand that the direction of the Semantic Web as envisioned by Tim Berners-Lee and others closely involved in its creation does point towards a lot of knowledge synthesis and extension through inference, it still leaves doors open for machine learning-style inference (TBL somewhat seems to shy away from this, but I think in part its his background in classical logic that creates that impression. He does make it very clear the methods used to do things like infer trust are not set or even partially determined, but that he's just speculating on possible ways).

    Many of the possibly application of the Semantic Web that he points to are very possible (and closer to what I characterize the Semantic Web as than many university researchers ;) ), here's a good article (the seminar example is what I'm specifically referring to): linky

    You know, if you wanted to cache in on the big research bucks, you could always come up with a way to combine machine learning and Semantic Web technologies (*hint hint* ;) ).

    --
    For to end yet again.
  61. Re: by NoTheory · · Score: 1

    Now, here's what all the semantic web seems to boil down to, to me:

    1) Common interface and data structures for describing all objects (how ever that'll be done, i think that's a pipe dream in itself).

    2) methods of combining and cross-referencing information from the aformentioned objects, whether they be user-defined or otherwise.

    3) Making sure everyone complies with enough regularity that the system actually does something useful.

    Now, i want to talk about a related and similarly interesting problem that i am personally more familiar with (in fact it is perfectly analogous).

    I'm a computational linguist (computational psycho-linguist to be precise). Several of my professors are interested in corpus based research, which requires in some cases annotated corpora. The interesting thing about natural language corpora, is that the goal of annotation is exactly what the semantic web people want, metadata that coherently describes all the attributes of the object they're discussing.

    The problem is that there isn't consensus as to what attributes human language actually has, so there are corpora which were annotated with metadata that's similar to syntactic theory A and corpora which have tags which favor theory B. So in the past couple years there have been pushes for "theory neutral annotation".

    Here's the problem. If it were possible to generate an annotation scheme that could be used by adherents of theory A and adherents of theory B (that is a theory neutral annotation scheme existed), it would mean that fundamentally theory A and theory B were, in fact, compatable. Since the entire reason that the two camps have different annotation schemes is because their theories are incompatible, it seems like a futile or contradictory endevor to undertake theory neutral metadata.

    The problem is the same for any system which wishes to create universal object descriptors. There is no way to do pre-theoretic object description. There are of course qualities of objects which will be uncontraversial, but ultimately some people are going to want to pay attention to certain qualities of objects, and other people aren't going to want the same attributes. While you may say "well that's fine, lets just toss in all the attributes," you'll still have the problem that some people will want to divide up the qualities that an object has one way, and someone else will want to divide up the attributes of the same object in another fundamentally incompatable manner.

    So what do you do then? You're fucked. You can't enforce consistency, and so your object descriptors are dead, which means you don't have your uniform api, and ontologies won't work.

    --
    There are lives at stake here!
  62. Re: by fugu13 · · Score: 1

    The Semantic Web (or the part of it described in the first point :) ) isn't about an absolutely generalized interface for describing all objects at all. Its about community developed vocabularies at a high enough level to be useful, but at a low enough level to be used as good building blocks for lots of structures. They are standards not by imposition but by adoption and general consensus. Existing vocabularies not good enough for your project? Invent one to fill the gaps and tell people about it!

    Take your annotation example. The Semantic Web approach wouldn't be to try to resolve the differences at all, or create some sort of neutral annotation. Instead, RDF vocabularies would be written for each form of annotation. If there were any points of commonality between the two sorts of annotation, those would be noted in ontologies, but if there weren't, no big loss.

    The important thing isn't that the data is part of some universal consistent structure, but that the data is made available in ways that can be easily related (in many different ways) to other data.

    You're misunderstanding the basic technologies of the Semantic Web as some big behemoth, when really the most basic technology is very simple. The only really important parts are these, in fact:

    RDF data is made up of triples.

    Triples have a subject, a predicate, and an object.

    The subject must be a URI or blank node (really an anonymous URI, so to speak), the predicate must be a URI, and the object must be a URI, blank node, or literal.

    It is both the simplicity of this structure (almost a graph) and the dependence on URIs that make it powerful. Because URIs are used, which are already used in practice as globally unique identifiers and may be easily extended and discovered, there is a certain universality to RDF statements.

    Not that they are universally meaningful or anything of the sort, but that in most cases it will be possible for two RDF statements to agree if they're talking about the same thing or different things -- just check the URIs (ah, the wonders of personification).

    --
    For to end yet again.
  63. Re: by NoTheory · · Score: 1

    Ah, so it's more of that FOAF stuff you've been talking about on grenme. I posted my comment in duplicate there, why don't we continue the conversation back at grenme?

    --
    There are lives at stake here!
  64. Re: by fugu13 · · Score: 1

    Because this way other people get to see us chatter :-P ?

    --
    For to end yet again.
  65. No Universal Meaing by aperevolution · · Score: 1

    That's a good link. It addresses the problems when trying to relay meaning through any medium, and the impossibility of recording it. What I propose is recording context rather than meaning. The closer you get to the context, the closer you get to the meaning. Too much of language is focused on trying to explain what happened rather than in what context it happened.

  66. self organizing web (Principia Cybernetica) by 80+85+83+83+89+33 · · Score: 1

    these guys http://pespmc1.vub.ac.be/ are using self adapting and darwinian technology to evolve the structure of the web, strengthening or weakening links, automatically, or even creating new links, based on usage patterns. they are comparing it to biological neural networks. they are trying to make the net actually work like a "global brain" ... their project also has the best name: Principia Cybernetica

    --
    i disable sigs
  67. my 2 cents by Anonymous Coward · · Score: 0
    (assertion)
    a semantic web, by definition will have to "handle more data, resolve contradictions and draw inferences from users' queries"; that just goes with the territory. The rest of the post is pretty general in definition

    terabytes of semantic information already exists in a structured form
    exposed as a directory tree (files) and only require the
    dicipline of its creator(s) to become better organized to be useful.
    If files|dirs employed the richness of our language (no whitespaces plz)
    and were heirarchially organized and cross-linked in some meaningful way
    by the author|compiler then the framework is already in place on every computer in use
    and if the schema can be adopted by users who only had to 're-think' the
    organization of their own libraries then i could forsee getting semantic value from
    spidering a site and extracting info out of the parsed directory, each files' name,
    its place in the dir hierarchy in addition to its meta statements, headers, etc.
    Obviously (2me) this works best with static files (docs) more than dynamic content
    but this would be no obstacle if the dir contained an informative .dir file and the
    server-parsed files (the cgi files) were properly named and required/included files
    used different extensions.

    I could forsee being able to create relationships between info from different trees
    combine them in incredibly complex ways and render the result sets in a variety
    of meaningful contexts.
    (/assertion)
    my continuing thought:
    http://dluz.tzo.com:8080/Rion/Blogs/WebD evelopment/