Slashdot Mirror


Going from a 'Web of links' to a 'Web of meaning'

neutron_p writes "Computer scientists from Lehigh University are building the Semantic Web, which will handle more data, resolve contradictions and draw inferences from users' queries. The new improved Web will also combine pieces of information from multiple sites in order to find answers to questions."

20 of 142 comments (clear)

  1. When by cbrocious · · Score: 2, Insightful

    When will we be dropping HTTP and HTML in favor of more metadata-friendly protocols and file formats? I can see huge potential in a system built specifically for getting data out there and linking it all together.

    --
    Disconnect and self-destruct, one bullet at a time.
    1. Re:When by SenatorOrrinHatch · · Score: 1, Insightful

      What on earth makes anyone think a computer has anything to do with "meaning"? That would require understanding, which would require thought, which would require consciousness. Which is exactly what a machine explicitly does NOT have.

      --
      The Christian in me says it's wrong, but the corrections officer in me says, 'I love to make a grown man piss himself.'
  2. Ummm by bo0ork · · Score: 3, Insightful
    The new improved Web will also combine pieces of information from multiple sites in order to find answers to questions

    Sounds like a recipe for disaster to me.

    --
    Does everything include nothing?
    1. Re:Ummm by bobbis.u · · Score: 4, Insightful
      Yeah, I would tend to agree.


      One of the reasons the internet has become so popular is because everyone can have their say. Unfortunately, this has the side effect that there is a lot of incorrect and misleading information out there. Everything is also self-reinforcing, because one person often copies their "facts" from another website without first checking the veracity. Even major news outlets and scientific publications have been caught out by this in the past.

    2. Re:Ummm by ezzzD55J · · Score: 4, Insightful
      'Everything is also self-reinforcing, because one person often copies their "facts" from another website without first checking the veracity'

      There is another way in which it's self-reinforcing. People look for sites and pages and people that reflect their own opinions.

    3. Re:Ummm by jsebrech · · Score: 2, Insightful

      "Webs of trust." People will make pages telling what pages they believe have a good reputation, and generally tells the truth.

      That won't work for stuff that's politically sensitive, since people will mod sites down just because they dislike what the site says, even if it is accurate. It also gets really complicated with sites that are accurate on one subject but don't know jack about another.

      Computers will have "beliefs" reflecting their owner's own.

      In that case, what's the point? If your computer only accepts data that fits in with your predetermined conclusions, it will provide valueless results.

  3. Snake oil... by Alomex · · Score: 2, Insightful



    with their favourite mode of publication being the press release.

  4. It's the authoring tools, stupid by Eloquence · · Score: 4, Insightful
    Who is "building the semantic web"? Academics or web authors? The only semantic web technology that has actually gained wide usage in the sphere of user-generated content is RSS, a syndication format (or rather, a bunch of competing syndication formats). The reason for this is that weblog engines like Slash and Movable Type support syndication. This then allowed programmers to create news aggregators and filters.

    The same can be said about any semantic web technology - whether it's FOAF (an RDF vocab for describing people and their interests) or a vocabulary for reviews. As soon as major authoring tools (i.e. both web editors and content management systems) start integrating these technologies, people will use them if they are useful. Do not expect web designers or bloggers to have a clue about all the great things that the semantic web can do - give them one useful thing which they understand, package it in a pretty UI, and they will start using it.

  5. The semantic Web and valid HTML by ToreTS · · Score: 2, Insightful

    I guess that the Semantic Web would need HTML documents to meet strict requirements when it comes to validation, use of logical instead of physical markup and so on. This could be an incentive for people to use HTML the way it was intended, instead of the crapload of pages that don't close tags, use hundreds of redundant FONT tags, use the H1..H6 elements to control font size instead of using them to indicate headings, and so on. Strangely enough, all "beginner's" HTML books still teach people to code this way.

  6. A lot of work to be done by Mazzaroth · · Score: 2, Insightful
    Semantic web is an amazing adea that will profoundly transform the way we interact with information. But I can see huge amount of work remaining to be done:
    • We need an ontology that will cover many if not all aspect of human experience. And this experience has been evolving dramatically and will continue to evolve. This ontology is probably a moving target. This task alone of creating the ontology has been, and is still the holy grail of AI and Knowledge Management.
    • The amount of time we will have to invest in adding metadata to the data will dramatically increase over time. We will need a way to automate the filling of the metadata layer. This is where kicks in automatic image recognition and classification, speech to text, text summarizer and meaning extractor (Here, Copernic is is the right direction). Maybe the librarian profession will be the next hot job...
    • Almost every application will have to adapt and inter-communicate. No big deal, RDF will probably become the new data bus anyway.
    That will be interesting!!!
  7. Re:Resolve Contradictions? by NoTheory · · Score: 2, Insightful

    alright, having read the friggin' article, all i have to say is that they have their work cut out for them.

    the problem with searching currently is that only librarians, who've had at least a year or two of graduate studies really know the ontology that libraries use. Common users bring their own concepts and ontologies to bear when they're searching for information. But if you move away from the monolithic single ontologies that libraries use, you have the problem that you have to be open to the fact that ontologies change, not just between individuals but over time, as cultures change the ontologies need to change as well. I guess the concept must be that there are a set of descriptors which are invariant, and can thus be interpreted based on the features of those objects by different ontologies.

    The crazy part about trying something like that is that you have to make people define their own ontologies. Furthermore you have the problem that you need to make sure that people are describing their data in an ontologically neutral manner.

    And that's the hidden third problem (the technology review article posted above has the dude citing 2 problems), getting people to behave in a sensible way when dealing with information organization. Unfortunately in so far as we know now, it's really difficult to get computers to automatically create meta-data (that doesn't mean we're not trying), but primarily humans have to be included in the decision process if you want to define what things are.

    the ironic thought that pops to mind is that if you've got a set of universal descriptors, then don't you already have an ontology? And if you don't have a set of universal descriptors, how would you ever create a coherent ontology?

    anyway, enough rambling for now

    -notheory

    --
    There are lives at stake here!
  8. Welcome to 2001 by the_demiurge · · Score: 2, Insightful

    Hasn't everyone heard of this already?
    W3C semantic web activity from 2001.
    Heflin's Thesis from 2001.

    I'm rather skeptical of the whole thing, it seems to me to be like "Wouldn't it be nice if people documented their web page content better? Then we could do all these neat things." The second statement is right, but I fear the first statment is intractable.

  9. I have my doubts... by ngunton · · Score: 4, Insightful

    It seems to be a common mistake for computer scientists to think that it's possible to make systems that "understand" the world (both real and abstract knowledge), with all its complexity and ambiguity, in the same way that humans do. I feel that there is a fundamental difference between using computers to enable humans to organize stuff, and having computers automatically do it. Every single attempt at getting computers to be "smart" about infering human intentions has ended up as an irritating impediment to using the system - look at clippy, Bob, "intelligent" voice systems that try to "help" you by stopping you from talking to a real person... what computers are very, very good at is amplifying and enabling human intelligence. Computers are not themselves intelligent, and (my personal opinion) I don't think they ever will be - unless we manage to "grow" them using processes that we probably won't fully understand. You can't construct something that is as complex as the human mind through deterministic (i.e. consciously designed architectural) means - all you'll end up with, at best, is a very complex rule inference engine that is limited by the rules you gave it. Every "holy grail" of intelligent programming that has come along - neural nets, genetic programming etc - has turned out to be very limited (though very useful in special situations).

    I also feel that talking about automatically organizing the world's knowledge in a semantic web is just more of the same hot air that we've been hearing from AI departments for the last few decades. You can't automatically allocate meaning to something unless you have the capability for "common sense" reasoning, and the world knowledge at your fingertips to be able to interpret the data intelligently, like a human would. And even then, different humans would interpret it differently... so there are multiple meanings, and anyway, how to allocate "meaning" to something abstract such as a poem or piece of art?

    And if we require real people to add metadata to everything... well, it just ain't going to happen, in my humble opinion. Adding meta data is a pain in the ass, since you have to define the categories of object, agree on meanings for all the different taxonomies that will have to be used to describe the world... then there's the potential for abuse, as spammers will inevitably seed their documents with inappropriate metadata. So, the "honest" people can't be bothered, and the dishonest people will wreck anything that does get built. So, it ain't gonna happen.

    The beauty of google (not that I love google, but they did hit a nail on the head) is that it requires no effort or "machine intelligence", beyond a very simple algorithm that depends not on AI but rather real, tangible relationships between words and documents (proximity and links). This is something that computers can be really good at.

    Just my opinion... obviously there will be others out there who will vehemently disagree, and that's fine! Go ahead and try, you'll learn a lot in the process and you will probably come out with some tangential technology that you never thought of initially but is useful nonetheless.

    1. Re:I have my doubts... by DrEasy · · Score: 2, Insightful
      The beauty of google (not that I love google, but they did hit a nail on the head) is that it requires no effort or "machine intelligence", beyond a very simple algorithm that depends not on AI but rather real, tangible relationships between words and documents (proximity and links). This is something that computers can be really good at.

      And that's the curse of AI right there. Because you happen to know the algorithm underneat Google, you don't think of it as "intelligent". But to the average Joe it can certainly seem that way.

      We used to say that the day a chess program could beat a human, it'd be proof that machines can be intelligent. But now that we know how to build such a system it has lost its magic, and therefore shouldn't count as AI?

      --
      "In our tactical decisions, we are operating contrary to our strategic interest."
  10. obscured by the cloud by Doc+Ruby · · Score: 2, Insightful

    Meaning is always "in context". Human communication always requires a "transmitter -> medium -> receiver" structure. Some say the universe is fundamentally structured on that model. When these sematic systems are overlaid on content, there's always these slippery, unresolvable mismatches of "intent" and "understanding", those "semantic arguments" that drive likeminded people crazy. Content searching is extremely powerful, without creating the "cracks" into which meanings can irretrievably fall. As long as there are alternative semantic indices to content still available "raw", semantics will just help. When we move to wrap all content entirely in semantics, we'll live in the "map is not the territory" problem forever. Ask CORBA programmers and EU language translators about the death of meaning by means of the dictionary. If we need to add semantics as a tool, we still get under the hood at the actual content.

    --

    --
    make install -not war

  11. Like when I type "Unicycle Jousting" by briancnorton · · Score: 2, Insightful
    And I get 200 adds for herbal viagra, 300 nigerians that have inherited 15 MILLON USDOLLARS, and deviant pornography.

    A semantic web is only as useful as the metadata, and people go to great lengths to mislead and disguise.

    --

    People who think they know everything really piss off those of us that actually do.

  12. Representation of meaning is not the problem by kubalaa · · Score: 3, Insightful

    Semantic Web is the most ridiculous idea I've ever heard. The problem with meaning isn't representation -- English represents meaning just fine. The problem is meaning itself -- it doesn't matter if you figure out a way to encode it in some XML language, for every bit that it's easier for computers to use, it will carry that much less meaning.

    Another way of putting it is, any program capable of extracting the same meaning from XML that humans can, should be able to understand English without much trouble. It's the whole Intelligence-complete" thing. Like NP-complete, there seem to be a class of problems which can only be solved by real intelligence, and they're all pretty much equivalent in that with real intelligence, you can solve them all.

    --

    "If you look 'round the table and can't tell who the sucker is, it's you." -- Quiz Show

    1. Re:Representation of meaning is not the problem by fugu13 · · Score: 2, Insightful

      This got insightful?!

      Lets take a look at English, shall we?

      "Milk costs five dollars."

      "Milk always costs five dollars."

      "Milk's price is five dollars."

      "Isn't it cool that milk costs that low, low price of five dollars?"

      "I am so gosh-darn happy that I can obtain the glorious bounty of milk for a mere five (count 'em, one-two-three-four-five) bills featuring our esteemed former president, George Washington."

      Now, lets take a look at some possible semantic web statements.

      Milk hasPrice $5

      anonymousItem hasType Milk
      anonymousItem hasPrice $5

      KrogersItem54728 hasType Milk
      KrogersItem54728 hasPrice $5

      Now, the above are slight simplifications for the purposes of conveying the essential ideas (we're not getting into the ideas of common vocabularies, though it makes relating information far simpler if used. Its a bit too much to explain), but it is amazing that anyone could think that programs which can parse the latter sets of information can parse the former!

      --
      For to end yet again.
  13. Dependency: web of trust by tunabomber · · Score: 2, Insightful

    Anybody remember the demise of META keywords?

    I think we could run into the same problem with the Semantic Web, as it too allows web developers to attach arbitrary metadata to their pages. The only way to prevent unscrupulous web developers from embedding inaccurate RDF in their pages in hopes of attracting more hits is by establishing a web-of-trust framework.
    Google implements a very crude version of web-of-trust that assumes "incoming hyperlinks==trust". I think that in order for the Semantic Web to be something that is usable by web-wide search engines like Google, we will need a much more robust and fine-grained system of trust. The user should be able to specify some of the entities that they trust and the search engine will deduce the rest.
    However, without an adequate trust framework, the Semantic Web will just be a new fertile ground for for keyword spam and search engine "optimization".

    --

    pi = 3.141592653589793helpimtrappedinauniversefactory71 ...
  14. BETTER FORMATTED THAN PARENT by fugu13 · · Score: 2, Insightful
    Silly me, not previewing.

    The World Wide Web cannot "at its core handle inconsistent information" yet it seems to lurch along okay.

    The Semantic Web is not some attempt at global knowledge, perfect knowledge, perfect reasoning, or anything of the sort, regardless of what many posters, including yourself, seem to have construed it as.

    It is intended to be an analogue of the World Wide Web, which is primarily consumed by humans, that is instead primarily consumed by computers.

    Can it know everything? Of course not! But it can make it so computers "understand" a heck of a lot more than they do today.

    For instance: right now an everyday computer (or more accurately, the web browser) "understands" that (absent styling) a

    tag is presented in a certain way. The Semantic Web wants to make it so the triple GallonOfMilk hasPrice $1.25 (this would actually be expressed in several triples about a product with a certain id, probably, but you get the idea) can be "understood" by a program in the same way across multiple sources.

    The same as a person does not automatically assume a site is an absolute authority on the price of milk, semantic web enabled programs would not assume that this information was absolute (nor would it likely be presented as such). However, imagine how powerful it would be if one could give your browser the address of the RDF interfaces for local grocery stores (or it might autodiscover them at least in part), and then it would find out what the price of milk (and other groceries) is at each one of them.

    That sort of thing is already possible today without the Semantic Web (or other semantic frameworks), but only with methods that either require heavy lifting on the part of the client system (such as web scraping every grocery store site, killing extensibility and easy implementation) or aren't cross-domain (perhaps I want to chart the price of milk (from some milk-price-archive) vs real dollar value -- now my client has to understand two possibly very different ways of presenting information, not just one integrated way).

    The Semantic Web (and associated technologies) is an enabling framework that frees programmers from doing a lot of the heavy lifting involved in discovering meaning and relating meaning, just as SQL is an enabling framework that frees programmers from doing a lot of the heavy lifting involved in storing data and relating data.

    --
    For to end yet again.