Slashdot Mirror


Tim Berners-Lee and the Semantic Web

An anonymous reader writes "As we all know, Tim Berners-Lee is the hero of the Web's creation story--he conjured up this system and chose not to capitalize on it commercially. It turns out that Sir Tim (he was knighted by Queen Elizabeth II in July) had a much grander plan in mind all along--a little something he calls the Semantic Web that would enable computers to extract meaning from far-flung information as easily as today's Internet links individual documents. In an interview with Technology Review, the Web-maestro explains his vision of 'a single Web of meaning, about everything and for everyone.'"

13 of 250 comments (clear)

  1. You don't want a "single" web... by Pig+Hogger · · Score: 3, Insightful
    You don't want a "single" web... You want a multitude of them, and carefully isolate them (beyond normal information reading and referencing).

    This is to insure against a monoculture that is so disastrous in computer circles as demonstrated by the numerous security failings of Windows...

    1. Re:You don't want a "single" web... by JimDabell · · Score: 3, Insightful

      This is to insure against a monoculture that is so disastrous in computer circles as demonstrated by the numerous security failings of Windows...

      Windows executes stuff. The semantic web is just data. Your warnings about a monoculture apply to the semantic web about as much as they apply to text files.

    2. Re:You don't want a "single" web... by JimDabell · · Score: 3, Insightful

      Remember when you couldn't get a virus just by reading an e-mail?

      Yes, and again, the problem is when the stuff that executes has a monoculture. It's not like you see Pine users or KMail users infected by emails with Outlook viruses in.

  2. Two major problems to a semantic web by levram2 · · Score: 5, Insightful

    The extra work required to put data into a standard data format won't be done. People can't bother making their pages w3c complaint (even slashdot). The second problem is that data formats can rarely be agreed upon by a large community. Look at how many calendar event and news feed formats there are.

  3. Statistical text analysis killed semweb by Ars-Fartsica · · Score: 5, Insightful

    As has been stated many times, content producers will spoof semantic data just like they used to with the META tag...which is why no one uses the META tag anymore. Relevance algorithms take into account link analysis and statistical text analysis to provide a much more truthful representation of what data is there. Sorry Tim.

  4. Not doing it right by vigyanik · · Score: 4, Insightful

    The fact that Tim has been trying for 15 years to sell this idea with little success indicates that he approach is insufficient. He is pitching the idea just like a startup would, giving cool examples and everything. But in practice, all he is doing is proposing and overseeing standards. Developing standards for an idea is not what is required to prove that an idea works. Standards should follow successful technology, not vice versa. You need to have companies that make products professionally and offer complete solutions (i.e. make it work real-life situations). Doing it for a very simple example that he quotes ("find pictures taken on sunny days") itself is a big, big deal. Perhaps Tim should get involved with companies in this field as an advisor/consultant. You know, there are enough smart people out there who could develop the standards. But very few people with his name and recognition to truly ignite commercial interest in his ideas.

    1. Re:Not doing it right by dubious9 · · Score: 4, Insightful

      Perhaps Tim should get involved with companies in this field as an advisor/consultant.

      Um... he invented www and started the W3C. I'd say he's had some experience with companies as a advisor. Take a look at some of the W3C recommendations and look for corporate involvment.

      But in practice, all he is doing is proposing and overseeing standards.

      That's kinda what the W3C *does*.

      Standards should follow successful technology, not vice versa.

      XHTML,XML,XSLT and a lot of other recommendations started as standards that *later* had robust implementations. Technology that starts without standards if often not fully thought out and awkward, and at worst, proprietary. Waiting for technology before standards will only inhibit interoperability and adoption of the standard.

      The fact that Tim has been trying for 15 years to sell this idea with little success indicates that he approach is insufficient.

      I suppose that it has nothing to with the fact that it's a tremendouly difficult and abitious project. You're right. Anything that take 15 years to develop should be scrapped.

      --
      Why, o why must the sky fall when I've learned to fly?
  5. Re:The rest of us call this... by bongoras · · Score: 4, Insightful
    The Semantic web represents relationships between data based on metadata (i.e. data about data). This is a far more powerful way to describe the meaning of data.

    And this is what makes me wonder if this will amount to much more then an interested research project for grad students. In order for the SemWeb to amount to anything useful, everyone is going to have to include the metadata necessary to integrate their data into the Semantic Web. How's that going to work? Who's going to make it work?

  6. Second System Effect by xleeko · · Score: 4, Insightful

    I've been hearing noise about the semantic web, RDF, and what not for years now, and every time I do, the first thing that pops into my head is "Second System Effect".

    He got lucky once, because he put together some tools that were simple and straightforward enough for people to pick it up quickly, thereby avoiding the fate of the dozens of other hypertext systems going back to the late 1980's.

    Now, like all second systems, he wants to "do it right", over-engineering away all of the things that made the first one take off ...

    Just my opinionated rant ...

  7. Re:Opposing view by Sique · · Score: 3, Insightful

    No, computers don't need meaning to handle data. Computers need syntax and rules how to act at syntactic structures. The semantic web is founded on the hope that enough syntax thrown at huge amounts of data turns magically into semantics.

    It's based on the assumption that all semantics can be explained by syntax. So far this has not been proven yet, and all attempts to get there went stuck somewhere and turned out something different, sometimes useful (Chomsky's grammars), sometimes not so useful.

    The semantic web would have to deal with the laziness of people who can't be bothered to write meaningful ALT attributes to tags. It can try to guess on some of the semantics, but it can also easily be fooled. Everyone who ever tried to use content filters for an internet connection knows what I am talking about. There are lots of false positives rejected and hundreds of questionable sites run through, because the syntax of a site alone doesn't help with evaluation the semantics (the meaning) of this site.

    --
    .sig: Sique *sigh*
  8. Re:What Does 42 Mean for Privacy? by Allen+Zadr · · Score: 4, Insightful
    Ah, but what constitutes privacy but an obscurity of your own behaviors in certain circles.

    That is to say, I may be an item scammer in online gaming realms, or in Diablo, but not in EverQuest. However, I may be one of the most honest people I know in the real world. Perhaps I have a second account that I use to Troll on Slashdot, but otherwise have this account where I try to post insightful information. You have the right to link these things, you may even have the right to link these to real world data like where I work and where I park my car. However, if I jilted someone in Diablo, do I want them to so easily find me and take it out on my car (as some people would)?

    Do I want my employer having instant access to all of my online transactions, regardless if I'm on shift or off shift at the time? Individually, these are not things that have been considered something you would even want to 'secure', yet they may be valuable to someone.

    --
    Kinetic stupidity has a new brand leader: Allen Zadr.
  9. Why this is a bad idea - it's a taxonomy by Animats · · Score: 4, Insightful
    The big problem with the so-called "semantic web" is that trying to taxonomize ideas doesn't work very well. Full-text search works much better.

    In the beginning, we had library card catalogs, with their painful attempts to index and cross-reference books. That works well in some areas, typically ones where names of people are significant. Attempts to apply the same approaches to technical papers worked less well.

    There's a very elaborate classification system for patents. When you had to look through patents on paper or microfilm, it was essential. Now that we have full text search, it's used less and less.

    A modern example of this approach is the ACM Taxonomy, a structure into which all computer science can be fitted. (As an exercise, try to put the current Slashdot stories into that taxonomy.) Nobody actually uses that taxonomy to find anything.

    As to data interchangability, that's a separate issue, and more of a standards one. The big problem for publicly available data is that the cost of encoding the data is borne by different people than those who benefit from the encoding. Many companies don't like having all their product and pricing information easily searchable by price. (Froogle may change this, because Google has so much clout.)

    I've spent some time dealing with public financial reporting. There's opposition to detailed disclosure in a standardized format. Many companies don't want their detailed information to be too easily analyzed. Embarassing results show up.

    The future is better search engines, not user-created indexing data. As we've painfully learned, a search engine must look at the same data a human reader would, or it will be lied to. Lied to to the point of uselessness.

  10. Re:Opposing view by Thuktun · · Score: 4, Insightful
    If you'd like an opposing view, make sure to read Clay Shirky's take on the semantic web.

    His writings appear to have some uncorrected logical fallacies.
    Consider the following assertions:
    • Count Dracula is a Vampire
    • Count Dracula lives in Transylvania
    • Transylvania is a region of Romania
    • Vampires are not real
    You can draw only one non-clashing conclusion from such a set of assertions -- Romania isn't real.
    You can conclude the following from those statements:
    • Count Dracula is not real
    • Count Dracula lives in a region of Romania
    I'd like to see the mystery step that combines these to conclude that Romania isn't real; at most, you could say that Romania houses something that isn't real. The conclusion he makes isn't supported by any logic.

    More importantly, these are dumbed-down semantics. The assertion that a fictional character lives somewhere real needs to be qualified that this occurs in a certain set of fictional stories, not real life. The fact that these unqualified statements are represented in this example ontology means that the ontology is insufficient, not that this method isn't useful.

    Another example in that article:
    • US citizens are people
    • The First Amendment covers the rights of US citizens
    • Nike is protected by the First Amendment
    You could conclude from this that Nike is a person, and of course you would be right.
    This is even factually incorrect. The First Amendment doesn't actually say anything about US citizens; it restricts the US Congress from certain actions, period, not for certain people.

    Ignoring this, you can make one conclusion and reduce this to the following:
    • the First Amendment covers the rights of people
    • Nike is protected by the First Amendment
    Concluding that Nike is a person from this is a logical fallacy. (Nothing in these logical statements says the First Amendment might not also cover the disposition of small peanut butter sandwiches with blueberry jam, which set Nike might then be an element of.)

    I find it hard to treat this article with much weight, given its fast-and-loose treatment of logic and ontological assertions.