Slashdot Mirror


On Finding Semantic Web Documents

Anonymous Coward writes "A research group at University of Maryland has published a blog describing the latest approach for finding and indexing Semantic Web Documents. They have published it in reaction to Peter Norvig's (director of search quality at Google) view on the Semantic Web (Semantic Web Ontologies: What Works and What Doesn't): 'A friend of mine [from UMBC] just asked can I send him all the URLs on the web that have dot-RDF, dot-OWL, and a couple other extensions on them; he couldn't find them all. I looked, and it turns out there's only around 200,000 of them. That's about 0.005% of the web. We've got a ways to go.'"

3 of 67 comments (clear)

  1. LiveJournal and other weblogging services by crschmidt · · Score: 3, Informative

    Every user of a LiveJournal-based website running recent code has a FOAF file. Let's look how many users that is:

    * LiveJournal.com: 5751567
    * GreatestJournal.com: 717406
    * DeadJournal.com: 474435
    * Weedweb.net: 22650
    * InsaneJournal.com: 12970
    * JournalFen.net: 7629
    * Plogs.net: 7086
    * journal.bad.lv: 4530

    (This list is most likely incomplete.)

    In addition to this, every Typepad user has an account: according to the 6A merger stories, that's another million users. Add in the RDF from all the Typepad RSS files, and that's another 1 million.

    All Wordpress blogs have a feed, located at /feed/rdf or /wp-rdf.php, which is in RDF. Movable Type comes preinstalled with an RSS 1.0 feed. Each of these has at least a couple thousand users.

    So, we've got, just as a guess, about 9 million RDF files out there in the blogging world alone. Throw in a hell of a lot of scientific data, and everything on RDFdata.org, and you start to get an idea that the world is a lot more Semantic Web enabled than you seem to think it is.

    --
    -- Christopher Schmidt YouTube Quality of Experience
    1. Re:LiveJournal and other weblogging services by Da_Weasel · · Score: 2, Informative

      About 75% of those that signed up for those various blogging services have never actually posted a single entry in their blog. So the actual numbers is more like 2.2 million of so. Even with a devistating hit like that it's still 10 times more that the number stated in the article though....lol...and its still just the bloggers alone.

      --
      If you must!
    2. Re:LiveJournal and other weblogging services by zangdesign · · Score: 2, Informative

      So, we've got, just as a guess, about 9 million RDF files out there in the blogging world alone.

      Care to venture a guess as to how many of those actually contain useful information? Really, who cares if Melanie in Oshkosh really, really loves Justin Timberlake, or Winthorpe in Des Moines really, really wants people to sign up so he can get an Ipod?

      Furthermore, once you start tying all this information together, doesn't that just make the work for corporate data miners just that much easier?

      Of course, you could salt in a bunch of useless, random data, which of course, means that the whole shooting match is useless.

      --
      To celebrate the occasion of my 1000th post, I will post no more forever on Slashdot. Goodbye.