Slashdot Mirror


The Semantic Web Going Mainstream

Jamie found a story about a new web tool that is trying to break ground into the semantic web. It's called twine, and it supposedly will intelligently aggregate your data, be it youtube videos, emails, or whatever you accumulate in your travels. Not the first, not the last, but here's hoping something comes out of the ideas someday.

110 comments

  1. Sorry, but it's not for me. by AltGrendel · · Score: 4, Insightful
    Twine is a website where people can dump information that's important to them, from strings of e-mails to YouTube videos.

    I really don't like this idea. One good hack from the Russian MAFIA and the game would be over. All your eggs are belong to us, as it were.

    --
    The simple truth is that interstellar distances will not fit into the human imagination

    - Douglas Adams

    1. Re:Sorry, but it's not for me. by cHiphead · · Score: 0, Flamebait

      Anyone that uses the word 'semantic' with 'web' should be pointed at and laughed at, then perhaps hit in the face with a brick. Keep trying, marketeers, you'll find a new way to game your hits and get another easy payday sooner or later. Nubs.

      Cheers.

      Maybe I'm just getting to old for this 'internets' thing.

      --

      This is my sig. There are many like it, but this one is mine.
    2. Re:Sorry, but it's not for me. by butterwise · · Score: 5, Funny

      Sorry, but it's not for me.
      Anti-semantic...
      --
      If a baby duck is a "duckling," why would anyone want to eat "dumplings?"
    3. Re:Sorry, but it's not for me. by UbuntuDupe · · Score: 5, Funny

      Okay, that is *not* fair. The Semantic Web would have untold benefits for humanity. For example: if you wanted to find out which Major League batter had the most RBIs in 1997, you would have to spend three -- perhaps four -- minutes learning how to use an internet search engine.

      With the Berners-Lee Semantic Web(tm), however, you would just type in "which Major League batter had the most RBIs in 1997?"

      (Of course, most search engines will already pick out the relevant terms even if you typed that question in, but that doens't count because they don't do it *intelligently*.)

    4. Re:Sorry, but it's not for me. by Colin+Smith · · Score: 3, Insightful

      Mmm. Nah. The Semantic Web is most useful to remove humans from the loop completely. i.e. When Skynet wants to know which batter had the most RBIs in 1997 it will be able to understand from the XML DTD what a batter is and how that relates to RBIs...

      So... What's an RBI then when it's at home?

      --
      Deleted
    5. Re:Sorry, but it's not for me. by CRCulver · · Score: 2, Insightful

      Anyone that uses the word 'semantic' with 'web' should be pointed at and laughed at, then perhaps hit in the face with a brick. Keep trying, marketeers, you'll find a new way to game your hits and get another easy payday sooner or later.

      The driving force behind semantic web research is Sir Tim Berners-Lee, hardly a marketer just trying to get rich from buzzwords. He's an academic, and it's precisely in academic circles that the semantic web is already a reality. Just see Visualizing the Semantic Web ed. Geroimenko & Chen (Springer-Verlag, 2nd ed. 2005). It shows several projects where the concepts of the semantic web were used to great effect.

      If anything, marketers have been staying away from the semantic web, since developing semantic web technologies requires the sort of learning and practising that costs money, when employees could just trot out the same FUBARed markup and table-based layouts that they are already used to.

    6. Re:Sorry, but it's not for me. by DocDJ · · Score: 1

      it's precisely in academic circles that the semantic web is already a reality But this is precisely the problem with the semantic web - it's not grounded in reality (apologies for the poor logic pun). I'm not going to indulge in academic-bashing, having spent a fair amount of time wandering up and down ivory towers in my time, but the semantic web does not measure up to even a cursory cost-benefit analysis. It provides very little qualitative benefit over current search - especially for content providers, who unfortunately are the people it requires a huge commitment from.
    7. Re:Sorry, but it's not for me. by CRCulver · · Score: 1

      So the academic world isn't a reality? When my work requires I pull together several sources of information and transform them into a visualization, that's not reality?

    8. Re:Sorry, but it's not for me. by vertinox · · Score: 1

      One good hack from the Russian MAFIA and the game would be over.

      What is more likley?

      A large company doesn't keep back ups.
      A person like you or me doesn't keep back ups.

      Now this varies from person to person, but I would usually bet a company would have far greater resources into backing up data. Of course, I have been unpleasantly surprised before on this matter.

      Or for that matter... Wouldn't it be easier for the Russian Mafia to hack your average unsecured windows computer and blow away your data that way?

      --
      "I am the king of the Romans, and am superior to rules of grammar!"
      -Sigismund, Holy Roman Emperor (1368-1437)
    9. Re:Sorry, but it's not for me. by Tim+C · · Score: 2, Insightful

      Or for that matter... Wouldn't it be easier for the Russian Mafia to hack your average unsecured windows computer and blow away your data that way?
      But why would they?

      I'm insignificant, a nobody, and no target at all. A site used by hundreds of thousands or even millions, on the other hand - now that's a target. A well-defended target you'd hope, but a target nonetheless.
    10. Re:Sorry, but it's not for me. by Eli+Gottlieb · · Score: 1

      What's he going to do? Make web-pages wear yellow semantic markup?

    11. Re:Sorry, but it's not for me. by darthflo · · Score: 1

      Question: Where dose the Mafia come in this and why would they want to have, destroy or hold hsotage bits of random people's online memories? I may not be one of them (maybe I am but can't say so 'cause they'd kill me if I did), but I'd suspect their interests being more along smuggling and distributing things, ending unfriendly people's lives and so on.

    12. Re:Sorry, but it's not for me. by UbuntuDupe · · Score: 0

      I don't think he was being anti-semantic, he was just saying that newish tag systems should be segregated off into their own neighborhoods, separate from the rest of the internet, and possibly flagged with a special star.

    13. Re:Sorry, but it's not for me. by Anonymous Coward · · Score: 0

      For visualizations there is PowerPoint.

      For everything else there's Visa.

    14. Re:Sorry, but it's not for me. by Marin3 · · Score: 0

      I'd rather use Google Notebook for my notes than Twine

    15. Re:Sorry, but it's not for me. by stuuf · · Score: 1

      Why? Because anything made by Google is automatically better than any potential competitor?

      --

      Everyone is born right-handed; only the greatest overcome it

    16. Re:Sorry, but it's not for me. by yoder · · Score: 1

      "So the academic world isn't a reality?"

      Apparently not. I missed that memo as well.

      --
      "In a time of universal deceit, telling the truth is a revolutionary act!" -- George Orwell (Eric Arthur Blair)
    17. Re:Sorry, but it's not for me. by Marin3 · · Score: 0

      I believe Google is making a good job on the data mining field and in the area which the story concerns to.

    18. Re:Sorry, but it's not for me. by Anonymous Coward · · Score: 0

      Actually, its more like all your base are belong to russia :)

    19. Re:Sorry, but it's not for me. by Anonymous Coward · · Score: 0

      There is certainly a clear and significant flaw in current semantic web concepts. I'm trying to get some patents before going public on this, but I can say there are things the current approaches cannot do because they're missing something very important, but achievable with a different philosophy. XML, RDF, OWL, all have a fundamental flaw in their approaches. A lot of VC money currently being thrown at semantic web right now will be probably be wasted because people are going down the wrong path. But there is a right one, and it can enable things you just wouldn't believe are possible. Stay tuned.

    20. Re:Sorry, but it's not for me. by DocDJ · · Score: 1

      Hmmm, either I meant (a) "the academic world isn't a reality" (which would clearly be a ludicrous claim), or (b) something else. Guess what? It's (b). The semantic web is not grounded in the real world in that it does not take into account that real people (as opposed to the kind of people that exist in research funding proposals) generally make decisions about how much time or money they invest in something based on the benefit they expect to obtain from that effort. It just seems hugely unlikely that people are going to decide to use "semantic" mark-up (by which I mean mark-up that is grounded in some formally-defined ontology which has been committed to by some community) when there seems very little benefit in them doing so. Will it improve their page-rank? Not at the moment. Some people are trying to bootstrap the semantic web by automatically annotating existing bodies of knowledge (e.g. Wikipedia). The obvious problem with this is that there is no way it's going to of sufficient quality (current NLP just isn't good enough) and this lack of precision completely wipes out the claimed advantages of the semantic web, which is that semantic annotations will bring increased quality of search results for more specific kinds of search queries.

    21. Re:Sorry, but it's not for me. by BenoitRen · · Score: 1

      The web is HTML, not XML.

    22. Re:Sorry, but it's not for me. by serge587 · · Score: 1

      define "intelligently"... it could be argued that "[picking] out the relevant terms" is intelligence... of sorts.

    23. Re:Sorry, but it's not for me. by YouMakeMeSoANGRY · · Score: 1

      Actually, if it is anything, the web is HTTP.

  2. no ads please by randuev · · Score: 4, Informative

    http://www.technologyreview.com/printer_friendly_article.aspx?id=19627 for those who don't want the ads
    but even without ads the article is very shallow. how is it "semantic" web exactly?

    1. Re:no ads please by Intron · · Score: 2, Informative

      The W3C part is to add semantic information to web pages and other data so that you can use it in multiple applications (like twine, I guess). Right now, data I get from a web page is only good for me to look at, but with semantic markup I could automatically import it into other uses.

      An example would be going over my finances at the end of the month. Right now I get either a paper statement, or log into each account, and then copy numbers over to Quicken. This would allow me to set up Quicken to automatically log in to all my accounts, balance my checkbook and generate a report of income and expenses for the month. It sounds good in principle, but I think the devil is in the details.

      --
      Intron: the portion of DNA which expresses nothing useful.
    2. Re:no ads please by foobsr · · Score: 1

      how is it "semantic" web exactly?

      Presumably they want to research how to trick people into providing markup/classification (that current 'AI' with a lack of reading comprehension/natural language competence (ever come across a decent 'automagic' translation?) fails to deliver) and sell the results for corporations to use. Seen this way, it is a cornerstone to the advancement of wealth creation, adding an exciting new semantic dimension to 'Rich Web Clients' (of Spivack).

      CC.

      --
      TaijiQuan (Huang, 5 loosenings)
    3. Re:no ads please by Anonymous Coward · · Score: 0

      1) The reason why people laugh when they here "semantic web" is because no one has really pulled it off. Most of the junk I see are barely Web 2.0!
      2) This is a pretty weak example of a "semantic web" app. More Web 2.0.
      3) IMHO, everyone is still to browser-centric. The "semantic web" is meant to be much more.

    4. Re:no ads please by Anonymous Coward · · Score: 0

      1. Your financial data.
      2. Key logger (or other means to discover our passwords).
      3. ??!
      4. Your loss is my PROFIT.

      This is like putting MS Office online with a "Hack Me" sign hanging from Clippy's neck.

    5. Re:no ads please by Anonymous Coward · · Score: 0

      So how did semantic web add any new problem? I already have online banking and stock account.

  3. Terms and Conditions by Mateo_LeFou · · Score: 2, Insightful

    "access and use the Site and electronically copy, (except where prohibited without a license) and print to hard copy portions of the Site Materials for your informational, non-commercial and personal use only"

    Can't use their service for commercial purposes; how mainstream can it be?

    --
    My turnips listen for the soft cry of your love
    1. Re:Terms and Conditions by Anonymous Coward · · Score: 0

      In fact, the new bubble is about getting the users to give the companies their entire information.

      But anyway, I disagree that this thing is novel, Google is already doing this kind of data aggregation and with much less noise, and probably with much more success. Just take a look on what kinds of things this tool aggregates and you'll see that google is a mainstream provider of services in all of that.

  4. news? by Starturtle · · Score: 0, Troll

    Not the first, not the last, not ground breaking...not news worthy.

  5. Two formats for all data! by Anonymous Coward · · Score: 0
    From Wikipedia:


    "Another criticism of the semantic web is that it would be much more time-consuming to create and publish content because there would need to be two formats for one piece of data: one for human viewing and one for machines. With this being the case, it would be much less likely for companies to adopt these practices, as it would only slow down their progress."

    Overall though I like the idea!

  6. not strong enough by Anonymous Coward · · Score: 3, Funny

    Sorry folks, but twine just isn't gonna cut it. We need something sturdier. Someone needs to start a similar project called 'ducttape'.

  7. Yes, like in... by ioshhdflwuegfh · · Score: 1
    From the site:

    Twine is a website where people can dump information that's important to them, from strings of e-mails to YouTube videos. and then web site does the semantic part... Revolutionary or what?
  8. Attack of the Misunderstood Acronyms! by the+linux+geek · · Score: 4, Informative

    "Written with the Semantic Web Standards, called W3C, in mind."

    Yikes. That's horrible.

    1. Re:Attack of the Misunderstood Acronyms! by southpolesammy · · Score: 1

      The write-up may be a bit confusing, but I'm pretty sure they're referring to the Semantic Web as an extension of the web as proposed by W3C director Sir Tim Berners-Lee, and not trying to hijack the acronym for themselves.

      --
      Rule #1 -- Politics always trumps technology.
    2. Re:Attack of the Misunderstood Acronyms! by MrMunkey · · Score: 2, Insightful

      Even better... On their site they say

      Twine is one of the first mainstream applications of the Semantic Web, or what is sometimes referred to as Web 3.0.

      http://www.twine.com/about and there's a great section about Web 3.0 here

      It's great for a laugh... until you realize that by this time next year we'll probably be on Web 10.0

    3. Re:Attack of the Misunderstood Acronyms! by Workaphobia · · Score: 1

      There's also a Web Ontology Language whose acronym is OWL. I propose coining the word "Anagrym" for such instances.

      --
      Evidently, the key to understanding recursion is to begin by understanding recursion. The rest is easy.
    4. Re:Attack of the Misunderstood Acronyms! by nuzak · · Score: 1

      Anagrym. I love it. OWL and ISO (International Organization for Standards) would be examples of TALs: Three Letter Anagryms.

      --
      Done with slashdot, done with nerds, getting a life.
    5. Re:Attack of the Misunderstood Acronyms! by dedalus2000 · · Score: 1

      don't be so critical it's not bad for machine generated English language text.

      --
      My keyboads not woking popely.
  9. Relevant by Gothmolly · · Score: 2, Interesting
    I know we're all supposed to be lubed up over "Web 2.0" and "blogging" and "social networking", but for the Internet users out there who ARENT 15 year old emo kids, how is "the semantic web" relevant?

    /yes, there's a whiff of irony about posting this to Slashdot
    //Slashdot = old Usenet
    ///rn FTW
    --
    I want to delete my account but Slashdot doesn't allow it.
    1. Re:Relevant by Anonymous Coward · · Score: 1

      as internet user the idea could be useless, but as computer scientist you can have a master degree working with semantic web, rdf, etc, since the idea is not practical, and fuzzy, you can publish, go to conferences, and be associate professor.

    2. Re:Relevant by Anonymous Coward · · Score: 2, Insightful
      u mean matrix lovin kids:

      Here's a futuristic tailored smeantic search example!

      Semantic search:#> sem.search s1 = Kung_Fu +online_course +display=practical Layout +embedded=video +moves=kicking +armchair=relaxed_position + muscles.linked(search(0)) + 3d.harness=on +holographic.image_projector=on +environment=O2;
      s1;
    3. Re:Relevant by Arthur+B. · · Score: 2, Insightful

      The search engines are currently still mostly syntaxic. Look for a word, see pages matching that word, in a more or less relevant order... This means you have to play trick games with search engines in order to find what you want...

      Imagine you could simply query things like: Find me an appointement with a dentist that takes my insurance, has good ratings and lives near where I live. From your personal information (your calendar, where you live), public information (consummer ratings on the dentists, maps, information from de dentist office, from your insurance etc) a semantic web search engine could provide you with an answer.

      All it takes is for the data published on the internet to be *structured*

      --
      \u262D = \u5350
    4. Re:Relevant by psykocrime · · Score: 1

      Imagine you could simply query things like: Find me an appointement with a dentist that takes my insurance, has good ratings and lives near where I live. From your personal information (your calendar, where you live), public information (consummer ratings on the dentists, maps, information from de dentist office, from your insurance etc) a semantic web search engine could provide you with an answer.

      Glad to see somebody here "get it" when it comes to the Semantic Web.

      --
      // TODO: Insert Cool Sig
    5. Re:Relevant by ColdWetDog · · Score: 1

      All it takes is for the data published on the internet to be *structured*

      All it takes is for the data published on the internet to be ** STOLEN **

      There, I fixed if for you. In your specific example, you have:

      - your demographic information
      - your insurance information
      - the dentist's schedule

      and other bits and pieces exposed. Ask yourself. Do you want to go there?

      --
      Faster! Faster! Faster would be better!
    6. Re:Relevant by Workaphobia · · Score: 1

      I'm relatively certain that most fifteen year old "emo kids" don't hold computer science masters degrees, running from conference to conference to give talks on changing the nature of computing using the web, in-between friending classmates on facebook.

      --
      Evidently, the key to understanding recursion is to begin by understanding recursion. The rest is easy.
    7. Re:Relevant by Workaphobia · · Score: 1

      >> "All it takes is for the data published on the internet to be *structured*"
      > "All it takes is for the data published on the internet to be ** STOLEN **"

      Careful, I heard somewhere that if you publish information to this thing called the web, other people can see it too!

      Jeez, it's relatively straightforward to only make available information that you *want* the world to see. If you don't want your mother's maiden name to be public information, take it off your homepage/blog/profile. The only difference here is that information which was *already available* is now more readily accessible.

      --
      Evidently, the key to understanding recursion is to begin by understanding recursion. The rest is easy.
    8. Re:Relevant by Arthur+B. · · Score: 1

      You don't need to expose that information to the whole world, only to the search engine or the application doing the matching. Of course, the search engine could be compromised or it could itself abuse your information. Still you remain free not to use it, you can ask to see only public information and match it with your agenda yourself, or you could use local sofwtares. Eventually, strong encryption and reputation mechanism should enforce better private information security in the future. For example you could put your info in a very secure, trusted information vault and provide third party applications temporary keys to access limited part of this information.

      --
      \u262D = \u5350
    9. Re:Relevant by maharg · · Score: 1

      psykocrime, I totally agree. The possibilities of semweb are really quite stunning. A few months back I posted a semweb 'use case' at http://slashdot.org/comments.pl?sid=233445&cid=18993283

      What's exciting about twine is that it appears to be based on W3C standards (RDF/OWL et al), but doesn't require knowledge to be engineered. Can't wait to see where this goes.. :o)

      --

      $ strings FTP.EXE | grep Copyright
      @(#) Copyright (c) 1983 The Regents of the University of California.
    10. Re:Relevant by heinousjay · · Score: 1

      This is a good topic for exposing the people who simply disparage everything they don't understand.

      --
      Slashdot - where whining about luck is the new way to make the world you want.
    11. Re:Relevant by Gothmolly · · Score: 1

      And all it takes for World Peace is for people to be "rational".

      The kind of order you're seeking to impose... its impossible.

      --
      I want to delete my account but Slashdot doesn't allow it.
    12. Re:Relevant by Marillion · · Score: 1

      I'm consulting at the biomedical informatics department of a major midwestern pediatric hospital. We're in the chase trying to make semantic web work. In sort, we're focused on the Data. There are at least six different well-known formats for representing Subject-Predicate-Object and the temptation is to get hung up on the markup and forgetting "It's the data stupid."

      There's an old saying: Astronomy isn't about telescopes. Of course astronomy would be severely crippled without telescopes; the goal of astronomy is to study celestial objects rationally and scientifically.

      I was trying to explain semantic web (or at least how we are trying to use it) to a group of college professors where I did my undergraduate studies. We're constructing huge lists of subject-predicate-object "phrases." These lists come from multiple sources with poor connectivity across different sources. We then run algorithms against those lists to synthetically and logically derive new phrases. This allows us to connect knowledge from animal studies with disease knowledge.

      I say we're in the race to make semantic web work because classic problems of IT always impede progress. It's amazing how the same thing can have different meaning depending upon context. As an example, I used to work at an airline. They had as least different definitions of were and what the city of Cincinnati was. There included: 1) where the airport was (flight operations), 2) where the city center was (frequent flyers), 3) where the population center was (marketing) ... you get the idea. If you've ever had to do data warehouse ETL (Extract, Transform, and Load), you know these problems well. These "impedance mismatches" affect the ability to connect information.

      --
      This is a boring sig
  10. In Soviet Russia... by Facetious · · Score: 0, Troll

    ... Russian Mafia eggs you!

    --
    Let us not become the evil that we deplore.
  11. "Fighting the hype problem"? by Red+Flayer · · Score: 4, Insightful
    FTA (second page):

    It's still too early to know if Twine will be successful with consumers, says Tony Shaw, president of Semantic Universe, an organization committed to raising awareness of semantic technologies in business and consumer settings. Success will not simply depend on making the technology work, but also on managing people's expectations of the technology, he says. "It's about fighting the hype problem."
    Hmm. Let's fight the hype problem by publishing more hype. And maybe if we include a statement saying we're fighting hype, people will assume this reformatted press release isn't hype.

    Sure, I understand that managing expectations is important, but let's not lose sight of what this article really is.
    --
    "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
    1. Re:"Fighting the hype problem"? by UbuntuDupe · · Score: 1

      Why don't they just give their own content the metatag "non-hype"? ;-)

    2. Re:"Fighting the hype problem"? by lhorn · · Score: 1

      Sounded like a good idea, organizing content thru metainformation across different formats. But I began to be wary at "leveraging" and they lost me totally with "consumer". I will not voluntarily have anything to do with anybody who regards people as consumers. Seems like another "Intelligent agent" to me. I'll stick with Google for a while, they do use metainformation and seems to find information when I need it.

      --
      accept no limits but time
    3. Re:"Fighting the hype problem"? by Anonymous Coward · · Score: 0

      I think you're missing the point a bit. A lot of people have the idea that the Semantic Web is like a gigantic AI engine that crawls the web, figuring stuff out, when in actual fact, it's a whole bunch of much smaller proven technologies working together to make software a bit more useful and able to do more things automatically. Anybody doing anything in this space gets a bunch of people turning their noses up at the "stupid ivory-tower nonsense that will never work". That's the hype that needs to be fought, and lumping normal marketing in with that is silly.

    4. Re:"Fighting the hype problem"? by YGingras · · Score: 2, Insightful

      Let's fight the hype problem by publishing more hype.
      Of course, it builds hype resistance. Historical evidences show that it worked for IPv6 and Duke Nukem Forever.
  12. Flashback by houstonbofh · · Score: 4, Interesting

    While reading TFA I had a flashback to reading a 90's era ASP press release. "Ohhh... Shiny and pointless!"

  13. Clearly some new meaning of semantic Web here by clickclickdrone · · Score: 5, Insightful

    Unless I've missed some whole new sub-branch, semantic web to me means marking it up properly to give meaning to the various page elements via correct tags and microformats. This is just an overgrown agregator.

    --
    I want a list of atrocities done in your name - Recoil
    1. Re:Clearly some new meaning of semantic Web here by xENoLocO · · Score: 1

      Deadly accurate... mod parent up!

      --
      "The need to build the internet comes from something inside us, something programmed... something we can't resist."
    2. Re:Clearly some new meaning of semantic Web here by darthflo · · Score: 1

      Actually you did miss the sub-branch of what is now referred to as the Semantic Web. Semantic (X)HTML tags like <strong> or emphasis provide better readability targeted directly to human interpreters while this new Semantic Web targets humans through aggregators of knowledge (and thus needs to be machine-read and -interpretable.

  14. Hype alert by jandersen · · Score: 1, Interesting

    This article is crap - however, the idea is not entirely hot air, even though it is being touted as 'the next, big thing', which I very much doubt it will be. I think the 'semantic web' is trying to solve a non-existent problem; we're not suffering from 'information overload' - the net has just been filled up with useless rubbish, like adverts, SPAM, entertainment and adverts. And did I mention adverts? Fortunately it is not necessary to 'manage' any of that - all you need is to be able to avoid it, which existing SPAM filters and ad-blockers already do reasonably well.

    Apart from that, I think using a tool like the one proposed (however vapidly) in the article presents it's own dangers. Letting a machine manage and 'understand' information that is important to you is not wise. Think of the spellchecker deathtrap: You misspell words in such a way that they become correctly spelled words with another meaning - like 'them' vs 'then', or 'than', or 'there' vs 'their'. Sometimes you stumble over texts where the author has clearly relied on the spellchecker without proofreading it afterwards, and the meaning has become garbled, or even worse, it has changed to something the author didn't intend, but which seems plausible enough. Just imagine if you were an amateur ornithologist who collects some articles mentioning 'cock pheasants' and 'blue tits' - and suddenly your collection of articles is tagged 'pornography'. Perhaps not the most catastrophic of scenarios, but certainly an example of the kind of surprises you can expect from the 'semantic web'.

  15. I call BS by abes · · Score: 2, Insightful

    While I am a fan of the "esoteric field of machine learning", as the article mentions, I am also well aware of the countless of disappointments so far (thus no AI..). There have been many designs that can tackle toy problems, but nothing yet that has been able to handle large corpuses of text so far. The big problem being is that to really be able to do proper categorization the program must understand what it's reading. Which, again, requires some type of intelligence.

    While methods are available to do categorization based on either static or learned heuristics, they are less than perfect (think about Safe Search in Google images -- it works decently, but definitely not perfectly). In fact, just parsing a single English sentences can be a difficult task for computers (if the sentence doesn't fall into a context free grammar). So the best we can probably hope Twine to do is categorize based off of word frequency (okay, they probably use some higher order stats).

    Whenever I read about a new semantic technology, I always think of Wordnet (developed by Miller, who is the same guy responsible for the study showing we can remember 5-7 digits). Wordnet was developed as a database for the hierarchy of all words. Words are defined by their relationship to other words.

    While it's a great idea, and useful for some projects, it also far from perfect, as words do not in the end have a static relationship to each other. The semantic web in the end relies on a static relationship between words (either through common usage or through a relationship through words).

    1. Re:I call BS by Gilmoure · · Score: 1

      Why not just have a large cube farm of women sitting there reading search engine requests, doing a look up of the information and then directing customer to information? They could wear glasses and have their hair in a bun, that only comes down when they're overwhelmed by desire for a games play, Star Trek watching geek?

      --
      I drank what? -- Socrates
    2. Re:I call BS by UbuntuDupe · · Score: 1

      Why not just have a large cube farm of women sitting there reading search engine requests, doing a look up of the information and then directing customer to information?

      They do: that's ChaCha.

      They could wear glasses and have their hair in a bun, that only comes down when they're overwhelmed by desire for a games play, Star Trek watching geek?

      Okay, maybe not that part.

  16. God damn that phrase!! by corifornia2 · · Score: 1

    I always misread it as the "Sementic Web" and I get really excited that more interesting ways to look at pr0n are on their way to the internets.

    1. Re:God damn that phrase!! by Whiteox · · Score: 1

      You're lucky! At least you thought "PORN!"
      I'm so unlucky as to have misread it as "Symantec Web"
      OMG! Another Norton product!!!!!!

      --
      Don't be apathetic. Procrastinate!
  17. I'm already using the Semantic Web by progprog · · Score: 5, Insightful

    Let's see if it works on Slashdot.

    1. Re:I'm already using the Semantic Web by UbuntuDupe · · Score: 1
      Your tags are formatted incorrectly. It should be

      *please mod insightful, please mod insightful*
      and it goes only at the end of the text.

      (Hey, it works for me quite a bit!)
    2. Re:I'm already using the Semantic Web by BjornStabell · · Score: 1
      How about an hComment or hReview microformat:

      <div class="hReview" moderation_status="insightful" rating="5"> Yeah, semantic web rocks. </div>
    3. Re:I'm already using the Semantic Web by Anonymous Coward · · Score: 0

      Damn the Semantic Web. That should have been modded funny.

    4. Re:I'm already using the Semantic Web by Anonymous Coward · · Score: 0

      Hehe, I'm going to consume all modpoints.

  18. Goofy project by wytcld · · Score: 4, Insightful

    It's well-known in linguistics and philosophy that "You don't get semantics from syntax." It's well-known in computer science that computers are syntactical. It's well-known in recent business history that all startups claiming they'd produce "expert systems" or "artificial intelligence" in which computer systems would, despite these accepted truths, perform semantic feats have miserably failed to live up to their claims.

    So why don't we give PR puff pieces like this the same warm reception we give to the latest announcement of a perpetual motion machine? It's the kind of project only plausible to those who know very little of the basic background well-accepted by experts in the pertinent adjacent fields. That one or two big names from the success of the syntactical www either aren't familiar with or don't accept core knowledge from linguistics and philosophy of language is finally no different than Thomas Edison working for years on a machine to talk to ghosts: brilliance in one area most often doesn't translate into other areas in which you have no background - and even more rarely into areas where nobody knows how it would be done.

    --
    "with their freedom lost all virtue lose" - Milton
    1. Re:Goofy project by realmolo · · Score: 1

      Exactly.

      The simple fact is, computers can't do "natural language recognition". They can't READ. And they definitely can't glean meaning from contextual clues. All of which are necessary for the so-called "semantic web" to work well.

      Essentially, these guys are pretending that they have a working artifical intelligence. Which they don't. No one does.

    2. Re:Goofy project by Lodragandraoidh · · Score: 2, Insightful

      Humans don't even get semantics right consistently. In many cases there is no one 'right' meaning for any given collection of symbols. It all resides within the human skull, and is ever changing over time - and is reflected in how languages and symbology morph through the centuries.

      There have been various attempts to tame the semantic beast - formalized hierarchies being the most successful in conjunction with the advancement of scientific thought, and more recently less formalized meta-tagging systems. In these systems that seem to work best the human is involved in providing the meaning in terms a computer (or other humans) can understand: lists of keys/pointers to other lists ad infinitum. Of course there is always that undefineable exception that breaks such simple systems (e.g. the Platypus).

      Reality is an ever changing and evolving continuum - and quantum physicists would probably take issue with that.

      --

      Lodragan Draoidh
      The more you explain it, the more I don't understand it. - Mark Twain
    3. Re:Goofy project by jpfed · · Score: 3, Insightful

      It's well-known in linguistics and philosophy that "You don't get semantics from syntax." That's right- we get semantics from interpretation.

      So why don't we give PR puff pieces like this the same warm reception we give to the latest announcement of a perpetual motion machine? Because the right syntax can give to a computer very helpful clues towards productive interpretations. Data- which is just "syntax"- helps to drive computers to more effectively interpret other, related data all the freaking time. That's not what kills the semantic web idea.

      What kills the semantic web idea is that all the millions of individual producers of data don't have any immediate incentive to mark their own data up for the benefit of others.
    4. Re:Goofy project by MikeFM · · Score: 1

      I won't quite agree with you. While I do agree that every atempt I've seen so far for artificial intelligence to classify random data has been sort of lame I don't think it's an impossible task. It's merely a task that requires more memory and processor power than we've yet got available. I've seen some pretty decent AI stuff for classifying smaller groups of data so I think it's only a matter of time before a computer can classify most data that a human could classify. I think that's a point too - often we expect computers to do a better job than we ourselves could do. If it's ambiguous to us then it probably will be to a computer also. As long as we make mistakes computers probably will too. My choice method is to have a computer auto-sort data, have humans double-check the sorting and correct it if needed, and have any changes passed back to the computer to improve it's training. Over time such systems do usually get better although sometimes developers have to take a look at the corrections also and figure out if there is some factors they're not letting their program properly recognize and respond to.

      Of course instead of going with fancy AI stuff it's always easier to just markup our own data in a way that is meaningful. Maybe create a wrapper that can be embedded directly into the system we already use. Instead of linking directly to a document or file why not link to a meta-data file that then provides the URL for the actual file desired? You could then make available all kinds of meta-data for any file without having to create a bunch of new file formats.

      --
      At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
    5. Re:Goofy project by BalkanBoy · · Score: 1

      Your post reminded me of a ST:TNG episode ("Ship in a bottle"), where Prof. Moriarty, a holodeck creation, somehow 'automagically' attains self-awareness and becomes a sentient human being, wishing to get off the holodeck and into the 3D world that Picard and his crew inhabit, but finds it to be an impossible task, even though he shakes down the enterprise after he gets a little annoyed with how little progress the crew has made with helping him get off the ship's holodeck.

      I would gladly trade places with Prof Moriarty right now, even though I'm of human form, if I knew I could live inside a machine, as an energy form, be self-aware and in some sense 'immortal' (until you pull the power plug I guess :), rather than wait until my body degrades and rots around age 75-100 (on average).

      It seems any 'meaning' a computer is aware or capable of processing, can only be created by humans (e.g. a computer has no inherent creational abilities as humans do). Can one possibly add the 'ultimate' algorithm to mimic _all_ human behavior, 100%, e.g. recreate Mr. Data and his 'positronic matrix' (along with the 'emotion chip')? Sounds great, though I think we're to become Borg (human-machine interface) before we become (or create, rather) Mr. Data (a pure computer application w/out any human component)...

      Becoming your own God is cool... but seems hardly attainable so far :).

      --
      'A lie if repeated often enough, becomes the truth.' - Goebbels
    6. Re:Goofy project by Anonymous Coward · · Score: 0

      It's well-known in linguistics and philosophy that "You don't get semantics from syntax."

      No it's not. It's well known that the philosopher John Searle advanced that thesis and that his "Chinese room" thought experiment is supposed to support it. But the thesis itself has been contentious from the outset. See, for instance, http://blindimpress.blogspot.com/2007/06/uplift-bytecode.html.

    7. Re:Goofy project by maharg · · Score: 2, Insightful

      .. and 640k ought to be enough for anyone ! lol

      Artificial Intelligence is a very different field from Semantic Web. The technology for SemWeb is here now, AI is still a ways off, I will admit.

      --

      $ strings FTP.EXE | grep Copyright
      @(#) Copyright (c) 1983 The Regents of the University of California.
    8. Re:Goofy project by Anonymous Coward · · Score: 0

      I'm not sure what you mean when you refer to this notion of computers being syntactical and the distinction between syntax and semantics. If this is a reference to something like Searle's `chinese room' then this is far from "well known".

      Whilst I agree that the semantic web has not yet satisfactorily dealt with the symbol grounding problem, the problem of connecting symbolic representations with what they are representative of, I do not agree that these problems are theoretically impossible.

    9. Re:Goofy project by Jeff+DeMaagd · · Score: 1

      What kills the semantic web idea is that all the millions of individual producers of data don't have any immediate incentive to mark their own data up for the benefit of others.

      There may be other issues as well, because the whole idea seems to hinge on correct and honest mark-up. It doesn't sound very resilient anyway. So really, it sounds like it's a project whose main aim at trying to eliminate hard work when it's eventually going to have to be done anyway.

    10. Re:Goofy project by Mode_Locrian · · Score: 1

      I mostly agree with your comment--the sentiment is definitely right. I'll just add that there is a field called formal semantics, which consists in building models of and, importantly for the discussion here, making rules that describe the semantics of languages. Now, this is most easily accomplished with artificial languages, where the semantics are clear and precise (and usually quite a bit more orderly than in natural languages) but there has been work in using this with natural languages as well. Long story short, you're right that "You can't get semantics from syntax" but, and I take it this is the project of the semantic web, "You can *model* semantics with syntax." (Consider (e.g.) proofs of soundness and completeness for derivation systems in first-order logic--the whole point of these proofs is to show that the syntax of the formal derivation system is, in fact, a good model of the semantics of first-order (informal) logic.) That said, I agree with the sentiment that the project is overblown and that they're nowhere near as close to coming up with a good model as their press releases might indicate.

    11. Re:Goofy project by smallpaul · · Score: 1

      In what sense are computers "syntactical" and the human brain "semantic"?

    12. Re:Goofy project by porter235 · · Score: 2, Informative
      What you seem to fail to recognize, is that the semantic web is not about teaching computers how to analyze our language (syntax) to extract semantics, but rather us agreeing on how to add syntax to our our data so that the computer can understand the semantics.

      For example, the following chunk of code explicitly defines the creator, title, description, and date of an audio file. Because it has been specifically marked up, and IF we can all agree to use the Dublin core namespace for describing that type of data, then we can write programs that can gather, correlate, and make deductions about that info from multiple sources.

      <rdf:RDF
        xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
        xmlns:dc="http://purl.org/dc/elements/1.1/">
       
        <rdf:Description rdf:about="http://media.example.com/audio/guide.ra">
       
            <dc:creator>Rose Bush</dc:creator>
            <dc:title>A Guide to Growing Roses</dc:title>
            <dc:description>Describes process for planting and nurturing different kinds of rose bushes.</dc:description>
            <dc:date>2001-01-20</dc:date>
       
        </rdf:Description>
      </rdf:RDF>
  19. No! Load More On! by Greyfox · · Score: 1
    Every technology that is going to be used in the industry has to go through a hype phase first. I first noticed this with C++ and Object Oriented programming. The hype was that OOP would revolutionize the industry. A bunch of PhBs across the land saw this blurb and said to their engineers "From now on we use C++ and Object Oriented programming!" The engineers went off and did it (Generally badly because the PhBs spent the training budget on a summer cottage in Italy) until they started to get the hang of it. That was about the time Java was coming out. The PhBs heard "Java will revolutionize the industry! You don't need to compile on every platform anymore!" So they went to their engineers and said "From now on we use Java everywhere!" yadda yadda training budget 3 week management retreat in Barcelona yadda yadda. Same thing happened with XML.

    So if you want to have it used in the industry you just have to say "The semantic web will revolutionize the industry." Maybe they can integrate it into Web 2.0...

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    1. Re:No! Load More On! by darthflo · · Score: 2, Insightful

      Maybe they can integrate it into Web 2.0...
      You got it all wrong. Web 2.0 is "You make all the content, they get all the revenue". Web 3.0 is going to be The Semantic Web. Web 4.0 will, like Winamp 4.0, be skipped in favour of Web 5.0 where users provide the content, search engines look at the ads while grabbing the content and returning it, processed and summarized, to said users. This will also perfectly integrate with GWEI and similar projects for other search engines!
  20. Irony by Anonymous Coward · · Score: 0

    It's funny that the posters bash semantic web, saying its useless and will never materialize... all while adding semantic annotations to the slashdot posting (tagging beta)!! ROFL.

    I do agree that the article is crap though.

  21. Could use Lojban by Besna · · Score: 1

    Lojban is unambiguous and much better suited to this than Spanish or English. Sure, you can make English as precise as possible--but most of written English is not. I don't see how you can have a semantic web when the semantics aren't clearly defined. Lojban, parsable like Java, makes it truly possible.

    1. Re:Could use Lojban by nuzak · · Score: 1

      Lojban has a ridiculously prescriptive grammar, and an inability to cope with ambiguity. Instead of adding qualifiers in a nondestructive way, it simply forbids ambiguity. Really expressive, yeah. Hell, many people can't even legally say their own NAME in Lojban.

      --
      Done with slashdot, done with nerds, getting a life.
  22. What's it all about? by naich · · Score: 1

    I went to the Twine site to find out what it was all about but I just got bombarded with meaningless buzzwords and technodrivel. This is what you tend to get from people who want to sound cutting edge but haven't got a clue, so I concluded that they didn't really know what it was all about either.

  23. I for one by Crazy+Taco · · Score: 0, Redundant

    I for one wait with Clay Shirkey to welcome our "devestatingly intelligent" machine searching overlords!

    --
    Beware of bugs in the above code; I have only proved it correct, not tried it.
  24. Too many buzzwords. too little content by Animats · · Score: 1

    I looked at the Twine web site, and I can't figure out what they're actually doing. It's all buzzwords. There's a video of the Twine guy speaking at the "Web 2.0 Summit". The video is useless; the guy is doing a demo, but the video only shows the face of the speaker, not the demo.

    Apparently the "natural language recognition" seems to consist of recognizing names of people, products, and companies. The examples were "Tim Bernars-Lee" and "Google", which are so unique that they're easy. But would it work for "Robert Smith" and "Joe's Plumbing"? There was no indication that it uses context to disambiguate the non-trivial cases. It still requires manual tagging for most data.

    There's a scheme for tracking document changes. There's a system that builds up a profile of the user based on what they store, which sounds like a targeted advertising engine. There's a personalized search engine. There are "collaboration features". There are contact lists.

    But from the available information, it's not yet possible to tell if this is useful.

    1. Re:Too many buzzwords. too little content by oliderid · · Score: 4, Funny

      The video is useless; the guy is doing a demo, but the video only shows the face of the speaker, not the demo.

      let's do some semantic here: useless, demo, speaker. Anwser:
      http://en.wikipedia.org/wiki/Internet_bubble

      Cool, the good old days are back, time to make some easy money :-).

  25. Don't Underestimate this by Anonymous Coward · · Score: 0

    The major difficulty the semantic web faces with adoption is that it is very hard
    to get people to tag things consistently and thoroughly, and it is also hard to have machines do
    the tagging. It looks like Radar Networks has made some progress in getting machines
    to do the tagging: you give them information, they or their machines tag it.

    If this is true, it could really help get the semantic web off the ground. The guys at radar networks are not
    clueless amateurs as some commentators above have suggested; you might say they've been around the block
    a few times.

  26. There's just no way to make it work every time by jpfed · · Score: 1

    Even then, if we somehow put in measures to detect ornithology pages, my cock pheasant pornography site could be misclassified, too.

  27. Build your own Semantic Web Apps using a free API by InsurgentGeek · · Score: 1
    All might be interested in taking a look at http://sws.clearforest.com/. These guys offer the high-end natural language processing that Twine claims as a simple API available to all.

    Some very cool apps have already been built on top of it like http://newsatseven.com/, http://www.squadinfo.com/, http://www.optevi.net/newstracker and many others.

    It's not the "real" semantic web - but it's an open-access starting point.

    The also have a firefox plugin at http://gnosis.clearforest.com/ that does semantic analysis real time as you browse. I use this constantly while reading business news or browsing Wikipedia.

    What they clearly don't have is Twine's marketing budget.

  28. Vagueness by Besna · · Score: 1

    You can express vagueness all you want (zo'e).

  29. semantic web without the buzzwords by Anonymous Coward · · Score: 0

    1. On my site www.mortality.com, I write, "All men are mortal". On site www.men.com, someone writes, "Socrates is a man".

    2. The semantic spider finds both these pages, and rather than indexing the words "All men are...", it adds to its knowledgebase:
      mortal(x) :- man(x).
      man(socrates).
    noting the sources of information, of course.

    3. In the search engine, I write: Is Socrates mortal?

    4. The engine translates the query to:
      ?- mortal(socrates).

    5. The Prolog inference engine finds this to be true, by the rules in (2).

    6. The UI answers that yes, using www.mortality.com and www.men.com, one can conclude that Socrates is mortal.

    The horribly naive proximity-based methods of today's search engines are the result of computer engineers throwing up their arms with a cry of "this problem is too hard, let's design something that looks clever but really has no clue". Straight statistics should be relegated to judging conflicting information - for example, a Bible thumper will want to weight answers in favour of religious sites, while an evolutionary biologist would want to filter out knowledge extracted from same. Since Wikipedia is a secondary source, one would want this weighted a lot lower, than, say, knowledge inferred directly from an archive of academic papers. Sensible, reconfigurable defaults.

  30. So Gmail is better? by techbiz108 · · Score: 2, Insightful

    Hold on so you are saying that any hosted service is unsafe then? What about all the people who use hosted email, or hosted collaboration, or hosted file servers? Sure if a hacker gets into anything it's unsafe. Heck even enterprise software that is locally hosted is at risk. Geez, if we're that terrified, let's not even use computers or the Internet at all then. Twine is no more at risk than Gmail, Facebook, Salesforce or any other online service that holds information that is not all public. Get real.

  31. Here's Hoping It Is As Stillborn As Rest of SW by littlewink · · Score: 1

    Not the first, not the last, but here's hoping something comes out of the ideas someday.
    Why? It never had a chance; just let it die, please.
  32. does anyone else here a buzz? by Sczi · · Score: 0

    Now I'm going to take a dog turd, a cat turd, and a goat turd and mash them together into a vaguely upright shape. Look, the leaning tower of pisa!

  33. the anti-google by Anonymous Coward · · Score: 0

    Is this just a tool to undo all the damage / barriers to entry that google has placed on the internet via it's adwords and adsense programs?

  34. My tin foil hat says... by lawn.ninja · · Score: 1

    This is away for the gov'ment to get all the info in one place and to get you to put it there voluntarily. I bet there is a clause somewhere in the user agreement that gives them permission to access and use your content. You know in order to properly catagorize it.

  35. Mod up (was: semantic web without the buzzwords) by j-tull · · Score: 1

    Good overview of semantic web for novices. Why do I never have mod points when I need them.