Slashdot Mirror


CMU Web-Scraping Learns English, One Word At a Time

blee37 writes "Researchers at Carnegie Mellon have developed a web-scraping AI program that never dies. It runs continuously, extracting information from the web and using that information to learn more about the English language. The idea is for a never ending learner like this to one day be able to become conversant in the English language." It's not that the program couldn't stop running; the idea is that there's no fixed end-point. Rather, its progress in categorizing complex word relationships is the object of the research. See also CMU's "Read the Web" research project site.

3 of 148 comments (clear)

  1. Finally, people are getting AI right. by Umuri · · Score: 4, Interesting

    I've always been amazed that until recently, most work on AI has been focused as a preconstructed system that fits data into pathways while having some variation in thought abilities to let it expand it's model slightly.
    They'd write the rules for the system and try to include most of the work on it, and then let see how good it does, with limited learning capabilities and still based on the original model.

    I'm glad a lot of research is finally gearing more towards the path of having a small initial program, then feeding it data and letting it grow into it's own intelligence.
    If you give it the ability to learn, then it'll learn itself the rest, rather than giving it functions that let it pretend to learn while fitting into a model.

    And i know there have been research into this in the past, but it didn't really take off till the last decade or so, and i'm glad it has.
    True, or at least somewhat competent AI, here we come.

    --
    You never realize how much manually made unmanaged "linked" lists suck, till you have src.link.link.link.link...
    1. Re:Finally, people are getting AI right. by phantomfive · · Score: 3, Interesting

      AI history has gone back and forth between pre-constructed systems and models that expand. One of the earliest successful AI experiments was a checkers program that taught itself to play by playing against itself, and quickly got very strong.

      Building a giant database of knowledge hasn't been possible for very long, because computers didn't have very much memory. When system capabilities first reached the capacity to do so, it had to be constructed from hand because there was no online repository of information to extract data from: the internet just wasn't very big. That particular project was known as Cyc, and it cost a lot of money.

      Since that time, the internet has grown and there are massive amounts of information available. It will be interesting to see the resultant quality of this database, to see if the information on the internet is good enough to make it usable.

      --
      Qxe4
  2. Re:Uh oh... by javaman235 · · Score: 4, Interesting

    The quality of the teachers is important when learning.

    That's seriously kind of interesting, actually: It makes me wonder if decades from now software developers will be few and far between, designing the AI algorithms for modern programs while the rest of us find work as software tutors, training those programs to do their business function.

    --
    -The art of programming is the pursuit of absolute simplicity.