Slashdot Mirror


Translation Software That Learns by Reading

redcone writes "New Scientist is reporting that translation software that develops an understanding of languages by scanning through thousands of previously translated documents has been released by U.S. researchers. According to the article "The translated documents used to teach the translation algorithms can be electronic, on paper, or even audio files. The system is not only faster than other methods, but also better suited to tackling less common languages and the unusual vocabulary found in specialised or technical texts.""

4 of 308 comments (clear)

  1. Harry Potter and the Bible by MikeFM · · Score: 4, Interesting

    I remember hearing about this a couple years ago. They were using translations of Harry Potter and the Bible to teach this software to translate. It seems to work well. I wonder what it'd make of different translations of technical documentation. That'd probably be even more interesting than what it'd make out of 'quidditch'.

    This could be great if it were opensourced. It'd be nice to translate email, instant messages, websites, technical docs, and lots of other stuff we're currently using the fish for. The fish is nice but not that effecient to add to other programs and it's translations aren't usually that great.

    --
    At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
  2. Google definitely would buy into this... by egyber · · Score: 5, Interesting

    Don't remember exactly where I read this, but google apparently has long believed that there is enough data on the internet alone to be able to intelligently translate... What these guys claim to have done is, it would seem, the missing peace of the puzzle for google. I wouldn't be surprised if google gets in on this.

  3. Arabic to English by Caseyscrib · · Score: 4, Interesting
    I'd like to see an arabic-to-english translator. I was interested in reading news from the middle east, because I don't particularly trust our media to translate it properly. A good example of this is Bin Laden's transcript.

    After a quick web search, all I was able to find was this site, which has a pretty sketchy TOS agreement.

  4. How is that news? Research was done 10 years ago. by Anonymous Coward · · Score: 4, Interesting

    The basic approach has been developed over 10
    years ago by IBM: The Mathematics of Statistical Machine Translation. And even free software has been available for a while, see
    http://www.fjoch.com/GIZA++.html.