Slashdot Mirror


Paraphrasing Sentences With Software

prostoalex writes "Cornell University researchers are making progress in paraphrasing and "understanding" complete sentences in a software application. Analyzing sentences on the semantic level allows the software application to treat two sentences, expressing similar thoughts and ideas, but written in a different manner, as a single semantic unit. Significant achievements in this area could revolutionize the information searching field."

2 of 203 comments (clear)

  1. The problem is... by Anonymous Coward · · Score: 4, Insightful

    That's there's absolutely nothing formulaic about idioms, which comprise 80% or so of english conversation. A human learns it by years of experience, a computer has to be given programming for every idiom there is.

  2. Re:Fascinating read by Jugalator · · Score: 5, Insightful

    I wonder what its' application could be, other than to detect duplicates... Perhaps, a tool to suggest ways of rewriting sentences? Or maybe part of a more advanced grammar check?

    My first thought was translation tools. GOOD translation tools that understand the grammar in the source language, and uses the grammar in the destination language to form the resulting sentence.

    There has been some work on something to solve this problem, where a phrase in language A was translated to some special "universal" code, and then finally to language B. The developers would then need to make the translator translate all languages to the universal code, and vice versa. The universal code could be whatever necessary to make the software as easily as possible be able to preserve the "meaning" of the sentence.

    However, if this is done, the problem could change from this:

    Source: I love hot dogs.
    Destination: Ich liebe heiBe Hunde. (i.e. a literal translation, from Altavista Babelfish) ... to this:

    Source: I love hot dogs.
    Destination: Ich liebe Nahrung. ("I love food")

    In case the universal language wasn't advanced enough and the english -> universal translator conversion was "lossy". So we might exchange our current problem with mangled grammar with lots information.

    Here's a web site about it, and I'm sure there are many more.

    --
    Beware: In C++, your friends can see your privates!