Slashdot Mirror


Automatic Translation Without Dictionaries

New submitter physicsphairy writes "Tomas Mikolov and others at Google have developed a simple means of translating between languages using a large corpus of sample texts. Rather than being defined by humans, words are characterized based on their relation to other words. For example, in any language, a word like 'cat' will have a particular relationship to words like 'small,' 'furry,' 'pet,' etc. The set of relationships of words in a language can be described as a vector space, and words from one language can be translated into words in another language by identifying the mapping between their two vector spaces. The technique works even for very dissimilar languages, and is presently being used to refine and identify mistakes in existing translation dictionaries."

1 of 115 comments (clear)

  1. Summary wrong (again) by icebike · · Score: 1, Flamebait

    Simply because you embed your dictionary in something you choose to call a vector doesn't make it any less of a dictionary.

    Its still a dictionary, and also a thesaurus. Come to think of it a thesaurus is simply a meaning vectored dictionary.
    What's old is new again.
    Mathematicians, late to the party, still trying to drink all the punch.

    --
    Sig Battery depleted. Reverting to safe mode.