Slashdot Mirror


A Universal Networking Language for the Internet?

Anonymous Coward writes: "The United Nations University is developing a Universal Networking Language for the Internet, which is designed to allow effective communication between people writing in their native languages, with automatic conversion through an intermediate Meta-language (perhaps a precursor to Star Trek's Universal Translator.) They will be holding a symposium on the technology on 18 November in Brussels, Belgium, where they will publicly announce their achievement. They claim that the initial stage of UNL will support 16 languages: Arabic, Chinese, English, French, Russian, Spanish, German, Hindi, Italian, Indonesian, Japanese, Latvian, Mongol, Portuguese, Swahili and Thai." An interesting idea, but this is one of those "the devil is in the details" things. It'll be interesting to see how/if this can work.

291 comments

  1. Re:Au contrare, mon frere by Anonymous Coward · · Score: 0

    Of course both of you are forgetting an important thing. The example you are using is from a song, and the lyrics arent exactly the best to use for examples. In this case the english translation is: My hat, it has 3 corners. because thats the way the song goes.

  2. Re:sounds difficult - not as you say by HiThere · · Score: 1

    Actually, as long as the final interpretation is done by a human rather than by a computer, some parts of the understanding can be let fly.

    OTOH, a transformational grammar has not yet been shown to be powerful enough (at least I haven't heard that it has). I think that one would require a complete ATN network with recursion. Bounded recursion would probably be sufficient, as I don't feel that folk understand more than about three layers. Certainly it only goes deeper as a stylistic perversion of normal syntax (but fashion can do strange things).

    A worse problem is divergent mappings. No language uses an atomic view of the world, so each concept in each language is a set of items selected from the universe of possible concepts. This can be noticed even within a single language when moving from one dialect to another. It is most easily noticed when discussing things that map readily onto sensory images, e.g., "What is your name for the color of the object?", but it exists in all aspects of lanugage. (What is the difference between "dog" and "hound"?) When one translates the term "black hole" into Russian, I am told, one must use a different term, because in Russian a "black hole" is something specific which is not astronomical (not sure what, but it was taboo).

    Now this is mainly something that can be handled by a lot of detail work. But I mean a lot of detail work. To get a very mild idea of a part of what I am talking about, pull out an unabridged dictionary and open it to a random page of definitions. Each meaning listed will probably need to be a separate term in the meta language. And that's just the distinctions that an english speaker would notice.

    --

    I think we've pushed this "anyone can grow up to be president" thing too far.
  3. Re:Language support - Esperanto? by Anonymous Coward · · Score: 0

    Let me add to this.

    The vodka is strong, but the meat is raw.

  4. Re:Who needs it? by PrinceOfChaos · · Score: 1

    Ok. In Russia, Russian is used for:

    1) Financial Markets
    2) Aviation
    3) Scientific Publication
    4) Popular Culture
    5) The computer industry ( along with English)
    6) Everything else that matters

    :)

  5. I hate to be critical... by vlax · · Score: 1

    There's no real technical information on the website, and no evidence at all that a linguist is actually participating in this project. It sounds like a bunch of computer scientists who think they understand language.

    Actually, the only real data they offer suggests that they are recreating the work Anna Wierzbicka was doing in the 80's with her ad-hoc theory of semantics. She ultimately showed why it wouldn't work, and now criticises the idea of using controlled language at all for machine understanding.

    No, these people don't seem to have any idea what they've gotten themselves into. This kind of thing was what I did graduate work on. Controlled language is a useful idea, but a very limited one, and using pivot languages for translation will only take you about as far as Systrans' system (the one used in Babelfish.)

    There are much more sophisticated efforts going on elsewhere, and even those are getting bogged down in the ugly reality of natural language. This will languish and go nowhere. With some luck, some more realistic project, like some of the automatic text summary projects and natural language to knowlege base projects will eventually produce a usable product, but this UN university effort sounds like a waste of time.

  6. DLT project did this with Esperanto by limako · · Score: 2
    The Distributed Language Translator (DLT) was a project in the Netherlands that made a first pass at this 10 years ago. They started with Esperanto and then made some changes to disambiguate words (even more than Esperanto already does). It worked pretty well, but suffered from the same kinds of problems anyone who's used translating software has seen before. Here's a nice article about it -- in Esperanto, of course.

    What's evil about these projects, of course, is that they don't let people just talk to one another. It would be neat to be able to have access to the literuture of other countries, but that pales in comparison to having access to the people in other countries. If you just learn Esperanto you can really converse with people without needing technology or anything. It just works.

  7. Re:this is great by Joe_NoOne · · Score: 1

    Uh, does that mean the end of the world? Didn't the creators of the tower of babel get smitten or something? I remember something about god not being happy so he did something and destroyed the tower...

  8. Bablefish crap by SpinyNorman · · Score: 2

    Why would anyone want their web page to read as if it's been run through a bablefish? A translation from netspeak into, say, English is always going to suffer some mangling, and most likely is not going to allow idiom, metaphor, etc.

    Machine translation will improve, but the best oranization is still going to be browser or proxy based translation. If that translation package internally uses an intermediate semantic representation, then fine, but the day /. reads like bablefish crap is the day I find myself an English web site.

    You have to admire the democratic thinking though (NOT!) - rather than just foreigners seeing your web page as crap, you can (must) see it that way too! Designed by politicians, no doubt.

  9. How to get the best of both worlds by Anonymous Coward · · Score: 0

    Ideally the system would allow finely crafted (hopefully even poetic, which often *requires* ambiguity) sentences in the author's native tounge. The UNL meta language could have drop down lists under each ambiguous word/phrase prompting the author to further clarify exactly what they meant, so it could be translated into all the other languages with the meaning intact.
    Obviously all subtlety, poetry, aliteration, etc. will be lost.

  10. Scary by Joe_NoOne · · Score: 1

    While I don't believe myths and such, it is rather scary how it matches the story of babel. Since we can't be scattered to the corners of the earth, what'll happen?

    http://logos.uoregon.edu/polyphonia/babel.html

  11. Arabic is semitic by Anonymous Coward · · Score: 0

    Don't you know who the real semites are?

  12. Universal Languages in SF by Anonymous Coward · · Score: 0

    Jack Vance wrote a book, The Languages of Pao, which explores the idea of using created languages for social engineering-i.e., create a language with few words for compassion and many for violence, take some children, put them in an isolated environment, encourage certain behaviors like competitiveness, and have them grow up to be warriors. It's an interesting concept. I wonder how the UN's universal language will address cultural nuances (don't include capability to translate violent concepts).

  13. Re:Language support by Anonymous Coward · · Score: 0

    I don't know German, but just for grins I ran "Gemütlichkeit" through Babelfish and came up with "cosiness".

    Is this a cozy approximation?

  14. enlessly [sic] by A+Big+Gnu+Thrush · · Score: 0

    Hey, moron! You misspelled endlessly. If you can't type a word as simple as that, head back to segfault or alt.moron.mindless.rambling!

    BTW, whatever window manager you use blows, unless of course you're not on Linux, in which case you blow!

    1. Re:enlessly [sic] by BitS · · Score: 1

      how old are you? 3, 4? this kind of comment is sad, you must have alot of personal problems to be so lame as to take a normal conversation and just start insulting it... lemme guess... your first /. post?

      --
      http://www.schizo.com/
    2. Re:enlessly [sic] by Anonymous Coward · · Score: 0

      either this went over his head, or he's kidding. the scary thing is that i cant tell which one it is.

    3. Re:enlessly [sic] by Progman · · Score: 1

      ... and you're a human! Just imagine what the universal translator machine would have made of it!

  15. Re:Esperanto ?? by Anonymous Coward · · Score: 0

    Esperanto wasn't aborted, it's alive & well today. The estimated number of speakers ranges widely from around 100,000 to 20 million, however, the most realistic estimate is around 2 million speakers today. Most people think the idea of a common & neutral second language is a good idea, it's just most haven't heard of esperanto. I hadn't until about a year ago, and I suspect many many more will hear about it because of the internet. It's a very easy language to learn, much easier than Russian, which I'm also learning at the present.

  16. Idns.org Internationalized Domain System by Hydrophobe · · Score: 1

    IDNS.org has a spec for non-ASCII domain names. They have a modified version of Bind available for download.

    Getting this adopted universally is nontrivial.

  17. Re:this is great by jra · · Score: 1

    > Didn't the creators of the tower of babel
    > get smitten or something?

    Well, "smited", perhaps.

    I think "smitten" has a _slightly_ different meaning there...


    Cheers,

  18. This is necessary and good. by Anonymous Coward · · Score: 0
    I doubt it will produce anything really useful just now. I've studied enough different languages to fully appreciate the problems of culture, context, slang, etc. Babelfish has earned the jibes it regularly gets, but nonetheless it does let me (mostly) understand many foreign-language texts. As bad as it is, it's not all bad. And I anticipate great improvement in natural language processing systems.

    If the nth-generation of babelfish can get it mostly right, then an intermediate language of some sort is a must. They want to support all 185 member languages of the UN and allow others to be supported as well. That's a pretty large matrix! It's unlikely that resources would be dedicated to a straight y Gymraeg-to-Euskara translator, much less Tagalog-to-Inuktitut.

    But that intermediate step means the process had better be good, as can be seen by using Babelfish to translate from language A to B and back to A again. I presume that UNL, to fill the role of language B, will be designed to facilitate getting it right.

  19. Re:Language support by Our+Man+In+Redmond · · Score: 1

    Forget support for Esperanto -- just use Esperanto as the intermediary language it was designed to be. Somehow I don't think encouraging people to include support for ISO 8859-3 in operating systems, browsers, etc. is going to be any less difficult than making allowances for bi-directional text in any of a number of character sets, to say nothing of language nuances (quick, how would you translate "Gemütlichkeit" into anything but German?). Esperanto is not that hard to learn, even for non-Indo-European-language speakers (there have been, and presumably still are, significant Esperanto movements in Japan and China, for example). The grammar can be grasped in about 30 minutes and you can carry the essential vocabulary around in your wallet.

    I know, I know, people are going to come up with reasons not to use Esperanto. But it seems like if a solution that will work exists, why not use it?

    (Note: Even though I like and occasionally use Esperanto, I would welcome use of a similar language like Interlingua or Latino sine Flexione that would be equally easy to learn and do the job just as well.)
    --
    Iun vi konfidas, kun ni li alig^as.
    --

    --
    Someone you trust is one of us.
  20. Re:Noam Chomsky and the Universal Grammar by vlax · · Score: 1

    Chomsky revises everything he thinks every 10 years or so. The existence of a universal grammar of the type Chomsky currently advocates (and it is by no means clear that this is true) still doesn't necessarily mean that we can construct a common, useable language for everyone. Remember, every language used in the world is one of those "special cases."

    Chomsky claims (despite evidence to the contrary) that syntax can be analysed apart of semantics, implying that if we could agree to a universal word list and definitions, it might be possible to devise an equally neutral grammar to use for machine translation. However, it is quite clear that words, even pseudosynonyms, don't mean the same thing in different languages.

    My inclination is that Chomsky is just plain wrong about it in the first place: that there is no universal underlying order of constituents, but rather that human language structure are restrained to a subset of all valid ways of organising information linearly, and that those constraints are biological.

    This means that any real machine translation requires us first to make real progress in understanding how humans process and store linguistic information. This field is in its infancy.

  21. NewSpeak by Fly_Boy · · Score: 1

    This is all well and good until somone in the UN declares that isn't a word anymore...

  22. Re:It won't work (not as you may expect) by Another+MacHack · · Score: 1

    I'm surprised nobody's mentioned Hofstadter yet; he had a pretty good translation of Jabberwocky into German and French. Should you translate "Campbell's Soup" to "Borscht"? "Jakobstrasse" to "Jacob Street"? Why bother translating Dickens; just read Dostoyevsky!

  23. No way by Bj�rn+Stenberg · · Score: 1
    Quote from their document:
    UNL is designed with the following aims:
    (1) UNL is to be capable of exactly representing all the information expressed in any language.
    (2) UNL expression must be defined not only as rigorous but also as general as possible in order to be understood by any people who are engaged in the development of "enconverters" and "deconverters" in each language.

    Not only is point one completely and utterly impossible for reasons well discussed here already (slang, local expressions, evolvement of languages etc.), point two actually contradicts point one! They want UNL to be an exact representation of the meaning expressed in the native language, while simultaneously having it to be generic enough so everybody (or at least all "enconverter" developers) can understand what is being said. Assuming the average "enconverter" developer will be as technically (il)literate as the authors of this document, there's no way they are going to understand what technical people are talking about even when using his native language. No way is UNL going to help with that. So how, then, is he going to understand that very same conversation translated from a language he doesn't understand in the first place? Forget it!

    Nice idea. Store it in the bin with all the other equally nice ideas: "Health and food for all" and "Can't we all just get along?".

  24. Re:lost in the translation by Windigo+The+Feral+(N · · Score: 1

    Akatosh dun said:

    The concept is nice, but you're still stuck with the problem that most languages are based on anacdotal references as well as accual words. You can translate the words, but the concepts will still frequently be lost.

    It's very interesting that you bring that up. Idioms can be a bear to translate at times, much less cultural references (even from English to Spanish and back--in many fansubbed animes, the fansubbers have to include a section at the beginning for cultural references and idioms that Americans wouldn't necessarily get but Japanese audiences would). Not only that, some concepts do not translate clearly across languages (I actually find it easier to think of the Japanese concept of honour in terms of the Tao or the Dine' {Navaho} concept of the Path of Beauty than in English!).

    A really good shot of how translation can require translating idioms and noting cultural reference is the discussion of the upcoming American release of "Mononoke Hime"/"Princess Mononoke" (click here for the gory details :). Neil Gaiman is translating for the dub, and apparently there were multiple major issues in translating it including:

    The fact the entire dialogue in the movie is not in modern Japanese but in an archaic form (roughly akin to Middle English or the old form of English used in the King James Bible)

    A mess of cultural references that Americans would not be aware of (such as one of the main characters cutting his hair--in Japan this is recognised that a warrior is leaving forever and to be among the dead)

    A number of idiomatic phrases that had to be translated into American idioms (such as a comment that a character's soup tasted like water--which is about as low as one can go to insult one's cooking...this ended up being retranslated into "Your soup tastes like piss" which is more understandable to silly gaijin :).

    Needless to say, it was quite illuminating...especially since some cultural references were noted that I didn't pick up on the first time I saw it (I've seen the fansubbed version) and I'm an otaku. Apparently Gaiman has rewritten the script explaining some stuff that American audiences wouldn't catch, either...and to be honest (IMHO) Gaiman is probably one of the few people who could've pulled it off.

    Another really good example of this is the first tape of the anime "Compiler"--which was dubbed, but they STILL had to explain at the end why a giant Colonel Sanders turned into a Japanese baseball player and defeated a mad statue :) (Basically...Roy Bass won the Japanese equivalent of the World Series for the Tigers...the celebrating fans grabbed a statue of Colonel Sanders from a KFC, it being the only Anglo-looking statue that could be found, and threw it into the sea...they have not won the pennant since, and legend goes that some say the town will not win the pennant until the statue of Colonel Sanders is retrieved because the sea gods are pissed. :) Neat story, but not one most Americans would get...then again, the Japanese wouldn't get why octopi are often thrown at Detriot games if they get in the Stanley Cup :)

    --
    -Windigo The Feral (NYAR!)
  25. Re:lost in the translation by Anonymous Coward · · Score: 0

    Really? Bummer. Shaka, when the walls fell.

  26. Re:Interesting, but... by DuaneGriffin · · Score: 1

    Of all the languages of the world there are three that clearly have great bodies of literature - Sanskrit, Greek, and yes, English.

    Hmm, I assume that there was an implyed 'only' in there. I have read a few Chinese authors and poets who would very strongly disagree with you, my Eurocentric friend. In fact I would venture to suggest that the body of literature in Chinese is substantially greater than in either Sanskrit or Greek, although I freely admit that I have absolutely no facts whatsoever.

    Anyone got any figures?

    --
    - "I never could learn to drink that blood and call it wine" - Bob Dylan (Tight Connection to my Heart)
  27. Re:Which Chinese language? What about Concepts? by Anonymous Coward · · Score: 0

    The first phase is to support the handful of official languages of the U.N., so it would be whichever Chinese is in that group.

  28. Hey, something hard! by Szoup · · Score: 1

    And it will probably take the UN 42 years to provide the first draft specs.

    In any case, sounds like a worthy effort.

  29. Re:Interesting, but... by James+Lanfear · · Score: 1

    It's definitely complex and inconsistent, but the point was that it's explicit--eg, instead of inflection you use prepositions and auxiliary verbs--and to that extent superior to many languages for scientific purposes. (Incidentally, Whitehead was a logician/mathematician, arguably the greatest of this century; I feel inclined to believe him when he says English offers an advantage in his own field.) You're absolutely correct, but I don't think that invalidates his statement.

    (And many people would argue that the body of English literature is no greater than that of, for example, German or Japanese.)

  30. Re:The "meta-language" by seanb · · Score: 1

    (I'm just thinking online here. I don't even know many spoken languages, but many of my Asian friends have spent long hours telling me how terrible English is.)
    I'm not sure English is the proper starting point for this type of a machine-read hyper-language. English is primarily a spoken language, with all the fuzziness that implies.
    What may be more appropriatte would be to start with written Chinese. From what I undserstand, "Chinese" is already something of a hyper-language, with one written language expressing several spoken languages. Modify the set of ideagrams to include some phonetic symbols (to properly represent the many names that are best represented as sounds). Ideally the syntax would allow for defining custom linguistic symbols, much like XML's ability to define custom tags. Tweak the hell out of this until you have a machine readable language (do less than 2^16 standard "words" seem adequate? Should this blow unicode out of te water and use 32-bit "words"?)

  31. What shape will it be? by stx23 · · Score: 3

    I'm hedging my bets it will be fish shaped, and will fit into the inner ear.

    1. Re:What shape will it be? by spectecjr · · Score: 1
      I'm hedging my bets it will be fish shaped, and will fit into the inner ear.

      Well, if it is, I think it died at some point while it was stuck in someone's lughole...

      Reason being? Well... reading the English info about the project (which I can only assume was run through their "enconverter" and "deconverter"):

      Multi-lingual network aims to enable people to communicate in their mother language with peoples of different language. UNL is a common language shared by people over the world in multi-lingual network. UNL system basically consists of network and conversion program between UNL and native languages.
      A conversion system from native languages into UNL is called "enconverter", and that from UNL into native languages is called "deconverter". Information in each language, being "enconverted", is exchanged via network in the form of UNL. Information represented in UNL is "deconverted" into each native language on the terminal of network.
      In transmission of information, the information which is expressed in a mother language is enconverted into UNL. Preciseness of conversion can be verified by deconverting the UNL representation into the language from which the UNL is obtained.


      1. Why not call it "encoding" and "decoding" like the rest of the world?
      2. Internet is spelt "Internet", not "Inter-Net"
      3. The grammar is terrible. "Information represented in UNL is "deconverted" into each native language on the terminal of network."... er... anyone else see words missing from that sentence?


      I'm just hoping that they get people who can actually read and write their target languages fluently to do the testing...

      Simon
      --
      Coming soon - pyrogyra
    2. Re:What shape will it be? by Anonymous Coward · · Score: 0

      Someone completely missed the joke

    3. Re:What shape will it be? by spectecjr · · Score: 1

      Babelfish...right?

      Unless the joke was the whole UNL site? Which I can see :)

      Si

      --
      Coming soon - pyrogyra
    4. Re:What shape will it be? by greenrd · · Score: 1
      I don't think he missed the joke, I think he just posted there to get near to the top of a "sort by score" view like I have. If one person does that it's okay in theory but if everyone does it, that's abuse.

  32. Language support by Ledge+Kindred · · Score: 2
    What, no support for Esperanto?!

    -=-=-=-=-

    --

    -=-=-=-=-
    My mom's going to kick you in the face!

    1. Re:Language support by Our+Man+In+Redmond · · Score: 1

      Close, but not exactly. About the closest you might get in English would be "the coziness you feel when you're together with family and friends, or at a pub you're quite fond of." And even that isn't quite on the mark.
      --

      --
      Someone you trust is one of us.
    2. Re:Language support by Anonymous Coward · · Score: 0

      Although i hate the G-Word, i would translate it as Comfortability.

    3. Re:Language support by piotrr · · Score: 1

      That is cosiness. Just like how us Swedes think we have something with the word "lagom" (='about just right', 'close enough'). It, and any other word as well can be approximated within most other languages. The only thing I can really see a problem coming with, just because of the scope of it, is the Inuit mass of words relating to "snow".

      --
      / Per
    4. Re:Language support by orabidoo · · Score: 1
      Esperanto is great as a hobby, and as a way to make international friends who share your hobby. it's not particularily good as an intermediary language for computer translation; in fact it's arguably quite bad at that.

      point is, what this project is trying to do has nothing to do with Esperanto or its approach. As someone stated, this intermediary language doesn't even have to be human-readable, human-pronounceable, nor human-easily-learnable. OTOH, this language has to keep as many semantic features of the translated sentences as possible, including which pronouns have the same reference in a sentence (e.g the fact that "he" and "him" can't be the same in "he told him to to bugger off", but the two "he"'s can in "he thinks he's great", in English). languages in the world have come up with remarkably diverging ways to mark up what refers to what, including gender (or category) agreement, and all kinds of syntax-based rules. then there's the problem of semantic space: what do you do with the fact that languages (to use a well known example) don't always put the borders between colors in the same places, so that for one language these two colors will be variations of the same, and for others they'll be different?

      all together, this means that an intermediary language, to be useful, has to strive for completeness (keep as much information as possible), rahter than for simplicity. it'll be hard to learn, but that's fine, you don't actually need people to learn it, so you don't even need to construct a phonetic/phonological system for it. think of it as an internal representation in computer memory. Esperanto being a rather simple language with a strong Indo-European bias, is not even close to mathing the requirements, and Interlingua, and Interlingua even less. but that's fine, this is not the problem these languages were trying to solve anyway. OTOH, E-o and Interlingua's relatively simple grammars should make it fairly easy to build translators to and from these languages and the UNL. maybe the UNL people could do a small scalle test to see how well their system can translate (for example) Esperanto to Interlingua, before attempting to translate German into Japanese. or maybe not.

    5. Re:Language support by sciuro · · Score: 1

      surprising, since one would expect that esperanto, with its more regular and simplified grammar, would give them an extra language on their list with less effort than the others there...

      -duncan

    6. Re:Language support by chuckT · · Score: 1

      Without wishing to seem negative, I think you're all being far too nice about this. Assuming this is not a hoax, then this is a terrble, badly-presented, poorly-researched idea.

      It is clear from the above posters that there are a number of artificial languages (lojlan, esperanto, etc) that have been developed as an aid to communication that could be used instead of this ill-defined 'Universal Language': why not just use these?

      There is virtually no attempt to approach the problems that people mention of cultural specificity, grammatical incompatability etc; and it appears to me that all they have done is come up with a couple of idiot buzzwords, and got a shitload of funding.

      I'm sure that this kind of intermediate layering system (which has been mooted for quite a while) is a way to go for a universal translator, but this project sounds like crap.

      sorry.

      --
      - These are small, *those* are _far away_
    7. Re:Language support by PHroD · · Score: 0

      I was gonna ask that myself! There are enough people that speak it (its cool to learn languages, even if they're not for programming ^_^)


      "There is no spoon" - Neo, The Matrix

  33. Esperanto Info by Anonymous Coward · · Score: 0

    Esperanto was not invented for that purpose. Esperanto's purpose is to be the one foreign language everybody in the world would study. That way any two people would have a spoken and written language they could use to communicate.

    Esperanto is alive and well on the Net. Use your favorite search engine to find links. Here are some:

    Marko

  34. Who needs it? by jcr · · Score: 1

    We've already got English.

    Now, the Academy Francais may not like it, but English is already the language of:

    1) Financial Markets
    2) Aviation
    3) Scientific Publication
    4) Popular culture
    5) The computer industry
    6) Everything else that matters.

    English is the new Latin: Deal with it.

    -jcr

    --
    The only title of honor that a tyrant can grant is "Enemy of the State."
  35. Re:Lojban, not Esperanto by Our+Man+In+Redmond · · Score: 1

    Well, part of the problem is that lojban really doesn't resemble anything so much as an explosion in a type factory. Esperanto and Interlingua at least have the occasional Latin or Greek root that's worked itself into worldwide usage.

    I guess the problem is that it's difficult to adapt a computer-friendly language to humans, or a human-friendly language to computers. But like teaching a computer to play chess, that doesn't mean it isn't worthwhile.
    --

    --
    Someone you trust is one of us.
  36. Sounds possible.. by Kukester · · Score: 1
    Seems to me that it will be easier to write a translator from your native language to a very well defined and documented intermediate language than trying to understand the fine details of a non-native language.


    Is this what happens inside the head of a bi-lingual person? (This is posed as a question to any readers who might be)

    1. Re:Sounds possible.. by Cuthalion · · Score: 2

      Is this what happens inside the head of a bi-lingual person?

      Uh, no. Typically, I believe bi-lingual people internally switch back and forth, or represent some concepts in one language and others in the other. That is to say, when they actually are using internal linguistic representations of things at all.

      --
      Trees can't go dancing
      So do them a big favor
      Pretend dancing stinks!
    2. Re:Sounds possible.. by Traser · · Score: 1

      I am bilingual speaking english and french. When speaking french I think in french. This continues untill I come upon an idea/concept I don't know in french. I then switch to english thinking and try to translate another way of saying the idea/word into french. If I used a kind of non-linguistic referencing system it would be harder for just two languages. Any multi(4+)-linguists out there. Do you use a non-linguistic referencing sytem or do you reference everything back to your mother tongue?

      --
      Insanity is contagious. - Yossarian
    3. Re:Sounds possible.. by Anonymous Coward · · Score: 0

      I was raised in French and Portuguese, or rather I learned to read/write in both languages at early age and used both until 18. I find myself thinking in one or the other, depending on the context, the country, the conversation. I have some recollections of dreaming in French and Portuguese, and even if I use English more than those two languages for the last gazillion years and if I absorb it from TV everyday, it never happened to me listening to myself or dreaming in English. If I hit my thumb with a hammer, I curse in Portuguese and French, never English!

      I feel the "usage" of a look-up table that goes from English, German, and Spanish to the internal representation. In French and Portuguese, I don't have that feeling. But, unlike the other comment above about pair representations, I rather multitask. I think in French or I think in Portuguese, but never both at the same time. When looking at a table, I don't see "table" and "mesa", I see "table" OR "mesa".

      Translating from French to Portuguese (and vice versa) is instantaneous, with the occasional mistake from living abroad for too long and for reading too much English. It's just there. I wish it would happen with English, but I think it is too late now. Those brain wirings were burned-in when I was a kid to the French-Portuguese specs and now any additional languages go on separate circuit boards... If only I had studied C++ in kindergarten...

    4. Re:Sounds possible.. by Anonymous Coward · · Score: 0

      I have two native languages, English and Filipino, having heard, spoken, written and studied both since birth. I've also taken basic courses in Japanese and German. There's a big difference between my native tongues and those that I learned later on.

      I can think and verbalize in my native languages without effort. Switching between the two is like switching between formal and conversational forms of the same language - the brain automatically uses what is appropriate. In contrast, my non-native languages have to be "translated" consciously and switching to them is an active effort. It's like having to tell myself "hey, this sentence is in German".

      There are some things that no longer need "translation". Numbers are easiest - whether it's spoken in any of the languages I know or written down as "1, 2, 3", "one, two, three", "isa, dalawa, tatlo", or as Japanese numerals. I suppose this means that once the brain is "at ease" with a representation of a concept there is no longer any translation effort required.

      Spoken/written languages are simply representations of concepts. A universal translator would have to be a completely language independent database of human concepts. Using it would be like taking C code, abstracting it to psuedo-code, then coding that into BASIC, or Pascal, or Perl.

    5. Re:Sounds possible.. by Jonavin · · Score: 1

      I usually switch back and forth, and sometimes even in the middle of a sentence without thinking about it. It's not a big deal when your friends understand all 3 languages, but if you only spoke one of the 3 or 4 languages you'd probably be lost.

      This brings up a problem with translations and learning new languages. IMO you don't really KNOW the language unless you THINK in that laguange. Doing a BabelFish in your head is not the same. This is exactly the problem I (and probably most adults) have learning new languages; we babelfish a new language instead of learning it.

    6. Re:Sounds possible.. by twinpot · · Score: 1

      >Is this what happens inside the head of a
      >bi-lingual person? (This is posed as a question >to any readers who might be)

      For me it depends on how fluent I am in that language, and how I learnt it - I can switch backwards and forwards directly between three languages, but a couple of others are only one way - to go the other way I have to go through English or Italian.

    7. Re:Sounds possible.. by xyz · · Score: 1
      Let's see. My native(?) language is french, but I went to german-speaking schools. So what happens in my head?

      World War II :)

    8. Re:Sounds possible.. by Anonymous Coward · · Score: 0

      If you learn to use a language properly and have to use it every day, you switch back and forth mentally. I never translate to my native language when I write, but sometimes I do when I talk. The two vocabularies ( In my case English and Norwegian )is beginning to fill each other out rather than being to separate 'lumps' of words with 'this word equals that word' pattern. I never picture the = anymore, but rather get a brief sensation that a Norwegian word belongs in a geometrical position in relation to the english one, or vice versa, like you get a sensation that this word is straight forward and a bit to the left of another word...

    9. Re:Sounds possible.. by orabidoo · · Score: 1

      yep, multi-lingual people switch back and forth. I find myself thinking in English, Catalan, Spanish or French, more or less randomly, depending on who I've last talked to, on on what I'm thinking about, or whatever.

    10. Re:Sounds possible.. by hvoss · · Score: 1

      (Seems like the GCC compiler idea to me).

      I don't know if I qualify for bi-lingual, but still I feel free to comment.
      Whether I use English or Dutch (or for that matter German, which I hardly do at all), I think in that Language.
      But on the otherhand, I know that in my head/memory I use a more symbolic representation, this is indicated by that I can usually not recall word by word what was said, but I recall the contents (=symbolic?), both intelectual and emotional.
      But then again, this is just me. Maybe there's a mind-specialist among us who knows how this works in general.
      Hans Voss
      ---

      --
      Hans Voss
      ---
      "I have no special talents, I am just passionately curious" -- Albert Einstein
    11. Re:Sounds possible.. by Stephen+Williams · · Score: 2

      Seems to me that it will be easier to write a translator from your native language to a very well defined and documented intermediate language than trying to understand the fine details of a non-native language.

      Though I know nothing about natural language parsing and translation, it seems to me that, from a software engineering point of view, translating from a spoken language into the metalanguage will be the hardest part of the exercise. This will be especially true with languages such as English that have inconsistent grammar and more than one way to do everything (makes you wonder if English is the spoken equivalent of Perl :-)

      Once you've got your text translated into the regular, simple metalanguage, it should be an easier task to convert it into a natural language than conversion to the metalanguage was.

      (Incidentally, the parallel with Star Trek's universal translator was a good one. In Star Trek, outgoing communications from Starfleet vessels are translated into a metalanguage called Linguacode which is supposedly easier for the alien's translation computer to process.)

      -Stephen

    12. Re:Sounds possible.. by fReNeTiK · · Score: 1

      ROTFL! That one was really funny...
      --

      --
      I strongly believe that trying to be clever is detrimental to your health. -- Linus Torvalds
    13. Re:Sounds possible.. by EisPick · · Score: 1

      Whether I use English or Dutch (or for that matter German, which I hardly do at all), I think in that Language.

      I think this gets to the root of the problem with the metalanguage method for translation.

      It seems to me that the attempt to build a metalanguage is based on the theoretical assumption that words are a representation some more basic element that the brain uses. Which is to say the translation to a metalanguage would be an approximation of a process the brain conducts when interpreting language.

      I don't think the brain works that way. Words are the building blocks of thought, and "thinking" in another language means that the brain trades one set of building blocks for another.

      This explains why multilingual people often intermingle words and phrases from different languages in the same sentence: While two words from different languages may be similar in meaning, they often don't mean precisely the same thing, so the speaker will substitute a more appropriate word from another language to communicate the precise nuance intended.

      This also explains the often heard phrase, "Well, that doesn't translate well." One can almost always translate the words, but that doesn't necessarily convey the same meaning.

      No matter what language a person uses, the grammar, vocabulary and syntax of that language is directly wired into the brain of the speaker. The brain has no metalanguage of its own. Which, of course, is why the use of language by children is such an important component of brain development. And why, althought it might sound snobbish, people with small vocabularies don't think great thoughts.

    14. Re:Sounds possible.. by 12dec0de · · Score: 1

      But on the otherhand, I know that in my head/memory I use a more symbolic representation, this is indicated by that I can usually not recall word by word what was said, but I recall the contents (=symbolic?), both intelectual and emotional.

      It is not just you!
      I have the same impression. This leads to funny things like citing a dialog in a language that you never heard it in. But since you only remember the information that was transmitted on a higher level , that you can translate (or rather re-encode) on the fly.

      But if you are listening to one language it is very hard to translate because you continue to thing in that language.

      One other rather funny effect I (native german) had when in high school while staying in the US: In american history we where shown a film about typical americana; the 4th of July parade, with some major-dude holding a speech. Suddendly I noticed some comotion around me and people asking 'what f* is he talking about'. It took me quite some time to notice that the whole film was in german. (in order to show that only a single voice would have turned the US into a german speaking nation) Very surreal that.

      mfg

    15. Re:Sounds possible.. by Yohahn · · Score: 1

      Actually, I believe this does, to a certain respect. We actually filter our thoughts through language. For example, People might have 13 words for snow, while we just call it snow. What they express in 1 word, we express in many.. for example (not actually a real language) they might say gicka, for what we would call "snow that has melted a bit on top and re-frozen".

      The meta-language is unknown to us, because we don't really know how we store concepts, but indeed those concepts are the metadata (thus this UNL better have good resolution when it comes to concepts).

      Language can be limiting in this respect, as well. In one native american language there was no differentiation between orange and red (I believe, I'm working on memory here, it might be 2 other colours). When native speakers were tested, they had a hard time differentiating between colours.
      This is a case, where the filter has shaped the meta-data.

      Quite an interesting topic area

    16. Re:Sounds possible.. by fReNeTiK · · Score: 1
      >Is this what happens inside the head of a bi-lingual person? (This is posed as a question to any readers who might be)


      Let's see. My native(?) language is french, but I went to german-speaking schools. So what happens in my head? Difficult question actually, because I switch between german and french without any conscious effort. I guess that's the definition of bilingual, when using the "other" language doesn't require a conscious mental effort. And of course, when I speak/write/read/listen to german, I don't translate it to french internally.

      As a comparison, when I use english, I translate it to german "internally" (why german and not french? well, german is closer to english). But I notice that with time and as I use english more often, this operation is slowly fading from consciousnes.


      I wonder: Do I really change modes completely when I switch between german end french, or is the german-french translation still happening, but sort of in the background?


      Sounds confusing doesn't it ;)



      --

      --
      I strongly believe that trying to be clever is detrimental to your health. -- Linus Torvalds
    17. Re:Sounds possible.. by orabidoo · · Score: 1
      whether words or some internal mental representation are the ultimate building blocks of thought, is a very long-running controversy that has been analyzed from many points of view. the case for pure "words ARE your thought" has been made repeatedly (and extrpolated into things like the Sapir-Whorf hypothesis, and ideas like "clean up your mind by learning to use language logically", and so on), but it doesn't appear very credible at this point. there is a good case for some sort of "mentalese" that gets expressed as words.

      anecdotic introspecting evidence on each side: 1) sometimes you can't find a word, which suggests that you can think in concepts. 2) sometimes the language you're thinking in will make you turn things one way or another, use idioms, etc, subtly changing what you're saying, compared to what you'd have siad in another language.

    18. Re:Sounds possible.. by Grey · · Score: 1
      Alas, parsing a language is very difficult, bordering on the Turning test for most languages. Since almost every language (national and planned) is ambiguous. (e.g. "beautiful little girls school" or "Time flies like an arrow. Fruit flies like a banana.") Plus for most national languages (eg. English, and French) we don't know all the rules of grammar. 6000+ for English with maybe 6000 more unknowen.

      It is not likely that we will get good automatic translators until we have machine that a pass the Turning test with ease.

      The UN here also reinventing the wheel her, there are many candidates for their purposes.

      • Esperanto has a simple grammer and is easy to learn, since it was intended as a Internation Language. Esperanto is ambiguous, but people can handle ambiguity most of the time.
      • For well defind nonamgigous languags there are a few Loglan and its desendant Lojban are two, unfortunatly Loglan was not designed to be easy to learn. Actuualy since it was to test the Whorf hypothesis the opposite is closer to being true.
      The bottom line is that no planned human language has truely outlived its creator other than Espranto. Creating a planned internation language is a hard task and not like one to be carried out by commitee.
      --
      Grey (Chris Lusena)
  37. Esperanto just works by Scurrilous+Knave · · Score: 1

    Excellent post. Wish I had moderator points today, so I could move it up! Esperanto just works.

    I have come to believe that, in the human brain, the language center is tied somehow to the emotions, because people start acting irrationally whenever you start suggesting language alternatives. It's like asking them to change sexual orientation or something--their language is too strongly tied into their concept of personal identity to permit approach. So in an open forum, I seldom see anyone who is not already an Esperantist discuss the language objectively. Sad, really.

    But hope springs eternal. I post this URL every time, in hopes that it may someday be of interest to someone: If you are interested in Esperanto, the world's most popular constructed language, try the Esperanto.net web site for starters.

    As for the UNL, most Esperantists have been aware of it for some time. We wish them well, most of us, really we do. But most people who know more than one human language hold limited hope for such a project's success.

  38. Re:It won't work. by vlax · · Score: 1

    Trust me, no linguist will use this. It would be like getting a perl user to switch to TCL - they would carp for years about all the things they can't do the way they want to, assuming they can even do all the things they want.

    Other types of tech will probably steer just as clear of it when they realise how frustrating it is to compose for an artificial semantically unambiguous language.

  39. We already have a universal Net language by Anonymous Coward · · Score: 0

    It's called English! It has been for a while now. Why are we pretending that English isn't quickly becoming the world's universal language?

  40. Re:The "meta-language" by hobbit · · Score: 1

    I think I would be more inclined to agree with you if you were arguing that we should use English because other languages have a large number of words borrowed from it.

    Hamish

    --
    "Wise men talk because they have something to say; fools, because they have to say something" - Plato
  41. this is great by bmabray · · Score: 1

    Now we can start work on that Tower of Babel again. :-)

    human://billy.j.mabray/

    --
    human://billy.j.mabray/
    "Every good system has a backup." -- Dale Hanchey
    1. Re:this is great by Anonymous Coward · · Score: 0

      That was just what I was thinking as I read this.

      - Justin

  42. I have by Anonymous Coward · · Score: 0

    In fact, I speak it myself.

    BTW, your linux distribution probably contains an Esperanto-HOWTO. And the GNU translation project has an Esperanto team. Plus KDE has Esperanto as one of the out-of-the-box languages.

    Marko

  43. And by eliminating all communication restraints... by handorf · · Score: 2

    it will allow more and longer flamewars than anything else since the invention of SNTP!
    (with a nod to Douglas Adams)

    But still a very cool idea!

    --
    -- IANAEG - I am not an elder god.
  44. Re:Interesting, but... by James+Lanfear · · Score: 1

    I agree, but I'd like to toss this goody in: AN Whitehead's Science and the Modern World included a section about language (as an analogy for mathematics, IIRC). One of the points he made was that while English is a shallow language, even compared to other Germanic languages, it makes up for it, in some ways, by being utterly explicit. Nothing is implied or masked by, eg. inflection; the entire language open, simple, and, to some extent, precise. (That a bit of an exaggeration, of course.) I believe his argument was that that made English a superior language for science, where ambiguity is a Bad Thing, but I can see who it could be extended to this, in the form of using English as the lowest-level of the metalanguage, then building protocols for the other languages on top, in a hierarchy of language features.

    Of course, this would probably ruin the entire project, but I'm not very confident that it will succeed anyway.

  45. lost in the translation by Akatosh · · Score: 1

    The concept is nice, but you're still stuck with the problem that most languages are based on anacdotal references as well as accual words. You can translate the words, but the concepts will still frequently be lost.

    1. Re:lost in the translation by lomion · · Score: 1

      Good point, especially when you try and deal with dialects or regional meanings. I'm sicilian and the italian spoke in sicily is a different beast from the one spoke in the mainland Italy.

      They'd have to go with the "norm" of the language like what is used in spelling dictionaries and school texts i guess.

      I can see it now some really silly looking english sentences translated from spanish or italian.

      --
      this space for rent
  46. Re:sounds difficult - not as you say by Chalst · · Score: 1

    Two mistakes in the above:

    (1) Not every language has every tense. German has fewer tenses than english, and another poster said that Chinese has none.

    (2) Language can't be described in A BNF grammar: it isn't sophisticated enough to capture singular vs. plural, gender, case, verb declensions etc. Phrase structure grammar extend BNF grammar s with parameters to capture these, and Chomsky showed that these are sufficnet to capture all of natural language.

    I would guess that the meta-language design is based upon transformational grammar, which exposes the essential similarities between sentences like `The door is closed' and `Close the door!'. This would allow it to express subtleties like different ways of representing the same sentence.

  47. Re:Interlac? by Anonymous Coward · · Score: 0
    Klingon, esperanto, etc just seem silly. All they can do is borrow words from each language.
    This is true of neither Klingon nor Esperanto. Words in Esperanto are largely made by sticking short roots and affixes together. Example: "samdomano" (pron. /sAm doUm 'A noU/, in IPA ASCII notation). You might be able to recognize the roots "sam" = same or "dom" = house, but I'd be very surprised to learn that the word as a whole resembles the word for "housemate" in any natural language.

    As for Klingon, the whole idea is that it's supposed to be unrelated to Earth languages! If they're borrowing words from existing natural languages, they've done a very poor job of the whole thing.

  48. But... by Anonymous Coward · · Score: 0

    We already have a universal networking language:


    Gimme warez d00d, I am 31337!

  49. The real barrier to Universal Translation by Anonymous Coward · · Score: 0

    At one level the real barrier to universal language translation is machine recognition of human languages. By this I mean the comprehension of what is being said. In order for linguistic comprehension to take place, general comprehension must first take place (thus the earlier post about the creation of a HAL-9000 like computer is not too far off base). Otherwise, your universal translator will choke on statements like: "He saw that gas can explode". This could mean that gas has the ability to explode, or that an object (a can of gas) exploded. In other languages, this double meaning doesn't apply and you have to use one of two possible sentences, depending upon your meaning. Since the translator wouldn't "intuitively" know your meaning, it would have to figure it out from context, which would ultimately require general comprehension.

    I suppose the UN could construct a meta-langauge that is free from all such idiosyncracies (maybe based off of Esperanto?). If you assume that is possible, then the translation from the universal language to any of the local languages would be a direct map. This would require the universal language to restrict all words to a single meaning; otherwise it's possible you'll end up with context-based problems again. Also, such a restricition could resolve issues such as inflection: for example, the universal language would treat the Japanese syllable 'ka', which changes meaning depending on inflection, as two or more separate "words". While such a language would be extremely large, the translation from the universal language to the local languages would always work correctly. But, no matter how easy it is to translate from, I still think the translation to will require significant advances in artificial intelligence. It would be easier for us to all learn the universal language, but if we all did, then there wouldn't be a need to translate to the local ones, now would there?? :^)

  50. Re:sounds difficult - not as you say by Anonymous Coward · · Score: 0

    BTW: In Russian "black hole" - "chernaya dyra" means astronomical "black hole".

  51. Re:Allow unicode in email addrs and domain names! by |n$ane · · Score: 1

    Yeah, but then how are we supposed to access these sites with a 'western' keyboard? It's not that i've got nothing against the idea. If it cost no extra to say register domain using roman/asian/russian characters, then sure, no problem.

    --
    I don't suffer from insanity. I *enjoy* it!
  52. Au contrare, mon frere by Our+Man+In+Redmond · · Score: 1

    to quote George Carlin :)

    If I remember my German, you're talking about something like

    Mein Hut, der hat drei Ecken

    where "der" referes to "Hut". That's something that will have to be covered in the rules both for translation into and translation out of German, no matter what language you're using to go into or out of German. Otherwise you end up with the English translation being

    My hat, the has three corners

    where a proper English translation would of course be

    My hat has three corners

    Of course this is a very simplified example, but I think you get the idea.

    I just think that for the foreseeable future (and, since this is computing, that could be, oh, say, six months) the best computers on the planet are the ones we carry around in our skulls. To me it would make more sense to have a single language that everyone would agree on, but then the problem is to agree on the language. All of the "evolved" languages carry their own cultural baggage, and few people seem to think that a "constructed" language is up to the task, even though certainly Esperanto and possibly Interlingua and a couple of others have proven that hypothesis wrong.

    Of course just outside the foreseeable future everybody will be speaking Bocci anyway, so what the heck. :)
    --

    --
    Someone you trust is one of us.
    1. Re:Au contrare, mon frere by Chalst · · Score: 1

      One would probably translate it by `My hat, which has three corners'. The point is that this policy of directly translating `der' by `which' doesn't work, since the noun referred to by which may be the wrong one, since English lacks the gender distinctions of German.

      I concluded that the metalanguage must have a phrase structure that generalises any particular language one wants to embed in it, which of course means that the metalanguage contains expressions that can't be directly represented in the language one wants to translate into.

      The solution to this difficulty is to have semantics-preserving transformation rules in the matalanguage that allow one to manipulate the expression into something that does correspond to something in the target language. Hard work, but I think it could be made automatic.

      With respect to your other point, I think the `cultural baggage' is a matter of semantics and not syntax. The problem these machines are expected to solve doesn't require human intelligence, and the nice thing is the single machine could in principle be able to translate into a hundred different languages, something even the very cleverest humans find difficult...

      Lastly, it is probably easier to get people to agree on a common metalanguage, than to get them to learn a common language...

  53. 'includes blah blah blah languages' -- wont work by Anonymous Coward · · Score: 0

    ok god knows why im typing this, there are 300 comments already. thanks to 'selling out' slashdot is a meaningless cacophony of garbage. well im very happy to add my little shit ball to the heaping steamy pile. they have included only certain languages. this is crap. it should be designed generically so that it can support any language. languages change. anyways, they probably left out braille. who cares about those stupid blind people anyways. euler was blind but he wasnt too important. when was the last time u used e? gimme a break! they should all be killed, when they are babies, viva la eugenics, social darwinism is your pal.

  54. Belgium, man, Belgium! by XNormal · · Score: 1

    Belgium is fast becoming the Mecca for speech and language technology with players like Lernout & Hauspie and projects like Flanders Language Valley.

    All of europe really needs these kinds of technologies, but Belgium is one of the more multilingual countries within Europe.

    --
    Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
    1. Re:Belgium, man, Belgium! by twinpot · · Score: 1

      I read somewhere recently that said Belgium had more quadra (?) lingual people than any other country. English, French, Dutch and one of German/Spanish/Italian.

    2. Re:Belgium, man, Belgium! by Harri · · Score: 1

      Hmm, Luxembourg would also be in the running for this, with people speaking Luxembourgish at home, having their lessons held in German and French and also learning English. I guess lots of them know bits of Dutch/Flemish/..... too.

  55. Open source Babelfish anyone? by dos+equis · · Score: 1

    Since it seems related, I've had a dream open source project in mind for some time. Not so much along the lines of UNL as Babelfish. I think this is the perfect project for the open source model because people from around the world could contribute work relevant to their own languages. A propreietry project would have to employ many specialists.

    If anybody is interested in starting such a project, please reply in this thread.

    dos/tres equis

  56. Artificial Intelligence by Anonymous Coward · · Score: 0

    Douglass Hofstadter has suggested that the Problem of language translation is the only real difficulty in producing an AI. =) For A number of years I have been working on a universal language project. It can be found at users.erols.com/alangrimes/ =) I hope you find that link interesting if not particularly usefull. =\ Please flame me by mail for taking up your time. I am Alonzo The Great ( alangrimes@starpower.net ) my old account got hosed somehow and I'm far too lazy to bother to fix it.

  57. Just in case anyone's curious... by zantispam · · Score: 1

    "First of all, I do not really believe the UN can produce anything remotely interesting, technically speaking. I like the IETF motto: "we believe in rough consensus, and working code". Show me the money^H^H^H^H^Hcode first, please. What's so special about UNL? Theoretical translation of language A into a universal language and from there to language B is almost as old as "machine" translation itself. As far as I remember, early EU research into machine translation were based on a similar idea -- and they were dismissed as a failure. For a good example of the total and dismal failure of machine translation, try translating this text into"

    I did.
    English -> French
    French -> English
    English -> German
    German -> English
    English -> Italian
    Italian -> English
    English -> Spanish
    Spanish -> English
    English -> Portuguese
    Portuguese -> English

    The end result???

    In the first place of all, crío to really distant distant of interest who the O.N.U all can produce and not point out technician. I have the taste of the modernity of the IETF:" we create in the agreement approached and the operation bases it ". the champions money^H^H^H^H^Hcode in the beginning, please. Which is therefore extreme special UNL? The theoretical translation of the language to the inside to a universal language and with of the language B is nearly therefore old here how much the translation " of the machine ". Like the memory, IT CREDITS the first jambs of capturing in the automatic translation it has been based on a similar idea -- and has been isolated like the landslide. For entire a landslide and it good slaughter houses of the example that the automatic translation, manages, in

    --

    censorship is a form of noise, which actively seeks to drown out content with silence - Crash Culligan
  58. Re:it will fail by dillon_rinker · · Score: 2

    You're right, but any decent "universal translator" will not stop at translating individual words. Its dictionary would extend to phrases of the sort you mention. Perhaps it would define "stand at window" as "stand .1m - 1m away from window, while normal vector from plane of body intersects window." Regardless, it would be quite a chore to accomplish this. Context is everything. The more subtle the meaning, the more context you need For example, if I said "That's really smart", you don't know if I'm being complimentary, self-deprecating, ironic, or insulting.

  59. Use FRENCH by coreybrenner · · Score: 1

    French is a dead language, one whose speakers stubbornly refuse to admit it, and one whose primary nation tightly controls it.

    Plus, all those frogs like those nifty blue berets.

    So... ribbit?

    --Corey

    --
    Not only will they not deserve liberty or safety, Mr. Franklin, they will be DENIED both!
  60. An example by Mop · · Score: 1

    Here is an example so you can have a better feeling of what it's like:


    [unl-t]
    [unl-p]
    [unl-s]
    agt(win(icl>event).@past.@entry.@73,team(icl>col lective).@def.@topic)
    obj(win(icl>event).@past.@entry.@73,match(icl>en tity).@def)
    [/unl-s]
    [unl-s]
    agt(break(icl>event).@past,player(icl>male).@def .@140)
    obj(break(icl>event).@past,leg(icl>body).@372)
    mod(leg(icl>body).@372,player(icl>male).@def.@14 0)
    mod(leg(icl>body).@372,left(icl>state).@140)
    [/unl-s]
    [/unl-p]
    [/unl-t]

    So, this a two-sentences, one-paragraph text.

    The first sentence has an agent (the team) who won something in the past, and an object (the match) which was won: "The team won the match".

    The second sentence has an agent (the player, who is male) who broke something, an agent (the leg) which was broken, and modifiers which specify that this leg is that player's own left leg: "The player broke his left leg."

    --
    "Show me the code" -- Linus.

  61. Compiler construction by garver · · Score: 1

    This problem would map into a modern compiler architecture. The compiler architecture has mutltiple front-ends, languages, and multiple back-ends, machine architectures, bound in the middle by an intermediate, but heavily simplified language. The idea is that a front-end parses and type checks the input and then outputs intermediate language. This can then be fed into any back-end built for a particular architecture.

    For example, if you have front ends for C and fortran and backends for PPC and i386, then you can compile fortran programs for PPC or i386 and also C programs for PPC or i386. Any combination. Add another backend, say MIPS and with no extra work, C and fortran compiling are possible.

    When dealing with natural languages, you would need a front-end and a back-end for each language.

    There are a number of catches, here are a few:

    • Finding the intermediate language. It should be possible, but a pain in the ass. After all, it has happened for computer languages and they vary widely.
    • Computer languages are confined to a certain syntax. To make this work, the input would have to be checked for valid syntax and type checked. In other words, poor grammar, incorrect use of words, etc. would simply not be allowed to get past first base. After tons of research, some AI might be introduced here to make the rules more flexible.
    • There will be a learning curve to using the system. Users will have to figure out what is valid. I think this goes for every system. Slang is going to send almost any solution guessing.

    Bottom line, of course a universal translator is possible, but until we discover BabbleFish or the brainwave reading equivalent (would reading brainwaves be enough, would all species "think" alike?), there will be plenty of input restrictions. Afterall, somethings just don't translate. Because of these restrictions, it will be infuriating and impractical to use.

  62. Re:Bahasa Indonesia by mochaone · · Score: 1

    Well, how do they connive to impress their girlfriends then? Or should I say girlfriend-girlfriend?

    --
    Hates people who have stupid little sigs
  63. Re:Noam Chomsky and the Universal Grammar by Seenhere · · Score: 1
    But note that Chomsky's UG hypothesis is about grammar, i.e. syntax. Semantics (which is what you're trying to preserve in translation) is another thing entirely, and something that Chomsky has always been pessimistic about being able to formalize.

    --Seen

    --
    "I used to be a dilettante. Then I thought I'd try something else for a while."
  64. Re:sounds difficult by Zarf · · Score: 1

    It would be most likely that the Meta-Language would only be able to handle a small subset of the meanings available in any of it's natural language counter-parts. The subset of "Meta-meanings" would be the set of all common meanings between all the languages.

    Ideas like run, walk, buy, sell, ect. would easily translate... however things like "glark", "glob", "grep" may not translate accurately. That is the 6 russian verb forms you mentioned may all be mapped directly to only 3 verbal meanings.

    ie: grep to look, glob to list, and glark to understand... and so on.

    The resulting word elements could then be arranged by a simple pattern-matching AI into an acceptable form. The result is a valid natural-language sentance which has some shadow of the original meaning. In practice this could allow for useful bussiness communication but prevent discussions of abstract ideas.

    Yet another fine example of how problem-domain-"scoping" affects over-all software functionality.




    - // Zarf //

    --
    [signature]
  65. Re:Confused by mochaone · · Score: 1

    If you run Universal Networking Language through their coder-decoder thing, it comes out as "Colossal waste of resources and money". In other words, Microsoft.

    --
    Hates people who have stupid little sigs
  66. Re:a pepsi generation??? by Anonymous Coward · · Score: 0

    A friend of mine had worked in this project. He said to me that the system has got a lot of limits. But creating good relative simple docs (like tech docs or bussines docs) is so useful, its'n it?

  67. Reinventing the wheel by Anonymous Coward · · Score: 0

    Who needs UNL when we have babelfish? ;)

    -Dave

  68. Re:Interesting, but... by Anonymous Coward · · Score: 1

    Dear friend,

    Chinese is derived from Sanskrit.

    Thank you

  69. Your Tax Dollars @ Work! by Anonymous Coward · · Score: 0

    Any people wonder why the US doesn't like to pay its dues to the UN.... Also, UNL is probably not in a similar vien to Esperanto; in fact, it might not even be a 'real' language at all. It could just be a series of symbols with no phonetic representation. It will only exist as an intermediary language that never actually gets surfaced to the user. And on other thing: didn't that wingnut Noam Chomsky try and fail to come up with a universal language?

  70. Could work in the following context: by ChrisDolan · · Score: 1
    Details aside, I think an idea like this could work well in the right context. Namely for a document which you don't want to contain nuance, for example, international law or multilingual web pages, etc.

    Here's how it might work:
    • Write your document in your native language
    • Translate it out to "Universal"
    • Translate back to native
    • Look for mistranslations and change the original to avoid this
    • Repeat until the out-and-back translation conveys the same meaning as the original
    • When you are happy, post the Universal version on your web site (and maybe ask a friend who speaks another language to read it once in her language)
    • Hope that the other deconverters are as good as yours

    This has the disadvantage that you lose some flexibility, subtlety and art in your writing, but you decided to give that up when you decided to go multilingual, right?

    The point is that if you write text specifically so that can go to one foreign language and back smoothly, it's probably pretty translatable to many languages, I'm guessing.

    You can try this now with Babelfish. Take a passage of text you wrote in English, convert it to something (e.g. French) and back. Then edit the original until the English that comes back is decent. This will force you to remove colloquialisms and force you to work around deficiencies in the translation program, but isn't this worth it for a good translateable piece of text?

    Final note: We have all seen Babelfish make funny translations. There will always be some words/phrases that software cannot translate perfectly without AI. But certainly, we are all smart enough to craft text that software can translate well! As the software gets better, we can put less and less effort into this.
  71. Re:Esperanto ?? by Rational · · Score: 1

    I'm not sure the internet is going to do Esperanto any favours, considering that anyone who uses the Net in a regular basis is likely to already speak English (which, to all intents and purposes is the Lingua Franca for at least the next century).

    I'm all for the idea of composite metalanguages, for computers, but I don't see why anyone should cripple a metalanguage so people can use it too.

    --
    "Be nice, veer left, and never stop thinking" Iain Banks - Walking On Glass
  72. Re:The "meta-language" by swirlyhead · · Score: 1

    OK according to the web site, the UNL aka "the meta language" Will be based off of english with a means for defining new words, as long as you can provide a word in your original language, AND locate it in the conceptual hierarchy. The most effective step they could take at this point, to increase the propagation would be to come up with an XML dtd, for UNL dictionary entries, and conversion/deconversion mappings. BTW it looks like the web site was produced using UNL technology, and it's not too bad, not as good as a native speaker with strong rhetorical skills but sufficient to carry technical and commercial traffic. The one thing it probably won't be very good at is translating persuasive text meant to convince people. Not such a great loss.

  73. Re:sounds difficult - not as you say by Rational · · Score: 1

    Is that the same tribe whose numbering system consists of "one", "two" and "many"? :)


    --
    "Be nice, veer left, and never stop thinking" Iain Banks - Walking On Glass
  74. Mongol? by davey_bee · · Score: 1

    Since when did Mongol become one of the world's major languages? Half the people in Mongolia are nomads, besides! Thats like Al Gore's suggestion to bring the internet to Africa, to help the people who don't have electricity & running water. weird

  75. Doesn't REBOL do this already? by Anonymous Coward · · Score: 0

    Check them out: http://www.rebol.com

  76. Re:Sounds like Esperanto - take 2 by Rational · · Score: 1

    Nah, English is piss-easy for Spanish speakers too, if they put half a brain cell to work on it.

    All it takes is learning a lot of words and working on your pronunciation, but English grammar is absurdly easy compared to that of most Romance languages, and can be learnt in an afternoon.

    I'd say French is harder for Spanish speakers than English, precisely because it has a complex Romance grammar, and a fucked up pronunciation to boot.

    --
    "Be nice, veer left, and never stop thinking" Iain Banks - Walking On Glass
  77. This Won't Work by mochaone · · Score: 1

    This project is doomed to hell. I will tell you why if you listen intently.

    The Boob Factor.

    That's right. The Boob Factor hasn't been addressed. None of these meta-langauges or intermediate langauges have addressed this important topic.

    What is the Boob Factor, you ask? Quite simply, the Boob Factor is you, it is us. We are the Boobs.

    Meta-languages or intermediate langauges, we will assume, work on well known grammatical and linguistic rules. In order to function correctly, these rules must be adhered to flawlessly.

    Let us examine the following statement:

    I like red meet.

    You the reader have been blessed to have a couple of ounces of grey matter resting on your more than likely underdeveloped shoulders. You have the ability to infer the offending Boob's meaning in this sentence. Do you place faith in a meta-langauge or intermediate langauge to do the same? I don't think so. The Boob Factor has reared its ugly little head.

    We should all sit back and wait until God has reversed his Babel of Confusion mayhem that he inflicted upon us in a drunken stupor. We can then all go back to speaking tongues in the master language of Sumerian. Oh, the joy for that day.

    --
    Hates people who have stupid little sigs
  78. Is this necessary? by kaphka · · Score: 1

    My first reaction on seeing this story was, "Wow, what a cool idea!" I'd love to work on designing this language. (And it is possible. All of the objections that I have seen in these threads can be resolved if you understand modern linguistics. Take a course, it's worth it.)

    However... Do we really need a new metalanguage? Couldn't we just as easily use an existing language as the intermediate form? It would be just as easy to translate, and you wouldn't have to learn a new language to understand the system.

    There's an idea in linguistics which is similar to the Church-Turing theorem in philosophy, although it's not as well established: Every modern language is assumed to have equivalent expressivity. If you wanted, you could translate from English to Chinese using an Aborigine language as your intermediate without any problems. (Except deficiencies in vocabulary, but it's easy to make up new words.)

    I suspect the real need for this meta-language has to do with this project's association with the UN: They don't want to offend any ethnic group by chosing an existing lanugage as the standard.

    --

    MSK

  79. Well, here's how it could work. by Anonymous Coward · · Score: 0

    They should translate from the source language, through a mediator, into a middle language representation, and prove that anything in the middle language representation can be translated into the destination language.

    The middle representation language would have temporal, spatial, and objective information, as well as references, and changes in these "dimensions". I'm sure there's a more formal spec for language in general.. I have no experience in this area.

    The mediator would detect concepts that could be vague (i.e., they have multiple reference levels), and provide alternatives that the client would pick from (that are already represented in the middle language). The mediator could then learn context cues from this, but it probably wouldn't be a good idea to use them until after a ton of experience is accrued in this respect.

    The provided alternatives would be created by "programmers" in this metalanguage who also speak the source language themselves. This is a HUGE programming project..

    When converting from the metalanguage into the destination language, the metalanguage would be compacted into phrases in the dest. language that semantically match what the metalanguage is saying. The annoying thing (and the reason it works) here is that if a figure of speech in the source language is really a large branch of references to other things that are completely foreign to a speaker in the destination language, the simple source phrase could blow up to a large
    dissertation on the concept being described in the destination language.

    source->mediator->metalangauge->compactor->dest.

    Would it work? I don't know. But it sounds like it could..

    --
    sean_dunn1@yahoo.com

  80. why esperanto sucks. by abfackeln · · Score: 1

    i have wanted to make a "universal language" for as long as i can remember, but i still have not found the time ...

    anyway, the real reason that esperanto is not successful is that it still has stupid rules -- for example, nouns still have gender which means that there are still too many pronouns and you still can not complete a sentence without knowing the gender of the subject.

    not to say that esperanto is bad, but we all know that esperanto is just spanish V2.0 and no one will admit it.

    a truly universal language must be written from scratch with all of the "fluff" removed. people say that you will lose the poetic qualities and you will lose the innuendo and colloqualism -- i say, that these people are pathetic whiners who are trained to be cynical of anything which could be considered progress. "poetic quality" and innuendo has _nothing_ to do with the language which it is written in. you might prefer german opera to italian, but it does not make one any "better" than the other. either way, you could still write your poetry in the language of your choice -- and, thanks to the UNL, people will still know what youre talking about.

    the fact of the matter is, effective worldwide communication is a much more serious matter than an old-fashioned idea of what is "good" poetry. poetry will persist so long as there are good poets; we do not need to acomodate them with prissy, "romantic" languages. this just makes it easier for unskilled drunks to make more sappy, bad poetry.

    as they say, you have to crack some eggs to make an omelette. i say thanks to UNU for cracking some eggs, and to everyone that thinks they can improve upon any of the current languages, please stop picking eggs out of the trash.

    -abf.

    --
    -abf.
    1. Re:why esperanto sucks. by Anonymous Coward · · Score: 0

      No Esperanto user will admit that nouns have gender in Esperanto because they DON'T! Nouns have number and case, and adjectives must agree with the nouns they modify. It took me about three years of Spanish classes to achieve the same level of fluency as two months of studying Esperanto on my own. Looks like "Spanish v2.0" is quite an improvement over the original! For a totally logical language written from scratch, try Logban (www.lojban.org). It looks like a computer language (it even has a full BNF grammar), but it's meant for humans. Some of the designers can even converse in Lojban, with great difficulty.

  81. Bad example. by riboflavin · · Score: 1

    Babelfish isn't great. We all know this. However babelfish doesn't use an intermediate language, and no, French is not an intermediate language. The idea behind an intermediate language is that you can have groups working very hard to get their language to translate into the intermediate language. Then, by doing that, their language can be translated into every other language that the intermediate language supports. If you worked for as long as it would take to translate your language into 10 different languages on only one language, you'd come out with a pretty good translation. An intermediate language would also have the advantage of being able to be optimized to be translated into and especially translated from.

  82. Re:Bahasa Indonesia by TheDoc · · Score: 1

    1. No articles? You gotta be kidding!
    2. One tense is not necessarily true. We have other ways to explain the tense in our sentences.

    I don't wan't to explain much about Bahasa Indonesia, since this is not a linguistic site.
    But, as a person who has involved in several computational linguistic projects in Bahasa Indonesia.. we do have our own difficulties to deal with.
    One of it is the verb-formation which is very flexible. This makes the stemming algorithm works harder for Bahasa Indonesia than other languages.

    regards,
    The Doc

  83. Re:Um..is Enconverter even a word? by dos+equis · · Score: 1

    As a person who speaks only English, but resides in a country where the official languages are French, Flemish (Dutch by any other name), and German, I can tell you from personal experience that it is *much* easier to understand someone speaking another language than it is to try to make yourself understood in another language.

    In my experience, this is universal.

    I, for one, find I can get my idea across in Spanish a lot more often than I can understand what a native Spanish speaker is trying to tell me.

    dos/tres equis
  84. Re:Sankrit as the meta language? by Anonymous Coward · · Score: 0
    No, no, no!

    This is a bizarre urban myth in some circles. Sanskrit is absolutely not a good computer language. It is wildly irregular, has a great many verb forms and numerous noun declensions.

    Further, in English, when when you say "and Bob" quickly you often get something like "am Bob." Sanskrit did the same. The awful part is that they wrote these word boundry changes (called sandhi). This is very hard for people... I can't imagine computers will find it any less ambiguous.

    Sanskrit is about as logical/regular as Latin or Greek. Which is to say, as much so as any natural language.

  85. UNU Appears to be lacking in linguistic knowledge by Fnordulicious · · Score: 1

    It appears that the people at UNU who discussed this idea of UNL didn't bother to talk to anyone trained in modern linguistic theory. (I don't claim to be trained, but I have more than a passing familiarity with linguistics, *and* I read /., so...)

    The first major problem that they will have is defining a syntax for the language. That's not so tough if you just define an arbitrary syntax and leave it at that. But I suspect that they will try hard to design a syntax that distills the most popular aspects of each of the languages that they're translating from, thus getting stuck in a linguistic tar-pit from which they will never escape. I hope.

    You see, there have been many attempts at discussing the "universal syntax", that is the base syntax for the language that the brain uses. In most flavors of Chomskyan syntax theory this is termed something like "deep structure" (lately it's been "D-Structure" to avoid any implied but inaccurate meanings of the word 'deep'). DStruct is in essence the most general syntax needed for accurate expression of any sentence structure in any human language. It's supposed to be general, not differentiating between different languages on the syntax level (notice that I haven't mentioned meaning yet -- that's something completely different, the /semantics/ of a language). This unfortunately isn't usually the case, since many languages like to put different sentence constituents in different locations within a typical sentence structure. English, f'rinstance, is said to have an SVO (Subject Verb Object) word order. That is, the Subject of a sentence will be the first major constituent in a sentence, followed by the Verb, then the Object constituent. This isn't general, however. In the case of a question, the word order of an English sentence often (but not always!) changes to a VSO (Verb Subject Object) form. Other languages use completely different word orders. Japanese, IIRC has a word order approximating OSV (could be wrong). That doesn't even consider the lower levels of syntax, where one discusses what's known as "X-bar theory". X-bar theory uses representations of constituent phrases connected in various manners to develop a phrase structure tree that represents the syntax of a particular sentence or phrase. Thus a noun phrase (NP) has two branches from it, one being a specifier (Spec), the other being the intermediate projection of the NP (N', read "n-bar" for hysterical raisins). N' in turn projects a complement (Comp) and a noun (N). Syntax in one language, say English, will project the Spec to the left of NP, and the Comp to the right of N'. Thus the noun N is in the middle of the NP. This isn't true for all languages, they are free to choose whatever branching order they wish to have (dependent on certain Parameters which define particular instances of Principles, which I won't get in to).

    Another theory, Head-driven Phrase Structure Grammar says that every word projects its own dependent structure, and that the structure projected from a word in the lexicon must adjoin properly to other words projected from the lexicon to form grammatical sentences. This theory also takes into account some semantics issues as well, and is very popular amongst the Computational Linguistics and Natural Language Programming crowds, but isn't too popular amongst the older ranks of theoretical linguists. It too is language dependent in its structure of syntax, although very comprehensive syntaxes of certain languages have been developed with some success.

    That's just syntax. It's not easy. It's not very regular. It's very context sensitive. If anyone has written a compiler for any programming language they know how complex a language will get if you allow it to be context sensitive (instead of context free).

    Semantics, the meaning behind a particular word or phrase, is a ridiculously complicated problem in linguistic research. People have spent their entire lives researching it with little success, and at various times in the history of linguistics certain well-known demagogues have denounced the study of semantics in its entirety because it appeared to them to be too unfounded or scientifically reasonable. Chomsky to this day makes nasty comments about semanticians and is well-known for denouncing research into semantics because most work is not provably consistent in even restricted domains.

    Semantics is gnarly. It's weird. Researchers who work in semantics are said to get their more successful ideas from hallucinogenic chemicals. Semantics is a subdiscipline in which any random researcher can overturn the field with one paper, tossing out all of the research done previously -- and get away with it successfully. I don't mean to degrade the work of semanticians, and I'd love to join their ranks some day, but it must be admitted that much of semantic research and theory has a hard time standing up because it's in its infancy.

    Look carefully at the construction of a programming language compiler. It deals with what's known as a 'regular language'. This is a language that is known to follow certain rules consistently, and all special cases are well-defined (for most languages anyway ;-). The syntax of the language (the part you wrote in lex) is usually a bit simpler than the semantics (the part you wrote in yacc), if you examine the respective sources for complexity. Now consider the fact that for *any* human language the complexity of both of these tasks is exponentially (perhaps even factorially) more difficult. Since semantics is at least an order of magnitude more complex than syntax with respect to computer languages, one could imagine how bloody awful complex this is for a human language. Now consider that to make a translation requires *complete* semantic comprehension of both the source and target languages -- translation is not a simple word-for-word lookup table (and I'm glad it isn't -- we wouldn't have much expressibility and I wouldn't be able to write this if it were).

    To put all of this into perspective, consider a universal translator for computer languages -- what's it called? It's called a computer. So what do we call a universal translator for human languages? Surprise -- a human.

  86. Re:UNL? Yeah, right! by Artie+FM · · Score: 1
    What's so special about UNL? Theoretical translation of language A into a universal language and from there to language B is almost as old as "machine" translation itself. The fundamental argument is that it hasn't worked before so it isn't going to work now is stupid. It has been demonstrated how difficult it is to do this, but not that it is impossible.
    For a good example of the total and dismal failure of machine translation, try translating this text into French (or Spanish, or Italian, or whatever) with Babelfish and back to English. Then do it a few times. Then try English to Chinese and back a few times. Case closed.
    Hardly, Here's why that is not a valid test
    1. Babelfish doesn't use an intermediate language.
    2. Babelfish doesn't even achieve loseless translation from language A to B and back to A. This is the simplest case and one which can be improved the most with a good definition for UNL
    It is, in fact, an even better AI test than the Turing test.
    They do not claim perfect translation, but yes computer which could translate between languages and do it perfectly would pass the test. Do you really argue that it is impossible for computer programs to ever pass the turing test? It is only a matter of time till this happens. The only way to stop it is to stop making computers.
    Frankly, would you trust somthing as big, bureaucratic and inefficient as the UN to determine the next standard in machine translation?
    This could be a real concern. You have to hope that once the UNL is defined you could extend it for your own purposes and still have every thing work.
    Finally, I have some friends who work at the UN as official translators, and they are doing perfectly fine, thank you very much (and, I should manking some serious money). Why? Because, AFAIK, no machine has ever been able to translate perfectly
    Here we are reading /. At the very heart of the cutting edge. Isn't it obvious to all us that the only thing we know we can expect in the next few decades are massive amounts of change? I wouldn't expect your friends to be out of work any time soon. But isn't the job of a professional translator radically different now than it would have been 100 yrs ago? Political change was not the only thing that caused this change... communication technology has had a big role.
    Machine translation has its place, but only on documents of a very limited scope/vocabulary and of a very repetitive and technical nature. Even then, a human translator is needed to correct the multiple mistakes made by the machine.
    Do you honestly believe this is the best possible solution? That machines can't get better?
    --
    Be insightful. If you can't be insightful, be informative.
    If you can't be informative, use my name
  87. Isn't the time protocol NTP, or xntp3? by Anonymous Coward · · Score: 0

    Or maybe SNMP or SMTP? I think there is a protocol for every 4 letter acronym ending in 'P'

  88. The Problems with Meta Languages by Sorklin · · Score: 1
    Original

    When you look at existing technology, like Babelfish at Altavista, you see that the 'devil in the details' might be more of a 'great satan' than one might think. I'm not sure you can have any kind of accurate translation without a human acting as a filter for meaning. Its easy to apply some rules to a metta language interpreter, but using it in discourse would probably create quite a bit of ambiguity. Just look at this translation if you don't believe me.

    English to German and Back

    If you regard available technology, like Babelfish with Alta Vista, you see you that the ' devil in the power of the details could think much more from a large satan than one. I am not safe you can type exact translation without human serve as a filter for meaning to have. Its easy, some guidelines to more mettasprachinterpreter to apply but at using it in the statement would probably create much ambiguity. Straight lines view of this translation, if you do not believe me.

    English to French and Back

    When you look at existing technology, like Babelfish at Altavista, you see that the ' devil in the force of the details much more than one great Satan which one A could think. I am not sure you then not to have any kind of precise translation without acting human as a filter for the significance. Its easy to apply some rules to an interpreter of language of metta, but to use it in the speech would probably create ambiguity much. Glance right with this translation if you do not believe me.

    and my personal favorite....

    English to Portuguese and Back

    When you look at it existing technology, as Babelfish in Altavista, sees that ' the devil in the power of the one details much more satan great of that one could think. I am not certain you I can have no type of the accurate translation without acting human as a filter for meaning. Its easy one to apply some rulers to an interpreter of the language of metta, but to use it in the speech would create probably the ambiguity sufficient. To look at just in this translation if you not to believe me.

    Need I say more?

  89. Inconsistency + BS by QianKun123 · · Score: 1

    The UNL will be inconsistent as a few of messages has already pointed out.

    Moreover, is this suppose to be the project of some freshman? The web page is messed up; there are lots of errors. One of the lines says "How to joint the UNL Community" on page http://www.unl.ias.unu.edu/eng/unlhp-e. html. I find a few by just looking at it. I think the people who are responsible for ths do not even care. The pages are poorly coded (made by some win9x program) and pictures look distorted. They did not even give an explanation of how will it be done.

    <!--#include virtual="disclaimer"-->
  90. Re:It won't work. by hab136 · · Score: 1

    So what? They're not trying to translate television. They're *trying* to translate "Legal papers [and] UN treaties".

    So it only works for boring documents. They're plenty happy with that.

  91. Re:Do they have a clue? by Glytch · · Score: 1

    What's the difference?

    My apologies to all you fifth-graders out there, sorry.

  92. Better that babel? by nowan · · Score: 1

    The question is simple -- will it work better than babel? Babel sucks, sure, but it's sure useful when I'm know something has the information I want, but don't speak the language. For that sort of thing, I suspect that this would be very useful.

    And if/when its use becomes widespread people might start writing to the meta-language. Not writing in it, necessarily, but, for example, being explicit on things that would confuse it. If that happens, then it really would work.

  93. Re:As a (formerly) bilingual person... by Cuthalion · · Score: 1

    On the overall topic of a metalanguage: I'm familiar with Chomsky's theory of a universal grammar and if we could reconstruct that grammar, a metalanguage would be possible to create and implement. However, without a good understanding of the universal grammar, it would seem to be nearly impossible. We're too set in our individual grammatical tracks to fully understand those of other languages. That's one reason why teaching small children multiple languages is so successful: their internal grammar is not yet completely rigid.

    Incidently all the Lojban (and loglan) affectionados cite this as reason why people whose native language is a more expressive less biased one (such as Lojban) would have freeer and more powerful minds. (I'm simplifying this a bit). I don't recall the name of this hypothesis, but it's attributed to someone or other.

    For those who don't know, Loglan is a conversational language, the grammar for which is based upon predicate calculus. It's nothing like any spoken language, and has some fairly rigid rules for everything from generating/importing new words to constructing unambiguous sentences. Lojban is an updated version of loglan. I've looked at the primary Lojban book, and it seems pretty awkward, though that could just be that I'm not used to it.

    --
    Trees can't go dancing
    So do them a big favor
    Pretend dancing stinks!
  94. Re:Oopss.... by smoondog · · Score: 1

    /. took out my g. I meant that English thing as a joke....

    Seriously though, having a universal translater might help curb the americanization of the world. Even though I strongly suspect it wont work very well.....

    -- Moondog

  95. Brussels ? Why Brussels ? by Kajakske · · Score: 1

    Hey,
    Why are the presenting there work in Brussels ?
    Let me ask you one question : What language(s) do people speak in Belgium ?
    Answer : I'm from Belgium and I speak Dutch, and that language is not included in their (little) list.
    To bad :)
    The other part of Belgium will be pleased (they speak French, bah!), but why in Brussels ?
    Maybe because Belgium is in the centre of Europe, but why in Europe ? Anybody got an explanation ?


    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    Belgium HyperBanner
    http://belgium.hyperbanner.net

  96. Re:The "meta-language" by Anonymous Coward · · Score: 0

    The beautiful thing about English is that it's shameless. If it finds itself lacking the ability to express something, it is more than willing to steal a word from someone else, and make it its own.

    Would you rather use French, which last I check, it still 18th century?
    (and 10% the size of English)

  97. Devil's in the Details Indeed! by Anonymous Coward · · Score: 0
    "United Nations: Rebuilding the Tower of Babylon!"

    Hope they can do better than BabelFish.

  98. sounds difficult by TheCodeMaster · · Score: 4

    I'd think it would be difficult to make an abstracted meta-language out of human languages. There's lots of grammatical issues which would be particularly difficult to deal with well.

    For example, in the case of inflected languages, how do you get the declensional case information into the metalanguage? In many languages, there are grammatical cases have overlapping declensions, so there's ambiguity about what would be intended with meaning. And mapping between languages would be really tough.

    Verbs would be really tough. Like in Russian, you have three tenses (past, present, and future) as well as two verb aspects. So you have pairs of verbs, one expressing action that occurs once, the other expressing habitual activity.

    Sounds like the project would be lots of fun to work on, though. It's a really neat idea, linguistically.


    1. Re:sounds difficult by piotrr · · Score: 1

      Given that the Finnish language was not in the listing of proposed supported language, this leads me to believe that the people behind the UNL project have been on the same track as you are here.

      I just don't know.. This whole thing reeks of pointy-haired suitness. For one, we already HAVE an abstracted meta-language unifying people of all tongues and it's called English.

      I for one will probably just take the time to simply learn the UNL metalanguage itself to avoid the trouble of people not understanding my grammar.

      Do we not have enough misunderstanings in mediums of this sort as it is? People who counter-argue any reasoning with "You probably do not mean what you are saying" or "I do not think I understand what you mean (because I'm trying really hard not to)" will have an easier time than ever evading the issue, saying "It seems we are losing something in the conversion here.. (he he he)"

      To wrap this up, I actually think the initiative is good. The outcome is uncertain, but even if translations will be far from perfect, hopefully they will be helpful as long as you aren't picky about your input, and picky about your output, as the saying goes.

      / per

      --
      / Per
    2. Re:sounds difficult by Chalst · · Score: 1

      I think the metalanguage would need to carry all of the specific case/gender/tense information for each of the languages that can be embedded in it.

      It sounds possible to me: phrase structure grammars can express the syntactic structure of all natural languages (Noam Chomsky's result). The meta language will be very complex though...

    3. Re:sounds difficult by EisPick · · Score: 1

      think the metalanguage would need to carry all of the specific case/gender/tense information for each of the languages that can be embedded in it.

      This sounds fine if everything is authored in the metalanguage. But what happens when you begin with a language like Cantonese, which has no verb tenses, or an inconsistent language like English, where "read" can be past or present tense?

    4. Re:sounds difficult by Chalst · · Score: 1

      Two points:

      (1) Translating into the metalanguage is in general ambiguous. However language is equipped with lots of hints to disambiguate just these cases, and so for the cases where language is put to use, it should be possible to apply heuristics.

      (2) German also has much less tense structure than English. The passage of time is indicated using modal particles, and the metalanguage will need to possess transformation rules for switching between tensed and modal representations.

      I gather from a linguist friend that systems such as these exist already. The exciting thing about this proposal is that it covers such a wide variety of languages and carries the stamp of the UN.

  99. Argh! Esperanto? Loglan? by xyzzy · · Score: 1

    This thing should have a Monty-Python-Foot icon for it... Real or not!

  100. Re:sounds difficult - not as you say by piotrr · · Score: 1

    A language may not "have" a specific tense as a part of its grammar, but the tense can be expressed. If not by word permutation, then by context or additional words.

    / per

    --
    / Per
  101. Privacy?? by kannen · · Score: 1

    Hmm... Hope you'll be in charge of the "translator", otherwise Echelon is no longer an issue. The UN (and therefore, our own governments) will automatically be monitoring anything which is translated...

  102. Lossy compression by Anomie-ous+Cow-ard · · Score: 1
    Lossy compression is more or less the reason most efforts to translate languages will have rather poor results. First, we take the UNL as analogous to the uncompressed version of the data (i'll use image analogies). Every natural language then represents a lossily compressed version of the data, with various aspects eliminated as unimportant to the particular user group that speaks it (one language uses 600dpi but eliminates color, one reduces the depth to only 72 dpi but uses 24-bit color, and so on). The problem comes in when you try to go from one form of compression to another, each of which eliminates different aspects of the data (you end up with a greyscale 72dpi version, which is considered poor quality to both patries). While the image analogies i've given are a bit extreme, language actually is that bad in some cases.

    Add to this typos and word choice errors (transmission noise?), slang, jargon, and all the other ways we distort language. How can any reliable translation be made?

    Disclaimer time: i am not any sort of language lawyer, so if my post contains inaccuracies don't flame, just correct.

    -----

    --

    --
    perl -e'$_=shift;die eval' '"$^X $0\047\$_=shift;die eval\047 \047$_\047"' at -e line 1.

  103. Computer translations by Glytch · · Score: 1

    A fine idea in theory, but the big question is "How?" sounds to me like some folks who don't really understand what computers are capable of are trying to sound important. Those who do not understand Babelfish are condemned to repeat it.

  104. The "meta-language" by Keelor · · Score: 2
    It seems to me that the strength of the meta-language will be the entire strength of this system. The question is, will the meta-language be skewed towards one language (*cough* English *cough*) or will they manage to create a language that does not impose biases toward one language.

    Overall, I agree strongly with the idea. From a testing standpoint, with the development of an effective meta-language, all one would need to do test the translation for the most part is go from language x->meta language->language x. If that works, than presumably the meta language did not slaughter language x.

    One question I have is how the language engine will handle words it does not know--or, more likely, abbreviations, misspellings, and slang. From what I've gathered, this is where other translators fail. If the translator doesn't understand half the sentence, than it generally has too much trouble finding context for the rest for anything to make sense. Just a thought.

    -Keelor

    1. Re:The "meta-language" by Anonymous Coward · · Score: 0

      Why not use Japanese kana characters, which are already a conversion from ideagrams to letters, of a sort, complete with rough equivalency to english, which is most widely spoken language, and 2nd in numbers.
      (assuming you lump all the forms of Chinese under 1 name)

    2. Re:The "meta-language" by Anonymous Coward · · Score: 0

      English has already accounted for many words from other languages.
      If you wanted a frequently stolen language, convert to Sanskrit or Latin. Hell, any dead language. You'll lose a majority of new terms, but it would roughly accomodate many. English is a good compromise because it's a mutt language. German grammar, Latin words, and then fragments of everything else.

    3. Re:The "meta-language" by Master_Ruthless · · Score: 1

      Is there any reason that it *shouldn't* be skewed towards english, or for that matter, even *be* some kind of modified english? Remember that nobody reads the meta language, and there's no reason to make up another whole system of symbols and words since we have one already. Therefore, we should pick as the metalanguage whatever language is able to hold the most concepts from other languages. I don't know the lingistics of it, but I do know that English has a large number of words, which have been borrowed from other languages- seems like a logical choice to me.

    4. Re:The "meta-language" by 12dec0de · · Score: 1

      There very simple reason for not using a modified english is the simple fact that a language expresses how you think, and there are some things that are inaccurately or incompletely expresses in english. Unfortunately I cannot give an example, as a) we are holding this discussion in english and b) none comes to mind right now.

      I wonder how this tranlator is supposed to handle this issue. It would hardly be used in diplomatic circles, as every nouance is needed there.

      Anybody ever read 'When HARLIE was one 2.0' by David Gerrold? I talks a lot about the talking-thinking paradigm in an easy digested novel form.

      mfg

    5. Re:The "meta-language" by Keelor · · Score: 1
      In one respect, I agree with you whole-heartedly. It doesn't matter if nearly every word in the meta-language is English--though it would help to have multiple forms of words like "read."

      What it comes down to is that there will really have to be two "engines"--one for words (which anyone can do, frankly), and one for grammar. I think that most translators up to this point have failed to translate well primarily because grammar is so language specific. As such, if you base your meta-language grammar on English, all other languages become bastardized because of the inconsistencies in English grammar.

      I don't expect any translator to pick up on every nuance of speech (such as puns) but it would at least be nice to be able to read the translations without having to reorginze the sentence due to grammar translation failures.

      As a side note, does anyone know of any computer-automated translators that actually do translate grammar well?

      -Keelor

    6. Re:The "meta-language" by Andrej+Marjan · · Score: 1
      That depends on what you're trying to translate. For nontechnical materials, English is an extremely poor language: the culture just isn't rich enough to express so many concepts and feelings in life and art. Especially since "passion seems to be a dirty word in English. And you can forget about humour.

      This might work for bureaucratese, since the underlying subculture seems to transcend human cultures. Same for other subcultures. But you'll never make it fly for general communications, no matter what intermediate representation you use, because so much of what's worth saying can't be translated from one cultural context directly into another, much less via an extra step.
      --

      --
      Change is inevitable.
      Progress is not.
  105. Re:Sounds like Esperanto - take 2 by twinpot · · Score: 1

    I don't think so - at least French (and Italian/Portuguese and possible Romanian) all have similar structure and rules - it's more a question of learning the vocab.

    Learning Italian for me was relativley easy because I already knew French. English structure and word order is different, more akin to the Germanic languages.

    One advantage English does have over some other languages is that you can really barstadise it, but still make yourself understood. Plus, there are so many variants.

  106. Re:Interesting, but maybe off the mark by redfoxtail · · Score: 1
    On the other hand, let me assure you that there is also a significant movement in linguistics away from Chomsky's theories. The cognitive linguistics community is pretty virulently opposed to generative grammar and "deep linguistics."

    There are some difficulties in Chomsky's work that arise from extrapolating general rules from "basic" cases, rather than unusual, abberant cases, and research suggests that other cognitive models shape the way we speak. Many linguists are still very into generative grammar, but it is far from a unanimous conclusion.

  107. Good Luck by smoondog · · Score: 1

    There is more to a language than just translating words, as any Babelfish user will tell you. My first problem is that it is so hard to get anybody to actually use it as a standard. It is very easy to come up with a standard. Even M$ with all their power have trouble trying to execute standards.

    Teach'em all english. Thats my solution.


    -- Moondog

    1. Re:Good Luck by Betcour · · Score: 1

      Do you mean English english or American English ? Or maybe some variation of them ?

    2. Re:Good Luck by Glytch · · Score: 1

      Ah, take off ya hoser. It's Canadian English, eh?

  108. There can be a level of success... by Dirtside · · Score: 1
    ...in this project. The main problem stated, and valid it is, is that different languages have nuances, cultural references, double entendres, yadda yadda yadda that other languages do not have.

    So why not establish a universal SUBSET of languages that can express as much of as many languages as possible? For example, I'd be willing to wager that there's a way of expressing the phrase "The red ball is resting atop the green book" in every language in the world. At least, every language used on the net. Relatively simple subsets of communication would at least allow for SOME measure of success for this project, but I don't know how useful it could be. But it's certainly worth a try.

    To recap, obviously you aren't going to be able to translate haiku with this thing, but you could translate Linux installation instructions.

    --- Dirtside

    --
    "Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
  109. At least they're not calling it a new protocol... by tobyl · · Score: 1

    Sounds like someone's finally found a practical, widespread use for the OSI Presentation layer...

  110. Seems they're consciously dumping the nuances by DiningPhilosopher · · Score: 1

    It seems to me the approach is to dump the nuances of individual languages. The website describes a process whereby you type a message and watch a realtime native->universal->native translation. If it the result doesn't match the input you've used something that doesn't translate and you need to replace it with something that does.

    It's far from a true universal translation system, but I think it could be very useful in conveying simple information. I wouldn't attempt to distribute poetry this way but for lots of documents it would probably be understandable.

    --
    /* The beatings will continue until morale improves. */
  111. a pepsi generation??? by wrenling · · Score: 1

    What will be interesting is how they will handle the changing and sliding concepts/phraseology between languages. Language is more than a verbal construct, its a representation of a culture.

    Classic cases of well intentioned translations that lost a little something:
    Come alive with the Pepsi Generation was released in China as 'Pepsi Brings Back Your Dead Ancestors!'

    chevy nova's didnt do too well in Mexico either..
    no va... no go

    --
    Check out Magic Firesheep!
    1. Re:a pepsi generation??? by Anonymous Coward · · Score: 0

      have you ever read a business document?
      they are anything *but* simple, unfortunately.

      a company called sentius is doing some interesting work with context-based translation for the japanese-english market, and amusingly enough the legalese to standard english/japanese and medical lingo to english/japanese markets.

    2. Re:a pepsi generation??? by wynlyndd · · Score: 1

      And coke was represented by characters in China which stated, "Bite the Wax Tadpole" Sounds a bit kinky to me...

      --
      "Dogs and cats, living together...it's mass hysteria!"
  112. Remember the huge success the UN made of EDI ? by dingbat_hp · · Score: 1

    In a pre-XML world, some of us encountered the UN's efforts to make EDI (Electronic Data Interchange) a world standard.

    It was not a pretty sight. I doubt very much if a UN-sponsored human-readable language effort will fare any better.

  113. As a (formerly) bilingual person... by Katydid · · Score: 1

    I don't know what others do, but I used to switch back and forth quite easily depending on to whom I was speaking and the situation. I was in a French Immersion elementary school in B.C. (before moving to the US, where no such program exists). In the classroom we all spoke French constantly but as soon as recess started we would speak English with the exact same people (except the teacher). However, if one of the English speaking administrators came into the classroom, we would be in English "mode" again.

    Sometimes, even now nine years later, I still can think of a word in French before I can think of it in English.

    On the overall topic of a metalanguage: I'm familiar with Chomsky's theory of a universal grammar and if we could reconstruct that grammar, a metalanguage would be possible to create and implement. However, without a good understanding of the universal grammar, it would seem to be nearly impossible. We're too set in our individual grammatical tracks to fully understand those of other languages. That's one reason why teaching small children multiple languages is so successful: their internal grammar is not yet completely rigid.

    Of course, the obvious extension of this arguement is that only a small child could invent a true metalanguage. Imagine the possibilities... :P

  114. Re:Language support - Esperanto? by fluffhead · · Score: 1

    Maybe they can use Esperanto as the basis for the intermediate translation language. IIRC that's what Esperanto was invented for in the first place (not to mention it's supposed to be logically consistent in grammar and pronunciation, unlike "natural" languages - surely a great boon to implementation).
    #include "disclaim.h"
    "All the best people in life seem to like LINUX." - Steve Wozniak

    --

    #include "disclaim.h"
    "All the best people in life seem to like LINUX." - Steve Wozniak
  115. Re:Who is gonna patent this first? by Spacey845 · · Score: 1
    Nobody. Prior art is all over the place, as is reasonably capable source code.

    NLT via a "generic" metalanguage is one of the Big Goals of computer linguistics after Speech recognition and production.

    It's a fascinating field, and any Joe Q Programmer can write a "reasonable" interpretation of the idea. Nobody has written a really good version just yet, and I don't think we're likely to for a while.

  116. The biggest problem that I see... by zantispam · · Score: 1

    ...is dialect and `slang' support. If I'm in the southern US and I say `yeah, I'm fixing to go do that', how will that be interpreted?

    Worse still will be Chinese support. They have, what, 2000 dialects of Mandarin alone? Will the UN force everyone to use the same language in a particular region? Or will this Meta-language understand that.

    Doable, but not very well. Wait until they start including slang and jargon. And we all thought Babelfish was bad...

    --

    censorship is a form of noise, which actively seeks to drown out content with silence - Crash Culligan
    1. Re:The biggest problem that I see... by Erasmus · · Score: 1

      The problem with dialects is helped a bit by the fact they intend to translate text rather than spoken language. Most people leared how to write in a 'standard' version of their native language in school, that cleans things up a bit right there. Plus, if they know that what they are writing is eventually going to be translated, they can exercise a bit of self-control to keep their grammar literal.

      Speaking of Chinese, it'll probably be the easiest language to translate. It has a fairly simple grammar and it is written more or less the same way for every dialect. *SPOKEN* Chinese, now that would be a nightmare to translate...

      Erasmus

  117. We already have a "Universal Language" by Blackheart · · Score: 2

    We already have a "Universal Language." It's called English.

    I'm not trying to be facetious; I'm not saying English is better than other languages; and I'm not saying that English will serve you best, or even tolerably well in all places; but it is an inevitable conclusion you must come to after spending any reasonable length of time abroad: if there is anything resembling a universal language in this world, it's English.

    English is already a lingua franca in technical and many academic fields. Many universities in non-English-speaking countries actually demand that graduate students write their theses in English, because that is the best way to ensure its diffusion. Some such schools even conduct their classes themselves in English.

    The Hollywood movie industry has also no doubt played a large part in helping to making English (not to mention Western culture) palatable and popular the world over. Dubbed versions of films are hardly ever as popular as subtitled ones (exception: kiddie films).

    Is English the best choice for a universal language? Definitely not from the point of being easy to learn. Esperanto would be much better. But realistically Esperanto doesn't have a chance. If English ever encounters a contender, it will probably be Chinese, if only because 1/5 of the planet speaks the language.

    BH

    1. Re:We already have a "Universal Language" by Pseudonymus+Bosch · · Score: 1
      if there is anything resembling a universal language in this world, it's English.

      You wouldn't say that a century ago. You won't say that in a century.

      Dubbed versions of films are hardly ever as popular as subtitled ones

      Goes with the country. Try to find a subtitled film in a medium-sized Spanish town. Try to find a non-English nor Spanish dubbed film.

      it will probably be Chinese, if only because 1/5 of the planet speaks the language.

      The languages. But it's quite probably we'll switch to some form of Chinese.
      --
      --
      __
      Men with no respect for life must never be allowed to control the ultimate instruments of death.
      GW Bu
    2. Re:We already have a "Universal Language" by Blackheart · · Score: 1
      if there is anything resembling a universal language in this world, it's English.
      You wouldn't say that a century ago. You won't say that in a century.

      A century ago, there was no global communications system. In a century matters might well change, I agree. But probably as long as America is the world's most powerful economic, technological and military force, English will remain the de facto lingua franca. Also, it seems unlikely that all the scientific results which have accumulated in English form will be translated wholesale into another language any time soon. (OTOH, programming and specification languages are approaching a state where mathematics can be codified to a reasonable degree, so this may turn out to be unimportant.)

      Dubbed versions of films are hardly ever as popular as subtitled ones
      Goes with the country. Try to find a subtitled film in a medium-sized Spanish town. Try to find a non-English nor Spanish dubbed film.

      I suppose this is probably true. I am speaking from my experience in Japan, where if you go to a video rental shop you will find that 85% of the copies of a new release are subtitled, and all of the old releases are.

      it will probably be Chinese, if only because 1/5 of the planet speaks the language.
      The languages. But it's quite probably we'll switch to some form of Chinese.

      I was actually discounting Cantonese. Far, far more people speak some form of Mandarin.

      BH

  118. Confused by Urmane · · Score: 3

    I'm a little confused ... does "Universal Networking Language" mean Esperanto or TCP/IP?

    --

    --
    "I find your lack of faith disturbing." -- Darth Vader
  119. Re:Who is gonna patent this first? by sciuro · · Score: 1

    prior art should stop this - it's been done (or at least attempted) before, using a de-ambiguised esperanto as the bridge language, by (among others) klaus schubert in the netherlands. searches for "distributed language translation" or "DLT" (possiblly intersecting with "esperanto") should turn up several references (some probably *in* esperanto) from the mid-late eighties. -duncan

  120. Should use Java bytecodes as intermediate language by Anonymous Coward · · Score: 1

    It's so crazy - it just might work!

  121. previous similar project by sciuro · · Score: 1

    a previous project, the Distributed Language Translation (DLT) project, based in the netherlands if i remember right, used a similar idea, using a de-ambiguised esperanto as the bridge language. as i remember, as a text was typed in a source language, it was translated into and stored as this bridge language. when ambiguities arose in the interpretation of the source, a query was sent to the typer as to which meaning was meant (eg: "i love her more than you"), and this distinction was preserved in the bridge language. when the text was required in a different language, this bridge language was translated into the destination language. since the bridge language was intentionally chosen and further designed to be more easily machine parseable and less ambiguous than the original, the translation work was made easier. searches for DLT and esperanto should turn up some references to the project, although a brief summary may clarify further. as far as i remember this was in the mid/late eighties. -duncan

  122. Re:Sounds like Esperanto - take 2 by Anonymous Coward · · Score: 0

    It's a big world, pal. Just because your map has your country in the centre/center of it doesn't mean that it's actually in the center/centre. =)

  123. Re:UNL? Yeah, right! by Noryungi · · Score: 3

    OK, here are some more answers.

    Watch out this is very, very long...

    Don't think about it as "automatic" translation, it's much more likely to work out as semi-automatic. I expect that the process would be something like this:

    1.Run automatic converter from natural language to intermediate.
    2.Have an expert in the intermediate language review the translation.
    3.Run automatic converters to the target natural languages.
    4.Have linguists review the output.

    Compare and contrast with a "traditional" translation process:

    1. Ask a translator to translate from language "A" to target "B". Ideally, the person in charge of the translation should be fluent in language A, a native speaker of B and have at least basic knowledge of the subject at hand (for instance: Open Source).

    2. Ask a linguist, (ideally fluent in language A, native speaker of B, etc.) to review the translation produced at step 1.

    The point is that the intermediate language should be designed to be free of the ambiguities that plague language translation.

    And how exactly can you do this? Either your intermediate language is "limited" (that is to say: misses many of the subtleties of the original language), which eases step #1 but certainly introduces many errors down the line. Or, it is an "advanced" language, that is able to translate many of the finer point of your "start" language -- but then, the interesting thing is the translation engine itself. Not the intermediate language. If your translation engine is good enough to translate, say, Spanish into UNL with little/no loss of meaning, it is also good enough to translate Spanish to English with no intermediate step!! If this is true, what's the point of UNL.

    Another point is, how can you be an expert in an "intermediate language"? Either the language is "human-readable", but probably produces an output compared to sludge and correcting this sludge may introduce additional errors. Not to mention the pain it represents to check on something that borders on the unreadable. Or it is machine readable -- but in that case, who is going to read it?

    Final point is productivity: using UNL, computers and machine translation may take longer than a simple translation "by hand" with human grey matter. A Windoze95 machine with MS Word and some good "paper" or digital dictionaries is, in many cases, more efficient and cheaper than going through the pain of machine translation.

    The hope is to minimize or eliminate step (4).

    Good luck! Frankly, this has been the "Holy Grail" of machine translation ever since it started. And I do not think we are any closer. So, far, every large, international institution that I am aware of (UN, UNESCO, EU Commission, EU Parliament, NATO, IMF, etc) either use tons of translators or have standardized on a couple of languages at most (English being, of course, the "Lingua Franca"). All the large international institutions mentioned above that use machine translations ahve discovered that, even on simple subjects, the 4th step you describe above is the one that consumes the largest time.

    It would be a big win if you could get to the point where all the hard stuff is done just *once* instead of repeated over and over again for all of your target languages.

    Again, this is the "Holy Grail" of machine translation. I don't believe that we are any closer to it than we were, say, 30 years ago. At least not judging from the output of some of the software available out there...

    And no, this will not work for poetry or humor, but there's no good way to translate poetry and humor in any case. The idea would be to get it to work with technical, legal, and business language.

    Sorry to say this, but this does not work very well either for legal or technical language. It may work with Business, since PHBs are so limited intellectually =). Legal translation can be horrendous: I have translated many legal documents in the past and I can tell you there is nothing worse than that, because legal terms are incredibly complicated and old-fashioned and also since legal trivia has to be rendered in a very exact manner. Legal terminology (in almost every language) is one of the most confusing and complicated one. Plus, lawyers and legal people are a major pain in the neck when it comes to Once you get the terminology right, I agree the rest of a legal document is usually a matter of "filling the blanks". But getting the legal terms right is enough to drive you nuts.

    Technical translation is another problem: I think some technical areas may be the best bet for machine translation yet. The problem, as far as the technical field is concerned, is that in fast-moving areas (computer science is one) the technical vocabulary is changing and evolving so fast it's hard to keep up pace. I read up to 5 computer magazines a week (not to mention a daily dose of Slashdot =) just to keep up-to-date with the latest evolution in language and technology. Keeping a UNL database of terms and translation could prove to be a daunting task...

    >What's so special about UNL? Theoretical translation of language A into a universal language and from there to language B is almost as old as "machine" translation itself.

    The fundamental argument is that it hasn't worked before so it isn't going to work now is stupid. It has been demonstrated how difficult it is to do this, but not that it is impossible.

    Please note that I never said (in the sentence you quote above) that this is not going to work. I just said that, as far as I am concerned, using an "intermediate" language is old news. This may be a new and interesting idea to you, but, frankly, for someone who has worked in translation, you could very well trace back this concept all the way to Volapuk and Esperanto. And these two were invented in the 19th century.

    As far as I am concerned, I think you could prove that correct translation is impossible. All you would have to prove is that a "human" language is a chaotic complex system, which usually follows unpredictable rules and has several strange attractors, inducing a runaway complexity.

    Case in point: English. Roots: Saxon dialects, Norman dialects, Old English and Old French. Latin. A little bit of Greek. Maybe German and Old Dutch. Evolution influenced by French and a myriad of other languages. Now divided into several branches (US English, British English, Irish English, Australian English, Indian English, International English), all of them influencing each other and countless other languages. Reducing the English language to a set of neat little equations and computer routines is left as an exercice to the reader... =)

    Please understand me: computer translation of "basic" English into UNL and from there into Chinese, French, Spanish, Italian, Japanese, etc... is no big deal. Computer translation of highly technical/scientific papers may be achieved. But even then, due to the inherent complexity of English (or any other human language), a human will have to review the machine translation and correct it.

    I therefore suppose that perfect translation does not exist (or is impossible). Translation (like programming) is an art, not a science. You can have a certain number of "artistic" rules, but you cannot have a "perfect", scientifically proven, solution.

    Example: give a problem to be solved to two good programmers, and they'll probably come up with two different and equally valid solutions. Which solution you pick has to be determined by other factors (speed of implementation, maintenance and evolution of the system, optimization, resources used, etc).

    Give a translation to be done to two good translators and they will probably come up with two rather different and equally valid translations. Which one you pick is then determined by other factors (length of translation, speed of said translators, price of translation, style, etc). Complex systems, like languages, cannot be reduced or predicted. They can be analyzed and more or less "solved" -- the quality of the solution being dependent on many factors, such as the experience of the specialist, his choice of tools, etc. This is true even in reductive or limited systems, where, for instance, the vocabulary to used is small (see technical translation above).

    Remember the butterfly in Brazil that creates a storm at the other end of the world? I suspect translation (especially multiple language translation) may well be the kind of complex system that is so hard to solve using computers.

    Perfect translation, like perfect programming, is only possible in a very limited scope. A "DO ... UNTIL" loop is the perfect solution for certain problems, and "dinero" is the perfect translation of "money" into Spanish. A TCP/IP stack, no matter which OS it is running on, will always have some sort of ACK/NACK test. But these are all very limited examples.

    >For a good example of the total and dismal failure of machine translation,
    >try translating this text into French (or Spanish, or Italian, or whatever)
    >with Babelfish and back to English. Then do it a few times. Then try
    >English to Chinese and back a few times. Case closed.

    Hardly, Here's why that is not a valid test

    1.Babelfish doesn't use an intermediate language.
    2.Babelfish doesn't even achieve loseless translation from
    language A to B and back to A. This is the simplest case and
    one which can be improved the most with a good definition for UNL.


    Answers:

    1. A intermediate language should introduce even more bugs into Babelfish translation. See above.
    2. "Lossless" translation is impossible. See above. Complex systems, such as human languages, cannot be reduced easily to a set of equations.

    >It is, in fact, an even better AI test than the Turing test.

    They do not claim perfect translation, but yes computer which could translate between languages and do it perfectly would pass the test. Do you really argue that it is impossible for computer programs to ever pass the turing test? It is only a matter of time till this happens. The only way to stop it is to stop making computers.

    Actually, I thought a computer had managed to recently pass the Turing Test, or some limited version of it. Anyone out there could supply information on this one?

    But: I don't think the Turing test is actually a very good AI test. There is a huge difference between a program that is able to "talk" to you (parrot back what you said) and one which is able to understand you. A computer able to understand human language would probably be the first real AI on this planet. Most Turing test software are based on some variation of Eliza, and this has been around for ages.

    Here we are reading /. At the very heart of the cutting edge. (some text removed) I wouldn't expect your friends to be out of work any time soon. But isn't the job of a professional translator radically different now than it would have been 100 yrs ago? Political change was not the only thing that caused this change... communication technology has had a big role.

    Well, this may be surprising to you, but the work of a professional translator has not evolved very much. Computer and communication technologies have eased their task a lot. Like many other professions, translators are now able to work from home, access the Internet and its wealth of information, send documents to clients by e-mail, and even use some very clever software that ease the translation process (TM/2, Trados, etc).

    Word processing, in particular, certainly is the best thing to happen to translators since sliced bread =). Also, I agree that many new translation fields have been added in the past century: biology, computer science, aerospace, etc.

    But the central fact remains this: to be a translator you have to be fluent in (at least) one language, a native speaker of another, and have a good expertise in one or more field of human activity. That's it. Oh, and you have to have a certain "talent" with languages, just like you need to have a certain "talent" for programming. It's an art, remember? Even the best-trained translator is worth 0 if he/she does not have that special "talent". Exactly like a lot of people work on Linux -- but there is only one Linus Torvald. =)

    We may translate faster, have more tools and information at our disposal, and produce better-looking documents -- but the core skills remain the same and the work process is exactly the same. You could train a translator today in the exact same way they were trained 100 years ago: with a pen and a piece of paper. Sorry to disappoint you, but Computer technology is not always the perfect solution it prides itself to be...

    That's All Folks!

    --
    The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
  124. Design by large, non-technology committees... by Anonymous Coward · · Score: 0

    I mean absolutely no offense here, but the end result is gonna suck. First of all, by the time the first draft comes along, there will probably be ten thousand other proposed standards, written by people with knowledge of the subject, that whip the hell out of the UN standard.

    Second of all, too many cooks spoil the soup. Take a look at Ada. It was designed to be an everything-in-one, do-everything, and do-it-well type of language. It's mediocre at some tasks, horrible with most, and good at nothing. It's shit.

    This is what generally happens with stuff like this. Don't get me wrong-- ANSI C is a good standard, but it was based on an already existing de facto standard. Same with SCSI.

    I get the feeling that the end result isn't going to work. It'll translate real well between all languages until you use a pesky word like "the" or "and".

  125. When Languages Express themselves the Same Way by David+Jensen · · Score: 1

    Maybe we could all map our written languages to written Chinese. This already works for a number of languages in eastern Asia. Europeans might have to adjust their grammar to meet the needs of written Chinese, but the lesson I take from Babelfish and other automatic translators is that grammar is so trivial that translators can just ignore it.

    1. Re:When Languages Express themselves the Same Way by radja · · Score: 1

      Or change chinese to suit the european languages...

      --

      No one can understand the truth until he drinks of coffee's frothy goodness.
      --Sheikh Abd-Al-Kadir, 1587
  126. Swahili is allready a universal language by Porag_Spliffing · · Score: 1

    Jambo !

    IIRC Swahili started as a universal (trading) language made up from various East African tribal languages, Arabic from traders coming to East Africa with an Indian influence and probably many others too. The word Swahili is from the Arabic word for the coast as the language sprung up first along the coast. It is now used as an intermediate by many peoples with different mother tongues far into Africa. In the short time I was in East Africa I found it incredibly easy to learn functional Swahili and I am a language dummy. Interesting then how this original 'universal' language has now moved down stream and is a one to translate too.

    I suppose many other modern languages have grown from a mix of others (eg English, from Low German, Frisian, French, Celtic now including Indian (eg juggernaught), Polynisian (eg taboo, tatoo) and many other words)

    Perhaps in the end we will just end up with another new language that started by the need of people to trade.

    Perhaps they should base it on Swahili or even Esperanto, please no more new languages !

    --
    Maybe you live in interesting times
  127. I doubt it will be pretty by Space+Cow · · Score: 1
    My experience in learning a language different from my own (Japanese - I am a native English speaker) is that a lot of things can be translated nicely, but a lot of things require something called "human judgement based on context". I have used many handheld JE electronic dictionaries, and there were countless entries that gave translations that made very little sense. I am talking about a simple little dictionary here, not a grand translating scheme. The only source that I trust completely when I come across something new is a combination of sources:
    Looking in a Japanese-Japanese dictionary (Kokugojiten for all you Japanese literate folks out there), speaking to someone in Japanese about the word or phrase I don't get, speaking to another Japanese fluent English speaker in a combo of Japanese/English, and finally waiting a while for the meaning to sink in from use.

    Based on the list of culturally incompatible (that is where a lot of the translation problems stem from) languages on the UN site, I would wager that this is going to produce at best a very very demented version of bable (sp) fish. Any linguistics folks wanna tell me why I am wrong?

  128. Re:Sounds like Esperanto - take 2 by Glytch · · Score: 1

    English, for those who did not learn it as kids, is a notoriously difficult language to learn. When you think of the fact that every single alleged "rule" in English has at least one (and usually several) exceptions, you being to see the problem.

    At least, this is what many French-Acadian friends have told me. English is my mother tongue, so I wouldn't know personally.

  129. Re:Interlac? by angelo · · Score: 1

    The statement made by me above was sorta cut and pasted after hitting preview and it not coming out right. You are right that klingon doesn't borrow. but any real artificial language does. Trek borrows for names a lot from other languages. "Odo" "Nerys" and "Jadzia" are all real words from other languages. I can't remember at the moment what Odo means, but Nerys=nose in spanish, and Jadzia="old man" in polish.

  130. Esperanto ?? by Anonymous Coward · · Score: 0

    I remember something about an aborted project of 'universal language', was it Esperanto ? Never heard about it since ...

  131. UNL Website produced by UNL? by Spacey845 · · Score: 1
    Does anyone else think that the linked website shows definate signs of having been created by the UNL computer program? If it is, then it's actually a darned fine piece of work.

    New slant on Turing: What about a prize for the machine that produces a translation that you can't decide is machine-translated or human-translated?

  132. Poor website by Tet · · Score: 3

    For a project that's supposed to allow effective communication, they could at least have designed a web site that works well in all browsers. No alt attributes for images... Sigh. Those of us using lynx just have guess, based on the image names :-(

    --
    "The invisible and the non-existent look very much alike." -- Delos B. McKown
    1. Re:Poor website by Frodo · · Score: 1

      This site isn't just poor. It's abomination, it's a shame. The suckers could at least make a white background, if they paint images on white. The site looks so pathetic that I won't even read it - people that can't afford web-developer with 1/10 of a clue obviously can't do anything worth seeing.

      --
      -- Si hoc legere scis nimium eruditionis habes.
  133. Read Shank & Riesbeck. Its nothing new. by crovira · · Score: 1

    The concept is called "deep contextual dependency."

    It works all right for extracting meaning and for translating documents without nuance or unidiomatically (See National Weather Service of Canada automatic weather advisory translation.)

    We can translate WORDS without difficulty. We can parse and deconstruct sometimes surprisingly complex sentences. The problems come when we try to deal with the fact that we don't speak in words.

    We USE words, sometimes the wrong ones though Malapropisms are less of a problem nowadays with the spread of mass media spreading too few languages like manure. There is pretty good concensual agreement on the meaning of any word.

    But, unless we're really anal-retentive or WASP, but I repeat myself, we usually back up or demonstrate our meaning with gestural cues.

    We SPEAK in sentence fragments and often, like cats purring, we aren't really communicating a damn thing, except "Hello I'm here. Don't kill me." Most of what passes for communication is just interpersonal noise.

    The problems of automatic translation arise because we don't speak or even think in words. We think in sentence fragments.

    To complicate things, the words we use in constructing those fragments are often not words that bear any relationship with the thought being expressed by the sentence fragment.

    "The spirit is willing but the flesh is weak" is a great example of a consensually determined, historically derived idiomatic fragment sequence used to express a single thought. It is directly untranslatable.

    Don't know what it means? Not not sure of what it means? Then you don't come from a Judao-Christian, AngloSaxon family that at least paid lip-service to the church and to Shakespear. There IS a concept and a context expressed with the phrase but most of us would dance around trying to express it to a foreigner (or a computer.)

    To understand it, you need to be a part of the consensus that was the socio-cultural caldrom that cooked up the expression in the first place. (Did you spot the idiomatic expession in that last sentence?)

    Not to disagree with Suzanne Vega, but Language IS liguid. It may not be able to rush in but it is constantly deforming to fill the ill-defined vessels that speak it.

    Machine translation will require that our machines not only become more like us but join us in constructing language

    --
    MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
  134. Interesting, but maybe off the mark by Shin+Dig · · Score: 2

    I will admit to not having read all of the UN documentation, but what I can tell about it from what I have read, they are attempting to create a abstraction of language in general.

    Although this is an interesting idea, it makes an assumption that all language is based off of one abstract "map". IMHO different languages have different maps. Having spent a fair ammount of time learning ancient greek in high school and college, I can say that the map for that language is quite different from english, and those are both Indo-European languages.

    The concepts that exist in one language may not in many other languages, which is often very problematic. Eventually, to learn any language, you must actually just start thinking in it, and not doing translation to your native language. Contemplating the 3 voices in greek (active, passive, and middle) is something I rather enjoy doing, as it is very foreign to english.

    I am just afraid that they will have to produce a Least Common Denominator language which won't be useful for anything beyond technical specifications and instructions. I will have to admit that that would be useful on many fronts, but may not be the dream that we were all hoping for.

    --
    There is no silver bullet. Plus, werewolves make better neighbors than zombies or vampires anyway.
    1. Re:Interesting, but maybe off the mark by YellowBook · · Score: 2
      Although this is an interesting idea, it makes an assumption that all language is based off of one abstract "map". IMHO different languages have different maps.

      Actually, the dominant paradigm in formal linguistics, generative grammar, implies that all languages are generated from one abstract "map", the so-called Universal Grammar. Now, actual grammars vary a lot, but the idea is that they can be generated from the Universal Grammar by tweaking various parameters. The main evidence for this is the specialized language-learning ability of human children, and particular evidence about how that ability works and doesn't.

      Now, as to whether this will make universal automated translation via a metalanguage possible, that depends a lot on the metalanguage. I envision the metalanguage looking a lot like "glosses" in syntax papers, rather than an actual language, so that you preserve all of the language features of the original in the metalanguage. The more languages the metalanguage is supposed to accomodate, the larger it will be.

      Even if the metalanguage is perfect for all the supported languages, there will be problems with idioms, probably with slang, and certainly with cultural concepts. But in general, how important those failings are will probably vary depending on the conversation. On the whole, I think that both the most enthusiastic and most critical posts I've seen in the comments to this article are underinformed.


      --
      The scalloped tatters of the King in Yellow must cover
      Yhtill forever. (R. W. Chambers, the King in Yellow)
      --
      The scalloped tatters of the King in Yellow must cover
      Yhtill forever. (R. W. Chambers, the King in Yellow
  135. SNTP, what's that? You must mean NNTP. by tap · · Score: 1

    This would make the grammar and spelling flame obsolete, since the grammar would be a product of the translator. Without the most basic of flames, how could a flame war ever compare to those now.

    Of course, errors in translation would let people who completely agree argue enlessly without knowing it!

    1. Re: SNTP, what's that? You must mean NNTP. by handorf · · Score: 1

      Of course, I was simply testing the veracity of the entire concept of translating languages. You KNEW what I meant and were able to mentally correct my error, but if the words were FNORD and GLEEBLOP, you may not have made the connection and an automatic translator certianly wouldn't.

      Esp since SNTP is a time protocol and NNTP is the news protocol.

      You are correct, of course. :-)

      --
      -- IANAEG - I am not an elder god.
  136. What's between the lines by 1600 · · Score: 1

    It certainly seems that this would be possible at an elementary level. It shouldn't be a problem developing a language that would enable users to communicate basic messages to each other. However, communicating the subtlties inherent in each language would prove to be difficult. For instance, certain concepts for which there are terms in Chinese or Japanese are almost impossible to represent in the English language. Tackling these abstract and subtle differences would prove to be the biggest challenge for any 'Universal Language'.

  137. Re:Intermediate language choice by Bazzargh · · Score: 1

    Latins not as dead as you think. Its still
    used for papal encyclicals and the like, and
    as a consequence they maintain a latin dictionary
    which every so often is updated with neologisms.

    This made the news about 6 months back - they
    included new words for things like 'lapdancer'
    in the latest edition.

  138. Um..is Enconverter even a word? by sprag · · Score: 1
    Maybe I'm being a little too harsh, but if this is supposed to be a language to language converter, why are they using words that don't make sense, like enconverter and deconverter? Conversion is conversion is conversion...

    Of course not to mention the typos and poor spelling in the site.

    1. Re:Um..is Enconverter even a word? by loki7 · · Score: 1

      I was a bit concerned about the grammar/spelling on that page, too. I just hope is that it's actually an example of a alpha version of the translator and not a flesh and blood author.

      /peter

    2. Re:Um..is Enconverter even a word? by shub · · Score: 1

      As a person who speaks only English, but resides in a country where the official languages are French, Flemish (Dutch by any other name), and German, I can tell you from personal experience that it is *much* easier to understand someone speaking another language than it is to try to make yourself understood in another language.

      In my experience, this is universal.


      Therefore, it makes a lot of sense having two separate converters (one that attempts the richest possible understanding of the language to be converted from, so that the maximum amount of semantics can be preserved in the conversion to the Universal language), and a simpler one that just converts from the Universal language into the "canonical" form of the local language.


      In the computer world, we understand that "write" operations are typically much more expensive than "read" operations, and depending on our application mix, we may optimize one at the expense of the other.

      It makes just as much sense to do this with regard to processing of natural languages as it does with other computer programs.

      --
      Brad Knowles
      http://daily.daemonnews.org/ -- if you're not
    3. Re:Um..is Enconverter even a word? by EisPick · · Score: 1

      Sounds like "franglais" to me. Makes you wonder how good the translation will work when the people writing the translators can't even write in one language (English) without leaving traces of their native language (presumably French).

  139. Other Languages by Jon+Duffee · · Score: 1

    Makes you wonder if they're going to include Klingon?? Or perhaps Sanscrit?? They can have a choose the next translated language contest! That would make for an interesting program.

  140. Sankrit as the meta language? by dodobh · · Score: 1

    Sanskrit would be the ideal meta language.
    Every thing has to be clearly specified. No irregular verbs, tenses,etc.
    The grammar(or syntax, take your pick) has been very well defined.
    Simple! No scope for error *and* no scope for any propetiary extensions.
    My $0.02

    --
    I can throw myself at the ground, and miss.
    1. Re:Sankrit as the meta language? by Anonymous Coward · · Score: 0

      You should, at the very least, give a friendly nod to Neil Stephenson...

  141. Re:so crazy it just might ... not work by Harri · · Score: 1
    As far as I can see, it should be fairly easy for someone with a knowledge of the grammar of their language to translate into the intermediate language. Perhaps businesses could pay people to do this for them. Then they only need to employ one expert instead of one per foreign language.

    On the other end it would be simple to computer-translate the result into the next language.

    Maybe we can't fully automate the process yet, but we can go a fair way.

  142. Who is gonna patent this first? by dieman · · Score: 2

    I think this is a very neat idea. My worry is, who will patent the technology first and screw the world.

    Amazon does it with ecommerce 1-click. Microsoft does it with style sheets. Hell, if its a good.. Interesting technology why not, lets take it and pantent it to death! Then we can charge everyone for it and make a zillion-and-one dollars. Perhaps I should send in my application today!

    There needs to be limits on patents. Yes, I believe they do foster invention, but they also can stop community work on a really-good-thing.

    Perhaps a community-patent-agency and a easy, low cost effort to setup patents that are held by some sort of group for the explicit reasoning of keeping some basic ideas *free* for us geeks and the rest of the world.

    Really, it shouldn't have to come down to this tho. But someone will patent the implementation of this and we will all be screwed.


    My $0.02

    --
    -- dieman - Scott Dier
  143. Allow unicode in email addrs and domain names! by Anonymous Coward · · Score: 0

    All unicode characters should be allowed in domain names, email addresses, URLs, etc. The net attitude that 26 Roman letters can somehow serve all Earth residents is PATHETIC. And what better way to open up the namespace too?

  144. It won't work. by jd · · Score: 2
    At least, not in general. Regional expressions, local terminology, written accents, cultural mannerisms, and all sorts of other fiddly details, might not HAVE a direct translation, into the meta-language OR into any other language.

    Yorkshire, UK, for example, still uses "thee" and "thou". If you translate this into some kind of meta-language, it's either going to barf, or lose details. Those details may be important to meaning. God only knows how it'd cope with Cockney slang, or even common phrases (eg: "from the horse's mouth", "a sticky wicket")

    As I see it, this can ONLY work for formalised documents, using a formalised subset of the various languages. eg: Legal papers, UN treaties, etc. It'll NEVER work with informal, written language.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    1. Re:It won't work. by blowdart · · Score: 1

      Not just English regionalisations to consider, German differs between Germany and Austria, Spanish between Spain and America.

    2. Re:It won't work. by Anonymous Coward · · Score: 0

      I think there's a fundamental problem here, something that was pointed up in the later Wittgenstein. Languages are sort of like games, and are bound by rules. Rules allow a structure, and thus intelligibility. However, like games, rules don't govern the entire process - how high can you hit the ball in tennis, for instance? To develop a working system for this kind of translation, you will probably need a formalized metalanguage for processing. You will also need sophisticated heuristic structures for parsing the source and target languages. You need to develop a system capable of playing the language game, in places where the rules don't explain how to act.

    3. Re:It won't work. by Chalst · · Score: 1

      Each dialect has its own distinct syntax: naturally the project will pick just one for each language.

      Linguists are pretty good at providing models of informal language use. People are quite sloppy in observing formal linguistic rules, but even these deviations tend to follow their own rules.

    4. Re:It won't work. by PigleT · · Score: 1

      German differs between German and German! :8)
      (Yup, I've seen it happen where folks from Munich and near Hannover have slowed down so much that *I* could understand most of what they were saying, where certainly the more northern stuff at full speed left me well behind). It was quite fun :)

      As an aside: I've just acquired the soundtrack from 'The Matrix' which Rammstein's 'Du Hast', including (Bavarian) abbreviation of 'habe' into 'hab''. Babelfish can't cope with this at all, and even confuses 'nichts' into 'anything'...

      As far as regionalisations go, I don't know that many Yorkshire natives who use 'thee' and 'thou', at least of the current generation. (And that's where I come from ;)
      I'll also go one further, and suggest that regionalisations *must* remain as intact as possible. (Although I won't be asking version one to translate 'there'll be trubble up at' t'mill' ;)

      Am I the only one so far to get bad vibes of Tower of Babel meets 1984? ;)

      --
      ~Tim
      --
      .|` Clouds cross the black moonlight,
      Rushing on down to the circle of the turn
    5. Re:It won't work. by shawnhargreaves · · Score: 1

      > Yorkshire, UK, for example, still uses
      > "thee" and "thou".

      In bad TV dramas, it is indeed true that the rural Yorkshire farmer might say things like this, but they are no longer a part of even a very broad Yorkshire dialect (based on my experience of living in York for a couple of years, and having relatives from other parts of the county).

      > God only knows how it'd cope with Cockney
      > slang, or even common phrases (eg: "from
      > the horse's mouth", "a sticky wicket")

      But surely this is just a question of how much language is programmed into the converter? Of course a program that only groks standard dictionary words won't know about this kind of dialect, but that just means that it has to be programmed with basic English plus all commonly used slang, if you want it to handle such things in translation. The English that we speak on a daily basis is much larger than the strict language core, but this is only a superset that can be handled in the same ways, not any fundamental shift in concept or grammar, so all you need is a big enough database...

      > As I see it, this can ONLY work for formalised
      > documents, using a formalised subset of the
      > various languages.

      It is certainly easier to automate things if you can restrict the input in some way, but I think this system would be more use for casual conversation. In formal documents you really do need everything to be 100% correct, so you'd still have to pay human translators to check the results unless the software came with some really solid guarantees (which seems unlikely). In a more informal situation, though, it doesn't matter so much if a few things get mangled, as long as the basic sense comes through. Hell, even babelfish is useful at times, and there is plenty of room to do a much better job than this that would make a really valuable, even if still flawed, translation tool...

    6. Re:It won't work. by bungalow · · Score: 1

      As I see it, this can ONLY work for formalised documents,...


      You're probably right, at least about ver 1.0, but the translation of subtle cultural / idiosyncratic linguistic tendencies is not what the project seems to be aiming for.

      This will not help you translate girlspeak.

      The use of this tool will probably be most common in more technically and linguistically savvy circles (again, speaking of ver 1.0) Like any other tool, Flaws will be found, exposed, and accomodations will be made, in the tool itself, the tool's UI, and undoubtedly in the user's understanding and implementation.



    7. Re:It won't work. by jd · · Score: 2

      I shall continue to believe in Compo, no matter what thee says, lad!

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  145. Which Chinese language? What about Concepts? by Saraphale · · Score: 1

    Which Chinese language will they be using? The website doesn't say. It'll most likely be Mandarin or Cantonese, I think.

    The website also doesn't address conceptual issues with translation - a language isn't just about syntax, it's about semantics. As a simple example, a German asking "Wie spaet ist es?" would have their sentence translated as "How late is it?" - it'll take a few seconds for an English speaker to work out they're asking the time. I'm sure there are a vast number of such colloquialisms in each language, and finding a common way of representing all the concepts people wish to communicate is a very hard task. Would it not be simpler for the people themselves to learn a language, and so everybody knows what colloquialisms and concepts the listener will be using.

    I know, Esperanto. Or Loglan. *sigh* These projects haven't worked before, so past experience predicts a poor performance of this one. It's good to see they're making the effort, though.

    S.
  146. Re:UNL? Yeah, right! by ToastyKen · · Score: 1

    Actually, it seems the problem with machine translation isn't so much that human brains are more intelligent, but that humans have more environmental context from which to get the subtleties of our evolving languages.

  147. Reduce the scope of the problem. by grimdenizen · · Score: 1
    What about the potential for defining a subset of each language which *can* be reliably translated? I think this is similar in spirit to the trade languages which popped up in places where differently-speaking peoples came together.

    Universal Network Pidgin?

    The in-converter would make an effort to identify idioms which are ambiguous or don't translate well and either have the author remove them or encode them in such a way that the core meaning isn't lost when they cannot be present.

    The out-converter would convert what it could, then if more information remained present it in another form like footnotes.

    Still not a easy problem by any means. But I think if we could find a way to write which could be reliably translated by machines it would be worth the investment.

  148. I wonder how they plan to do it by DanaL · · Score: 1

    From what I understand, computers still aren't terribly good a translating tasks, and it seems as though this could be even worse. Wouldn't you loose even more context/subtle-language-aspect by translating from, say, English to UNL and then UNL to, say, Japanese? UNL will probably be rigidly defined (context-free grammar?), so you will have to twist and bend English to get it into UNL.

    It would be neat to see some amazing developments in natural language processing, but I won't hold my breath

    BTW - if you ask Babelfish to translate 'I won't hold my breath' to German and back, you get:

    I do not hold mean breath on

    :)

    Dana

  149. Translate what I mean, not what I say... by ripler · · Score: 1

    This could possibly be a great tool for communications, but what about cultural differences between nations? Are the going to be PC (Politically Correct) filter plugins to make us aware of when we are about to say something that would be offensive to another culture?

    Before you post your web page, you hit the World PC button on your UNL Translator Plugin.

    *WARNING* you are about to offend people in the following countries...

    China, Saudi Arabia, etc..

    Wihtout it, you could offend billions of people, and not even know it. What an opportunity!

    It also raises the question of "What am I saying to people in Zimbabwe?" You have to have a great deal of trust in the algorithms to put anything with delicate subject matter through the system. This becomes even more important if there was some kind of filter like I described above.

    I wonder if this UNL is something that is human readable on its own. I think it would be much safer if we created a UNL that was human readable/speakable. Then you could go to any part of the world, and stand a chance of communicating without having to use your UNL enabled cell phone or palm device.

  150. Re:sounds difficult - not as you say by MobyDisk · · Score: 2

    I will be difficult, but I don't think for those reasons:

    Verb tenses are not the problem. Every language can express every tense, just in a different way. Hard yes, impossible no.

    Additionally, approximations work well enough. Ex. Most English readers couldn't tell you the difference between past tense and preterite(sp?) tense.

    Grammar is easily defined. 90% of language could be described in a BNF. adv-adj-noun in one, noun-adj-adv in another. So what. That is probably the simplest part.


    My interest would be in the meta-language design. Words by number? string? Grammar by parsing into a std format, or classifying each word? Are there multiple ways to organize a statement? What about this "word hierchy" they talk about. Quite cool there.

  151. Their logo sucks by Nicolas+MONNET · · Score: 1

    It's horrible. What I don't understand, or rather what people don't seem to understand about graphic design is that ... if you don't know how to do it right, don't do it!
    I'd rather see a plain TEXT HTML2.0 page on a gray background than this kind of ugly logos. At least it would be lynx-friendly.

  152. This is a very good idea by Cuthalion · · Score: 2

    This allows the semantic extraction to be MUCH more computationally intensive than systems like babelfish can afford. When you make a document, it's okay to spend an extra 15 seconds to extract a pretty good representation of the gist of it, so long as it doesn't need to happen every time the page is viewed. (babelfish doesn't even cache translations, does it?)

    Okay, so some of the idioms and convoluted sentences will be improperly converted, and will need some manual tweaking. Hopefully this system will allow this tweaking to take place. By providing multiple different conversions back into the author's native tongue, they may be able to see some of the translational oversights, and fix them.

    This won't be good for poetry, but will allow people who only know one language (English speakers seem more likely to fit this category than other people) to publish documents readable by people who do not speak English - that's a substantial breakthrough.

    It would be nice if this standard would allow segments to be set to fixed translations, so that if I really wanted the English to read a particular way, I could enforce that particular idiom, without loss of generality. ("Normally translate 'it has a low probability' but if you ARE translating to english, substitute the literal string 'fat chance'")

    --
    Trees can't go dancing
    So do them a big favor
    Pretend dancing stinks!
  153. A Templated Web by Wolfier · · Score: 1

    The UNL is simpler than the natural languages.

    During the translation from the natural languages to UNL, some unimportant details must be dropped.
    During the translation from the UNL back to natural languages, the meaning of the sentences remain, however the atmosphere or the mood might be lost - it is like a lossy compression...

  154. It sounds quite interesting. by cemerson · · Score: 1

    It's obvious they've put at least some thought into it. Translating to some universal middle-language is clearly the only sensible way to handle translating between many different languages - you only need N two-way translators instead of N^2.

    They describe a UNL editor which translates your text into UNL and back again, so that you can check that the translation will be ok as you write it. This sounds like an excellent way to check the results and minimise the errors/inaccuracy.

    Actually coming up with a representation which copes with the meaning in all of the other languages is surely a massively difficult task. *If* they manage to solve that, then I'd think that computer (written) natural language recognition is all but solved. (I don't know what the current state-of-the-art is like)

    The one thing which annoys me is their insistence on using the terms "enconverter" and "deconverter". I mean the latter sounds almost ok, but "enconverter"? From a group who need to be expert in languages? Yuck!

  155. it will fail by jilles · · Score: 3

    Though at first sight the idea of translating to an intermediate language seems interesting, I can't help but note that similar projects in europe have all failed so far.

    Automatic translation between languages in the EU is something that could save a lot of money. So there have been a lot of research projects funded with loads of EU money to accomplish this. All of these projects have failed (as far as I know).

    This seems to be a similar effort, this time by the UN which is an equally burocratic organization. I think the goal of this project is probably too ambitious to work. Even translations between two related languages (english and german)are troublesome (babelfish for example is not exactly perfect), so I can't see why translations to an intermediate language would change things (ever tried to do that in babelfish? the result is not pretty).

    So, it will probably fail and loads of money will be wasted on it.

    --

    Jilles
    1. Re:it will fail by kevin805 · · Score: 1

      I have to agree. Reading the website, it's all very nice how they can say what it's going to do, but I can write up a spec for a magic chip that breaks all encryption systems in under 25 seconds, but that doesn't mean it's going to come to pass.

      I sort of like the idea of a super-language that can easily be translated into natural languages, but I don't see how machine translation into this super-language would be possible when machine translation into a simpler (natural) language isn't. The problem with machine translation isn't converting the internal representation back into a natural language, it's figuring out what the sentence means.

      For example, find a rule that can be used to tranlate "at" in the following sentences:

      Bob stands at the window.
      Bob stands at the watercooler.

      In the first, it's implied that Bob is actually facing the window, but in the second, Bob probably isn't facing it. You need to actually know the real world meaning of the sentences to see the difference -- there isn't any syntactic difference at all. How is the UNL "enconverter" going to figure out the difference?

    2. Re:it will fail by Nathaniel · · Score: 1
      For example, find a rule that can be used to tranlate "at" in the following sentences:
      Bob stands at the window.
      Bob stands at the watercooler.

      I'm not sure that it's important to find such a rule. It seems to me that the high order bits are the idea of Bob standing near the window|watercooler. If you intend to convey that he then looks through the window, or that he hangs out around the watercooler talking to someone, that's a seperate sentence, and his facing wasn't all that important.

      It is unreasonable to expect any sort of translation system to work perfectly without some attention from the people using it. Knowing that you you are trying ton convey information, you can make your statements more clear, and use modles of speach which are more likely to convey the information you want to sent.

  156. Spelling errors by scumdamn · · Score: 2
    Imagine the translation errors when people spell words like "you're" "lose" "its" and "too" wrong.
    A holy war could be started because of the sentence:
    Your wife is going to lose, but you will win.

    If it's spelled incorrectly.
    Use your imagination.
  157. More Comments from last time by wangi · · Score: 1
    You might want to check the comments from the last time this story was run:
    http://slashdot.org/articles/98/ 11/24/101217.shtml
  158. Intersting alebeit it old news by luminiferous · · Score: 1

    A very similar story was posted on slashdot last year ---> Here

  159. Nice idea, but... by neophase · · Score: 1
    ... it's unlikely that this will amount to anything significant, for several reasons:

    • First, this sounds suspiciously like Esperanto, the supposedly "neutral and universal" language that everyone was supposed to learn. Real languages evolve; an artificial language is static and has to be forcibly "upgraded" to include new concepts.
    • As anyone who has read the output of automated "translators" can attest, the results are far from stellar, unless the text is VERY simple structurally and semantically. Colloquialisms and figures of speech are particularly difficult.
    • Any translation introduces a certain amount of error. Languages express things differently, and secondary meanings are often lost in the translations. This system is adding an extra layer of potential errors to the whole mess.
    • The text would have to be completely free from grammatical and spelling errors for an automated program to stand a chance. This is more of a problem than it first sounds, since the grammar and spelling checkers that I've seen appear to have trouble differentiating between similar words with very different meanings. I don't want to start a spelling / grammar flame war, but much of the stuff on the web is abysmal.

    While this system would reduce the number of translators significantly, with the UN's record of fast action (NOT!) and bureaucracy I think this is headed down to the Great Bit Bucket. I'd be much more interested in what some of the major research centres in computational linguistics and language recognition are up to. (Links, anyone?)
    ==================================
    neophase

    --
    ==================================
    neophase
    1. Re:Nice idea, but... by PHroD · · Score: 0

      would it be : 1010 1010 1010

      or : 1010011010

      ? :P


      "There is no spoon" - Neo, The Matrix

  160. How can they do this if by PHroD · · Score: 0

    they can't even write web pages correctly (the english pages are using the Shift_JIS - Japanese - char encoding...fscks up 1/2 the text on the page)

    I think the UN should just push esperanto as a 2nd language (as its intended) so ppl can communicate that way (i mean come on, its NOT hard to learn)


    "There is no spoon" - Neo, The Matrix

  161. Perkele! by euroderf · · Score: 1
    If it has non- Indo-European languages, then why not FINNISH ?

    Ilmoita se Ahtisaarelle heti !!

  162. All Written Chinese by David+Jensen · · Score: 1

    The various Chinese languages are all mapped (with a reasonably good degree of confidence) to the same written language. Translating written languages is easier than translating spoken languages.

  163. Use w3m! by Anonymous Coward · · Score: 0

    w3m has pretty much obsoleted Lynx. It's a text mode browser that supports tables, frames, and generally renders much better than Lynx does. Akinori Ito is the man.

  164. Yay! I'm not the only person bugged by that! by Anonymous Coward · · Score: 0
    At the very least they could have taken a clue from the engineering worlds and used words like "encode" and "decode" rather than make up their own freaky phrasings.

    After all, encode and decode have been used for years and years and years, so all of the translations of the UNL web pages would be correct.

    Frankly, the web site reads like it was originally written in another language, and then translated to English. Which, to be blunt, makes me doubt that they have the skill to actually implement UNL. I mean, if they can't do a decent job translating a lousy "brochureware" web site, how can they do a creditable job with UNL?

    Having said that, I'd like UNL to work. It might simplify my travels somewhat.

  165. Re:Sounds like Esperanto - take 2 by Anonymous Coward · · Score: 0

    It doesn`t help with the yanks mispelling words ie color, sulfur ( and driving on the wrong side of the road for that matter :-) )

  166. Re:Language support - Esperanto? by Chalst · · Score: 1

    Esperanto wouldn't work: the point about Esperanto is that it is a pared down language without the anomalies that most natural languages have.

    Unfortunately one *needs* these anomalies in order to translate back into the natural languages: one needs to know all about the 3 genders and 4 cases to translate automatically into German, for example.

  167. UNL? Yeah, right! by Noryungi · · Score: 3
    All right, all right, all right...

    Several points -- for full disclosure, let me just state that I am a localization engineer, with a 5+ years of experience in software localization (read: adaptation into different languages) and a 7+ years experience in translation. If that does not makes me qualified to comment on this, I don't know what does.

    • First of all, I do not really believe the UN can produce anything remotely interesting, technically speaking. I like the IETF motto: "we believe in rough consensus, and working code". Show me the money^H^H^H^H^Hcode first, please.
    • What's so special about UNL? Theoretical translation of language A into a universal language and from there to language B is almost as old as "machine" translation itself. As far as I remember, early EU research into machine translation were based on a similar idea -- and they were dismissed as a failure.
    • For a good example of the total and dismal failure of machine translation, try translating this text into French (or Spanish, or Italian, or whatever) with Babelfish and back to English. Then do it a few times. Then try English to Chinese and back a few times. Case closed.
    • People, Star Trek is nothing but TV! Don't misunderstand me: I love spending an evening with Cap't Kirk and Mr Spock, but this not reality! The Universal Translator is, in my opinion, a perfect (read: impossible) dream. It is, in fact, an even better AI test than the Turing test. The day a computer can perfectly translate a text from language A to language B is (a) the day I'll be out of a job and (b) the day I'll begin to seriously worry about that glowing red camera and calm voice saying: "Would you like a nice game of chess, Dave?".
    • Frankly, would you trust somthing as big, bureaucratic and inefficient as the UN to determine the next standard in machine translation?
    • Finally, I have some friends who work at the UN as official translators, and they are doing perfectly fine, thank you very much (and, I should manking some serious money). Why? Because, AFAIK, no machine has ever been able to translate perfectly the multiple meanings, subtle changes in context, double-entendre, puns, cultural and historical framework, regionalisms, etc. that exist in every language on this Earth. Call it the "Curse of Babel" if you will, but a human brain is, and will remain for a long time the best translation machine there is. Machine translation has its place, but only on documents of a very limited scope/vocabulary and of a very repetitive and technical nature. Even then, a human translator is needed to correct the multiple mistakes made by the machine.


    Of course, I may be completely wrong and UNL may be the next best thing since sliced bread. But I doubt it.

    --
    The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
    1. Re:UNL? Yeah, right! by doom · · Score: 1
      I *was* a localization engineer for 5+ years (thankfully I'm out of that now), and I'm much more positive on this idea than you are... (though I suspect you're right that the UN is the wrong agency to bring off something like this).

      Don't think about it as "automatic" translation, it's much more likely to work out as semi-automatic. I expect that the process would be something like this:

      1. Run automatic converter from natural language to intermediate.
      2. Have an expert in the intermediate language review the translation.
      3. Run automatic converters to the target natural languages.
      4. Have linguists review the output.

      The point is that the intermediate language should be designed to be free of the ambiguities that plague language translation. The hope is to minimize or eliminate step (4). A typical localization job is to take software written in English and translate it for a few dozen other countries. It would be a big win if you could get to the point where all the hard stuff is done just *once* instead of repeated over and over again for all of your target languages.

      And no, this will not work for poetry or humor, but there's no good way to translate poetry and humor in any case. The idea would be to get it to work with technical, legal, and business language.

      By the way, when I was thinking about doing something like this, I figured I would try and use Loglan:
      Loglan welcome page

  168. Re:sounds difficult - not as you say by Anonymous Coward · · Score: 0

    no, every language cannot express every tense. Read Cassier (his philosophy of symbolic something or other); he has lots of source information about how things are expressed/experienced in different languages.

    In particular, there's an indian tribe in south america he cites which has a single verb tense, and expressions of the past and present are identical, and are apparently understood in an temporally undistinguished way by the tribe.

  169. WTF? by Anonymous Coward · · Score: 0

    MONGOL? surely there are many many languages with more users than Mongol? Does Genghis Khan use the internet? Your tax dollars at work.

  170. Spoken meta-language would help, markup content? by jncook · · Score: 1

    A few thoughts on "automatic translation" of web page.

    1. Web pages could store more than one translation of their text. You could store the default version in your native language and a translated version for someone else. Envision BODY LANGUAGE=ENGLISH-AMERICAN or LANGUAGE=ESPANOL. This would be even easier if you were generating pages out of a back-end database.

    2. You could embed a "meta-language" version of each of your web pages. This would allow you to tune the meta-language page for better translation. BODY LANGUAGE=UN-META-LANGUAGE

    3. This process would be facilitated if the meta-language was something that people actually spoke, and was easy to learn (eg, Esperanto, or some other designed language).

    4. Where the meta-language was not sufficiently specific (implied meaning, context), you could add markup tags around words to indicate meaning. This could extend Esperanto to have useful features for computer translation.

    5. You could even mark up English text to indicate meaning.

    6. Failing all the above, if web pages consistently had "summary" or "abstract" sections you could at least focus your translation efforts on that chunk of the page.

    Ah, for the return of HTML to content markup and not display.

    James Cook
    James@CookMD.com
    See http://www.useit.com/ about the bolding.

  171. translating. by thal · · Score: 1

    this is one of those things my wonderful senior year high school teacher used to talk about. we would talk about things like if all words are really "images" in the brain, like as if there's a mental stamp we have inside that we "think" of every time we say "run", that could really be translated into any language... very interesting, but it's really hard to out-think your own brain and try to figure out what it's doing.

    there was an exchange student from germany in that class and she said that when she first came here, she would "think in german", but eventually she found it easier to "think in english" when she was going to speak english.

    i'm a very poor spanish student (which have to take at college due to general requirements blahblahblah) and i'm not at the point where i can really "think in spanish", so i "think in english" and of course i speak about ten times slower than good spanish speakers (those people they put on the listening tapes are so damn fast!).

    if there is some kind of "universal" language that your brain thinks in, it must be really hidden from your conscious self, because i find it very difficult to think without words. but, of course, as my teacher would say, if you can't think without words, how could the first languages have been developed?

    so of course we must be _able_ to think without words. i guess we just make up our own internal mental representations for things or concepts, but once we learn language, this is probably not used.

    as far as the project itself is concerned... this is going to be _necessary_ at some point for language translation (though of course i think everyone should speak english, like almost "everyone" does already). if you have just have italian-english and german-french, etc,etc. translation algorithms, it's going to get ridiculous. n^2, where n is the number of languages you want to translate between.

    i really just hate languages.

  172. Why? English already de facto standard by Anonymous Coward · · Score: 0
    That sounds like an imperialist thing to say, but hey, if you're reading this, you're an English speaker (or reader). Ask yourself if you're willing to learn a new language.

    Thought not.

    By the way, this was already tried once - it was called "Esperanto". Ever meet anyone thatspeaks it?

    Thought not.

    1. Re:Why? English already de facto standard by PHroD · · Score: 0

      hey i'm LEARNING esperanto (learned a lot of spanish, so many words and roots are familiar, and a damn-easy, regular syntax and grammar makes it a breeze)


      "There is no spoon" - Neo, The Matrix

  173. I don't see how this can work! by cs668 · · Score: 1
    I am not a linguist, but I do speak both English and German fluently.

    My mother is German my father American. My German family speaks no English and my American family speaks no German.

    When I have had to translate I used to try to do a word for word sentence for sentence translation. It never worked. I can not explain why, it just sounded very wrong and sometimes even gave the wrong meaning. Then when I got over having to do sentence for sentence translation and began paraphrasing everyone would understand. I don't see a computer doing any accurate paraphrasing anytime soon.

  174. re: universal language by Anonymous Coward · · Score: 0

    Why is the UN reinventing the wheel? There is a universal language on the internet: it's called English (or american, as the brits would call it).

    Only half in jest on this one...

  175. Universal Translators by Hard_Code · · Score: 1

    We already have realtime audio-textual language translators that are surprisingly accurate.

    I think what they are talking about is a textual intermediate language to bridge other languages. I think this is how the audio translators work anyway so we may be half way there. I think an XML language would do perfectly, like:

    (sentance)
    (clause)
    (subject name="boy")
    (verb name="walk" tense="past simple" adverb="slowly" adverb="lazily")
    (predicate)
    (preposition name="to")
    (noun name="school")
    (/predicate)
    (/clause)
    (/sentence)

    Something like that (hopefully it wasn't munged). Then I'd say "The boy walked slowly and lazily to school", and it would be converted to the meta language, then into the destination language.

    --

    It's 10 PM. Do you know if you're un-American?
  176. Another Esperanto? by mOdQuArK! · · Score: 1

    Just imagine what the poor linguists who have to find the "common constructs" in the various languages will have to go through...

    I wonder if eventually, people will just "skip" the language translations & learn the meta-language directly - will they teach it in school?

    Of course, this could all go the way of Esperanto...

  177. And here's an antecedent... by SEE · · Score: 2

    This discusses a similar project...

    Wow. I've been on /. longer than I thought, to have remembered to look for that...

  178. Do they have a clue? by Jered · · Score: 2

    "enconverter software"? "Inter-Net"?

    Are there any actual computer scientists or linguists involved with this project? Their web site looks like it's either a team of bureaucrats or fifth-graders.

  179. not a new idea by hugg · · Score: 1

    The idea of using an intermediate language (often called an "interlingua") to translate text is not new. PANGLOSS is (was?) one such project, there is also at least one Japanese interlingua project (http://www.cicc.or.jp/homepage/english/about/act/ mt/mt.htm). I don't think these projects led directly to any practical application. It's a tough problem!

    I don't expect UNL to succeed -- I am skeptical of any organization that has only a flowchart and a tag on their home page to show for a year of work.

    What I'd like to see is an open-source project to develop an interlingua for a very specific domain (say, computer user interfaces?) that will be immediately useful. Start with just the interlingua -> human language engine, since it's a bit easier. Use it to make text for dialog boxes, menus, and help files. It won't translate Shakespeare, but it'll be useful!

  180. Re:It won't work (not as you may expect) by ianezz · · Score: 1

    Also consider this example (oversimplificated):

    A cat is a pet in western culture, but a dog is more like a chicken (i.e. something to eat) in some eastern cultures (China?).

    So, how an eastern people writing "He went out to kill the dog" should be interpreted? Was he crazy, or just an hungry man? Probably, he was an hungry man.

    The intermediate language may be as precise as possible, but when translating, either you have to ignore the meaning ("She went out to kill the dog"), or you have ignore the fact ("She went out to kill the chicken").

    Well, let's choose the former: a Babelfish translation does almost this today, so what's the point in doing that? Improving. Well. It's a good point.

    So, let's choose the latter: suddenly, in the text, you find something like "...and still there were some tracks of a four-legged animal on the snow...". Obviously, it can't be a chicken, so the translator should also change all the indirect references to the dog to indirect references to a chicken. "A tricky job", as Deep Thought would say.

    Said that, I hope they improve the quality of automated translations. But I don't expect too much soon.

    My 0.02 Euro.

  181. Re:sounds difficult - not as you say by TheCodeMaster · · Score: 1

    Philosophy of Symbolic Forms. I think you mean volume 1. Is here where I remark on my dislike for the deification of Chomsky and his theories?

  182. Intermediate language choice by Anonymous Coward · · Score: 0

    Ages ago I thought of the problem of automatic language translation. It is best done by translating to an intermediate language then translating to the language of choice. DO NOT use a language that is still in use. The reason is "living" (still being used) languages change too rapidly (the same words change meaning with time). Use a "dead" language (no longer used). LATIN is probably best since it no longer used and there are Latin to "every other language" translations available.

  183. Re:Sounds possible.. not really. by Feral+Wylde+I · · Score: 1

    I am a true bi-lingual and my wife also studied this in college. Here is the gist of it, bi-lingual brains are wired differrently (makes me a mutant ). The brain forms pair-words of equivalent words. so if you say house in my brain house-haus or haus-house is accessed. Based on the aural input I then strip off the other language construct. Since I parse English/German it is actually easier since they are linguistically close. I dont envy people that have say English/Japanese or Chinese. And yes I can generally think in both languages but mostly when I am alone. The best way to describe that is that when I am alone I can here the words of the language(s) in my head but not when I am around other people. I suspect that I use the internal symbolic language when around other people in order for it to be my translator. Oh yeah, until I met my wife I didnt know any of this I just did it. Until here I didnt realize what a fun toy I had between my ears.

  184. Anonymous Coward meets UNL by Anonymous Coward · · Score: 1

    First Pole! Initial posting! Top Rank! Station #1!

  185. Re:Language support - Esperanto? by Chalst · · Score: 2

    Let me just add something to the above, since I haven't made myself clear in what I have said in the above.

    In German it is possible to use the definite article to refer back to something used in the previous sentence, rather like `it' in English: but with the crucial distinction that what we refer back to must be of matching gender. So if a masculine, feminine and neuter word occur in the sentence it is possible to refer to any of them with the `it'. This ability to refer on the basis of gender must be captured in our syntactic model. Similarly the case system allows one to have multiple indirect objects (one accusative, one datave and one genetive, for example) directly attached to a verb, where in english one would use a preposition.

  186. one big bland language coming right up by tuffy · · Score: 2
    Assuming this scheme can work and they can map some subset of all languages to one another, the result won't be terribly pretty to use. The whole idea behind having different languages is to express cultural diversity - like the old adage that eskimos have 11 different words for snow. There's no way a universal language could capture that level of subtle differentiation.

    On the other hand, they just might be able to come up with a way to map a small subset of natural language, computer-speak for example, for the purpose of easing the creating of internationalized apps and making web sites more navigable. But I don't see how this could be successful in a general case.

    --

    Ita erat quando hic adveni.

  187. I'm trying to imagine by mudnux · · Score: 1
    how a meta language would have to evolve as each and every language evolved. Trying to keep up with one language is not easy as any dictionary publisher will tell you. Imagine how difficult it will be to track changes in every language.

    I recall when I had to write a graphics format conversion utility for a little known application. This was not a simple task having to learn each file format and then how translate it to this new format. luckily, I had no need to translate between anyother formats or translate to any other formats (I had to avoid copy lefted code so pbm was out :( ). But it would have been impossible to maintain if the file formats for each format changed constantly. It is bad enough that I had to account for multiple versions of a particular proprietary graphics format.

    Now, suppose they overcome the problem of keeping up with language changes. What about time? I am supposing this is talking about written language. I write something today using current local language terminology that changes over the next few years or I use some terminology that went out in the 60's or any combination (what a gnarly bad chuck of freakin' hackish). 10 years later it is translated in the meta-language. how the heck would it translate this and maintain integrity?

    --
    NT is based on the premise that anyone who can manipulate a mouse can administer a system. Huh?!?
  188. Yes, but! by Anonymous Coward · · Score: 0

    That's why your South American language isn't on the list of languages to be supported. =) Guess these people are out of luck, which is really a shame, but that's what happens when the industrialized world gets to run things. The weirdos.

    Yeah, this guy shouldn't have generalized like he did, but I've done work in MT as well, and he's right for the most part. Babelfish is not exactly the epitome of Machine Translators, either. Most good Machine Translators cost money to get to use, and they're used primarily as an aid to human translators, who look over the result. Cuts down on costs. =)

    Anyway. This post is buried so deep, it makes my eyes water.

  189. Noam Chomsky and the Universal Grammar by Hasdi+Hashim · · Score: 3

    For those of you who think this is impossible because of the variations between languages, Noam Chomsky has something to say to you. I was exposed to his idea back in formal languages and automata class. Basically, his argument is that we have universal grammar (UG) parser built within us when we are born. We 'hardened' the parameters to the UG to conform to our prefered language. Sorta of like guile and perl where guile is a very expressive language but perl, while express less, can express the same thing in a more consise manner.

    Universal grammar is defined by Chomsky as ``the system of principles, conditions, and rules that are elements or properties of all human languages... the essence of human language'' [Chomsky, 1978].

    Thus, all languages that we are accustomed, English, Arabic, Malay, Japanese, and Chinese are special cases of a universal grammar. Chomsky and subsequent linguists are looking for those common elements of all languages.

    Universal grammar and the innateness hypothesis

    Universal Grammar in Prolog

    There are lots of discussion about this... see google.

    Hasdi

    1. Re:Noam Chomsky and the Universal Grammar by the+eric+conspiracy · · Score: 2

      Perhaps there is a universal grammer that is innate to humans. But what makes you think that you can implement it in a Turing Machine?

  190. what about slang? by 78spb89 · · Score: 1

    I will be curious to see/hear how they intend to handle the use of slang, or eubonics, which is so commonly (over?) used, particularly in the US, any ideas on this? Maybe Americans will have to learn English now so this will work. Wouldn't that just make things simple?

  191. Key differences from Esperanto by ToastyKen · · Score: 2

    As far as I can tell (just guessing, though) there are 2 key differences between UNL and Esperanto:

    1) It's not Romance-based and thus won't be as Euro-centric and will thus probably translate Eastern languages better.

    2) It's designed as an intermediate language and not as a final end-user language. As far as I can tell, it could even be machine-readible and not speakable. In any case, it will not have as many constraints as a language like Esperanto that is designed for human speech.

    These are just my gueses. I don't know what kind of language they're actually trying to implement. (The website is skimpy on those details.)

  192. UN being anti-semetic? by Anonymous Coward · · Score: 0
    They claim that the initial stage of UNL will support 16 languages: Arabic, Chinese, English, French, Russian, Spanish, German, Hindi, Italian, Indonesian, Japanese, Latvian, Mongol, Portuguese, Swahili and Thai.

    Latvian and Mongol but not Hebrew? Does this make sense to anybody? Israel practially a second silicon valey (home of ICQ, Gooey, Ricotche, and more, the entire country is covered digital cellular network - cell phones outnumber ground lines, etc.)

  193. My Hovercraft is Full of Eels by Anonymous Coward · · Score: 0

    I will not buy this record -- it is scratched!

    I want to fondle your bum!

    Mein Luftkissenfahrzeug ist voll von den Aalen.

    Je veux au fondle votre sans valeur!

    (courtesy of babelfish, obviously)

  194. Interesting, but... by Millennium · · Score: 4

    It's not going to work very well. The problem is that each language has its own nuances, and in many cases these don't translate very well into other languages. I'll use Japanese honorifics as an example. The list of them is relatively long ( -san, -sama, -kun, -chan, -sensei, -wa, and others). Simply by attaching one to the end of a person's name, I can make the same sentence express immoderate flattery or extreme derision. This can be translated in an extremely limited fashion to romance languages such as Spanish or French (by using familiar vs. formal form of address, but it's still limited). It doesn't translate into English at all (this is why I prefer subtitled anime; get the general meaning from the subtitles, and actually listen to the Japanese for the nuances). And, of course, you still have the problem of inflection not translating very well into written words. This makes English particularly unsuitable for network communications, actually, since so much meaning is left to inflection. What's the solution? I don't know. There probably isn't one. Even Esperanto isn't immune to this problem of losing meanings in translation. I don't think a "universal meta-language" is going to work, though.

    1. Re:Interesting, but... by the+eric+conspiracy · · Score: 2

      AN Whitehead was obviosly not a linguist at all.

      In actuality English is perhaps the most complex language in modern use, for a number of reasons. It has by far the largest vocabulary; it takes root words from many many other laguages; it's rules of grammer are highly irregular; because of the introduction of printing before the great vowel shift the spoken form of English does not agree at all with the written from; the geographic spread is so large that several dialects pigdins and patois forms of English now exist. If you propose that we adopt English as the base language you are going to have to be very specific about WHAT English using what local idioms and rules.

      The richness and complexity of English is perhaps best exemplified by the richness of it's body of great literature and poetry where expression and level of meaning are best brought to form by a language that has a great richness of vocabulary and ability to express multiple levels of ideas in a single word. Of all the languages of the world there are three that clearly have great bodies of literature - Sanskrit, Greek, and yes, English.

  195. Re:Sounds like Esperanto - take 2 by pp · · Score: 1

    Nope, the idea of this project if I understood correctly is not to create a new language people around can use to talk to eachother, it's meant more for machines. English doesn't really qualify as it's a real language and thus doesn't have the required properties as it is ambiguous in many ways. I'm not familiar with esperanto, but I would assume it doesn't contain all the necessary information either to be useful as an intermediate language as it was designed to be a human language. Take for example the word "uncle". Is that on your mothers or fathers side? In english it's impossible to say but in some languages the difference is very important. If your translator program is smart it might guess correctly from the context (or atleast state it doesn't know so maybe a human operator can then choose one) Ok, so maybe things would be much easier if everyone spoke english, but that is not the case and won't be in the foreseeable future. In the meanwhile automatic translation is needed badly and this sounds like a good approach to the problem.

  196. How good could it be? by EisPick · · Score: 1

    This may sound snide, but if their translation tools work so well, why haven't they been able to translate their Web site into all 16 supported languages? The site only has English and Japanese versions at the moment.

  197. A good book... by curril · · Score: 1

    Pinker's "The Language Instinct" is an outstanding introduction to some of the problems of linguistics. It also presents some theories as to the biological basis for language. One of his most interesting points explained the incredible difficulty in learning a new language after the age of nine or so.

    Pinker feels that there is a portion of the brain that instinctively understands the possibility of grammar, and when learning a language (when young) the brain adapts to the particular grammar of a language. Children who have not been exposed to a language and yet must interact with others in the same situation will develop their own grammar and invent a vocabulary.

    In other words, there isn't a "universal" grammar as such, but more of a "meta" grammar--rules for creating grammatical rules. Of course, the human mind doesn't follow these rules verbatim. Any language has a main set up grammatical rules, but there are numerous exceptions within the language that follow different rules. A universal translator might know the "meta" grammar, but it would still have to figure out the grammar for each particular language plus recognize the exceptions.

    Something of a daunting task in and of itself, and we haven't even begun to talk about the problems of non-corresponding vocabulary, idioms, slang, jargon, language drift, etc., etc. I am not optimistic about the chances of a good universal translator coming around until we get a good AI that mimics the processes of the human brain. Unfortunately, such an AI would probably want coffee breaks.

    Pinker has several other books out that I haven't had a chance to read yet, but they have gotten good reviews and I am sure that they would make excellent reads as well.

  198. First get the interlingua design right by WillWare · · Score: 1
    This is an excellent and very worthwhile project, but I have concerns about the approach. Here is the project schedule, which suggests that the order to do things is: first write UNL-to-native translators (1997), then write Native-to-UNL translators (1998), then test and deploy. The current UNL organization has three parts, one assigned to translator development, and two assigned to the design of UNL (semantics, linguistics, how it connects to HTML).

    If the project organizers try to write good translators first, they will be putting the cart before the horse, and the project is likely to go badly. They should put their effort into the design of UNL, coming up with a good extensible machine-readable language that conveys human semantics, and write only prototype translators. UNL must be an open standard, like TCP/IP or HTML, and once publicly released, it should not drift too much. The writing of the real translators should be left to enthusiastic open-source developers, who will have the time and the motivation to do a much better job on translators.

    Inevitably there will be trade-offs. In most languages, translations from other languages will seem like a pidgin. Fine linguistic nuances will not survive the translation process, and regular users will learn not to depend on them. If it's mostly comprehensible, it will still facilitate communication where none would have been possible previously.

    The first design for UNL should probably be considered provisional, and ultimately a throw-away to be replaced in the future. But we can't replace it until we've learned its lessons. This still seems to me to be a very worthwhile thing to attempt.

    --
    WWJD for a Klondike Bar?
  199. so crazy it just might ... not work by Anonymous Coward · · Score: 0
    I'm not a genius in computational linguistics, but I do know a little.

    Current understanding does not allow us to translate arbitrary subject matter between languages very well. (translations of language coming from a subdomain where meanings are non-arbitrary is currently possible).

    Given UNL, it might be possible to generate natural language from it, but not vice versa. UNL may provide for language meaning to go to natural language, but does not provide a way to get from natural language to meaning, something computational linguistics has been struggling with for decades, and precisely the reason why translation is currently impossible.

    In short, the problem isn't a universal representation of meaning, it is getting natural language to automagically convert to such a structure.

    So, UNL will only be useful once the problem it "solves" is already solved.-k

  200. that's not what "meta" means by ToastyKen · · Score: 1

    Just to be a nitpick, UNL would not be a "meta-language" because it would not be a language about languages. It would just be an intermediate language.
    The enconverters and deconverters would be more like the "meta-languages", sort of...

  201. Esperanto Redux by Anonymous Coward · · Score: 0

    Universal languages for any medium will never be completely accepted. To have the UN try and do this is a waste of time & money. The US should get out of the UN, and kick them out of the country, before they try and impose their new language as part of their New World Order. Oceania had newspeak, the UN has this----

  202. Interlac? by angelo · · Score: 1

    Isn't this the same as interlac?

    Sorry, couldn't resist, but it sounds like the codex of language starting with numbers and working up. Maybe it should be taken that way.

    Un-natural languages don't sit with me well. Klingon, esperanto, etc just seem silly. All they can do is borrow words from each language. This would at least guarantee certain words translate exactly to certain languages.

    I think it would be interesting to see everything translated through Chinese, since that is one of the more stateless languages. Of course it would be difficult to understand when translated to English (just go to Chinatown in cisco), but you'd get the gist of the conversation.

    Japanese would be interesting as well. Like chinese, it conveys thought and emotion more efficiently than english. That and its grammar rules don't swing violently on whim.

    Pluarality and conjucation could simply be translated by computer. We just need a universal rule set, not a universal language.

  203. Re:Sounds like Esperanto - take 2 by Anonymous Coward · · Score: 0

    Depends on your starting point. If you are from a Nordic country for example, or Germany or The Netherlands for that matter, English is a lot easier to learn then French is. For Spanish people on the other hand, the opposite is true.

  204. Re:Lojban, not Esperanto by Anonymous Coward · · Score: 0

    A lot of these goals have already been done by Lojban . It has precise unambiguous grammar (Yacc-able, in fact), is speakable by humans, and can even be parsed from speech unambiguously, and has none of the cultural baggage that clutters most natural languages (and even Esperanto). Of course, for the project it would be necessary to codify a much bigger vocabulary, but that's not too hard.

  205. Initial=Not finished by Anonymous Coward · · Score: 0

    I agree that Mongol isn't as widely used as, say, Swedish? but, then again, how many Swedes don't speak/read/write another language? I believe they chose this because Mongolia is rapidly adopting "free trade" and how many American companies have business there. BTW, Napoleon didn't use the Internet either but people still speak French.

  206. Mathematics by Anonymous Coward · · Score: 0

    What about using mathematics as an International Language. People would read and write using numbers. We could use numbers to represent the sounds of words. Or we could use them just like alphabets. So instead of "Hello", we would have 28766. And that would be readable by every person.

  207. Linkages by the_tsi · · Score: 2

    Linkwa, pink dama, arf muzheek. Rintintinambulation. Alla da peepholes enda voold, enda looniverse, cargo a schlong ender hertz. Epp, dat schlog arf Unamunda.

    -Chris

  208. Openness of information - Skeptical about progress by ToastyKen · · Score: 1

    Speaking of being more open, given that this project is supposed to help international communication, I'm surprised it gives so incredibly few details about their language. If you look at their project info page, this has been in development for a FEW YEARS already. Yet, their website only contains information on the software, not on the language itself, which would be the hard part.

    I wish they'd give us more information on what UNL itself is like

  209. Babelfish? by pol-pot · · Score: 1

    Is babelfish gonna be a plugin for my Netscape og what?

  210. not impossible, quite difficult by Anonymous Coward · · Score: 0

    Problems with chomsky aside, it's still quite tough even with innate logical structures. Humans have a hard time using their own native grammars correctly; making a machine do something that complex is going to be super-tough

  211. There already is a universal language... by Tim+Macinta · · Score: 1
  212. How do you translate encrypted text? by Anonymous Coward · · Score: 0

    I thought not. We now return you to your regularly scheduled brain-washing.

  213. Re:Language support - Esperanto? by Anonymous Coward · · Score: 0
    Let me add to this. Once upon a time, I was working for a game company during a time when one of their games was being translated into Japanese. At one point, the designers of the game had to meet to decide a point of plot that hadn't been addressed in the English version: Which of two brothers (NPC's in the game) was older? Apparently the translator needed this information as it would affect how they address each other.

    So anyway, these problems are hardly unique to European languages.

  214. Bahasa Indonesia by Hobbex · · Score: 2


    has, to my knowledge, only one tense. And no articles. And plural noted by saying the noun twice ("orang" is person, "orang-orang" is people).

    Needless to say, there isn't much poetry in Indonesian...

    -
    /. is like a steer's horns, a point here, a point there and a lot of bull in between.