Speaking in Tongues
Desert1 writes "Carnegie Mellon's renowned computer science department has developed a system which allows for conversation between two different languages called Tongues. Currently this has been used between Croatian and English, perhaps one day they will be able to develop one that will allow politicians to talk to normal folks and be understood." It's been in development for a while.
...is handwriting recognition that can handle Doctor's handwriting.
until it can allow h@x0r5 and non-"l33t"s to communicate?
I never spellcheck and I freely admit it. Save your karma for more worthwhile "lol erorrs" replies
Then again, you need to understand Holyspiritish before you can write the translator.
http://pcblues.com - Digits and Wood
Looks like a fascinating project --- I wonder if their Vision and Robotics boys are working on recognizing sign language which, for all intents and purposes, seems to be a very much more difficult problem (don't believe me --- see how well the facial recognition packages do in production environments :-P). I wonder if this is at the stage where it could be attached to a something like a virtual {insert sign language of your choice here} "translator"... hrm, sounds like a summer project ;-)
I've long wondered why someone doesn't just brute force translation.
Create a human translated database of damn near EVERYTHING in two languages, like English and Spanish. Then, just do fast lookups.
Computing power is such that this would be possible.
Learning HOW to think is more important than learning WHAT to think.
Only when the program can quickly and seamlessly convert a hitherto unknown language into everyday English will I be satisfied. The differences between Star Trek and reality must continue to dwindle.
But hey, they could at -least- program Klingon into it.
In Soviet Russia, Beowulf cluster imagines you!
if you arent satisfied with the pc magazine summary, you can read this
In the pre-computer days, some folks noticed that a neophyte (basic idea, needs dictionary)translation into Esperanto was much more comprehended at the other end than a neophyte translation to the destination language or a neophyte translation by the recipient.
The reasoning was that the process of translating into a more formal mechanical language clarified and codified ideas.
Once again, it's the dividing line between human and machine that's the problem. Millions of people train themselves to C or the shells. Fewer to assembly. But it takes some wetware work to push the human/computer boundary closer to the computer.
Like most programming has a learning curve, usually less than ASM, leaving language translation completely to the machine will be fraught and ambiguous. Good translation requires some push from normal speech, but maybe not so far as mastering every other possible language...
So, basically, it's a lookup function, translating the incomming speech and then comparing in a database... So, while they could have a huge dictionary that could cover most situations, they aren't really doing a 'translation' per say...
Although, then again, for anyone who has taken language classes, but are not fluent in the second language, isn't that what we do? I know that while I was taking French and Latin, to come up with phrases I would do phrase translations because I was still thinking in English. I wasn't fluent enough to think in those other languages, so I couldn't formulate phrases directly properly.
I suppose, in essence, this will work as a translator, but it is neither a babel-fish type universal translator nor is it any replacement for fluency.
Still cool, though. Now, can they get it to run on a Palm?
-T
"Drop your panties, Sir William; I cannot wait 'til lunchtime"
'Tis 'bout time that 'twas translated into Croatin, aye!
I have a fish in my ear.
"It's too bad that stupidity isn't painful." - Anton LaVey
Oh great, just what we need: a machine/program that makes it easier for us to snow crash. I'd like to play with this a bit, and find out where it's rough edges are--especially running translated output back through, a la the Babelfish.
404 Error:
I would not, could not, in a boat
I would not, could not, with a goat
Sam I Am you let me be!
...perhaps one day they will be able to develop one that will allow politicians to talk to normal folks and be understood.
I don't think there's that much problem understanding what politicians say; it's just a lot of times they aren't very "accurate" with what they say.
I dunno if computer translation is going to be up to par for a long time.
I speak both Spanish and English. English is native and Spanish is due to 3 years in South America. And my grandparents are from Spain. I did not really know anything until I lived in Colombia and my granny who has Phd in her own language was a pretty harsh mistress. I was 21 years old when I learned. Of course living with a Colombian sysadmin girl for two years was a big help. She liked the Penguin.
Languages differ too much from location to location. Justlike English in regions in the US. I am from New Orleans and the english changes from neighborhood to neoghborhood.
Word meanings and expressions might be exactly the same in spelling and sound but mean different things to different people.
To build these variables into software would be a *HUGE* task.
I think the best we could hope for is software that does a decent brute translation and then a human does the final edit.
The problem is one word might be ok to use in Puerto Rico(well they are confused about which language they speak) but socially unacceptable in Colombia. Software cannot know the difference.
People will always do the translation gig better.
Puto
Course my handle is pretty bad in any Latin country.
The Revolution Will Not Be Televised
I think it is very interesting that it works by using phrases rather than individual words. Most translators in the past have used words and that leaves room for error with idiomatic phrases such as "window shopping" (the french equivalent translates as "window licking").
Maybe it would be a good idea to put something on the web and let us test it, at least without the speech components.
You call me a pedant? I prefer the term "correct"
One of the most useful ones, now with all the scrutiny in the business world will be the translation from any kind of management speak/weaselease into english.
Corp officer: We are commited to stringent compliance with accounting rules and will not tolerate anything less than the pure truth.
Translation: We're covering our rears as fast as we can.
Or to steal one from Dilbert...
Management: Employees are our most valuable resource.
Translation: (nothing)
They have them in English-Russian and English-German at present, but apparently plan to add more languages all the time. Their unidirectional models ("UT-103") handle about eight languages currently.
Nothing like technology undoing God's old testament work!
Heck, I'd be happy if it would just let me understand my girlfriend.
My
Limekiller
Another repost. This one's from yesterday, people. http://slashdot.org/articles/02/08/10/033231.shtml ?tid=126
it's better than californication and expecting the world to speak english.
Surprisingly, the editor is quite correct.
it's Pronunciation Key (ts)
1. Contraction of it is.
2. Contraction of it has. See Usage Note at its.
Source: The American Heritage® Dictionary of the English Language, Fourth Edition
Copyright © 2000 by Houghton Mifflin Company.
Published by Houghton Mifflin Company. All rights reserved.
Withdrawal before climax is very ineffective and those who try this are usually called "parents."
For example,
becomes:
"All art is quite useless." -- Oscar Wilde
From what I have seen (spoken to?) speech recog pretty much sucks now days, unless you are one of the lucky ones to have one of those 'special voices' that computer speech recog likes. . . .
/trained/. . . .) these sorts of applications of technology are going to be very limited in scope.
As I have stated before in these types of articles, until speech recog can get over 95% or so recog on untrained voices, (or heck, I would like it if it could get 90% recog on my voice
Need help treating your acne? Come here!
"I am looking for the tobacconist."
"I need some matches."
"How much do I own you?"
The entire dictionary can be found here.
Your reality is lies and balderdash and I'm delighted to say that I have no grasp of it whatsoever. - Baron Munchausen
... consider my ass presented for biting.
that was seriously fucking lame man.
No matter how you slice it, you'll never be able to make a machine do what a translator does. Why is it that these things are always made by people who aren't multilingual?
My name is Carlos Montoya. You share files of my music. Prepare to die.
Funny, I've spoken in tongues, and didn't need any computer assistance whatsoever...
Creation threads, eternal life threads (ala cryo), now tongues...
Seriously, Michael: do you want to know that your going to heaven when you die?
Jake
#! /bin/sh
echo "All I'm saying is that I keep my options open."
A "good politician" (good as in "successful," not "responsible") is intentionally vague whenever possible because that allows him to keep his options open. A vague statement that is commonly interpreted one way can later be interpreted a different way. The more details the politician provides, the greater the number of people who will disagree with him.
A smaller-than-PDA version of this was used by that goofy roommate on the TV show Undeclared to speak to his Japanese girlfriend a number of months ago.
Do you remember what the Guide says? I quote: "Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloodier wars than anything else in the history of creation".
Do we really want Pierre Parisian to be able communicate his exact feelings to Lin Chinese?
Relevant Google Search
This is neat and all, but why did they choose croatian? Why not start with something more useful?
Find your friends!
And this time, I think I spelled his name right, dammit.
The Mongrel Dogs Who Teach
"... perhaps one day they will be able to develop one that will allow politicians to talk to normal folks and be understood."
we already can. it's just no one can believe what they're saying could be that stupid, so they insert a delusion. to help out there, Carnegie Mellon will have to develop a systems which enables belief. Once again, Douglas Adams leads the way.
sense to realize something like that (besides me of course)... but then I believe that if speaking in tongues is of God it must be able to be translated like other languages.
Tongue? TONGUE! Tongue tongue Beowulf cluster tongue.
Tongue, tongue belong to us. Tongue.
Tongue tongue #$^@ DMCA tongue RIAA tongue double plus bad tongue.
It's the strangest word after typing it a few times.
Two exist already...grand juries and impeachment hearings. Obviously, they don't work very well yet. Too much politician involvement in the translation matrix.
So long as this stuff stays on the recieving end, this is all a step in the right direction. You don't want to deprive the people you're sending information to of any information. Let them decide whether to use a human or a computer. Sending a computer based translation that you can't understand only increases the chance of offending someone/misrepresenting something.
Giving it to soldiers in the field so they can "speak" the foreign language is bad. Instead give one-way devices to both sides and let them use those to translate what's told to them. That way if they need a human translator to clarify that's still an option.
It would be terrible if information started flowing between countries that had been passed through a computer translator first. Please, let me use babelfish to translate that spanish document, don't use it for me (heck, I have friends from south america who can help me clarify it if I need to but that's *no good* without the original spanish)...
Translation through tounges is a lossy process. Not translating it at least prevents compromising the information. It's all still there...just a wee bit harder to get at.
Brian
perhaps one day they will be able to develop one that will allow politicians to talk to normal folks and be understood
I made one of these, it loops and plays a wav file explaining that it doesnt help because it made claims solely for the reason of getting elected...
No, Beowulf clusters can't imagine in Soviet Russia.
Of course it does. It's a simple contraction representing the way some people say it. Just because it can also mean "it is" means nothing. Many words are spelled the same as others.
Better yet, how about one that let normal folks talk to politicians and be understood?
Right...
Dialect output. Soon, you won't have to listen to some Croatian nun discussing free will translated into Bostonian English, you'll be able to listen to a Croatian nun discussing free will translated into Jive.
This reminds me of a story told to me long ago by a friend of the family. She was of Dutch descent, and the story is about a well bred Englishman who went on a working holiday to Holland. He got work on the docks, and that is where he learned to speak Dutch. The result was that in a refined English accent he spoke obscenity-laden gutter Dutch, apparently unaware that he was doing so.
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
.. is that the article on PCmag.com is dated September 3, 2002. Slashdot is going all Minority Report -- we know about news BEFORE it happens.
The prob I have with a dictionary translation is the whole is often greater than the sum of its parts. For instance, how do you say "How are you"? Que Tal? What's up? How's it hanging? Come stai? A dictionary can, literally, translate any of these into any language imagineable, but would the listener understand?
I'd rather you do it wrong, than for me to have to do it at all.
Your comment reminds me of one of the all time funniest moments from that classic of american TV, Married with Children. The scene: Al Bundy is at the DMV. He asks (in english) to take the written exam. The clerk asks him what language.
Al: I speak the language that everyone in this country speaks
Clerk: Ah, spanish it is
Al: No. This is america. I speak american.
Clerk: American, eh? (Looks in the file cabinet)
Aha, here it is. American. Wow, I hope you know a lot about trucking.
To make laws that man cannot, and will not obey, serves to bring all law into contempt.
--E.C. Stanton
It was meant to be used by field chaplains serving in Croatia, so they had it translate Croatian. They figured it would be a good test, and the chaplains weren't doing anything where a mistranslation would kill someone.
this was on the discovery channel about a year ago, already being used by the US military in the UN peace keeping mission.
Wow... just look at this guy. Strongly indicates someone trying to be outside the box.
Expanded picture from the article.
Another pictures is on his homepage that even has some information on "running Unix on IBM PC110 palmtop computer" and "a Casio E-105 Palm-sized PC"
[news for me, stuff that doesn't matter]
Mellon like the Sindarin (elvish) for friend, as seen in Lord of the Rings?
Cos if so, that's either very cool or very geeky.
RoseColor red={0, 0xffff, 0x0000, 0x0000};VioletColour blue={0, 0x0000, 0x0000, 0xffff};find / -name *mybase*|chown you
- perhaps one day they will be able to develop one that will allow politicians to talk to normal folks and be understood.
A translator would be useless; there's no way to understand who doesn't want to be understood. A vulcan mind meld could do the job much better.
How are we /ever/ going to get it into a package that is small enough it fit in your ear and watertight enough to let swim around in a bowl of water when you're not using it?
Is this how they came up with all of the bad english puns in Serious Sam?
I'm majoring in computer linguistics, and currently we're examining different computer translation models; the one you're suggesting is called the interlingua approach.
The idea is, basically, that you need an "in-betweener" language that can carry all the meaning and connotations of both source and target language. Then you only need translations rules for both sets and then let it run.
The main drawback is that you always have some loss in both translation steps, which sometimes adds up to quite a difference in meaning. The main advantage is that you can modularize - once you have a working English-to-Interlingua module, you can use Interlingua-to-French, Interlingua-to-German, what have you. For further information, google for interlingua "machine translation"...
-- Language is a virus from outer space.
Here's the reference.
... and not a single soul karma whoring with references to Noam Chomsky? I'm a little worried... of course, I once had a professor explain that if you're ever at a party with all sorts of academics and you want to look smart, wait for someone to say something that sounds reasonably intelligent and say, "but doesn't that follow from Chomsky?" Apparently nobody will stop to question you ;-)
Like Esperanto, lang2lang translators threaten to be rendered useless by the real world. After all, most people in the world speak some level of English now. In the next 25 years almost everybody in the world will speak fluent English, and the people who can't, well you really wouldn't want to talk to them. What use are 'universal translators' then? But as much use as Esperanto ever was.
then Cobol -to - Pascal
"Two people can exchange a 10-second sentence in about a minute and a half,"
I don't think this project is quite ready, yet.
... involving extremely cunning linguists creating *brand new* languages, completely from scratch, for corporate clients who need to communicate freely and yet still keep something relatively secure.
A per-transaction language, in other words, with a complete new lexicon for each speaker. Of course, the individuals would have to learn the language quite quickly, so this would also be another service realm in this plan.
Sort of like Kings of old, who used to use language differences to obfuscate and control various parts of court, only in this case it would be a commercial service, and available to all.
Something like this would be a good tool in the modern corporate environment, I think.
Well, I'm off to register Babylon, Inc...
Oh, D'oh!
; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
... if only everyone learned to speak Klingon.
True warriors use the Klingon Google
Would it be OK if the soldier and the Croatian gave the two-way translator to the other guy before saying something? This way it would always be the person at the receiving end who's using a translator. As you so elaborately explained, this would be a big improvement over the soldier keeping the two-way translator all the time.
this is not a flame This is a reasonable post
"It includes a speech recognizer, which turns spoken words into text (aka dragon naturally speaking); a machine translator, which converts the text from one language to another (aka babelfish); and a speech synthesizer, which turns the text back into audible words (um... it exists, but forget the name...)."
Sounds to me like all they did was get the programs to communicate with each other ("the speech recognizer, known as Sphinx, and the speech synthesizer, known as Festival"), but other than that it doesn't sound like a major break through to me.
Actually I'm shocked this doesn't already exist considering we already had software to do each of those for at least the past 5 years, it's only now that someone thought of putting them together?
They lie, cheat, take bribes, and does everything to save their hides. I already know how to translate 99% of the politicians, just do "> /dev/null".
Sorry for being so pessimistic, but we are closing in on an election over here, and it sickens me how stupid people are who can't see the reason behind what the politicans say and can't remember from past elections how little (almost nothing) they keep of their promises...
*Phew*, for a second there I thought it was the other type of 'speaking in tongues' also referred to as glossolalia which it a totally bogus. If someone comes up to you asking you to attend there church group to learn how to speak in tongues tell them to get lost.
Analytic & algebraic topology of locally Euclidean meterization of infinitely differentiable Riemmanian manifold
Hell if we need ta hear from 'em we'll jus kick thier asses and make 'em learn ta talk American instead of all that gibberish!
Quemadmodum gladius neminem occidit, occidentis telum est
But can it beat Kramnik in chess? Ah, now *there* is the question!
www.HearMySoulSpeak.com
...allowing normal folk to talk to politicians and be understood...
+1 Insightful, -1 Troll. What can I say, I'm an Insightful Troll.
Yet another funny moderator strikes! +1 Informative :-)
"... perhaps one day they will be able to develop one that will allow politicians to talk to normal folks and be understood."
If it would allow 'normal' people to understand what the slashdot crowd is saying, THAT would be an accomplishment
sphinx and festival? dear lord, these have been around for a while, i use both daily. festival used with the mbrola phoneme databases are actually pretty decent, even in multiple languages. sphinx is speech recognition, which is obviously more complicated, so naturally it doesn't work quite as well. it is actually tailored very well for this sort of thing, because it is built around recognizing phrases, which is exactly what they need since they have a huge lookup table. so, if anything in the table is recognized, it makes things that much easier. i don't really see why carnegie mellong deserves so much credit for this? i could have done the EXACT same thing. in fact, i even have MORE. i have a natural language processor, known by some as CHAT BOTS, although i beg to differ, because mine is not meant to fool anyone, instead i am actually trying to teach it to say something intelligible (not from a database of phrases)..with mixed results.
a lookup table for a translator + sphinx + festival is not exactly worthy of an overblown article. sorry, that's just it. no aim to flaim.
QED
BSD is for people who love UNIX. Linux is for those who hate Microsoft.
I still picture _Snow Crash_ whenever the "speaking in tongues" phrase comes about... someone heal me!
-Douglas Adams
Some would say that people enter into more conflict than ever when the barriers sep. them are removed (distance, language, etc.) Is a "universal translator" a good thing?
I wonder if this is the nefarious work of Ashera....
I love it when somebody invents something that isn't new.
The first thing you learn about in psycholinguistics is the concept of the pidgin -- a common language which develops between two or more peoples who must interact but share no lingua franca. These simple languages, which sound like baby talk bastardizations of both languages, eventually turn into what's called a creole, such as that sexy patois spoken by fortune tellers on cable.
All these chaps have done is built their own version, and as the case of esperanto shows, manufactured language is very difficult to gain acceptance and adoption of. They'd have been better off locking a Croat and a Brit in a large office building with big gulps and no marked bathrooms. These guys would develop a pidgin pretty quick.
Hey freaks: now you're ju
The classic example in language translation without language understanding is Jesus admonishment to his sleeping disciples (Jesus can't sleep because he knows he is about to be executed): "The spirit is willing but the flesh is weak." Jesus is essentially calling his disciples wusses for nodding off. A round trip to Russian and back supposedly put it as "The meat is rotten but the liquor is holding up." A phrasebook as comprehensive as Bartlet's would recognize this as a highly-quoted Bible passage and use the correct quote from the Bible in Croatian.
The correct translation of "The spirit is willing but the flesh is weak" requires a great deal of context -- one needs to know that people in the Bible talk in symbols and metaphors all the time. The only way I even know what it means is that I remember that it is from the Bible, so for the computer to use a brute-force phrasebook without trying to understand what something means is not far off the mark -- I doubt most of us in the pews understand half of what is in the Bible anyway.
If people don't understand me, rather than learning their native language, I simply shout loudly and aggressively in my own.
I'd rather have one for lawyers. I don't know anyone that can speak legalease.
perhaps one day they will be able to develop one that will allow politicians to talk to normal folks and be understood
a simple lex scanner will do...
"$" {
printf("I'll draft the legislation myself!");
};
"can you" {
printf("I'll see what I can do.");
};
".*" {
printf("Certainly. Vote for me.");
}
The article fails to mention why did they choose to build a Croatian-to-English translator first. Croatian is a complicated language... Imho a lot more complicated than English. Trust me, I live in Zagreb. When I see a foreigner on TV that knows Croatian, I think to myself, 'Wow...'. And they never know it perfectly. I don't mean the pronounciation, but the grammar.
:p), but wouldn't a language like German be more suitable for the first prototypes? I know a bit of German too, it seems closer to English than Croatian to me.
Not that I'm complaining (hey, it's free publicity for Croatia, visit for your holidays
I will know this tech is mature when they are able to translate a legal document into English.
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
Nope sorry. It is wrong grammatically.
But are we so sure the challenge underlying the thousands of spoken languages are totally sorted then? English and Croatian are both in the same family [Indo-European] that Chomsky mistakenly generalises all languages from.
Croat and English structures are pretty similar.
Whereas English in and out of Sioux or English in and out of a West African Bantu language like, for example, Yoruba would be a bit more of a serious test.
I'd like to see the developers tackle English-Yoruba translation, and then come back a little more modest!
Not to be unkind, but I think the optimism of some people in computing about automating things like translation is due to them never having learned a second language and never having translated anything.
There is a massive problem with missing context between languages, for example, that is quite hard to explain to anyone monolingual.
I'm British, and speak only poor schoolboy French. However since hooking up with my half-Russian, half-Serbian girlfriend, I've found that by learning a dozen or so basic words and phrases by rote, then trying to use them conversationally, I've been able to pick up a surprising amount. Serbo-croat was always supposed to be a nightmare to learn, but it's waaay easier than English... for instance, a pnoneme(?) a group of three or four letters will always be pronounced the same way (cf eg "ain" in English.) I'm rather hoping the Babel Fish is never released; by learning the language you start to subconsciously pick up something of the target language's cognitive assumptions, and (in a small way) to "think like" a native speaker. Now /Russian/, there's a tricky language... but we
both play chess which is a good middle-man ;)
The problem I think most posters have missed so far in this discussion [though I have overlooked lots, I'm sure] is that a lot of computer developers assume translation is about different ways of saying the same thing -- and that basically we are all talking about the same core topics. [Hence the discussion about resolving the amibiguities around 'river bank' versus 'money bank']
The real problem is that in fact translation is about different ways of saying different things. This problem is fairly trivial between any two European languages because they have such similar structures [eg languages as close as Croatian and English], and most of us in Europe and North America only get taught one or two of those languages, so most of us have no idea just how different different languages can be. But bridging between Japanese and English, achurch will have a much better idea than most of us of the problem of translating something there is not yet a way of saying in the other language.
Almost every language has a feature no-one else bothers with, and the key question is what do you do when you try to translate a type of statement into a language where it doesn't exist?
Example: non-Bantu-language-speakers' bewilderment at information like whether you are speaking to women, to men, among family, about something you saw with your own eyes, something you saw alone and more being encoded into the choice of verb. Bantu-language into English, OK - you can explain those features in the English translation. But the other direction? How? Going from an English text, how do you choose the right verb form in the Bantu language based on information the English words simply don't provide? Of course you choose a neutral or weak verb form to cover yourself, but then you [or the machine system] isn't really translating, and you really see the difference. A group of West African speakers of one of these languages [for example Yoruba] will rightly regard a machine translation as lame, because it leaves so much stuff out because that stuff simply isn't in the English text as presented. They will choose a Bantu-speaking human to explain what really happened, properly, putting back in all that missing context that a person can know but, until we have seriously intelligent machines, a machine does not know about a live, real-time situation [or even a situation presented in a written text].
That's the real problem. Not finding ways to match up different ways of saying something, but deciding what to do when information one language expects to be in any text or conversation simply isn't in the other language. Sophisticated look-up tables can work well between two languages in the same family [like English and Croatian] because you're looking up the same kind of thing. But outside the same family, and you have a real obstacle - asymmetry. One language wants you to look up something that's simply not there to look up in the other language.
OK, fair enough. Then if Croatian and English are very different, then Yoruba and English [or Sioux and English] are very very very different.
The point about sentence order is good, but it is even worse. In plenty of human language pairs, one language as given [in text or speech] simply lacks information the other language regards as crucial. There are basic asymmetries.
Different language families are context-dependent in different ways, and creating an intelligent lexicon that can work between two languages that are context-dependent in the same way is far from proof of feasibility.
Idioms are pretty minor compared to structures that are regarded as important in one language but are totally missing from the other language.
You're loony. "It has" is perfectly fine. Your quarrel is with the spelling.
But with two languages not in the same language family there's a pretty insurmountable asymmetry to do with context.
Putting an English text into one of the Bantu languages, for example, will just lead to enormous amounts of information important to the Bantu-language speaker being missed out because it simply isn't in the English. Stuff like the social context of the event, how many people saw it, whether they were men or women, young or old -- if that kind of linguistic data is not mentioned in the English text it simply can't be decided on in the Bantu side of the lookup table, leaving nothing like a real translation.
eom
Gender and case agreement in languages like Croatian [most Indo-European languages have gender and case agreement - Latin is an Indo-European language] cause problems for English speakers, but however large these differences seem, the aspect system of, for example, the Bantu languages, is much more different. With West African languages it is not simply a case of modifying a noun or adjective to make it agree, as is the case with most Indo-European languages, but choosing from a battery of distinct verbs to express context. So there can be eight totally different [ie. no letters or sounds in common] verbs to express 'steal' or 'give' or 'kiss'. It's not a question of inflecting a wordstem, it's another thing altogether.
The language families are described that way by linguists because they have major structural differences between families bigger than any differences inside the family. So for example Croatian, English, Norwegian, and Persian/Farsi are classed as being closer to each other than any one of them is to Turkish or Hungarian, for example.
lawyers talking with humans.
...
well, then again, maybe that's too sci-fi... ;)
reps talking with democrats.
women talking with nerds!
Karma
I don't disagree with your idea that actually learning foreign languages has good uses, but you can't use "Serbo-Croat" here, because for the purposes of a translator like this, I'm fairly sure Serbian and Croatian are indeed different enough not to be lumped together.
If you write a phrase both in Croatian and in Serbian (with the latter using the Latin alphabet instead of Cyrillic) on a piece of paper, they might be very similar or in some cases even identical. But if you get two native speakers to pronounce these phrases, no matter how similar they are on paper, the machine will get two fairly different recordings to deal with.
Actually, (believe it or not, I didn't read the story), CMU's had a system that sounds exactly like this (speech->computer metarepresentation->speech) that gets 95% accuracy.
Plus, research speech recognition is well ahead of most consumer-available speech recognition...but also requires custom hardware or more resources.
May we never see th
I guess I ought to mention that I have a project
on SourceForge called Linguaphile. It handles
about 50 languages currently but only about 4 of
them are remotely useful. The Spanish and
Swedish are probably worth playing with. It's
early days and needs lots of work but it does
actually do something now. I'm really interested
in finding people who would like to work on it.
You can try it online or download it if you have
Perl. Apologies in advance that there are no
docs at all since I've had little interest:
Linguaphile online
I envy you knowing an African language - is it Xhosa (Winnie Mandela's mother tongue I believe) which has that clicking sound in the throat?
Last year I met a German restaurant manager in Budapest who had picked up some Zulu in SA. Did your time in the country leave you optimistic/pessemistic about their future?