Computer Program Learns Baby Talk in Any Language

← Back to Stories (view on slashdot.org)

Computer Program Learns Baby Talk in Any Language

Posted by samzenpus on Wednesday July 25, 2007 @11:13AM from the what-was-your-machines-first-word dept.

athloi writes "Researchers have made a computer program that learns to decode sounds from different languages in the same way that a baby does. The program will help to shed new light on how people learn to talk. It has already raised questions as to how much specific information about language is hard-wired into the brain."

13 of 170 comments (clear)

Min score:

Reason:

Sort:

not all languages by blackcoot · 2007-07-25 11:23 · Score: 4, Informative

they have only tested with japanese and english. (see ars technica's coverage here). while they do present some intriguing results, the authors themselves admit that their methodology is flawed. btw, when did slashdot become ars redux?
Re:Baby talk? I swear at my computer! by buswolley · 2007-07-25 11:57 · Score: 2, Informative

Don't be an idiot. Since when is the news story going to tell you what the researchers really think?
I'm busying myself reading the actual research journal article, and forwarding it to my laboratory colleagues.
It looks interesting. Sorry I can't post the journal article text.. copyright blah blah
Vallabha, GK, & McClelland, JL. (2007). Success and failure of new speech category learning in adulthood: consequences of learned Hebbian attractors in topographic maps. Cognitive, affective & behavioral neuroscience, 7(1), 53-73.

--
A Good Troll is better than a Bad Human.
Re:Baby talk? I swear at my computer! by Anonymous Coward · 2007-07-25 12:10 · Score: 3, Informative

we don't come with linguistic firmware. Noam Chomsky and a few generations of linguists disagree. Not saying they're right, but I'm guessing you lack the qualification to argue.
Re:Skeptical by potpie · 2007-07-25 13:03 · Score: 4, Informative

IAAL (I am a linguist), and I believe you are correct. Language is a colligation of sound and meaning, but this technology merely distinguishes sounds: it is a vastly simplified model, not of how children acquire language, but of how children pick up phones. The phone is the most basic unit of the physical (sound) aspect of language, so if this technology is to have any use at all, it has a very long way to go.

From TFA:
Expanding on some existing ideas, he and a team of international researchers developed a computer model that resembles the brain processes a baby uses when learning about speech.

This sentence means nothing. How do they know their computer model resembles the brain processes? Because they got the same outcome? Is that enough to verify what goes on in the mind of a child?

How about this: as soon as their program can distinguish allophones, I will be impressed. Allophones are different sounds in a language that native speakers do not distinguish, but which nevertheless occur in certain environments. For instance, in English we do not distinguish the voiced th sound and the voiceless th sound, but we do distinguish f and v, even though the only difference in both pairs is voicing. The difference is that exchanging f and v can change the meaning of a word, but changing voiced th and voiceless th only makes the word sound funny.

--
Esoteric reference.
Re:Two speed bumps by Taxman415a · 2007-07-25 13:44 · Score: 2, Informative

You're right, it doesn't seem McClelland et al's paper makes the claims that Reuter's article does. Scientific American's article did a much better job explaining the realities and the SA author appears to have actually understood what McClelland et al were getting at.
Re:Skeptical by Anonymous Coward · 2007-07-25 14:28 · Score: 3, Informative

Actually, in English, we do distinguish voiced and unvoiced /th/. They aren't allophones at all - unless you think "thigh" and "thy" are the same word, of course. While "thy" is somewhat archaic it's still part of the language. Voiced and unvoiced is an area where English distinguishes heavily; we're very light on aspiration, mind you.
Re:Baby talk? I swear at my computer! by tilde_e · 2007-07-25 15:49 · Score: 2, Informative

I don't know. If you need that much exposure to your father (it sounds like you have had some). I personally tend to pick up the mannerisms of anyone I'm around that I have some kind of affinity for. I begin to gesture like them, I know what they would say in certain situations, I begin to respond to certain situations the same way they would. This can happen even if I only met someone once. This includes: facial expressions (squinting, raising eyebrows), voice inflections, laughing, pauses when speaking. I notice it in written text as well.
Re:Genetics IS a form of memory. by PurpleBob · 2007-07-25 16:08 · Score: 2, Informative

Because humans are adapted to be good at learning language. That doesn't mean they have to be born having already learned it in their genes somehow.

Ad hominem attacks are a really great way to make a scientific point, by the way.

--
Win dain a lotica, en vai tu ri silota
NetTalk by sgml4kids · 2007-07-25 16:13 · Score: 3, Informative

Wasn't this demonstrated about 20 years ago? In that experiment, they showed how a neural network learning to "speak" (i.e. drive a speech synthesizer), would first discover that normal speech has pauses and breaks, then it learned vowels, then consonants. It learned this, if I recall correctly, by comparing (in a backprop sort of way) it's output (a transcription of the sounds that came out of the speech synth) against a human reading the same speech.
Here's an audio clip of its learning progression.
And I recall seeing a TV broadcast showing an experiment where infants were incapable of even hearing certain sounds from one language (e.g. an inuit language with subtle throat-clicking sounds) if they were primarily exposed to another language (say French or English). A baby had to be repeatedly exposed to certain sounds before they could perceive them.
Re:Baby talk? I swear at my computer! by Magada · 2007-07-25 21:37 · Score: 2, Informative

Noam Chomsky will be overjoyed if this thing proves to be a success - because if it does, it will provide no less than a working black-box model of the very firmware in question :).

--
Something bad is coming when people are suddenly anxious to tell the truth.
Re:Baby talk? I swear at my computer! by IndieKid · 2007-07-25 23:48 · Score: 2, Informative

Steven Pinker's book, The Language Instinct is a good read for anyone interested in the theory of Universal Grammar. It's written in a fairly accessible style, but there are some tough ideas to get your head around if you're new to the subject. Those who have a Computer Science background and learnt about grammars etc. in their compiler design courses might appreciate reading about the subject from a different angle, I know I did.
Re:IAAL, too by kris_lang · 2007-07-26 01:16 · Score: 3, Informative

I'm not at a place where I can access the research article, so let me comment about what I know about McLelland's previous work with neural networks.

Rumelhart and McLelland worked on the groundbreaking "can a neural network learn how to pronounce words based on their spelling?" paper, which used back-propagation to train a neural net to do just that. That was in the 1980s. (Sejnowski at the Salk Institute followed up with a lot of neural net training studies too.)

Their little cheat was that there was no temporal component to the data. Words were represented as sets of triplet-letters: catalog is represented as "-ca", "cat", "ata", "tal", "alo", and "log". (Actually, I don't remember if they used special sequences to represent start and stop, so --c -ca og- and g-- may not have been part of the sets.}

And of course the neural net didn't really have audio output, though of course the rejoinder is that this would be trivial.

My key question is how they deal with the issue of time in this study, and if there is any actual audio output which would act as feed-back for the training system or whether the output is representational only, as an output set of phonemes.

Having real audio output and real audio input would let it correlate its output with real language examples. Having representational blobs would only mean that: given inputs of the hash that represents "hard TH" vs the hash that represents "soft TH" the system could yield a result of different outputs.

And you're saying that the key result would be if the system learned to conflate or ignore the two sounds of "TH", hard or soft, in trying to interpret words. Remember that the initial Rumelhart-McLelland model was "content/meaning free", and I suspect that this one is too. Learning to conflate "x" and "y" in a neural net would be trivially implemented and trainable: the links for "x" and "y" into the model would have similar weights in the right contexts (the context being the set of predecessor and successor phonemes).

It sounds like an agglomerator: given a large dataset of valid words in a given language, this system learns the rule for "predecessor" and "successor" probabilities of a particular phoneme vs another phoneme and then produces random output with the same Bayesian probability, producing gibberish nonsensical sounds which follow the probability distribution of the input training language.

or that's my guess at least trying to be the typical slashdotter commenting without reading the article.

I'll try to get at the article from the Uni with journal access tomorrow.

Kris
Baby words aren't words by alexhmit01 · 2007-07-26 05:04 · Score: 2, Informative

Going a step further, those "words" aren't words in any language.

The formal words are mother and father, though mommy and daddy seem a reasonable informal way of saying my mother and my father. Mom and Dad are derived from the informal. However, kids master the ma and da syllables quickly, so doubling it up and calling it a word makes it easy.

A friend relayed a story to me... someone asked him why his child called him Abba, which he said was the Hebrew word for daddy. The person protested, "but that's the first noise children make." He smiled back, "I know, and that's why we made it the word for daddy." Evolutionarily, this makes sense, mastering dada before mama makes sense as well... mothers are MUCH more wired for unconditional love than fathers, because of the hormonal bonding from delivery and nursing (those that don't do those steps don't get the hormone dump helping them, doesn't affect their being good mothers, but probably makes it rougher on them)...

Each language has a "simplified" informal and a baby equivalent. Hebrew: Father = Av, Mother = Em, My Father = Abi, My Mother = Imi, yet the informal is Abba and Ima, which officially are tied to Aramaic, but probably evolved as simplified forms for children. Like mama and dada, papa, etc.

It would serve a TREMENDOUS biological edge two quickly master words for parents, and therefore a selected characteristic. It's amazing how not upset you get with a terror of a child when they call out your "name."