IBM Strives For 'Superhuman' Speech Tech

Which ... by spiny · 2006-01-24 21:36 · Score: 3, Interesting

Which witch blew the blue candle out ?

--

Fry: heh, Yakov Smirnoff said it
Leela: No he didn't.

Re:Which ... by jakeweston · 2006-01-24 21:46 · Score: 3, Funny

To wreck a nice beach...
Re:Which ... by jcupitt65 · 2006-01-24 22:41 · Score: 5, Interesting

Or I can wreck a nice beach versus I can recognise speech.
Sometimes you need rather a large context to disambiguate: is this sentence part of a discussion on shore-front management, or spoken language understanding?
Re:Which ... by mwood · 2006-01-25 01:59 · Score: 2, Insightful

Just remember that *you* have a truly enormous and well-filled content-addressable memory, a huge and richly-connected semantic network, and untold numbers of self-adapting heuristics that have been trained all day every day for decades, with more coming into production constantly. It's hard for a machine to match that. Feeding 100,000 distinct pattern matchers in parallel is something most computers just aren't architected to do well. That a machine can do even a passable job of speaker-independant continuous speech recognition is an amazing achievement.

BTW what Teletext is like in the U.S. is that we don't have it. :-( We do have titling on some shows, but to compare that to Teletext is like comparing a single couplet to the poetry section of a library.
Re:Which ... by Squalish · 2006-01-25 04:18 · Score: 2, Interesting

The computer is being programmed with the goal of understanding the user, not some arbitrarily defined 'perfect speech' dialect/accent.

--
People in Soviet Russia, however, appear to be afflicted with amusing juxtapositions of the aforementioned situation
Re:Which ... by kryonD · 2006-01-25 05:34 · Score: 2, Insightful

Don't hold your breath on that. After spending seven years studying Japanese just to speak it conversationally, I can tell you flat out that there will never be on the fly translations between Japanese and English. Why you ask? Because the languages and cultures behind the languages are so drastically different, you often have to listen to several sentences before you can organize the correct context for words in the other language. Not to mention occasionally having to add material in the translated output to explain why a certain sequence of words means something.

For example, go watch Memiors of a Geisha and note that Chiyo keeps calling Mameha "oneesan" (Oh-Nay-San) which literally and figuratively translates to big sister. They are not related, and it is not an afectionate reference that someone might make in English to an older woman who provides protection and guidance. The term actually holds a special meaning in the Japanese world of Hostessing (both Geisha and less formal such as snack bars) that I would find difficult to even explain in English. Good luck IBM.

--
I've dirtied my hands writing poetry, for the sake of seduction; that is, for the sake of a useful cause. --Dostoevsky

Coherency? by PrinceAshitaka · 2006-01-24 21:38 · Score: 4, Insightful

From The article "For now, all video processed through Tales is delayed by about four minutes, with an accuracy rate of between 60 and 70 percent" and "The accuracy rate could be increased to 80 percent, Roukos added"

Still even at 80 percent how good is this translation. If that 20% is the important parts of speech You could still be left clueless. Even the best Machine translations of text I have seen always leaves the text a bit garbled and confusticated.

I don't know how much delay is implied in the phrase "on the fly" , but I personally don' think there could ever be real time translation for the following reason. Sentences in different languages have different sentence structures. While in English the verb is usually the second part, in other languages the verb comes many times last (German). For the translator to get the second word of a sentence, it would have to wait till the end, of what could be a long sentence. This necessarily adds delay.

--
quis custodiet ipsos custodes

Re:Coherency? by Yahweh+Doesn't+Exist · 2006-01-24 21:48 · Score: 3, Interesting

yes, there will always be delay for the reason you state. but that's true even with human translators, yet no-one claims real-time meetings between people via translators is a waste of time.

since even "live" boradcasts are usually delayed several minutes for technical and legal reasons anyway, if this technology can get to the state where you're just one or two sentences behind real-life it will be effectively real-time anyway for almost all practical purposes.
Re:Coherency? by sumdumass · 2006-01-24 22:29 · Score: 2, Funny

I'm wondering if this was used durring the lead up on Iraq? "i'm unclear if there are bombs here" and end up getting translated into "there are nuclear bombs here".
Re:Coherency? by wizrd_nml · 2006-01-24 22:37 · Score: 2, Informative

For the translator to get the second word of a sentence, it would have to wait till the end, of what could be a long sentence. This necessarily adds delay.
Not necessarily. An on-the-fly translator could translate words as it hears them filling in the translated words in the correct location in the sentence. In other words, the sentence doesn't have to be completed in order. It can dynamically expand to fit in new words.
If you listen to human translators doing on-the-fly translation you'll see this is how they work.
Re:Coherency? by dancallaghan · 2006-01-24 22:40 · Score: 3, Interesting

but I personally don' think there could ever be real time translation for the following reason. [German]

You are going to have that problem whether it's a machine doing the translating or a human. As I understand it, interpreters of German get around this by some quick-thinking restructuring of the translated sentence, or they simply lag a half-sentence or so behind.

The real problem for machine translation is, and always has been, determining the sense of a word from context (indeed I recall a recent Slashdot article about some guy who suggests this is the separating factor between computers and animal intelligence). Most languages have a great many homonyms whose meaning a listener can determine only from the surrounding contenxt and, often, general background knowledge of the language or topic at hand.

first? by Anonymous Coward · 2006-01-24 21:39 · Score: 5, Funny

however the researchers stated "We still can't figure out what Bob Dylan is saying"

Nuances by AnonymousYellowBelly · 2006-01-24 21:43 · Score: 4, Funny

GB on TV: "We have prevailed"
Subtitle: "All your base are belongs to us"

--
Disclosure: I'm stupid

NSA Babelfish by Elixon · 2006-01-24 21:44 · Score: 2, Funny

I cannot wait when I buty the first eBabelfish gadget that I will put in my ear so I can understand spoken language of my russian colegues... ;-) :-) I hope that someobody will not consider it as "important technology for the national security" and will not restrict it by any mean...

(I'm sure that this eBabelfish is already installed - not in my ear - but on the telecommunication centers...)

--
Well, I've got to get back to work. When I stop rowing, the slave ship just goes in circles.

Foreign languages are complex... by pubjames · 2006-01-24 21:52 · Score: 5, Insightful

I'm afraid this type of technology will be used as an exuse for people not to learn foreign languages, which is a shame.

It's not until you learn another foreign language that you realise how complex languages are, and how subtle. Learning another language can literally change the way you think about things.

This type of technology will make people think they completely understand a foreign language, but they won't. Their understanding will be crude, without the subtleties and cultural understanding.

I can speak English and Spanish fluently, and if I watch an English film with Spanish subtitles I'm always thinking - damn, they missed a good joke there, they got that wrong, etc. (Equally so with a Spanish film with English subtitles). And film subtitles are done by professional translators. God only knows what a terrible job a computer would make of film translation.

Re:Foreign languages are complex... by Viol8 · 2006-01-24 22:00 · Score: 2, Insightful

"It's not until you learn another foreign language that you realise how complex languages are, and how subtle."

And how wierd sometimes. English for example loves to use the word "up" in all
sorts of unsuitable places:

give up
shut up
fed up
wash up
fuck up
laid up
muck up
turn up
free up
look up
make up
put up
screw up
hang up
wrap up
hold up
grow up

Wtf?

And home come we say "didn't he.." but in longhand its "did he not...". Shouldn't
it be "did not he"? Why does the "not" shift to the other side of the pronoun?
But then all languages have similar wierd , illogical syntax.
Re:Foreign languages are complex... by MPHellwig · 2006-01-24 22:09 · Score: 4, Funny

And of course: "Up yours!" ;-)
Re:Foreign languages are complex... by Mushdot · 2006-01-24 22:16 · Score: 3, Interesting

I have a friend works in Japan and he tells me the same. He often goes to watch English films that are subtitled in Japanese and tells me that they completely miss-translate most of the jokes and miss subtle nuances of speech. One example he gave was a scene from 'The Full Monty' (im doing this from distant memory so it might not be quite right - in fact, a bad translation :-)

One of the characters is shouting up to someone in their bedroom window. They don't respond to the shouting and the character says "He obviously can't hear me because of his triple glazing".

This is a sarcastic comment relating to the house owners supposed wealth but in Japanese it was translated as:

"He has thick windows"

Perhaps in this case there was no easy way to translate - but I suspect films are probably translated in one pass and there is no time to understand the context of each sentence spoken so it's left to literal translatation only.
Re:Foreign languages are complex... by virtualsid · 2006-01-24 22:45 · Score: 2, Insightful

I'm afraid this type of technology will be used as an exuse for people not to learn foreign languages, which is a shame.

I'm not quite sure what you mean here not bother because of this technology?

I can't see anyone not wanting to bother learning a language because of this technology. Not unless it was a babelfish/universal translator type technology - i.e. basically invisible. In which case, what's the issue? ;-)

What are you going to do:
a) Walk around with a little device which translates with 60-80% accuracy when you're in a country where people speak a language you do not understand.
b) Try to learn the language so you don't have to rely on a gadget?

I think I know which one I'd choose - not that I can speak anything other than English, but I do try.

Once devices get to 100% accuracy, my argument disappears. I'd love for that to happen too :-)

Sid
Re:Foreign languages are complex... by anum · 2006-01-24 22:59 · Score: 2, Insightful

Learning a foreign language is a net good and the only way to really understand another culture is to experience it. That said, there are a large number of languages and an even larger number of cultures. Do you intend to learn/experience them all?

Can you see no good in a rough translation for some purposes?

Calculators have largely eliminated the need (an in some cases the ability) for people to do basic math. Therefore we should eliminate calculators before these people start believing that they completely understand cube roots when they just know how to push buttons.

Oh yeah, that reminds me...Cartoons aren't real.

Good luck IBM and I hope this stuff becomes viable soon.

--
I don't think, Therefore I'm not.
Re:Foreign languages are complex... by Splab · 2006-01-24 23:30 · Score: 5, Funny

From boondock saints:
Rocco: Fucking... What the fuck. Who the fuck fucked this fucking... How did you two fucking fucks...
[shouts]
Rocco: fuck!
Connor: Well, that certainly illustrates the diversity of the word.

Think that just about covers it...
Re:Foreign languages are complex... by anum · 2006-01-24 23:33 · Score: 2, Interesting

Ya, I got ya'.

I almost added "I just hope GWB doesn't decide to fire all his intell linguists based on this post" but it seemed kind of like bashing the Prez and i would never do that...

Cheers

--
I don't think, Therefore I'm not.

Ghee... by Anonymous Coward · 2006-01-24 21:54 · Score: 4, Insightful

Hmm, instantaniously translation from arabic, wonder who "cough cough echelon cough!" they are marketing this to.. ?

Re:Ghee... by SchwarzeReiter · 2006-01-25 04:22 · Score: 2

Man, if IBM markets this in 2006, NSA has it working since 2000

Re:Just what we need... by pubjames · 2006-01-24 21:55 · Score: 4, Insightful

More opportunities for Arabic speaking people to misinterpret western media.

I think you've got it the wrong way round haven't you? Did you mean to say "More opportunities for English speaking people to misinterpret Arabic media."?

If they REALLY want to test it properly... by Viol8 · 2006-01-24 21:56 · Score: 4, Funny

...they should send it to Glasgow on a saturday night just after the pubs
have closed.

"Ye loooiii ahhh me jimmeh??! *belch* C'mere ya wee electrahnich bastid, I'll
shoo ye!"

It isn't worth it by YearOfTheDragon · 2006-01-24 22:00 · Score: 5, Funny

May be IBM is going to make speech recognition true, but Bill Gates said that this was posible a long time ago. Simply genius.

--
-= If you fight Dragons long enough, you will become a Dragon =-

On-The-Fly by Trurl's+Machine · 2006-01-24 22:02 · Score: 4, Informative

They really do it on the fly? You mean, [on the surface of] [a particular] [insect of a Musca domestica species]?

I have read a lot of auto-translated documents and it is always a good laughter in terms of "crapslation cabaret". So far, there is no technology that could auto-translate a text document succesfully. The "80% success" is a myth - they just count how many words were found in the vocabulary, not how many of them were put into a good context. A "fly" translated as an insect would be accounted as a success!

Even if you are not a bot but a human being with some knowledge of the other language and culture, it's very easy to involuntary offend someone or just to make a ridiculous faux-pas. Polish and Czech languages, for example, are very much alike and use common roots for many words, but because of the way both languages evolved, some neutral terms on one side of the border have become offensive on the other side. Czechs evolved an euphemism for sexual intercourse based on the verb "to look for". Poles still use this word when they look for something, which leads to constant crapslation cabaret gags when a Polish tourist appears in a Czech town "looking for a parking lot". Now, auto-translate this...

Re:On-The-Fly by Red+Alastor · 2006-01-24 23:48 · Score: 2, Insightful

However, add in "domain knowledge" and you're in some interesting territory. I think this is essentially what Google did - they fed in oodles of texts in the various languages so that the system could statistically match phrases. At a simple level, you could have a lookup table of common colloquialisms (eg. 'he's kicked the bucket'(English/UK) == 'he broke his pipe' (French/FR)).
The problem is that why French/FR people will understand the expression, others like French/CA won't. And even if they did special lookup tables, you'll still miss subtely. For instance, if I want to use the expression you gave as an exemple as a warning to someone in French/CA, I could say "You'll break your neck." which would carry the same meaning. But if I say that someone broke his neck, then it should be understood literally.

--
Slashdot anagrams to "Sad Sloth"

IBM and Google cooperation to come? by Mostly+a+lurker · 2006-01-24 22:13 · Score: 2, Interesting

IBM has been one of the pioneers in speech recognition for a long time. However, indications are that Google (in the lab) has been making tremendous progress in translation. While the two companies are bound to be fierce competitors, it would seem they would both have much to gain from cooperation in the area of language recognition and translation.

This won't make speech recognition mainstream by thbb · 2006-01-24 22:16 · Score: 4, Interesting

As it has been the case for the past thirty years, the description of the prowesses of the system are still written in the conditional form: "...IBM technology can be used to control computers and devices..." rather than the active form: "is being used"...

Ben Shneiderman is the person who, in my opinion, articulates the best the limits of speech recognition.

One of my favorite phrases to explain this issue is: "You don't want to speak to a computer, because you can't speak and think at the same time". More precisely, speech utterance makes use of some modules in our brain which are required for planification too. Hence, you can't plan as well what to do next when you speak, which is a big hurdle in the type of intellectual activities one carries with a computer.

Awful default TTS by Council · 2006-01-24 22:19 · Score: 3, Insightful

Speech-to-text is cool, but for 30 years they've been predicting it's the next new thing in interfaces, and it's remained a niche thing as it gets better and better. Maybe it'll hit the point where it's flawless and suddenly find new markets, but we'll see.

What really bothers me is the state of Windows text-to-speech. The TTS that ships with the most popular operating system on Earth is easily trumped in understandability by a small third-party program I downloaded literally TWELVE YEARS AGO. I really wonder if M$ made some pact to give out crappy TTS so as not to stifle sales of some business partner's application.

This seems pretty ridiculous, but I'm at a loss as to why their text-to-speech programs are of 12-year-old quality.

I'm glad people are doing good speech research, (I know I've seen a demo of good IBM TTS somewhere) but I hope it finds its way into Windows someday.

--
xkcd.com - a webcomic of mathematics, love, and language.

Re:Awful default TTS by wfWebber · 2006-01-24 22:48 · Score: 2, Informative

Then again, if they supplied a version that produced awesome quality voices, they'd be accused of trying to kill their TTS competition.

That said, in Microsoft Windows Vista (ETA 2019), the default TTS engine will be replaced by a new one sporting Anna. Have heard her in the preview and I have to say, it's one hell of an improvement.

--
Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway. -- Andrew S. Tanenbaum

Re:Opensource? by omeg · 2006-01-24 22:22 · Score: 3, Insightful

Of course it won't be open source. They achieved what they dub a "breakthrough in speech recognition". They plan on making a lot of money with this.

American or English? by squoozer · 2006-01-24 22:30 · Score: 2, Interesting

I realize that Anericans and British (English at least ;o)) speak essentially the same language but I have yet to find any speech recognition software that can get more than roughly 85% of what I say correct. I have a fairly soft neutral english accent with pretty good enunciation so I would have expectd to be getting a recognition rate in the high 90%s. I'm wondering if, as most of this software is developed in the US, it is tuned specifically to pick up on english with a US accent? I realize that you train the software for your voice but AIUI all you are doing is tuning a basic speech model. Has anyone else had this problem or is it just me?

--
I used to have a better sig but it broke.

Re:American or English? by Vengeance · 2006-01-24 23:23 · Score: 2, Funny

I'm sorry, what?!?!?

I cannot understand a word you're saying. What's with that accent?

--
It was a joke! When you give me that look it was a joke.

Oh oh oh. by Anonymous Coward · 2006-01-24 22:33 · Score: 3, Funny

I think it was about 1996 or maybe 1997 when I attended an IBM demonstration (for retailers) for its speech recognition software. Anyway, the lady who was narrating the text and. talking. like. a. robot. to. do. it. was half-way through when, for no apparent reason, the word uterus appeared in the text.

So I'm sitting here thinking of how funny it was to the juvenile me back then, and how unfunny it seems right now. Oh well.

Not _that_ amazing by johndoe42 · 2006-01-24 22:42 · Score: 2, Interesting

It's been well-known among language researchers that both speech recognition and parsing/comprehension are much easier when applied to a small problem domain. SRI in Palo Alto and CSLI at Stanford, for example, have a number of very impressive speech recognition packages that understand, for example, medicine-related sentences. The dashboard controls just sound like a logical progression of this to faster computers and an even smaller problem domain. They're cool nonetheless.

The translation, on the other hand, sounds damned impressive. For unrestricted content, especially with an untrained voice (I imagine that IBM isn't individually training to each Al Jazeera talking head), 70% recognition sounds quite good. 70% accuracy post-translation ought to be quite a bit better than what's currently out there. The description of MASTOR, however, is useless -- it could easily describe anything that isn't word-for-word translation.

And German is an easy one by Ogemaniac · 2006-01-24 22:44 · Score: 4, Informative

It is as closer to English as any other language. In general, European languages have the same basics as English (such as "the") and are fairly easy to learn and translate. Right now I live in Japan, where the language and its underlying way of thinking basically run in the reverse direction of English. To translate, you are essentially running the whole thing backwards. Worse yet, the fundamental parts of the language are quite different. For example, Japanese does not have articles or prepositions, though it has post-positions that roughly correspond. However, there are fewer of them, so they have "lots of meanings" when translated into English. Translation can be a "#$#, even for a human who understands both languages very well (which is why anime comes off so corny sometimes). There are countless times where there is just no simple way to express a thought in one language that is trivial in the other.

Japanese and English are quite different by Ogemaniac · 2006-01-24 22:48 · Score: 2, Insightful

and it is usually extremely difficult to translate jokes. The senses of humor are quite different as well. I think this is part of the charm of anime, actually - we are laughing at things Japanese aren't always intended to find funny, while missing half of the jokes that are supposed to be there.

Buyer beware by 99luftballon · 2006-01-24 23:04 · Score: 4, Insightful

Speech recognition has long been the land of inflated promises and little returns. Anyone remember Lernout & Hauspie and its supposed 15 minutes learning time?

Speech recognition is riddled with problems. From a computing side it's enormously processor intensive and memory hungry. From a computer side it's very com,plex code and the 'learning' process is fraught with problems - surnames, company names and locations are all very poorly recognised.

So don't rush to buy. Let the labs check it out first.

Re:Just what we need... by user9918277462 · 2006-01-24 23:13 · Score: 4, Insightful

There's a very good reason they're testing this tech on Arabic speech primarily. Although they won't say it, I'd be very surprised if the DOD isn't sponsoring this. NSA would absolutely love to be able to translate and transcribe monitored Arabic speech (ie, phone calls) in real time. No backlog of untranslated intercepts, no staff shortages.

funny this subject should come up... by dafragsta · 2006-01-25 00:04 · Score: 2, Interesting

I've actually never used any speech recognition software before today. That said, today just happens to be the day. That said, I tried out Dragon NaturallySpeaking for the first time, and it is a complete coincidence that this topic should come up. I'm actually dictating this post with Dragon, as we speak. ha ha

the training process definitely has its ups and downs. The more you work with it however, the more it becomes attenuated to your own speech patterns and moreover, the quirky words we use every day. If you can get past the first two or three hours, you'll see that it is totally worth the effort, especially if this IBM tech isn't available to end-users for some time. There is also an aspect of the software training you, while you train the software. At the present time, I can dictate to slightly slower than I can probably type.

In the end, I can see where this would make a writing e-mails and other such time-consuming tasks, which involve spellchecking, grammar, and other proof reading significantly quicker. When you really hit your stride, it's easy to write at the speed of thought, which is really appealing. There are caveats, however. it's very easy to dictate several sentences worth of tax and taken for granted that it to everything down the way you attendedselect tax select select tax undo

Real-time eavesdropping by 0xC2 · 2006-01-25 00:30 · Score: 2, Interesting

Although most of the discussion so far has focused on foreign language translation, this technology is about *real-time-audio-to-text* conversion. The feds will be able to monitor, analyze, and record our conversations in real time:

Monitor all conversation.
Apply real-time text filters.
Assign live agents to priority eavesdropping.
Profit!

If you could apply a filter to listen in to any call what would it be?

--
Be heard || Be herd

Translating Arab TV by Perl-Pusher · 2006-01-25 00:56 · Score: 2, Informative

I imagine it is easier to translate repetitive phrases such as "The zionist oppresssor shall be eliminated", "The great Satan America will be destroyed" and "Our martyrs have struck fear in the hearts of the infidels ".

I was in Kuwait and watched arab TV with english subtitles, it was enlightening to say the least. One long tribute to racism paid for by the Amir of Quatar. Only on arab TV will you see such trash as "the jews are descended from pigs".

Excellent Product, Confused Reviewers by MarsGov · 2006-01-25 01:44 · Score: 2, Informative

ViaVoice Embedded, the product that they're releasing, works on limited-domain problems: for example, tasks related to control of your car's peripherals. When the vocabulary and grammars are constrained it's possible to acheive very decent accuracy.

Dictation, however, is a completely different problem. There are far fewer constraints on what can be said, and the system makes errors as it picks through the possible choices. As a result, most dictation software requires training: the system will use your voice to train its recognition models to improve its word selection. Dictation systems also ask for samples of your documents to train its language models on how you put words together; that also helps determine the probabiity of proper word choice. (Example of how you put words together: "Peanut butter sandwich" is a much more likely choice than "peanut butter sand," and will get a higher score.)

The IBM announcement is about embedded, task-oriented speech recognition. It's not "superhuman," according to the article's text and ignoring its headline. I'll have an opportunity to see it in action next week at SpeechTek West. Expect to see other product announcements about speech technology in the next few days as the conference approaches.

As for the TV translation software, it's still in the research stage according to the article. I've seen BBN's version of this software, and frankly it's amazing how good real-time translation can be.

Bell Canada deployed Emily a few years back, and the results to date have been excellent. A top-level question of "How can I help you?" replaces several layers of DTMF auto-attendant complexity.

If you're interested in trying speech recognition and text-to-speech out for yourself, you can use Voxeo's servers, program in VoiceXML, and my Voice Conference Manager app as a starting point (yeah, VCM needs a new release, and it's getting one soon).

Let's see it translate poems by roman_mir · 2006-01-25 01:57 · Score: 2, Interesting

When and if it can translate poems from language to language, while keeping the style, the nuances, the rythm, the cultural references, the general idea and the details, then we will know - it is done. Until then, don't hold your breath.

--
You can't handle the truth.

Re:Let's see it translate poems by hunterx11 · 2006-01-25 02:23 · Score: 3, Interesting

I'd be happy enough if humans could do this.

--
English is easier said than done.

Re:Just what we need... by mwood · 2006-01-25 02:03 · Score: 2, Insightful

Patriotic. What part of "*International* Business Machines" did you not understand? More likely it's to show that they really understand the problem and not just the English-only subset.

S-to-T in hospitals by stardancer · 2006-01-25 02:20 · Score: 2, Interesting

I know that one hospital in Norway has been experimenting with/testing speech-to-text software for a while, and reports say it's been very successful! (this supports what was said about speech recognition within a tight context in an earlier comment). I believe the plan is to, at some point, eliminate the need of secretaries transcribing what the doctors dictate, so that ideally the doctors can just speak into a mic and the text automagically appears in the patient's (electronic/digital) journal!

this of course worries secretaries, since they might eventually lose their job/"career". on the other hand it would improve effeciency *a lot*.

--
There's nothing too profound behind this sig.

Live experiment with Dragon 8 by bdwoolman · 2006-01-25 02:47 · Score: 4, Funny

Here we go:

I can wreck a nice beach. I can recognize speech.

Well, Dragon Systems eight passed the beach test first try. Knowing the program, however, I did use pretty clear diction.

I use Dragon Systems and find it absolutely great. There are a few persistent errors. For example, It frequently fails to get "there" and " there" right on the first try. But the fly down menu system enables me to quickly correct the problem on the run. Certainly I pick it up on an edit. If IBM has something better than this -- and it sounds like they do -- then it must be pretty darn good. Of course, you have to insert the punctuation verbally. But that comes with a little practice -- provided that you know what to do in the first place.

It does take a little bit of investment in time. But not nearly as much as learning to type at seventy words a minute, which I can now do in dictation. I have added very little by way of customized commands etc. The program has done a lot of learning on its own.

Let's try once again: I can't recognize beach. I can recognize speech. Oops. Okay, it failed that time. Let's try one more time: I can wreck a nice beach. I can recognize speech. Well, the phrases have to be enunciated pretty clearly or the program has trouble.

Which which blew the blue candle. Failed on the second "which" the b*tch.

Okay, okay. I'll put the laundry in the dryer. No I am not just screwing around on Slashdot again I'm getting some work done down here. Just a minute. Just a MINUTE.

One trouble. You do have to put the mike to sleep during family discussions.

--
"No fear. No envy. No meanness." Liam Clancy

Unlikely by rcbarnes · 2006-01-25 06:31 · Score: 2, Insightful

Transcription? Not too hard. Translation? I highly doubt it.

Recent studies of the efficacy of machine translation found that we have made only marginal progress by modern engines from those of the *70s*, (in fact, one of them, SysTrans, is the most used translation engine online) and there were *no* descernable difference between engines of the eighties and current engines. I hope that they're not trying to claim that they suddenly overcame the vast problems of translation wholly independent of the linguistic community. That's just ludicrous.

I'd love to see the this engine handle a parasitic sentence like this between two largely different languages and catch the nuance in the parens: "Which report did she file (that report) without (her) reading (that same report)?" Sure some engines will hit by chance, but only because of similar structure, but the engine is lucky, not actually parsing the "meaning."

--
"Fight for lost causes. You may discover they weren't."

Slashdot Mirror

IBM Strives For 'Superhuman' Speech Tech

52 of 289 comments (clear)