Speech Recognition in Silicon

← Back to Stories (view on slashdot.org)

Speech Recognition in Silicon

Posted by CmdrTaco on Tuesday September 14, 2004 @02:55AM from the spell-my-naughty-words dept.

Ben Sullivan writes "NSF-funded researchers are working to develop a silicon-based approach to speech recognition. "The goal is to create a radically new and efficient silicon chip architecture that only does speech recognition, but does this 100 to 1,000 times more efficiently than a conventional computer." Good use of $1 million?"

6 of 328 comments (clear)

Min score:

Reason:

Sort:

Funny... by leonmergen · 2004-09-14 02:56 · Score: 5, Interesting

Funny, I work on a speech recognition research project, and well, i have to say, think about all the possibilities... automa ted speech2text recording of meetings, on-the-fly subtitling of live tv shows, but it can get better : think about searching multimedia files in a google-kind of way based on audio, that automatically directs you to that part of the file where you want to be...
If this really is true what they're saying, and knowing how much money is invested in speech recognition research on a yearl y basis, yeah, i would definately say that this is one million dollars of great investment...
... but then again, maybe they're just throwing around with numbers to make sure they get their money. :)

--
- Leon Mergen
http://www.solatis.com
1... million... DOLLARS!!! by AKAImBatman · 2004-09-14 02:56 · Score: 5, Interesting

Good use of $1 million?

Let me think for a moment... Hell yeah! If we had low power speech processors, the possibilities would be endless. For one, we'd finally have a Star Trek(TM) interface for our homes!

"Computer, lights!"
"Computer, make coffee!"
"Computer, Earl Grey, hot!"

As silly as it may sound, such an interface would be far more efficient than mashing buttons.

In addition, blind people could be significantly helped by this. Many of them already use speech recognition and synthesis to assist in computer usage. Imagine if their computers could suddenly understand them a thousand times better? They could talk to their computers a bit more naturally, thus saving their vocal chords from undue stress.

Other applications (off the top of my head) are:

- Voice notes on embedded devices (store only text!)
- Helpful Kiosks that can give you directions
- A new use for natural language database queries (i.e. Ask the computer what last quarter's net sales were.)
- Voice controlled robots ("You missed a corner, vacuum cleaner")
- Data search by voice ("Find me a channel that plays Star Trek")

Any other cool ideas out there?

--
Javascript + Nintendo DSi = DSiCade
1. Re:1... million... DOLLARS!!! by theparanoidcynic · 2004-09-14 03:05 · Score: 5, Interesting
  
  Any other cool ideas out there?
  
  Universal language translators. Imagine headphones that let you understand any known language.
  
  --
  Only in a Slashdot fantasy can a Slackware install turn into several hours of sex . . . . .
History.. by SillyNickName4me · 2004-09-14 03:07 · Score: 4, Interesting

During 1994 upto 1998 I did marketign and technical support for IBM's Voicetype Dictation products..

Initially, doing anythign beyond understanding a few words would take special hardware, but after a bit of 'training' highly acurate and fast speech to text was quite a possibility with a specially developed dsp.

Then, the pentium class cpus came about, and a p90 could just do the whole thing without the dsp.

So, now someone is developing a new dedicated piece of silicon for this.. lets see how long it takes for general purpose computers to catch up.

The issue is not that this is not usefull, but that it either has to keep developing, or offer a somewhat longer lasting price/performance ratio or much better features for a logn time to come.
Yay! Boo! Uh... Oh bugger.... by MooseByte · 2004-09-14 03:08 · Score: 4, Interesting

From the blog: ''Homeland security applications are the big reason we were chosen for this award,'' says Rutenbar. ''Imagine if an emergency responder could query a critical online database with voice alone, without returning to a vehicle, in a noisy and dangerous environment. The possibilities are endless.''
Like some slight tweaking in order to deploy massive voiceprint-recognition silicon arrays for amazingly efficient automatic realtime conversation transcription and identity determination, attached to Echelon.

So cool... so potentially evil... head begins to hurt... tinfoil hat burning....
Pretty Ambitious, Harder than it sounds by Anonymous Coward · 2004-09-14 03:12 · Score: 5, Interesting

Although $1million significantly can speed things up, this is a pretty ambitious undertaking.

My Master's research was on implementing machine learning in hardware, specifically support vector machines.

Now, they have much more money than I did, and probably this will be a collaboration involving many graduate students, but converting complex algorithms from software to hardware is no easy task.

It is just easier to do things in software, that's why it has evolved. The modular layers of abstraction allow a Computer Scientist working in machine learning or speech recognition to not have to worry about how the underlying hardware works.

Working in hardware, a lot these issues come face to face. Particularly since you want an architecture on a chip, whereas in a conventional desktop/server system there are resources such as lots of RAM, harddrive space, etc are available and their interconnections have been built and refined over decades.

Throw in concerns about small form factor, low power consumption, quite fast a lot of unexpected hurles pop up.

My master's research goal was to produce a data mining/machine learning machine, or at the very least a data mining/machine learning co-processor. In retrospect, that was a very ambitious goal that would require many years of work, probably in collaboration with other graduate students.

What I ended up doing was just Support Vector Machines in digital hardware. Now granted, there is another aspect to my research that I'm not mentioning here, mainly that I didn't use normal floating point mathematical architectures, but a different innovative logarithmic based mathematical architecture. That in itself was a significant undertaking.

In any case, this sounds like a great project, I just wonder how much they can do in their (in an academic sense) very small time frame of 2-3 years. Even though a lot of preliminary work has probably already been done just to apply for the grant.

In any case, it is great to see something like this, something to keep in mind in case I ever go back for a Ph.D.