Speech Recognition in Silicon
Ben Sullivan writes "NSF-funded researchers are working to develop a silicon-based approach to speech recognition. "The goal is to create a radically new and efficient silicon chip architecture that only does speech recognition, but does this 100 to 1,000 times more efficiently than a conventional computer." Good use of $1 million?"
My friend and I were talking about this. In countries that are more totalitarian, it could be used to root out "dangerous people" www.geocities.com/James_Sager_PA
God spoke to me.
100 to 1000 times more efficient worth $1M? meh. maybe.
100 to 1000 times more accurate worth $1M? definitely.
Damned straight it is! In government terms, that's a pittance. In government-funded science terms, it's downright INFINITESIMAL. It isn't even couch change, it's more like the stale pretzel under the couch cushion.
But, of course, cue the armchair blogging fanatics without a formal science education, waxing poetic about the infinite power and glory of x86 hardware running clever open source software. Maybe we could do it in perl!
I'm curious to see if their research will improve Natural Language Queries, as opposed to just improving speech recognition. There is an important difference between having to say: SELECT name FROM users WHERE id=12345 and saying: Pull up the name of employee number 12345.
-dave
http://millionnumbers.com/ - own the number of your dreams
I once did a lot of work with speech recognition software, having a former significant other who was disabled. I tested a number of programs, and found the biggest problem to be the wide variances in users' dialects. The programs all have to be trained initially to recognize a single users' voice. This means that a program trained for a Bostonian may not work for someone from Arkansas, Texas, or Louisiana. Also, the programs' effectiveness decreased over time if you did not use it regularly.
I don't know how possible it will be to make a program that can recognize all English users. Will someone who speaks Oxford English be recognized as well as a surfer from California? I doubt it.
Never look down your nose at others. Someday, someone is bound to see your boogers.
This seems like a situation where a hardware accelerated approach is pretty sensible. I'm guessing there is large amounts of signal processing involved in speech recognition. With a custom chip like this it probably helps greatly to offload some of that onto a dedicated chip in the same way as GPUs are used on graphics cards. The only problem I can see is that there might not be much market for it. GPUs have an obvious market (games), but there is less demand for speech processing. Star-Trek style interfaces are nice to dream of but for most common tasks a keyboard and mouse will probably give you a faster and more accurate interface.
gmail invite
- Voice controlled robots ("You missed a corner, vacuum cleaner")
- Data search by voice ("Find me a channel that plays Star Trek")
Kinda jumping ahead of yourself, aren't you? There are two steps to an operation like these, speech to text, and understanding the text you get out. Speech recognition gives you the first part, but you still have to be able to pull apart the sentence and figure out what it means.
Also, the article didn't say more accurate than software, it said more efficient. You know, uses less power and stuff like that? If the applications you mention (like search via voice) were possible/usable, you could run them today on an upper-end PC no problem.
I want to sing the general tone of a song I heard on the radio in a microphone and have google direct me to that album on froogle.
THAT would be awesome!
I work on product X and think of all the possibilities (list slightly feasible but most likely never going to happen features).
If this is really true what they're saying then people should put tons more money into product X!
Actually, use of speech recognition technology to index video clips for search engines _is_ both a very desirable technology, and something that can be done fairly easily (most professionally produced video, at least, takes great pains to have one speaker at a time and keep noise to a minimum). There's a fair bit of video content accessible via the web right now, and this will only increase (most new digital cameras can take video clips now - remember how quickly still pictures flooded the web when digicams first became available?).
Speech recognition technology has trouble when it's trying to sort out a noisy environment or a degraded communications channel, and has trouble holding useful open-ended conversations (as opposed to task-driven), but it's very capable in most other contexts. After all, the field has been under study for decades.
In summary, your mocking of the parent post is premature.
making quantum leaps in speech recognition has tremendous potential for deaf and hard-of-hearing (I am the latter)
Imagine being in a meeting (almost always a problem for hearing impaired people) and having real-time subtitles.
$1 million is a TINY price considering upwards of 20% of the nation has some hearing loss and hearing aids cost on the order of $4000 a pair.
A year spent in artificial intelligence is enough to make one believe in God.