Slashdot Mirror


Speech Recognition in Silicon

Ben Sullivan writes "NSF-funded researchers are working to develop a silicon-based approach to speech recognition. "The goal is to create a radically new and efficient silicon chip architecture that only does speech recognition, but does this 100 to 1,000 times more efficiently than a conventional computer." Good use of $1 million?"

328 comments

  1. Funny... by leonmergen · · Score: 5, Interesting
    Funny, I work on a speech recognition research project, and well, i have to say, think about all the possibilities... automa ted speech2text recording of meetings, on-the-fly subtitling of live tv shows, but it can get better : think about searching multimedia files in a google-kind of way based on audio, that automatically directs you to that part of the file where you want to be...

    If this really is true what they're saying, and knowing how much money is invested in speech recognition research on a yearl y basis, yeah, i would definately say that this is one million dollars of great investment...

    ... but then again, maybe they're just throwing around with numbers to make sure they get their money. :)

    --
    - Leon Mergen
    http://www.solatis.com
    1. Re:Funny... by strictfoo · · Score: 2, Funny

      I work on product X and think of all the possibilities (list slightly feasible but most likely never going to happen features).

      If this is really true what they're saying then people should put tons more money into product X!

      But then again maybe I'm just talking up product X to make sure I get my money :)

      --
      I've just signed legislation that'll outlaw Russia forever. We'll begin bombing in five minutes.
    2. Re:Funny... by Anonymous Coward · · Score: 0

      I'll finally be able to yell back at my TV and be heard.

    3. Re:Funny... by Anonymous Coward · · Score: 0

      "searching multimedia files in a google-kind of way" How would that be google-kind? The only thing Google does over most search engines is page rank. How would this speech recognition search engine do anything that was google-kind, given that the only thing that marks out Google, and therefore the definition of google-kind, is the page rank? Go on. I would like an actual answer. This isn't a rhetorical question.

    4. Re:Funny... by leonmergen · · Score: 1
      Ah sorry, I should've said "internet search engine"-kind of way, instead of for example the windows file search...

      So you will need to index the files prior to being able to search them.

      --
      - Leon Mergen
      http://www.solatis.com
    5. Re:Funny... by loginx · · Score: 3, Insightful

      I want to sing the general tone of a song I heard on the radio in a microphone and have google direct me to that album on froogle.

      THAT would be awesome!

    6. Re:Funny... by Chess_the_cat · · Score: 1
      Ah sorry, I should've said "internet search engine"-kind of way, instead of for example the windows file search...
      So you will need to index the files prior to being able to search them.

      What? Search engines index webpages prior to being able to search them. Did you think that Google was reading billions of pages in real time before returning a result?

      --
      Support the First Amendment. Read at -1
    7. Re:Funny... by TFGeditor · · Score: 0

      Funnier still...

      A few weeks ago, I had a conversation with my son-in-law (IBM engineer) about the next quantum leap in computer technology. I said it would be in the area of speech recognition.

      I love it when I am right.

      --
      Ignorance is curable, stupid is forever.
    8. Re:Funny... by tubbtubb · · Score: 2, Interesting

      My understanding of speech recognition is minimal, but from what I understand the meat of this chip would probably just be a floating point SIMD engine to do FFTs, and some comparison and control logic.

      I'm wondering if you could just do this with your average ATI or Nvidia 3D chip and an FPGA wrapper?

    9. Re:Funny... by syukton · · Score: 3, Interesting

      From what you describe, it isn't so much a speech recognition thing as it is a sound recognition thing; essentially, a way for a computer to logically distinguish between many millions of different sounds.

      How far away are we from having a machine that could identify all of the instruments in a piece of music by "listening" to the music? I say "listening" because there need not physically be a playback-and-listen, the playback could be mathematically modeled by the computer.

      --
      Reinvent the wheel only at either a lower cost, greater effectiveness, or your own personal enrichment and satisfaction.
    10. Re:Funny... by Anonymous Coward · · Score: 1, Interesting

      In the UK there is something similar, called Shazam. Which works surpisingly well.

    11. Re:Funny... by richy+freeway · · Score: 3, Interesting

      We have something like that in the UK called Shazam.

      Just dial a number on your mobile phone, hold it up to the speaker while the tune you want ID'd is playing and it'll SMS you back shortly with the track name and artist. You can then log onto the Shazam website, enter in your mobile number and you get a list of all the tracks you've searched for along with links to an Amazon search so you can purchase the track.

      Pretty good for ID'ing tracks when you're in a club and can't get to the DJ to hassle him. :P

    12. Re:Funny... by ChefInnocent · · Score: 1

      Perhaps, but it will still be the case that nobody is listening.

    13. Re:Funny... by Christopher+Thomas · · Score: 4, Insightful

      I work on product X and think of all the possibilities (list slightly feasible but most likely never going to happen features).

      If this is really true what they're saying then people should put tons more money into product X!


      Actually, use of speech recognition technology to index video clips for search engines _is_ both a very desirable technology, and something that can be done fairly easily (most professionally produced video, at least, takes great pains to have one speaker at a time and keep noise to a minimum). There's a fair bit of video content accessible via the web right now, and this will only increase (most new digital cameras can take video clips now - remember how quickly still pictures flooded the web when digicams first became available?).

      Speech recognition technology has trouble when it's trying to sort out a noisy environment or a degraded communications channel, and has trouble holding useful open-ended conversations (as opposed to task-driven), but it's very capable in most other contexts. After all, the field has been under study for decades.

      In summary, your mocking of the parent post is premature.

    14. Re:Funny... by Anonymous Coward · · Score: 0

      But then again maybe I'm just talking up product X to make sure I get my money :)

      Yes. That's why investors don't write cheques for every one who has a project, but also do write cheques for some. The original comment indicated exactly how benefitial this particular project is and showed that the investment was worth it:

      Speech recognition is a generic input device and can have a widespread effect on computing. This project isn't much different from advances made in video card technologies, like when the first specialized 3D rendering chips were developed. The development of speech recognition chips might well produce a multi-billion dollar industry.

      If you're having funding-envy, you too can try and show the investors the benefits of product X and if they feel it's worth it, they might give you some seed money too. One thing's for sure, acting like a wise-ass has never been endearing to investors...

    15. Re:Funny... by bestguruever · · Score: 1

      This may happen sooner than you think. My girlfriend is participating in a limitted beta of a service that does just that. I was watching her yesterday and it kept returning just one result - a sound clip of a cat being tortured. Seems pretty accurate to me.

      --
      if you think this is bad, you should have seen my last sig
    16. Re:Funny... by strictfoo · · Score: 1

      I was just trying to poke fun at all the people who post on slashdot now in slight astroturfing mode. "The company I work for make product X! It'll save the world!"

      Personally I think this type of hardware is something soung card manufacturers should be working on. Seems like the logical place to put it since all sound must flow through there, in both digital and analog forms typically.

      --
      I've just signed legislation that'll outlaw Russia forever. We'll begin bombing in five minutes.
    17. Re:Funny... by syrinx · · Score: 1

      I've been wanting that for years. That *would* be awesome.

      --
      Quidquid latine dictum sit, altum sonatur.
    18. Re:Funny... by MrScience · · Score: 1

      There exists a service that provides searchable TV shows... based on closed caption information.

      --

      You quitting proves that the karma kap worked. The most annoying of the whores shut up. --CmdrTaco

    19. Re:Funny... by Anonymous Coward · · Score: 0

      Actually one of the founders of Shazam is now working with Google... funny that.

    20. Re:Funny... by Anonymous Coward · · Score: 0

      Well aren't you just Mr. Prophecy.

    21. Re:Funny... by Bender_ · · Score: 1


      Erm.. they just spend a lot of money, it must not necessarily have the expected results.

      Besides, I have already seen dedicated speech recognition chips at a company presentation years ago.

    22. Re:Funny... by override11 · · Score: 1

      wow, your spelling and grammer looks like it was done with this year's model of speech recognition...

      --
      No I didnt spell check this post...
    23. Re:Funny... by Shotgun · · Score: 1

      Yes, but the professional videos you talk about emphasize the "YES! YES! DEEPER!" because that is the only voice that anyone really wants to hear. It's already easy to find those videos using the 'adult' keyword. Now trying to generalize past the geek view of the world...

      8*)

      --
      Aah, change is good. -- Rafiki
      Yeah, but it ain't easy. -- Simba
    24. Re:Funny... by Anonymous Coward · · Score: 1, Interesting

      It is speech recognition because they are trying to recognize a smaller subset of spoken syllables. Actually, I think it is half-syllables. There are apparently several hundred of these (complicated a little by dialects/Bush obviously).

    25. Re:Funny... by cybpunks3 · · Score: 1

      You index video quite easily by parsing closed captioning. No speech recognition necessary.

    26. Re:Funny... by Christopher+Thomas · · Score: 1

      You index video quite easily by parsing closed captioning. No speech recognition necessary.

      Only for video clips with closed captioning, that were saved in a form that retained that captioning.

      The flood of user-produced video clips that the digital photo flood tells me to expect certainly won't have captioning. The various university-tutorial lectures that various profs have put online don't have captioning. Similar video clips produced in-house for whatever purpose (training, demonstration, PR, etc) are unlikely to have captioning.

      In summary, I do not expect this feature to be available for the majority of online video clips needing indexing.

    27. Re:Funny... by TFGeditor · · Score: 2, Funny

      Yeah, well, when you are an oudated nerd you have to get your kicks somewhere.

      --
      Ignorance is curable, stupid is forever.
    28. Re:Funny... by ReelOddeeo · · Score: 1

      I want to sing the general tone of a song I heard on the radio in a microphone and have google direct me to that album on froogle.

      THAT would be awesome!



      I want to sing the general tone of a song that I heard on the radio in a microphone and have Kazaa direct me to that album on their download network.

      That would be awesome!

      --

      Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!
    29. Re:Funny... by SAPHRguru · · Score: 1

      They're looking for phonemes... I remember a crappy little chip that we programmed in college to do 'speech generation' -- it had a digitized phoneme table, and you made it speak by sending it the code for the phoneme and the duration (in ticks) L/AW2/ON/NG/ or S/SH/AW/UR/T Best fun was making it try to read the project report that we had to write all about the impressive project!

  2. 1... million... DOLLARS!!! by AKAImBatman · · Score: 5, Interesting

    Good use of $1 million?

    Let me think for a moment... Hell yeah! If we had low power speech processors, the possibilities would be endless. For one, we'd finally have a Star Trek(TM) interface for our homes!

    "Computer, lights!"
    "Computer, make coffee!"
    "Computer, Earl Grey, hot!"

    As silly as it may sound, such an interface would be far more efficient than mashing buttons.

    In addition, blind people could be significantly helped by this. Many of them already use speech recognition and synthesis to assist in computer usage. Imagine if their computers could suddenly understand them a thousand times better? They could talk to their computers a bit more naturally, thus saving their vocal chords from undue stress.

    Other applications (off the top of my head) are:

    - Voice notes on embedded devices (store only text!)
    - Helpful Kiosks that can give you directions
    - A new use for natural language database queries (i.e. Ask the computer what last quarter's net sales were.)
    - Voice controlled robots ("You missed a corner, vacuum cleaner")
    - Data search by voice ("Find me a channel that plays Star Trek")

    Any other cool ideas out there?

    1. Re:1... million... DOLLARS!!! by savagedome · · Score: 2, Funny

      Any other cool ideas out there?

      Yes.

      Peter Gibbons : What would you do if you had a million dollars?
      Lawrence : I'll tell you what I'd do, man, two chicks at the same time, man.
      Peter Gibbons : That's it? If you had a million dollars, you'd do two chicks at the same time?
      Lawrence : Damn straight. I always wanted to do that, man. And I think if I had a million dollars I could hook that up, cause chicks dig a dude with money.
      Peter Gibbons : Well, not all chicks.
      Lawrence : Well the kind of chicks that'd double up on a dude like me do.
      Peter Gibbons : Good point.

    2. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      > "Computer, Earl Grey, hot!"

      Hmmm.... Somehow this made me think about "It's not 'something-something a Space Odyssey' - it's 'two-thousand...AAAAAAAAAAAAAAAAARRRRRGH'"...

      (hint: very first episode of the Dilbert TV series; shower scene)

    3. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      I'd be happy if right now I could type "Computer, Earl Gray, no lemon, 10 sugarcubes" and make it happen. Anyone have linkage to some FOSS that does that?

    4. Re:1... million... DOLLARS!!! by theparanoidcynic · · Score: 5, Interesting

      Any other cool ideas out there?

      Universal language translators. Imagine headphones that let you understand any known language.

      --
      Only in a Slashdot fantasy can a Slackware install turn into several hours of sex . . . . .
    5. Re:1... million... DOLLARS!!! by AKAImBatman · · Score: 1

      Ooo! That's a good one. Even if it sounded like Babelfish, it would still be better than keying words into those handheld translators.

    6. Re:1... million... DOLLARS!!! by koa · · Score: 1

      Make yer own:

      http://www.phidgets.com :)

      --
      ....move along....nothing to see here....
    7. Re:1... million... DOLLARS!!! by randombit · · Score: 3, Insightful


      - Voice controlled robots ("You missed a corner, vacuum cleaner")
      - Data search by voice ("Find me a channel that plays Star Trek")


      Kinda jumping ahead of yourself, aren't you? There are two steps to an operation like these, speech to text, and understanding the text you get out. Speech recognition gives you the first part, but you still have to be able to pull apart the sentence and figure out what it means.

      Also, the article didn't say more accurate than software, it said more efficient. You know, uses less power and stuff like that? If the applications you mention (like search via voice) were possible/usable, you could run them today on an upper-end PC no problem.

    8. Re:1... million... DOLLARS!!! by superstick58 · · Score: 1
      I think one of the best places to use speech recognition is in the car. There are already many devices that use this, like Onstar, but the interface is slow and buggy. If you could say "Climate Control, 70 degrees", and other commands, it would free up your hands for actually driving and lower distraction.

      In addition, you would get less dash clutter and not have to rely on complicated menu navigation in things like the iDrive. Voice recognition is a great way to centralize the operation of many functions into one controller, the human voice.

    9. Re:1... million... DOLLARS!!! by Khomar · · Score: 1
      - Helpful Kiosks that can give you directions
      - A new use for natural language database queries (i.e. Ask the computer what last quarter's net sales were.)
      - Voice controlled robots ("You missed a corner, vacuum cleaner")
      - Data search by voice ("Find me a channel that plays Star Trek")

      While I agree that this is a great investment, voice recognition does not equal artificial intelligence. Even if the computer is able to tell that you spoke the words what+were+last+quarter's+net+sales, it would not know what that meant without some configuration (create a "last quarter's new sales" report). Your other ideas were far closer to reality (helping the blind, turning on/off lights, etc).

      That said, this technology would bring us closer to a Star Trek world, but a lot of work needs to be done on language parsing and artificial intelligence for that gap to be closed.

      --

      I believe in de-evolution. God made the world perfect, man fell, and its been going downhill ever since!

    10. Re:1... million... DOLLARS!!! by bytesmythe · · Score: 1

      The article mentions speech recognition, but not comprehension. You cannot take pure recognition and immediately make a superhelpful information kiosk or natural language query system out of it.

      Such an informational kiosk could be made just as easily with current speech recognition technology considering how limited the interface would have to be. (A handful of phrases, such as "I'm lost", then replying to a voice prompt with the location you're looking for, at which point the computer can do a quick lookup on mapquest and read you the directions. Nothing a good couple of developers couldn't hammer out in a few weeks.)

      The new tech research seems to simply be a way of taking current capabilities and moving them from software into hardware, which provides some speed and mobility gains, but no new functionality.

      Considering the size such a reconigition device might be, perhaps they could drop a chip in my remote key fob that understands the phrase "Where the F*** are my ****** keys?!?!?", at which point the device will chirp, or possibly play an insulting message questioning my heritage or legitimacy.

      --
      bytesmythe
      Hypocrisy is the resin that holds the plywood of society together.
      -- Scott Meyer
    11. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      Now, combine the first set of messages with the last two.
      You: Find me a channel that plays Star Trek.
      *tv turns on an episode of Star Trek*
      TV: Computer, coffee, hot.
      *vacuum cleaner pours coffee on the carpet*
      Vacuum Cleaner: Beep.
      You: Vacuum Cleaner, you don't make coffee. Clean it up.
      Vacuum Cleaner: Beep.
      *vacuum cleaner tries to clean the carpet, sucks in some hot water, and shorts out*
      Vacuum Cleaner: Mein Leben!

    12. Re:1... million... DOLLARS!!! by AKAImBatman · · Score: 2, Interesting

      It's not that hard. Have you ever seen those automatic coffee machines? i.e. Put a few quarters in, then punch a bunch of "options" buttons. A cup drops down, and fills with coffee, cream, sugar, and any other options offered by the machine.

      The same could be done with tea. Just keep a reservoir of hot water, a stack of tea bags, cubes of sugar, and refrigerated lemons. When you order tea, the machine would inject the bag into the hot water stream, then drop the sugar and lemon into the tea.

      Voila, Earl Grey, hot! ;-)

    13. Re:1... million... DOLLARS!!! by ViolentGreen · · Score: 1

      I think you hit it on the head.

      Any other cool ideas out there?

      Some specific ideas off the top of my head:
      - Navigation systems in cars
      - Decent automated phone system
      - Microwave Ovens (tell it to cook two baked potatos)
      - PDA calander entries.

      --
      Not everything is analogous to cars. Car analogies rarely work.
    14. Re:1... million... DOLLARS!!! by AKAImBatman · · Score: 2, Interesting

      Ah hah! Found one!

    15. Re:1... million... DOLLARS!!! by AKAImBatman · · Score: 1

      I don't think you understand. Natural Language Interfaces already exist for SQL databases. Their biggest limitation is that they need quite a bit of meta data about your data structure in order to properly parse the queries. But once the meta data has been added, the computer should be capable of answering most questions about your data.

      It's not really useful for development work, but it can come in handy for allowing data requests from executives..

    16. Re:1... million... DOLLARS!!! by maxwell+demon · · Score: 1, Funny

      It could relieve me from having to type my password ... oh, wait ...

      --
      The Tao of math: The numbers you can count are not the real numbers.
    17. Re:1... million... DOLLARS!!! by iabervon · · Score: 1

      Actually, voice is terrible for controlling anything that doesn't talk back, and pretty bad for anything without a large amount of common sense (i.e., unsolved AI problem). There just isn't enough information in speech to react at all appropriately to it without a very good understanding of context, and you generally can't express unscripted ideas without dialogue.

      On the other hand, there's a lot of information currently available as speech which could be managed more usefully if transcribed automatically. I think the best use is a system which transcribes voice notes, which you can then clean up later (or just treat as rough notes anyway).

    18. Re:1... million... DOLLARS!!! by frank_adrian314159 · · Score: 3, Informative
      There are two steps to an operation like these, speech to text, and understanding the text you get out. Speech recognition gives you the first part, but you still have to be able to pull apart the sentence and figure out what it means.

      In fact, converting the speech to text and then trying to analyze the text without sound-level annotations might give bad results, as tonal or emotional content would be lost. You need both simultaneously to really understand what's being said.

      --
      That is all.
    19. Re:1... million... DOLLARS!!! by D-Cypell · · Score: 1

      It surely wouldnt be a huge leap to store speech using conventions to retain the intonation.

      Im sure there is already detailed standards on how this could be done.

    20. Re:1... million... DOLLARS!!! by waleg · · Score: 1

      What about the R&D work of IBM, Lotus and other few companies on the voice .. talking to the computer projects??? Why they aren't mentioned here?

      WALEG
      http://www.waleg.com/

    21. Re:1... million... DOLLARS!!! by richieb · · Score: 2, Funny
      Any other cool ideas out there?

      Walk into someone's office: "Computer! Format C:"

      --
      ...richie - It is a good day to code.
    22. Re:1... million... DOLLARS!!! by Lumpy · · Score: 1

      For one, we'd finally have a Star Trek(TM) interface for our homes!


      what you are looking for is called misterhouse

      It interfaces with the IBM via voice apps as well as other items to give you what you want.

      it's not perfect as voice recognition is only slightly better than it was in the late 80's but it's what you are asking for.

      and AMX/Panja has a turn-key system they you can buy and have installed for around $50,000.00 that also can do what you want, I saw a demo of that system last weekend in a home around here. It only would recognize the owners reliably (same as most VR ssytems) but did recognize my voice on some basic commands.

      --
      Do not look at laser with remaining good eye.
    23. Re:1... million... DOLLARS!!! by Paladine97 · · Score: 1

      Ah yes this is Professor Frink, floygan.

      As any self-respecting Star Trek nerd will know, the Captain orders his tea like this: "Tea, Earl Grey, hot!". Alas you have forget the initial "Tea!" floygan smoygan.

    24. Re:1... million... DOLLARS!!! by salec · · Score: 1

      Every TV comercial starts with: "TV, max volume!" :-) I am sure that in the long run: 1) All the "little languages" (mine too) will die, as everyone would be speaking to household devices in english (well, sorta'...), starting at very young age. 2) All people of this planet's technosphere will speak english, BUT, they will also develop funny accents and diction, especially when "value" products get to be developed in some of the "cheap white-collar paradises". I can see (hear) today, from speaking toys, what will our future generations sound like tomorrow.

    25. Re:1... million... DOLLARS!!! by soltarusprime · · Score: 1

      Obligatory HHGTTG paraphrase: No doubt it will dispense something just about but not completely like tea.

    26. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      You say even if as though that would be a bad thing.

    27. Re:1... million... DOLLARS!!! by Sir+dies+alot · · Score: 2, Informative

      Actually they are one in the same, it is possible to determine what something means using today's voice recog. (I've got a setup that controls my entertainment center and lights in my apartment through voice recog) However it is wildly inefficient and difficult to setup. The reason is the english language is just about the most illogical system on the planet, and computers only understand logic. Due to the limited scope of my setup, I only had to record about 20-40 words/phrases and reference them differently in a database. Then you speak, it gets each word and follows a tree like structure jumping from each word to the next until it gets to the end. Any word not understood is simply filtered out as useless. When it reachs the "leaf" in the tree it has a command which it sends out the preconfigured port. Not a beautiful system but it works fairly well. If they make the ability to recognize text much more efficient, that means all the processing power that was being used to simultaneously decode and translate speech can be used to understand the speech. This is an immediate boost in power and then it just takes some good algorithms to be made in order for these inventions to become a plausible reality. Also, the reference about using a high-end PC to do this is true if thats not all it is doing. If you use a mid-range PC solely for voice processing, it should work just as well. (mine is running using spare processing time on my Athlon 64 3400+ with 2GB RAM, but I would assume that you could use a slower system if you werent doing anything else on it.)

      --
      The stupidity of your average American is just about the same as the average European, we simply show it off better.
    28. Re:1... million... DOLLARS!!! by Have+Blue · · Score: 1

      Increased efficiency can easily lead to increased accuracy as it allows more expensive techniques to be used for the same cost.

    29. Re:1... million... DOLLARS!!! by Christopher+Thomas · · Score: 1

      - Voice controlled robots ("You missed a corner, vacuum cleaner")

      - Data search by voice ("Find me a channel that plays Star Trek")


      Kinda jumping ahead of yourself, aren't you? There are two steps to an operation like these, speech to text, and understanding the text you get out. Speech recognition gives you the first part, but you still have to be able to pull apart the sentence and figure out what it means.

      While extracting full meaning is extremely difficult, extracting enough to get the job done when given context is much easier. Both of the tasks listed above are in the second category.

      If a vacuum cleaner is listening, all it has to hear is its name (to confirm that the sentence is directed at it), and hear "corner". Depending on whether it's actively vacuuming or on standby, it would have to decide on its own whether you wanted it to re-vacuum the corners, or just sit in one.

      If a search query program is listening, all it has to hear is "find" (to confirm that the sentence is directed at it), "channel", and "Star Trek". "channel" tells it to look at TV listings and radio channels. If its indexes of either pull up "Star Trek", it tells you that TV channels X, Y, and Z carry Star Trek, and prompts you for a follow-up query (e.g. "when does channel Z carry Star Trek", or "do any of the channels carry Star Trek between 6pm and 10pm?").

      This kind of technology is already in use today (airline reservation systems are one of the more established applications). User-directed queries of databases and directories are one of the big emerging applications.

      The only reason vacuum cleaner control isn't a big voice application yet, is that robot vacuum cleaner technology is still immature :) (though that's pretty close to being solved, with several interesting solution attempts already on the market).

      In summary, these tasks are easier than you think, as long as the problem domain is properly constrained.

    30. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      You can do that with technology that exists right now. I worked for a company 5 years ago that developed speech recognition products that would do exactly that. There were already several chips on the market that could do good speech recognition without training. I remember one of them was created by Motorola.

      The biggest problem we ran into with speech recognition was one of expectation. Users like speech recognition, but once they get it - they automatically tend to want the next step (AI).

    31. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      Nope, already been done, and it's not as bulky as a headphone, it's a little yellow fish, ya stick it in your ear. It works better than voice recognition too, it actually looks at the brain waves in the section of the brain where language comes from and decyphers it from there.. Very clever little thing, enough to make you think there's a God after all..

    32. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      Also, the article didn't say more accurate than software, it said more efficient.

      But it takes efficiency to let developers create more sophisticated algorithms. Today only people who can afford a high-end PC can develop better speech recognition, and even then it might be too slow going to experiment a lot. But if the burden of the basic work was transferred to an efficient chip, then the developers can use the freed-up power of their PC to create and experiment with better algorithms, which will be transferred later to better chips, etc.

      This is how you are able to play Doom 3 on your home PC. 15 years ago, not only was that impossible, most of the rendering algorithms hadn't even been developed. The creation of 3D rendering chips made development of better graphics programs much much easier, and here we are today.

    33. Re:1... million... DOLLARS!!! by floki · · Score: 1

      Let me think for a moment... Hell yeah! If we had low power speech processors, the possibilities would be endless. For one, we'd finally have a Star Trek(TM) interface for our homes!

      "Computer, lights!"
      "Computer, make coffee!"
      "Computer, Earl Grey, hot!"


      A friend of mine is working at the research department of Audi. And guess what? They have fully functional speech recognition like the one you described.

      Okay, no thing like boiling coffee, but a/c and lights control, entering destinations for the navigation system, and controlling the radio and CD player are working perfectly. You even have to start a command with a phrase like "Computer" to attract the computer's attention.

      The future has already started :-)

      --
      from the to-stupid-for-words dept.
    34. Re:1... million... DOLLARS!!! by centauri · · Score: 2, Funny

      But by removing all barriers to communication between different races and cultures, such a device would cause more and bloodier wars than anything else in the history of creation.

      --
      Don't blame me, I voted for Durga.
    35. Re:1... million... DOLLARS!!! by mOdQuArK! · · Score: 1

      Actually, I would prefer the translated text to appear as subtitles in my field of vision (attached to the source of sound). Translation should also include OCR for any signs that I happen to be looking at.

      Combine that with a your-language-to-other speech-to-audio function (type in your sentence, either try & repeat funny sounds from headphones or let audience listen to speaker), you'd get tricked-out geeks wandering fearlessly in odd places all over the world, providing vast amounts of amusement for the locals.

    36. Re:1... million... DOLLARS!!! by aardvarkjoe · · Score: 1
      it is possible to determine what something means using today's voice recog.
      What you are describing is just a simple lookup. That's a long way from determining what an arbitrary sentence means, which is what the original poster is describing.
      --

      How can we continue to believe in a just universe and freedom to eat crackers if we have no ale?
    37. Re:1... million... DOLLARS!!! by brucmack · · Score: 1

      Note that an increase in efficiency doesn't automatically imply an increase in accuracy... Just because this might do speech recognition 100 to 1000 times faster than software doesn't mean it will be more accurate.

      Of course, the increased efficiency should allow for development of more complex algorithms that do increase accuracy, but that could take more time/money.

    38. Re:1... million... DOLLARS!!! by Simonetta · · Score: 1

      But by removing all barriers to communication between different races and cultures, such a device would cause more and bloodier wars than anything else in the history of creation.

      I believe that the statement above assumes that conflict is often minimized because the sides don't understand the verbal provocations from the other side. If you don't understand the 'fighting words' then it's not fighting words, it's just gibberish.

      A universal communicator would force people to put a lot more emphasis on analysing the words spoken before taking action. Plus most wars are fought for economic or population reasons; the insults are just triggers. In theory, universal language translators would increase financial interaction, which should work to decrease warfare.

      The current war between Islam and the USA is founded on the two facts that the USA needs the Islamic oil more than anything, and two, the population of Islam is growing four times faster than the economies of Islam. Millions of Islamic young people enter the work force each year with no prospects ever for gainful employment. It's easy for the mullahs to convince them that the USA is responsible for the situation, and that by killing Americans will improve the situation. Americans spend billions of dollars keeping backward corrupt Islamic regimes in place militarily.
      The Islamic war will end when the Americans develop a new low-cost energy source that doesn't consume large amounts of Islamic oil, and when Islam controls their population growth and reforms their economies. The only people who are doing anything in these directions are the Iranians, who have a massive and successful birth-control program in place, and the Malaysians, who are the only people in Islam to restructure their economy to modern times.

    39. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 1, Insightful

      Whoa! Not so fast! Voice RECOGNITION is one thing, UNDERSTANDING and translating is something different...

    40. Re:1... million... DOLLARS!!! by renderhead · · Score: 2, Funny

      That is good thought! The thing software which is the simple problem where existing translation that it is developed applies algorithm to speech of real time very is healthy! Gorgeousness!

      P.S. I used Babelfish for translating this post.

      --
      I wish that my inferiority complex were as good as yours.

      -RenderHead

    41. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      - A new use for natural language database queries (i.e. Ask the computer what last quarter's net sales were.)

      Get in the 21st century: query by example is long dead and I, for one, welcome our dba overlords.

    42. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      Not that those wouldn't be cool, but they'd require more effective machine translation than we have today (think Babelfish...)

    43. Re:1... million... DOLLARS!!! by SilkBD · · Score: 1

      Don't mistake illogical for complex and redundant. I'm sure you could program all the rules and exceptions to the rules for understanding english into a program. I'm willing to bet it's already been done...

      --
      00101010
    44. Re:1... million... DOLLARS!!! by aldousd666 · · Score: 1

      If you're going for adding things to your field of vision, why not overwrite the Japanese version of the sign with the text written in plain english? Surely they can project an image onto your retina to take care of that. Then there is no need for subtitles. Top this off with the audio translation playing the sound back of the translated words of someone speaking to you -- and in their own voice not some quirky computer voice. Both of these things are doable with technology that already exists -- OCR, retinal projection, speech to Text, translators like babel fish, and well speakers. The trouble is words have meaning -- not just syntax, and often have different meanings (even vernacularized meanings) in different contexts and converting them to text doesn't exactly allow for accurate translation -- like in babel fish. How would a translator program handle something like, "That's a lot of liquor mate, you'll tumble down the apples on the way out and bruise up your eggs." There needs to be a little progress in the natural language recognition process for this to be totally cool, but babel fish style should be enough to let us survive on the concept.

      --
      Speak for yourself.
    45. Re:1... million... DOLLARS!!! by Snarph · · Score: 1

      Yep. Without that, the computer might think your guest "Earl Grey" is too hot, and turn on the air conditioner.

      Or something.

    46. Re:1... million... DOLLARS!!! by aldousd666 · · Score: 1
      Yes, it does exist, I already know what you're thinking.

      retinal projection (PDF). So there..

      Figures that the military gets this first
      --
      Speak for yourself.
    47. Re:1... million... DOLLARS!!! by bhima · · Score: 1
      Yes!

      From the woman who I learned English from:"You can mash potatoes but not buttons"

      Ain't EASL great!

      --
      Nothing in the world is more dangerous than sincere ignorance and conscientious stupidity.
    48. Re:1... million... DOLLARS!!! by glass_window · · Score: 1

      Don't forget:
      -Make keyboards obsolete

    49. Re:1... million... DOLLARS!!! by ikkonoishi · · Score: 1

      Its all possible with today's technology yes...

      But I don't want to have to carry Deep Blue around on my back when I go on vacation.

    50. Re:1... million... DOLLARS!!! by aldousd666 · · Score: 1

      touche

      --
      Speak for yourself.
    51. Re:1... million... DOLLARS!!! by grouchomarxist · · Score: 1

      It is not just blind people who could be helped out by this. Other people with disabilities (e.g. quadriplegics) could be assisted through this technology, as long as they are able to speak.

    52. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      I'd say something like this would be almost impossible.
      Many languages make use of unspoken contextual and relational hints. For instance Japanese:
      If I say "hello, Mr. Tanaka"
      should it be translated:
      こにちは&#122 88;た&# 12394;かーさま (good afternoon Mr. Tanaka [respectful])
      こんばん&#1 2399; &# 12383;なかーさま (good evening Mr. Tanaka [respectful])
      おひよご&#1 2374;い&# 12414;す たなか&#12 540;さま (good morning Mr. Tanaka [respectful])
      こにちは&#1 2288;た&# 12394;かーさん (good afternoon Mr. Tanaka)

      If I say "excuse me" how should that be translated?
      すみませ&#124 3 5; (I'm sorry)
      しつれえし&# 12414;&# 12377; (I'm leaving)
      ごめなさ&#1235 6; (I'm sorry)

    53. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      Sure, right after tricorders, universal magical surgical wands (wave wave--- you're healed!), and FTL travel. Meanwhile, back on earth, getting decent machine translation is generally recognized as an unworkable task.

      Babblefish: Certainly straight after tricorders the universal magic surgical partition (vag of wave you are healed!), and the ftl journey. In the meantime on earth, which is regarded the discrete machine translation generally as a not-carry outable task, receives.

    54. Re:1... million... DOLLARS!!! by mOdQuArK! · · Score: 2, Insightful
      If you're going for adding things to your field of vision, why not overwrite the Japanese version of the sign with the text written in plain english?

      Well, for those of us who actually like seeing the thing which is being translated, covering everything up would make the experience a little less rich. Also, over time, if you always see the two things together, you might be able to recognize patterns (hey, that set of ideograms always means Tokyo!), so if your batteries go dead, you still have a chance of navigation.

      Top this off with the audio translation playing the sound back of the translated words of someone speaking to you

      I prefer subtitles for similar reasons as for the signs, plus there is the added issue of cognitive modality - it is harder for you to concentrate on an audio translation if you can hear the person speaking to you at the same time (brain has to filter out similar sensory information), whereas I find it fairly easy to follow subtitles for meaning even while using the audio from the person only as an emotional "channel" (brain can use complementary sensory info).

      The other stuff you mention (colloquialisms, vernacular, etc) I agree with, except that I actually like to see the Babelfish-like (straight) translations in some of those instances, perhaps with a background notation of the slang's translation, its probable meanings & maybe its origin (although I doubt you would look at all that stuff while in the middle of conversation :-).

    55. Re:1... million... DOLLARS!!! by rtaylor · · Score: 1

      In fact, converting the speech to text and then trying to analyze the text without sound-level annotations might give bad results, as tonal or emotional content would be lost. You need both simultaneously to really understand what's being said.

      Not to worry, v2 will be speech to text and emoticon.

      --
      Rod Taylor
    56. Re:1... million... DOLLARS!!! by cakefool · · Score: 1

      Spoken - "computer, format see colon"

      Ouch...

    57. Re:1... million... DOLLARS!!! by Anonymous Coward · · Score: 0

      dude, you need to get out more, like learn to appreciate literature or something ...

    58. Re:1... million... DOLLARS!!! by IWannaBeAnAC · · Score: 1
      The biggest problem we ran into with speech recognition was one of expectation. Users like speech recognition, but once they get it - they automatically tend to want the next step (AI).

      Naturally. Human speech evolved at the same time as other human social interactions. Using speech with a device that has no understanding of context is not something that could possibly come naturally to an adult human. A child perhaps, could learn at an early age to treat machine-speech and human-speech as different media, it would be an interesting experiment.

  3. Text of article by Anonymous Coward · · Score: 4, Informative

    Carnegie Mellon University's Rob A. Rutenbar is leading a national research team to develop a new, efficient silicon chip that may revolutionize the way humans communicate and have a significant impact on America's homeland security. Rutenbar, a professor of electrical and computer engineering at Carnegie Mellon, working jointly with researchers at the University of California at Berkeley received a $1 million grant from the National Science Foundation to move automatic speech recognition from software into hardware. ''I can ask my cell phone to 'Call Mom,''' says Rutenbar, ''but I can't dictate a detailed email complaint to my travel agent or navigate a complicated Internet database by voice alone.''

    From Carnegie Mellon University:

    Carnegie Mellon engineering researchers to create speech recognition in silicon

    Team to develop new silicon chip

    Carnegie Mellon University's Rob A. Rutenbar is leading a national research team to develop a new, efficient silicon chip that may revolutionize the way humans communicate and have a significant impact on America's homeland security.

    Rutenbar, a professor of electrical and computer engineering at Carnegie Mellon, working jointly with researchers at the University of California at Berkeley received a $1 million grant from the National Science Foundation to move automatic speech recognition from software into hardware.

    ''I can ask my cell phone to 'Call Mom,''' says Rutenbar, ''but I can't dictate a detailed email complaint to my travel agent or navigate a complicated Internet database by voice alone.''

    The problem is power--or rather, the lack of it. It takes a very powerful desktop computer to recognize arbitrary speech. ''But we can't put a PentiumTM in my cell phone, or in a soldier's helmet, or under a rock in a desert,'' explains Rutenbar, ''the batteries wouldn't last 10 minutes.''

    Thus, the goal is to create a radically new and efficient silicon chip architecture that only does speech recognition, but does this 100 to 1,000 times more efficiently than a conventional computer.

    The research team is uniquely poised to deliver on this ambitious project. Carnegie Mellon researchers pioneered much of today's successful speech recognition technology. This includes the influential 'Sphinx' project, the basis for many of today's commercial speech recognizers.

    ''We're still not even close to having a voice interface that will let you throw away your keyboard and mouse, but this current research could help us see speech as the primary modality on cell phones and PDAs,'' said Richard Stern, a professor in electrical and computer engineering and the team's senior speech recognition expert. ''To really throw away the keyboard, we have to go to silicon.'' But enhanced conversations between people and consumer products is not the main goal. ''Homeland security applications are the big reason we were chosen for this award,'' says Rutenbar. ''Imagine if an emergency responder could query a critical online database with voice alone, without returning to a vehicle, in a noisy and dangerous environment. The possibilities are endless.''

    Researchers plan to unveil speech-recognition chip architecture in two to three years.

    1. Re:Text of article by Anonymous Coward · · Score: 0

      how the fuck did this get moded to 4??? its the same fucking article repeated!!! slashdot is seriously on the downhill....moderators of a better calibre are urgently needed

    2. Re:Text of article by Kurayamino-X · · Score: 1

      hmm, 100 to 1000 times more efficiently eh?
      you know, as in sucks less batteries as opposed to "100 to 1000 times better" or "more accurate."

      they're basically saying they can do what a desktop does only with less power, and last time i checked, a desktop wasn't very accurate when it came to voice recognition.

      --
      ...I got nothing.
    3. Re:Text of article by goneutt · · Score: 1

      I only had to read the first paragraph to see that this project is one of the many that have sprouted since 9-11 because they can work a "homeland security" angle into it.

      --
      Bacardi + slashdot = negative karma.
  4. First Post by JohnHegarty · · Score: 5, Funny

    I can just see the anonymous cowards shouting first post at their pcs now

    1. Re:First Post by Anonymous Coward · · Score: 2, Funny

      and their PCs talking back to them "I'm sorry Dave, I'm afraid I can't do that"

    2. Re:First Post by Anonymous Coward · · Score: 0

      and i can just see johnhegarty screeming, "mommy, i peepeed in my pants. whaaaaaaa".

    3. Re:First Post by aurb · · Score: 2

      Or karma whores reading the articles out loud at their pc's.

    4. Re:First Post by JohnHegarty · · Score: 1, Funny

      "reading the articles"

      sorry , this is a slashdot , you must mean a different site....

    5. Re:First Post by aurb · · Score: 1

      Oh, wait... then using text2speech software.

    6. Re:First Post by tajmorton · · Score: 1

      New Mozilla Extention: First Post Blocker!

      --
      Tell the truth and you won't have so much to remember.
    7. Re:First Post by System.out.println() · · Score: 2, Funny

      In Soviet Russia, the PC yells at you!

      Or was that redmond....

    8. Re:First Post by freqres · · Score: 1

      Does this mean the 'next big thing' in computers is to interface with them using speech in some Zork adventure like fashion?

      --
      Rampant Ninja related crimes these days...Whitehouse is not the exception
    9. Re:First Post by freqres · · Score: 1

      All your speech are belong to us!

      --
      Rampant Ninja related crimes these days...Whitehouse is not the exception
    10. Re:First Post by Anonymous Coward · · Score: 0

      first pots first post damn delete delete damn you delete jerk | | | already

  5. Carnivore on telephones by CrazyJim1 · · Score: 5, Insightful

    My friend and I were talking about this. In countries that are more totalitarian, it could be used to root out "dangerous people" www.geocities.com/James_Sager_PA

    1. Re:Carnivore on telephones by strictfoo · · Score: 1

      congrats

      You and your friend are only about 20+ years behind the times. People have been talking about that since the eighties and probably even before that.

      --
      I've just signed legislation that'll outlaw Russia forever. We'll begin bombing in five minutes.
    2. Re:Carnivore on telephones by Anonymous Coward · · Score: 0

      That's NORTH KOREA why I frequently AL-QAEDA sprinkle NUCLEAR my regular ANTHRAX conversations with helpful BIN LADEN key words. =)

    3. Re:Carnivore on telephones by ChefInnocent · · Score: 2, Informative

      Hello? Have you heard of Echelon?

    4. Re:Carnivore on telephones by protohiro1 · · Score: 1

      Our goverment already does this. They do it right here in denver in fact, at Buckly "Air National Guard Base".

      --
      Sig removed because it was obnoxious
    5. Re:Carnivore on telephones by ultrasound · · Score: 1

      Have a look at his web site, wow! He's not 20 years behind, he's 20 years ahead, and possibly on a different planet, or even in a different galaxy.

    6. Re:Carnivore on telephones by Anonymous Coward · · Score: 1, Interesting

      At least they admit it in the article:

      ... may revolutionize the way humans communicate and have a significant impact on America's homeland security.

      Why doesn't this kind of thing bother more people?

    7. Re:Carnivore on telephones by statusbar · · Score: 1

      Check out: http://www.spectrumsignal.com/

      --jeff++

      --
      ipv6 is my vpn
    8. Re:Carnivore on telephones by Kokuyo · · Score: 1

      Now don't tell me our countries are any better. The US and Europe are trying to spy on us every day. This is a common problem. What I don't quite understand is why did nobody invent a telephone receiver which digitalizes, encrypts and sends the data as pulse tones over the telephone net? Just hit a button and big brother receives just gibberish. I'd say companies as well as any average Joe would be a potential customer the more things like Echelon are discussed.

  6. accuracy by tubbtubb · · Score: 5, Insightful


    100 to 1000 times more efficient worth $1M? meh. maybe.
    100 to 1000 times more accurate worth $1M? definitely.

    1. Re:accuracy by SillyNickName4me · · Score: 2, Insightful

      > 100 to 1000 times more efficient worth $1M? meh. maybe.
      > 100 to 1000 times more accurate worth $1M? definitely.

      Accuracy does not have to be a problem with modern speech to text systems, but the need to 'train' them to get that accuracy, and the need to talk to it in a somewhat distinctive way, make them far less efficient.

      I'd rather say that the time it takes to get used to a speech recognition system (and to get it used to you where appliable), together with the soemwhat heavy cpu requirements, are what currently stops use. To me that means that the first thign that is required is efficiency, the accuracy is already there.

      (I have been using speech to text for over a decade now, starting out with another hardware solution in the first half of the 90s (IBM's VoiceType Dictation, back then called Personal Dictation System if I'm not mistaken, and even that system already had an almost as good accuracy as I manage myself)

    2. Re:accuracy by glorf · · Score: 1

      Is dictation really the most common or most useful application of speech to text? If you are using it for dictation then a long training period is acceptable. But I find voice recognition in use much more often in phone IVR systems. When I call up my power company it asks me to say my account number. When I call a major hotel chain it asks which property I want info about and connects me appropriately. I saw a demo from IBM several years ago where they were doing money management and the system was recognizing mutual fund names. Movie theatre listings, 411, and the list could go on. Not only are these things that are probably a lot more commonly used than dictation, but they are also the things that would save companies the most money. For the price of something like Dragon Dictate most people would save money hiring some starving college kid to type up whatever it is for the frequency they need it. Being able to shut down call center locations instead of just outsourcing them to India or the prison system saves a lot more money.

    3. Re:accuracy by SillyNickName4me · · Score: 1

      > Is dictation really the most common or most useful application of speech to text?

      Definitely not..

      > If you are using it for dictation then a long training period is acceptable.

      Well, depends on 'long'. For most practical uses, peopel seem to find 3+ hours of training for it to be too much, unless they have a very direct purpose for entering text without needign their hands.

      > But I find voice recognition in use much more often in phone IVR systems. When I call up my power company it asks me to say my account number. When I call a major hotel chain it asks which property I want info about and connects me appropriately. I saw a demo from IBM several years ago where they were doing money management and the system was recognizing mutual fund names. Movie theatre listings, 411, and the list could go on.

      The complexity of speech recognition depends ALOT (to put it very mildly) on how big the vocabulary is that you are usign, and how many similar sounding words it contains.

      All the things you list would be relatively simple, and won't require much if any training because they can rely on distinctive words froma small vocabulary.

      This could be done with general purpose computing hardware like a decade ago, and was first demonstrated by IBM in the late 60s.

      It is seriously different from what is required for generic speech to text for situations where you have no clue whatsoever what a user might be going to say.

      To give you some exampels of the problems generic speech to text translation runs into:

      Did you just say to? too? two?

      It is only possible to tell if you have the context for it. In generic speech to text, that is a lot more difficult then when you are listening for very specific words and have determined the context beforehand.

      > Not only are these things that are probably a lot more commonly used than dictation, but they are also the things that would save companies the most money. For the price of something like Dragon Dictate most people would save money hiring some starving college kid to type up whatever it is for the frequency they need it. Being able to shut down call center locations instead of just outsourcing them to India or the prison system saves a lot more money.

      No doubt but the question is how usefull this technology will prove for that.
      As mentioned before, there is a huge difference between listening for a list of words and doign generic speech to text.

  7. Good use of $1 million? by Anonymous Coward · · Score: 3, Insightful

    Damned straight it is! In government terms, that's a pittance. In government-funded science terms, it's downright INFINITESIMAL. It isn't even couch change, it's more like the stale pretzel under the couch cushion.

    But, of course, cue the armchair blogging fanatics without a formal science education, waxing poetic about the infinite power and glory of x86 hardware running clever open source software. Maybe we could do it in perl!

    1. Re:Good use of $1 million? by dr_labrat · · Score: 1

      MMmmmm. Stale pretzels.

      --
      The secret of success is honesty and fair dealing. If you can fake those, you've got it made. (Marx)
    2. Re:Good use of $1 million? by Anonymous Coward · · Score: 1, Insightful

      so we should never try to advance society until what you feel as basic problems (that WILL NEVER BE SOLVED) are fixed?

      bravo
      lets go back to living in mud huts too, because there was energy spent on making better walls while some people were starving.

      not to mention: 10,000 people, what is $10 going to do for them?

      wow they can have half a dozen ultra cheap meals.
      that really helps a lot

    3. Re:Good use of $1 million? by Anonymous Coward · · Score: 0

      > so we should never try to advance society until what you feel as basic problems
      > (that WILL NEVER BE SOLVED) are fixed?

      In what way is preventing people from starving not advancing society?

      > 10,000 people, what is $10 going to do for them?

      $10 each? Instead of asking questions which make you look like a fucktard, perhaps you can check out the websites of charities such as oxfam and Unicef and see precisely what $10 can do.

    4. Re:Good use of $1 million? by PythonCodr · · Score: 1

      Yeah ... not much use in having a small, portable device that would allow the hearing impared be able to understand people trying to communicate with them using something much smaller than a desktop...

      (Of course, I'm going to leave out how this is the logical first step in on-the-fly language translation, since that's a ways off. But it's clear that this is a logical first-step in a lot of useful and helpful products once you look past the geeky "Look, I can talk to my computer..." stuff.)

    5. Re:Good use of $1 million? by Threni · · Score: 1

      > so we should never try to advance society until what you feel as basic problems
      > (that WILL NEVER BE SOLVED) are fixed?

      Why do you feel the problem of people dying from easily treatable diseases such as diarrhea or Malaria or Sleeping Sickness are unsolvable?

    6. Re:Good use of $1 million? by Threni · · Score: 1

      > Yeah ... not much use in having a small, portable device that would allow the
      > hearing impared be able to understand people trying to communicate with them
      > using something much smaller than a desktop...

      The hearing impaired aren't about to die, so in that sense there's less use in such a device.

    7. Re:Good use of $1 million? by Lumpy · · Score: 1

      Maybe we could do it in perl!

      ask and Ye shall Recieve!

      BEHOLD!

      Perl rear's it's ugly head almost everywhere!

      --
      Do not look at laser with remaining good eye.
    8. Re:Good use of $1 million? by wildwood · · Score: 1

      Think of it like this:

      Researchers put the entire speech recognition process in hardware, on a chip. Once you've got a process on a chip, you can refine it, make it cheaper to produce, less power-consumptive, and smaller.

      Eventually, you can have a speech recognition chip that fits in a solar-powered credit-card-sized form factor, like all those free calculators. If you can re-target it for different languages (different chip-sets, maybe), and design it so the LCD shows whatever was just said...

      Sounds to me like literacy-in-a-box. Which ranks up pretty close to shelter and clean water.

      --
      normal(adj)- people who don't sit on slashdot all day wondering why everyone else isn't building robots [DECS]
    9. Re:Good use of $1 million? by Threni · · Score: 1

      > Sounds to me like literacy-in-a-box. Which ranks up pretty close to shelter and
      > clean water.

      Not really. In a world where over one billion people are starving, the idea that "literacy-in-a-box" (even if it were somehow available for $1,000,000, which it isn't) is somehow equivalent to preventing starvation is irrational.

    10. Re:Good use of $1 million? by Taladar · · Score: 1

      You would probably end up typing your order anyway. Speech Recognition is slow and Errorprone and will be that way for a very long time. 1 Million $ won't change that.

    11. Re:Good use of $1 million? by ryanvm · · Score: 1

      People are starving RIGHT NOW! Why are you wasting earning potential (that could be redirected to their welfare) arguing with an anonymous poster on a recreational forum?!?

      The AC is right. Your original post suggesting that any money not spent on solving basic world problems is wasted was just plain silly.

    12. Re:Good use of $1 million? by Threni · · Score: 1

      > Your original post suggesting that any money not spent on solving basic world
      > problems is wasted was just plain silly.

      It's silly to try and stop people starving.
      It's not silly to instead spend the money on making a box which is able to understand speech in a different way to that currently available.

      I see - it's so simple now you've spelled it out to me. Presumably you'd feel differently if it were members of your family who were starving? And the difference is that you don't know these people, or that they're a long way away?

    13. Re:Good use of $1 million? by WillWare · · Score: 1
      It would be a good thing for this particular million dollars to be redirected toward humanitarian aid. The economic reality is that people don't all share the same priorities, and different people have their hands on different millions. You can't stop people doing stuff you disagree with, but you can make it easier or more enticing for them to do the things you want.

      The post I found really interesting in this thread was the one saying, go to the Oxfam site and see what ten bucks can do. I scoped around the site for a few minutes and couldn't find a "Here's what ten bucks can do" page. I'm sure they could create one rather easily. They also don't have a "Make an anonymous donation via Paypal" page. If there were a way to give ten bucks right now without putting my name and address in their database, I'd do it.

      --
      WWJD for a Klondike Bar?
    14. Re:Good use of $1 million? by PythonCodr · · Score: 1

      The hearing impaired aren't about to die, so in that sense there's less use in such a device.

      It's about quality of life. $1 million isn't a lot of money if it ultimately leads to better quality of life for millions of people. Sure, in the long run it's going to cost a lot more than $1 million to bring such a device the masses, but you have to start somewhere, and it has practical uses.

    15. Re:Good use of $1 million? by wildwood · · Score: 1

      Not really. In a world where over one billion people are starving, the idea that "literacy-in-a-box" (even if it were somehow available for $1,000,000, which it isn't) is somehow equivalent to preventing starvation is irrational.

      How, exactly, does money prevent starvation? If you ship in emergency relief, are you "preventing" starvation, or just postponing it? Do you just keep shipping in food, month after month?

      Or do you spend that money to teach people better ways to farm and manage their resources? Do you teach them how to innovate new techniques, and to communicate those techniques to people in other areas?

      Literacy gives a massive boost to any education program. And, over the long term, education is, as far as I know, the only way to really end poverty and starvation.

      I'm willing to be proven wrong, if you'd care to present an argument. At this point, I stand by what I've said.

      --
      normal(adj)- people who don't sit on slashdot all day wondering why everyone else isn't building robots [DECS]
    16. Re:Good use of $1 million? by Armchair+Dissident · · Score: 5, Insightful

      Every time a dollar value is placed on a piece of research, some idiot comes along and say "Hey! This could be spent providing clean drinking water, and food and shelter", as if only research that directly provides clean drinking water or food or shelter is worth funding. Quite frequently the idiot making this statement is in a perfect position to provide money to ensure that more people have access to these facilities, and just as frequently that idiot isn't doing so.

      I'm sure that when America and Russia were engaged in the space race there were people saying "Hey! This money could be better spent on disaster relief!". And where are we now? Only a few short decades later we have sattelites that tell us where hurricanes are going so that we can evacuate areas and people who would otherwise die surviveWe have a global reliable telecommunications satellites so that disaster relief agencies in third world countries can inform people of what supplies are required, and people who would otherwse die survive.

      Without the massive investment in jet airline technology that could otherwise have been spent "saving the starving", we would not be able to travel to disaster areas within hours of an incident. And so the list goes on.

      If you personally want to see more money invested in agencies that provide disaster relief, or reliable shelter or clean water then you only have to donate to the right charities, and encourage others to do the same. It doesn't take many people to donate out of their pockets to provide $1 million. You can start here.

      --

      The ways of gods are mysteriously indistinguishable from chance.
    17. Re:Good use of $1 million? by Threni · · Score: 1

      > Literacy gives a massive boost to any education program. And, over the long
      > term, education is, as far as I know, the only way to really end poverty and
      > starvation.

      It's not just poverty, is the problems poverty causes, such as lack of food, medicine and shelter. Literacy does not help with these problems. It's something good to aim for once people aren't dying at the rate of one person every few seconds but you're fooling yourself if you think that it's the best use of a million US dollars.

      > I'm willing to be proven wrong, if you'd care to present an argument. At this
      > point, I stand by what I've said.

      My argument is that the things that kill millions of people are easily solved problems like dirty water and easily preventable diseases. I don't feel obliged to present any more of an argument than that. If you think, even for an instant, that I'm not trolling (I'm not - the question was `is this a good use of a million dollars`, and i'm arguing that it's not the best use, which perhaps isn't the same question but still a relevant one), then you're more than capable of using Google and checking out some of the major charities such as Oxfam, Unicef, Christian Aid, etc and seeing just how much you can change things for just a few dollars. (Those charities also help with education and other literacy-related projects, by the way.)

      Really, do it now.

      http://www.unicef.org/
      http://www.oxfam.org/eng /
      http://www.christian-aid.org.uk/world/index.htm

      Whether you act on it or not is up to you - perhaps you value that new Star Wars DVD boxed set, for instance, or a $400 graphics card more than a clear conscience that you're not standing by while people starve slowly to death. I'd be interested to hear a moral, rational argument for that, however.

    18. Re:Good use of $1 million? by Threni · · Score: 1

      Perhaps you aren't paying attention. The more than one billion people who are currently starving to death, or dying from easily treatable illnesses don't require a satellite to inform them as to the weather in other parts of the world.

      The rest of your argument appears to consist of casting me as some sort of luddite - odd, given than Slashdot is read pretty much solely by people familiar with technology and it's advantages. I'm simply saying that in this instance, where the question is `is this (yet another speech recognition system) a good use of a million dollars`? I say no - lets stick with the existing ones, which appear to work pretty well whenever I use subtitles when watching live tv programs (news, debates), and spend this million dollars on aid programs instead. The west gives a pretty pitiful amount of money in aid, and I'm puzzled as to why. I get the feeling people think it's some sort of physical problem in literally getting the supplies over there.

      > If you personally want to see more money invested in agencies that provide
      > disaster relief, or reliable shelter or clean water then you only have to
      > donate to the right charities, and encourage others to do the same. It doesn't
      > take many people to donate out of their pockets to provide $1 million.

      What do you think I'm doing here? Now who's the idiot? Oh, and the other implication in your first paragraph is wrong - I give plenty of money to charity.

    19. Re:Good use of $1 million? by Anonymous Coward · · Score: 0

      so why are you posting to slashdot. isnt that a waste too, so shouldnt you be out ssaving everyone?

    20. Re:Good use of $1 million? by Anonymous Coward · · Score: 0

      Who are you to dictate what is moral and rational to me?

      I think we need to start actively practicing eugenics on a world-wide scale. To me this is perfectly rational and moral, but I'm pretty sure it isn't to you.

    21. Re:Good use of $1 million? by avida · · Score: 1

      If you spend $10 million helping feed the world it wouldn't make a damn difference. They would be hungry after the money ran out or the local warlords sold the food.

    22. Re:Good use of $1 million? by Armchair+Dissident · · Score: 1

      Perhaps you aren't paying attention.

      I think I was. I usually try.

      The more than one billion people who are currently starving to death, or dying from easily treatable illnesses don't require a satellite to inform them as to the weather in other parts of the world.

      I can assure you that I am most certainly familiar with the issue of starving people. I am not a supporter of Save the Children for my own health. But you have missed the point. Around the world every year people are surviving incidents that they would otherwise have died in because of the advance in technology.

      I say no - lets stick with the existing ones, which appear to work pretty well whenever I use subtitles when watching live tv programes

      Yes, they are good enough. But to paraphrase a recent Honda advert in the UK: They're OK. They're not bad. They're adequate. But why invent the transistor if the valve is OK. Why invent the jet engine if propellers are OK. Why invent the car if the horse and trap is OK.

      The west gives a pretty pitiful amount of money in aid, and I'm puzzled as to why. I get the feeling people think it's some sort of physical problem in literally getting the supplies over there.

      I don't disagree. But the reason is not because some of the money in western countries is spent on computer research projects.

      Part of the problem, actually is because there's a physical problem with getting supplies there. Food supplies to many severely deprived areas need to go through war zones. The food gets hijacked, and aid workers get killed. This is assuming you're able to land supplies in the country in the first place.

      Part of the problem is because of social and political issues in the countries worst affected. During the Ethiopian crisis, for example, the government in power at the time was a brutal dictatorship that didn't want anyone hearing about the problem, then didn't want people sending aid.

      Part of the problem is religious. Many deprived areas are over-populated because various religions proscribe contraception. All these problems need to be tackled. Simply throwing more money at the problem will not suddenly solve it.

      Now, you may argue that the money could be given to charities. But think about it for a moment. How much money do you really want a charity to receive from a government? How independent should a charity be? What if a charity received $x million from a government on the proviso that they didn't provide aid to a hostile country? Charities should be able to remain as independent as possible.

      What do you think I'm doing here? Now who's the idiot? Oh, and the other implication in your first paragraph is wrong - I give plenty of money to charity.

      Then I retract the comment in your instance, but I stand by my statement. By and large people stating that research money could be better spent on aid do not donate to charities - at least in my experience.

      --

      The ways of gods are mysteriously indistinguishable from chance.
    23. Re:Good use of $1 million? by Anonymous Coward · · Score: 0
      "They also don't have a "Make an anonymous donation via Paypal" page."

      They probably did until PayPal cancelled their account and stole their money without warning, explanation or recourse.

      Not that I'm bitter :/
    24. Re:Good use of $1 million? by ryanvm · · Score: 1

      Given your logic, I must assume that you have no possessions which do not serve a fundamental human need? You have no luxury items, no games, no entertainment items, etc. After all, money spent on those things could have been sent to starving children in third world countries.

      I applaud your charity.

    25. Re:Good use of $1 million? by Armchair+Dissident · · Score: 1

      Literacy does not help with these problems

      How the hell are can you link to UNICEF and Oxfam, and not recognise that literacy is absolutely central to the problem . If you have an illiterate population you cannot teach them to fix a pump, or rely upon them to teach their children to fix a pump. Without literacy you cannot teach a farmer to sow a field, and teach others to sow a field. With an illiterate population you will never make progress!

      UNICEF and Oxfam are both deeply concerned about the literacy rate in the third word for a simple reason: The problem cannot be solved without raising literacy rates!

      Whether you act on it or not is up to you - perhaps you value that new Star Wars DVD boxed set, for instance, or a $400 graphics card more than a clear conscience that you're not standing by while people starve slowly to death

      You arrogant troll. How much did your computer cost you? How much is the electricity bill to run your computer, and the lights in your house, and the heating? How much did you spend on your car? How much do you spend on fuel? How much do you spend on clothes in a year? Do you own a DVD? A graphics card? A Hi-Fi? How much do you give your ISP each year?

      I said in my first post that you should encourage people to support charities. I would like to retract that - I would like to ask that you never mention the name of a charity in your posts ever again. You ignorance does more harm than good.

      --

      The ways of gods are mysteriously indistinguishable from chance.
    26. Re:Good use of $1 million? by Anonymous Coward · · Score: 0
      Depends. It's not as good as using it to prevent the deaths of thousands - possibly tens of thousands - of people by ensuring they have clean drinking water and shelter from the elements. But hey - you can't put a price on being able to speak to a computer rather than type when you're ordering a pizza/

      However it might be as good as as using it to prevent the deaths of thousands - possibly tens of thousands - of people by one day providing them, and those listening to them, with cheap voice translators to allow their culture to interact more easily with other cultures. Almost all of these thousands - possibly tens of thousands - of people are in the situation they're in because of their closed cultures and the corrupt governments that confine them.

    27. Re:Good use of $1 million? by Kehvarl · · Score: 1

      Exactly. That individual isn't the only person alive. Take me for example, I don't donate to charities, and I probably never will. "Why" you ask? because I'm a greedy, self serving bastard and I know it.
      Plus, if I donate to helping people, what have I done? possibly given to whichever criminal (sure you could use the term "terrorist" but why bother when a much more simple and accurate term exists?) group happens to intercept my money/food/medicines/etc. or -maybe- helped a small percentage of those who actually need assistance.
      Now, I don't fit exactly into the category you mentioned, as I don't think research money could be better spent aiding humanity. Rather, I feel that research money can be better spent on research that improves or may improve my life. and if, as a side effect, it improves the lives of others I can live with that.

    28. Re:Good use of $1 million? by Kehvarl · · Score: 1

      Actually the indication was that they would not be solved, not that they were unsolvable. There will always be people who are (choose any number of them). Not because there are no solutions, but because there are no solutions which will simultaniously please a large enough percentage of those capable of providing the solution to get it done.

    29. Re:Good use of $1 million? by Armchair+Dissident · · Score: 1

      Exactly. That individual isn't the only person alive. Take me for example, I don't donate to charities, and I probably never will. "Why" you ask? because I'm a greedy, self serving bastard and I know it.

      Erm... I'm not quite sure how you can say "exactly" when you appear to have disagreed with everything in my post.

      Plus, if I donate to helping people, what have I done? possibly given to whichever criminal (sure you could use the term "terrorist" but why bother when a much more simple and accurate term exists?) group happens to intercept my money/food/medicines/etc. or -maybe- helped a small percentage of those who actually need assistance.

      I didn't say that...

      Now, I don't fit exactly into the category you mentioned, as I don't think research money could be better spent aiding humanity. Rather, I feel that research money can be better spent on research that improves or may improve my life. and if, as a side effect, it improves the lives of others I can live with that.

      And I most certainly did not say that.

      How have did you reach the conclusion that you agreed with me, when you appear to completely disagree with me?

      --

      The ways of gods are mysteriously indistinguishable from chance.
    30. Re:Good use of $1 million? by figa · · Score: 1
      "And where are we now?"

      Two space shuttle disasters later we're reverting to Soviet-era technology to launch satellites, if not paying the Russians and Chinese outright to launch them.

      I'd rather not have to use a Brita filter than give the NSA the ability to do

      cat /dev/speech | grep -i bomb
      The NSA should be forced to hold bake sales along with the Pentagon. A fat lot of good their intelligence gathering did here in New York three years ago.
    31. Re:Good use of $1 million? by Kehvarl · · Score: 1

      no you didn't, I did. I made it all up as I went along too, but that's mostly becauase I figured everyone would ignore me.
      and when I said exactly I was agreeing with you excluding the other individual from your final statement without excluding everyone.

    32. Re:Good use of $1 million? by Threni · · Score: 1

      > UNICEF and Oxfam are both deeply concerned about the literacy rate in the third
      > word for a simple reason: The problem cannot be solved without raising literacy
      > rates!

      I'm not sure they've stated anywhere that one million dollars is better spent on creating another speech recognition device, however.

      > You arrogant troll.

      Calling someone a troll doesn't really address the points raised, does it? It's hard to see how someone would be all that bothered with such an insult. If I were trolling then that would presumably be the reaction I would be after, right? And if I'm not a troll then it's just a waste of time.

      > I would like to ask that you never mention the name of a charity in your posts
      > ever again. You ignorance does more harm than good.

      You can ask what you like, but it won't help you to prove that money is better spent on yet another speech recognition device than preventing the more than 1 billion people who are currently starving from dying? Or are you keeping some sort of killer argument up your sleeve? Frankly I doubt it.

  8. Sarcasm? by Anonymous Coward · · Score: 2, Insightful

    Good use of $1 million?
    For something that would be worth hundreds of times that in the form of a finished product, I would hope so. The only dispute might be that the researchers' efforts would be better spent on other things.

  9. Mixed feelings on this one... by Oxy+the+moron · · Score: 2, Insightful

    On the one hand, it is obvious how much more efficient this would make our day-to-day tasks. Being able to "jot" notes with speech instead of writing, schedule tasks in seconds, the list goes on and on...

    This is certainly beneficial... but think about the impact on the economy! Imagine all the "Administrative Professionals" who could, almost instantly, be out of work. I for one would rather pay even $5,000 for a good piece of software to take all my notes than pay a secretary $28,000/year or so.

    Then again, when I posed this situation at my wife's office (she's a paralegal) one of the attorneys responded, "Until they come up with software that can find my lost keys and bring me coffee, the secretary's job is secure."

    --

    Proudly supporting the Libertarian Party.

    1. Re:Mixed feelings on this one... by Jeff+DeMaagd · · Score: 2, Insightful

      Agreed. Secretaries are needed to do paper handling, take calls and filing too. A business that prides itself on professionalism and service would IMO not rely on short cuts like the voice mail maze. So they aren't just a personal refresment gopher. Any business should still need that sort of thing.

      So what if dictation is taken away from secretaries, they still need to check the grammar and arrangement as dictation is almost always free-form without the same structure as a good written letter.

    2. Re:Mixed feelings on this one... by Taladar · · Score: 1
      Being able to "jot" notes with speech instead of writing,
      I don't know about you but I can type a lot faster than I can speak when I speak in a way speech recognition can understand.
    3. Re:Mixed feelings on this one... by Lumpy · · Score: 1

      you're not looking at the big picture.

      this will be the next thing to cause a wave of "they're stealing my Intellectual Property!" screaming .

      Imagine, this Enabling technology to let a student sit in a class and get a perfect transcript of the lecture, then he post's it to the P2P app of that day and that professor has to start eating Ramen in his cardboard box because he is poor now.

      Ok sarcasim aside, you will hear of people screaming about this, we will have a new type of blog filled with nothing but text of what was said at certian public places, and other stuff to make the whiners whine.

      --
      Do not look at laser with remaining good eye.
    4. Re:Mixed feelings on this one... by the_twisted_pair · · Score: 1
      Actually I think you're sailing close to the Broken Window fallacy.

      I'd rather look at it this way: all the people you identify - and more - will have a lot of the drudge taken out of their work and more time to do the things they are indispensable for: editing, collating material/resources/ presentations and so on - the 'added-value' bit. Oh, and even if the voice recognition works very well, there will always be a role for touching up grmamar and structure, just as one has to with OCR.

      OTOH, my secret fear is that offices will be full of people just babbling at their machines all day long. That would drive me insane...

    5. Re:Mixed feelings on this one... by the_twisted_pair · · Score: 1
      *grmamar* Oh, and spelling too....

      Doh.

  10. Save a few kilobytes... by tcopeland · · Score: 2, Informative

    ...and view the printable version.

  11. Natural Language Interpreter by MankyD · · Score: 4, Insightful

    I'm curious to see if their research will improve Natural Language Queries, as opposed to just improving speech recognition. There is an important difference between having to say: SELECT name FROM users WHERE id=12345 and saying: Pull up the name of employee number 12345.

    --
    -dave
    http://millionnumbers.com/ - own the number of your dreams
    1. Re:Natural Language Interpreter by selfsealingstembolt · · Score: 1

      It will only improve speech recognition, to make it faster and less power consuming.
      Converting human speech to text is already possible with very few mistakes. A high-end consumer PC has the power to do that already.

      For the chip to understand "Pull up the name of employee number 12345" it would need some very strong AI. It would need a conceptual model of the world or at least of the area in question to allow arbitrary syntax for its queries. That is far beyond our current level of AI and computing power, even if you use a cluster of supercomputers. To see such intelligence on a single, efficient chip will take several decades of research in the areas of AI and minaturization.

      --
      Keep open minded - but not that open your brain falls out...
    2. Re:Natural Language Interpreter by MankyD · · Score: 1

      This is pretty much what I figured. I guess we can only hope that their research furthers Natural Language research.

      --
      -dave
      http://millionnumbers.com/ - own the number of your dreams
    3. Re:Natural Language Interpreter by Anonymous Coward · · Score: 0

      Good question. At its basic, speech recognition is just another input device, although more sophisticated than a keyboard or mouse. But to do it really right, it must also have some language recognition capabilities, so it can figure out the difference between "two" "too" and "to" or "there" "their" and "they're". So full-blown natural language recognition should be just a few short steps away.

    4. Re:Natural Language Interpreter by freqres · · Score: 1

      Or we can just find the crushed remains of a T-800 Model 101. Just beware when your computer starts talking to you in broken English with an Austrian accent.

      --
      Rampant Ninja related crimes these days...Whitehouse is not the exception
    5. Re:Natural Language Interpreter by Anonymous Coward · · Score: 0

      A very low end consumer PC has the power to do that aready. It is not processor power that is the problem in speech to text but ensuring there are good enough acoustic and grammatical models for the language being spoken. This requires huge amounts of data more than processor power.

      For understanding the language again you don't need masses of processor power. You just need the correct grammatical models / training based on sample inputs. Obviously you limit these models to the subject area that the queries are for, but it is not beyond current power or AI at the moment. The problem is always the data, speed while nice is very much a secondary issue.

    6. Re:Natural Language Interpreter by Masker · · Score: 3, Insightful

      Natural language processing and speech recognition are two entirely separate problem spaces.

      Natural language processing tasks involve parsing strings of tokens and mapping them to commands to be executed. So, from your example, "Pull up the name of employee number 12345", the natural language system must map "Pull up" to "SELECT", "the name" to "name", "of employee number 12345" to "FROM users where id = 12345". Really, it's largely a problem of context, and your example shows an excellent problem: the "of employee number 12345" to "FROM..." map requires the contextual information of where to pull this information from. Surely multiple tables of a database could have an "employee number" field in them. Do you want all of the tuples which matches, or just from a certain table? Now, in the context of looking up a bunch of other employees, maybe I know what table you've been hitting a lot, and can determine what you're asking, but without that context, I have no idea.

      In fact, everyday speech has a lot more ambiguity in it than could be handled without keeping large amounts of state, be it contextual or experiencial/situational. For example, if I overhear two people in a conversation, and the first thing I hear is: "Yeah, but he's been lying all though his campaign, and I for one don't support him," I have no idea which politcal candidate might be speaking of. However, if I saw that person wearing a shirt for a political campaign last week, then I have enough context to make a reasonable guess that he's talking about that person's opponent.

      Speech recognition is a "lower level" than that: it's about matching acoustic information into speech sounds and then using the speech sounds to determine the word that was said. This is a hugely complex task that has a number of unsolved problems (of which these are the 3 that I can think of off the top of my head):

      1) "speech sounds" are fuzzy categories, and are not canonical targets.
      2) salient "features" of phonemes are disputed, contradictory and large amounts of redundancy/conflicting info are built into the speech signal
      3) idiosyncratic speaker-to-speaker differences make the phoneme categories even fuzzier and can complicate the task even for the one speech recognition system that we know works: the human brain.

      At any rate, the problems that need to be solved for speech recognition are not the same problems in natural language processing. While there may be some cross over in pattern-matching, the specifics of the problem spaces make it unlikely that you will get much benefit for NLS (natural language systems) from just making the algorithms faster.

      Which, in fact, is my main criticism of this article: the algorithms that we have now are piss-poor, and making them faster doesn't intrinsically make them better. Unless there's been some huge advance in the field that I'm unaware of, you'd still have to train a SRS (speech recognition system) on your idiolect, by reading some pre-selected passages to it. This model has lots of problems, most specially that it's tailored to an individual. Imagine if you had to have each person that you spoke with read some canned paragraphs to you the first time you met so that you could interact....

      [sorry I don't have sources for all of this; I'm AFB, and I don't have time to dredge up info right now. But, apparently, I have time to write one long-ass entry...]

      --

      ---------The early bird gets the worm, but the second mouse gets the cheese.

    7. Re:Natural Language Interpreter by MankyD · · Score: 1

      I understand entirely, and I suppose that's why I posed the question in the first place. Perhaps this new research will provide insight into both fields.

      I do remembering reading in an article once that, while speech recognition does not help in natural language interpretation, natural language interpretation can play into speech recognition. If a computer can't figure out, phonetically, the difference between "their" and "there" - the context and thus the natural language interpretation helps out. There are much more complicated (and thus applicable) examples that can't be solved by simple if-then comparisons, but I can't come up with one at the moment.

      --
      -dave
      http://millionnumbers.com/ - own the number of your dreams
    8. Re:Natural Language Interpreter by cft_128 · · Score: 2, Funny
      There is an important difference between having to say: SELECT name FROM users WHERE id=12345 and saying: Pull up the name of employee number 12345.

      Yeah, there is a difference, I find the first query much more natural. I think I need to get out more.

      --

      Underloved Movies and Pub Quiz: donotquestionme.org

    9. Re:Natural Language Interpreter by Masker · · Score: 1

      I do remembering reading in an article once that, while speech recognition does not help in natural language interpretation, natural language interpretation can play into speech recognition.

      Well, what you'd be doing is adding natural language processing to the utterance that was created via speech recognition as an additional step to help interpret the utterance, perhaps by using context. So, it's not really "playing into" speech recognition, but is being used as part of a larger system (say, speech-to-text) to increase the accuracy of the entire utterance. Phonemically, "their" and "there" are exactly the same, so you wouldn't be increasing the accuracy of the speech recognition. You would be if you used context to distinguish "their/there" from "where" or something like that. But, again, it would be a bolt-on of NLS post-processing as feedback to the speech recognition algorithms.

      --

      ---------The early bird gets the worm, but the second mouse gets the cheese.

    10. Re:Natural Language Interpreter by Tony-A · · Score: 1

      Natural language processing and speech recognition are two entirely separate problem spaces

      Counterexample. I can mumble and be perfecty understood by people who know me. It's a case of very high-level processing to determine which of a few low-level phrases I might have uttered.

      mumble mumble nmmmm mumble 12345.
      What table has key field with value 12345?
      What primary column sounds something like nnmmm?

      Keep a stack (vertical pile) of who asked what and some idea of relevance if you can get any feedback from users.

      13758? then becomes a perfectly valid query.

      "Pull up the name of employee number 12345"
      Since this is a query system, "SELECT foo FROM bar WHERE baz" can be taken for granted. The length of the query idicates this should be a "simple" query. "The" of "the name" is significant in that any meaningful query is expected to return only one answer. The hard part is translating employee to user.

      You are very right that the context matters. The relevant context is pretty much everything known and everything that everybody has ever asked.

  12. Virtual Valerie by rgf71 · · Score: 1

    Oh the possibilities of handless command of Virtual Valerie:)

  13. Silicon == buzzword by handy_vandal · · Score: 1, Insightful

    Speech recognition on a chip, yes.

    But only "silicon" in the sense that every other silicon chip is silicon.

    No magical "silicon" breakthroughs to see here, keep moving.

    -kgj

    --
    -kgj
    1. Re:Silicon == buzzword by LiquidCoooled · · Score: 1

      Nothing wrong with speech recognition on silicon.

      My missus has been telling me to stop talking to her boobies for years, now finally I will have a valid reason.

      --
      liqbase :: faster than paper
    2. Re:Silicon == buzzword by hackwrench · · Score: 1

      No, Silicon in the sense of a processor designed for speech like the Graphics processor is designed for vectors, and not having to do it by following a uploaded program from outside of silicon. Also, it will quickly get utilized by non speech processing like the graphics cards are getting used for non-graphics processing.

    3. Re:Silicon == buzzword by SunPin · · Score: 1

      Lighten up, dude. It doesn't matter that "silicon" is a buzzword. The people putting up the money need these annoying buzzwords to understand what they are financing. Considering how much voice dictation sucks (I use it for 99% of my input), it's in dire need of improvement and any buzzword that leads to some scientist getting the money he/she needs to improve it is ok with me.

      --
      Laws are for people with no friends.
  14. Oblig simp quote by tubbtubb · · Score: 1, Funny


    Note to self: Eat up Martha.

  15. Obligitary Star Trek quote by MBAFK · · Score: 1, Funny

    Computer. Computer? Hello, Computer. Just use the keyboard. Keyboard. How quaint.

  16. Only 1million? by Gyorg_Lavode · · Score: 2, Insightful
    Thats impressive for just 1 million, working in defense and knowing our contactors. 1 million dollars is bearly enough to get them to tell you how much it would cost for them to do the initial research to tell you if they can actually build what you want.

    (I did not read the article as it is slashdotted so I am relying on the summary's statement of 1 million dollars.)

    --
    I do security
    1. Re:Only 1million? by Christopher+Thomas · · Score: 1

      Thats impressive for just 1 million, working in defense and knowing our contactors. 1 million dollars is bearly enough to get them to tell you how much it would cost for them to do the initial research to tell you if they can actually build what you want.

      This is being done through CMU. $1 million funds a lab full of grad students, with a couple of chip spins per year, for several years.

      I'm sure a few papers can be squeezed out of a speech-algorithm accelerator chip project.

  17. First Rule of Government Spending... by Fortress · · Score: 1, Insightful

    ...is always underestimate your costs and run over budget later. That $1 million will turn into $1 billion before anything comes of this. Hell, it'll take over a million to get the development organization up and running.

  18. A measily $1 million? by Aggrazel · · Score: 2, Interesting

    Imagine how much money could be saved if you could *perfect* speach recognition.

    Heck, the hospital I used to work at by itself spent over a million dollars a year on medical transcriptionists ...

    1. Re:A measily $1 million? by Aggrazel · · Score: 3, Funny

      And imagine how much embarassment could be saved alone by correcting idiotic mispellings of simple words like "speech".

    2. Re:A measily $1 million? by wjsteele · · Score: 1

      Just imagine how much money could be saved if you could *perfect* spelling, too!!! :-)

      Bill

      --
      It's my Sig and you can't have it. Mine! All Mine!
    3. Re:A measily $1 million? by bytesmythe · · Score: 1
      Heck, the hospital I used to work at by itself spent over a million dollars a year on medical transcriptionists ...

      The company I used to work at is out to fix that...

      --
      bytesmythe
      Hypocrisy is the resin that holds the plywood of society together.
      -- Scott Meyer
    4. Re:A measily $1 million? by JohnFluxx · · Score: 1

      I used to work on that too. http://www.opengalen.org/ -- that used to run on my machine.

    5. Re:A measily $1 million? by rk · · Score: 1

      I'm sure it was typo. Just ignore the s. There's a lot of cutting edge research being done in peach recognition. Although not difficult by human standards, getting a computer to recognize the difference between a peach and say, an apricot is not easy.

  19. Interesting, but do we really need this? by hackronym0 · · Score: 2, Funny

    It is an interesting concept, but do we really need this?

    We already have voice recognition, this tech will just bring it to everything. You can talk to your keys, your toaster, your watch. But will they have anything interesting to say back?

    What would you do if you had 1 million dollars?

    You mean besides 2 chicks at the same time...

    Refer your friends, get a free ipod
    --
    This is completely false. This is not a sig.
    1. Re:Interesting, but do we really need this? by mborohovski · · Score: 1

      Well let's see..."toaster, light brown...watch, set time zone to rome and start chronograph..." Yeah, I could see the uses.

      --
      -Tang, it's a kick in the ass.
  20. The difficulties of dialect... by L0neW0lf · · Score: 5, Insightful

    I once did a lot of work with speech recognition software, having a former significant other who was disabled. I tested a number of programs, and found the biggest problem to be the wide variances in users' dialects. The programs all have to be trained initially to recognize a single users' voice. This means that a program trained for a Bostonian may not work for someone from Arkansas, Texas, or Louisiana. Also, the programs' effectiveness decreased over time if you did not use it regularly.

    I don't know how possible it will be to make a program that can recognize all English users. Will someone who speaks Oxford English be recognized as well as a surfer from California? I doubt it.

    --

    Never look down your nose at others. Someday, someone is bound to see your boogers.
    1. Re:The difficulties of dialect... by drinkypoo · · Score: 1
      I dunno bra, the California surfer accent can get kind of gnarly, there's like, a lot of drawn-out sounds and unnecessary pronounciation in there.

      I say this not as a surfer, but as someone born and raised in Santa Cruz.

      Now it is true that Californians not known for having accents (surfers are definitely known for having an accent - seen fast times? Spicoli's a pretty accurate representation thereof actually) and the californians who don't have a particularly strong accent are the people in the USA whose pronounciation is closest to what's in the dictionary, but surfers are about the worst possible californians to use as an example, except maybe some type of immigrants :)

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    2. Re:The difficulties of dialect... by Paulrothrock · · Score: 1

      Forget surfer dudes, imagine the problems with an Aussie using rhyming slang! Or a Cockney accent! Most people can't even understand them if they don't speak it themselves.

      --
      I'm in the hole of the broadband donut.
    3. Re:The difficulties of dialect... by Anonymous Coward · · Score: 0

      So long as the software models have been trained with speakers of both very easily. The problem with consumer software is they vendors leave it down to the user to train it to their own voice. Good commercial grade the vendor provide the models that have been trained on a huge range of accents.

    4. Re:The difficulties of dialect... by Anonymous Coward · · Score: 0

      Or they could learn to speak the language correctly... I got IBM's ViaVoice almost 5 years ago, and it was over 90% accurate right out of the box since I *gasp* can speak english correctly! And the last 10% was just it getting used to my word usage.

    5. Re:The difficulties of dialect... by Viceice · · Score: 1

      Sure you can. All it entails is more work. I tried IBM ViaVoice once, did the training and all but didn't quite work unless i spoke "American".

      Then a year later, i saw an English speech2text program being promoted that was specially tuned to Asian accents (oriantal Asian), worked very well.

      So basicly all you need are many profiles and a way for the computer to work out which profile to use.

      --
      Sometimes I wish I was a plumber, then I'd know how to deal with other people's shit.
    6. Re:The difficulties of dialect... by CustomDesigned · · Score: 1

      Speech recognition has to be trained for each speaker type. Cheaper systems have to be trained for each speaker. Even human beings have to be trained for each speaker type. When I took my wife to Barbados on our honeymoon, she asked me what language they were speaking (I had already been there virtually via tapes, pictures and records from my Dad). I told her it was English. She didn't believe me! It took about 2 days before she could understand the native speakers.

    7. Re:The difficulties of dialect... by A+non+moose+cow · · Score: 1

      I feel like we will get to the point where everyone carries around a digital speech profile/filter of themselves in all of their electronic devices. It would alter the input (and output?)of speech between the user and the app to account for the user's nuances.

    8. Re:The difficulties of dialect... by jesup · · Score: 1

      I was part of an NIST program in '92/93 or so where they paid people to make calls to each other and talk about a particular subject. I believe they took the data, made transcripts, and it was fed to companies as a database of US english speakers (of varying accents) to use to tune speech recognition. (I think I got involved with it via Bell Labs, who we were working with on using their 3210 DSP for speech recognition in Amigas.

      So there may be multi-accent databases out there to develop base profiles for speech recognition.

  21. hardware accelerated by GMail+Troll · · Score: 3, Insightful
    "People who are serious about software should make their own hardware" - Alan Kay

    This seems like a situation where a hardware accelerated approach is pretty sensible. I'm guessing there is large amounts of signal processing involved in speech recognition. With a custom chip like this it probably helps greatly to offload some of that onto a dedicated chip in the same way as GPUs are used on graphics cards. The only problem I can see is that there might not be much market for it. GPUs have an obvious market (games), but there is less demand for speech processing. Star-Trek style interfaces are nice to dream of but for most common tasks a keyboard and mouse will probably give you a faster and more accurate interface.

    gmail invite

    1. Re:hardware accelerated by Anonymous Coward · · Score: 0

      Hardware is not always better, it is more complex
      and difficult to work with. The only area I see
      the h/w speech processing might help is to offload
      certain portions of a complex speech processing
      algorithm(software). It might help those who
      want the ability to screen millions of phone
      calls as close to realtime as possible(esp.
      Homeland security). I really doubt that the
      harware could be more accurate than the more
      complex software available today. A better use
      of money would be to use the powerful GPU's in
      the nextgeneration video cards to assist
      speech recognition. Why reinvent something, if
      something that already exist could be tailored
      to meet the needs!.

      That said, I also totally disagree that hardware
      is not better than software, if you could make
      the hardware unconventional(pure analog(no dsp's
      or conventional digital logic) circuit design
      that has the ability to learn! and self correct!).

    2. Re:hardware accelerated by MP3Chuck · · Score: 1

      "there is less demand for speech processing."

      But perhaps if speech processing were more accurate and less resource-heavy to use (thanks to said hardware acceleration), there would be more of a demand for it... I wouldn't mind adding a speech recognition card to my box if it meant clear and quick voice recognition and voice-to-text.

    3. Re:hardware accelerated by Ratface · · Score: 1

      Thank you! I have been looking for a gmail invite for ages :-D

      --

      A little planning goes a long way...
  22. May I suggest a name for this chip? by scotay · · Score: 0

    The Adama.

  23. I'll get excited when... by Darkon06 · · Score: 2, Insightful

    I see some results. So far theres been quite a few attempts at speech recongnition. Generally they all fall short, they don't like accents, and often mis-interpret. I know because awhile back we looked at something for my grandfather, he can't keep his hand steady enough to write anymore... *shrug*

  24. One million is a pretty small investment by samberdoo · · Score: 1

    The social, commercial and political usefulness of this technology is worth billions. Will this lead to be the end of word processing by keyboard? Dr. Evil: "Here's the plan. We get the warhead, and we hold the world ransomed for.....One MILLION DOLLARS!!" No.2: "Ahem...Well, don't you think we should maybe ask for *more* than a million dollars? A million dollars isn't exactly a lot of money these days. Virtucon alone makes over nine billion dollars a year!"

  25. Good use of $1 million? by Threni · · Score: 3, Interesting

    Depends. It's not as good as using it to prevent the deaths of thousands - possibly tens of thousands - of people by ensuring they have clean drinking water and shelter from the elements. But hey - you can't put a price on being able to speak to a computer rather than type when you're ordering a pizza.

  26. History.. by SillyNickName4me · · Score: 4, Interesting

    During 1994 upto 1998 I did marketign and technical support for IBM's Voicetype Dictation products..

    Initially, doing anythign beyond understanding a few words would take special hardware, but after a bit of 'training' highly acurate and fast speech to text was quite a possibility with a specially developed dsp.

    Then, the pentium class cpus came about, and a p90 could just do the whole thing without the dsp.

    So, now someone is developing a new dedicated piece of silicon for this.. lets see how long it takes for general purpose computers to catch up.

    The issue is not that this is not usefull, but that it either has to keep developing, or offer a somewhat longer lasting price/performance ratio or much better features for a logn time to come.

    1. Re:History.. by geordie_loz · · Score: 1, Interesting

      I considered this too.. the article does address this however.

      Small low-power units are useful for say a soldier's helmet, or in a PDA.

      I'd also say, that the same thing happened with 3D cards, and they keep making them faster/more features, but you could play half-life with software 3D on a 2.x Ghz PC looking pretty much the same as it did on a Voodoo card back in the day.

      The question is rather, would there be much future speed advances in hardware, or once it's built, would later software recognition do as well - a little like DVD hardware cards. I have an encore card, but software decoding beats it now, and my DVD decoding doesn't need to be any faster.

      I think the thing they're looking for is building some cheap (as) chips for embedded systems, like mobile phones and PDA's.

    2. Re:History.. by Anonymous Coward · · Score: 0

      you forget how much power a regular processor uses.....

    3. Re:History.. by giblfiz · · Score: 2, Insightful

      An excilent point, However if one were to make something along the lines of a PDA or phone with voice recognition the dedicated hardware would stay useful for much longer because you not only need to wait for the CPUs to catch up, but they need to pull so far ahead that they can compete in power consumption as well. (Which may be entirely impossible)

      task specific silicon becomes very useful when you don't have as much space/power/heat-disipation as you want.

    4. Re:History.. by Jeff+DeMaagd · · Score: 2, Insightful

      These chips wouldn't go into a computer, there are numerous non-computer devices that could use good, low power speech recognition.

      Will a general purpose CPU fit or operate in a phone that can be on for a week? I almost never shut off the phone and it still lasts a week, and I don't want to sacrifice that run time for speech recognition.

      Granted, ARM chips are getting more powerful but the power consumption is still a limiting factor for their designs.

    5. Re:History.. by SillyNickName4me · · Score: 1

      > task specific silicon becomes very useful when you don't have as much space/power/heat-disipation as you want.

      Definitely, and I see quite soem uses for that...
      People use portable recorders for dictation now.. it would be cool to have such a small handheld device that does speech to text translation instead.

      10 years ago, a 486dx33 together with a specialized dsp could do it, so I bet with this new piece of silicon and some low class cpu it is also possible.

      Hmm. so that is where the possible niche for this hardware is I guess.

      A small sidenote regarding 3D graphics accelerators and general purpose CPUs, whle you are absolutely right regarding the huge increase of cpu power offsetting the special hardware of such cards over time, we also get a new generation of 3d graphics hardware every so often, and there is a lot of room for improvement there still. Somehow development of the old PDS hardware was stopped as soon as CPUs became capable of doing the job.

    6. Re:History.. by SillyNickName4me · · Score: 1

      > Will a general purpose CPU fit or operate in a phone that can be on for a week? I almost never shut off the phone and it still lasts a week, and I don't want to sacrifice that run time for speech recognition.

      Good point, but with modern cpus and no need to have the speech recognition software runnign all the time, I think you can indeed have a phone with a long lastign battery, and still have a fairly powerfull cpu in there for the few moments where you really need it. For doign relatively accurate speech to text translation you need somethign faster then a p90, and I think there are mobile phones out there that manage that already.

      Most of the time, a phone is left in 'standby' mode, which doesn't require much, given that you can tell the cpu to go sleep/slow down.

      Don't get me wrong btw, I do see a use for the speech recognition hardware, but I'd think that use is for rather specialized devices.

  27. Better approach by Lord+Kano · · Score: 3, Interesting

    Using specialised DSPs makes more sense to me than burning up generic CPU cycles. There have been many examples over the years of how a specialized DSP is more efficient and effective for a narrow task than a regular CPU. Look at portable MP3 players. They use tiny specialized DSPs to decode the files in a manner that is much more efficient than using a regular CPU.

    We'll still need to do traditional development to interpret the data from the DSPs. We'll need to parse the output so that we can use natural commands to control devices.

    "Coffee maker, brew 10 cups, strong."
    "Bathroom lights, on."

    Without some manner of AI to interpret them, these phrases will be useless.

    LK

    --
    "Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
    1. Re:Better approach by drinkypoo · · Score: 1
      You don't even need an AI, just a dialogue tree, a "this word leads to these words" kind of thing, and at the end a command is issued that does something. This is well-suited to your coffeepot example.

      Another approach, perhaps complementary, would be to accept a list of words and do something when you have enough to match one of the stored patterns. Light control is a good example. The microphone your voice is picked up on would provide one keyword, the location. You could override it by speaking the name of another location. Hence if you're in the bedroom, "lights on" (lights, on) is enough to turn the bedroom lights on, but "bathroom lights on" (lights, on, bathroom) before the timeout period between words (a second or two) would turn on the bathroom lights. This isn't rocket science, or even AI research...

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    2. Re:Better approach by Anonymous Coward · · Score: 0

      Yeah, but wait until the day you're working on a project on your computer, and you make a stupid mistake, and you utter "Oh, fuck me!"

      ...And the computer takes that literally.

  28. Yay! Boo! Uh... Oh bugger.... by MooseByte · · Score: 4, Interesting

    From the blog: ''Homeland security applications are the big reason we were chosen for this award,'' says Rutenbar. ''Imagine if an emergency responder could query a critical online database with voice alone, without returning to a vehicle, in a noisy and dangerous environment. The possibilities are endless.''

    Like some slight tweaking in order to deploy massive voiceprint-recognition silicon arrays for amazingly efficient automatic realtime conversation transcription and identity determination, attached to Echelon.

    So cool... so potentially evil... head begins to hurt... tinfoil hat burning....

    1. Re:Yay! Boo! Uh... Oh bugger.... by Euphonious+Coward · · Score: 1
      This way the NSA could retire all those aging Alpha machines in the Echelon bunkers (the ones that scan every single long-distance or cell call made anywhere for naughty words) and replace them with many, many fewer specialized boxes, instead of buying an equal number of Itaniums.

      If your head wasn't hurting before, it should be now. Retire Alphas? Aaaugh! Forestall purchases of Itaniums? Mmmmm. (Thought: when those tens of thousands of Alphas are finally retired, will they show up on Ebay?)

    2. Re:Yay! Boo! Uh... Oh bugger.... by grasshoppa · · Score: 1

      So cool... so potentially evil...

      I know, I get all warm and fuzzy just thinking about it.

      Because, you know the government, I'd be seriously shocked if they went with anyone other than MS, and that provides endless amounts of possibilities for chaos.

      Gives me chills.

      --
      Mod me down with all of your hatred and your journey towards the dark side will be complete!
    3. Re:Yay! Boo! Uh... Oh bugger.... by Deliveranc3 · · Score: 1

      But potential for good is amazing too.

      Attach them to government officials and the real story can come out in a comprehensive and easy format.

      Concerns about privacy really mean that are our society has huge multi personality disorders.

    4. Re:Yay! Boo! Uh... Oh bugger.... by MP3Chuck · · Score: 1

      "so potentially evil"

      What isn't potentially evil? All your Interntet packets are going through your ISP ... maybe they're sending your encrypted packets to government supercomputers and they're cracking the encryption.

      What if Google is just a front for Echelon as the government refines its data storage and indexing techniques?

      You can only worry about so many "What If's" before it goes from "Tinfoil Hat" to "Tinfoil Fallout Shelter" ...

    5. Re:Yay! Boo! Uh... Oh bugger.... by shfted! · · Score: 1

      I thought it was shown that only about 5% of calls went through echelon?

      --
      He who laughs last is stuck in a time dilation bubble.
  29. Pretty Ambitious, Harder than it sounds by Anonymous Coward · · Score: 5, Interesting

    Although $1million significantly can speed things up, this is a pretty ambitious undertaking.

    My Master's research was on implementing machine learning in hardware, specifically support vector machines.

    Now, they have much more money than I did, and probably this will be a collaboration involving many graduate students, but converting complex algorithms from software to hardware is no easy task.

    It is just easier to do things in software, that's why it has evolved. The modular layers of abstraction allow a Computer Scientist working in machine learning or speech recognition to not have to worry about how the underlying hardware works.

    Working in hardware, a lot these issues come face to face. Particularly since you want an architecture on a chip, whereas in a conventional desktop/server system there are resources such as lots of RAM, harddrive space, etc are available and their interconnections have been built and refined over decades.

    Throw in concerns about small form factor, low power consumption, quite fast a lot of unexpected hurles pop up.

    My master's research goal was to produce a data mining/machine learning machine, or at the very least a data mining/machine learning co-processor. In retrospect, that was a very ambitious goal that would require many years of work, probably in collaboration with other graduate students.

    What I ended up doing was just Support Vector Machines in digital hardware. Now granted, there is another aspect to my research that I'm not mentioning here, mainly that I didn't use normal floating point mathematical architectures, but a different innovative logarithmic based mathematical architecture. That in itself was a significant undertaking.

    In any case, this sounds like a great project, I just wonder how much they can do in their (in an academic sense) very small time frame of 2-3 years. Even though a lot of preliminary work has probably already been done just to apply for the grant.

    In any case, it is great to see something like this, something to keep in mind in case I ever go back for a Ph.D.

    1. Re:Pretty Ambitious, Harder than it sounds by Anonymous Coward · · Score: 0

      "My Master's research was on implementing machine learning in hardware, specifically support vector machines."

      (* hunched shoulder *)
      Rrreallly? My Mathter's re-theach involved the electrical re-animation of lifeleth flesh.

      We are planning thome follow-up invethigathion into automated cath-el defenth and pitchfork wielding mob deterenth. We altho intend to improve our control mechanithemth.

      Mua-hahaha!

  30. Speech - text and text - speech by tod_miller · · Score: 1, Troll

    There isn't much overlap, but there is some. Singal processing, the breaking down of the naunces of speach.

    I figure a hardware speech processor and hardware speech synthesis (very very accurate and believable) would have a great use for mankind.

    Imagine how much cheaper sex chat lines owuld be for instance!

    They owuld only need a limited vocabulary, so perhaps the OS IBM stuff would work for now?

    Of course, I bet a patent will come out of this... voice technology that is very realible and very easy will remove a whole interface. Talk back to your sat nav...

    "turn left"

    "I can't its bloody road works"

    "Turn left"

    "Damn you!"

    "turn left, turn left, you will be assimilated"

    "what did you say?"

    "erm, nothing, I mean, turn left"

    --
    #hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
    1. Re:Speech - text and text - speech by maxwell+demon · · Score: 1
      Imagine how much cheaper sex chat lines owuld be for instance!

      I think for that speech recognition/generation per se would not be enough. The speech must also come with the right tone. I don't think a sex chat line with a monotonous computer voice would be very successful. You'd at least have some simulation of an emotional state in the voice.
      Ah, and don't forget the non-verbal noises ...
      --
      The Tao of math: The numbers you can count are not the real numbers.
    2. Re:Speech - text and text - speech by tod_miller · · Score: 1

      I think we found our expert! :-) I once called a phone sex line, out of the back of FHM.

      My phone provider fsksked up an install (for someone else!) and I lost my line, I complained, and it was reconnected, with the wrong line... so I call, and I say casually to the guy (no this bit isn't the phone sex bit!) so am I liable for any calls I make on this line? he says, well no, because our billing system won't, oh I see.

      So I had a chat with this chilean bird who was studying some shit, we spoke about everythign except sex. *feels the whole of slashdot looking at me with a concerned expression* yeah, that is it. She had a brother.

      What I meant was, a hardware solution could provide very high fidelity vocals, including breathing rates, minor deviations, emotion, etc.

      If speech recognition goes well.

      Oh, the line was a 1 quid an hour line, I called it 3 times, each time I spoke for 20 minutes before the auto cut off. :-) that is 60 quid those bastards won't see again! muahah *cough*

      --
      #hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
  31. Who will own the IP? by Anonymous Coward · · Score: 0

    So.. Who owns the patents, etc, on this if they do it?

  32. $[ANYTHING] in Silicon .. by torpor · · Score: 1

    .. is better.

    Bring on the silicon, yeah baby, yeah!

    {oh, except %ONE thing, that is... right...}

    --
    ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
  33. Cellphone voice dialling by Anonymous Coward · · Score: 0


    cellphones have had voice dialling for ages (+3yrs), i simply say "call home" or "dial pizza" and my phone dials the number automatically presumably the DSP for this is on a chip so i dont get whats new ?

    1. Re:Cellphone voice dialling by Anonymous Coward · · Score: 0

      I think thats just matching a prerecorded voice, matching a pattern. This is recognition. Could you say to your phone, Dial 1234567890 and have it recognize each number?

  34. You bet it's worth it by Tairnyn · · Score: 3, Interesting

    Once this technology has matured and some more headway can be made in Natural Language Processing, (uncertainty for teh win) we'll be on the cusp of some really excellent improvements in human-computer interfaces. It's becoming more common to see 'intelligent' systems being built to mirror the architecture of the human nervous system. This will be a necessary step to forming a generally proficient AI system. The day a computer can readily recognize you're being sarcastic, it's time to be paranoid.

    --
    "Don't waste your time or time will waste you" -MUSE
  35. Blind users by melandy · · Score: 1

    You make an excellent point about blind users.

    My dad lost his vision a few years back, and we haven't really found
    anything terribly useful in the realm of speech recoginition.

    He's tried out the little electronic phone/address book gizmo, but it took
    forever to train to his voice, a process that was a PITA to start with since
    you had to _READ_ what you were supposed to say to it off the screen,
    then whisper it to dad, loud enough for him to hear you, but not loud
    enough that your whisper would be picked up.

    So that went in the trash, and he's been using a microcassette recorder
    ever since. Not really the coolest way to do things, but it gets the job done.
    (It has an interface that a blind person can actually use)

    This sounds like a great idea, as long as they make it useable.

  36. brains are and probably should be modular by deathcloset · · Score: 2, Interesting

    This sounds like a great idea. Sometimes a Hammer works better than a screwdriver at a certain task. Not all Jobs can be preformed as well by a single tool or method.

    After all, the human brain has different areas for processing different types of stimuli.

    In fact, some parts of our brain are so radically different they are almost considered brains of their own.

    like the cerebellem; it's often referred to as "the small brain". This controls motor coordination - and in humans allows us to do amazing things like flips, kung-fu, and cup-stacking.

    And forgive me for forgetting the exact names, but the brain has layers as well. the outmost layer being the cortex (where most of the higher-level mamillian processing takes place - correct me if I'm wrong, the frontal lobe is pretty much purely cortical tissue). as you delve deeper you get into the hippocampus and medulla whatever (sorry IANAN I am not a Neurologist) which is where emotion rules - and if I again remember correctly is sometimes referred to as the "reptilian" brain.

    Even the eyes themselves can almost be considered little 'brains' of thier own - considering the amount of pre-processing they do (maybe a co-processor would be more accurate).

    make

    1. Re:brains are and probably should be modular by Anonymous Coward · · Score: 0

      "This sounds like a great idea. Sometimes a Hammer works better than a screwdriver at a certain task. Not all Jobs can be preformed as well by a single tool or method."

      Not true, at least when it comes to mathematics. A single, sufficiently fast CPU can do anything: solve any solvable equation, execute any algorithm, run any program and emulate any other (deterministic) system (including the chip in question).
      The only problems are speed, amount of storage, power consumption (which is why they are doing this), and the difficulty involved in programming the operating system or emulator for the program to run in - which is why you can't run windows programs on a Mac.

      For more info look up Turing Machines.

      Doing this in a dedicated chip will probably be faster and more efficient, but it's not the only way - it'll probably never catch on for desktops since by the time it's completed regular CPUs will be able to perform the same calculations equally fast and won't require a dedicated piece of silicon. Doing everything you can on one general purpose CPU is cheaper.

    2. Re:brains are and probably should be modular by Anonymous Coward · · Score: 0

      A single, sufficiently fast CPU can do anything

      which explains why we have video cards!

    3. Re:brains are and probably should be modular by Anonymous Coward · · Score: 0

      Yeah, but think about it. Most graphics systems in PCs today are crappy on-motherboard ones. They do very little in hardware, and run essentially in software on the cpu. The only thing you truly need any extra hardware for is the digital-analog conversion required to get a VGA output.
      Everything else could be done in software, and in the future everything current GPUs do probably will be: e.g. quake 2 could use hardware anti-aliasing, halo only does it in software. UT2004 has a software renderer.
      If CPUs were "sufficiently fast", as I said, then graphics programmers would only need to do software rendering. Sadly, for power hungry gamers (like me), they aren't sufficiently fast - but for word-processor type users they are, which is why they only need a software rendered desktop.

    4. Re:brains are and probably should be modular by Anonymous Coward · · Score: 0

      hmm, point taken.

  37. Depends, how would it integrate with... by 192939495969798999 · · Score: 1

    pr0n? We all know that if there's a pr0n application, then the technology will be developed & shipped 100-1000x faster. Speech recognition + pr0n...
    of course, the obvious control of the system by speech (first steps towards a holodeck), but also you could identify who's in that video by their ... voice!

    --
    stuff |
    1. Re:Depends, how would it integrate with... by Kiryat+Malachi · · Score: 1

      The obvious answer is that typing one-handed just became typing no-handed.

      --

      ---
      Mod me down, you fucking twits. Go ahead. I dare you.
      (I read with sigs off.)
  38. Re:1... million... DOLLARS!!! No by Retric · · Score: 0

    I smell BS.

    Good speach to text does not take a realy fast CPU it takes a fast CPU + good database + a fair amount of ram. Your cell phone's cpu can handle Call MOM because it only needs to know MOM, DAD, SALLY, and mabe 20 - 30 other names. There are 40,000 + words in english if want to have a low cost CPU great but will not a lot of memory and permant storage to get this to work.

  39. The UN would probably use this heavily by ARRRLovin · · Score: 2, Interesting

    With the advent hardware speech recognition, hardware speech translation is just the next evolution. Imagine being able to go to any country in the world and have just an iPod size device and a bluetooth hearing aid as a translator.

    --
    -Randy
  40. Then when I'm playing UT by Prince+Vegeta+SSJ4 · · Score: 0

    L33t D00d: I ownz j00

    Me: No you don't, eat sniper rifle

    *HEADSHOT*

    Me: Dammit

    *HEADSHOT*

    *Double Kill*

    Me: Sh!t

    [toilet flushes]

    *M..M..M..Monster Kill...Kill...killl

    Me: F*ck

    bed folds down

    *L33t d00d is unstoppable*

    Me: Sh!t

    [toilet flushes]

    *L33t d00d is godlike*

    Me: gawd dammit

    [house explodes]

    L33t D00d: told ya, I ownz j00

    L33t D00d: hey, you still there?

  41. Spelling degradation by bchernicoff · · Score: 1

    The decline in legibility of handwritting due to the widespread use of keyboards has been dicussed on slashdot before, but taking it a step further, what effect do you think prevelant voice recognition will have on out ability to spell?

    On a side note:
    "I don't have lip fungus!"
    "Let it go."

  42. Your scale is too small. by Anonymous Coward · · Score: 0

    If you're looking at an embedded chip to interpret information, think about something large-scale: languages.
    If you had the processing power to interpret and understand language, tack that on to something like Babelfish as a translator program. Now you have something that fits on a chip that can translate between any number of languages into your own. Now you can stick a little hearing aid into your ear, and it will translate anything you hear to english, for example. This would revolutionize international communication. This would reduce the number of barriers between diplomats, making them more effective communicators. Also, it would save governments millions of dollars, euros, or any other form of currency in translator salaries, reduce miscommunication, prevent problems with misunderstanding criminals they are charging with crimes, and increase the quality of education among international/foreign exchange students.

    Drawbacks: Keeping up with changing language and slang will be quite difficult to include in older models without the capability of a firmware upgrade. Chip size and speed are a factor as well.
    This is, of course, assuming that the chip is smaller than the user's head.

  43. Re:1... million... DOLLARS!!! No by AKAImBatman · · Score: 1

    According to this link, the average length of an English word is 6 characters. At one byte per character (two if you use Unicode), we find that a database of 40,000 words would be anywhere from (40,000*6) = 240,000 bytes = 235 kb to 470 kb in size. That's NOT much memory at all.

  44. Very good use unless... by WindBourne · · Score: 1

    you are working a job as customer support. I suspect that this will be used to help replace customer support, or possibly to change the somebodies accent so that they appear from Boston rather than from India

    --
    I prefer the "u" in honour as it seems to be missing these days.
  45. To, two, too by CyberLord+Seven · · Score: 1
    Our, hour.

    This is very good but English is not a good language for voice to text translations. There are far too many homonyms (sp).

    --
    We have always been at war with Eurasia!
    1. Re:To, two, too by j_cavera · · Score: 2, Insightful

      Speech recognition is a two-part process. The silicon is to speed up part one: word recognition. The first thing to do is to figure out that the person is saying:

      Computer, set timer for (to|too|two) (ours|hours).

      Step two changes that into: ... two hours.

      based on context. That's where the AI programmers get their turn at the problem.

      --
      #include "humorous_pop_culture_reference.h"
    2. Re:To, two, too by CyberLord+Seven · · Score: 2, Insightful

      Exactly, and that's where the real problem lies. If people think it's going to be difficult to identify the same word spoken by people from different regions then they probably have not given much thought to the fact that many words with different meanings sound the same in English and also that there are phrases such as "fat-chance" and "slim-chance" that mean exactly the same thing.

      --
      We have always been at war with Eurasia!
  46. Speech recognition needs a radical UI change by DrSkwid · · Score: 1


    Layering a speech-to-commands layer over the current systems is very problematic.
    The Star Trek nonsense of 'computer! get me all the data on ship X'
    [and why does Data talk to the computer, surely he's Wi-Fi enabled ? ] is plainly wrong.

    I found using via-voice and friends physically tiring, talking all day instead of typing is quite draining.

    Now sit yourself in an office with 20 or so colleagues all trying to work - talking out loud all day.

    It's pretty much like touch screens - they sound great until you actually get one and you find out all that investment as pretty much a waste of time except for niche markets.

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
  47. The Future is.... maybe not now by Mr.Senator · · Score: 1
    This is quite amazing in the idea not just of speech2text, but of a different mode. The Universal Translator. It would be very nice to have the dictiation of the accuracy of a good secratary or an army of them at the board meeting, but what if those at the board meeting are from different countries? Set up the computer, have it fed through a blah to blah dictionary and print it out the other side on a completely different language! This has more uses than just dictation. Off hand this is invaluable to things like the UN, Nato, the EU, International Deplomacy, and we can even understand what the hell those operators in India are actually saying!

    The other thing that this spells to me (haha I made a funny) is the specialization of computer components. Rather than having 1 main processor and a sorta second and third in the North Bridge and GPU we have dedicated processors for the different functions! One for the Graphics, Sound, Integer math, Floating point math, the possibilities are endless. This may be a futuristic idea, but the pracitcal uses will be more general advancements in how computers are used and thought of.

  48. Kramer speech-enabled movie listing by vurg · · Score: 1

    KRAMER: Hewwo and welcome to Movie phone. If you know the name of the movie you'd like to see, press one.

    GEORGE: Come on. Come on.

    KRAMER: Using your touch-tone keypad, please enter the first three letters of the movie title, now.

    (George presses 3 keys)

    KRAMER: You've selected ... Agent Zero? If that's correct, press one.

    GEORGE: What?

    KRAMER: Ah, you've selected ... Brown-Eyed Girl? If this is correct, press one.

    (George looks baffled)

    KRAMER: Why don't you just tell me the name of the movie you've selected.

    GEORGE: Chunnel?

    KRAMER: To find the theater nearest you, please enter your five digit zip-code, now.

    (George enters his zip-code)

    KRAMER: Why don't you just tell me where you want to see the movie?

    GEORGE: Lowes Paragon, 84th and Broadway.

    KRAMER: (picks up paper) Chunnel, is playing at the Paragon 84th Street cinema in the main theater at 9:30 PM.

    GEORGE: Yeah, now I gotcha! (hangs up the phone and rushes out the door)

    KRAMER: It's also playing in theater number two at 9:00.

  49. Disgruntled Employees by Rufus88 · · Score: 2, Funny

    Now, disgruntled ex-employees won't return to the office to "go postal", so to speak. They'll just run up and down the hallway yelling "File! Exit! No!".

    1. Re:Disgruntled Employees by BLKMGK · · Score: 1

      Umm, I've actually seen video of a demo where that was done. A guy was attempting to demo some software and shouting from the crowd, from different people no less, managed to hose the guy's computer. I believe they were formatting the drive but cannot clearly recall - I'd love to find that clip again though! (lol) What was funny was that it was obviously not planned and I seem to recall each participant sounding as if they didn't quite believe that their particular command would be taken - and it was!

      --
      Build it, Drive it, Improve it! Hybridz.org
  50. Re:1... million... DOLLARS!!! No by Anonymous Coward · · Score: 0

    Um. Storing the text part of the word isn't the issue. It's the sound part, which requires more than 6-12 bytes per word.

  51. Re:1... million... DOLLARS!!! No by soltarusprime · · Score: 2, Informative

    You are forgetting the coded phonetic context of a word and distillations for "known dialects". Besides dialects, English is bereft with words that sound the same yet mean different things or even sound differently (slightly) depending on the surrounding contectual words and whether it is a statement, question or exclamation (different intonations). Feel free to multiply that K figure by up to 1000 times.

  52. Compare this with graphic cards ... by YeeHaW_Jelte · · Score: 1

    ... maybe it'll do the same thing for speech recognition as seperate processors have done for graphics, notably 3d graphics. When I was mucking around with computers as a youngster, I could only dream of the likes of quake3 & doom3. Most computers had a crappy CGA or _whew_ maybe even a EGA adapter on board. GPU's have made things not so much possible as feasible that weren't so before ... maybe a seperate chip for speech processing will have the same effect. I mean, we've been talking about speech recognition for decades now, only it's been going totally nowhere.

    --

    ---
    "The chances of a demonic possession spreading are remote -- relax."
  53. Re:1... million... DOLLARS!!! No by AKAImBatman · · Score: 1

    Phonetics. It's quite uncommon to store the complete sound of the word. Not only would it be redundant, but it would be of no practical value to the computer.

  54. Perception & Conception split by jazmataz23 · · Score: 1
    Check into Douglas Hofstadter's work in cognitive science. He argues quite effectively that how we conceive of the signals we receive is inextricably tied to the perceptive act.

    Human beings are very adept at making a quick judgements on the stream of information we receive from the senses. We then follow along logical paths from those judgements, but we also quickly discern if we're headed down a wrong track and will "re-view" the evidence we've been given. His philosophy is that if you segment AI into perception and cognition, you're missing a fundamental feature of human intelligence.

    Go to his page at UI, check his wiki, or better yet read his books.

    jaz

    --
    Death to Argument by Slogan!! (This post twice-encrypted with ROT-13. Replies not using same will be ignored)
  55. Well done please! by Roger+W+Moore · · Score: 1
    If you could say "Climate Control, 70 degrees", and other commands...

    That might be a somewhat dangerous command to have as it would probably lead to many cooked American visitors who rented cars in Canada or Europe (or in fact almost anywhere outside the US!). In fact I can see that the headlines of tomorrow might be subtly different from those of today:

    "NASA looses Engineer after spacecraft gets units wrong"

    1. Re:Well done please! by Anonymous Coward · · Score: 0

      Main Entry: loose
      Pronunciation: 'lüs
      Function: adjective
      Inflected Form(s): looser; loosest
      Etymology: Middle English lous, from Old Norse lauss; akin to Old High German lOs loose -- more at -LESS
      1 a : not rigidly fastened or securely attached b (1) : having worked partly free from attachments (2) : having relative freedom of movement c : produced freely and accompanied by raising of mucus d : not tight-fitting
      2 a : free from a state of confinement, restraint, or obligation b : not brought together in a bundle, container, or binding c archaic : DISCONNECTED, DETACHED
      3 : not dense, close, or compact in structure or arrangement
      4 a : lacking in restraint or power of restraint b : lacking moral restraint : UNCHASTE
      5 a : not tightly drawn or stretched : SLACK b : being flexible or relaxed
      6 a : lacking in precision, exactness, or care b : permitting freedom of interpretation
      7 : not in the possession of either of two competing teams

      Main Entry: lose
      Pronunciation: 'lüz
      Function: verb
      Inflected Form(s): lost /'lost/; losing /'lü-zi[ng]/
      Etymology: Middle English, from Old English losian to perish, lose, from los destruction; akin to Old English lEosan to lose; akin to Old Norse losa to loosen, Latin luere to atone for, Greek lyein to loosen, dissolve, destroy
      transitive senses
      1 a : to bring to destruction -- used chiefly in passive construction b : DAMN
      2 : to miss from one's possession or from a customary or supposed place
      3 : to suffer deprivation of : part with especially in an unforeseen or accidental manner
      4 a : to suffer loss through the death or removal of or final separation from (a person) b : to fail to keep control of or allegiance of
      5 a : to fail to use : let slip by : WASTE b (1) : to fail to win, gain, or obtain (2) : to undergo defeat in c : to fail to catch with the senses or the mind
      6 : to cause the loss of
      7 : to fail to keep, sustain, or maintain
      8 a : to cause to miss one's way or bearings b : to make (oneself) withdrawn from immediate reality
      9 a : to wander or go astray from b : to draw away from : OUTSTRIP
      10 : to fail to keep in sight or in mind
      11 : to free oneself from : get rid of

      GET IT RIGHT, DAMNIT!

    2. Re:Well done please! by Roger+W+Moore · · Score: 1

      Main entry: loser
      Pronunciation: 'lüz-er
      Function: noun
      1. Someone who pedantically points out minor typos in other people's posts.

  56. Beats doing it in software by JUSTONEMORELATTE · · Score: 1

    Carver Mead (at Caltech last I heard) was pioneering work to take neural processes such as vision and hearing, and model them in silicon via custom-fab VLSI circuits. This is a MUCH better approach to modelling these proceses, since your neurons process the information in massively-parallel, simple-cicuit networks.
    The traditional approach was to take a (completely) serial CPU and have it iterate over sampled data using a complex model of the naturally-occuring network.
    It seems like a no-brainer to me, but I doubt that $1million will be much when all is said and done.

    But I freely confess that I haven't RTFA. :)

    1. Re:Beats doing it in software by TheSync · · Score: 2, Insightful

      So far, analog neuromorphic VLSI has hit a dead-end in terms of real applications. Also digital signal processing has been speeding up to the point where it can go almost as fast as a lot of the parallel analog models.

      The one exception is that the work on analog retina models lead to the development of the Foveon X3 technology, which is just packing R,G, and B CMOS sensors into a single vertical column on a chip. But again, the neuromorphic part of the retina model is not the X3 technology, the X3 technology is stacking CMOS sensors.

      Analog neuromorphic VLSI did have one big result, the electrical engineers managed to teach the biologists a lot about signal processing, and the cross-pollination of this knowledge has lead to discoveries such as ripple analysis in auditory cortex.

    2. Re:Beats doing it in software by JUSTONEMORELATTE · · Score: 1

      I realize that Dr Mead's Foveon has narrowed down to digital imaging products, but one of his PhD's, Rahul Sarpeshkar, has gone on to a professorship at MIT, and has taken the cochelar implant one step beyond what was done at Caltech.
      That article is from May of 2003, and I don't know if they have reached human trials yet, but I'd hardly call it a dead-end.
      Digital SP has sped up a lot in the past decade, of course, but the process that we're seeking to model is still on the order of tens of thousands (for auditory) to tens of millions (for visual) of signals being processed in parallel. It's just plain nuts to try to model this with a general-purpose, serial processor. Particularly when there is a reasonably mature science of building specific parallel-input hardware to mimic the natural processes.

      But then again, I'm just a hobbyist in the field. I'm neither a researcher nor a professional in the area, so take my thoughts with the appropriate grain of salt.

    3. Re:Beats doing it in software by TheSync · · Score: 1

      Well, the article says "Dr. Sarpeshkar expects the chip to be available commercially within two years." So I guess we will see in 2006 if some real useful product is built with analog neuromorphic chips. I'd be happy to see such a thing. But I doubt it...

      I am co-inventor on an analog VLSI cochlear model patent (that has since run out...) We talked with hearing aid and cochlear implant companies, but at the time they said they were happy to use standard analog filter technology (op-amps and R/C) in their devices, and didn't see a need to change it.

  57. Re:1... million... DOLLARS!!! No by soltarusprime · · Score: 1

    I would prefer that my computer be able to differentiate between there;their;they're and eight;ate to name a few. For simple commands ("Computer Lights On") it would be farely useless, but what I mentioned would be a drop in the bucket compared to the basic AI needed to do a decent text-to-speach-completely-handsoff solution.

  58. Hardware solution to software problem by Anonymous Coward · · Score: 0

    This is not a good use of $1 mil. This is an attempt to throw hardware at a problem that software should solve.
    Speech recognition has been stuck in it's use of neural nets for far too long. It is very possible to vastly reduce the hardware requirements by making the software smarter. In speech recognition, the most logical way to do this is to RECOGNIZE that the signal is SPEECH. Speech is unique from many other signals, and there are volumes of linguistic research that show how speech is unique. Application of linguistic knowledge can make speech recognition vastly more efficient.
    Neural nets are a highly inefficient way to attempt to recognize speech. I defy anyone to really be able to demonstrate in a detailed way what a neural net is doing when it attempts to recognize speech. This is a blind alley that will keep requiring more muscular hardware. Instead, take a first pass at speech data using linguistic knowledge. This will GREATLY reduce processing overhead.

  59. Just Remember.... by mdielmann · · Score: 1

    ...it's only $1 Million for the first chip. All the other chips cost about 35 cents. Assuming it works, of course.

    --
    Sure I'm paranoid, but am I paranoid enough?
  60. Re:1... million... DOLLARS!!! No by AKAImBatman · · Score: 1

    Fair enough. We can assume that the phonetic representation is similar to unicode (i.e. up to 65,000 unique phoneme), so that would double the storage. If we then assume we need data about each phoneme. Now english has about 45 phonemes, which is actually above average for a language. If we assume that the computer stores about 4-8 times that many (different samples used as ranges for interpolation), you still don't have that many samples. A few megs at most.

  61. Good use of $1 million? I don't know... by ThatsNotFunny · · Score: 1

    I don't know, let's ask the chip...

    --
    "Was it a millionaire who said 'Imagine No Posessions?'" -- Elvis Costello
  62. Let me ask my computer by Anonymous Coward · · Score: 0

    Good use of a million dollars? Let me ask my computer...

    "Hello, computer." - Scotty

  63. Re:1... million... DOLLARS!!! No by AKAImBatman · · Score: 1

    That's already handled by text to speech programs. They handle this issue by making a contextual "guess" of which word to use. This is especially important as most english speakers fail to properly enunciate their words. i.e. affect and effect are pronounced slightly different, but most people incorrectly pronounce them with a short 'a'. i.e. 'u-ffect' instead of 'a-ffect' and 'ee-ffect'.

  64. backwards: do software first by peter303 · · Score: 1

    I'd hesitate "siliconizing" an algorithm before I knew what the best algorithm was. People have working on this problem for 50 years. They have some reasonable solutions for slow speech. But there are still clever things to be discovered. You can always test it on a supercomputer, or slowed down speech.

  65. NSA: Imagine a beawulf cluster of these ... by peter303 · · Score: 3, Funny

    National Security Agency: "We did, and they are hooked to the national phone system."

  66. Live Chat & Search by LionKimbro · · Score: 2, Interesting

    With voice software, you can already speak in real-time, conference style. I think Skype supports 5 people.

    With speech-to-text, you could log all conversation to IRC.

    Then you could have search engines that search *all conversation within the last 5 minutes, world-wide.*

    Well, at least all conversation that was okay with being public.

    So you could say, "Show me all conversations that are going right now about Python, and immediately find the people talking about Python, wherever they were.

    One step towards the HiveMind.

  67. Automated Listening? by NotQuiteReal · · Score: 1

    Heck, most people hear just fine, and lots of us don't listen. Just ask any wife, mother, professor, boss...

    --
    This issue is a bit more complicated than you think.
  68. NSF Funded? by jonathanhowell · · Score: 2, Funny

    NSF, to me, translates to "Non-Sufficient Funds" or a bounced check.

    I can tell you from personal experience that this method of "funding" only works for the short term.

    Jonathan

  69. Getting OT... by Anonymous Coward · · Score: 0

    I was just trying to poke fun at all the people who post on slashdot now in slight astroturfing mode. "The company I work for make product X! It'll save the world!"

    I don't see anything wrong with that. In the old days, Slashdot was all about new products and projects (and MS-bashing, Linux-loving), no yro, no politics. A lot of us still want to hear about new products and as long as the submitter correctly identifies themself as the developer, there is no problem. That's the way it's been from day 1. It's the true astroturfers who pose as just an interested third party who are doing it wrong.

    If you don't like product announcements on Slashdot, ignore them and stick to the other areas. Myself, I don't like all the YRO articles, but I don't add comments to them poking fun of YRO discussions.

  70. Good Use by DingoBueno · · Score: 1
    Good use of $1 million?
    Absolutely. As long as it's spent in the US. I hope they look to native sources for materials, manufacturing, etc.
    --
    ascii art
  71. Um..not exactly by Strange_Attractor · · Score: 2, Funny

    something just about but not completely unlike tea

    --

    ----
    WWJD...For a Klondike Bar?
    1. Re:Um..not exactly by soltarusprime · · Score: 1

      Again, paraphrasing not quoting was the point of it all. While I haven't read the book in a while - I have my 7 year old reading it and reading it to him as one of his introductions to "chapter-books".

    2. Re:Um..not exactly by aardvarkjoe · · Score: 1
      Again, paraphrasing not quoting was the point of it all.
      What strange planet do you come from where "paraphrase" means "to say the opposite?"
      --

      How can we continue to believe in a just universe and freedom to eat crackers if we have no ale?
  72. Efficency vs. accuracy? by gr8_phk · · Score: 1
    I don't really care if they can make it 1000 times more efficient. My PC can do speech recognition "fast enough" but it needs more accuracy. Can 100 to 1000 times the computation actually improve the accuracy of todays algorithms? If so, this is great. If not, I would say the goal is simply to monitor 1000 phone calls at once with a single chip with todays accuracy.

    So is it like chess where faster translates to better?

  73. A different thing is language recognition by Anonymous Coward · · Score: 0

    One thing is "speech recognition" or speech-to-text. Another different beast is "language recognition" or text-to-meaning.

    If you say "open the second drawer from the right and close it after ten seconds", it could be relatively easy to recognize the sound and convert it to text, to take written notes from whatever people talk about. But it's much, much harder to design an "intelligence" capable of really understanding of what you *mean*, human language is not easy, even for humans...

  74. speech recognition and deaf/hard-of-hearing by CrudPuppy · · Score: 4, Insightful


    making quantum leaps in speech recognition has tremendous potential for deaf and hard-of-hearing (I am the latter)

    Imagine being in a meeting (almost always a problem for hearing impaired people) and having real-time subtitles.

    $1 million is a TINY price considering upwards of 20% of the nation has some hearing loss and hearing aids cost on the order of $4000 a pair.

    --
    A year spent in artificial intelligence is enough to make one believe in God.
  75. Details? by karlandtanya · · Score: 1

    Is this really something radically new?

    Or is it just a PGA.

    Gate arrays are very fast and very limited. So, prototyping one would take lotsa bux. But it wouldn't really be anything to brag about as a technical achievement.

    Article seems kinda short on technical details.

    And no, saying "We have a really cool algorithm that's ready to commit to silicon. So, we're going to make a PGA. Then make a billion more just like it." would NOT give away any trade secrets.

    --
    "Reality is that which, when you stop believing in it, it doesn't go away." - Philip K. Dick
  76. Read My Lisp by Anonymous Coward · · Score: 0

    How long until they can get the computer to read lips?

    1. Re:Read My Lisp by CyberLord+Seven · · Score: 1

      Good Morning, Dave. I am an HAL 9000 computer.

      --
      We have always been at war with Eurasia!
  77. Hallo by wan-fu · · Score: 1

    Eye am yousing dis tex knowledge E all ready

  78. just try analogical cnn by l3v1 · · Score: 1

    This chip that recognizes voice patterns fast... It seems like reinventing the wheel. Why, because analogical algorithms implemented on cnn (meaning cellular neural networks) chips, on real hardware, could do that (as they can do even much more - as I know some researchers who work on real projects in the field).

    When somebody would ask me about this (why would they :P ) I'd say invest in something that could be more beneficial in the long term, which in this case would be cnn research.

    Don't get me wrong, I really appreciate when hearing about these kinds of money investments, because they will serve a very good purpose. Hell, I am one of those guys who are dreaming about real voice controlled computers since my first contact with Star Trek :D And of course my computer would have a female voice (I'd even know who to sample :D )

    --
    I am putting myself to the fullest possible use, which is all I can think that any conscious entity can ever hope to do.
    1. Re:just try analogical cnn by nusratt · · Score: 1

      l3v1,what country are you in?

  79. $1 mil? by mmmmmhotpants · · Score: 1

    $1 mil doesn't really go all that far in the research world. That could fund maybe 5 graduate students for 3 years, and would leave maybe just enough money for the purchasing. I'm not sure if in the end a chip will come of it, but its definately a worthy start.

    --

    can't sleep. clowns will eat me.
  80. "Fetch me a Beer!" by Anonymous Coward · · Score: 0

    Hmm,

    Speech Recognition Chip ~ $1,000,000
    Household Robot ~ $2,900
    Not missing a single play of the game - Priceless.

  81. Tech doesn't kill people, people do. by Anonymous Coward · · Score: 0

    " My friend and I were talking about this. In countries that are more totalitarian, it could be used to root out "dangerous people" www.geocities.com/James_Sager_PA"'

    But of course. Remember however we're a geek site and hence pro-tech even if it can be used to enslave people (Just wait till that forehead chip comes out. 2 GB of ram. Whoo Hoo!). To paraphrase "It's not the tech that enslaves people. It's the people who enslave people" It's just that tech makes it soo much easier.

  82. Good use? Sure. by AndyChrist · · Score: 2, Funny

    As it is, it's a tossup whether I prefer speaking with a machine or a customer service rep in India. Won't take much for a machine to surpass most of them in English speech recognition. (Alright, to be fair, there are some indians I've gotten on the phone who have been at LEAST as good as the typical US based rep. But that's a minority.) Anything to advance the technology.

  83. Eye use it! by stfvon007 · · Score: 2, Funny

    Eye use peach recon ingition proton now. Sea how wood it works? Eye love his sea check ignition pro gram. don't ewe tank hugh should met won?

    --
    All misspellings and grammatical errors in the above post are intentional and part of my artistic expression.
  84. Not what it seems .... by Bruchpilot · · Score: 1

    Read closely. The target of this research is not to improve speech technology itself, but rather that current speech recognition technology can run on devices with limited capacities (e.g. cell phones and PDAs).

    To improve speech recognition technology itself you do not need more computing power, you need a totally new approach. The current approach based on propabilities of word combinations and hidden markov models can work fine for narrow applications, like dictating medical reports that follow a certain model, but comes quickly to its limit in a totally open environment.

    And we are talking here just about the speech to text aspect. Forget anything about the computer trying to understand the meaning, apart from simple command and control.

  85. A good goal, but.. by siveys · · Score: 1

    I think this research does not seem very promising in the long run.

    I think that applying only more computing power to audio recognition does not solve the underlying problem of the complexity of speech recognition.

    For some people it is ofcourse a nice-to-have gizmo that they can command their cell phone with short sentences but this is really far from speech to text dictating and natural language interfaces.

    I personally think that to achieve the "bigger goals" we need to concentrate more on context and sentence aware solutions. Speech recognition can at it's best be highly educated guessing of arbitrary human tone sequences - that's what we humans do too.

    In these more desirable goals the audio analyzing which this project concentrates on is only a very small (although vital) subset of the process and thus optimizing it with hardware seems to me as fairly insignificant.

  86. Not only deaf. by Thomas+Shaddack · · Score: 1
    Imagine being in a meeting and having real-time subtitles.

    That would be a big advantage even if you hear well, just have trouble keeping concentration for prolonged time. If you let your thoughts wander off for a moment, you just read the last couple lines of the log.

    For teleconferences, this would also make it easy to participate in more conferences at once. Like having open several IRC windows.

    With an automatic translating system, it would help even with multi-language meetings (and, given the inherent features of machine translation, lead to many funny situations - maybe the translators should be aware of ambiguities and show all the possible meanings).

  87. A million is not alot in academia by obiquity · · Score: 2, Insightful

    I am an assistant prof at a major research institution and $1,000,000 is not as much as you would imagine. Firstly most universities take ~ 50% of grants immediately as overhead. You're down to 500K. Second this is spread out over 4 - 5 years, now you're down to about 125 K a year. Third, if we have grants we profs are required to pay our own summer salaries. On average this could be 25K, so you're down to 100 k/ year. In sciece and engineering we are expected to pay our grad-students if we have grants. Yearly salary with additional overhead (in the US, Canada is a bit less) comes to almost 50K/year A post-doctoral researcher would be hard to find for less than 50K/year with overhead. So really it supports a grad student and a post-doc and maybe some equipment for four years. Compared to the resources of industry it sometimes seems kind of puny. But the freedom is worth it. Just some info, OBQT

  88. Where are the details? by hokiecomputerenginee · · Score: 1

    What I want to know are the architectural details! What are they doing in the silicon -- a basic artificial neural network? What DSP algorithms are they implementing? Are they recognizing at the phoneme level or at the word level? If these questions have been answered and they're just not publicized, that's great. If they have not been answered, we will not have much after 2 years of research. Maybe I should just contact the prof directly...

  89. Good use of a million dollars? by TheLoneGundam · · Score: 1

    I dunno, I'll have to ask my computer.

  90. SR + Google + OpenCyc by Em+Adespoton · · Score: 1
    One of the things I'm really anticipating is such a speech-to-text processor that can work with OpenCyc's Natural Language Processor, so that we could interract with a truely intelligent system. Imagine, you say, "Computer, Earl Gray, Hot," and the computer responds with, "There are a number of meanings for what you said. Based on your previous queries, I expect you want the Tea Machine to steep you a cup of Earl Gray Tea. Is this correct?"

    Of course, using such a processor, OpenCyc would also be able to use the video camera at your front door to ID you as you approach, open the door for you, and say "You have 5 new voicemail messages, one from 555-6789, from someone who sounded like your mother. Her tone was urgent. Would you like to listen to this message first?"

    I haven't even got to Google integration yet, but that was mainly added as a way to get people to read this ;) OpenCyc can already do independent Google searches and collate the results.

  91. New architecture alone won't do it by slobber · · Score: 2, Insightful

    This should be about algorithms, not architecture. Anything they can do in silicon can and should be implemented and perfected in PC software first. I don't care if it takes PC 10 minutes to recognize 10 second sentence as long as it does it accurately. As soon as that happens, then by all means cut its power consumption and speed it up x1000 by doing it in silicone. If all they are doing is speeding up existing, relatively low accuracy algorithms, then their effort is of limited use.

    Too be honest, I doubt that putting a few clever algorithms together will ever achieve any respectable accuracy no matter how fast those algorithms are. Sure, it might accurately recognize words from limited vocabulary when spoken clearly and/or in simple sentences. If this is their goal, then it is quite achievable. It sounds to me though that they are aiming much higher as in "dictating a detailed email". I think that so many things have to happen from effective noise filtering to proper phonetic model representation to parsing to content-based correction. Latter step is especially problematic since it requires a huge knowledge database which takes humans years to accumulate. I am not saying that these difficulties are insurmountable, but simply that their goals are too ambitious for the current state of our technology and knowledge. I'd love to be proven wrong on that account though.

    --
    "You mortals are so obtuse." -Q
  92. national security? by bob_jenkins · · Score: 2, Interesting

    Why are they talking about querying online databases for 911 calls as the national security app? It's obvious the national security app is to translate every single phone call to text and store them (indexed) in a classified database. I've attempted to believe the US wouldn't do this because it's illegal, but I can't manage to suspend disbelief. The only way to avoid this is if phone calls are encrypted and the US doesn't have the keys.

    1. Re:national security? by bob_jenkins · · Score: 1

      I take it back. The database of phone calls probably already exists. The obvious application for these chips is to record every conversation in every public place in the US and store it, indexed, in a classified database.

      I'm not so much against the database, as against access to it being classified. I think I'd be happy if the people could do what the government could do.

    2. Re:national security? by burns210 · · Score: 1

      And you just KNOW Uncle Same will need one of these to search against all that information.

  93. free speech recognition by museumpeace · · Score: 1

    Just a million? Pfft! I went down the tubes with one S.R. startup back in '92 that ate far more of some VC's money than that. Now NSF is not in it to get rich and I hope I am right in assuming that a successful chip design, if a mere $1000000 gets that far, would then be available at no fee to any foundry, or at least US foundry. OK, any foundry that wants to sell S.R. chips to the DOD:( This lines up pretty well with IBM's recent give-away of its S.R. code: it is an admission that Speech Recognition is a commodity and nobody knows how to make any money with it so govt must fund further development. BTW, automated recognition of music [as in "what is this tune I keep humming?"] has been on the drawing board at Philips over in the Netherlands for over a year. Philips isn't saying much. But it appears you have to have a pretty accurate sample to get recognition since they want to arrest your piracy based on this recognition...no S.R. software worth its $1000000 is that fussy about sound quality.

    --
    SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
  94. Re:1... million... DOLLARS!!! BOGUS NERD ALERT! by ChaoticLimbs · · Score: 1

    This is "News for Nerds". It has been discovered that you are not, in fact, a bona fide nerd. This has been shown in your above post, where you quoted, as examples of speech recognition use from Star Trek, the following text:

    "Computer, lights!"
    "Computer, make coffee!"
    "Computer, Earl Grey, hot!"
    The actual text should read "Tea, Earl Grey, Hot." With no "Computer" and with the added description of "tea". To my knowledge no one in any episode has requested "Computer, make coffee!" either.
    Your Slashdot UID has been suspended until such time as you demonstrate competency with Star Trek references, and some minor Warcraft or Nethack experience.

    The Management

  95. Tamagotchi by Barryke · · Score: 1

    But enhanced conversations between people and consumer products is not the main goal. Just imagine all those notebooks/pda/tamagotchi's etc. That'd be fun. As long as my mom doesn't ask them why i was home late i'm safe.

    --
    Hivemind harvest in progress..
  96. Silicone by Lordcheez · · Score: 1

    "the wine was finished, and the romantic meal was finished. Nick sensually kissed Candice as they slowly went upstairs. As he slowly pushed her back on the bed she moaned, 'arouse!' and the voice activated silicone...."

  97. Devices by xombo · · Score: 1

    This will be great for adding speech recognition support to embedded devices and low-power computers. We'll have palm tops that allow us to speak into our date book like a secretary.

  98. Yes it's worth it by llZENll · · Score: 1

    Considering that if it were successful it would change every persons life on the face of the earth for the rest of time that interfaces with any electronics I would say 1M is worth it.

  99. Oh that's just GREAT! by thewickedmystic · · Score: 1

    Now everything I say on a long distance call will be transcribed 1000 times better and/or faster?

    That's it! I'm making a tin-foil hat for my phone!

    (did I just forget to post AC?? oh damn!)

    --
    "Logic merely enables one to be wrong with authority." - Dr. Who
  100. Dumb idea. Too soon for silicon by Animats · · Score: 1
    Unless it works well on a general purpose computer but won't go fast enough, there's no reason for custom silicon.

    Besides, making a workable technology cheaper is a job for the private sector. Nokia, etc. should be funding this. If it was likely to work, they would be.

  101. Power efficient speech recognition by Anonymous Coward · · Score: 1, Informative

    While we are on the topic of speech recognition hardware, here is a shameless plug for the Perception Processor that people might find interesting. The Perception Processor OR The Perception Processor

  102. Announcement was content-free so hard to tell by billstewart · · Score: 1
    Unfortunately, all the announcement said was "1. Buzzwords 2... 3. GRANT MONEY!", which is the academic version of "3. PROFIT!", so it's really hard to tell what they're talking about, or whether their ranting about "Silicon" is like otaku ranting about "Steam!"

    The last time I worked with vision recognition people was in the early 90s, but the two basic approaches to the problem were

    • Conventional Algorithms using conventional CPUs or DSPs, so if you did any special chips they were just for running standard calculations faster, e.g. DSPs in parallel to implement your FFTs.
    • Neural Networks, which really are a lot easier to wire up in chips than to emulate in software. You can build the things out of FPGAs and burn them into ASICs once your design is done, and then of course you've got to interface them to more conventional I/O channels.
    Does their call for silicon imply they're doing neural nets? Or just that they want to do dumb SIMD number-crunching? Hard to say.
    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
  103. You talkin' Jive, turkey? by Todd+H.+Sals · · Score: 1

    Buy one of those computers and pipe in the audio from a blaxsploitation movie.

    The Jive alone would cause it to go into meltdown.

  104. YES! by TLouden · · Score: 1

    Speech Recognition is worth so much. It provides the potential for a much faster (can you really type faster than you can talk?), less damaging(carpel tunnel anyone?), form of data input. Now, mind reading would be nice but then the virus potential there is way to dangerous.

    --
    -Tim Louden
  105. existing ones used for subtitles? by Anonymous Coward · · Score: 0
    I say no - lets stick with the existing ones, which appear to work pretty well whenever I use subtitles when watching live tv programs (news, debates), and spend this million dollars on aid programs instead...

    Who with the what now?!?

    Last time I checked, those subtitles were the result of a person typing like crazy (and on the local news, with terrible accuracy) on a keyboard while watching the show, not some speech recognition system. I've even seen what looks like the person typing the subtitles hit the backspace key a few times to correct a mistake. Sometimes they even give up and skip a sentence or two.

    If $1M could create an automated system that can do significantly better, then it's definitely worth it. Remember, that's a one-time investment to create a technology, not a single device. How can you not see the potential for good in that?

    Next you'll tell us Alexander Graham Bell should have stuck to teaching sign language instead of monkeying around with inventing the telephone.

  106. Don't forget the blind by buck_wild · · Score: 1

    The blind have the moving line of braille, but this technology could be used to translate, or even help clarify what the person actually said.

    --
    If all you have is a hammer, everything looks like a nail.
  107. Good use of $1 million? by slapout · · Score: 1

    >Good use of $1 million?

    Well, I could name some worse...

    --
    Coder's Stone: The programming language quick ref for iPad
  108. Telephone pr0n will just get better by syousef · · Score: 1

    Instead of talking to a hairy man named Bubba with a put on voice, dirty old men will be talking to a computer. Yay! No need to pay Bubba any more!

    --
    These posts express my own personal views, not those of my employer
  109. An Idea but not Cool by rtb61 · · Score: 1

    Big brother is listening (and understanding). Nowdays it can be a worry when universities do the research, rather than a technology and entertainment company. I would not have thought that 1 million dollars would do much of anything with regards to silicon chip research, so is that the limit of the funding or is there more funding, undeclared going on in the background. Perhaps the the general public might not like the source of the additional funding. The professionally paranoid love this kind of technology.

    --
    Chaos - everything, everywhere, everywhen