Slashdot Mirror


Cell Phones for the Deaf

nitzan writes "Quoting from the article: 'the software translates the voice on the other side of the line into a three dimensional animated face on the computer, whose lips move in real time synch with the voice allowing the receiver to lip read.' Unfortunately this only works with laptops, but a pda version is in the works." The company website has a demonstration.

37 of 267 comments (clear)

  1. Can you hear me now? by Cap'n+Canuck · · Score: 5, Funny

    Still no?

    Ok, can you hear me now? Still no?

    Ok....

  2. Oh, great... by Anonymous Coward · · Score: 5, Funny

    ...so now we'll all have to learn how to sign "Turn off your fucking phone, asshole!"

    1. Re:Oh, great... by sczimme · · Score: 5, Insightful


      Yes, because the deaf person is bound to have the ringer turned way up...

      Oy.

      --
      I want to drag this out as long as possible. Bring me my protractor.
    2. Re:Oh, great... by Waab · · Score: 5, Funny

      I believe this particular sign has already been standardized and is currently in use by 99% of the American driving population.

      .!..

      !!.. if you're from the other side of the pond.

  3. Technology overkill by tyler_larson · · Score: 5, Insightful

    What was wrong with speech to text?

    --
    "With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea...."
    RFC 1925
    1. Re:Technology overkill by p4ul13 · · Score: 4, Informative

      Rather than have a computer interpret a person's speech, the software basically gives a representation of what the speaker's mouth is doing. This will allow the deaf person watching the device to do their own interpretation of what they see, which I'd imagine is much more reliable than speech-to-text could hope to be.

      --
      Paul Lenhart writes words!
    2. Re:Technology overkill by Ted_Green · · Score: 5, Interesting

      Actually speech to text is much more reliable.

      Text to speech:

      1. person speaks
      2. software interprets phonetics converts it into words
      3. deaf person reads the words

      versus

      1. person speaks
      2. software interprets phonetics into picture based lip movements
      3. deaf person interprets picture based lip movements

      Point of fact this is unbelievably dumb and is right up there with converting Russian to German for an English speaker to read.

    3. Re:Technology overkill by Kintanon · · Score: 3, Insightful

      How the hell do you draw that conclusion? How could speech to text be MORE processor intensive than converting speech to MOVEMENT on a face?! It's orders of magnitude harder to translate a sound into a muscle group movement on a computer generated face than it is to turn it into a group of characters representing that sound.

      Kintanon

      --
      Check out JoshJitsu.info for Brazilian Ji
    4. Re:Technology overkill by Dephex+Twin · · Score: 3, Interesting

      Think about it like this.

      If you have to say the sounds for the word "ow", what does that look like? There is a way for a computer to display this that would be pretty clear, and figuring this out would more or less require grabbing the "ow" picture group.

      Now, what if you have to write something with a "ow" sound in it, but this "ow" sound might be in the middle, beginning or end of any word? The sounds all around it have an effect on it. It might be spelled "au", "ow", "ao", "ough", god-knows-what-else. There are dozens and dozens of situations where this sound might arise. Including ways you might not even consciously think about. Figuring all this out is really hard for a computer to do, because it has no AI. It doesn't know what is being talked about.

      Probably mapping mouth movements isn't dead-easy, but I'd wager it is much easier than speech-to-text.

      --

      If you want to make an apple pie from scratch, you must first create the universe. -- Carl Sagan
    5. Re:Technology overkill by fishbowl · · Score: 5, Insightful


      "2. software interprets phonetics converts it into words"

      Is a very different, much more complex problem than:

      "2. software interprets phonetics into picture based lip movements"

      Consider that for the first example, we need the computer to understand the language,
      whereas in the second example, all the computer needs is a fourier transform and
      Max Headroom anatomy.

      Personally, I think it would be simpler and more effective to put a
      camera on the phone and transmit an image of the speakers face.

      --
      -fb Everything not expressly forbidden is now mandatory.
    6. Re:Technology overkill by sakeneko · · Score: 5, Interesting
      Point of fact this is unbelievably dumb and is right up there with converting Russian to German for an English speaker to read.

      Very well put!

      I've had deaf friends, one of whom attended Gallaudet University. (Famous liberal arts college for the deaf.) In addition, I lost most of my hearing for some years as a child -- fortunately, I got it back after surgery. I've thought about deafness, and dealt with it.

      Lip-reading works best for people who were hearing at one point and lost some or all of their hearing. I went deaf after I learned to talk, and went deaf slowly, which means I relied heavily upon it. People who have always been deaf often find lip-reading very difficult, or even impossible. When you have no concept of hearing or sound, trying to figure out what meaning is associated with specific lip movements is tough.

      This is true of learning to read, as well. A person who was already speaking, or could read, before going deaf has no real problem with reading. If you can't hear and never have heard, though, the concept of an alphabet and "sounding it out" makes no sense. A congenitally deaf person who wants to learn to read must learn each word as a whole, much as a Chinese or Japanese person who learns to read his/her language must learn each character separately.

      Since a congenitally deaf person faces a humongous task regardless of whether he/she is learning to read lips, or read and write, just which one do you think he/she would rather have to learn? In most cases, learning to read and write is going to be a lot more useful.

      From where I sit, speech to text would work better for most deaf people, congenitally deaf or not.

    7. Re:Technology overkill by Dephex+Twin · · Score: 3, Interesting

      Those are some good points.

      For my interpretation of how the lipreading worked, I was looking at the sample on the website. It appeared to be that certain sounds had certain pictures. Anyway, it also had something that touched on what you were talking about. It had a little bit of extra information you can't normally see, like red dots on the cheeks for g and k type sounds, and a red nose for nasal sounds, etc. Extras like these might make up for some of the lacking facial cues.

      The other thing I was thinking about was that the lipreading could be part of the understanding process. A good number of people (most?) who are legally deaf are not truly 100% deaf. If a person is able to get a bit of auditory information, plus this lipsynching information, it might be enough to make things a lot easier for these people, even if the lip-reading by itself were too simplistic.

      --

      If you want to make an apple pie from scratch, you must first create the universe. -- Carl Sagan
  4. What about TTY? by genka · · Score: 5, Interesting

    I worked with deaf people for a while and they were (and I am sure still are) disappointed that cell phones are not compatible with TTY devices. How difficult is this to do?

    1. Re:What about TTY? by mystik · · Score: 4, Funny

      My new Motorola phone I purchased this weekend mentions in it's menus Something about a TTY. I imagine I'd need data service from Verizon though.

      --
      Why aren't you encrypting your e-mail?
    2. Re:What about TTY? by mgrochmal · · Score: 3, Interesting
      Speaking as someone who works with trying to get lots of accessibility devices to communicate (for the blind and visually impaired, but similar principles apply), one of the main problems is deciding on a standard, followed by making sure it works with those that won't adhere to said standard.

      Case and point: I recently got a cellphone, so that someone else could have their phone back. I shopped around for a while and settled for one that would be free after rebate. I had it for a few days, and returned it for a more expensive one. First, the phone had an odd number layout, so I had to relearn the key mapping (the keys were part of a curve, instead of straight across). Second, I use a laptop to connect to the Internet, and I occasionally use a cell phone adapter to do it. The phone I bought was incompatible with the connector, and the phone's manufacturer had no immediate plans to make one. Those two reasons, as well as several other factors, prompted a return.

      If the cell phone companies would agree on a single interface, it would make the compatibility much easier to implement. Not only that, but the TTY devices need the information to implement all the various brands and models of cell phones. The possibility's there, but there's not much of a chance it'll happen anytime soon.

      --
      This .sig Intentionally Left Blank.
  5. One flaw ... by Greedo · · Score: 3, Funny

    No downloadable ring-tones.

    --
    Tuus crepidae innexilis sunt.
  6. Complicated by batboy78 · · Score: 4, Insightful

    This just seems complicated, why can't they just improve the speech to text capability. It seems like drawing a face with life-like facial movements to enable lip reading is a little beyond the scope of power for a PDA.

  7. Good idea and good start but.... by BWJones · · Score: 5, Informative

    This is a fantastic idea which will enable communication for the vast numbers of hearing impaired, however if the web-site is any indication, the technology needs improvement. I'm pretty good at reading lips and I was working pretty hard to figure out what was being said with the sound off.

    --
    Visit Jonesblog and say hello.
  8. Uhhh... by NilObject · · Score: 5, Funny

    Being a severely hearing impaired person, I do find the virtual person's "O"'s to be highly disturbing if not graphic. Yikes.

  9. Speaking from experience by FunkyELF · · Score: 5, Insightful

    I lived with a deaf room-mate last year. It took me about 2 months for me to understand what he was saying, and took him about the same to get used to my lips. Anytime he meets someone new, its very hard for him to read their lips (i.e. every time a new telemarketer tries to prey on the deaf user). Also, its not just the lips, its the tounge also. It'd probably be easier to use speach-> text software than this stuff....and what about background noise? I doubt this thing works well if not at all.

  10. Yeah, sure by Wind_Walker · · Score: 3, Interesting
    I was one of the fortunate ones who got to the company's website before it got Slashdotted, and was able to view the "demonstration" of their software. The demo consists of a mouth saying "Thank You" in various languages. I looked at English and Spanish, the two I know best.

    I sure as hell couldn't tell you what they were saying, even when I knew what words were coming out of their mouth. And this is not to mention cell phone static, distractions, contractions, mumbling, and lots of "ummm" and "uhhhh" that occurs during normal speech. I really don't see how this is a viable communication method.

    Maybe it's because I'm not experienced with lip reading. Maybe people who are deaf are better at it than I am, but I can usually tell what Football coaches are saying on the sidelines of games (of course, that's limited to "Bull****" and "You've gotta be ****ing kidding me!", but still...)

  11. WHy should they. by Unknown+Poltroon · · Score: 4, Insightful

    I still cnat get coverage, or hear the other person clearly, why should the deaf be different? But i can ply 3 different games and send a fucking picture of a duck. Stupid phone companies. Its a fucking phone!! First, fix it so i can hear someone, THEN gimme the damn bowling games.

    OK, this might be a troll. Im not sure myself. Its definately a vent. Fucking sprint. Oh well.

    --
    All Troll + "offtopic" mods are meta moderated as "Unfair", because you abused the system.
  12. Ugh by NilObject · · Score: 3, Insightful

    I just can not picture myself on a bus looking at this wildly articulate mout while yelling back: "Can yoo reepeeet dat agaannn???" Yes, I am hearing impaired. I would NEVER touch this thing. I'll stick with 2 way messaging.

  13. lip reading.. by pretzel_logic · · Score: 3, Funny

    look at a lip reader and say:

    'I want a fig newton'

    IMHO:
    too many flaws, the investors will back out

    --

    pretzel_logic
    1. Re:lip reading.. by Dephex+Twin · · Score: 3, Insightful

      Well, for comparison, see how well a speech recognition program does with the same sentence.

      And unless you just randomly blurted out the sentence, you probably have context in the surrounding sentences (e.g. you are talking about fig newtons, food in general, newton's law, whatever).

      I'd definitely put my money on the lip-reader, frankly.

      --

      If you want to make an apple pie from scratch, you must first create the universe. -- Carl Sagan
  14. This doesn't really do much... by Cyclopedian · · Score: 5, Informative
    Lip reading is only half the whole "info-stream" that comes out of peoples mouths. I know this. I'm deaf (severe to profound sensori-neural hearing loss, since birth) and I'll tell you one thing: lip-reading can give ambiguous results.

    Someone can say "Pot" and yet with the same lip movement, can also say "My". Men with bushy mustaches are a lip-reading disaster.

    For me, I've adapted in my own way: I rely heavily on my hearing aids. That combination of both lip-reading and hearing the audio stream from your mouth enables me to achieve at least a 70% success rate (under ideal conditions, if it's a party atomosphere, fudgeddaboutit). I've had hearing aids since I was 1 1/2, and only with extensive speech therapy can I speak well. I'm one of the few deaf-from-birth people that can do it this well. So, from that perspective, I can speak on a phone (as long as I can understand that mangled audio coming out the receiver, which is 0%).

    Why don't they just focus on speech recognition? A great speech recognition phone would enable deaf people that speak to use phones for near real-time conversations. In addition, such technology can also be (easily?) adapted to foreign language translators for tourists.

    However, until such technology is available at the consumer level, I'm stuck with two-way text messaging devices like the T-Mobile SideKick.

    -Cyc

  15. Read My Lips by bytesmythe · · Score: 5, Interesting

    I thought it seemed a little weird at first, but then I checked out the other demos. When I knew what the words were ("Thank you" in English, German, French, Spanish, and Japanese), I could easily tell what was being said.

    I notice a lot of people complaining about improving text-to-speech, which is far more advanced than this technology. Speech sounds come out in a continuous flow. Getting a computer to recognize the breaks between words, properly spell them reliably, etc. is hard enough on a desktop system, much less a PDA. Especially considering in languages like English, where most vowels in unstressed syllables are rendered vocally as "uh".

    This system simply has to hear a sound, and immediately display an associated... well, not "grapheme", since this isn't writing... maybe "pixeme". It is the graphical equivalent of attempting to spell perfectly phonetically.

    Also, if you didn't notice it, "invisible" sounds that occur on the back of the tongue are indicated by circles on the cheeks (like hard 'g' and 'k'), and nasal sounds are indicated by a darkening of the nose.

    All in all, I think this is an interesting idea. It will be even cooler when they can render different faces so the "avatar" resembles the person to whom you're speaking.

    --
    bytesmythe
    Hypocrisy is the resin that holds the plywood of society together.
    -- Scott Meyer
  16. The reasons this is better than speech-to-text by zipwow · · Score: 5, Informative

    Partly, because speech to text isn't very good.

    Speech to text isn't very good because its very hard to turn phonetics into words. Our ability to understand people is very reliant on context. Knowing what's been said helps you understand what's being said.

    Some will say that speech to text is getting fairly good in English, which is somewhat true. Obviously, though, there are bigger markets in other languages.

    So how does this thing work, if it doesn't do speech to text? It does speech to phonetics, and phonetics to lips.

    For example, its relatively easy to understand when someone has said "h -ee- r", but knowing if that's supposed to be "here" or "hear" is quite difficult.

    This is why the same software works across languages. "Th" is "Th" in any language, and your single algorithm doesn't have to care.

    -Zipwow

    --
    I don't know which is more depressing, that 2/3 didn't care enough to vote, or that 1/2 of those that did are crazy.
  17. Re:And why isnt it just realtime text??? by Dephex+Twin · · Score: 3, Informative

    Seems like it's not over-engineering. This is less steps than speech-to-text as far as I can see.

    You have to record the speech and convert those sounds into phonemes. Now all you do is use the picture(s) that go with that phoneme, which is going to be more or less consistent.

    With speech-to-text you have to use probability and word banks to figure out what the heck words those phonemes are supposed to go with, which is the hardest part by far, because spelling and grammar is so inconsistent. That requires a lot more time and computing power, and you are prone to a bunch more mistakes of course.

    --

    If you want to make an apple pie from scratch, you must first create the universe. -- Carl Sagan
  18. Finally a solution for illiterate deaf people! by techstar25 · · Score: 5, Funny

    This is clearly a solution for the large population of completely illiterate deaf people, for whom speech-to-text is not an option.

  19. I can see one advantage to this... by jhines0042 · · Score: 5, Interesting

    ... if you have this software running on a phone then if you are hearing impared you could get real time conversation with the other party without having to go through a human being.

    I've spoken with a hearing impared person on a phone before through a TTY system and it is painfully slow. First you have to say your sentence and then they send it. Then the other end needs to read it, type in a response, and then send it at which point it is read back to you. Imagine having a conversation over an Instant Messenger except you're secretary was reading the screen and typing for you. (IM for the blind for example)

    I agree that we need better voice to text and text to voice translation. That technology would give use better access for everyone. You could have "hearing" for the hearing impared (speech to text), "reading" for the vision impaired (text to speech), and you could even have "writing" for those with fine muscle control imparement or who are lacking the necessary limbs for various reasons.

    But this is an interesting approach to solve one of the three problems.

    --
    42 - So long and thanks for all the fish.
  20. combine lip-reading with speech2text by peter303 · · Score: 3, Interesting

    David Stork has a chapter computer lip reading on in the book "Hal's Legacy" on A.I. methods. The combination is much more reliable that either audio or visual.

  21. Re:Office space... by thelinuxking · · Score: 4, Funny

    I'm thinking about taking that new chick from Logistics. If things go right I might be showing her my O-face. You know: Oh! Oh!

  22. Lightbulbs for the Blind by egg+troll · · Score: 5, Funny

    Reading this makes me realize that my Lightbulbs for the Blind scheme was not crazy! Bundles of cash, here I come!

    --

    C - A language that combines the speed of assembly with the ease of use of assembly.
  23. Phonemes, visemes, TTS, and lip synching by TekkonKinkreet · · Score: 3, Informative

    Posting late, but wtf.

    By way of introduction: I developed the core coarticulation and other algorithms for lip synching when I worked at a now-defunct company called...wait for it...LIPSinc. We thought the resulting lip synching was pretty damn convincing, so on my own I tested out our stuff with a hearing-impaired friend, with mixed results. Anyway, I don't know a little about this stuff, I know a *lot* about it.

    What these guys have done is map phonemes onto exaggerated visemes (the pictures of the mouth). Not a bad idea at all! Bunch of problems, though. First, there's a data data reduction of about 3x in going from sound to video--there are 40-50 distinguishable phonemes, and 9-16 distinguishable visemes, depending on how you count each. This is because the visible part of the face only makes up the end of the vocal tract, a lot of distinctions between letters occurs without the involvement of the lips, like the difference between F and V, while others, like K, can be pronounced with the face in virtually any position. This is part of what makes lip reading so hard with a real person, and why they need a lot of context to pull it off. They also seem to be slowing down the timing, as if they recognized the phonemes and then synthesized each at the same length. This gives longer to recognize each one, but wrecks the visual prosody (rhthym) of the speech, which is a good cue for where the parts of speech are. Then there's the rest of the face. The eyebrows and head positions help you figure out key words, ends of clauses, tell if something is a question, etc.

    Those who say that TTS is superior to lip reading have a point. Good TTS contains *more* accurate information than an uninterpreted stream of phonemes (itself 3x richer than a stream of visemes, as I said above), because the machine can do a Viterbi search to find the most likely sequence of words from a continuous stream of phonemes. Words also open up higher NLP functions, so you can do constraint relaxation to test whether "wreck a nice beach" or "recognize speech" fits better in the context.

    Still, I'd like to see an experiment where the raw phonemes are fed, as text, to the recipient. I think with practice, your brain would start to decode the string (it manages with the sound, right?), despite the lack of word boundaries and the errors in phoneme detection (which is not all that high without text-I think seventy-something percent). Seems like an easier pattern recognition problem than lip reading. Who wants to go get funding?

  24. Better yet: by Misch · · Score: 3, Insightful

    We have tools like Sprint Relay On-Line that will do text-to-speech... and every state provides confidential relay services to begin with. Many states are moving towards making 711 a standard relay number.

    If a deaf person wanted a "cell phone", they'll probably have one from Wynd Communications, a two-way pager with text/e-mail and other services built right into the damn thing. They're all the rage here. Screw lip reading over the phone. This technology is pure eye-candy. Nice, but how useful will it really be?

    --

    --You will rephrase your request for me to go to hell. Goto statements are not acceptable programming constructs
    1. Re:Better yet: by Misch · · Score: 3, Insightful

      Services like TTY and phone relays have long since been made redundant by the advent of universally available and reasonably inexpensive e-mail. But taxpayers still spend hundreds of millions a year subsidizing these obsolete services.

      May you never go deaf then. May you never buy a product that breaks and have to call a phone number for customer services. May you never have to call an emergency services number. May you never have to call the pizza place to order a pizza.

      The differnce between e-mail and TTY is the difference between push and pull technology. With e-mail, there's no guarantee that your e-mail is ever received, much less opened, read and processed.

      Because of this, e-mail cannot (and does not!) qualify under the ADA soley as a reasonable accomodation.

      I've worked as a secretary (in a school for the deaf, no less.) I know e-mails can take a long time to get delivered. There's still time between when it gets delivered and when it actually got read and processed by me. (Usually not long, but on some crazy hectic days, it could take some time.)

      I hate feeding the trolls, but this one needed to be thwacked over the head.

      --

      --You will rephrase your request for me to go to hell. Goto statements are not acceptable programming constructs