Slashdot Mirror


Reading Lips In Software

SEWilco writes "The Register points out that Intel has released code for reading lips from a video image, Audio Visual Speech Recognition (AVSR). They do point out that better results would probably be achieved by combining video and audio recognition processing. I don't know if they have any patents, we all know some prior "art" from 2001, er.. 1968. HAL's accomplishment was also mentioned by CNN during 2001 in an article about this group's work."

5 of 149 comments (clear)

  1. Good or Evil? by Blaine+Hilton · · Score: 5, Insightful
    That's all we need, now everybody and his brother can easily create software applications to log everything. Security cameras record a lot of movements, but imagine hooking that up to lip readers and then being able to grep through all of that text output? Total Information Awareness here we come......

    Go calculate something

  2. Some coding expertise... by flamingspinach · · Score: 4, Insightful

    Wow, that must have taken a lot of hard work to do. First you'd have to recognize the location of the lips in the images (they might not stand out that much, especially in a crowd scene), then find the region in which the lips are moving, then finally use the positions of the lips to extrapolate for the current shape of the inside of the person's mouth, and make a haphazard guess at the sound being produced. And you'd need to be able to recognize the lips from any angle whatsoever. Sounds near impossible to me... and besides, by the point at which the person is beyond the range of the audio pickup of a security camera (I'm assuming that's what this would be used for), it would also be beyond the point of bad resolution. (unless the target is in a crowd, in which case the lips would be obscured frequently by people moving around in front of the target).

    1. Re:Some coding expertise... by Nihilanth · · Score: 3, Insightful

      yeah, a lot of asian languages rely on internal vowel sounds that make lip-reading nearly impossible. Maybe if they used lasers to measure the sound pressure waves, or vibrations of the voicebox in conjunction with the lipreading.

  3. Orwellian p0ssibilities by asadodetira · · Score: 2, Insightful

    Cameras randomly zooming on the lips of the crowd, if somebody says someting from some "list" of words, they keep tracking that person and make some face recognition also.

  4. Re:Woot, this is a godsend for us college students by deadsaijinx* · · Score: 1, Insightful

    i doubt it would help. The way i see it, the image would have to be clear, and the person can't be moving to much, and they have to be annunciating their english. after all, i hardly move my lips while speaking, so it couldnt read mine

    --
    YOU SUCK BALLS!