Slashdot Mirror


Researchers Work To Perfect Computerized Lip Reading

Iddo Genuth writes "Researchers at the University of East Anglia are working to develop computerized lip-reading systems. Lip-reading is extremely hard for humans to master, but a software-based system has several benefits over even the most highly trained expert. The ultimate goal of the project is to convert lip-read speech into text. 'Apart from being extremely helpful to hearing-disabled individuals, researchers say that such a system could be used to noiselessly dictate commands to electronic devices equipped with a simple camera - like mobile phones, microwaves or even a car's dashboard. England's Home Office Scientific Development Branch ... is currently investigating the feasibility of using lip-reading software as an additional tool for gathering information about criminals or for collecting evidence.'"

3 of 117 comments (clear)

  1. Already been done by TheGoodSteven · · Score: 2, Informative

    I had watched a documentary about this technology some time ago. This technology was applied to Hitler's home videos which lacked audio. Its pretty interesting but runs about 45 minutes long. Here's the video for those that are interested.

  2. Already Done for Noisy Environments by BoydWaters · · Score: 2, Informative

    About ten years ago I attended a workshop by Stanford professor David Stork. He mentioned some work on a system that was deployed for use by aircraft technicians: the system couldn't read the voice channel with the jet engine blasting away (the techs wear hearing protection). So it read lips. Ten years ago.

    Sounds like TFA is talking about doing this in an embedded, consumer-electronics application. Rather than a fixed, industrial-military, hire-computer-scientists-to-maintain-it thing.

    Not-so-coincidentally, David Stork is the author of the book, "HAL's Legacy"...

  3. What's the real story? by bill_mcgonigle · · Score: 4, Informative
    The summary and TFA seem to talk about one day coming up with lip-reading computers, which we've had for a while, and was open sourced and is apparently now on Sourceforge.

    TFA links to a paper that's actually about exaggerating lip motion to improve recognition, which seems like an interesting topic, at least new to me. But it's seemingly unrelated to the reporting or any governments protecting us from our rights.

    From the Abstract:

    Accurate lip-reading techniques would be of enormous benefit for agencies involved in counter-terrorism and other law-
    enforcement areas. Unfortunately, there are very few skilled lip-readers, and it is apparently a difficult skill to transmit,
    so the area is under-resourced. In this paper we investigate the possibility of making the lip-reading task more amenable
    to a wider range of operators by enhancing lip movements in video sequences using active appearance models. These are
    generative, parametric models commonly used to track faces in images and video sequences. The parametric nature of the
    model allows a face in an image to be encoded in terms of a few tens of parameters, while the generative nature allows
    faces to be re-synthesised using the parameters. The aim of this study is to determine if exaggerating lip-motions in video
    sequences by amplifying the parameters of the model improves lip-reading ability. We also present results of lip-reading
    tests undertaken by experienced (but non-expert) adult subjects who claim to use lip-reading in their speech recognition
    process.
    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)