Slashdot Mirror


Researchers Work To Perfect Computerized Lip Reading

Iddo Genuth writes "Researchers at the University of East Anglia are working to develop computerized lip-reading systems. Lip-reading is extremely hard for humans to master, but a software-based system has several benefits over even the most highly trained expert. The ultimate goal of the project is to convert lip-read speech into text. 'Apart from being extremely helpful to hearing-disabled individuals, researchers say that such a system could be used to noiselessly dictate commands to electronic devices equipped with a simple camera - like mobile phones, microwaves or even a car's dashboard. England's Home Office Scientific Development Branch ... is currently investigating the feasibility of using lip-reading software as an additional tool for gathering information about criminals or for collecting evidence.'"

12 of 117 comments (clear)

  1. This could be a problem. by grub · · Score: 4, Funny


    1: Go in the D pod with Frank.
    2: Turn off sound.
    3: Plan disconnection of HAL.
    4: Leave D pod.
    5: Check out slashdot's 7 year firehose backlog before executing your plans.
    6: Get that sinking feeling of impending doom.

    --
    Trolling is a art,
    1. Re:This could be a problem. by Dogtanian · · Score: 3, Funny

      Go in the D pod with Frank. There's no "Frank" here. Wow, it's like a 2001 text adventure. Perhaps Frank was eaten by a grue then?
      --
      "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
  2. Bush Sr.? by dosius · · Score: 4, Funny

    Now we can find out what Dubya's father was REALLY saying when he said "read my lips, no new taxes"

    -uso.

    --
    What you hear in the ear, preach from the rooftop Matthew 10.27b
  3. Haha by pembo13 · · Score: 3, Insightful

    I like how the task for which it will be used most heavily is put at the end of the summary.

    --
    "Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
  4. Let me be the first... by ephedream · · Score: 5, Insightful

    ... to welcome our new lip-reading overlords, who will undoubtedly be watching us from every street camera on every corner from now on.

  5. Love affair with voice control by mgkimsal2 · · Score: 3, Insightful

    I've noticed a love affair with voice controlled phone systems recently, with some companies getting rid of the 'press 1, press 2' and moving totally to 'Please tell us what you're calling about'. Tellme.com is mostly to blame for this proliferation I think, but someone else makes the final call to get rid of the numbers altogether. Not a good move, imo.

    Anyway, this gets me to privacy stuff. As computers try to understand us more, we'll need to interact in a more 'human' fashion - talking more, or doing things that would attract the attention of other humans (and also the computers). It's late, and I'm rambling here a bit, but remember how voice-controlled computers were going to take over a few years back? Everyone was just going to be talking to their computers to get stuff done. In reality, that would be a complete disaster in office environments, as there's generally too much noise already. Replacing all the typing you hear with voices. Ugh...

    So, if I need to talk to a computer, but do it quietly, it can just read my lips, right? Or can I just mouth the words and have it understand that? I've found that when I try to 'mouth' words silently to someone across a room, I tend to exaggerate my mouth's movements, so perhaps that would be a better thing for the computers to be able to 'parse'.???

    I see real application for this technology in niche areas, but am not sure it'll become 'mainstream' any time soon (like, 5-10 years). We'll need to rethink our physical world - offices, cars, and such - before these sorts of new HCI systems can really be integrated in to our day to day lives productively.

  6. Well by PieSquared · · Score: 3, Insightful

    As with all technology its use more then the technology itself will be good or bad. I can see it being useful as an auxiliary input method. This combined with speech recognition ought to be better then speech recognition alone, and of course it allows soundless input in a situation where sound isn't possible or is undesirable - though I'd imagine just lip reading would be somewhat less accurate then current speech recognition.

    On the other hand, it could also be used as a tool for additional unnecessary surveillance.

    --
    Does a line appended to your comment give your post meaning in and of itself, or only in relation to those without?
  7. You could try... by jd · · Score: 3, Funny
    3a. Learn to speak Klingon really really fast.
    3b. Hope HAL doesn't have the Klingon i18n package installed.

    Or...

    3a. XOR the output from HAL's camera with the output from the output from a chip manufacturing security camera. The AI porn'll distract HAL for long enough.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  8. time by rossdee · · Score: 4, Funny

    to learn ventriloquism

  9. Silent films given voice by lightyear4 · · Score: 3, Interesting

    Bringing audio and/or transcript to silent films is also where such technology is applicable. An excellent documentary about computerized lip reading to accomplish the very same may be found via google video : http://video.google.com/videoplay?docid=189608705425991617&hl=en . I know it's quite early for an indirect invocation of Godwin's Law, but the documentary content is nevertheless quite related to this topic. It is entitled "Hitler Speaks" in reference to silent videos filmed in Hitler's presence.

  10. Speech recognition uses by WizzardX · · Score: 3, Interesting

    Will future versions of speech recognition software use a web cam to improve accuracy?

  11. What's the real story? by bill_mcgonigle · · Score: 4, Informative
    The summary and TFA seem to talk about one day coming up with lip-reading computers, which we've had for a while, and was open sourced and is apparently now on Sourceforge.

    TFA links to a paper that's actually about exaggerating lip motion to improve recognition, which seems like an interesting topic, at least new to me. But it's seemingly unrelated to the reporting or any governments protecting us from our rights.

    From the Abstract:

    Accurate lip-reading techniques would be of enormous benefit for agencies involved in counter-terrorism and other law-
    enforcement areas. Unfortunately, there are very few skilled lip-readers, and it is apparently a difficult skill to transmit,
    so the area is under-resourced. In this paper we investigate the possibility of making the lip-reading task more amenable
    to a wider range of operators by enhancing lip movements in video sequences using active appearance models. These are
    generative, parametric models commonly used to track faces in images and video sequences. The parametric nature of the
    model allows a face in an image to be encoded in terms of a few tens of parameters, while the generative nature allows
    faces to be re-synthesised using the parameters. The aim of this study is to determine if exaggerating lip-motions in video
    sequences by amplifying the parameters of the model improves lip-reading ability. We also present results of lip-reading
    tests undertaken by experienced (but non-expert) adult subjects who claim to use lip-reading in their speech recognition
    process.
    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)