Slashdot Mirror


Reading Lips In Software

SEWilco writes "The Register points out that Intel has released code for reading lips from a video image, Audio Visual Speech Recognition (AVSR). They do point out that better results would probably be achieved by combining video and audio recognition processing. I don't know if they have any patents, we all know some prior "art" from 2001, er.. 1968. HAL's accomplishment was also mentioned by CNN during 2001 in an article about this group's work."

149 comments

  1. The only hope for privacy: by burgburgburg · · Score: 5, Funny
    Thick mustaches.

    Men and women, boys and girls. All with really thick, dirty, obscuring mustaches.

    What is this world coming to?

    1. Re:The only hope for privacy: by deadsaijinx* · · Score: 2, Funny

      No, this is the era of ventriliquisum (sp?), as the ventriliquistes will rise up against those who mocked them, their plans for unholy vengance will go unnoticed, as our only safety net collapses.

      --
      YOU SUCK BALLS!
    2. Re:The only hope for privacy: by Ugmo · · Score: 0, Offtopic
      Thick mustaches.

      Men and women, boys and girls. All with really thick, dirty, obscuring mustaches.

      What is this world coming to?

      Well if we give the girls male hormones to make mustaches the world looks like the East German Womans Swim Team

      If we forgo the hormones and just have the men grow mustaches and the women wear veils to cover their faces the world will look like 1990's Iraq with all the guys being Saddam Doubles and all the women in Burquas, covered head to toe.

      Thank Goodness we're living in Free America not Iraq.

    3. Re:The only hope for privacy: by DeadScreenSky · · Score: 1

      Well, on a positive spin, maybe some of us could then get new jobs.

      --
      There is no excellent beauty that hath not some strangeness in the proportion. -- Francis Bacon
    4. Re:The only hope for privacy: by vano2001 · · Score: 0, Offtopic

      Hmmm... maybe saddam was on the right track... makes you think about all those mustaches in Iraq! Oh and guess what the Iraqui Information Minister was prolly the only one without a mustache... coincidence? I think not!

    5. Re:The only hope for privacy: by Ores · · Score: 1

      Completly off topic, and I relise this was a joke But women were not at all oppressed in the way you are thinking, there was no inforcement at all of such tradtitions. This is why people such as Osama bin Laden called Saddam an infidel. Ironicly any new govt (non American) in Iraq is far more likely to have these traddition values. If you're gona bomb a country at least learn a little about it first before destroying it.

  2. Good or Evil? by Blaine+Hilton · · Score: 5, Insightful
    That's all we need, now everybody and his brother can easily create software applications to log everything. Security cameras record a lot of movements, but imagine hooking that up to lip readers and then being able to grep through all of that text output? Total Information Awareness here we come......

    Go calculate something

    1. Re:Good or Evil? by phrogeeb · · Score: 1
      Of course, it seems that the camera would need near head-on uninterrupted vision of the person whose' lips are being read, which is not an ideal surveillance situation and makes it pretty much impossible to track actual conversations (with people facing one another).

      I'd be scared of speaking to my computer now, tho - can you imagine a virus that uses your own webcam or whatever to see what you're saying when you're sitting in front of the screen?

      I guess no more webcam sex for me. =(

      --

      ------

      "Will the highways on the Internet become more few?" --George W. Bush, in Jan. 2000

    2. Re:Good or Evil? by geekoid · · Score: 1

      If this is able to reach accurate reading in less then ideal circumstances, the only way to assure the government does not act outside its mandated bounds would be by having a completly open society.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    3. Re:Good or Evil? by cpeterso · · Score: 1


      This reminds me of the Seinfeld episode where George wants to "borrow" Jerry's deaf girlfriend to read people's lips. George and Jerry try to hide their lips from her as they discuss her lip-reading abilities, then she can read their lips anyway.

  3. No new taxes! by Shadow+Wrought · · Score: 5, Funny

    Oh wait, that was a different lip reading session...

    --
    If brevity is the soul of wit, then how does one explain Twitter?
    1. Re:No new taxes! by Anonymous Coward · · Score: 0

      Recent work by Intel on the video from the "No new taxes" pledge reveils that Bush actually promised, "No Newt axes." A pledge he was able to keep.

    2. Re:No new taxes! by graveyhead · · Score: 1
      Oh wait, that was a different lip reading session...
      From a different bush :P

      [ducks]
      --
      std::disclaimer<std::legalese> sig=new std::disclaimer; sig->dump(); delete sig;
  4. What about changing what people say? by KPU · · Score: 2, Interesting

    Anybody else reminded of the Read My Lips videos that fit clips to songs?

    1. Re:What about changing what people say? by Anonymous Coward · · Score: 0

      this is the reason i'm sceptical about lip reading software being better than speech recognition software. lip movements can be anything.

      i'm not deaf so i'm not sure how a deaf person may read lips but i have a feeling that they kind of guess by extrapolating what the subject is at hand. for example: a cbc commentator reading the lips of mark crawford during a timeout period of yesterdays vancouver/minnesota hockey game ("GO TO THE NET!"), i don't think he could've done this without the hockey environment in place.

    2. Re:What about changing what people say? by PeteEMT · · Score: 3, Informative

      I am deaf, and your pretty much right, at least some of the time. Without context, I find lipreading very hard to impossible, with context I can get maybe 80% of the words and can fill in the blanks.
      I know others can lipread better than I can but even in lipreading class they said that you wont be able to catch everything and have to fill in the blanks.

      Just to note, All Deaf people can't lipread and not all people can be lipread. Bushy Mustaches, not moving your mouth when you talk are two big obstacles. (a personal peeve when someone expects me to lipread them)

      --
      Pete
    3. Re:What about changing what people say? by PeteEMT · · Score: 1

      That should be: NOT All deaf people CAN lipread.

      --
      Pete
  5. Woot, this is a godsend for us college students. by yeoua · · Score: 5, Funny

    Maybe now with a cluster at our finger tips and this sound visual lip analyser thing, we may be able to (finally) understand what all those foreign heavy accented professors are actually mumbling about...

    And well, beats manual note taking if the computer can read the board and his mouth and his voice.

  6. Re:Woot, this is a godsend for us college students by Anonymous Coward · · Score: 0

    there should be a place near the colleg that sells notes.

  7. Prior Art? by Anonymous Coward · · Score: 0

    I don't know if they have any patents, we all know some prior "art" from 2001, er.. 1968. Did Clarke ever file a patent for the geosynchronous satellites?

    1. Re:Prior Art? by meringuoid · · Score: 4, Interesting
      Did Clarke ever file a patent for the geosynchronous satellites?

      No, he never did. If he had, he would almost certainly by now be far and away the richest man on the planet. Now, imagine if you will what Arthur Clarke might have done with a fortune that would make Gates green with envy... He'd have been on Mars twenty years ago.

      --
      Real Daleks don't climb stairs - they level the building.
    2. Re:Prior Art? by Anonymous Coward · · Score: 1, Funny
      For some reason I keep picturing Gates green from trying to breath on Mars after Arthur Clarke sent him there. Something about your wording put that in my head.

      --
      I'm not a cowboy, just let me submit this comment and get back to work.

    3. Re:Prior Art? by RatBastard · · Score: 1

      Thiinks about the money Robert Heinlein would have made if he had patented half of the things he devised for his stories!

      --
      Boobies never hurt anyone. - Sherry Glaser.
    4. Re:Prior Art? by Anonymous Coward · · Score: 0

      No he wouldn't be. The us government would have repossessed his patent 'for matters of national security'.

    5. Re:Prior Art? by Anonymous Coward · · Score: 0

      The real question is this: could he have done so?

      Could he have patented the very idea of putting a satellite at a certain altitude in a certain orbit?

      Could he have patented the mathematical calculations required to define the orbit? Geosyncronous orbits are explained by simple newtonian physics in your average junior high science class.

      He couldn't have patented the calculations for the amount of thrust, the orbital mechanics of the geosyncronous transfer orbit, or any of the hairy math, because he never did it. He certainly couldn't have patented the means to get a satellite to geosync orbit, because that's rocket design, and he didn't do that.

      The man had an idea: an object placed at a certain altitude will stay above the same point on the planet, and this could be used for the purposes of communication.

      An original concept, but does this qualify for patent protection?

    6. Re:Prior Art? by Rich0 · · Score: 1

      I thought that all patents had to be reduced to practice. You have to have a working prototype, or a detailed enough design to demonstrate that you can in fact make the thing.

    7. Re:Prior Art? by SEWilco · · Score: 1
      No, he never did. If he had, he would almost certainly by now be far and away the richest man on the planet. ... He'd have been on Mars twenty years ago.

      I see you also had the same thought I did. Clarke would have been the richest man, but he wouldn't be on this planet. If not on Mars, he'd at least be...in Clarke Orbit.

      (I like his paper's discussion about whether radio frequencies might pass through the atmosphere: "..we have visual evidence that frequencies at the optical end of the spectrum pass through [the atmosphere] ...")

    8. Re:Prior Art? by Anonymous Coward · · Score: 0

      A method for the insertion of a metalic spheriod object such that it remains in orbit at a constant position over a fixed land mass.
      Yes, He could have patented it, If the same people granting patents today were granting them in the late 60's.
      Could he have made gobs and gobs of money off it? maybe, maybe not, but he probablly coulda worked out a deal to get put on a mission to the moon...
      Oh and patents aren't international. If the US had had a patent on geosync satelites, the russians or someone else who didn't have patents on them could have put them up for others for money. So making gobs of money is questionable.
      having the power to prevent satelite communications from taking off, yes that could happen from a patent like that.
      Remember, you can't just mint money without producing value... otherwise you're draining a finite resource from others.
      patents and copyright are about as anti-progress as any law can be. they make an intangable worth money, without having had to produce any real value to anyone... you just had to think up the idea before anyone else.
      Sure, they do allow someone who has an idea to set a price which can recoup any costs associated with thinking it up, which to a reasonable limit may promote thinking of new ideas, but ultimately, unless there is an expiration date you've gone past promoting developmnet, and moved into making a money printing press for someone.

    9. Re:Prior Art? by sysjkb · · Score: 1

      If he [Clarke] had [filed a patent for geosynchronous satellites], he would almost certainly by now be far and away the richest man on the planet.

      Clarke published his idea in 1945. The first geosynchronous satellite, Syncom I, wasn't launched until 1963. Clarke's patents would have run out by the time geosynchronous communications satellites began filling the skies.

      Sincerely yours,
      Jeffrey Boulier

  8. finally! by sweatyboatman · · Score: 1

    a reason to really hunker down and learn an obscure Chinese dialect.

    I've been putting it off for far too long.

    --
    It breaks my pluginses, my precious!
    1. Re:finally! by MisterFancypants · · Score: 3, Funny
      a reason to really hunker down and learn an obscure Chinese dialect.

      Good plan... Oh wait, who would you talk to? Bad plan.

    2. Re:finally! by Anonymous Coward · · Score: 0

      blah bleh blah super bleh blah bleh blah super bleh blah bleh blah super bleh blah bleh blah super bleh blah bleh blah super bleh blah bleh blah super bleh blah bleh blah super bleh and then the monkey gods decended on your ass. blah bleh blah super bleh blah bleh blah super bleh and then hue heffner rapes your grandma blah bleh blah super bleh blah bleh blah super bleh and then buddah eats some ice cream blah bleh blah super bleh and then the chinese sue me blah bleh blah super bleh ....

      uhhh, ok, no snow for you mister. i sense no danger here. i sense no need to learn an obscure chines dialect. I sense only peace. but, hey, maybe it's the opiates...... grr, baby, i'm the berry best smoothy you ever had. APPLE!

  9. So computers can now talk to themselves (Re /.) by skermit · · Score: 5, Interesting

    A couple months ago, a very fine article was posted to /. about work at MIT regarding speech-->video synthesis using pre-recorded syllables. This means in the near future we'll be able to have avatars which an communicate to other people by videophone and/or other computers should we wish to do so. I'm reposting the old link because it got /.'ed for about 2 months (the professor took down the link) before putting the vids back up. So check out the amazing work that's on the flip-side of this article.

    http://cerboli.mit.edu:8000/research/mary101/resul ts/results.html

    --
    -Christopher Wu
    http://www.christopherwu.net/
    1. Re:So computers can now talk to themselves (Re /.) by Anonymous Coward · · Score: 1, Funny

      "So computers can now talk to themselves"

      Yep, it is called networking.

    2. Re:So computers can now talk to themselves (Re /.) by Anonymous Coward · · Score: 0

      computers dont talk! well, some do, but its real choppy. have yourself an ass-fuck-alifically good time

    3. Re:So computers can now talk to themselves (Re /.) by haroldhunt · · Score: 2, Interesting

      Great! So computers can talk to themselves but they still haven't got anything to say.

    4. Re:So computers can now talk to themselves (Re /.) by Entropy248 · · Score: 1

      Figuring that after the first 2 months we /.ed them, they'd never expect a sneak attack. Onwards /.ers!
      Br. Why not combine the whole works with some Animatronics (a la Disney) and make some robots?

  10. Body language by Smallpond · · Score: 4, Funny


    Body language should be even easier than lip reading. I want to know if I'm wasting my time or whether I should invite her back to my place.

    1. Re:Body language by Suchetha · · Score: 5, Funny

      simple.. you're posting on /. .. face it.. you're wasting your time

      Suchetha

      --

      learn from yesterday, plan for tomorrow, party tonight
      or one out of three ain't bad
    2. Re:Body language by thulldud · · Score: 1

      "Gesture recognition" is even easier. Want me to interpret that last one for you, Dave?

    3. Re:Body language by Lizard_King · · Score: 1

      The interesting thing about this is that body language is an *important* part of lip reading. Facial expressions and gestures can add a lot of meaning to communication... I wonder what type of gesture recognition this system claims to have.

      --
      "My mother never saw the irony in calling me a son-of-a-bitch." - Jack Nicholson
    4. Re:Body language by Anonymous Coward · · Score: 1, Funny
      LOL ROTFLMAO!!! You so funny witty slashdot poster!! You make old joke seem fresh and not like old stinky turd!!! Please share more surprising-delivery of classic-style joke!!

      You: Did you know chickenz cross road to not be on old side?
      Us: LOL!! You so funny!!

      You: Black peoples are funny with watermelon in their cadillac!
      Us: OMG - its funny cause its funny cause itr true!!!

      You: Please, take my wife!
      Us: Ha Ha ha ah ah aH

      You: Nerd get no sex!!!! Ha ha ha
      Us: Funny but true but sad but funny too

      You: Q: How many X to screw in light bulb? A: Solve for X!!!
      Us: Please keep up the hilarity!!!

      You: Knock Knock!
      Us: PLEASE DIE

  11. Some coding expertise... by flamingspinach · · Score: 4, Insightful

    Wow, that must have taken a lot of hard work to do. First you'd have to recognize the location of the lips in the images (they might not stand out that much, especially in a crowd scene), then find the region in which the lips are moving, then finally use the positions of the lips to extrapolate for the current shape of the inside of the person's mouth, and make a haphazard guess at the sound being produced. And you'd need to be able to recognize the lips from any angle whatsoever. Sounds near impossible to me... and besides, by the point at which the person is beyond the range of the audio pickup of a security camera (I'm assuming that's what this would be used for), it would also be beyond the point of bad resolution. (unless the target is in a crowd, in which case the lips would be obscured frequently by people moving around in front of the target).

    1. Re:Some coding expertise... by flamingspinach · · Score: 2, Interesting

      Hey, and what about Chinese? Reading inflection would be near impossible, even if you looked at the person's voicebox (assuming it's visible).

    2. Re:Some coding expertise... by Nihilanth · · Score: 3, Insightful

      yeah, a lot of asian languages rely on internal vowel sounds that make lip-reading nearly impossible. Maybe if they used lasers to measure the sound pressure waves, or vibrations of the voicebox in conjunction with the lipreading.

    3. Re:Some coding expertise... by flamingspinach · · Score: 2, Interesting

      That second one could work, but can lasers measure pressure fluctuations? I would think that air wouldn't reflect a laser, and if one measures the pressure by the speed of light through the medium (high pressure will slow it down slightly), you'd need a reflector of some sort...

    4. Re:Some coding expertise... by Flakeloaf · · Score: 1

      All I wanted was a lip-reading computer with a frickin pressure-sensing laser beam attached to its USB port. Was that too much to ask?

      --

      Am I the only one who heard Roxette to sing "I'm gonna get blitzed for some sex"?

    5. Re:Some coding expertise... by awilden · · Score: 1

      A lot of these issues have been taken care of a long time ago. In 1996 several of my colleagues published a simple system for doing this in realtime (including integrating sound and video together for speech recognition) at the European Conference Computer Vision -- CiteSeer link to paper and there are several other papers in that same epoch describing similar systems. Clearly Intel has a more complete system than these papers (as you would expect given 7 years), but it's not as hard as you're making it seem.

    6. Re:Some coding expertise... by flamingspinach · · Score: 1

      Impressive... ;:o (I'm an amateur coder and am fazed easily by complex-sounding projects)

    7. Re:Some coding expertise... by Anonymous Coward · · Score: 0

      are you sure you don't want this lip-reading laser grafted into your skin --- modded from a futurama quote

    8. Re:Some coding expertise... by RobPiano · · Score: 1

      There is actually a lot of research in video analysis right now. Yes it is tough, but there are lots of tricks and a growing and fun community!

      If you are a windows user I suggest trying eyesweb (www.eyesweb.org). Sorry, it works alittle better if you use PAL and I don't like Windows, but its open source and flexible.

      The only problem, is its a KILLER on the CPU. The more analysis you do the more research you need. For face info, check out blob analysis and a head set camera.

      If you want to know more you can send me a message.

      Kind Regards,
      Rob

    9. Re:Some coding expertise... by deadsaijinx* · · Score: 1

      i'm begginning to wonder if a microphone might be the simpler solution. too bad mics don't have that Gee-Whiz factor we love so much

      --
      YOU SUCK BALLS!
    10. Re:Some coding expertise... by Entropy248 · · Score: 1

      Why the hell would they need a laser?? I think if it's gone that far they might just consider a MICROPHONE! WTF guys?! I know it's news for nerds, but... I can only think of a few applications for this that a microphone would be useless and that a laser with a mirror setup couldn't possibly help with sound pickup. Spying from tall buildings, maybe, but I can't imagine how much zoom those cameras would have to have and how stable they'd have to be to lipread something from any ridiculous distance. The manpower involved in going through even grepped transcripts would be insane!

    11. Re:Some coding expertise... by Nihilanth · · Score: 1

      laser microphones have been widely used in james-bondesque espionage situations for years (those of us who've played Splinter Cell were forced to use one more than once), basically a laser microphone measures the vibrations on a plate of glass and tunes into conversations by measuring the vibrations.

      Sound pressure waves cause the density of air to fluctuate, which would bend the path of a beam of light travelling through it.

      basically, you'd need more than one laser in this situation, i think, you'd need like a 3d array of them.

      actually, forget lasers. Focused radio waves (which are just like light but with a slower frequency) would react to sound pressure waves too.

    12. Re:Some coding expertise... by Nihilanth · · Score: 1

      relying on sound pressure waves has its limitations, especially in noisy or crowded areas. The number of microphones to reliably process everyone's conversations would have to grow exponentially with the amount of noise in the area. Relying on a visual information source might be trickier given the current state of computational power, but from a pure signal standpoint, it isn't effected by the ambient noise level or interferance from other conversations. it basically lets you sidestep the problem of sorting out who is saying what. In the process, of course, other problems are created, but hey, that's what happens.

      I'm not too worried about the whole thing, the more resources they dump into this kind of foolishness, the more shit people will get away with as The Powers That Be get complacent in their technological superiority. Biometrics, for example, is silly and dosen't work. Once we waste enough money on this sort of thing, itll be abandoned for surveillance purposes and integrated into consumer electronics in flashy ways.

  12. Planet Express Delivery Ship by luzrek · · Score: 2, Funny
    Unlike HAL the Planet Express Delivery Ship cannot read lips.

    Fry, Leela, and Bender are hiding out in the shower discussing how to turn of Planet Express Delivery Ship. The little red light is on, the screen is scrolling back and forth between the lips as Leela gives orders and Bender objects. Then the ship says, "Oh, if only I could read lips!"

    --

    Galium Arsenide is the material of the future, and always will be.

  13. Hmm.... by Anonymous Coward · · Score: 0

    Open the pod bay doors HAL!

  14. Orwellian p0ssibilities by asadodetira · · Score: 2, Insightful

    Cameras randomly zooming on the lips of the crowd, if somebody says someting from some "list" of words, they keep tracking that person and make some face recognition also.

    1. Re:Orwellian p0ssibilities by flamingspinach · · Score: 1

      Maybe it's time to develop a neo-Cockney...

    2. Re:Orwellian p0ssibilities by spumoni_fettuccini · · Score: 0

      Nawww...Welsh! That way only one word in thirty can be made out.

      --
      -- Some days you're the dog; some days you're the hydrant.
  15. Not that 2001 ended up being very accurate... by DeadScreenSky · · Score: 5, Interesting

    ... but I think it is interesting that Arthur C. Clarke thought HAL reading lips was the only implausible scene in the film. You know, as opposed to the whole aliens thing. :P Just goes to show you the perils of trying to predict the future...

    --
    There is no excellent beauty that hath not some strangeness in the proportion. -- Francis Bacon
  16. Sigh... by ScoLgo · · Score: 3, Interesting

    Sigh... the signal to noise ratio alone is enough to lend you reasonable anonymity. There's just way too much information that would need to be grepped through in order to listen in on your dinner conversation. No one, (or their Big Brother), is going to bother unless they have a really good reason to be investigating you in the first place.

    I'm thinking that the 'good' will outweigh the 'evil' here...

    --
    "Michael, I did nothing. I did absolutely nothing - and it was everything that I thought it could be."
    1. Re:Sigh... by shaitand · · Score: 3, Interesting

      How about having it record everything it picks up and time coding it, so that you grep for the word "revolution" "bomb" "nuts itch"and then cross reference it to the time sequence in the video. This is then passed on to the FBI as routine policy for "the war on terror"

    2. Re:Sigh... by ScoLgo · · Score: 2, Interesting

      Well, it's possible that my tinfoil hat is on crooked today...

      From the Reg article... "Intel's announcement implies that the system works better when coupled with facial recognition to identify 'known' speakers."

      Doesn't this imply that, at least for the foreseeable future, this technology won't be easily used as some general Orwellian tool? It sounds as though it needs to 'learn' each speaker - much like voice recognition software has to be trained to your voice before it can be used accurately.

      From the Intel link... "The speaker independent audio-visual continuous speech recognition system relies on a robust set of visual features obtained from the accurate detection and tracking of the mouth region."

      As mentioned by someone else in another thread, this system relies on a relatively uninterrupted view of the speaker's face. There are billions of people on this planet, all moving around willy-nilly and not worrying about holding still long enough for this technology to track their mouth movements. It's therefore just not feasible to apply this to public video 'eavesdropping'.

      It's more likely to be used in educational situations and for people with special needs, (automatic translation of seminar presentations for the deaf, perhaps?).

      As I already said, I can see this being used by government spooks to track certain individuals that are already under investigation - hopefully after getting a warrant.

      Of course, I could be wrong...

      --
      "Michael, I did nothing. I did absolutely nothing - and it was everything that I thought it could be."
    3. Re:Sigh... by LordMyren · · Score: 1

      Maybe now thats true, but as computing power continues its exponential growth, big brother will hae the computing power necessary to analyze more and more data.

      Twenty years and why not hear everything?

    4. Re:Sigh... by Anonymous Coward · · Score: 0

      It just means that Wal-Mart will have to keep a data warehouse of all the video so each time you enter a store they can find all your previous appearances and see if they have enough samples yet to "learn" your voice and decode all of your past conversations.

  17. Copyrighted Prior Art by Anonymous Coward · · Score: 0, Informative

    I don't know if they have any patents, we all know some prior "art" from 2001

    Just in case anyone gets the wrong idea here, copyrighted works cannot be used to contravene a patent.

    1. Re:Copyrighted Prior Art by Anonymous Coward · · Score: 3, Interesting


      Just in case anyone gets the wrong idea here, copyrighted works cannot be used to contravene a patent.

      erm, yes they can. In fact, the firm I work for specializes in that very thing.

    2. Re:Copyrighted Prior Art by Anonymous Coward · · Score: 0

      A patent has to be novel and nonobvious. You may use copyrighted material along with any other prior art to try and prove that an idea is either not novel or completely obvious. Simply saying "one sci-fi author once thought this might be possible, somehow, using.. you know, computers" is not sufficient to prove that the patent is neither novel nor nonobvious.

  18. I would like to volunteer my time to this project by Anonymous Coward · · Score: 0

    during the judging phase of the competition.

  19. But will is work in this situation by Anonymous Coward · · Score: 0





    You know in Ace Ventura, when Ace was talking out of his asshole? Can it translate that?



  20. Re:Woot, this is a godsend for us college students by deadsaijinx* · · Score: 1, Insightful

    i doubt it would help. The way i see it, the image would have to be clear, and the person can't be moving to much, and they have to be annunciating their english. after all, i hardly move my lips while speaking, so it couldnt read mine

    --
    YOU SUCK BALLS!
  21. Reason for this being released as open source by gilesjuk · · Score: 1

    Call me cynical but has this been released as open source so it will be rapidly improved before being used in an Intel product?

    1. Re:Reason for this being released as open source by Anonymous Coward · · Score: 0

      Not really. Intel has had an open-source computer vision library for a while. It's not aimed at being a product itself, it's aimed at helping people develop products for intel chips.

    2. Re:Reason for this being released as open source by ceejayoz · · Score: 1

      How's that cynical? Isn't that a good thing?

    3. Re:Reason for this being released as open source by JohnFluxx · · Score: 1

      no, the reason it is released is so that people intergrate into programs more, and so more people require faster processors to process this.

  22. This just in... by Jippy_ · · Score: 1

    Usage of IRC across the globe suddenly drops as users are dismayed by the number of people asking to sweep with them.

  23. Read my lips... by Anonymous Coward · · Score: 0

    Read my lips, "No New Taxes!" Can the software identify a liar?

    1. Re:Read my lips... by Anonymous Coward · · Score: 0

      I think he was saying "No New Texas!"

  24. Too late for me... by raehl · · Score: 3, Funny

    I may have done better in my AI class if I was able to read lisp. All those damned parenthesis made life very difficult.

  25. Oh yeah? Lip Read this! by Metallic+Matty · · Score: 2, Funny



    1. Re:Oh yeah? Lip Read this! by Anonymous Coward · · Score: 0



      I says "I want you to cum on my face."

    2. Re:Oh yeah? Lip Read this! by Anonymous Coward · · Score: 0

      I says "I want you to cum on my face."
      of course you do.

  26. Can it read this? by ektor · · Score: 2, Funny

    No... more... taxes.

  27. Anybody played with other languages? by Xerithane · · Score: 1

    I know that English is one of the more easy languages to "lip read." It goes into the latin roots, and such. I'm sure that using slang will make it much harder, but I'd be curious how it works with other languages. I think that Japanese (when spoken clearly, and not using dialects) would be incredibly easy where Chinese could be very difficult. If anybody has time and a desire to hack on it, keep me posted if you do multi-lingual work. I'm really curious on how it goes.

    --
    Dacels Jewelers can't be trusted.
    1. Re:Anybody played with other languages? by geekoid · · Score: 1

      I owuld image any language that invokes clicking sounds would be difficult. as well as bird calls.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    2. Re:Anybody played with other languages? by deadsaijinx* · · Score: 2, Funny

      yes, since there are so many people concerned with reading the lips of birds, especially since they don't have lips, or talk. yeah, consider this my karma burn for the evening.

      --
      YOU SUCK BALLS!
    3. Re:Anybody played with other languages? by Anonymous Coward · · Score: 0

      IANALinguist, but...

      English is one of the more easy languages to "lip read."
      I'd imagine that this is because English tends to be spoken relatively slowly and precisely, compared with other languages. This probably balances out the numerous sounds of the language. The problem, though, is converting the speech into text. As English essentially has random spelling rules, viz. "ghoti", it would be difficult to come up with a good speech-to-text algorithm.

      "I think that Japanese...would be incredibly easy".
      Japanese would be an excellent choice, simply because it has very clear, simple rules for making words out of sounds; a syllable follows the basic vowel-consonant pattern. As Japanese has relatively few sounds compared with English, I'd imagaine it would be a good language with which to start.

      "Chinese could be very difficult."
      The tonality alone would make things very hard indeed; I don't think it is possible to distinguish tone by sight alone. In addition, since Chinese has many homonyms, finding the right character for a spoken syllable would provide a great challenge. Some syllables can be written using up to a dozen characters, which must be inferred from context.

      Freedom^H^H^H^Hnch (the war's over), incedentally, would probably be one of the most difficult I can think of, for some the same reasons as Chinese. Although it is not tonal, French has many, many homonyms. Also, spoken French tends to be fast-paced, with swallowed vowels and final consonants, and with slurred-together words.

    4. Re:Anybody played with other languages? by Max+Romantschuk · · Score: 1

      If I've understood thing correctly, Chinese uses both formants and pitch to signify meaning. A formant is a distinct sound, like A or O, which is recognizible at any pitch, and that part can be lip-read.

      But can pitch be lip-read? If not, would a system like this work at all for languages who apply pitch aswell as formants to distinguish between words?

      --
      .: Max Romantschuk :: http://max.romantschuk.fi/
    5. Re:Anybody played with other languages? by Xerithane · · Score: 1

      Japanese would be an excellent choice, simply because it has very clear, simple rules for making words out of sounds; a syllable follows the basic vowel-consonant pattern. As Japanese has relatively few sounds compared with English, I'd imagaine it would be a good language with which to start.

      Japanese lip reading is very hard though. For example, you can't tell if I'm saying, "tsu", "zu", "su" just by my lips. You can also go through "ra", "ri", "ru", "re", "ro" without moving your lips (and if you do, you sound like a major foreigner)

      Japanese would be hard to read lips, but easy to recognize phonetically (without slang.) For example, a common word is "sumimasen" but it's pronounced (bear with me, switching to english since it's slang) "see-mah-sen"

      It's seems that this would work best in a language that balances between lip pronounciation and phonetic pronunciation. English, perhaps German as well. Spanish would probably not be difficult, nor would French, as long as you realize you are talking to a computer...

      --
      Dacels Jewelers can't be trusted.
  28. Ha I've fooled them by geekoid · · Score: 1

    I only use sign language!

    fools...
    ummm wait.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  29. I will put you on my prayer list. by Anonymous Coward · · Score: 0

    Along with the DNS protocol and France.

  30. OpenCV under Linux? by 26199 · · Score: 1

    I've investigated Intel's vision library, OpenCV, before... and it does appear to be available for Linux if you look hard enough... but I couldn't find any Linux applications using it to actually *do* something.

    Has anyone had any success with OpenCV/Video4Linux?...

  31. How do you think the court system would handle... by djoham · · Score: 2, Interesting

    ...someone recording to video a person *speaking* the source code of DeCSS and then using this tool in combination with gcc to generate libDVDCSS?

    Would this tool then be declared a "circumvention device" under the DMCA, or would the courts finally realize that code can be considered protected speech? The code was, after all, spoken in its original form in this case.

    This same question could also be applied to audio-to-text converters as well. Maybe there's hope the DMCA will be declared unconstitutional after all.

    Interesting food for thought...

    David

  32. ig-pay atin-lay by bryanthompson · · Score: 4, Funny

    ersonally-pay, i-ay(?) erfer-pay o-tay use pig latin.

    geeze, that really wasn't worth the effort...

    1. Re:ig-pay atin-lay by Echelon309 · · Score: 0

      Whenever you have a word that begins with a vowel, such as "I", you simply add "yay" to the end. So "I" becomes "I-yay". Kind of like if you want to speak Latin, you just add "us" to the end of every word ;)

    2. Re:ig-pay atin-lay by UnknownQ · · Score: 1

      I prefer to use google for all my pig latin needs.

      --
      Wherever you go, there you are!
    3. Re:ig-pay atin-lay by SEWilco · · Score: 1

      I-yay-us am-yay-us ultilingual-may-us?

  33. Prior Art by cperciva · · Score: 3, Informative

    Software and business model patents have evidently effected comprehension of what a patent entails.

    "A computer, examining a set of video images, to perform lip reading" is not patentable. HAL would be prior art for this; but it doesn't matter because there isn't any inventive step here anyway.

    "A computer, processing a set of video images by locating what appears to be a set of lips, selecting recognizable points, using the movement of those points to track the deformation against a 3D model, comparing against a table of syllables to compute the probability of each particular syllable, and using knowledge about a language to determine which syllables are most likely to follow each other" could be patented. HAL would not be prior art for this, because there is no indication of how HAL performed the lip reading.

  34. Re:The only lips I read by Anonymous Coward · · Score: 0

    of course, you wouldn't know about them from experience, nore would most of the slashdot crowd. No, the only experience you have with that set of lips is while masterbating to the illustrations in your high school anatomy class. all though, i am quite familiar the lips of your mom softly groping my cock. peace out and happy trolling.

  35. Fox News by Jru+Hym · · Score: 2, Funny

    It probably wouldn't work for Greta "Lips" Van Susteren

    --
    This lobster was alive when it hit the frothy, boiling water.
    1. Re:Fox News by Jru+Hym · · Score: 2, Funny

      Next test subject: The Vagina Monologues

      --
      This lobster was alive when it hit the frothy, boiling water.
  36. It's cool with me... by Bendebecker · · Score: 1

    Read my lips: "No new invasions of privacy... hey, wait a minute!"

    --
    There's a growing sense that even if The Future comes,
    most of us won't be able to afford it.
    -- Lemmy
  37. SF movies typically don't count as prior art... by GoBears · · Score: 4, Informative
    I don't know if they have any patents, we all know some prior "art" from 2001, er.. 1968.

    patents are supposed to be on inventions, not ideas. (very) generally speaking, you have to demonstrate you know how to do something for it to count as prior art. actually building something counts, as does a patent application (since the patent application has to explain how the invention works at a reasonable level of detail, for an admittedly arguable legal definition of reasonable).

    ianal, but the last i heard, a mention in a science fiction book or movie wouldn't typically be considered prior art. a person skilled in the art can't tell from 2001 how to make a computer read lips.

  38. Stupid, offtopic and not funny. by RatBastard · · Score: 1

    The evil trolls inside my head keep trying to make a joke about women, scanners and a lack of pants, but it's just not coming together.

    --
    Boobies never hurt anyone. - Sherry Glaser.
  39. Lisp ?? by Nyktos · · Score: 1

    Oh oh!! L-I-P-S!!

    First I thought, Jeeze... I can already read Lisp, emacs style...

    Then... ohhhh... they mean Lisps... like a speech impedement... That would be cool, to read lisps.

    But reading lips makes much more sense.

  40. Actually, this could be a major breakthrough by RhettLivingston · · Score: 3, Interesting

    in speech recognition if it does no more than allow input from a camera to aid in separating out which sounds came from which speakers. Simply fixing the background noise problem would be a huge advance.

  41. silence is golden by mashie · · Score: 1


    This could solve my fundamental beef with speech as an interface - privacy! Dictating email and documents would be great, if I didn't have to broadcast to everyone around me. Not to mention the annoyance of hearing the guy in the next cube complain to his girlfriend over IM...

    Mouthing words silently takes some getting used to, but it has advantages. No more trying to type on a tiny PDA keyboard - etc. Obviously this is a ways off, but it seems doable.

  42. Re:How do you think the court system would handle. by Anonymous Coward · · Score: 0

    That is only interesting if you are a sweaty, pear-shaped, socialist whiner nerd.

    If you started to worry more about personal hygene and less about the DMCA, maybe you could actually convince a female to talk to you.

  43. Re:How do you think the court system would handle. by SamBeckett · · Score: 1

    If you beat a dead horse, will it die some more?

  44. cool by tadheckaman · · Score: 1

    but when can I get this on my desktop? it would be really neat to chat through IRC without making a sound. oh wait...

    --
    My potato gun was confiscated by the United Nations. They said I wasn't allowed to have weapons of mash destruction.
  45. Still patentable? by AmoebafromSweden · · Score: 1

    The article submitter says:
    "I don't know if they have any patents, we all know some prior "art" from 2001, er.. 1968. HAL's accomplishment was also mentioned by CNN during 2001 in an article about this group's work."

    Is there not a difference between the idea and the way to implement the technical solution. Meaning thay cannot patent the idea, but they can still patent the code itself for the way thwe code works.

    Just curious. What does everyone think?

  46. I am unreadable by Lips · · Score: 1

    :-b

  47. I don't like this at all! by pair-a-noyd · · Score: 1

    HAL, as in "2001" for one thing. You all know about that already.

    The REAL THREAT of this is "them" using camera's to look at people from afar (or by whatever means) and eavesdropping on people when they can't get a microphone in..

    You can be sure that H.L.S. will jump on this like white on rice...

  48. They certainly aren't the first by Omegalomaniac · · Score: 2, Informative

    It's been done at Carnegie Mellon as well.

  49. Maybe SARS masks for everyone now! by DeathoP · · Score: 0, Offtopic

    And if I'm approached to remove it, I'll know that someone is trying to monitor me. Could this be why CDC is concerned with SARS? Can't read lips with SARS masks blocking the 'flow of information". :)

    1. stop SARS.
    2. collect information.
    3. profit.

  50. video quality problems by jolyonr · · Score: 1

    I can imagine the source video material quality may be quite critical to this. It would be much easier to process a signal from a DVD, for example, than a composite video camera.

    But then on a DVD you'd just hit the subtitle button and problem sorted :)

    --


    Please read my Canon EOS tech blog at http://www.everyothershot.com
  51. Oh yeah THAT'll work by graveyhead · · Score: 3, Funny

    Lyndsey Nagle: Do I detect a note of sarcasm?
    Frink: (With sarcasm detector) Are you kidding? This baby is off the charts mm-hai.
    CBG: A sarcasm detector, that's a real useful invention.
    (Sarcasm detector explodes)

    --
    std::disclaimer<std::legalese> sig=new std::disclaimer; sig->dump(); delete sig;
  52. Bah humbug... your brain already does this. by rice_burners_suck · · Score: 1

    The human mind parses speech by using both senses of sight and sound. They demonstrated this on the news one time by repeating a word over and over. They instructed the viewers to look at the screen while listening, then at some random time, to close their eyes and then open them again after waiting an interval, all while continuing to listen. Sure enough, when I closed my eyes, the word I heard was a completely different word, even though when I looked at the screen, I wasn't necessarily looking at the person's lips. In other words, your cabeza does this automatically. Obviously, the two words were similar enough in sound that this worked, but it demonstrated that in addition to using context to provide meaning, your brain uses other information as well.

  53. Sports by Dynastar454 · · Score: 2, Funny

    I know what I want this for- I want to read the lips of all the coaches and players during basketball/baseball/whatever broadcasts. Maybe ESPN could offer this as a feature, censoring as needed. :-)

    --


    Laugh at stupidity: mod idiots +1 Funny.
  54. Re:Woot, this is a godsend for us college students by wmspringer · · Score: 1

    While it probably wouldn't work - the prof moving back and forth, not pronouncing words clearly - it'd be a great help if it actually did... Last semester I had an interpreter for one of my classes. (I'm deaf and the teacher had a heavy accent) My interpreter couldn't understand him either!

  55. Soviet Russia by ripleymj · · Score: 1

    In Soviet Union .... computer reads you.

  56. Re:Woot, this is a godsend for us college students by scotch · · Score: 1
    Do they sell e's, too? If so, you may want to pick up a carton.

    --
    XML causes global warming.
  57. Voice recognition that doesn't require training? by sowellfan · · Score: 1

    I'm trying to transcribe some tapes of lectures right now, and I'm looking for an easy way out. I know speech recognition programs are out there, but from what I know, they need significant training of the user with the program in order to work.

    Unfortunately, my voice is not the one giving the lectures, and there are actually two or three different lecturers. Since training is impossible (AFAIK, at least), I'm wondering how far speech-to-text technology has come, especially in the open source community. Can I just pipe the output from the wav file into the input of a speech-to-text program, and if so, what sort of signal-to-noise ratio can I expect on the output without training? (graphical interface would be nice) Right now it's taking me about four hours to transcribe 50 minutes worth of lecture.

  58. DMCA by michaelhood · · Score: 0, Troll

    This whole, "maybe we could apply 'blah' to rule the DMCA unconstitutional!" thing is turning into the next "wow, imagine a beowulf cluster of [...]".

    STOP IT WHILE YOU CAN. PLEASE! I BEG OF YOU.

  59. Re:How do you think the court system would handle. by Niten · · Score: 1

    Very good point... for that matter, how would the courts handle it even without this new technology? Even without programs that can read words from video, it is still theoretically possible (though maybe not practically possible) that someone could read the source code to DeCSS aloud onto a video tape, such that someone else at the receiving end could manually record that code into a source file and compile it.

    (And if you wanted to be really ironic about it, you could always store the video on a DVD :-) )

    Do you think the almost invulnerable association that we make between the video and audio recording of somebody speaking and the term "free speech" would give this medium any better legal footing than the traditional source-code-on-magnetic-disks? If I remember correctly, at one point the PGP developers were in the business of exporting their strong encryption to non-US territories by means of publishing their source code in a book... correct me if that is wrong.

  60. Lipreading is a myth, as is this code working. by nloop · · Score: 4, Informative

    I have taken many years of ASL classes and am pretty involved with Deaf culture; one of the biggest myths about it is peoples ability to read lips.

    The idea most people have of lipreaders, like in the movie See No Evil Hear No Evil (Richard Pryor Gene Wilder comedy) or the Seinfeld lipreader episode just really isn't possible. Many sounds such as "t" and "d" look the exact same, and many such as "k" and "g" are not visible at all. The best lipreaders really can only get 2/3 of what is being said, (if they are entirely Deaf, which many Deaf people are not, if your hearing loss is not total it can be far more efective) and that is with the person speaking slowly, facing them, and human intuition (context). Throw in facial contortions, (like yelling... "they can't hear me so if I yell it will help") low light, bad angle, fast talking, etc. and the accuracy drops dramatically.

    Computers lack the ability to figure out what word is being said based on context when the lips don't provide adequate information. They are also historically terribly poor at things like complex image recognition. Registration script busting is based on what? Image recognition with noise in the image (i.e. type the word that appears in the next form box) and no one has even come close to a functional computer ASL interpreter and ASL is far easier to disguish visibly than speech.

    I don't see that 40% word error rate it is currently having being able to improve much at all, and I'm guessing the video feed that's off of isn't anything like fullspeed nonexagerated human speech.

    Your fears of the video cameras on the streets logging your conversations are pretty unfounded ;)

    1. Re:Lipreading is a myth, as is this code working. by Anonymous Coward · · Score: 0

      Yep, that's exactly what I thought. I have a deaf son with some residual hearing (he wears two hearing aids), and when he can see my lips, his comprehension level goes way up. If I just mouth the words, his ability to understand goes *way* down. Lipreading is an art, and an exhausting one at that (imagine trying to decide on the fly what possible words could fit the context of the conversation, and which one of those the other person is actually saying. Then watch yourself saying "I love you" followed by "olive juice" in a mirror). Most of the deaf people I know, even the very talented lipreaders, prefer sign.

      It's for this reason I wasn't real excited about the cellphone for lipreaders stuff that was posted here a while back as well...

    2. Re:Lipreading is a myth, as is this code working. by CompVisGuy · · Score: 1

      I agree. I'm a computer vision researcher, and even the very best methods will only perform well under some quite limited circumstances.

      Vision systems need to be composed of many modules (e.g. face detector, lip segmentation, lip shape modelling, model interpretation...), and each module will many many assumptions made about the data it works with, some of which the researcher will make accidentally. The whole will only work if the sets of assuptions are compatible, and the assumptions will limit the situations in which the vision system can be used.

      Usually, this kind of research is assessed by using what we would consider very 'easy' movies, just containing the lips and no other confounding features.

      As far as using computer vision to interpret American Sign Language, I doubt this is any easier. Hands are much more complex than facess -- they have very many degrees of freedom -- and when signing, occlusion will occur than will make the computer vision task very difficult. I would imagine that there is significant variation in how people sign, how fast they do it, how explicitly they do it and they may well use other means of communication to add 'intonation' to what they are saying (for example using facial expressions or making non-standard gestures).

      Although there is some research going on to do automatic interpretation of sign language, I'm not sure that it is any easier than lip reading.

      --


      "The noble art of losing face will one day save the human race"---Hans Blix
  61. a big difference by led_belly · · Score: 0

    There's a big difference between the HAL computer and software that can read lips. I am surprised though not shocked to see that the analogy was used. After all, HAL was supposedly a thinking entity that taught itself to read lips (we presume in the movie that 'it' was programmed to do so). Where, in this instance, we have a computer being explicitly programmed to read lips.

  62. Reading lithpth? by floydigus · · Score: 1

    My computer hath been able to read lithpth for yearth.

    --

    All things in moderation; including moderation

  63. The Conversation by BigBadBri · · Score: 1
    Puts me more in mind of Coppola's The Conversation.

    The idea of combining it with speech recognition in an adaptive fashion, using one source to cross-check the other, could open up a whole new area of privacy invasion.

    Imagine this stuff running on all the CCTVs in the town where you live...

    --
    oh brave new world, that has such people in it!
  64. does'nt work well by j1r3 · · Score: 1

    In french the word "benjamin" and the french translation for "eat shit" have exactly the same lip movement pattern. Just a tought like that

  65. Where there's life there's hope / Please pass the by tiltowait · · Score: 1

    ... lavender soap. I've heard those two phrases can be difficult to tell apart by a lip reader.

    As for computer lip reading, there's a chapter in Hal's Legacy about this very topic.

  66. on the issue of prior art... by LifesABeach · · Score: 0

    i believe the science fiction story 'colossus, the forbin project' makes use of the computer not only reading lips, but also body movement.

  67. Re:How do you think the court system would handle. by stonecypher · · Score: 1

    I think it'd be about the same as if you were to try to sell an audiobook you recorded of someone else's text. Free speech only covers things you created in the first place. It's not free usage of speech.

    --
    StoneCypher is Full of BS