Microsoft Research Uses Kinect To Translate Between Spoken and Sign Languages
An anonymous reader writes in with a neat project Microsoft is working on to translate sign language with a Kinect. "Microsoft Research is now using the Kinect to bridge the gap between folks who don't speak the same language, whether they can hear or not. The Kinect Sign Language Translator is a research prototype that can translate sign language into spoken language and vice versa. The best part? It does it all in real time."
Kinect translation with new autocorrect-for-ASL: "I really like your tits"
Thanks Microsoft!
Sent from my ENIAC
How do you sign: Dear aunt, let's set so double the killer delete select all?
Required reading for internet skeptics
... flipping the bird?
some signs need no translation.
Sent from my ENIAC
... I flip the bird does is say FUCK YOU ??
Sign language isn't actually much relevant these days.
Almost no one understands sign language and it is quicker and more convenient for the disabled to send a text message.
Mobile phone technology is to thank for this --- it is making the world far more fairer every day.
Priest: "Universe from nothing, no laws of physics, sped up time"+ huge discrepancies. Creationism? No. Big Bang Theory
There's a lot of contextual clues necessary to understand sign language. Most conversations would seem "faux caveman" like to the outsider - a lot of Noun Verb Noun going on...
I'm going to have to watch the video from another machine, but I'm more interested in the bumper at the bottom that has realtime English/Chinese translation in your own voice...
We were doing real-time ASL translation to text using the webcam on the Indy2 workstation back in 1997, success rate was about 85% and most of the misses were from hidden object problems which the Kinect does nothing to help with.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
As far as ways to communicate online goes, I'm not sure how useful of a tool this would be. I can definitely see how this could easily become the best way to learn sign language though if paired with Rosetta Stone-like tutoring software. My wife has been planning to learn sign language soon, I'm sure she'd love to have something like this as a learning aide.
I tried to tackle the same thing a couple of months back using OpenCV and a smartphone. Before starting I consulted with people that knew sign language and the problem is not so much recognizing hand gestures, but facial expressions. They say that most of the conversation happens with a a given look, a frown, or the movement of the lips.
Needless to say, I though it was too hard to solve the problem on a smartphone so I postponed the project. I don't think Kinect can do a much better job at picking up the small details of facial expressions, but let's wait an see.
Goodbye Apple and Google. This technology is set to put you OUT. OF. BUSINESS.
If the kinect is tracking your arms/legs/fingers/head etc. I'm fairly certain it can be programmed to start using facial expressions to add the missing conversations.
Looks like about the quality of verbel speech recognition...
I have never seen anyone sign that clearly or slowly except to very new beginners.
Yup. Many signed phrases are just noun, verb, point, noun .
There are lots of research being done with kinect, by BS and masters students, mostly around physiotherapy. This is one of those creative applications that everybody says after hearing about it .. "damn, why didn't I think of that". Very creative use of the kinect.
Looking at the video in the article, it seems that "in real time" means "at about 1/4 of the speed of regular signing".
Imagine. Having. To. Speak. Like. Kirk.
Esli epei etot cumprenan, shris soa Sfaha.
Not sure how ASL compares to BSL (I'm a BSL n00b), but facial expressions are one thing that has bothered me about sign-to-text/speech systems. Without facial expressions: I can't express the magnitude of certain things; I can't ask questions; I can't answer some questions; I can't distinguish between things like uncle, aunt, nephew, niece or battery; yadda yadda. Non-manual features are really important grammatically and expressively.
There are also signs that don't really have an equivalent in spoken/written language unless you start incorporating diagrams into the translation.
Don't get me wrong, I think work like this is a great idea but the projects I've seen so far either ignore the problem or grossly simplify it - losing the power and expressiveness of sign language.
Disclaimer: yes I work for Microsoft. No not on these projects.
This was demo'd live in front of 30K MSFT employees at our annual company meeting. It nearly brought me to tears. Yes, I can see through demoware and and yes it's highly imperfect, but honestly it was the single most impressive use of technology I've ever seen. It was both novel and simple. It combined hardware, algorithms, user experience, and cloud scale. I don't know if it will ever go anywhere though I expect that it will. The key point here is that these are off the shelf components. Kinect and gesture APIs combined with machine translation and text to speech. It's important that these are, all or nearly all public production APIs. Such a system 10 years ago even if possible, would never make it to market because of the tiny user base. Today we can build such apps for the 0.01% of the population that need Mandarin Sign Language translated to English. And it can be cost effectively. That is the point. Technology being used to address real problems for under served communities. So yes, maybe people researched automated sign language recognition years ago, but bringing it to market and enabling a scenario for real people is a wholly different beast
I knew it! the Federation runs on Microsoft technology. Guess that explains all the exploding consoles and Transporter accidents.
I've got better things to do tonight than die.
The ASL interpreters I know do a lot of on-call work for medical, mental health and educational purposes. One thing they mentioned is ASL has regional dialects.
They (I assume the same group) demoed the real-time English/Chinese translation in your own voice last year. It's really impressive, and the results were surprisingly good.
I do wonder how it deals with phonemes that are present in one language but not in another, maybe there's a "training process" you have to do initially to make sure it has enough recorded samples to get full-coverage of the target language.
Comment of the year
nods and looks sad, grabs cap brim with left hand and runs right thumb along chin
Sent from my ENIAC
UNlikely.
Much subtler changes for facial work.