Robotic Hand Translates Speech into Sign Language
usermilk writes "Robot educators Keita Matsuo and Hirotsugu Sakai have created a robot hand that translate the spoken word into sign language for the deaf. From the article: 'A microchip in the robot recognizes the 50-character hiragana syllabary and about 10 simple phrases such as "ohayo" (good morning) and sends the information to a central computer, which sends commands to 18 micromotors in the joints of the robotic hand, translating the sound it hears into sign language.'"
Good lord! I imagine the Japanese language with its 1945+ character alphabet is hard enough to learn; learning Japanese sign language must really suck.
You know what would really spoil those deaf kids is, instead of a robot doing sign language, a robot that shows images or words based on what a speaker says. I know, I know; creating a robot to do this is a feat within itself and impressive in its own right, but perhaps there are better ways of communicating with a robot if it can already perform more than adequate speech recognition.
Falun Dafa is good!
After reading this letter, you will never again be able to trust Mr. Scuttle Monkey, and you will see with crystal clarity the way that reason, not make-believe, is the best way to deal with the real evils of our world. Before I begin, let me point out that his flock appears to be growing in number. I indeed pray that this is analogous to the flare-up of a candle just before extinction yet I keep reminding myself that many people think of his prudish insults as a joke, as something only half-serious. In fact, they're deadly serious. They're the tool by which myopic devotees of conspiracy theories will dress up Mr. Monkey's profit motive in the cloak of selfless altruism by next weekend. A second all-too-serious item is that perhaps one day we will live in a world where good people are not troubled by fear of semi-intelligible, crass half-wits. Until that day arrives, however, we must spread the word that Mr. Monkey has a knack for convincing self-serving gadflies that advertising is the most veridical form of human communication. That's called marketing. The underlying trick is to use sesquipedalian terms like "internationalization" and "roentgenographically" to keep his sales pitch from sounding inaniloquent. That's why you really have to look hard to see that Mr. Monkey's overweening dream is starting to come true. Liberties are being killed by attrition. Vandalism is being installed by accretion. The only way that we can reverse these satanic trends is to speak up and speak out against Mr. Monkey. To be precise, I would be grateful if he would take a little time from his rigorous schedule to compile readers' remarks and suggestions and use them to shout back at his propaganda. Of course, pigs will grow wings and fly before that ever happens. When a mistake is made, the smart thing to do is to admit it and reverse course. That takes real courage. The way that Mr. Monkey stubbornly refuses to own up to his mistakes serves only to convince me that he either is or elects to be ignorant of scientific principles and methods. Mr. Monkey even intentionally misuses scientific terminology to replace intellectual integrity with ridiculous sloganeering.
Mr. Monkey acts as if he were King of the World. This hauteur is astonishing, staggering, and mind-boggling. Make special note of that point, because if we are to expose his double standards for what they really are, then we must be guided by a healthy and progressive ideology, not by the mindless and macabre ideologies that Mr. Monkey promotes.
According to the latest scientific evidence, Mr. Monkey's attempts to let ruthless nincompoops run rampant through the streets are much worse than mere lexiphanicism. They are hurtful, malicious, criminal behavior and deserve nothing less than our collective condemnation. It is similarly noteworthy that we must understand that Mr. Monkey works from the false assumption that most people actually want meretricious agitators to sully a profession that's already held in low esteem. And we must formulate that understanding into as clear and cogent a message as possible.
Mr. Monkey is addicted to the feeling of power, to the idea of controlling people. Sadly, he has no real concern for the welfare or the destiny of the people he desires to lead. All I can tell you is what matters to me: I like to speak of him as "crapulous". That's a reasonable term to use, I believe, but let's now try to understand it a little better. For starters, it's undoubtedly a tragedy that Mr. Monkey's goal in life is apparently to condemn children to a life of drugs, gangs, drinking, rape, incest, verbal abuse, physical abuse, and a number of other horrors. Here, I use the word "tragedy" as the philosopher Whitehead used it. Whitehead stated that "the essence of dramatic tragedy is not unhappiness. It resides in the solemnity of the remorseless working of things," which I interpret as saying that in a recent essay, Mr. Monkey stated that he has achieved sainthood. Since the arguments he made in the rest of his essay are based in part on that assumpt
They only need to put it on wheels and it can become a scutter.
Additional warning:
Do not let this robot pat you on the back whilst near the top of the stairs.
liqbase
Call me culturally insensitive but, why not simply translate speech to text?
... Does it also distinguish between different 'dialects' in sign language?
I seem to recall that sign languages differ between countries, same as 'natural' language.
However this is really great for deaf people.
80 CC D8 AF AE D3 AB 54 B7 2E CE 67 C7
signing "I'll be back"
I'm also blind, you insensitive clod!
Can it translate:
"FUCK YOU"
Into a finger sign?
ha...ha
Ubuntu is an African word meaning 'I can't configure Debian'
Would this not be more useful as software, able to render simple 3d hands with low microprocessor overheads, and preferably available for mobile phones and PDAs?
Deaf people could carry a PDA, and when they need to find out what someone is saying, they can hold the PDA up like a microphone, and watch the screen, assuming the translation is at least reasonable accurate...
Of course they could lipread too but some find that harder than others, and this could also be used eventually to cross language barriers?
I imagine it's extremely hard to lipread a foreign language.
Tickle... Amy... Tickle
http://www.imdb.com/title/tt0112715/
The possibilities for this are endless - converting 'wanker' into an off the wrist gesture, raising one or two fingers for the US or UK symbol for f*** off, the list goes on... On a practical level however this is surely of limited use. Conversion to text would be far more useful and allow deaf people to talk to non-deaf folk. Cute application mind you can could have some spin offs for better robotic hands in the future.
So it's not just a hand, but a hand with two legs!
To them, its not a foriegn language.
Anyone else thinking there might be some other 'alternate' use for this device?
This gives a whole new meaning to the phrase "talk to the hand"
Yes but not nearly as intimidating. Who's going to get their lunch money taken -- deaf kid with a PDA, or deaf kid with a giant robot hand?
This might as well be from the "doing-it-just-because-we-can" dept. As many slashdotters have already pointed out, this is pretty impractical.
Mentifex AI might benefit from this technology of sign language as a form of speech output.
Microsoft will probably find a means of censorship for users in China.
Since sign languages are different all over the world, I don't know if there is the same problem in Japan, but:
American Sign Language is not English (American or other).
Thus, translating speech to ASL would reach people that that understand ASL but don't read Englih.
Exam 4/C again. Maybe I'll do better this time.
Finally deaf people can use a computer too.. err.. wait a second..
Without a camera that translates sign language into spoken language.. This is kinda useless isn't it?
...and yes.. I know some deaf people can talk sorta.. So I guess it helps there.
You can talk to the hand, sure but, that doesn't help you read the deaf persons hands..
--
In retrospect:
Awesome! This could be very cost saving for when my son starts school. Just train the robot on his teachers voice to sign what they are saying... It looks like the only thing I'll have to figure out how to do is to translate SEE - Signed Exact English to ASL - American Sign Language...
Hope they make a good SDK...
Can you flip off someone in Japanese? Give them the OK sign? Give them a stop with the full palm?
It would be interesting to know how these motions translate, if at all.
He who knows best knows how little he knows. - Thomas Jefferson
if i tell someone to f-off will it flip them off for me?
A good sign language interpreter can read signs from a fair distance, well across a board room at least. How far away can the PDA be before you stop being able to read the text on the screen?
This sig has absolutely no significance and serves only to take up screen space and waste the time of the reader.
Unfortunately, no ;-)
Yup, much the same as how, unfortunately, no one's come up with a universal spoken or written language. Gosh, let alone trying to get a universal programming language...
This sig has absolutely no significance and serves only to take up screen space and waste the time of the reader.
for all practical moans and groans FreeBSD showed one common goal - legitimise doing Contributed code achievemEnts that = 1400 NetBSD Cycle; take a say I'm packing
For the sake of being informative, here's a good page on Japanese Sign Language. It's not the same as American Sign Language, which isn't the same as British Sign Language as someone's sure to post eventually. *sigh* Short of Gestuno, there is no universal sign language, no more than there is a universal spoken or written language. *rolls eyes* Except, of course, Esperanto, which everyone speaks by now, right?
This sig has absolutely no significance and serves only to take up screen space and waste the time of the reader.
After reading a couple of replies to my parent post, I was thinking about people that might understand signing, but not read or hear.
It is my understanding that children can learn to sign before they can learn to read. (In fact hearing children can learn to sign before learning to speak.)
Similarly, developmentally challenged people, such as certain people with Down's Syndrome, never learn to read, but can sign just fine.
Reading takes certain specific brain functions, and it is not inconceivable that there are people who have had head trauma that damaged the reading part of their brain, and the hearing part, but can still understand sign language.
These are just quick thoughts and may have lots of holes, and little sense. Please feel free to expand/correct/flame/whatever.
Exam 4/C again. Maybe I'll do better this time.
There's a researcher at Gallaudet working on the other side of this equation with a system designed to recognize sign langauge, which seems like a much harder problem.
ASL isn't like English in that there are always specific words- a lot of it has to do with spacial context (where in the signing space the sign was made) and a whole class of signs that don't translate directly into words (they're hand shapes which can translate into an event or a description of an object or set of objects).
And, as the research page shows, facial expressions and even facial movements can be part of a sign.
Of course, this is American Sign Language, Japanese Sign Language may be very different.
So now I have to learn Japanese to communicate with the deaf?
I remember reading about Stanford grad student project doing this ten years ago and a winner of the National Science Fair doing this three or four years ago.
... will a version be demonstrated at future Ann Summers parties...
Smile, it confuses people
The article states that the hand is 80 cm large (doesn't specify, but I'm guessing that's height). 80 cm is almost three feet for non-metric types. My own hand is only about 12 cm long. Is this the largest communicating hand on the planet? Or, as is more likely, the 80 cm takes into account the massive box of micromotors and computing. Pay no attention to the man behind the curtain.
There is an article Evolution of Mechanical Fingerspelling Hands for People who are Deaf-Blind that talks about the development of this technology since 1977.
There are a couple of challenges with this type of technology. Sign language does not depend only on finger movements but gestures and facial expressions to convey emotion and context. Finger-spelling hands, being mechanical, can only accept data so fast before they start "choking" and sezing up/breaking (we tried hooking one up to a teleprompter application, and its middle finger got stuck - go figure).
This technology can be exciting on a small scale, but is not meant (not able) to act as a replacement for sign language or even closed captioning.
There have been other efforts to develop speech-to-sign robots. I recall one being featured on the Discovery channel many years ago that was able to fingerspell a variant of ASL that is used by persons that are both deaf and blind. That was nearly 10 years ago. In that case, the person "listens" by placing their hand over the signer's hand, and feels the different handshapes.
On another note, this sort of translation is actually more difficult than a voice-to-text, text-to-sign translation. As someone who studied sign language for several years, I can categorically state that such a direct translation is not the same as ASL, which has a different grammar and word order. The way things are described and conveyed in ASL is often not just a matter of stringing different signs that represent different words together into a sentence. Oftentimes it uses spatial and directional relations as well. Humans are able to understand these meanings quite naturally, but it is difficult to program that kind of style into a computer. The translation of voice-to-sign is just as difficult as translating between, say, spoken english and chinese. A more difficult task than voice-to-sign would be to go the other way around: a camera or motion-based system that extrapolates what a signer is saying, and then translate that to normal English.
The next question which I have is the significance of body positioning of signs in JSL. Most ASL signs have migrated to the face and upper-chest region, but I know some sign languages have a great amount of significance in the body positioning and it may range all over the place.
This sig has absolutely no significance and serves only to take up screen space and waste the time of the reader.
In the early 90's I worked with the robotic finger spelling hands called "Dexter" & "Ralph". Those devices were intended for individuals who are both deaf and blind. An individual with this kind of disability must rest their hand on the back of someone's hand (or on the back of the robotic hand) and feel letters as they are signed by the hand/fingers.
Now I can get a freaking ROBOT HAND attached to my HEAD so I can WRITE GOOD MORNING and have aforementioned ROBOT HAND sign it to a deaf guy. That's a LOT EASIER than just writing good morning and SHOWING IT TO THE DEAF GUY!
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Bah, Joel invented this on MST3000 years ago. Where's my edible sneakers?
I've had enough abrasive sigs. Kittens are cute and fuzzy.
Someone could be blind and deaf. But then why not use braille? The situation I can imagine is maybe a person knew sign language but then became blind later in life. So that would be one of the only ways to communicate. From what I understand a lot of older people have eyesight problems, so for the deaf this is even worse.
The other use could be for teaching sign language. There's a lot of people that know a little sign language, but perhaps not enough to teach someone. Seeing a robotic hand do it in three dimensions might help.
Also if you develop the technology for a sign language hand there's probably other uses for it. Imagine a robotic hand on the end of a stick that you could use to grab fragile things from high shelves. There's thousands of things it could be used for if you use your imagination
The sending of this message pretty much inconveniences everyone involved.
It's not like that robotic hand actually has to manipulate anything. That way, the program would be actually usable by any deaf person with a notebook that has a microphone.
In the time it takes to program the robot to do a bad simulacrum of someone doing each sign, they could have just video'd someone doing all the signs. Then it's more visible to a bigger audience, too.
I'm not saying it's not an interesting project, but it's not a practical solution to the problem.
Let's not stir that bag of worms...
Can I get one of these for the back of my car?
guess this is more for the sake of doing rather than being practical
my karma will be here long after I'm gone
wonO't be shou7ing
Why a robotic hand? Why not simply text on a screen?
http://ablegray.com
So when someone tells a deaf person 'f--- you,' does the robotic hand give the deaf person the finger?
then slashdot readers would really get excited.
So, the steps are these: Recognize language, use translator (of the babelfish kind) to translate to sign language, render signing hand.
:(
Why not just type it out to the screen?
Send email from the afterlife! Write your e-will at Dead Man's Switch.
Why on earth use something as complex as a robot? What's wrong with using ultra-cheap computer graphics instead? Surely the effect must be identical for the viewer. Anything with that many motors has got to be expensive and unreliable.
www.sjbaker.org
The grammar is, of course, different from English, but many children learn multiple languages growing up. So long as you're exposed to fluent speakers and forced to use the languages, anyone can pick up a language. The only reason very little children are seen to pick up languages easily is a) we excuse a lot of grammar and spelling on their part due to their age and b) lacking a language to fall back on, they have to learn quickly. There's a small segment of tonal languages which are easier to pick up as a child, but that has more to do with perfect (or near-perfect) pitch getting fixed at an early age.
This sig has absolutely no significance and serves only to take up screen space and waste the time of the reader.
Sign languages tend to be more than just words. the positioning and motion of a sign conveys location and tense of nouns and verbs. It would be like speaking English without being able to conjugate any of the nouns or verbs.
They could, perhaps, dynamically generate pictures of the signs to convey more information, but that means you have to get more information out of the original language (often the much tricker part) and even then, you still have an issue of the signing only being visible from one direction and a fixed distance (which I suspect would be even worse for a field unit which would probably have an LCD screen, which tend to be pretty unviewable from any direction but straight on.)
This sig has absolutely no significance and serves only to take up screen space and waste the time of the reader.
Don't you guys ever consider the fact that some of these breakthroughs are not built for commercial applications?
Instead of trying to analyze these achievements in the rather constricted mould of "Why not 3D graphics" or "Why not text on a screen", consider the use of this technology in the future - when say, the robots to help disabled people finally get off the assembly lines. By then, this process would've been refined to the point of being able to do an excellent job in communications.
As a researcher in the field of robotics, a lot of work which I do or goes on around me, has definite implications - if not now, atleast in the next decade or so. And don't we owe to ourselves to look at developments such as this just for the *sake of the development itself*?
If Bill Gates had a dime for every time a Windows box crashed...oh, wait a minute - he already does.
the positioning and motion of a sign conveys location and tense of nouns and verbs
You're just being silly. All this robot does is take words and map them to gestures. It doesn't convey all this crap you're imagining. And if you're going to do signs which require relation to body parts - as many do - you're going to need a big f'ing robot body to make it visible to lots of people - and you're back to viewing from one direction.
In the very worst case scenario, they could have a 3d representation of a hand on the screen (or on a giant screen, or on ten screens - either way still many times cheaper than a robot hand with this kind of articulation). It could then carry out exactly the same motions as the robot hand - only way, way cheaper, easier and more easily replicable. No maintenance, easier viewing, everything.
Again, this thing is cool - but 100% not practical.
Let's not stir that bag of worms...
The device in the Congo movie used a sign language => speech converter. The japanese story is about a speech => sign language converter.
If we consider the gestures as a series of movements produced by predetermined actuators (junctions), they can be quantized and stored in a vector, it's just numerical input, and could be classified as a different kind of speech.
Training the a gesture reader would be equivalent to searching inside a soundwave database (find the closest match, reject if there's any significant difference).
However, speech is more than soundwaves, they have to be interpreted thru various phases, i.e. recognizing the phonemes, generating a soundex, then searched against predetermined words to find a match, AND determining the exact word based on the previous textual context.
In other words, the japanese speech recognition (whether it has a robotic arm or not) is technologically superior to the (ficticious?) gesture reader in the Congo movie.
From my observation, much of the "color" or entertainment value in signed conversations comes not from the movement of the hands but the expressions on the face etc. combined with the movements. Clearly, this robot is still nowhere near being capable of the same range of expression as a human being. As a simple test, I'd like to see the robot tell a joke and get the same laughs as a proficient human signer telling the same joke...
I've abandoned my search for truth; now I'm just looking for some useful delusions.
oh come on, in romanji good morning is spelled "Ohaiyo" NOT "ohayo".. :/
plato they say could stick away half a crate of whisky every day.....
Japanese has only about 2,000 "daily use" kanji, those required to read the newspaper, or graduate from high school. Calling it an "alphabet" as one post did is rather inacurate, as "alphabet" denotes characters that make up parts of sounds, while Japanese is based on syllables (Only 40; or 80 is you include minor variations).
While in Japan I was able to learn some JSL at the hands of some of the top translators in the country (my failure to learn more was my fault, not theirs). I was surprised to find JSL actually shares many of the same gestures as ASL, with minor fingering differences depending one the characters in the word. Other signs in Japanses are actually based on their Kanji representations.
Hmm...isn't the middle finger signifying "Fuck You" pretty much universal?
I know it's a joke you're making, but actually, I don't believe it is universal although it's rapidly spreading coupled with English. I want to say that most cultures have an "up yours" gesture of some sort involving a hand punching up with some kind of finger gesture, but that's probably my ethnocentrism speaking.
This sig has absolutely no significance and serves only to take up screen space and waste the time of the reader.
If this robot gives the one-finger salute when cussed at, I'll be getting one.
"Banking establishments are more dangerous than standing armies." -Thomas Jefferson
Original Article of Yomiuri Shinbun(Japanese)
:-)
(Babelfish translation)
It is a very large hand.
No way in hell a computer can actually interpret in sign language within the next 10 years - at least. I appreciate the difficulty in doing what they have done, but there are intense subtlies to sign that will never be attainable by a machine without a human-like form. Puffing the cheeks, eyebrows, and other expressions are semantic modifiers that can change "I drove" into "I was driving forever!" or "I had to drive those kids ALL day!". ... I've always been confounded by trying to translate the intent of the statement. It was easy to interpret in general use, but always gave me a sort of stumble when I tried to interpret it accurately. I think My best was "I (honorific) proof because I (hon.) think"
I remember once telling a story, having given place names in space and subjects at the start, that used no further nouns or "pronouns" for the rest of the story. Imagine writing a paragraph without no further nouns/pronouns after the first sentence.
Heck.. "I think therefore I am" doesn't really translate to ASL, as there are no forms of be. Typically, such a quote would be done in SEE, using english terms... but literally it translates to ("I" is implied) "Think then live" or "Here because think"
meh
I may be breaking a cliché about Japanese photography but ...
where are the pictures, hundreds of photos and the videos ?
Any luck anyone ? I want proofs !
Sorry, now I have to nitpick. People who use facial expressions incorrectly *are* unintelligible, depending on what function the expression has, because facial expressions function in roles other than paralinguistic.
...
For instance:
- negation: in ASL the manual marker for "NOT" is optional. The associated facial expression is mandatory. If the manual sign is omitted, the facial expression must extend over the clause being negated.
- temporal: eye aperture can indicate whether an action took place in the past. There is at least one documented instance of where an interpreter misrepresented what a client said in court, because the interpreter was not aware of this usage.
- adverbial: a facial expression modifies the meaning of the verb
- topic marker: essential to denote the subject of the sentence if the word order is any other than SVO.
And so on