Talk ... Without Speaking
mjm7 writes "Finally, we might be able to get rid of all those annoying people yelling over the static on their cell phones! CNN has an article about a new technology that senses muscle movements in your face and then translates them into sound. This way all you have to do is mouth words into the phone...not actually speak!" Somehow I suspect that we'd lose a lot of the
subtleties of communication, but it sure would be nice every time hemos calls me from the discotheque.
"Slashdot is about legos and staplers." -Cmdr. Taco
We are right to be skeptical of outrageous claims like "my cell phone gave me cancer" and I applaud the many geeks who, in this story and others, have stood up to suspected pseudo-science and brought to bear a modicum of scientific knowledge.
However, there are significant reasons to believe the claim is true in this case. For instance, consider electric fields. You may not be aware of this or have thought of it this way, but a microwave oven is basically just a big, unmodulated radio station broadcasting in the microwave band instead of the radio band. And what do we use microwave ovens for? Cooking things.
And microwaves, like all electromagnetic radiation, are caused by what? Electric fields. And a major source of electric fields and broadcast power is what? Cell phones. And we put cell phones where? Next to our genitals and next to our brains[1].
So, while I love my personal computer, SUV, air-conditioning and other marvels of modern life I Just Say No to cancer-causing cell phones.
[1] For me this is two separate locations, YMMV.
The Anderson partner called his secretary on his cell phone and said:
Ship the Enron documents to the Feds
But she heard:
Rip the Enron documents to shreds
It turns out that this was all just a case of bad cellular...
Life is the leading cause of death in America.
We'll finally be able to understand what the hell mimes are doing! Rejoice!
"I only speak the truth"
Karma: null(Mostly affected by an unassigned variable)
Not good news for those that like to mutter curses to the morons on the other end of the phone.
Finally, now I can finish my swat team leader costume.
Imagine what a world of difference this would make to the mute or to people who had lost the use of their voice due to throat cancer. It seems weird they didn't mention the applications this would have for people who have lost or have never had the use of their voice.
yes i run a goth/punk/emo porn site.
Words like this may cause some minor misunderstandings.
Lord, bless my users that they may stop being such fucking idiots!!
I'd also have to say this should be made mandatory for all people that would otherwise force me to listen to their loud cell phone conversations.
With keyboards we successfully took away peoples needs to physically write something... with this we won't need people to verbally speak... next it'll be visual impulses shot right into your head so you really don't need your eyes anymore... sheesh...
This might help voice recognition catch on as a means of PC input, too. I'd feel slightly less stupid sitting in my office mouthing words at my computer than I would actually talking to it.
Please donate your spare CPU cycles to help fight cancer and other diseases
Iaamoac
As a professional mime for fifteen years, I can finally use a cell phone like everyone else!
Ok, so not really. But it's kinda fun to think about . . .
Fnord.
...think of the possibilities here. I bet you're always wondered what those people sitting around on corners and talking to themselves are saying. Now we have the technology to find out.
...
...
... and this is A GOOD THING (tm) ?
-- We live in a world where lemonade is artificial and soap has real lemon.
I'd love to use this while playing paintball. The enemy wouldn't be able to overhear our communications.
Yeah, this sounds like just the thing for people who want voice dictation, but work in a "noisy" environment. :-)
Alternatively, you could even have a microphone attached so that when you actually did speak, it would automatically disable the recognition - no more accidentally transcribing your half of a phone conversation for example. Wait a minute, I have to patent that idea!
You either believe in rational thought or you don't
What kind of a sound would it make if I held my middle finger up to it?
I mean really, if the static is so bad that you can't get a good enough signal to hear the person, how is the "face recognition" signal going to get transmitted?
"History doesn't repeat itself, but it does rhyme." Mark Twain
Think about it, don't most people move the muscles in their mouths slightly different when they are mouthing words, as opposed to actually speaking them? I would venture that the technology wouldn't be able to discern the subleties in the way we speak.
Other than that, it sounds like an interesting technology.
Attention all planets of the Solar Federation! We have assumed control! - Neil Peart
It also seems to me that sounds are not necessarily made due to the movement of the jaw. I'd imagine that non-vowel sounds emanate from the vocal cords and tongue. And, what could this end up looking like? Think Nintendo Power Glove...
sarchasm: The gulf between the author of sarcastic wit and the person who doesn't get it.
you are on your way home from the dentist? You know, big cheeks, numbed jaw, cotton stuffed in there...
Your end of the conversation would sound like: "argummm blmmmaa goooo"
This would be/will be great. Now we just need cellphones that come in vibrate-only mode and I can finally have a peaceful meal in a restaurant without some moron ten tables away disturbing the whole restaurant with an incoming call (and the subsequent conversation).
Question: if this can eventually recognize what sounds the person is meaning to make with 100% accuracy, does that mean that voice recognition has arrived? Instead of spitting out an audio signal, it could output text instead. THAT would be AWESOME.
I'm so excited. :)
Mr. Ska
"and scream without raising your voice."
-jc
these days? A "discotheque". Back in my day, we called it what is was, a good ol' saw-dust on the floor whore hose AND WE LIKED IT. We had to walk up hill both ways in the snow AND WE LIKED IT.
Talking on my phone
I twitch, about to sneeze hard.
Phone thinks I said "F*CK."
Sure, we might lose some subtleties of conversation..
But reality would gain some..
Thing about all people walking around moving their mouths.. and no sound..
Eerie..
I really need a system to capture all my meaninless muttering.
Hmm. like 'Sir or Madam,
I am writing regarding your super-glue substitute and, oh shit what's this in my hair? I need a shower. I love that your product ow that coffee's hot! will not stick my finger to my eyelid.
Sincerly, will you turn down that fucking polka music,
teasea
I can hear it now...
"Domo Arigato, Mister Roboto"
Neat idea...(I didn't anyway) it looks like all they can detect right now are vowels.
I wonder how they will work out the consonant issues. The way an S is produced is pretty similar to a Z. At least they are pretty similar in my mouth anyway.
I suspect everyone produces consonants in a slightly different manner. I mean, when you are learning to speak, you don't stick your hand in someone else's mouth to figure out what their tongue is doing... You just maneuver your own until you make a similar sound.
So there are probably several different tongue configurations that work to produce a sound. Not to mention the shape of one's mouth may require a specific and unique tongue configuration to produce a particular sound as compared to someone else.
Sounds (hehe) like they have their work cut out for them in this area.
--Scott
Now doing a rasberry would put this on your IRC ;P
Kinda takes all the fun out of yelling would'nt it?
make Linux, not Microsoft. sin(beast) = -0.809016994374947424102293417182819
it sure would be nice every time hemos calls me from the discotheque
:)
Still, this is just a one-way solution. You will be able to hear the person talking in the crowd, but how will the person on the other end be able to hear anything? Will the phone be able to display the message in the form of text or something similar? Or will it just make funny faces at you?
does anyone remember the "my teacher is an alien!! series? plot synopsis: 4th grader finds out teacher is an alien (suprise, suprise), teacher/alien sees him seeing him, and keeping glactic security safe, takes him up into the New Jersey (mega-big spaceship), and they cruise about, saving the universe.
anywho, i read (and probably own) the whole series in probably 4th grade, i'm 18 1/2 now. on one of their missions, they had special devices like this; except it attached to your throat muscles, which is probably a whole lot easier and less conspicious. the funny part was that they had to whisper, otherwise they'd "yell" right into the other people's earsets. good to know this stuff is comming to fruit
my teacher is an alien on amazon.com
the interesting thing about the series, is that it explains in amazingly simple terminology, using a large noodle, how hyperspace works. i'd explain more, but i don't want to get modded offtopic TOO much. and i have to go to work.
moox. for a new generation.
Speaker for the Dead and Xenocide, had Ender and later Miro subvocalizing to Jane, the sentient entity that "lived" in the network of ansibles. It might continue past that, I have only read up to Xenocide.
"Rotate the pod please, Hal..."
... I could see your lips moving ...
Dave
-Ev
I stole this sig from someone cleverer than me.
Actually, bibliothèque is the French word for library. Discothèque is the French word for discotheque, or "place where the records (discs) are."
This way all you have to do is mouth words into the phone...not actually speak!" ...it sure would be nice every time hemos calls me from the discotheque.
Maybe you could use the mouth movement to determine which sounds should be transmitted. That way you could actually hear the other peson but filter out the extra sounds.
The French word for 'library' is 'bibliothèque'
Discotheque means the same thing in both French and English - a place where records are played and people dance.
-Vercingetorix
"Necessitas non habet legem." -St. Augustine
Part of what makes phone conversations work is implied security through voice recognition.
If someone with one of these gizmos calls me, and I hear Mr. Roboto on the line, then there really is no way for me to know who is on the other end.
Bottom line, if you sound like a robot from Berzerk then I'm hangin' up. Intruder Alert indeed.
What happens if, while walking down the street, you see a hot teenage girl walking the other direction and your jaw drops? How would it vocalize that?
Is that this would be great for people who for one reason or another no longer have voiceboxes.
I had a great-aunt who lost a decent portion of her lungs to cancer and cigarettes, and up until her death a few years ago she had to use one of those darth-vader vibration-amplifier things like the "Ned" character does on south park. I was terrified of her when i was six.. (Give me a break, i was six years old and stupid.)
Anyway, i can imagine that technology like this would be just about perfect for people disabled in a similar manner through tobacco, cigarettes or who knows what. No? At least it would keep such people from having to deal with their idiot six-year-old-nephews reactions to the harsh sounds of the vibration amplifier box..
and really, even beyond that, tech like this would be just about the only option for people who are going through whatever that intensive vocal-node-therapy thing is where you're banned from speaking for six months. and i know a number of theatrical singers who would be intensely happy to have one of these so that they could rest their voices between performances without cutting themselves off from the world...
I hope that once this complete, they'll sell a unit where the voice-synth thing outputs into speakers rather than a phone.. I'm sure they would have looked into this possibility by now, right?
(P.S.: While we're on the subject, sort of.. just in case anyone reading knows: This came up as an argument the other night when we were watching the Oscars and examining how much pain Enye appeared to be in from having to exert her voice. What's the difference between a vocal node and a vocal nodule?
Irritable, left-wing and possibly humorous bumper stickers and t-shirts
What does this mean for Big Mouths like us? Instead of "can you speak a little softer" it'd be "can you shut your mouth"... oh wait, nothing new.
to those with Tourette Syndrome.
Has Hemos discovered time travel and not shared it with us?
No sig for you!!
...in a 70s liesure suit... a la "Saturday Night Fever"...
It's rather a scary vision... sorry for sharing it with you.
BlackNova Traders
Was anybody else immediately reminded of the old Simon and Garfunkel tune, sounds of silence in particular the line about "people talking without speaking" (the link is a poor transcription).
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
Will pointing the phone in someone else's direction enable you to eavesdrop on their conversation?
How about all those times you get a phone call and you realize you don't want to talk them and as they drone and drone and drone you mouth to anyone around you "SHUT THE F-CK UP!!!" Now they will hear that.
RonB
It is human nature to take shortcuts in thinking.
While the technology may work well for some languages, like Japanese, which only has 5 vowel sounds, I wonder if they've done any testing with speakers of other languages that have a larger number of sounds.
At any rate, I agree it really sounds like something from a William Gibson novel.
This technology, assuming it works, might initially fail to gain popularity if it's not priced right. I doubt many people would pay, say, $100 extra for a phone with this feature. And that's because many people simply don't care if they're irritating the people around them.
But I'd love to see such places as restaurants and bus lines require their customers, who insist on using cellphones on their premises, to use this product. I bet the bulk of customers would support such a rule, and everyone would benefit.
I'm generally "Interesting," "Insightful," and even "Funny" here. What the hell happens to me at parties?
Just like now there is a vibrate feature so the phone can ring without anoying every one in a room. Now there is an other feature to try to stop those jirks who talk answer their telephone and talk into in in an unaproprate situation to talk. When this technoligy is realease I think we should have legal rights to smack anyone whos Cell phone goes off in an unaproprate situation (because they should have a vibrate) and then Kick the person who Starts talking to the cell phone where we can hear them. I dont know about anyone else but What is more annoying then having a cell phone go off in an unaproprate place, is when they start talking loudly without leaving the room.
This technology may not stop this from hapening but it would give us a reason to force them to stop. Where the answer oh this is an important call, will be become complete BS.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
As long as the cell phone makes real noise, rather than inserting a probe into your ear canal and manually manipulates your eardrum so that you hear the conversation without sound...
"It's Stephen Hawking on the line again for you..."
husband: "honey, I am working late again tonight." (whilst making suggesting eye contact with his secretary)
wife: "ok.... and you want to have phone sex?!?"
Just what everyone wants: to sound like Ned from South Park.
Democrats and Republicans only disagree about how to enslave you
(It's just a JOKE! I know I'm not the first to think of it.)
NetInfo connection failed for server 127.0.0.1/local
Usually when I mouth a word into my phone it usually means I DON'T want the other person to hear it. I'm not sure what the learning curve would be on a device like this but chances are that until person hits it they are going to have a lot of explaining to do!!
I stole this Sig
a new technology to enslave us.
From the article: Engineers are developing a sensor, which detects signals coming from the muscle movements in the cheek and jaw made when people are speaking. Signals from the sensor are interpreted and the sound being made by the speaker can be determined, but because the system measures such impulses, the user needs to just mouth the words and no actual sound has to be made. "This technology is still at a basic level," says Mariko Wada, a spokesperson for NTT DoCoMo in Tokyo. During experiments, engineers have been able to get the system to discern vowel sounds with 100 percent accuracy -- a world first according to Wada. Now, is it just me, or are people jumping to conclusions. There is _no_ data concerning how accurate measurements is ... there is no way to "tune" the system ... teaching a speech recogniatino system is hard enough although we can _hear_ the sounds ... can you imagine trying to debug this system? How many people here can lip read anyway? How can you except to debug it then?
For me, plain typing is the way to go.
I'm just gonna work on vim and try the dvorak keyboard.
Later,
LLNL has been researching micropower impulse radar to 'image' the vocal chords, mainly for speech recogonition. The main site seems down, but you can get to it with google cache. Also check out ucdavis
HIV Crosses Species Barrier... into Muppets
Problem is most speech synthesis still sounds robotic. Imagine if she sounded like WOPR from war games.
MCC would you like to play a game?
Now that would scare me as an adult even!
Fool's gold. Especially considering this post was copied verbatim from a joke email that was going around not long ago.
I have a woman and money. Life is good.
will come if they can get it to be incredibly sensitive. Sensitive enough to pick up on the slightest facial twitch.
I always envied Paul's ability (DUNE) to read peoples expressions, even when they were subconcious. Basically an ability to 'read minds'
Put that in a box and sell it. Oooh-- put it in court rooms.
"I did not have sex with that girl!"
FacialRecognitionAmplifier2000- "BEEEEP!"
Judge- "Are you sure you want to stick w/that lie?"
Very interesting possibilities down the road.
.
It's hard to believe that's how Micronians are made. Why don't we see it right now by having you both kiss one another?
Couldn't someone use the movements in addition to the sound to filter out the actual speaker's voice from the background noise? This seems almost like a nonlinear Kalman filter application (though I am by no means an expert on such things), if you had a (presumably nonlinear) model for speech as a function of the movements of the mouth. The article didn't give too much detail. Oh well, it sounds interesting in the very least.
<OT>
Looks like there's going to be an Ender's Game movie!
</OT>
Fight for your right to read books!
The last thing we need is technology that allows our wives to be able to figure out what we are muttering at them under our breath...
Oh, I forgot. Most geeks aren't married...in fact, most probably have no clue how to perform even the most basic intereactions with one (except of course when their mothers call down the basement stairs to them that dinner is ready)!
You're using her as bait, Master!
nt
I don't want knowledge. I want certainty. - Law, David Bowie
A system like this, would either need to incorporate some amount of voice recognition, or use a vibration sensing mechanism.
You know what?
The static is probably from crossing the spacetime barrier to the 1970's. that begs the question: Is it better to dress up in your disco clothes before antering the time machine, or after? To do so first, you might risk disturbing your present day friends. To do so after, you risk looking like a square to all your disco buddies.
we all heard ex-president bush say no new taxes. with this we sould have known what he really said..
cuz we know you guys don't ever leave...
Of being able to mute some of my coworkers when they think of them self that they are gods. Problem is that this won't disable their vocal cords... On the other hand, I wouldn't be able to mimic any more "Fuck ya you looser!" at them when they turn around. I imagine them hear the little feminin robotic voice saying "Your Budy "Dan" just says: Fuck ya looser! To respond to this message press 1 now. To forward it to someone else press 2..."
I'd rather be sailing...
used it too! They were considered a novelty in Earth that were only used by those who were willing to expend the energy to become proficient at it. One of the characters who was a world class biologist used it to communicate more effectively with her computer.
Nice, I'll be able to tell what you are whispering to your friend from a spy satellite. No more secrets.
Personally, I think static is a good thing... hello, hello, I think I'm losing you...I can't hear anything your saying...
IANAL, but I've seen actors play them on TV
What concerns me are celebrity magazines. Television with 500 channels. Some guy's name on my underwear. Rogaine. Viagra. - Tyler
Somehow I suspect that we'd lose a lot of the subtleties of communication, but it sure would be nice every time hemos calls me from the discotheque
Regardless of the technology that this phone would include, Hemos is still fscked if he calls Taco for dating tips.
Where does the school board find them and why do they keep sending them to ME?
That also means you can not have a privet talk. Once again our lives are under attact by technology.
It's a narrow mind that can only spell a word one way!
Special concrete to prevent EM from CRTs from being detected by the enemy.
Walls to keep the enemy from seeing your CRT screen reflected in your face.
Enclosures to keep the enemy from seeing your twitching facial muscles as you talk.
The only place I know that's decent to get a beer is the New Holland Brewery....Then you're looking at GR or God Forbid Saugatuck.
Pretty nice tech for interrogating people...
-- Apparently, some people are calling me 'Maurice' merely because I said something about the pompitus of love.
'Vacuum' ~ 'F**k you'
I'd hate to get those mixed up. But really, how much to people really talk about vacuums?
Am I the only one who says nice things aloud into the phone while muttering "fscking azzhool" under my breath? How refreshing honest our communications will become!
"with their freedom lost all virtue lose" - Milton
Typing without moving your hands?
Just a pair of gloves with biometric sensors, and you just motion the keypresses with your hands.....
...but it sure would be nice every time hemos calls me from the discotheque.
I don't know what's more scary -- a new cellular electrode attachment or Hemos heating up (literally) the floor under a giant mirrored ball.
"I'll just chip in a bit for RedHat: I actually have that installed on my university machine." - Linus, '95
This article fails to mention that you would wind up looking like a moron doing this though. Tell me that if you see somebody down the streets of NYC walking and moving his mouth without sound coming out wouldn't strike you kinda weird (OOoops, that would actually be normal in NYC (I am from NYC so don't be insulted) it will probably be best if we changed the state to Wisconsin)
Seems like kool technology, however we need to read brainwaves (Now that would be a little too kool).
. . . keep trying guys!
Seriously, though, the promised "killer application" for over a decade now has been voice recognition, and we're STILL at a point where the inaccuracy rate leads to it being generally useless in anything other than "ooh, isn't that neat" kinds of demos (for instance it was a laugh to see voice recognition as a hyped feature of Office XP : Now tell me how many people on the planet are actually using it? While I applaud them for adding it for the handicapped, of the general public it seems neat, but when you have to babysit every word it dictates you relegate it to the unused feature list).
So we've barely gotten voice recognition down, despite being "just a wee bit more" type of promise for so long now, and someone is claiming that they'll read your lips? Fat chance in hell, is all I can say. Unless we concatenate our language to about 4 words, there isn't a chance.
cell phones have caught up to deaf people.
go get it
By no means, do I claim to be a master of the art of Trolling, but here are my 2 cents.
Exploring the troll within one's self can be a confusing experience. First, one must recognize the term troll often is used to describe several different style of postings. It can be used in a general sense to describe crapflooding posts, flame-baiting posts, page-tweaking posts (PLP's and PWP's), and true trolls. After one makes the distinction that a more specific definition of "troll" exists, the curious mind must find its meaning. Is a troll simply a post that contradicts popular opinion? Is it a post that gathers many replies? Is it a post that argues a point that you don't really believe? Is it a post that argues for an opinion you don't even understand? Is it a post that is only meant to annoy a large number of people? Or is it a post that is purely intended to encourage real debate? There may be many other possibilities that I haven't considered yet, but these are the ones I can quantify.
These all appear to be valid motives of the many veteran trolls. So, how to you become proficient at this art? Well the following is a quote I found etched within the inner chambers of a 1000 year old temple in Nepal. I share these publicly for the first time, here on Trollaxor.com.
It seems one must have a well defined perception of their own beliefs, and then be able to dis-prove them all... in ones mind, build an undeniable proof of a fact, and then be able to prove it fiction... understand ones audience, their weaknesses, their strengths, what irritates them, what they believe as holy truth... one must be resourceful with their research on issues they don't completely understand, and be able to project confidence in the knowledge they find...
One must become Tuan-chi Muho. Find your beliefs and remove your mind's limitation to only believing that single thought path, then understand all other possible thought paths... understanding the individual has the ability to choose any of these paths, but chooses not to limit one's thoughts by only exploring a single path. One must be able to take these insights and use them against their foes.
Go forth now... and troll.
Color flashing, thunder crashing, dynamite machines.
Wouldn't everybody start sounding like the vietnam vet on south park??
So, I wonder how the system works with inflection and stressed syllables. Would be a disaster for those domestic husband/wife disputes (not to mention Japanese which is almost *entirely* inflective):
*I* put the dishes away.
I *put* the dishes away.
I put the *dishes* away.
I put the dishes *away*.
Looks like we will still that Sprint guy hovering around for a while....
"Just because you're a genius doesn't make you a smart guy!" -- Narrator, Powerpuff Girls
All I told her was, "I wanna fig newton!"
"And like that
When I call someone often I like to hear thier voice. Not some mechanical or recorded voice. Also I think other people would like the same on the other end.
But at the end of the article they say it could apply to help send text messges which would make it quite usefull.
this tech would be wonderful for voice recongition programs. Or imagine using one of those voice command programs in the middle of the night without having to worry about waking your roommate. Or typing out a quick text message without taking your hands off your steering wheel while driving or while reading a paper and then the program sending a text message to someone.
-THIS SPACE FOR RENT!
You smoke poles every day
i think it's called lash, but if memory serves me, there are ways to pickup sound by just wrapping a mic around someone's throat (a thin band) and have it sense them whispering. I think swat uses this.
Photos.
Not new, old technology. They even have a guy who uses a handheld one on South Park!
"Live Free or Die." Don't like it? Then keep out of the USA
Anybody ever pay attention to the sounds that the handlink makes on Quantum Leap? For example, it kind of goes 'waaaaahhhh' when he smacks it. That's the most obvious one, but if you listen a little more carefully, the sounds that little device makes start to emote. You can get an idea what he's reading on the screen before he actually states it.
Tom and Jerry is similar, to a degree. I ran across a cartoon of Tom and Jerry on the web a few days ago and watched it. I noticed something very interesting. The music in the cartoon responded to every little movement that the characters made. You listen to the music, for example, and tell if Jerry was tiptoe'ing or running. That was a very interesting dimension to Tom and Jerry. That is the type of element that would allow you to watch a slideshow of the show with the sound track and still keep track of what's going on.
This article was very interesting because I think it may be the start of making computer interfaces take advantage of audio responses that don't even require words. I've spent a great deal of time assigning different sounds in Windows to different events. For example, I have a very distinctive sound that ICQ makes when I recieve a message. I even went as far as to provide different people with different sounds. I noticed something very interesting, when I went to use ICQ on another machine, I ached to hear the sounds again. It was so strange not hearing them!
I hope one day Windows (or whatever OS I use in the future...) spends more effort into providing a sound-enhanced interface. That would truely provide better a better multi-tasking experience. It'd be cool if, for example, the window on the screen causing the sound was played through the right or left speaker based on where the window is on the screen. Maybe muffle it if a window is under it.
Anybody know of any products for Windows that do this today?
"Derp de derp."
*ring*
Guy: Hello?
Guy's Girlfriend: Hey honey. What time are you getting home tonight?
Guy: I'm not sure babe. Looks like I might have to stay in and work overtime again.
Guy's Girlfriend: Awwwww.
Secretary: *whispers* who that?
Guy: *whispers* It's my stupid girlfriend. Don't worry, we'll still go out, drop by your place, and make out like animals tonight.
Guy's Girlfriend: WHAT????
my blog
The real trick would be to implant the pickup leads to the cranial nerve that drives the muscle like the cochlea implants. Put a LED/phototransistor pair under the skin somewhere for an optical output/optoisolator.
With both, cops, solders, secret service agents and other people that need to work without giving away their position could communicate. Note taking in classrooms or libraries could use the to-text functions and I could get rid of this keyboard and have a much more compact portable computer/pda without a noisy microphone in the way or wirers plastered to my face. Speeches and lectures could be recorded for posterity in text format and TV shows could have close captioning of the actors actual words. Listen to music without bothering your neighbors or blocking your environmental sounds with earphones. It would be interesting to imagine how language would morph over time.
Move the leads farther up the brains processing and viola the equivalent to schizophrenia. Yep, needs an optoisolator instead of wireless link, and faraday shields on the leads.
Matrix here we come.
I have a minor speech impairment (not very clear) so it would probably be useful to me. :)
Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
This way all you have to do is mouth words into the phone...not actually speak!
My daughter talks without saying anything. Maybe she could get a job testing these things.
Disconnect your television. Do your own research. Draw your own conclusions. They're probably lying. Don't be a sheep.
Researchers later admitted that the technology was developed in response to those idiots who insist on talking on using their cell phone *during* the movie. I thought I was going to see "Lord of the Rings", instead I ended up being treated to "Lord of the Ring Tones". It happens at about every third movie I attend.
Now I can't mouth obscenities about the person I'm talking with without them hearing!!! You can't also hold a "quiet" conversation with the person beside you while "politely" listening to the person on the phone...
Oh well... my boss probably needs to know about what I call him behind his back anyways. q:]
MadCow.
I used to have a sig, but I set it free and it never came back.
I'm somewhat skeptical. How are they determining vowel sounds? Can it see well enough that I'm making a high back vowel or a low back vowel?
/b/ and /b/, or /d/ and /t/? Or perhaps aspiration?
Also, how would you distinguish between voiced and unvoiced pairs, such as
There is more to speaking than just "face muscles."
Ben
Eloi are stupid, throw morlocks at them!
you're one of those people who talks on the phone in restaurants?
now i can say 'olive juice' to my wife and still earn major points. OTOH, everyone in Italy is gonna think that the entire US population is gay!
Another way to help speechless persons to communicate is the recognition and translation of sign language. If you're interested in that you might want to look here.
someone needs to find a way to mix this with porn so it gets popular. eric.
adventure-today.com
I can finally find out of the girl across the room is whispering to her friend about the hot guy across the room, or the creepy guy that is pointing a cell phone at her.
The problem as I see it is that I have no personality of my own.
Progress.
=brian
Does anyone know how this will differentiate between voiced and unvoiced sounds? (eg. 'k' and 'g')
Are they going to call it "subvocalizing" like in Enders Game?
Travis
Why? What's the big deal? I can understand if somebody talks too loud, but that's true of anybody, not just a phone user. I was at a McDonald's once grabbing a bite, and I called my dad and talked to him for a bit. The woman in front of me got irritated and muttered that I should get off the phone.
I never did find out what sparked that. If I was talking too loud, for example, she could have just touched her mouth in the 'sssh' sybol and been polite about it. I don't think I was talking that loud. Nobody else even looked up at me. I think she just had a conception that people with cell phones are rude. Well my response to that is 'ITS NONE OF YOUR FUCKING BUSINESS.'
There's 0 difference between me talking on the phone or me talking to somebody in person. If it's okay for me to talk with somebody in person, but not on a phone, then there are some serious social issues that will arise down the road. I bet she'd be tickled to death if any of her kids called her out of the blue just to say hi, but I call my dad (who lives 2000 miles away) and I'm a rude jerk. If it's distracting to her to watch a guy talk to somebody that isn't there, then she can watch Quantum Leap until she gets used to it. I certainly am not turning off my phone for the simple reason that it displeases her.
If anybody is going to discrminate against people with cell phones, make damn sure the reason is unique to the cellphone itself. No phone ringing in a theater: Acceptable. No Cell phones in a Hospital because they interfere with equipment: Acceptable. No talking on a cell phone in a restaraunt: Unaccetpable.
"Derp de derp."
Taco, you're such a fag.
One of the big advantages of the keyboard as an input device is that it's quiet - you can have a room full of people typing all at once without distraction. Try that with voice recognition software.
Who knows, with good enough "voice" recognition, maybe this technology could finally replace the keyboard - a totally compact, quiet, easy to use input device.
It's interesting to see you bring up this topic, that what is a feature to the majority of people could be nearly a neccessity for others. I find it ironic, however, that current mobile phone accessibility for the deaf and hard of hearing is sorely lacking. While many phones are now SMS-enabled, this is not always sufficient. TTY functionality would be needed for relay calls.
It is certainly nice to see this development, but before this is actually implemented I would like to see TTY functionality on some mobile phones first.
The "100% vowel detection" claim sets off alarm bells for me. Sure, pure vowels tend to show up on the face, but there are lots of characteristics of speech which occur down in the throat, or back of the tongue...how do they plan to distinguish between sh, ch, and j? S and Z? F and V? For now, they don't.
I also just don't see how the claim can be accurate. I can say the "ih" phoneme with my jaw in any position, and I can say "a e i o" without moving my cheeks or jaw at all. What gives?
Human lip readers need *context*, and lots of it. This one I'll believe when I can use the demo myself.
As for losing subtleties of communication, I think the real problem is in synthesis. I work on the opposite side of this problem, generating lip movements from audio (i.e. lip synching). A lot of the subtleties you might think you'd lose are actually there in both signals, the audio and the muscles. For example, you can tell when somebody's smiling over the phone, the change in the shape of the mouth makes the phonemes sound different. Shouting invokes different muscles from normal speech, emphasis might be picked up from the eyebrows, and so on. But even if you detect such things on the face, no voice synthesis engine is capable of rendering the accompanying vocal effects.
"...but it sure would be nice every time hemos calls me from the discotheque. "
Like This?
To think this will end up as mass market technology is plain wrong. Can anyone really think that people will stop "speaking" into their phones? And for what, to evade cell phone static? Come on. Usage of this technology will only really take off in very niche markets where there's an actual need, like those who's speach is affected frome one form or another. Those are the people who will really benefit from this. The implications there are incredible. Now where's my Crystal Pepsi?
We need this in my office now. All the fscking idiots, yakking on their phones all day, no clue that they're disturbing anyone...
As much as this would be really cool, there are lots of things that would make this go haywire. First and formost is the ability of the program to distinguish the end of a word. When you listen to someone talk, your brain automaticaly divides up what it percieves as full words and then reports them to you. But your brain has had many many years of experience with this. A technology like this would be confused by anyone who speaks fast. Even speaking normaly it could be confused. Don't believe me, listen to a person who is fluent in a foriegn language speak. To him (or her) the words are all very distinct and seperate, but to you, it sounds a lot like a string of sounds and you can distinguish maybe 3 or 4 words.
Second, this is all well and good for words or parts of words that are formed by the movement of the mouth, but try mouthing the words this, that, there and the. Most of the word is made up by the tounge and it's postition, and the only difference between there and the is what you do with the inside of your throat at the end of the word.
As much as this would be a great technology to have, we have a hard enough time getting speech recognition programs to work let alone mouth recognitions.
T Money
World Domination with a plastic spoon since 1984
This quote gets thrown around all the time without anybody being cited as the one who coined it, but it was definately not Larry Flynt, who used it in entirely the wrong context (it's a quote about taxes, not censorship) in 1999.
"Democracy is two wolves and a sheep voting to decide on what to have for dinner," is sometimes credited to Robert Heinlein, although I have had little luck finding the actual source where he said it.
In 1994, James Bovard referred to the quote, by saying: "Democracy must be more than two wolves and a sheep voting on what to have for dinner."
Larry Flynt obviously just overheard it somewhere, or saw it on an e-mail signature, half-remembered it but failed to recall what it was about, and used it to get a laugh in one of his speeches.
Information wants to be anthropomorphized.
Flawless speech recognition in HARDWARE no less.
There is a phenomenon called "Covert Oral Behavior" which is similar to this. Basically, when you think words, (but do not speak them) the nerve signals which would normally move your mouth and vocal chords are not completely dampened. They are suppressed, but not completely eliminated. See the work of this research group, based on the research of F. J. McGuigan
Sounds like it would help someone with laryngitis.
Slow down, cowboy! It has been 4 hours since you last posted. You must wait another few hours.
Doesn't this mean that I won't be able to make out the actual voice of my caller? I mean, it's cool and all, but I don't want to feel like I'm in a cheap horror movie. Or talking to Stephen Hawking.
On an oddly related point, I saw a thing on TV recently about new inventions and some contest run every year to give an award to the most promising new invention. This one wasn't voted most promising, but it was cool:
One guy invented a speech feedback system that picked up your voice in a headset microphone, changed it slightly, then fed it back into earphones you had on. He previously had a serious studdering problem, but this device allowed him to speak almost perfectly after having used it repeatedly to practice speaking. He demonstrated that the device could also be used to make a person speak authoritatively, cheery, fast, slow, and various other ways as well by changing the feedback that they were hearing.
Pretty amazing, how the brain would change the voice to match what it thought was normal almost immediately. People wouldn't have any problem adjusting voice inflection to match the device assuming it provided feedback.
// harborpirate
// Slashbots off the starboard bow!
Inappropriate capitalization of Apparently randomly selected Words.
This next song is very sad. Please clap along. -- Robin Zander
"Read my lips. No new taxes (today)."
Seems to me that the best answer to the problem that many folks have faced is to burn all cell phones. Sounds good to me...
How can you make a phone call if you can't even spea-k.
Maybe speech recognition w/o speaking? Now all you have to work about is a repetitive strain injury of your facial muscles. ;)
The biggest trick the devil pulled was letting lawyers become politicians so they can write the laws.
That reminds me of those psychic people in...I think it was "Return to the Planet of the Apes" thaey sort of threw their face at you. That was how they communicated. I'm pissing myself laughing at the thought of people doing that on the street...creepy but funny.
But no one's picking it up, or, there's a fear that no one will 'catch-on' to this technology, or some other lame reason. Or so I'm told.
I disagree. I can sign. I use Amercian Sign Language, and on top of that, I can read _most_ people's lips. The Deaf would have so much LESS of a barrier if this came into widespread adoption.
This technology only benefits hearing people. UNLESS one considers that the deaf MIGHT be willing to have a voice synthesizer perform the auditory acrobatics FOR them. I think you'd have an easier time getting visual phones adopted and much more quickly.
(diversion: although part of me flinches at the comedic pontential of this muscle-recognition technology. If both come into effect, how hard of a time am I going to read someone's lips when their mouths are contorting themselves by HABIT to be understood if this technology gets widespread adoption FIRST)
.Why am i posting here? I dunno. I don't think it's completely off-topic if one considers minority groups being overlooked (again). I'm sure this technology is focused on English speakers.
spam, spam, spam, spam, e-mail, news and spam.
I don't know Japanese myself, but I'm in the middle of reading The Japanese Language (no link; not carried by Amazon). One of the things that's discussed is how little mouth movement is required in Japanese, in contrast to other languages. So it's somewhat ironic that DoCoMo, a Japanese company, is leading the charge in this field.
Even in non-Japanese languages, guttural sounds like 'g', 'k', and German 'ch' cause very little muscular change--just watch yourself in a mirror some time. The article didn't go into much detail, but it may be infinitely more useful if the sensors paid attention to tongue movements instead of cheek ones.
:wq
1 - the lips and face
2 - the tongue
3 - the larynx
4 - the diaphragm
This approach only has access to part 1 above and will not be able to capture the other elements of sound. Lip reading works only partially. Try watching TV muted without closed captioning to see how many things that sound different will look alike.
The diaphragm is controlled by the vagus nerve (Cranial Nerve X). The diaphragm starts the ball rolling and controls the expulsion of air from your lungs. Sighs and the amplitude of your voice are controlled by controlling the strength of the exhalation.
The larynx is controlled by the laryngeal branch of the cranial nerve X, the vagus. Controlled constriction of the larynx provides a vibrating memrane which controls the resonance and dominant frequency output of the voice. Intonation is controlled by changing the frequency. This can change across a sentence, or in Cantonese and many languages even across the length of a single word or syllable.
The tongue is controlled by the hypoglossal nerve (Cranial Nerve XII). The soft palate is controlled by the glossopharyngeal nerve. These two components along with the lips and the cheeks act to change the size and internal shape of the resonance cavity that is your mouth. This filters the sound produced by the lungs and the larynx. Placing your tongue in different places changes a lot of filter characteristics.
Your lips and cheeks are controlled by the facial nerve and by the upper cervical nerves. They help modulate the resonance cavity. The lips also provide an abrupt or gradual start and stop to sounds.
Saying the letter 'B' or 'P' requires that the lips close first, then the lungs start to exhale and pressure builds up in the mouth, then the lips are opened and the larynx resonates as air is quickly allowed to be expelled. You do all of this automatically all of the time. The only part that is visible externally are the lips and the slight facial nuances associated with this. The rest go out of the brainstem via the cranial nerves and are not accesible. (well, you could get to the laryngeal component of the vagus surgically, but that is invasive.)
This won't work easily and will not sound as fluid without all of the other characterstics of sound and voice production being taken into account.
Think Nintendo Power Glove...
Nah...They'd probably just put a small electric sensors around the center of each muscle that controls the jaw. I suppose these could either be implanted or placed on the surface.
In a way, you're right, though. They could use either a frame (a la Borg), a mask, or adhesive dots.
I think the adhesive dots would work best, since you could make the adhesive conductive to increase sensitivity. Or they could have tiny accelerometers in them just to sense movement.
In any case, the tongue is definately a problem. As I understand it(though IANAD), the tongue is all one muscle. I suppose you could make the adhesive dots double as ultrasound transceivers, but the phone would have to have to have awfully quick pattern recognition in order to understand the (coarse) virtual image of the tongue realtime.
In any case, I suspect commercial application of this is a ways away.
I can easily see colored adhesive dots becoming the "in" thing. (Please, nobody say "Flower Power...") Anybody remember back when fake car phone antennas were all the rage? Be wary of the person who has colored dots on her face, but uses a pay phone.
What's this Submit thingy do?
I seem to recall a project like this designed for special forces that was funded by CIA and DARPA (Defense Advanced Research Projects Agency). There was also a "silent sound" device that could transmit acoustic information directly into the skull.
Of course there could be many applications for the delivery of this type of thing, but one of the applications that the CIA was interested in subliminal presentation of messages in peoples sleep while the silent transmission of information would obviously be useful to special forces teams that need to communicate without revealing themselves.
Visit Jonesblog and say hello.
-Sam
1. this is for japanese diction, so applications to english could be even further away than u think.
2. digital (cdma, tdma from sprint, att, verizon etc) mobiles use speech prediction algorithms to encode voice b4 sending it over the air, and this comes with a very beneficial side effect: human speech comes out much clearer than u'd think.
this is because the algorithms used (qcelp etc) to compress/encode speech r very good when used to compress human speech, but perform badly at everything else. thus they will "pick out" human speech in a rock concert much better than regular analog phones. think of it as a SNR enhancer. thus there really isnt a reason to shout afterall, with or without this new technology.
lots and lots of prank calls?
I guess I can't eat anything while I'm talking on the phone anymore...
How will this differentiate between voiced and unvoiced consonants? "Pat" and "bat" sound different but the two initial consonants are extremely similar outside of vocalization. Yes, the articulation of the "b" is longer than the "p", but it's really miniscule and probably differs from person to person. I wonder if this will take the tack of making the phone "learn" how to discern such, or will it make the person learn how to "speak" in a way that the phone "understands" (kind of like handwriting recognition versus using Graffiti)...
Although the article talks about getting 100% accuracy in discerning vowel sounds, the Japanese language is pretty simple in its vowels -- a, i, u, e, and o, and that's about it. What about vowel sounds like umlauted vowels that occur in European languages? Heck, what about African languages that incorporate clicks and creaky vowels?
This sounds like promising technology, but the article leaves a lot of questions that need to be answered. I guess five more years of research will help, though.
--
http://www.aikiweb.com - AikiWeb Aikido Information
If this uses some sort of network of electrodes on your face, wouldn't you look strange walking down the street? If enough people started using this, it would become common place, but the appearance of the device could seriously be a marketing flaw until then...
Did this remind anyone else about Neuromancer? When Case entered that one girl's mind and could feel what she was saying when she just made the movements with her mouth, wouldn't it be kind of like how the thing is supposed to 'read' your words by your muscle movements?
I might be a little off-topic, I'll admit. But practically speaking, it could be really nice to hold conversations without disturbing the silence of a library... but I'm getting ahead of the article.
There's a 68.71% chance you're right.
David Brin discussed this technology in his book _Earth_ as a substitute for speech recognition.
- Serge Wroclawski
So given the fact that it typically takes about 15-30 years for really cool technology to get to us...how long do you suppose governments have been using this to try to read peoples minds? The next time somebody walks up to and makes an outrageous accusation then holds up a pen with a lens on it, recite your favorite poem to yourself.
For it is written in the great songs of Simon and Garfunkel:
And in the naked light I saw
Ten thousand people, maybe more.
People talking without speaking,
People hearing without listening,
People writing songs that voices never share
And no one dared
Disturb the sound of silence.
"Oppression and harassment is a small price to pay to live in the land of the free." -- Montgomery Burns.
think of the future possibilities, like some have said already--hearing and visually impaired people will find a new world. then with sufficient programming we could point it at animals' faces and get a synthesised human voice of what they're 'saying'. cusswords and all.
i could have fun "pointing " such a system at cartoon animation movies and listening to the resulting voices. new ways of communication using parts of our bodies not thought of in that sense...
what do it say.
all your base
"I didn't ask you to dance, I said you look fat in those pants"
I can see how this might work with vowels, but I don't know how it is going to distinguish voiced sounds from unvoiced sounds. Sounds such as S and Z, CH and J, and K and G become the same without voicing. For example "I'm joking" sounds like "I'm choking"
I don't know if I fully understand this technology but from a Linguist's point of view this is what came to mind.
Now we can make Whiskey!
Been wait'n for this for hundreds of years!
-- 'The' Lord and Master Bitman On High, Master Of All
...since the research is being done in Japan.
i s_being_said
Japanese has very few dipthongs.
A word that might be spelled 'Ao' using latin characters,(Â), would be pronounced as 'Ah-ow' (sort of).
Some words do change the vowels, but usually just by extending it. The word Tokyo isn't pronounced 'toe-key-o' as much as it is 'to-u-key-o-u'. The audible differences can be very slight, though. Possibly by sensing the muscle movements, it would be easier to discern the differences.
Another interesting capability would be the ability to discern mood. Consider the following:
'Yes dear, I'd <rolls_eyes>love</rolls_eyes>to have your mother visit this weekend...'
I'm not sure that I'd want my phone telling my girlfriend when I'm being sarcastic. You could have a new groupof 'tags' kind of like those you see on IRC:
roll_eyes
clench_jaw
check_watch
sneer
cringe
shake_head_in_disbelief_at_the_studidity_of_what_
You get the idea...
Cheers,
Jim in Tokyo
-- My Weblog.
TYPING!
It sounds like a very interesting technology. I don't necessarily see it at a tool for cell phone use, but as a tool for the hearing impaired or mute.
For mutes, it would be nice because they can already hear and this device would allow them aural communication again.
For the hearing impaired, it wouldn't be so easy. This technology would be a few years off, but you could have a combination of augmented reality and the lip recognition technology. As the user would speak, there could either be a slight delay or a it could be in real time. The device already has text capability, and could display in a heads up for the user and out a speaker. A very nice way to restore aural communication for the hearing impaired
13 year old white supremacists are shitty web designers.
The reason people shout into cell phones isn't that the phones don't pick up sound well enough. They do. It's that people don't *THINK* they pick up sound well enough because the phones don't give you any feedback in your own ear. Normal phones do give feedback and people are used to that. When you hear no feedback, you think "hey this phone must not be picking me up very well".
It may be a neat bit of technology they've come up with, but people won't stop shouting into their phones until they get feedback.
I doubt the technology will work for all languages, take Chinese for instance, Chinese is a tonal language, you have inflections upon the word, I believe there are in some cases 9 different words you can say with the same word, just inflected differently, I highly doubt technology could pick up on inflection.
Or take Korean, Hangul characters are actually to a certain extent patterned after the position of your throat and mouth muscles, alot of the sounds in Korean come from your throat, not how you move your lips.
Japanese, you can pretty much talk while maintaining a smile.
I am sure many other langugages are the same way.
Yes, the English phonemes 'g' and 'c' are articulated in the same position, both dorso-velar (dorsum of the tongue contacting the velum, the flap of skin behind the palate). They're both also 'stops' (the passage is momentarily completely blocked). But discerning sounds of identical position is actually somewhat less problematic in English than it might be in certain other languages. You hit upon a really important point when you mentioned 'the air' which accompanies 'c' word-initially in English (called 'aspiration'). Khmer, spoken in Cambodia, distinguishes between aspirated and unaspirated stops (e.g., the first 'k' in 'kook' is an aspirated stop, the second is unaspirated, but English speakers don't distinguish between them). How could this system possibly tell the difference? The only difference between the first 'k' and second 'k' in 'kook', as you point out, is the quick expulsion of air which accompanies the first. Even more confusingly, the first 'k' in 'keel' is not even articulated in the same position at all as the first 'k' in 'kook'. 'k' in 'keel' is palatal (further forward), where 'k' in 'kook' is velar (further back). But, for some reason, in English, we consider them the same phoneme (the subjective perception of what constitutes a unique sound in a given language. 'Keel' and 'kook' start with the same English phoneme, because we can't tell the difference). This is just impressing the point that where a phone is articulated is only a tiny piece of the puzzle. Making a system which understands language on the basis of position alone is ludicrous. That's impossible.
As you point out a workable system would have to detect 'voicing' (the vibration of the vocal cords), as voicing, AFAIK, differentiates at least some phonemes in every language on earth.
What about nasalisation (where the nasal passage is opened in pronouncing a vowel)? The only difference between the French words 'main' (hand) and 'mais' (but) is that the first is pronounced with resonance in the nasal cavity. How is this system to divine that one has opened a tiny passage to one's nasal cavity for the duration of the vowel?
Speaking of point of articulation, how about glottals (articulated in the larynx) and pharyngeals (articulated in the pharynx. We have none in English, but they exist in Semitic languages)? Without a camera rammed down the subject's throat, sensing articulation in there is going to be hard.
If we have some way of determining the position of the tongue, vowels will be comparatively easy to distinguish, as they're distinguished by 'rounding' (i.e., of the lips), position of the tongue and nasalisation alone (a caveat: Japanese has a 'voiceless vowel', but it's a total phonetic red-herring, really). And detecting nasalisation still seems a difficulty.
At any rate, the idea of recognising language mechanically would seem to at least necessitate detection of 1) position and character of vibration in the nasal cavity, pharynx and mouth and 2) exact position of the tongue at all times. At any rate, I'll leave the last word on this 'invention' to others:
Turn to the person next to you and silently mouth the words "elephant shoe" ...ask them what they think u said...
that's what everyone does from nightclubs. Well except that the club I go to has a bloody faraday cage around the main room (perhaps this improves the acoustics - there is less high-frequency hiss there than in other clubs).
tourette syndrome
First thing they teach you is that emotions are communicated by phone, so when you smile your correspondant feel it,same thing when you are stressed,where those emotions will go when an artificial voice will speak for me ???
Phones that can understand your mouth movements will probably have to translate these movements into some sequence of sounds that correspond to the speaker's langauge words. What I mean is that the phone will have to KNOW what language you are speaking in order to be able to translate your mouth movements into sounds meaningfull for your language. I made a joke a couple of days ago about phones understanding their owners, I was with my girlfriend and told her that it is amazing what my cell phone heard from me so far, for example when I talk to her. Anyway, cell phone companies would love to sell these phones since it will mean more upgrade capabilities - do you want your phone to speak english? russian? german? japaneese? Ha! Another 99.99$!
You can't handle the truth.