Microsoft Creates Kinect-Like System Using Laptop Speaker & Microphone
MrSeb writes "Microsoft Research, working with the University of Washington, has developed a Kinect-like system that uses your computer's built-in microphone and speakers to provide object detection and gesture recognition, much in the same way that a submarine uses sonar. Called SoundWave, the new technology uses the Doppler effect to detect any movements and gestures in the proximity of a computer. In the case of SoundWave, your computer's built-in speaker is used to emit ultrasonic (18-22KHz) sound waves, which change frequency depending on where your hand (or body) is in relation to the computer. This change in frequency is measured by your computer's built-in microphone, and then some fairly complex software works out your motion/gesture. The obvious advantage of SoundWave over a product like Kinect is that it uses existing, commodity hardware; it could effectively equip every modern laptop with a gesture-sensing interface. The Microsoft Research team is reporting a 90-100% accuracy rate for SoundWave, even in noisy environments."
It sounds interesting, as long as there is no background noise, you are alone in the room with the system and the system itself isn't generating any noises (fans? DVD access? music or sound effects?).
How is this Ultrasonic? Humans can hear up to 20KHz. So only the upper end of this is going to be above human hearing. Neat idea but I don't think I could tolerate the high pitch whine all day. Sounds like MS needs to hire some younger blood.
In my youth I could hear 18kHz. So is this only for older / deaf users?
I don't have one, but I thought the kinect did 2D very accurately plus a crude 3rd D based on image size so lets call it 2.5 D
I don't see how one mic and two speakers does more than 1 D of data. Then again I haven't read the article, maybe they place the whole laptop on an oscillating fan or something as a gimmick. Or is it really using the built in cam and the ultrasound is the gimmick that doesn't really do anything?
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
Can they patent it? This seems to be pretty much what bats have been doing for centuries
There was some research back in the past, this is a much more precise version, it seems (and btw, why aren't they using also the built-in camera, which is very common in today's laptops?)
http://hardware.slashdot.org/story/09/10/15/2121214/sonar-software-detects-laptop-user-presence
http://empathicsystems.org/
I wonder how accurate it is if two people are using it at the same time in the same area, e.g. me and my next-seat neighbor on an airliner...
http://www.extremetech.com/computing/128735-microsoft-creates-kinect-like-system-using-your-laptops-built-in-speaker-microphone
Microsoft creates Kinect-like system using your laptop’s built-in speaker & microphone
By Sebastian Anthony on May 7, 2012 at 9:02 am
SoundWave: Sound-based motion detection from Microsoft Research
Share This article
Not one to be outdone by Disney’s any-surface touch interface, Microsoft Research, working with the University of Washington, has developed a Kinect-like system that uses your computer’s built-in microphone and speakers to provide object detection and gesture recognition, much in the same way that a submarine uses sonar.
Called SoundWave, the new technology uses the Doppler effect to detect any movements and gestures in the proximity of a computer. The Doppler effect, if you remember high school physics, is where the frequency of a sound alters depending on your distance from it — the Doppler effect describes the change of a police siren’s pitch as it comes towards you and then recedes into the distance. In the case of SoundWave, your computer’s built-in speaker is used to emit ultrasonic (18-22KHz) sound waves, which change frequency depending on where your hand (or body) is in relation to the computer. This change in frequency is measured by your computer’s built-in microphone, and then some fairly complex software works out your motion/gesture.
Now, the obvious advantage of SoundWave over a product like Kinect is that it uses existing, commodity hardware; it could effectively equip every modern laptop with a gesture-sensing interface. The flip side, though, is that SoundWave, with a single sound source and microphone, isn’t going to allow for the same kind of accurate, 3D sensing that Kinect, Sony Move, or Wii Motion can provide with cameras and stereo IR sensors.
Microsoft SoundWave, measuring the Doppler effect of a moving handWatching the SoundWave video though (embedded below), I am surprised at what has already been achieved with a very simple hardware setup. The most obvious example is a laptop that automatically locks when you move away from it, and unlocks when you return — but it seems that the software is already advanced enough to detect up/down and left/right swipes of the hand. The system’s accuracy, according to the research paper, is between 90 and 100%, even in noisy environments. In one example, some fairly complex hand gestures are used to control the rotation and descending of Tetris blocks. If you added another ultrasonic sound source, and a few more microphones (many laptops already have microphone arrays anyway), SoundWave could probably replicate Kinect very well.
The video also makes clear, however, that waving your hands around — when the keyboard is right there — is a little bit foolish. Still, SoundWave is a freebie — it doesn’t interfere with any other sounds played by the computer (you can listen to music while SoundWave is active), and there’s no reason why laptops shouldn’t come with SoundWave preinstalled. I doubt it will ever reach the accuracy or resolution of camera-based solutions, though, and in all likelihood it won’t be long until we see laptops and smartphones with Kinect built in, anyway. Still, who knows — maybe SoundWave could provide a cheaper option for developing countries, or perhaps it could simply augment Kinect to provide greater accuracy over a wider range of motions/gestures.
Is it a good thing or a bad thing that the first thought I had was of the cell phone sonar from The Dark Knight film?
I'm in the beginning of my 30s and I can still hear 18 kHz (probably due to not listening to loud music, and wearing musicians' ear plugs in loud clubs); younger folks can often hear to around 20 kHz. Calling this ultrasonic is silly. Though the high frequency sensitivity of the ear is lower and these sounds would not be loud, they can easily be annoying, in the same way the old CRT TVs had that annoying 15.7 kHz buzz you can hear when you mute the sound.
Some here may wonder why, in the day of sound cards with 96 ksamples/s they didn't use a higher output frequency. The problem is the sound card DAC's reconstruction filter starts attenuation significantly below that, and most speakers drop in sensitivity much beyond 20 kHz as well. I would imagine the recording side has similar limitations.
"Politicians and diapers must be changed often, and for the same reason."
So, we got a Kinect, and the biggest downside we noticed is the sheer amount of space it requires to function properly.
I do not have a small house, but it's a bit tight in our living room. I can't imagine how badly it works in a typical dorm room.
Does this sound-based mechanism work better with smaller spaces? Has it been tested in dorm rooms and cube farms?
Why is everybody trying to make me wave my hands in the air or lift my forearms off the desk to drag my fingers across a screen?
If Slashdot were chemistry it would look like this:Cadaverine
Further proof that Microsoft has the best code-names and the worst product names.
Not sure whom, but I've heard someone did something like this (>5?) years ago.
Now if only it could transform your PC into a giant robot interested only in the consumption of all energy in the universe.
summary, then article: "the frequency changes when the distance changes". wrong.
the frequency changes when the velocity of the hand/head/whatever changes.
the article even goes further to describe the train approaching vs train leaving example of Doppler effect, and still the author didn't understand that it's not the distance that matters.
PS: 18kHz-22kHz is much too low.
Kinect detects the position of objects, while this system can only detect movement.
All you need to do is combine specific gestures with spoken keywords, and you've got yourself a magically controlled laptop. Required equipment for Hogwarts comp-sci 101 course. If this had come a few years earlier, they could have used it for spell casting in the the Harry Potter PC games.
This could be pretty cool for when you have your hands dirty and don't need your keyboard to be too. Scrolling recipes, for example.
PS. Que the porn jokes...
.: Max Romantschuk
Can I have my tinfoil hat?
Doc: No, it wont help.
For every benefit you receive a tax is levied. - Ralph Waldo Emerson
much in the same way that a bat uses echolocation.
The bats didn't patent it, but you acknowledge their work.
Why is everybody trying to make me wave my hands in the air or lift my forearms off the desk to drag my fingers across a screen?
Because that's what the actors do in the all of the futuristic movies.
I agree that there's little point on using this for everyday computer usage, it would be really cool for standing in front of classrooms giving presentations, and some other not-so-everyday-usages.
"SoundWave has detected that you are trying to masturbate. Shall I redirect your browser to a porn site appropriate for your sexual orientation?"
Clippy
The Microsoft Research team is reporting a 90-100% accuracy rate for SoundWave, even in noisy environments.
I wonder if they tested the system when multiple of these computers were in the same room.
This change in frequency is measured by your computer's built-in microphone, and then some fairly complex software works out your motion/gesture.
Complex software my ass. Take a FFT, find the peak in the 18-20kHz range and add it to the list. Check what the pattern in the list was over the last X seconds, see if that pattern matches one of the stored patterns. Initiate gesture action.
This could also be used to see if you are sitting at your laptop... very sneaky.
Help I am stuck in a signature factory!
That is well within the normal hearing range of a teenage human.
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
Now, let's turn on a room fan, or have the HVAC system start blowing the air around...
I'd like to quote TFS and GP here "The Microsoft Research team is reporting a 90-100% accuracy rate for SoundWave, even in noisy environments."
Bzzzt! Physics knowledge failure detected!
The type of "sound" that a typical room-fan generates that will screw with this isn't the audible "whoosh" sound, but rather the SUB-sonic "warble" (frequency wobble) "vibrato" that is generated by the speed of the fan blades "beating" the air. This "vibrato" might be tracked as "motion" by the doppler-tracking s/w. At best, it would introduce an annoying "uncertainty" in the position information, and at worst, might cause the system to just "give up" due to crappy position data.
To generate the sound I'm talking about, walk up to a window or room-fan and "sing" into it. That "vibrato" is happening to EVERY sound in the room. We are just used to ignoring it. But, anyone who has done any musical practicing, or worse yet, audio recording, in a room with a fan knows EXACTLY what I'm talking about...
I think it's kinda funny that almost every single comment on this article so far has been bitching about the frequency and how people can hear it, and not how amazing this is.
All the world's a CPU, and all the men and women merely AI agents
http://en.wikipedia.org/wiki/Flyback_transformer
b.t.w. I can hear that and also mosquito buzz like the ones in shopping malls.
Oh yes, I am heading towards 50 years of age, not all old people have hearing problems.
Nowadays more young than old people have hearing problems...
There is an android app that does (or tries) to do just that.
http://www.appbrain.com/app/sonar/com.dicon.sonar
Sig? Heil
Someone developed this capability about 4 years ago (estimate) with the idea of using it to lock a pc or laptop when the user walks away from it in an open environment. Detecting presence of a user withing "keyboard range" of the device was almost a trivial matter, and detecting motion near the system was very little more complex.
Didn't Morgan Freeman create this circa 2008?
Now you'll be able to fap, fap, fap away until you beat level 32. And don't try and tell me someone won't try this.
Vote monkeys into Congress. They are cheaper and more trustworthy.
Everything old is new again :-) Admittedly, the mechanism is somewhat more advanced going by TFA (the MS version uses doppler shift rather than triangulation per se, so it can use a single mic) :
From TFA:
"In the case of SoundWave, your computerâ(TM)s built-in speaker is used to emit ultrasonic (18-22KHz) sound waves, which change frequency depending on where your hand (or body) is in relation to the computer. This change in frequency is measured by your computerâ(TM)s built-in microphone, and then some fairly complex software works out your motion/gesture."
From http://en.wikipedia.org/wiki/Power_Glove
"There are two ultrasonic speakers (transmitters) in the glove and three ultrasonic microphones (receivers) around the TV monitor. The ultrasonic speakers take turns transmitting a short burst (a few pulses) of 40 kHz sound and the system measures the time it takes for the sound to reach the microphones. A triangulation calculation is performed to determine the X, Y, Z location of each of the two speakers, which specifies the yaw and roll of the hand."
Caveat Emptor is not a business model.