Microsoft Creates Kinect-Like System Using Laptop Speaker & Microphone
MrSeb writes "Microsoft Research, working with the University of Washington, has developed a Kinect-like system that uses your computer's built-in microphone and speakers to provide object detection and gesture recognition, much in the same way that a submarine uses sonar. Called SoundWave, the new technology uses the Doppler effect to detect any movements and gestures in the proximity of a computer. In the case of SoundWave, your computer's built-in speaker is used to emit ultrasonic (18-22KHz) sound waves, which change frequency depending on where your hand (or body) is in relation to the computer. This change in frequency is measured by your computer's built-in microphone, and then some fairly complex software works out your motion/gesture. The obvious advantage of SoundWave over a product like Kinect is that it uses existing, commodity hardware; it could effectively equip every modern laptop with a gesture-sensing interface. The Microsoft Research team is reporting a 90-100% accuracy rate for SoundWave, even in noisy environments."
Why is my dog barking at my laptop?
Faster! Faster! Faster would be better!
It all depends on the frequency used for the "sonar" system, the fans, HDD, background noise shouldn't contain a signifinact amount of noise at 20kHz so it shouldn't be a problem
How is this Ultrasonic? Humans can hear up to 20KHz. So only the upper end of this is going to be above human hearing. Neat idea but I don't think I could tolerate the high pitch whine all day. Sounds like MS needs to hire some younger blood.
From the article "The Microsoft Research team is reporting a 90-100% accuracy rate for SoundWave, even in noisy environments."
Good job reading the summary:
"The Microsoft Research team is reporting a 90-100% accuracy rate for SoundWave, even in noisy environments."
Can they patent it? This seems to be pretty much what bats have been doing for centuries
The Kinect had a bit more going on than that: it both had an ordinary webcam and a projected IR dot field and IR camera for depth calculations(along with an array mic, for noise cancellation and some degree of audio location)...
In this case, my impression is that the 'sonar' data are intended to be combined with a webcam image, with the 'sonar' providing a cue about what is foreground and what is background, and the webcam providing the detail.
There was some research back in the past, this is a much more precise version, it seems (and btw, why aren't they using also the built-in camera, which is very common in today's laptops?)
http://hardware.slashdot.org/story/09/10/15/2121214/sonar-software-detects-laptop-user-presence
http://empathicsystems.org/
Why is my dog barking at my laptop?
Because that is not really your laptop, you moron! It is a polymimetic-type Terminator! Your dog is trying to warn you! Run for your life!
And it won't merely affect dogs. Who says that this might not subconsiously affect humans too? Even if you do not consciously hear the near ultrasound, it might still affect you in indirect ways....
I'm in the beginning of my 30s and I can still hear 18 kHz (probably due to not listening to loud music, and wearing musicians' ear plugs in loud clubs); younger folks can often hear to around 20 kHz. Calling this ultrasonic is silly. Though the high frequency sensitivity of the ear is lower and these sounds would not be loud, they can easily be annoying, in the same way the old CRT TVs had that annoying 15.7 kHz buzz you can hear when you mute the sound.
Some here may wonder why, in the day of sound cards with 96 ksamples/s they didn't use a higher output frequency. The problem is the sound card DAC's reconstruction filter starts attenuation significantly below that, and most speakers drop in sensitivity much beyond 20 kHz as well. I would imagine the recording side has similar limitations.
"Politicians and diapers must be changed often, and for the same reason."
Getting it wrong one time in ten doesn't sound terribly good to me.
Actually, given it's called "SoundWave", more likely a Transformer (a Decepticon, to be specific). Terminators cannot replicate advanced machine functions such as a computer display, while a Transformer can.
"None can love freedom heartily, but good men; the rest love not freedom, but license." --John Milton
Computer: Hi there, I see you are giving me the middle finger salute. Would you like help with:
1. filing out your Windows registration
2. sending us money to unlock exciting new features of Windows
3. allowing all your warnings and alerts to use the voice chip
Perhaps Microsoft could combine this as a double check for Kinect, to make Kinect actually work.
Fear is the mind killer.
Why is everybody trying to make me wave my hands in the air or lift my forearms off the desk to drag my fingers across a screen?
If Slashdot were chemistry it would look like this:Cadaverine
Further proof that Microsoft has the best code-names and the worst product names.
Some people also hear sounds in the 18-22 kHz range. Especially 18-20 kHz, which is inside the "normal" hearing range for young people.
Most PC speakers and many sound cards are unable to produce reliable sound in those ranges anyhow, so it might be moot - it likely won't annoy you because it won't work.
This could be pretty cool for when you have your hands dirty and don't need your keyboard to be too. Scrolling recipes, for example.
PS. Que the porn jokes...
.: Max Romantschuk
It sounds interesting, as long as there is no background noise, you are alone in the room with the system and the system itself isn't generating any noises (fans? DVD access? music or sound effects?).
And you don't have a fan operating in the room, and aren't less than 25 years old (or 40 if female) (most males can hear 18-22 KHz up to about that age, and females until about age 40-50), so that you can't stand to be in the same room with it.
Exactly! Just like radiation!
Well.. maybe. Or Maybe not. But Definitely not sort of.
Just wait till the reports regarding how prolonged exposure to this frequency causes earlobe cancer.
-The wise argue that there are few absolutes, the fool argues that there are no probabilities.
Now, let's turn on a room fan, or have the HVAC system start blowing the air around...
I'd like to quote TFS and GP here "The Microsoft Research team is reporting a 90-100% accuracy rate for SoundWave, even in noisy environments."
Bzzzt! Physics knowledge failure detected!
The type of "sound" that a typical room-fan generates that will screw with this isn't the audible "whoosh" sound, but rather the SUB-sonic "warble" (frequency wobble) "vibrato" that is generated by the speed of the fan blades "beating" the air. This "vibrato" might be tracked as "motion" by the doppler-tracking s/w. At best, it would introduce an annoying "uncertainty" in the position information, and at worst, might cause the system to just "give up" due to crappy position data.
To generate the sound I'm talking about, walk up to a window or room-fan and "sing" into it. That "vibrato" is happening to EVERY sound in the room. We are just used to ignoring it. But, anyone who has done any musical practicing, or worse yet, audio recording, in a room with a fan knows EXACTLY what I'm talking about...
I think it's kinda funny that almost every single comment on this article so far has been bitching about the frequency and how people can hear it, and not how amazing this is.
All the world's a CPU, and all the men and women merely AI agents
http://en.wikipedia.org/wiki/Flyback_transformer
b.t.w. I can hear that and also mosquito buzz like the ones in shopping malls.
Oh yes, I am heading towards 50 years of age, not all old people have hearing problems.
Nowadays more young than old people have hearing problems...
Kinect detects the position of objects, while this system can only detect movement.
Not necessarily. If two slightly different frequencies are used (one from each stereo speaker), then with some complex math and comparisons against previous frames a simulated environment can be built with only one microphone. It may need to be calibrated each use (as different laptops have speakers/microphones in different physical locations across different models), but it can be approximated.
I think you missed GP's legitimate complaint... Contrary to the article, the Doppler effect has nothing to do with position, but changes in relative velocity. There's no change in frequency in the reflected ultrasonic tone if your hands are 1 inch, 1 foot, or 10 feet from the microphone... if they stay there. Only when you move can it detect the gesture, because that's the only time the reflection would be Doppler shifted.
Now, that's just according to the article's description of how the system works, but since the journalist got the Doppler effect wrong, it's highly likely he also got Soundwave wrong. If the system uses pulses, then it could use time-domain reflectivity to measure distance to stationary objects.
If it's just the Doppler effect, however, you don't need different frequencies or a pair of mics, as you said, because it's not simulating the environment... it's just looking for a change in a detected frequency from a known baseline, thus indicating something approaching or receding.
So you're saying that it's a device the will make the wife and kids leave me the f** alone so that I can get some work done?!!
Does it run on Linux?
Aah, change is good. -- Rafiki
Yeah, but it ain't easy. -- Simba