Hands-On With Microsoft's Touchless SDK
snydeq writes "Fatal Exception's Neil McAllister takes Microsoft's recently released Touchless SDK for a test spin, controlling his Asus Eee PC 901 with a Roma tomato. The Touchless SDK is a set of .Net components that can be used to simulate the gestural interfaces of devices like the iPhone in thin air — using an ordinary USB Webcam. Although McAllister was able to draw, scroll, and play a rudimentary game with his tomato, the SDK still has some kinks to work out. 'For starters, its marker-location algorithm is very much keyed to color,' he writes. 'That's probably an efficient way to identify contrasting shapes, but color response varies by camera and is heavily influenced by ambient light conditions.' Moreover, the detection routine soaked up 64 percent of McAllister's 1.6GHz Atom CPU, with the video from the Webcam soon developing a few seconds' lag that made controlling onscreen cursors challenging. Project developer Mike Wasserman offers a video demo of the technology."
Can it recognise that someone's about to pick up a chair?
While it's very vogueish to make comparisons with Apple products lately, Sony's Cambridge studio are the group that spring to mind when it comes to gestural webcam-based interfaces. On a related note, their original Eyetoy tech demos were similarly "keyed to color", using large foam props, although the end product worked on skintones and therefore was heavily dependent on good lighting and contrast. They patented a "wand" with coloured LEDs back in 2005 which provided a reasonable compromise between the two (a month or two before the Wii Controller popped up, and made it all look passe).
No kidding!!! What do you say at this point?
Maybe he should try testing it on a real computer next time.... 64% of an underpowered device is not much to complain about.
See my sig, I'm no MS apologist
A fool throws a stone into a well and a thousand sages can not remove it.
You know, someone should have really told these guys about this thing called a low-pass filter. It's very easily implemented in hardware (heck, most DSPs can do it rather handily), and uses very little power. A TI dsp would have no problem handling this kind of load.
As for mediocre hardware, yes, the EEE is a little underpowered compared to a desktop. But, when you consider the fact that a 200 MHz dsp can encode NTSC video in realtime, chewing up 60% of the CPU is just poor implementation. That's ~1 GHz on a fully pipelined, superscalar processor, with a heatsink, to do what an embedded DSP can do with oh, say about 50-100 MHz of processing power, without a heatsink, using a RISC processor, running on AA batteries.
And this yet one of the reasons I believe programmers should have to learn hardware. They wouldn't write code so inefficiently if they only understood the typical hardware engineer's approach to these problems.
The society for a thought-free internet welcomes you.
Running Linux. And the voice commands actually work!
I'm not sure why I'd bother to chew up my battery with the webcam when I can just talk to the thing. If anything, it seems to me like the voice recognition would be far more promising than using the webcam.
Okay, I know how this is going to sound, and I'm really not trying to troll, so please bear with me. I suppose there's a contingent of people who like the thought of waving their hands in the air to control their computer (Wii users?!), but I just don't see this going anywhere, especially because Microsoft is involved. If you look at their history, they typically get things wrong the first few times. Whatever promise this technology holds, I expect that:
The society for a thought-free internet welcomes you.
you're a bit late, it's been done already, but with a wii remote. http://www.cs.cmu.edu/~johnny/projects/wii/
If you're interested in a truely Open Source version of this, Pygame has camera and computer vision functions in the SVN that let you do exactly this. I could track two different colored objects in realtime (30fps) with no lag, on a 433mhz OLPC XO.
It is Linux only at the moment, but Windows and OS X support is likely to be finished before the next release.
eclecti.cc
openframeworks wraps c++ like processing wraps java, also has opencv bindings.
MS appears to basically doing optical flow & color tracking. the above libs can do those, and more, and are great for programmers and nonprogrammers alike. tho if you really hate code, you may rather use max/msp/jitter or gem/pd.
.
In a financial crisis the prize goes to the last man standing
Microsoft is the first U.S. industrial corporation in ten years to earn a AAA bond rating from S&P and Moody's.
More than 70 percent of S&P ratings for U.S. nonfinancial companies are currently below investment grade and classified as "junk", or speculative-grade bonds. That's up from 32 percent in 1980. Microsoft wins top credit ratings from S&P, Moody's
Okay, I know it's a little late to post this, but these are the numbers I'm getting from my EEE 900. I'm running a 3-tap FIR filter to average all the pixels in a dummy frame. This doesn't include the time it would take to pull the frame from the CMOS/CCD sensor.
On battery alone:
On AC its a little better
Given the sensor resolution is 1280 x 1024, it appears their algorithm uses the full resolution. They could probably get much better results if they used 320 x 240. A little speed binning goes a long way.
Respond to this post if you're interested in the code.
The society for a thought-free internet welcomes you.