Slashdot Mirror


Hands-On With Microsoft's Touchless SDK

snydeq writes "Fatal Exception's Neil McAllister takes Microsoft's recently released Touchless SDK for a test spin, controlling his Asus Eee PC 901 with a Roma tomato. The Touchless SDK is a set of .Net components that can be used to simulate the gestural interfaces of devices like the iPhone in thin air — using an ordinary USB Webcam. Although McAllister was able to draw, scroll, and play a rudimentary game with his tomato, the SDK still has some kinks to work out. 'For starters, its marker-location algorithm is very much keyed to color,' he writes. 'That's probably an efficient way to identify contrasting shapes, but color response varies by camera and is heavily influenced by ambient light conditions.' Moreover, the detection routine soaked up 64 percent of McAllister's 1.6GHz Atom CPU, with the video from the Webcam soon developing a few seconds' lag that made controlling onscreen cursors challenging. Project developer Mike Wasserman offers a video demo of the technology."

13 of 84 comments (clear)

  1. Gesture interfaces by Anonymous Coward · · Score: 5, Funny

    Can it recognise that someone's about to pick up a chair?

    1. Re:Gesture interfaces by Firehed · · Score: 3, Funny

      BSD is what caused the chair to be picked up in the first place. I think the reaction is a BSOD.

      --
      How are sites slashdotted when nobody reads TFAs?
  2. iPhone? More like Eyetoy by Sockatume · · Score: 2, Interesting

    While it's very vogueish to make comparisons with Apple products lately, Sony's Cambridge studio are the group that spring to mind when it comes to gestural webcam-based interfaces. On a related note, their original Eyetoy tech demos were similarly "keyed to color", using large foam props, although the end product worked on skintones and therefore was heavily dependent on good lighting and contrast. They patented a "wand" with coloured LEDs back in 2005 which provided a reasonable compromise between the two (a month or two before the Wii Controller popped up, and made it all look passe).

    --
    No kidding!!! What do you say at this point?
  3. Re:Code efficiency by foniksonik · · Score: 2, Insightful

    Maybe he should try testing it on a real computer next time.... 64% of an underpowered device is not much to complain about.

      See my sig, I'm no MS apologist

    --
    A fool throws a stone into a well and a thousand sages can not remove it.
  4. LPF? by gillbates · · Score: 4, Insightful

    You know, someone should have really told these guys about this thing called a low-pass filter. It's very easily implemented in hardware (heck, most DSPs can do it rather handily), and uses very little power. A TI dsp would have no problem handling this kind of load.

    As for mediocre hardware, yes, the EEE is a little underpowered compared to a desktop. But, when you consider the fact that a 200 MHz dsp can encode NTSC video in realtime, chewing up 60% of the CPU is just poor implementation. That's ~1 GHz on a fully pipelined, superscalar processor, with a heatsink, to do what an embedded DSP can do with oh, say about 50-100 MHz of processing power, without a heatsink, using a RISC processor, running on AA batteries.

    And this yet one of the reasons I believe programmers should have to learn hardware. They wouldn't write code so inefficiently if they only understood the typical hardware engineer's approach to these problems.

    --
    The society for a thought-free internet welcomes you.
  5. I have an EEE PC by gillbates · · Score: 3, Insightful

    Running Linux. And the voice commands actually work!

    I'm not sure why I'd bother to chew up my battery with the webcam when I can just talk to the thing. If anything, it seems to me like the voice recognition would be far more promising than using the webcam.

    Okay, I know how this is going to sound, and I'm really not trying to troll, so please bear with me. I suppose there's a contingent of people who like the thought of waving their hands in the air to control their computer (Wii users?!), but I just don't see this going anywhere, especially because Microsoft is involved. If you look at their history, they typically get things wrong the first few times. Whatever promise this technology holds, I expect that:

    1. Any really cool technique will be patented by Microsoft and doomed to obscurity by their poor implementation of same; and
    2. It really is easier for most people to talk to their computer, or use the mouse/keyboard to control their computer, than it is to wave.
    --
    The society for a thought-free internet welcomes you.
    1. Re:I have an EEE PC by Issildur03 · · Score: 2, Informative

      I tried it out, and the drawing demo seems to be the most promising application. In the absence of a touch-screen monitor, this could be a lot better than an external touchpad. And there's definitely something neat about using a tomato to play snake. Still a long way to go, though...

    2. Re:I have an EEE PC by sp332 · · Score: 2, Informative

      1. Any really cool technique will be patented by Microsoft and doomed to obscurity by their poor implementation of same; and

      From the license:
      "(B) Patent Grant- Subject to the terms of this license, including the license conditions and limitations in section 3, each contributor grants you a non-exclusive, worldwide, royalty-free license under its licensed patents to make, have made, use, sell, offer for sale, import, and/or otherwise dispose of its contribution in the software or derivative works of the contribution in the software."

  6. Re:Didn't Toshiba do something similar to this onc by cowlobster · · Score: 2, Interesting

    you're a bit late, it's been done already, but with a wii remote. http://www.cs.cmu.edu/~johnny/projects/wii/

  7. Re:It's Open Source by PaintyThePirate · · Score: 5, Informative

    If you're interested in a truely Open Source version of this, Pygame has camera and computer vision functions in the SVN that let you do exactly this. I could track two different colored objects in realtime (30fps) with no lag, on a 433mhz OLPC XO.

    It is Linux only at the moment, but Windows and OS X support is likely to be finished before the next release.

  8. better cross platform alternatives by nan0 · · Score: 5, Informative
    opencv has nice python bindings, runs on mac, win & nix.
    openframeworks wraps c++ like processing wraps java, also has opencv bindings.

    MS appears to basically doing optical flow & color tracking. the above libs can do those, and more, and are great for programmers and nonprogrammers alike. tho if you really hate code, you may rather use max/msp/jitter or gem/pd.

  9. The last man standing by westlake · · Score: 2, Informative
    Can it recognise that someone's about to pick up a chair?
    .

    In a financial crisis the prize goes to the last man standing

    Microsoft is the first U.S. industrial corporation in ten years to earn a AAA bond rating from S&P and Moody's.

    More than 70 percent of S&P ratings for U.S. nonfinancial companies are currently below investment grade and classified as "junk", or speculative-grade bonds. That's up from 32 percent in 1980. Microsoft wins top credit ratings from S&P, Moody's

  10. Actual numbers by gillbates · · Score: 2, Interesting

    Okay, I know it's a little late to post this, but these are the numbers I'm getting from my EEE 900. I'm running a 3-tap FIR filter to average all the pixels in a dummy frame. This doesn't include the time it would take to pull the frame from the CMOS/CCD sensor.

    On battery alone:

    Resolution: 160 x 120 : 4223 frames, (422.300000 per second)
    Resolution: 320 x 240 : 849 frames, (84.900000 per second)
    Resolution: 640 x 480 : 303 frames, (30.300000 per second)
    Resolution: 720 x 480 : 269 frames, (26.900000 per second)
    Resolution: 800 x 600 : 171 frames, (17.100000 per second)
    Resolution: 1024 x 768 : 118 frames, (11.800000 per second)
    Resolution: 1280 x 1024 : 71 frames, (7.100000 per second)
    Resolution: 1600 x 1200 : 30 frames, (3.000000 per second)

    On AC its a little better

    Resolution: 160 x 120 : 5758 frames, (575.800000 per second)
    Resolution: 320 x 240 : 1675 frames, (167.500000 per second)
    Resolution: 640 x 480 : 321 frames, (32.100000 per second)
    Resolution: 720 x 480 : 353 frames, (35.300000 per second)
    Resolution: 800 x 600 : 276 frames, (27.600000 per second)
    Resolution: 1024 x 768 : 169 frames, (16.900000 per second)
    Resolution: 1280 x 1024 : 101 frames, (10.100000 per second)
    Resolution: 1600 x 1200 : 60 frames, (6.000000 per second)

    Given the sensor resolution is 1280 x 1024, it appears their algorithm uses the full resolution. They could probably get much better results if they used 320 x 240. A little speed binning goes a long way.

    Respond to this post if you're interested in the code.

    --
    The society for a thought-free internet welcomes you.