Best Device For Gesture Based Input?

← Back to Stories (view on slashdot.org)

Best Device For Gesture Based Input?

Posted by timothy on Thursday April 26, 2001 @08:51AM from the move-your-eyes dept.

jotaeleemeese writes: "A few days ago there was a discussion about gesture navigation in the Opera browser, that prompted my to buy Black & White, download Opera and get the evaluation version of Sensiva. Being a trackball user, I found gesture navigation too cubersome, I found a mouse not much better either. Then I thought a pen based device or a touchpad could be ideal for this kind of input, but before investing my hard cash buying something, I would like opinions from /.ers that have already tried something with these or other programs using gesture recognition and what the results have been."

13 of 133 comments (clear)

Min score:

Reason:

Sort:

Is your computer slow perhaps? by jandrese · 2001-04-26 05:03 · Score: 5

One thing I noticed about Black and White is that if your input device has very low resolution (say your comptuer is overtaxed and only servicing interrupts every 100ms or so), then the gesture based input can be a real pain, but when I'm on a fast enough machine (with a good precision mouse) the gestures are easy to preform. The problem with slow input is that when you go around a curve, the mouse may only register at two or three points along the curve, and your software will interpolate that into a straight line between those points. If what you are trying to draw is curved, then there is a good chance the recognition software will get it wrong.

Down that path lies madness. On the other hand, the road to hell is paved with melting snowballs.

--

I read the internet for the articles.
Gloves by Ravenscall · 2001-04-26 04:56 · Score: 5

I think something like the old Nintendo power glove would be great, hit a button, gesture, hit a button to confirm, all of it fingertip controlled. Either that, or a touchpad/screen. What would be more natural for gesture based control than touching the screen and making the gesture?

--
You say you want a revolution....
Alias|Wavefront's work in the field. by jamesneal · 2001-04-26 05:12 · Score: 5

I believe the whole concept of gesture-based menus was first pioneered (and put into production) by Alias|Wavefront, which is designed to be used with a Wacom tablet-- pens work much better than mice for gesturing.
The idea is that the human brain isn't good at discerning differences between short distances, such as "Move the mouse pointer to the menu bar, click within a .5 inch box, scroll down 2.5 inches to the appropriate menu item and release", however it's quite good at producing and remembering changes in directions. So, for instance, File|Save would be "Up, Left".
With just two gestures, it's possible to represent over 48 different actions. Add a third gesture, and that number goes to 288. Their research showed that their average subject had no problem remembering four levels deep!
Gesture interfaces are especially useful as a user-interface for blind people, where it's just not possible to choose items from a menu visually.
The cool thing is that gesture-based menus have been part of the Alias|Wavefront products since 1996.
-James
Unproductive gestures by bobdehnhardt · 2001-04-26 05:26 · Score: 5

I gesture at my computer constantly. It doesn't increase my productivity or improve my computing, but it does make me feel a whole lot better....
Application specific? by Black+Parrot · 2001-04-26 06:01 · Score: 5

For surfing p()orn sites and playing Tomb Raider, I have found that a life-size inflatable doll makes the best gesture-based "input" device.

--

--
Sheesh, evil *and* a jerk. -- Jade
Force feedback mouse by Brento · 2001-04-26 04:59 · Score: 5

As a guy who plays Black & White with a Logitech iFeel mouse, I've gotta say your initial take on mice needs to be revisited. Having the mouse kick back when you do something right, wrong, powerful, whatever, that means a lot, and it helps you get used to doing things the right way.

The only drawback is that it's too tiring for day-to-day use. I usually leave the feedback turned off when surfing the web, for example, because it just beats your wrists to death as you glide over a zillion links. I've got carpal tunnel, and the buzz that it makes when jumping over hyperlinks makes my wrists feel like they've been typing for hours.

It's remarkably cheap, too - it was $45 on the shelf the last time I looked.

--
What's your damage, Heather?
Touchpads work, if using correct drivers by Animgif · 2001-04-26 05:00 · Score: 5

I am currently using a Dell CPxJ for browsing with Opera. I had the regular drivers installed for the PS2 mouse...they blew for this. I downloaded the Alps drivers. This allowed me to click and drag, right click, everything could be set so I just placed my finger and drug...all's done

I think the key to anything is find something you are comfortable with, and then just make it work. Don't spend a lot of money on something you aren't going to be happy with. And when you do get it, don't half ass it!

--
------ This has been provided as a public service! ------
Wacom Graphire by Perdo · 2001-04-26 11:19 · Score: 5

For drawing pictures freehand on a 'puter nothing beats it. Pressure sensitive and integrates with Adobe and Corel. Darker, fatter lines when you press hard, lighter thinner lines when you ease up. You can actually sketch with this thing. Has a similar feel to a soft pencil or the spongy tipped ink pens. Put a piece of soft plastic over the tablet to provide a better feel of resistance to pen strokes. Nothing rough though. Anything rough will actually give you the effect of gravestone rubbings. It transfers the grain of the paper you are using to provide resistance directly to the screen. Yes, it is that sensitive.

--
If voting were effective, it would be illegal by now.
Mouse position refresh rate - that's what matters by andyh1978 · 2001-04-26 05:51 · Score: 5

I've only got experience of a mouse with gesture recognition, so I can't speak for any other device.

What I have seen is how much the 'refresh rate' of the mouse's position (temporal frequency?) affects the usability of gestures.

I've bought Black and White, and it has serious issues on Windows 2000. As in it doesn't run at all. Fantastic.

I've got a triple-boot machine (Slackware/Win98/Win2k), so I'm forced to run B&W in Windows 98 where the update rate of the mouse is pretty appalling.

Getting B&W to recognise some of the more complex gestures is a pain because the time between updates of mouse position gives the gesture considerably more 'jaggy' edges, making it look less like what you actually did with the mouse.

Windows 2000 has the refresh rate pretty high, so I'd have thought it's far easier to use gestures successfully on there.

I've not used the mouse much under Linux; my dedicated Linux box doesn't have a monitor, let alone a mouse, I just use it over ssh or X-Win32, so I don't know if the PS/2 refresh rate has been increased (or is configurable); the last I saw was that it wasn't particularly fast.

Opera's gestures are fairly simple (so far), not nearly as complex as some of B&W's gestures, so the rate isn't as critical. But, add more complex ones and you will see the difference.

It's not a new technology by any stretch of the imaginatio (emacs strokes mode anyone?) but it's very useful; even something as simple as Opera's 'back' gesture is so convenient, I wonder 'why didn't they put this in earlier!'.

Nice one Mr. Molyneux; he was always the king of games back in the good old days of Atari STs, and now something from his latest game seems to have started a bit of trend elsewhere in the software business.
REAL gesturing. by 3-State+Bit · 2001-04-26 06:38 · Score: 5
Our company is developing software for true "gesture recognition". Basically, it takes a number of arbitrary points of view (from higher-quality [not "web"] cameras) and calculates the location of three-space objects from them. The only "set-up" hardware-wise is holding up a calibrator (a scepter-like) device by its handle and pressing a button to mechanically (the mechanics so far are just toy-like, the important aspect to the calibrater is its gradations, a proprietary system serving the purpose of interlocking rulers) turn it 360 degrees a couple of times. It doesn't even matter if you move it while you do it, as long as you don't move it too fast to have distinct, clear frames. As long as there is a line of sight between the cameras and the calibrator, the software will be able to calculate their positions relative to the calibrator. Afterward, our software is able to keep a running matrix of all three-space that is visible to at least two cameras. Using five cameras, it's possible to have more or less a total view (well, total opaque view) of the three-space in front of your monitor, for instance, and the one out of the five cameras is only necessary when you happen to be blocking one of the other necessary ones. All this is very processor-intensive, but so far it's very straight-forward. Basically, simple trigonometry. We haven't been working on optimizing tricks, since our 800 quad xeon test server already does 30 frames per second with five cameras at 800 by 600. So our process looks like this:
1. Synchronize a "frame" from the point of view of every camera. You must already know their "absolute" positions, which is relative to some zero-point. (Determined by the original location of the calibrator).
2. For each pixel that a given camera sees:
  - Assume that you are seeing a pixel at the nearest point that the second camera in your stereo set could also see. To draw a human comparison, bring your finger closer and closer to your eye, until with your other eye it passes the line of your nose and you can't see it anymore. This is the "closest point".
  - Calculate where this point would appear in the other camera, as well as the sorrounding blocks of pixels, and see whether it matches what the other camera in the stero pair actually sees.
  - If it doesn't match, assume that it must be farther than you initially assume. Repeat process.
  - Repeat until you "converge"...ie, get images where many pixels in the area "line up" as calculated by the assumption that they are at absolute point x,y,z. This process actually is very similar to what your eye does if you ever notice when it's scanning for how far away something is. At first it assumes it's close, then keeps looking farther and farther away until the two images are brought together. Your brain is the only thing bringing the two images together! Your eyes are still an inch point five apart, silly. :) In the same way, for each pixel (or rather, group of pixels large enough to identify a small area on an object), our software's "brain" converges the image for various distances until it finds a match.
  - If you cannot find a match, assume that the other camera in the pair is not seeing that particular pixel, either because something near you is blocking the nearest area that the other camera is seeing, or because something near the other camera is blocking the line of sight that goes to what you're seeing, or because it's outside the line of sight of another camera entirely. This last is easiest because you don't even need to scan the pixels you know only one camera sees.
3. Repeat this process for each stereo pair.
4. Assemble every picture you have an absolute coordinate from (that a stereo pair can see) into a three-space.
Note that I've left out such things as massaging the image from different cameras in various ways (color, brightness, etc) to get them near, using more or less fuzzy "matches" depending on how much you might expect an object to differ at different angles, and calculating lighting sources based on the calibrator. While these are serious issues, they're really basic math stuff that's well-explored in the field of optical recognition, and it's basically a cut-and-paste of components, and, like I said, a $5,000 server can do thirty frames per second without having any graphics hardware specifically enabled for this stuff. The number of three-space "pixels" it ends up getting varies with conditions, but you can always do well enough to read standard braille that's reasonably close in proximity (1.5 feet) to a stereo pair of cameras. Needless to say, there are more useful applications to these kinds of technology than reading braille on your computer screen :). This leads me to the real area we're flinging resources at:
Developing a gesture recognition system. I did not mean to outline everything I did above, but it really is not involved, and a lot more viable than some people think. Anyway, the interesting thing about the three-space that you develop from the process above is that it is very easily analyzable. Not only do you have a solid "block" of where pixels are, but it's easy to tell lines that separate, for instance, individual fingers that overlap. In fact, the human brain uses more picture analysis than stereoscopic analysis, and our system is actually more precise than the human brain at finding the exact location of a point two or three feet away relative to a point near it, compared with the human brain, if you are given no color clues! When looking at a hand, therefore, we can pretty take the basic shape of a hand and (here is where we get tricky) apply a very fuzzy algorithm for fitting it to the hand that we actually see. It is "fuzzy" almost to the extent of being neural-netty (although we control it very much), since it not only needs to choose between an infinite number of ways that two hands can contort themselves, but also learn the size of individual aspects of it (which changes slightly), and their shape, and for this purpose also takes into account where the hand "used" to be in the previous frame, how fast it was moving over the previous few frames, and how likely it is to move in a certain way, with respect to speed and with respect to what positions are unnatural. All this is necessary to get 30 frames per second, because we aren't just interested in the "position" of the hand, but its important aspects (the relative bend in each joint). To test, we have another application that is ONLY given the absolute position of hands and the relative joints we are measuring, and then reconstructs the hands visually. You can therefore have all three programs running, the stereoscopic analyzer feeding the hand-position recognizer data, and the hand-position recognizer feeding the renderer data, so that your screen shows how the renderer is getting the info about where your hands are. Mostly, however you move your hands will be reflected on the screen, but if you move it very quickly and unusually you can still confuse the hand-position analyzer and get an image that's out of sync with what your hand actually is doing. This is independent of the stereoscopic anaylzer, which comes up with the correct data, which if you feed directly to the renderer you see always matches what your hand is doing, at 30 fps.
So now I've outlined how we get the position of joints, which includes quite a bit of fuzziness. But by far the most fuzziness is not in this, but in the actual "recognition" of a GESTURE. We've already gotten the first-generation information about what a gesture is by spending several hours each in front of a test server set up for it, already equipped with a popular voice command system, and agreeing to surf the web and do various other tasks the voice command system is equipped for (we didn't make that, it's just purchased off the floor somewhere) while also doing the gesture we have set up for each command. So we end up with "sample" gestures to analyze, and have already manually looked at the major indicators and drawn them up and programmed them. The way we have done the first time is very crude, however, eyeing as we have each sample ourselves, but we are now in the process of collecting second-generation information, so that when a user successfully uses a gesture and doesn't complain that it wasn't what he wanted, that particular instance of gesturing gets put into the database of gesturing instances associated with a gesture, and we are developing fuzzy logic to link these gestures more closely and reliably. The gestures make sense for the most part, such as having your right thumb open to the left with your other fingers closed, in a quick leftward motion to go back, or up and with a quick rightward motion to be right. Stopping is pushing your palm forward toward the screen, closed a window is putting your finger and thumb together and drawing your hand back, as if you're flicking the window away, and refresh is a sweeping gesture with your palm toward you, from bottom left toward top-right (only a small part of the way). The software recognizes a "gesture" because you perform it particularly fast and deliberately, so if you playing with your hands slowly, it doesn't misrecognize any of these.
Anyway I'm getting really tired of typing all this, and even though there is much, much, more, I'm just kidding. Wouldn't all this be cool though?

~
IBM's new laptop... by Liquid-Gecka · 2001-04-26 05:14 · Score: 5

IBM has a new laptop that is awsome for gesture navigation. It is large and heavy, but it opens up with a notebook on one side, and the laptop/monitor on the other. It has both a normal laptop mouse and a pen mouse. The pen mouse can be used on the screen, or on the pad beside the laptop. It comes with a documentation program that allows you to write/draw into the software itsself =) Its _REALLY_ cool... the pen allows you to do gesture type actions just like you where writing them down!
Re:pen based and glidepads by Magumbo · 2001-04-26 05:23 · Score: 5

It seems people either love or hate these pen devices. Personally I use a USB 6x8 Wacom tablet with the Intuos pen, and totally love it. It works well under linux, macos, and win2k.
You can customize how you want it to behave (map the screen to the tablet, or use a mouse-like interface), the pressure sensitivity thresholds, macros for the two buttons, angle behavior, and eraser behavior/sensitivity. On win and mac you can easily set these independently for different programs. Another cool feature is that you can buy multiple pens (which I find pretty comfortable,btw) and have independent settings for each one.
I'll be the first to admit it does take a while to get used to using one. But after playing around with it for a while I fell in love with it.
They are a bit costly, but well worth it. Last I heard, Wacom was selling refurbished ones at nice discounts.

--
Mice 1 Everything Else 0 by 8934tioegkldxf · 2001-04-26 05:20 · Score: 5

I find the mouse is excellent for this sort of thing. However, I have a Logitech Mouseman which fits my hand perfectly and I have very high sensitivity set and acceleration turned on. A gesture for me means moving my mouse within an area no bigger than about 1/4" x 1/4". Most people have their mouse sensitivity set way too low.

The only better device would be a 3D glove since you could do 3D motions, which gives a much larger domain for your gestures to be in, probably making it both easier to remember them and less likely you'll mess them up. But don't sneeze or you may delete you root directory.

BTW, Black and White sucks. A whole 5 levels, and WAY too much wood required to do anything. If I wanted to do the same task over and over and over again for hours on end I'd get a job in a factory and get paid for it. And how do you become evil? I taught my creature to eat people, I destroy entire villages, I set people on fire, fling them into mountains, sacrifice 'em all over the place, starve them to death and I'm a GOOD God? They got some good weed down at Lionhead, uh-huh.