Cheap 3D Computer Vision?
InspectorPraline writes "According to this article at the New York Times [free reg req'd], a tech firm known as Tyzx is developing optics technology that will have three-dimensional capability -- using two cameras attached by a high-bandwidth connection to a custom processing card inside a PC. The article makes one believe that the system would have a top speed of as much as 132 stereo frames per second, which could be very useful in security systems. Of course, the real question is who's behind the cameras, but we can all drool over the other possibilities, right?"
No more taping the red and blue filters from my Mag-Lite to my eyelids any more! :-)
See the company's website for better details on the used technology, here are some interesting publications, this one (PDF) is the core: Real-time Stereo Vision for Real-world Object Tracking.
can real 3d be obtained with just two cameras?
or is it merely 2.5d
Regardless of where the cameras are, is there not still a plane which the cameras/software cant determine the "height" of
Dont you need 3 cameras minimum for proper 3d?
Forget security, we all know it'll be used by the porn industry first!
Why, we'll make Rock Ridge think it was a chicken that got caught in a tractor's nuts!
It will be virtually impossible to palm chips or any other sleight (spelling?) of hand tricks that people do at card tables. I'm sure there's millions of other more interesting possibilities, but that, and stopping pickpockets, are the ones that arrived immediately in my head..
He tried to kill me with a forklift!
This is taken from the document Real-time Stereo Vision for Real-world object tracking:
.... the DeepSea chip may not be able to find a valid match for every pixel in the image. Large unformity lit areas of scene may have pixels of identical intensity; for pixels in such area, no single match can be found. Pixels that correspond to an object that is invisible to one imager but the other also do not have matching pixels.
... Once the matching process is complete, the range of each pixel can be calucated using the horizontal disparity of the matching pixels, the focal lenghts of the lenses and the distance between them. The DeepSea chip designates the range or anormalous pixels as invalid. :)) See also a HP document covering partly the same matter.
<clip>
The DeepSea chip is hardware implementation of the census correspondence algorithm invented by Tyzx staff... The algorithm's key concept is transforming a pixel's numeric absolute intensity value into a bit string that represents the pixel's brightness relative ot it's neighboring pixels. For each pixel, The DeepSea chip examines the pixels surrounding area called a neighborhood. A typical neighborhood is 7x7 pixels centered on the subject pixel. Comparing a subject pixel's intensity to its neighbours, the chip produces a relative intensity map (show in the document, page 8).
</clip>
(typos are mine)
The technology employed (both hardware and software) is limited. CMOS sensors of the type described suffer from poor signal to noise as well as interlacing artifacts. Pixel jitter is of major importance in machine vision and I doubt these sensors offer much clock control over and above the 1 pixel mark (if any).
The matching algorithm described is very primitive, assuming rotation in depth between views doesn't effect the scene projection into the image - ooh but it does. The concensus matching algorithm is very simple and whilst it does recognise the problems of illumination variation it fails to solve the problem in a manner you could describe as robust. Also contrary to popular belief you cannot robustly recover depth from every pixel n the image! There is no evidence that the human vision system does it (without knowledge of the object) so why are people trying it? Even if you ataempt it you are going to need some way of telling which data is more accurate than not in order to start using the results. Edges are your best bet and I didn't see any evidence of preprocessing described in their system (although to be fair I only read it breifly).
I appreciate that this is supposed to be a cheap system and thus its limitations are probably to be expected. Might be fun to play with for a hundred Euros or so.
For more state of the art look at what is possible you could do better than take a look at TINA an open source machine vision system with a very sophisticated stereo depth estimation algorithm (we even built a chip to accelerate it!)
-- "Can't sleep, clowns will eat me!"
I don't know what "inexpensive" means. It's all relative, and no real point of reference is given. If it truly is inexpensive, this could open up a market for lots of new products which track objects in 3D (real) envirnments where it just never made economic sense before.
Product ideas anyone?
-Pete
Soccer Goal Plans
But it is no longer in production and it is patended.
Works with any software as it is attached at the front of the screen. Surface mirrors and the idea of doing the view-master 'on screen'
I'll keep mine for a long time.
A description and pictures of it here
Patent here with description.
Of course, the real question is who's behind the cameras
When are slashdotters going to stop adding these kind of remarks at the end of their news post? I'm getting tired of all the paranoia and propaganda that is around on Slashdot, even if it is justified sometimes.
Can anyone think of more interesting apps for this? How about this one: A computer system that is able do be referee for a sports game.
(Score:5, Not Funny)
The two cameras approach requires relatively high performance. Is there are reason why combination of digital camera and laser based distance meter (accuracy is measured in millimeters) would not be more accurate, reliable and require less computational performance.
:) To me atleast this aproach is also easier to comprehend than some magic algorithm.
Take image, feed the laser distance-o-meter, which scans the distances and embeds the results with the imagedata. We could even have a matrix of the lasers for example to measure the distance on a single shot, for example at 8x8 (64) beams would be already good for scanning an area of few square meters - if the objects that we are looking for area bigger than insects, ofcourse
Now nerds like me will finally have chance to simulate reality even better without actually having to face it.
...
"In event of an emergency in 'real life sim 3000' press [enter] to pause and scroll up the history window to see what went wrong"
I wonder if cheat codes are applicable
my dvd discs won't fit in a:
Expect a whole new onslaught of X10 ads as soon as this technology becomes popular :(
"We must destroy X10! We must destroy all Internet ad!" - KOMPRESSOR
Today's weirdness is tomorrow's reason why. -- Hunter S. Thompson
The problem I can see with such technology is that it should be adjustable to your vision.
/. with red-blue glasses.
The article is about computers/robots seeing in 3d, not us. It will enable much more precise handling of objects in realtime, whatever the application might be. (insert ref to porn here)
3D glasses have already been with us for genrations.
Destoo - reading
Nouvelles de jeux et technologies en français. TC
Close one eye. Can you still estimate the distances of objects around you? Of course you can. This demonstrates that there's much more to depth perception than stereo vision.
Stereo vision is inherently limited. It requires that the objects have sufficient texture so that points on the two stereo images can be correlated. Our depth perception relies on much more than stereo e.g. common sense knowledge about the world, intution about shading and lighting, etc.
This is similar to some work I did on Eye Gaze Tracking in my senior year at University of Connecicut. The project page can be viewed here.
I wish I had done more with it, there are more applications for this than just tracking people in public. They can be used for keeping the laser in the correct position if a person moves their eye during lasic eye surgery. It can be used to by a paraplegic to use a computer. And most importantly it can used to target in Quake3.
http://github.com/gbook/nidb
but we can all drool over the other possibilities, right?
You mean 3d pr0n?
Riiiight, and those X10 cameras are for surveillance too.
EDISON, is a free C++ toolkit that performs edge detection and image segmentation. The image segmentation portion is based on mean-shift analysis.
A colleague and I are currently in the process of porting portions of EDISON to Java.
There is a company called Point Grey Research
(http://www.ptgrey.com/) that has external binocular and trinocular stereo units for sale that use firewire. They don't do the processing on the unit, but have algorithms that run on standard PCs to process the data for you. Pretty interesting little guys, the computer vision lab where I got my degree (http://cvrr.ucsd.edu) had 3 of the triclops camera systems. They have a new one called the bumblebee that looks to be cheaper and maybe do processing onboard?
There are linux SDKs available also. Note my version of Mozilla (version 1.0) doesn't load their page correctly, maybe some IE messy code?
__ No registration required to read this message. They did it in the Matrix.
Everything you've just described can be done with a single camera.
Reality is the ultimate Rorschach.
Not meaning to offend anyone here.
but, wasn't this all invented in the early 1900s?
History of Cameras
If so, then why is taking a picture with two cameras and then displaying them to people so they have stereoscopic vision so "computationally intensive". It seems not to difficult for me. (What's really computationally intensive though would be rendering the two pics, but even then it only requires the "camera" to be shifted and two images to be rendered for each frame. So therefore requires O(f(x)) (f(x) = big O for time to render one picture) computation time and I am guessing it's roughly double the computation time.
Maybe, I am missing something though.
~ kjrose
No, we are not related.
Kryzx
"I don't know half of you half as well as I should like, and I like less than half of you half as well as you deserve."
No more radar guns for police (now you'll need an invisible car)
Fighter planes that don't need radar (but will need scads of cameras all over it -- both visible, infrared, and tetrawave)
Computerized athletic officiating (which may finally kill the politics of skating and gymnastics)
Better identity recognition software (now you don't have to face the camera)
Custom-tailored clothing (no more scanning mechanisms)
Automated grocery checkout (the machine identifies the fruits & veggies so that the clerk doesn't have to type in a 4-digit produce code)
Another reason for George Lucas to go back and re-film all 6 episodes into digital 3-D.
we can all drool over the other possibilities
It's always the same on Slashdot - somebody will eventually end up talking about p0rn...
I'm disgusted!!!
As others have commented, this technology seems like nothing any better than what is already out there. Who really cares? Why is this relevant news? Show me some serious attempt at real 3D vision any day.
I want one of those implants like the movie Johnny Mneumonic(Spelling). That way I can put on the 3d glasses attach a processor to my spine and play quake all day. No Boss, these are prescription glasses, honest.
"On a long enough timeline, the survival rate for everyone drops to zero."
This does not mean that you did not learn depth cues such as perspective and relative size from other experiences, such as 3D perception. Simply because you have learned that certain shading patterns imply depth does not mean that you did not initially gather that information via stereo vision
It's relatively easy to test your argument. A person blind on one eye from childhood would never be able to learn stereo vision. Yet, it's VERY likely that he are still able to estimate distances.
The argument that he gathered distance information through moving and seeing an object from different angles and constructing 3D (or 2.5 D as some argue) in his head could be a good one. However, if you provide a photo of some scenery never seen before, this one eye viewer should still be able to estimate object boundaries and relative distances.
Simple rules like "object A covering object B is in front of it" play a much more important role than SV. SV is rather an addition to an already existing machinery, not it's primary tool.
132 stereo frames per second, which could be very useful in security systems
The local video store puts all its p0rn under the "documentary" category. Has the codename been changed, and no one told me?
I'm struggling to think of any other products the technology could be useful for.
:)
There's always the universal correct answer to that kind of question - porn
Triple-D cups in 3-D
-
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
http://www.zipworld.com.au/~surturz/threed/3dindex . tm