Domain: ptgrey.com
Stories and comments across the archive that link to ptgrey.com.
Comments · 23
-
Re:Wow, an array of photovoltaic cells.
3. I assert that any camera will have a housing with more surface area than its lens.
This and several other models. The lenses used for these usually have much more surface area than the camera itself, and weigh significantly more to boot.
-
This is not new
-
old tech
Point Grey Research has been doing this for years and years with their Ladybug spherical cameras (eg. http://www.ptgrey.com/products/ladybug3/Ladybug3_360_video_camera.asp). They have a much more innovative many-camera unit, too - it is a PCI-E device (weird, but high throughput!) with a square array: http://www.ptgrey.com/products/profusion25/index.asp
-
old tech
Point Grey Research has been doing this for years and years with their Ladybug spherical cameras (eg. http://www.ptgrey.com/products/ladybug3/Ladybug3_360_video_camera.asp). They have a much more innovative many-camera unit, too - it is a PCI-E device (weird, but high throughput!) with a square array: http://www.ptgrey.com/products/profusion25/index.asp
-
Point Grey Research?
Depending on your needs, there's the company that made the first spherical vision camera, Point Grey Research [ptgrey.com]. They had one of their cameras go up on a shuttle mission. Dunno how much they run, but their current "cheap" camera weighs a kilo, and runs at 30fps at 1024x768 x 6 cameras.
-
55 pounds?
That's heavy for what's essentially a laptop with wheels.
Apparently its main sensors are just little IR ranging devices. Those things are basically non-contact bumpers. Not too impressive. It really is a rehash of 1980s technology. I don't see much use for a 55 pound dumbbot. Robotics is way beyond that point.
This thing ought to have at least two cameras, stereo vision, and SLAM software. Wouldn't add that much to the cost, and they have the needed CPU power onboard. A pair of webcam chips mounted rigidly to the same frame, so that they stay aligned within a pixel, would make stereo vision work. You can buy stereo camera pairs for robotics, but they cost too much because they're made in tiny quantities. Made by a toy manufacturer, they'd be no more expensive than two standard webcams.
-
Re:This story is useless without pictures
Maybe something similar to this: http://www.ptgrey.com/products/spherical.asp
-
PointGrey
We developed a motion capture system for a bit of a different application (6-DOF joint movement tracking for biomed research). Getting the cameras working is trivial compared to the processing required to actually get 3d motion capture working reliably. Of course, we were going for something that probably has to be much more accurate than what you need, but it's not a trivial manner to write the software for something like this. Of course, there may be stuff out there you can use. Anyway, here's a brief rundown of what we were using and what might help.
We used 3 point-grey cameras (the flea). They might be a bit expensive for what you want, but all their cameras are top-quality, and they are intended for use in computer vision applications, and thus come with some source code pre-written that you can use for the interface. http://www.ptgrey.com/
We actually used windows machines at the request of the lab, but we did originally look into linux. The cameras are firewire, and the best linux drivers we've found for these types of cameras are the the libdc1394 drivers on sourceforge: http://sourceforge.net/projects/libdc1394/
The Open computer vision library is also invaluable. It has a lot of pre-written functions to deal with the more basic processing problems. It's got most of the major filters and algorithms in there that you'll need to extract the info from the camera pictures. Here: http://sourceforge.net/projects/opencvlibrary
I'm not sure how you're planning on combining or calibrating the system, but we used a static set of known coordinates and used DLTs to actually give the real 3D coordinates. A good tutorial is here: http://www.kwon3d.com/theory/dlt/dlt.html
Lastly, good trackers can really help the processing a good deal. Our trackers used a 4-ball system because we needed the accuracy and refrences for the angular rotations, but even a 1-ball tracker can be well designed. If the ball is a significant bright point on the image, simple thresholding is all that is needed in terms of preprocessing before you extract location in the image. Reflective paint or another bright source is KEY. If you're going colour, a distinct colour is also a good option.
good luck. -
Re:cheap homebrew infra-red mocap rig
Point gray do firewire cameras + SDK for computer vision research that I believe automatically synch if put on the same firewire connection.
-
See 360 degree video in action
My current work project deals with real-time presentation of a 360 degree video feed. We used the Ladybug camera to record 30min of footage in a contemporary glass studio. The video feed is later projected inside a hemisphere. You use a trackball to change your viewing direction at a rock solid 60Hz update rate.
Note that having a good 3D sound system is essential for this type of installation. Since you only see half of the world, a full 3D sound field can give you important clues about things which happen behind you.
We'll show this starting October 6 at the Sydney Powerhouse Museum (Australia). Enjoy. -
Ladybug video camera
A shameless plug to another projects like the work done by Point Grey:
http://www.ptgrey.com/products/spherical.html -
You can buy two existing similar systems
At Siggraph this year, there were two similar systems on display. They are unbelievably cool.
1) Point Gray's Ladybug2 has five cameras mounted in a box about the size of, say, a stack of three decks of cards.
2) Immersive Media's system has 11 (!) cameras in a sphere about 2 inches on a side.
Both systems do real-time stitching of the multiple images into a panorama.
We're looking into them for the obvious motion-picture visual effects applications. The resolution (both spatial and dynamic) is not ideal for motion-picture work, but the ability to have an extremely small, lightweight, panoramic capture is a tradeoff that is worthy of pursuit. In the past (say, on The Fast and The Furious) we used six ARRI 435 cameras mounted to the side of a motorcycle, to the tune of several thousand dollars a day rental, hundreds of pounds of weight, and fairly compromised images in other ways (bad lens flare, extremely bouncy images.)
Thad Beier
Hammerhead Productions -
Re:A few questions...
I worked on Virginia Tech's entries into the Grand Challenge, specifically on the vision system. We use a Bumblebee stereoscopic camera for depth perception and image processing. (http://www.ptgrey.com/products/bumblebee/)
Our entries (Cliff and Rocky, both of which are among the 40 finalists) use scanning laser rangefinders (LRFs) and the aforementioned Bumblebee stereoscopic camera. The laser rangefinder technology is nearly foolproof in many ways. The results are real (except sometimes when interacting with puddles) and the I/O interface is simple. The problems with LRFs are that they can be spoofed by dust and rain, cannot tell a piece of shrubbery from a boulder, and will miss important information like a chain link fence.
The problem with a stereoscopic camera used solely for stereo processing is the sheer intensiveness of the process. Depth perception (both range and accuracy) depend entirely on the number of pixels used in the processing. The amount of computational power a team of 40 or so engineering students can fit on a club car is limited. Better depth perception means longer refresh times, which means that boulder that was 30 meters away when you took the pictures is probably lodged in your engine by the time you send the processed data to whatever program is running your path planning. (We used a behavioral-based approach of obstacle avoidance on Cliff, and the A* algorithm (http://upe.acm.jhu.edu/websites/Benny_Tsai/Introd uction%20to%20AStar.htm) for Rocky.)
The way that our genius programmer - not me - approached the problem was to use a whole bunch of other algorithms using a single image to identify likely road areas, and then only process that area in stereo. It worked very well on a noisy test course at 10 mph, though that's still too slow to complete the challenge. Anyway, go Hokies! -
A Camera that could work
I agree with the bulk of the comments, that there is little to no market for HD video conferencing, but if you want to give it a go check out this camera. It's not terribly expensive (about $1000). The output is firewire. The resolution is 1024x768, so you may have to do some croping or scaling if you want it to match TV formats. Other industrial cameras may be just as good. You will just have to do some coding to turn the data from the camera into something usefull.
-
Re:Most of them will never workWhy not a combination of stereovision, range finding, and a digital horizon to enable real time mapping based off a visual system?
Stereo vision has two fundamental limitations. First, it doesn't work very well unless the scene has clean, sharp edges to match up. Second, the accuracy decreases rapidly with range, beause you're measuring a narrow triangle from angles at the base.
The algorithms for stereo vision aren't all that forgiving. There are basically two flavors. One finds and matches "features", usually corners. This works nicely for indoor scenes and badly on dirt roads. The other does a straightforward correlation between matching scan lines from two cameras, sliding them back and forth looking for the best match. This has a high false alarm rate on surfaces with high-frequency detail, like gravel roads.
Practical problems include the fact that correlation algorithms are sensitive to high-frequency noise, so any thermal noise from the camera is a major problem. Also, keeping two cameras aligned to within a pixel while jouncing along on an off-road vehicle requires a very rigid mounting with the cameras near the center of gravity along the inter-camera axis. (For an example of a good one, see the Bumblebee from Point Grey. They have the most successful stereo vision products.)
To date, the most successful outdoor stereo vision system used on a mobile robot was on the NASA Hyperion robot. They were able to achieve a range of about 7 meters on rocky terrain with hard edges. This is about a third the range that theory predicts. A DARPA Grand Challenge vehicle needs at least 20 meters of range, and if you want to go fast, 50 meters. You need 1.5 to 2x your stopping distance.
We have a stereo camera setup working on my desk here, and we've had it for over a year. We've tried that.
Stereo from motion, where you work with successive frames from a single camera, has potential. The baseline is the distance you move between frames, which can be much bigger than the distance between two cameras. But people have been trying to make that work for years without much success. If you want to work on vision, that's a good problem. Especially since you can just take data from a camcorder and crunch on it - no special hardware required for development.
-
Many erroneous repliesThe software exists in Windows, Mac, and Linux, the issue is that the camera has to support it. Some cameras send nothing over the firewire port unless it is in playback mode. It doesn't matter if you are using a Mac if the hardware will not send data. I have found very little information easily accessible about which cameras will support this. Some manufacturers will answer your questions.
For machine vision (which is why I have looked at this before), check out the firewire cameras at Point Grey Research. They have some really nice stuff and a great support staff.
-
Re:Nothing useful?Almost all the work of robotics today (including the work of Rodney Brooks) is reactive and behavior-based.
Not really. Most of the good robotics work today involves some mapmaking and planning. Mapping and planning are less rigid than they used to be. But they're definitely in there. See, for example, the CMU automated forklift project.
Would you prefer the traditional deliberative, abstract-model forming robot design that was forwarded by Minsky in the 50s? These sort of robots would map the world and then move accordingly, but it took them *forever* to do so.
Yes, mapping and vision processing were really slow on an 0.3 MIPS DEC-10. Back then, it took minutes. When I did sonar mapping on a 6 MIPS PC/AT, it took seconds. Now that we have a few more orders of magnitude in compute power, that's not a problem. -
Re:What's so difficult?The concepts behind it aren't too difficult, a google search for epipolar geometry is a good place to start.
The biggest problems are computational; it's hard to do a good job of stereo reconstruction at high frame rates in real time. It's by no means impossible, and there are commercial out there that do it, like this one.
Two cameras aren't really necessary, either, if your camera is moving in the scene. It's possible to recover both the movement of a camera and 3-d information about a scene just by moving a camera through it. Googling for structure from motion is a good place to start looking into those techniques, and there's a pretty cool page about one groups application here.
In short, this company may have an interesting prodect (depending on cost and more details on the error characteristics) but this isn't something that couldn't be done with existing methods.
Also, as an aside, I find it interesting that they take a swipe at laser rangefinders as requiring a spinning mirror, when just about all IR cameras have a spinning "chopper" as an integral part of the exposure system...:)
-
Point Grey Research has something similarPoint Grey Research has been offering something similar for a few years now. Even runs on Linux. It's not free, but it has a track record. There's a downloadable demo.
Point Grey likes to use three-camera systems, with the cameras arranged in a triangle. This eliminates most ambiguities found with two-camera systems.
Algorithms to do this have been around for years, but only in the last few years has it become possible to do it in real time on commodity processors. Hans Moravec was the first, almost 30 years ago, back when it took him 20 minutes of mainframe time to process a stereo image. Point Grey was selling a DSP-based solution a few years ago. Now you can do it on consumer hardware.
Mobile robots should be getting much better shortly. Systems based on Polaroid sonars have the resolution of probing the world with the big end of a broom. Laser rangefinders cost way too much and have moving parts. Millimeter wave radar is complicated to use as an imager (although it opens most supermarket doors in the developed world.) Affordable, fast vision is finally here.
-
Re:Classic example of SMPASigh.
Early thinking (60s-70s) really was to build a detailed model of the world, grind it down to simple primitives, and run a logic-based planner on it. That had a terrible time dealing with uncertainty and required a very regular world.
Moravec introduced the idea of "certainty grids", which are probablistic occupancy maps. Originally, he used this as a means of getting useful data from ultrasonic rangefinders, which are very low resolution devices with slow data rates. (I've built a robot that works that way myself, and you really can get maps with more resolution than the sonar beam by taking enough samples as the robot moves.) As enough compute power became available, he moved to laser rangefinders (better resolution, but clunky rotating mirrors) and finally to passive stereo imaging.
What you get out of systems like this is a map of the neighborhood showing what's open space and what isn't. This is a good input to a repulsive-field type path planner. There's no need to extract a "primal sketch" or do any object recognition just to accomplish navigation using this approach. It works quite well; the CMU Navlab vehicles have been cruising around offroad on this technology for years now. The Denning guard robots used this technology with sonars.
Extracting range data from stereo imagery was Moravec's thesis topic in the 1960s. It took a mainframe computer 20 minutes per frame back then. Now it can be done in real time. There's commercial software for this. Two cameras are good; three cameras are better. It's actually not that hard; it's basically done by convolution. It's not done by edge recognition any more. Convolution is computationally expensive, but simple. We finally have enough compute power to do this stuff.
I've commented on Brooks' work previously, so I won't say any more about that now.
-
Integral photographyIntegral photography dates from the 19th century. Here's an explaination. It trades resolution for depth inefficiently, although there's been some work in the UK on compressing integral TV images.
This group is using the technique to extract depth using a single HDTV camera. That makes sense, although the approach is somewhat low-res. Depth extraction from stereo images is commercially available, and is an alternative to this approach.
-
Walking robotsThat Honda robot is a few years old, but it's still a great piece of work.
Honda has the advantage of being an industrial company with a good mechanical engineering R&D operation. To build something like that, you need more technicians and machinists than researchers. Most robotics labs in the US are in computer science departments, and with the exception of the Field Robotics Center at CMU, aren't organized to build good machinery on a reasonable schedule. DoD isn't throwing money at this problem any more; they did in the 1970s and 1980s, and didn't see much for their money.
The hardware component state of the art is actually pretty good, which wasn't true a decade ago. Early robotics researchers wasted too much time on building radio links, motor controllers, encoders, and similar parts. Now you can buy all that stuff. Getting enough compute power onboard is now the easy part. Rate gyros and accelerometers are now stock, low-cost items. CCD TV cameras are easily available. Laser rangefinders are still big, clunky, and overpriced, but depth from stereo vision, after thirty years of work, now works well in real time.
Controlling a legged robot is a tough problem, but there's been a fair amount of work on balance. I have a patent in that area myself. Most of the work on legged locomotion is now going into animation and games, but the results will be useful in the real world.
Nobody makes money from mobile robots, though. A few companies have tried, notably Denning Robotics and HelpMate, but not with success. The basic problem is that robots compete with cheap people, and aren't much faster.
-
3D extraction from video3D depth extraction can be done in real time. See Point Grey. [Their site's down today; I hope they didn't go out of business.] They build a nice hardware/software system with three cameras arranged in a triangle. Three-camera stereo works much better than two-camera; most of the ambiguous cases go away. Their hardware is overpriced and their software is closed-source, but maybe somebody will deal with that. The algorithm isn't that complicated, but it's really expensive computationally. Their first implementation used a DSP, a hardware convolver chip, and a Transputer, but they've since moved to more standard hardware
Canoma is a re-implementation of some work done at U.C. Berkeley in the mid-90s. The Berkeley group liked to do big things like buildings, and modelled the central part of the Berkeley campus. They got their aerial photographs using a camera on a kite; there's an architecture prof at Berkeley who's developed good techniques for doing this. Much cheaper than a helicopter.
Both Canoma and Metaflash are semi-automatic systems. The user has to manually identify corresponding points and edges between multiple images. This can be a lot of work. One more generation and somebody will have this fully automated.