Capturing 3D Surfaces Simply With a Flash Camera
MojoKid writes with this excerpt from Hot Hardware (linking to a video demonstration): "Creating 3D maps and worlds can be extremely labor intensive and time consuming. Also, the final result might not be all that accurate or realistic.
A new technique developed by scientists at The University of Manchester's School of Computer Science and Dolby Canada, however, might make capturing depth and textures for 3D surfaces as simple as shooting two pictures with a digital camera — one with flash and one without. First an image of a surface is captured without flash. The problem is that the different colors of a surface also reflect light differently, making it difficult to determine if the brightness difference is a function of depth or color.
By taking a second photo with flash, however, the accurate colors of all visible portions of the surface can be captured. The two captured images essentially become a reflectance map (albedo) and a depth map (height field)."
Bah! I completed my last project in exactly 6 days and used nothing but voice commands. It turned out so well I sat on my couch and ate Cheetos the entire next day. Today, there are over 6 billion users and we're only now starting to run into scalability issues.
-God
.
I'm a big tall mofo.
...all sorts of problems become simple. I'd love to take a picture with some mirrors, some windows, maybe a reflective sign or two in the background, and see the funhouse effects that result. Oh, and don't forget emissive elements (lamps), which will appear to recede to infinity.
Slashdot (can't be bothered to find it) had a story several years ago about the (then old!) technique of capturing complicated 3D objects, such as car engines, by using two flash images, each with the flash located in slightly different locations. Threshholding the difference between the images gives very nice edge detection, along with very accurate depth information.
A project I'm working on uses the technique to capture information about arrowheads/spearheads.
They make a version of Flash for digital cameras? Is it secure?
now we need to go OSS in diesel cars
This is quite unusual for a university. Many schools have a department of computer science or a school of computer science. But combining that with a school of Dolby Canada is quite unusual. What kind of degrees in Dolby Canada do they offer? :-)
TFA requires Flash.
Why didn't you just link to the more informative New Scientist article that the blog you linked quoted?
mcgrew's razor: Never attribute to stupidity that which can be explained by greedy self-interest
Homemades.
http://www.ghouse.com/daniel/stereoscopy/equipment/index.html
http://www.teamdroid.com/how-to-make-a-cheap-digital-camera/
Store
http://www.3dstereo.com/viewmaster/cam-kal.html
That's really freakin cool. How long before there's a GIMP plugin for this? I'd like it by 3pm Pacific please.
8 years ago a manager in my lab thought that you could use a digital camera to get a 3D mesh model of whatever you photographed. It's a digital camera right? It took months for us to explain what a digital camera really was. Maybe he should have been teaching us!
-516
This is just a way to automatically generate surface bump maps. It does not really capture depth information (like a Z-buffer).
Conceptually it seems simple enough (take a photo with shadows from a light source not in line with the camera, take another where all the shadows are in line with the camera (making them virtually invisible), tell the software which direction the light is coming from in the first photo, and let it figure out the relative height of each pixel, by analysing the difference between it and the uniform (flash-lit) version, after averaging the brightness of the two. It's similar to the technique some film scanners use to automatically remove scratches.
I can think of a lot of cases where it won't work at all (shiny objects, detached layers, photos with multiple "natural" light sources, photos with long shadows), but still, for stuff like rock or tree bark textures it should save a lot of time. As the video suggests, this should be very pretty useful for archaeologists.
Probably has significant potential in the pr0n industry.
First an image of a surface is captured with flash. The problem is that the different colors of a surface also reflect light differently, making it difficult to determine if the brightness difference is a function of depth or color. By taking a second photo without flash, however, the accurate colors of all visible portions of the surface can be captured.
This is reversed, the flash-lit image will show you the reflectance (and possibly some depth) information, whereas the non flash-lit image will show you the bare color map for the scene (provided the scene is properly lit to begin with.) FTFY!
Much like the printing press, I can only assume this technology will find its first commercial success in pornography. Some angles are worth hiding.
First it's barnacles on the web and now cameras. Is there anything that annoying Flash crap won't infect? ... (mutters to self about the good old days ...)
Why not cameras that use different wavelengths of light, etc? For example, one that works in visible light, and one that works in infrared?
How about the use of different polarized lenses to block certain wavelengths of light?
I noticed this when I was in photoshops if you pick a circular brush and choose white on a black background you can "paint", quasi-3D ish landscapes, because of the way perspective works. And you can turn it into a height map, Supreme commander uses a similar/same method.
It sounds like they just figured out how to use photographic techniques to make a height map.
Can anyone elucidate why this is so whizbang neato when we've had 3D photography ever since someone with a camera figured out about parallax? Why is this different from stereoscopy?
Bemused,
"What in the name of Fats Waller is that?"
"A four-foot prune."
I wonder how well this works with faces, if it works well it could be an easy way to create head busts for 3d heads for "icons" in your contact list.
Caution: Do not use camera, flash or not, around minors, some asians, some tribes of africa and south america, or anyone in the protection of the united states federal government. Use of camera in any of these situations can result in physical harm or jail time.
3d goatse! Awesome!
Also, that would be quite a depth calculation!
Hi!
I know they're not as conspicuous as they could be, but there are frequently stories included near the body of the new story. It took me a while to dig this one up (I remembered posting it, but that was several thousand posts ago, and a few years, too), so I hope people notice it.
https://science.slashdot.org/article.pl?sid=04/12/01/0238222
Cheers,
timothy
jrnl: http://tinyurl.com/c2l8yr / foes: http://tinyurl.com/ckjno5
"shooting two pictures with a digital camera -- one with flash and one without. "
This difference has already been well-expressed across the internet for years.
Unfortunately unlocking the minigame can be nearly impossible if you have the wrong arbitrarily-assigned game character. Of course you could modify your character and change your character's gear to make it a little easier, but that's even more work and expense and doesn't make a big difference. There's also a way to pay your way into one minigame session but you'll have to be discreet about it unless you want to start another minigame that involves a lot of not-fun stuff like carefully balancing a slippery bar of soap.
"When information is power, privacy is freedom" - Jah-Wren Ryel
Obligatory XKCD
Rule of Slashdot #0: You and people like you are not representative of the larger population. - A.C.
Both approaches require taking two photographs, so I confess I don't see too much difference that way. Part of what I'm confused about, I guess, is why it's easier to reconstruct 3D-ness from flash+nonflash rather than from parallax. Per your point, yes, stereoscopy has no depth per se, but then neither does flash+nonflash, really, which appears to be suggested by this bit:
Reading through it again, I think what's important about this approach has much more to do with lighting and surface textures than with 3D:
Cheers,
"What in the name of Fats Waller is that?"
"A four-foot prune."
But still could be good for quick and dirty bumpmaps.
---- Booth was a patriot ----
This actually isn't all that different from some methods I've seen to generate 3D geometry of a subject using cameras and lighting. One method in particular uses cameras mounted in strategic locations around the subject as a DLP projector rapidly displays a series of light and dark lines patterns across the subject's surface, then shooting photos of lines.
Not quite as cool as a 3D scanner using lasers, but it seems to be easier on subjects like humans or animals that tend to move a lot.
8==8 Bones 8==8
You're right that this still requires two pictures, but they are taken from the same point of view. You don't have to move the camera, re-focus, etc. To get stereoscopy to look right for human eyes, the cameras need to be just the right distance apart otherwise things look weird or out of scale. I'd imagine you'd have a similar issue with computer processing. To get much depth with parallax I think you need to have the camera shots a good difference apart as well, especially if you are trying to photograph something mostly planar (like Myan carvings on stone temples). This should be able to pick up those finer things easily.
Your bit about the lighting and surface textures, that's the sense I got from the video as well. What they seem to be doing is using the flash to get the correct color of the object. By using that, they can determine how far back on object is set (based on how much darker it is) and that is where the depth comes from (at least at a very basic level).
Still, it's a very neat idea and very approachable. As one of the project people mentions near the end of the video bump maps for games are created by hand. I'd imagine if I could just take two pictures (one with flash, one without) and get some depth information I could play around with that idea very easily on my computer and come up with something neat. Compare that to taking two (or more) shots from different parts, trying to match everything up, etc.
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
All you need to do is rig your cellphone to emit a high frequency pulse and then post-process the sonar to get a 3D map of the environment. I saw Morgan Freeman do it...
Here's my take on it:
Parallax and stereoscopy won't give you 3d information. There is no depth field. You've got two 2d images from which 3d information can be speculated, but no 3d information.
This technique sounds like it would give you two arrays: The first array would be a colour map, the second array would be a height map. This would be done basically by taking the image without any flash, which would have no distance cues based on distance from the flash lens, and comparing it to the image with flash, which would be lit in such a way that the brightness will fall off at a square of the distance. Having both pictures, and having physics, will let you create a distance map. Then use the first picture, and apply it as a skin over the distance-map created by the second, and you've got something you didn't have before: a true 3d image, which you should be able to rotate and look at. You won't be able to see behind the object or anything insane like that, but you could concievably take two pictures of someone's face, and get a 3d snapshot of the face which would require only small changes to look normal.
It's been a long time.
One, I wonder what the results would be if this was implemented with a standard emulsion film camera and a double-exposure of the same film. One exposure being with a flash, the other without. I no longer own a emulsion film camera, so I cannot test it and evaluate the results.
Two, this also might explain something odd I experienced in the desert of California, at a ghost town called Rhyolite. My wife and I were approaching the town late at night and we noticed bright flashes coming from the direction of the ghost town. Very bright. As we approached closer, we saw that someone was aiming what appeared to be a modified, hand-held aircraft landing light(with a momentary trigger) at the old bank building. They had a single camera set up and then proceeded to light the outside of the building, from different angles, repeatedly. They did this for quite some time. I am not sure if they had a single, running exposure going, or multiple exposures. I am not sure what their goal was, but this might be an explanation. They could quite possible have been trying to achieve a 3D effect with emulsions film (this was 20+ years ago, so I doubt they were doing digital photography).
Just a couple thoughts.
"You won't be able to see behind the object or anything insane like that, but you could concievably take two pictures of someone's face, and get a 3d snapshot of the face which would require only small changes to look normal."
I seem to recall a short story somewhere(can't remember where, or by who) where the protagonist was working with the same kind of technology but found that he COULD see the back of objects. If I remember correctly, he could see the back of objects, but when he went and actually looked there himself, at the back of the objects he photgraphed, what he observed was entirely different then what his camera recorded. Essentially, he had stumbled onto a sort of alternate reality.
Fiction, obviously, but it made for an interesting read.
My camera already has a feature for this. It has a mode that takes to consecutive pictures (one with flash and one without). All I need now is a little software and I have a 3D camera. :-)
Seriously, how was such accuracy was determined and to what precision can depth "measurements" be made?.
My project isn't *extremely* concerned with precision, but for a monochromatic light source and a nice background, one can easily obtain depths to ~1/50 mm from shadow-shifts. This is about one part in 500 of the object height. For two monochromatic sources, the precision increases to about 1/70 mm. More sources increase the precision a bit, but due to specularity and diffraction effects, white light decreases the precision a little bit.
The article says "The two captured images essentially become a reflectance map (albedo) and a depth map (height field)." To say this is a complete absurd! The depth map is _exactly_ what we want to obtain... It's the final output, and not one of the inputs. If we have the depth map, it means we have the 3D model of the surface we are looking at.
All limitations of these techniques were already highlighted above (needs Lambertian surfaces, etc). I would just like to point out that one important step is surface reconstruction. These methods often end up estimating just the direction of the gradient of the surface, and it is by "integrating" this gradient-direction that we finally get the height of the pixels.
This "integration" is not easy to do, by the way. It's not easy to get a depth map. It's what computer vision is all about. How could it be that a normal picture could be directly used as such???
What makes me angry is that the "essentially becomes" in the text gives the impression that the person who wrote that knew what he/she/it was talking about... :P
Parallax and stereoscopy won't give you 3d information. There is no depth field. You've got two 2d images from which 3d information can be speculated, but no 3d information.
What does it even mean "3D information can be speculated but no 3D information"? The information is either there or it's not. Of course it only gives you information on what both cameras see, which in some cases might even be all the 3D information you can get from the scene (picture shooting a "bumpy" wall, or really anything else which you can see in its entirety from some point), but it's a bunch of information you can extract "3D information" from.
You just got troll'd!
You mean like what happens to some people during near death experiences?
You just got troll'd!
They could have been making a panoramic high dynamic range image. From the wiki:"Probably the first practical application of HDRI was by the movie industry in late 1980s and, in 1985"
It's still neat, though.
this up! Funny shit, man.
Not sure what you mean by that.
Another story that came to mind Was "The Sun Dog" by Stephen King.
I'm talking about people having NDEs who report seeing themselves out of their bodies and being able to fly across the room and even read some inscription under the operating table.
You just got troll'd!
Actually, it sounds more like a near-Depth experience.