Disney Algorithm Builds High-Res 3D Models From Ordinary Photos
Zothecula writes "Disney Research has developed an algorithm which can generate 3D computer models from 2D images in great detail, sufficient, it says, to meet the needs of video game and film makers. The technology requires multiple images to capture the scene from a variety of vantage points. The 3D model is somewhat limited in that it is only coherent within the field of view encompassed by the original images. It does not appear to fill in data"
This is great for scenery, it is amazing how much effort goes into the background scenery that no one will really pay attention to, but if you get it wrong everyone pays attention suddenly.
The 3D model is somewhat limited in that it is only coherent within the field of view encompassed by the original images. It does not appear to fill in data
Just have the CSI boys zoom and enhance. C'mon guys, they've been doing this for years.
The technology requires multiple images to capture the scene from a variety of vantage points.
That's cheating.
The name of the algorithm is called 'affine reconstruction' and is a fairly well studied algorithm in computer vision. It is great that Disney and co. are releasing software to semi-automate the data input and reconstruction.
http://www.123dapp.com/catch
Autodesk has a service already available that does what the Disney does, it's called Recap.
http://usa.autodesk.com/adsk/servlet/pc/index?id=21350337&siteID=123112
They have a cloud service that can make full 3D models from photos.
I have a program from the mid 90's that I got from a book about VRML http://www.amazon.com/Teach-Yourself-Vrml-Days-Sams/dp/1575211939 which would turn say buildings in photos into 3d objects. I think it was only a demo so never really tired it out to see if it worked.
by TheSpoom (715771) Uncaring Linux user here. I have nothing to add to this but please continue. *munches popcorn*
Hey gramps, what the hell is a "Polaroid"? r * sin (Hemorrhoid) ?
Next you'll go blathering on about irrational things like "phone books", when everybody all knows they're called Kindles.
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
http://www.robots.ox.ac.uk/~gk/PTAM/
did it 5 years ago
Who logs in to gdm? Not I, said the duck.
Easy? Not at all. IIRC, to be able to theoretically get the model, no... let me try again: to even determine where your cameras are and how they are oriented, you need to be able to define something like 11 points in 7 photos.
At that, that just gets you to the point of having N equations, N unknowns. It doesn't give you the answer. Nor does it account for lens distortion. Throw in lens distortion, and you have that many more unknowns, therefore that many more points you'll need to define.
Having thought about it more, since then, I have decided that that isn't the way to do it. The proper way to do it is something more akin to relaxation... but you still need sufficient points. You also have to be able to define what the "same point" is. That's not easy.
That said, there are ways to make it easier. One is to first find which photos are closest to each other. To do that, you have to overlay the photos, and subtract the RGB values of each pixel. Then, run an FFT on the parts of the photos. The main frequency output of the FFT will tell you the probable shift-error in that part of the image. Try adjusting the photos that many pixels left/right/up/down (4 directions) until you find the best match, then rinse and repeat. Do this for all parts of the photo, and you will start to identify point alignments. Now work other photos together in a similar way, until you have a single network.
THEN you can use relaxation to try to find your camera positions.
THEN you can back-ray-trace, using I^4 correlation to get probable "glow spots", and then use that to generate your wireframe.
And somehow, you have to account for objects that moved, or people who were walking. Yes, it can be done by identifying different objects, but...
As I say, nothing easy about it.
Correct Horse Battery Staple: 72 bits of entropy. Enter "Correct H" into google. When it generates the phrase, that's
I wonder if all those frames that stored say, John Wayne, could be used to create a fairly good 3D likeness. If not now, maybe soon. Also, who would own the rights to those performances?
Overlap the photos you're taking by 60% & look at them through a Stereoscope... you get 3D.
http://en.wikipedia.org/wiki/Stereoscopy
In another news, the Sun is shining. I mean seriously, light-field based 3D reconstruction has been around for many years. Hell, even one of my colleagues has built a rotating table-based camera setup to capture images and create a full 3D model. Just google light fields 3D reconstruction or structure from motion and smell the coffee.
Yeah, great news.
I am putting myself to the fullest possible use, which is all I can think that any conscious entity can ever hope to do.