Disney Algorithm Builds High-Res 3D Models From Ordinary Photos

← Back to Stories (view on slashdot.org)

Disney Algorithm Builds High-Res 3D Models From Ordinary Photos

Posted by samzenpus on Monday July 22, 2013 @04:48AM from the adding-a-little-depth dept.

Zothecula writes "Disney Research has developed an algorithm which can generate 3D computer models from 2D images in great detail, sufficient, it says, to meet the needs of video game and film makers. The technology requires multiple images to capture the scene from a variety of vantage points. The 3D model is somewhat limited in that it is only coherent within the field of view encompassed by the original images. It does not appear to fill in data"

18 of 80 comments (clear)

Min score:

Reason:

Sort:

Time Saver by Anonymous Coward · 2013-07-22 04:53 · Score: 2, Insightful

This is great for scenery, it is amazing how much effort goes into the background scenery that no one will really pay attention to, but if you get it wrong everyone pays attention suddenly.
Primitive, useless tech by Anonymous Coward · 2013-07-22 04:59 · Score: 3, Funny

The 3D model is somewhat limited in that it is only coherent within the field of view encompassed by the original images. It does not appear to fill in data
Just have the CSI boys zoom and enhance. C'mon guys, they've been doing this for years.
1. Re:Primitive, useless tech by ArcadeX · 2013-07-22 05:13 · Score: 2
  
  Just have the CSI boys zoom and enhance. C'mon guys, they've been doing this for years.
  Darkman did this in the early 90s, long before CSI was a glimmer in CBS's pocket book.
  
  --
  An I.T. motto in the hands of an idiot is a dangerous thing...
2. Re:Primitive, useless tech by tom17 · 2013-07-22 05:29 · Score: 5, Informative
  
  Excuse me?
  http://www.youtube.com/watch?v=qHepKd38pr0 (Bladerunner)
  Bitch, please.
3. Re:Primitive, useless tech by NatasRevol · 2013-07-22 05:47 · Score: 2
  
  What am I supposed to bitch about? The video quality? The audio quality? The lighting? The blue blinking?
  They all stink!
  
  --
  There are two types of people in the world: Those who crave closure
4. Re:Primitive, useless tech by JDevers · 2013-07-22 06:20 · Score: 2
  
  To be fair, Bladerunner is set in a future world where technology is both far ahead of ours and seemingly behind ours in many ways (almost steampunk like...but forward thinking for a 1980s movie). If the camera that took that picture was more advanced than those today, it would be very possible for this to happen. Imagine a small snapshot taken with an 800 megapixel camera and this is very much possible, especially if one assumes that the actual "photo" uses might also contain an embedded memory fragment with the full resolution image.
Cheating by the+eric+conspiracy · 2013-07-22 05:01 · Score: 3, Informative

The technology requires multiple images to capture the scene from a variety of vantage points.
That's cheating.
Affine by tmarthal · 2013-07-22 05:25 · Score: 5, Informative

The name of the algorithm is called 'affine reconstruction' and is a fairly well studied algorithm in computer vision. It is great that Disney and co. are releasing software to semi-automate the data input and reconstruction.
1. Re:Affine by Anonymous Coward · 2013-07-22 06:57 · Score: 4, Informative
  
  Possibly not for those particular use cases, but there certainly is already freely available software to do the "structure from motion" reconstruction trick; e.g., vSFM -- an easy(FSVO)-to-use frontend for a couple of tools from different research projects.
AutoCAD has a service for this by bradgoodman · 2013-07-22 05:28 · Score: 3, Informative

It's called 123D Catch. They have an iPhone app and everything...
http://www.123dapp.com/catch
Re:I think this has been done for some time now by MatthiasF · 2013-07-22 05:28 · Score: 4, Informative

Autodesk has a service already available that does what the Disney does, it's called Recap.

http://usa.autodesk.com/adsk/servlet/pc/index?id=21350337&siteID=123112

They have a cloud service that can make full 3D models from photos.
Not new by future+assassin · 2013-07-22 05:30 · Score: 2

I have a program from the mid 90's that I got from a book about VRML http://www.amazon.com/Teach-Yourself-Vrml-Days-Sams/dp/1575211939 which would turn say buildings in photos into 3d objects. I think it was only a demo so never really tired it out to see if it worked.

--
by TheSpoom (715771) Uncaring Linux user here. I have nothing to add to this but please continue. *munches popcorn*
the sky was blue as a TV with no input by Thud457 · 2013-07-22 06:30 · Score: 2

Hey gramps, what the hell is a "Polaroid"? r * sin (Hemorrhoid) ?
Next you'll go blathering on about irrational things like "phone books", when everybody all knows they're called Kindles.

--
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
NOT Ordinary Photos, paraller moving video by citizenr · 2013-07-22 06:50 · Score: 2

http://www.robots.ox.ac.uk/~gk/PTAM/
did it 5 years ago

--
Who logs in to gdm? Not I, said the duck.
Re:Kinect by MickLinux · 2013-07-22 06:56 · Score: 2

Easy? Not at all. IIRC, to be able to theoretically get the model, no... let me try again: to even determine where your cameras are and how they are oriented, you need to be able to define something like 11 points in 7 photos.
At that, that just gets you to the point of having N equations, N unknowns. It doesn't give you the answer. Nor does it account for lens distortion. Throw in lens distortion, and you have that many more unknowns, therefore that many more points you'll need to define.
Having thought about it more, since then, I have decided that that isn't the way to do it. The proper way to do it is something more akin to relaxation... but you still need sufficient points. You also have to be able to define what the "same point" is. That's not easy.
That said, there are ways to make it easier. One is to first find which photos are closest to each other. To do that, you have to overlay the photos, and subtract the RGB values of each pixel. Then, run an FFT on the parts of the photos. The main frequency output of the FFT will tell you the probable shift-error in that part of the image. Try adjusting the photos that many pixels left/right/up/down (4 directions) until you find the best match, then rinse and repeat. Do this for all parts of the photo, and you will start to identify point alignments. Now work other photos together in a similar way, until you have a single network.
THEN you can use relaxation to try to find your camera positions.
THEN you can back-ray-trace, using I^4 correlation to get probable "glow spots", and then use that to generate your wireframe.
And somehow, you have to account for objects that moved, or people who were walking. Yes, it can be done by identifying different objects, but...
As I say, nothing easy about it.

--
Correct Horse Battery Staple: 72 bits of entropy. Enter "Correct H" into google. When it generates the phrase, that's
Dead actors, new movies by chuckugly · 2013-07-22 07:07 · Score: 2

I wonder if all those frames that stored say, John Wayne, could be used to create a fairly good 3D likeness. If not now, maybe soon. Also, who would own the rights to those performances?
Revisiting Stereoscopy by aklinux · 2013-07-22 09:29 · Score: 2

Overlap the photos you're taking by 60% & look at them through a Stereoscope... you get 3D.

http://en.wikipedia.org/wiki/Stereoscopy
disney algorithm by l3v1 · 2013-07-22 17:21 · Score: 2

In another news, the Sun is shining. I mean seriously, light-field based 3D reconstruction has been around for many years. Hell, even one of my colleagues has built a rotating table-based camera setup to capture images and create a full 3D model. Just google light fields 3D reconstruction or structure from motion and smell the coffee.

Yeah, great news.

--
I am putting myself to the fullest possible use, which is all I can think that any conscious entity can ever hope to do.