Making 3D Models from Video Clips
BoingBoing is covering an interesting piece of software called VideoTrace that allows you to easily create 3D models from the images in video clips. "The user interacts with VideoTrace by tracing the shape of the object to be modeled over one or more frames of the video. By interpreting the sketch drawn by the user in light of 3D information obtained from computer vision techniques, a small number of simple 2D interactions can be used to generate a realistic 3D model."
wow, what a terrible link.
A quick search turns up the project homepage http://www.acvt.com.au/research/videotrace/
I've never seen a first post from Tim O'Reilly.
Don't link to blogs.
http://www.acvt.com.au/research/videotrace/
AI needs a way of interpreting video input into 3d objects and environment. Once a computer can represent objects in a 3d environment, it can then perform operations on them. Technically you could make AI without this tool, but you'd have to do extremely precise and patient CAD inputs that would take most of your life. With a tool to convert video into 3d objects, you can just start cataloging all the objects out there. Add in a 3d physics simulator, and you're halfway to true AI. I have a quick overview on how to do AI, and as you'll note on the very beginning of the page: the reason I haven't worked on AI myself is that I can't code a video->3d object converter myself.
God spoke to me.
Hasn't this been a mainstay of movies forever?
The cost of that cleanup, of course, will be borne by taxpayers, not industry.
Software like Canoma from the now-defunct Metacreations would let you create 3D models from 2D images in the mid-to-late 90s. I also remember reading about people using Viz ImageModeler to convert images from video to models even though the software is also designed for still images - the users would just capture those frames they needed to create the 3D model.
The only thing "new" about this is using video as the input without having to grab the individual frames yourself.
Never let reality temper imagination
Never let reality temper imagination
Isn't this old news? I remember seeing a demo of similar technology over 5 years ago. It must have been Japanese, because they traced boobs with it.
I like the fact he's been marked 'redundant'
for finding the one boingboing post that's not about Doctorow's Disney fetish, or Xeni's insistance that she is in fact, not a he.
Remember back in the day when we were told that computers would never be able to learn how to understand human speech because it's too complicated? The arguments were compelling but now we've got voice recognition working over crappy telephone connections and dictation software is getting better all the time. As bad as the voice recognition problem was, computer vision seemed like an even harder nut to crack given how impossible it seemed to get a machine to go from a two-dimensional image to 3D. All of this stuff seems like impossibly difficult "we'll never get there" AI impossibilities and then we see a technology demonstration that nails it. I'm still astounded that DARAPA is not only asking for robot-driven cars, they're actually getting teams producing working results. That's another problem I always thought would be impossible.
My prediction for the future: the 21st century will be for robotics what the 20th was for aviation. We've been thinking about it for centuries but now the technology is maturing to the point that we can really do something with it. The stuff we're amazed by today is going to seem like wood and canvas biplanes.
Kwisatz Haderach
Sell the spice to CHOAM
This Mahdi took Shaddam's Throne
ive always thought converting various images (or in this case a video) to a 3d image wouldn't be too hard!
so why don't google use this on google maps to make a 3d world?
I'd like to see how it holds up against Calista Flockhart footage and not go Division By Zero.
What they did is not that new. Voxel coloring has been around for a decade. However, the main problems has been it only works well for perfectly diffuse reflective surfaces since the same point viewed from a different camera angle will be different in the real world. Not having enough camera angles of the same point (filling in the gaps) to determine the 3D position via correlation is also a problem. It seems those researchers have found answers to these problems.
http://www.youtube.com/watch?v=vda2RAEuW_g
I'm a Ph.D. student at UC Santa Cruz. I finished my masters a few years ago working on enhancements to a project with similar goals. My advisor, Jane Wilhelms (who unfortunately died shortly after I finished my masters) was working on computer vision techniques for several years. Her work focused on extracting motion for animals (often children or horses) out of videos. My Masters contribution was to look at how the accuracy and usability of the software could be improved if we assume that the general motion of a walk is the same for all instances of a particular species (the knees all bend the same way, and the legs move in the same order, etc). I didn't have a high quality capture to start with, so the results were a bit fuzzy in terms of accuracy, but it did make the process easier for the user. The user had only to make the "original" motion match the video at key frames (maybe 4 per "walk cycle"), and the computer could easily interpret the rest; I don't recall off the top of my head, but I think the number of key frames the user had to specify was reduced by half or more over the former process (without the canonical motion as a starting point). I didn't publish any papers based on my work, but my masters thesis (with example filmstrips) is available.
Hook up google maps api with polar navigated flight path, some edge/point detection algorithms and start mapping. That'd be an interesting video.
.
I've never heard of "true AI" -- do you mean strong AI?
And no, computer vision plus physics simulation does not make half of strong AI, either. Russell and Norvig, the classic AI text, lists 9 abilities generally required for strong AI. 2 is not half of 9.
I don't know what your dead geocities page has, but not working on AI because you can't write a video->3d object converter is like not working on video compression because you can't act.
http://www.youtube.com/watch?v=vda2RAEuW_g
Imagine the porn!
So it's kinda like MS' Photosynth, except it gathers the photos it self from a video.
...I can make a perfectly accurate 3-D character model by just feeding the program a bit of video and pointing out the character. Then, all we need is the same with voice and I can make my own animes! Man, that would be sweet, but I think we're still a ways off from that.
I work as a video professional for a very large stock video provider. I could see software like this being an amazing tool for a company such as mine. Not only can we offer you footage of (for example) a horse running through a field - we might be able to sell the elements themselves or in addition to that? Need some more horses? How about we just sell you the background and you pick what animals you want? A lot of the time the video industry is dictated by extremely tight deadlines and budgets - any tool that gets offered to a producer or editor that makes it cheaper/faster to get to a desired outcome will get snatched up. I could see this as a real labor savor/enabler.
Apply that to the 2d sprites in doom ? I like the new engines out there created to play doom II wads and new fancy poligonated objects , but it would be nicer if the monsters were 3d as well.
When I was in grad school, I knew a fellow who was working on similar technology. I don't think he got anywhere near as advanced as this, but he did get good enough that given 10 to 15 still images, his software could create a primitive 3D model.
Unfortunately for him, he tried to make a 3D model of his erect penis. I'm not sure if he realized it or not, but he wasn't very well hung (he's Korean). Well, at one of the presentations he had to make regarding his work, he accidentally opened up the model of his penis. He couldn't even deny that it was his, since his name was in the filename. And his supervisor, an older woman, just couldn't stop laughing. He did go on to get his degree, but I think his pride took a real beating.
In my thesis I'm also creating a 3d model from a video stream, only I'm using stereoscopy and pattern recognition to find matching objects in each frame and triangulating the depth to said objects. By the end I'm hoping to reduce the objects to small pixel clusters; the tricky part is that all this is happening in real-time. By mounting the cameras on a device where the point of view is know, it could be used to map out any static terrain by just navigating through it. Adding more cameras from different perspectives increases the completeness of the generated model. The article has definately got the right idea. With sufficient object detection and tracking algorithms, you could minimise or eliminate the need to draw the template.
...and no one is going to make a porn joke?
This is very interesting. Unfortunately, it is going to be closed-source and patented.
Does anyone know any open-source projects to do object reconstruction from video or still photographs? I'm asking because my group is building a 3D printer.
http://www.reprap.org/
(Self-link pimpage, etc. etc.)
and I think it would be cool and useful to be able to capture a 3D model from photos or video of a sculpted maquette, pet cat, broken part, human, or so on.
(I just stumbled across this by googling "gpl object reconstruction", which may be relavant):
https://ezra.dev.java.net/
People may be interested in
http://splinescan.co.uk/
which is a gpl laser scanner hardware (pen laser, prism*, webcam, and turntable) + software project to do 3D object scanning.
I'll follow comment responses to this thread, but I also welcome emails:
penguin at supermeta dot ihatespamtoo dot com
create a 3D model of my favorite p0rn movie
David is anther free DIY-laserline-scanner-based implementation which doesn't need a turntable (merging multiple scans doesn't seem to be included with the free version, though).
If you think it was unfortunate she died after you finished your thesis, imagine where you'd be if she had died before. (Hint: still in grad school)
I'll me making my source available once I've finished my thesis, though that code wont be available until the end of 2008.
If you could combine the techniques that create the models automatically, with techniques like this where a skilled artist is involved, you could produce some high quality output indeed.
You can easily make 3D images viewable with lcd shutter glasses and an nvidia card if you find some shots where the camera is panning across the scene, and it's pretty static, using software like 3D Combine. Just take two frames so many frames apart and use one for each eye. I did this with some old Betty Boop cartoons (which were made by rotoscoping, that is, based on actual photographic images) and they worked great.
try { do() || do_not(); } catch (JediException err) { yoda(err); }
Now I can finally have a 3-D model of the starship swordbreaker, finally.
Tsukasa: All I really want, is to be left alone...
If Microsoft made this announcement it would be condemned as "vaporware". The main site claims it is in beta and they are looking for commercial partners, so it apparently is not open source and no use to us at this time.
I appreciate the links and information in the discussion prompted by this article. Although I'm underwhelmed by the actual announcement, I've learned a lot from the links you folks have provided.
"The mind works quicker than you think!"
There's this hot little Night Elf Paladin chick I have my eyes on...
Do not mock my vision of impractical footwear
Most of the comments about this (both here and on BoingBoing) are clueless to say the least. You people must think the guys at Pixar and ILM have been asleep on the job for the last 20 years.
This has been done for ages. It's called photogrammetry, and it has been used in several movies (ex., Fight Club). Maybe their approach is simpler or maybe it works faster than current techniques, but until they post a video showing the workflow, there's simply nothing new here.
Even more impressive is the Campanile movie, where an entire 3D model of the UC Berkeley campus and a fly-by shot was generated from just 15 still pictures. This was done a whole decade ago, for SIGGRAPH 97.
Actually, Microsoft has made a number of presentations at SIGGRAPH over the years without any condemnation or other unpleasantness. Why would you think otherwise? This kind of thing is what SIGGRAPH is for.
People should really give up on that and start using D
http://forums.reprap.org/read.php?1,4474,6338#msg-6338
It surely mitigates the slashdot effect.
Patents Drive Free Software as Hurricanes Drive Construction Industry
The University of Kiel (Germany) presented quite exactly the same stuff (without the need of manually marking objects or object boundaries) at CeBiT 2003.
check this video (scroll page to "Movie for presentation on CeBIT 2003").
http://www.mip.informatik.uni-kiel.de/tiki-index.php?page=3D+reconstruction+from+images
Nice link.
Also, 3D Active countours can be used to trak the shape and reconstruct the model.
Weird Science wasn't a movie...it was a prophecy!
I only have one response to that.
The Internet is for porn.
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
I think the description is a little bit wrong cause it makes people think this software actually is very automatic, when in fact it just do what Blender and other softwares do, but with videos instead of images, what should not be difficult to add in Blender also. You could check the video here to see that is very manuall http://www.acvt.com.au/research/videotrace/ The only advance to me is the automatic UVmapping.