Automatic 3D Reconstruction of Scenes
Neil Halelamien writes "New Scientist reports on a piece of software by MDRobotics called instant Scene modeler (iSM), which automatically generates 3D reconstructions of scenes, using a few hundred frames from a pair of ordinary video cameras. The software uses David Lowe's SIFT vision algorithm to quickly locate common features between sequential images, for use in the reconstruction; SIFT has also been useful for generating panoramas and object recognition. MDRobotics has a demo page showing the software being used for crime scene reconstruction, along with animated GIFs of input video and the resulting 3D model."
CSI: Crime Scene Reconstruction
In C++, friends can touch each others private parts.
Whoopdedoo Basil but what does it mean?
Finally we can do what Harrison Ford can do... :-)
Kind of reminds me of the Mars Rovers' Autonomous Rover Navigation (QT video)...
Here's what I do: Bitty Browser & Andromeda
Wow I bet it's just like in CSI they will be able to zoom up 10000% on the digital image, 'sharpen' it and read all sorts of interesting things off the back of things after rotating them virtually.
I love taking panoramic shots with my camera, but the stinkiest part has always been stitching all the images back together. I haven't seen any package like this before...too bad its not open source :)
This could be applied to real estate and used to give "virtual tours" of homes on the market.
However, I plan on figuring out how I can embedd this into one of my Mindstorms...
Someone you trust is one of us.
Since slashdot will probably burn out the web server hosting images: http://www.mdrobotics.ca.nyud.net:8090/ism/behind. htm
--
Free iPod? Try a free Mac Mini
Or a free Nintendo DS, GC, PS2, Xbox
Wired article as proof
Yes, I would love to see an open source implementation of this program. Does anyone know of anything like this, or similar software?
The sending of this message pretty much inconveniences everyone involved.
Educationally, people could truly "walk around" in a virtual museum. This is lightyears ahead of QuickTime VR(?) where one simply can rotate about one point and zoom in or out.
It's only going to work on stationary scenes, as that sleeping fellow showed us. Basically, anything from the Real World you want to "import" into VR will be much easier to do.
If anyone likes FPS, you could model a map based on real scenery.
Most inventions and technology came into being before people found a use for it. It just seems pretty darn cool if nothing else.
"The first rule of intelligent tinkering is to save all the pieces." --Aldo Leopold (Paraphrased)
This reminds me of the bit in Enemy of the State where the government operatives take the lingerie store security footage, and then use their computer to "rotate the camer 90 degrees." And on top of that, they then see something in the bag that wouldn't have been at all detectable from the actual camera's angle.
It's pretty silly to suppose that this thing will be able to generate a 3-D representation of a scene without without getting highly-detailed footage of everything from every angle. Otherwise, it would just be a completely bogus modelling system to pull a fast one on people who don't know any better. "Ladies and gentlemen of the jury, although the original crime scene evidence photos don't show it, when you look rotate the angle and look at the far side of the desk, the defendent's fingerprints are clearly visible"
until they start using this technology for porn
The problem with slashdot is that most of its users were bullied and stuffed into lockers as kids!
That would be cool if someone could add something like that to Blender
http://www.blender3d.org/
I'm pretty sure videoconferencing of the future will involve automatically creating 3d models and detailed textures of the participants and their surroundings.
Even though increase of bandwidth will have made the old method of sending bitmap diffs smoother than today, one doesn't have to stretch the imagination very far to see the amazing advantages of going realtime 3d...
A 3D reconstruction of a 3D animation.
Bleh
Someone set us up the bomb, so shine we are!
From one of the links: The SIFT algorithm is restricted by patents in the United States and hence this software is not completely free to use. For details see the LICENSE file included in the distribution, before you start to use this software.
Hopefully, they're liberal about the patent and will let noncommercial nonresearch applications use the algorithm. Otherwise, we would have to wait for the really interesting software to come out.
A C# implementation with support for Mono is available to play with for anyone who is interested: http://user.cs.tu-berlin.de/~nowozin/libsift/
--
Free iPod? Try a free Mac Mini
Or a free Nintendo DS, GC, PS2, Xbox
Wired article as proof
I have been waiting for the results for quite some time and they surely look impressive. I might add that the underlying concept is not very hard to understand and one could even make a simple 3-D model of distant objects (like e.g. buildings in your city) using only two eyes, paper, pencil and some basic trigonometry.
/|
Look at this model:
A---B
|\
| C |
|/ \|
D---E
Where D and E are your two eyes, two cameras, or two positions from which you look at the object C that appears to be eclipsing A and B respectively. The distance between any of those points and their relative 3-D positions can be calculated when you know some of the distances (e.g. DE and AD) with very high precision.
Recommended Wikipædia reading for anyone interested: Parallax, Triangulation, Stationary point, Pythagorean theorem, Euclidean geometry, Astrometry, Binocular vision, Stereoscopy. Have fun.
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
I can see this sort of thing being useful for space exploration.
I might know what I'm talkin' about, but then again, this is Slashdot...
Lot of time ago I thought about this (the general idea, not detailed algorithmics) as a nice way to build very realistic scenarios for racing games.
:-)
Just think about a street circuit you could imagine nicely put into a game. I already did with our riverside avenues
Got Pike?
looks like they caught him in the act of spanking the monkey in the resulting 3d model picture
c enemodel.gifcaught on 3D!
http://www.mdrobotics.ca.nyud.net:8090/ism/crimes
Though the technology would need some additional improvements, it might be interesting to apply it to tracking shots in old movies (like Casablanca) and in addition to reconstructing the sets one could also replay a scene from a slighly different angle.
The other slight modification would be to combine the possible modification (getting a slightly different angle from an existing tracking shot) and build a stereo 3D image of the shot or film segment.
I recall watching the special features for the original Matrix and seeing how a 3d subway environment for the final fight was created from the set by taking a panoramic set of pictures. Likely that process wasn't realtime and could have required a lot of hand tweaking, but that is probably true of this software as well (slashdotted servers means I can't RTFA).
MD Robotics are the makers of the Canada Arm I & II.
For those that didn't know.
Buy Steampunk Clothing Online!
This private company just created a useful product (this is something that even you acknowledge) and justly wants to profit from the cost of risking development of such a tool. Then the first thing you ask is how somebody can clone it and steal the idea by rewriting the code as open source, since you cannot seriously expect them to just open source their code? This is capitalism at its worst.
If this practice becomes more commonplace, all that will happen is that the tiny companies get screwed, while the big monopolies will grow. The small ones struggling to get their foot in the door will simply be cloned and bankrupted while the rich ones with enough clout and monopoly to maintain their position will continue getting rich. Your thoughts about open-sourcing new, innovative software ultimately contribute to the problem of today's marketplace and really just keep the big guys in power.
The best 3D reconstruction method know so far is described in: http://www.wisdom.weizmann.ac.il/~wexler/papers/ic cv03.html
Cool. Heh, maybe it's just me, but this reminded me a lot of the episode of TNG when Geordi got infected and became an invisible man. There was a scene where they reconstructed something on the holodeck, much like this.
I suppose using zoom on the camera could cause trouble, but maybe specialized cameras could be used where the zoom is encoded within the video (wouldn't be hard with modern digital stuff -- I'm sure some do it already).
Neat to see this automated, though certain aspects of this have been possible for a long time (as others have pointed out). I mostly think of topographical maps as an example, where aerial photographs have often been used to determine geography. However, such maps normally don't incorporate the actual colors like this does.
This is not limited to static scenes as one comment says. It could be used to reconstruct moving objects just as well, with a bit more software.
You could very accurately construct physical models of criminals from security tapes.
You could also construct an accurate model of how they walk. Since every person has a unique walk, that would be more difficult to disguise than physical appearance alone.
You could discover identifying details of the cars they drive, like a small dent in the fender.
This would be perfect for eBay, you could send them a short film of the object you're selling and they would post a 3D model of the item.
This heralds the end of both motion capture and the existing hours long '3D scanning' of clay models used in films like LoTR. Instead of requiring a mechanical stylus to touch every point of a model, you just film it.
Once the software has the ability to turn multiple 2D viewpoints into a single 3D image, this will be the perfect replacement for VR gloves as well. You could have a cameras on either shoulder and your hands would be your 3D mice. That sounds like a nicely intuitive interface.
Moving companies could find this useful. They could film the objects, the moving truck, and in return get an optimal packing order. You could also film the stairway up to an apartment and the software could figure out how to get through any of the particularly tight spots, if it's possible.
This would be good for the sort of augmented reality that washington.edu has researched. When the software can regognize the separate parts in a machine it can display directions for disassembly on a heads up display.
Oh, I can think of lots more uses, but better to get hold of the code and try to implement some of the random ideas above.
Shae Erisson - ScannedInAvian.com
I just hope they patent it before anyone else can use it. I'd hate to see them give up their ability to innovate.
And the first use will be for porn, as always.
With you reasoning, computers couldnt work at all because 70s SF movies had redicululous things done by computers.
Software like this, only in more basic form, has been around for at least 10 years. There was a retail product aimed at VRML designers 5 years or so ago that made the same, but needed user input (he had to mark corresponding edges in the different frames.
Not this software can use the fact that the input frames are not from a camera, but video frames, so it can use normal motion search algorithms like they are used in video compression to find the place of a edge/corner afain in the next frame.
After that, it creates a plausible 3d model whose projections correspond with the screen coordinates of the points in the different frames (lots of boring matrix math). Then it uses for every triangle created a cut from the video and UV maps it as a texture (a lot of room for detail improvement, like averaging, or trying to pick "the best" view of that object face in the picture stream.
So it seams the main improvement of this new software is an automatic keypoint search algorithm that works and gives good output. No bugus, just a good piece of software. And no, it cant show things that never were shown, but if a guy with a camera films for a minute while walking aorund in a room, not much will remain out of view.
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
You will still need their stereoscopic camera to be able to make anything 3D. As someone else said. Their technology only produces 3D from an already 3D/Stereo source!
Following that, some detail work could be done to capture areas that might have been missed (for example, the interior of a lead lined safe).
The amount of data collected to be processed would be huge, probably not practical today. But for anyone in the future doing a prior art search for a patent on such a system, there was a posting on /. describing such a system on March 13th, 2005.
Loose lips lose spit.
you work in machine vision? I was too busy shoveling snow and my mod points evaporated...this rates an insightful or interesting.
SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
Two time-separated frames in a video and sufficiently intelligent software is a stereoscopic camera.
Shae Erisson - ScannedInAvian.com
On slashdot about three months ago.
Can't wait til i can stroll thru my old high school with a pair of cameras, take that info, load it into the computer, algorithmize it, and load that data into Splinter Cell / Halo / Unreal Tournament. Ah man, think of all the cool new maps that'd be popping up every day if this system ever gets widespread!
I have to agree with that; opensource "clones" are opensource at the worst. Where is the creativity gone in opensource software ? instead, we get another (usualy bad) clone of photoshop, windows 95, Aqua etc etc.
OpenSource at it's greatest is stuff like linux, the binutils, tcc etc. Unfortunately it's just too rare.
Not that I wouldn't LOVE to be able to use this particular toy, but heck, I'd be able to pay $100 for it like I can pay $100 for my other toys.
Is anyone else remind anyone of the ESPER system from the movie Blade Runner?
Speaking is NOT communication
3D scene reconstruction from images has been the subject of research for decades, with some impressive successes even in the 1980's.
With newer hardware, much higher resolutions, and more training data, these systems were bound to get better, and they are going to get better still.
No that is capitalism AT ITS HEART. Competition is capitalism. Capitalism in its purest form grants no guaranteed return. What, do you think governmental enforcement of a monopoly based on the premise of earliest discovery would be a better idea? If I start up a business I have no guarantees it will succeed, and thats the way it should be. Why should discoveries be an exception?
Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
I don't think I would accept a CG cartoon talking about National Policy or giving briefings because heads of state are elected officials who are responsibile for thier actions, no matter what newspeak tells you.
That is the third time an article on Slashdot has described something I had hoped I could do my masters thesis on.
First was the decentralized bittorrent network.
Then was the method by which angle changes in camera shots can be deduced by comparing the images (the jigsaw puzzle solver). That was probably an infantile persuit anyway.
And now this.
I will go ahead and spill the next idea along the same lines as the last two projects:
What if you were to put a light collector that could detect angle and intensity of light on top of a camera. Then, when your camera is filming your scene you are also recording the manner in which ambient light is reaching your scene. Later, using you 3D scene reconstruction, you can throw in new objects such as creatures and whatnot and use the data from you light collector to apply correct lighting to the new objects you introduce.
I would imagine someone is about to release this technology and we will see an article about it on Slashdot in a couple weeks. Look forward to it.
Direct away from face when opening.
I'm just a programmer who enjoys cool code and cool ideas. I'd likely start with FVision and add SIFT. I haven't yet found any details on the algorithm they're using to do 'reverse raytracing'. Anyone else has some pointers?
Shae Erisson - ScannedInAvian.com
Yup, Linux was a whole new thing, with no functional relationship with any previous body of work, no sir! No work-alikes here, no indeed!
This next song is very sad. Please clap along. -- Robin Zander
What, do you think governmental enforcement of a monopoly based on the premise of earliest discovery would be a better idea?
And what about copyright and patent law? That is exactly what the capitalism-based American government tries to do in such matters. While I agree that it is absurd to forbid cloning software, it is even more ridiculous to pass off a clone of expensive, research-intensive software as a virtue of capitalism. It is a question of development versus application. If these people developed novel techniques of generating 3d images from video, by all means they should be entitled to be the exclusive producers of this software, since the technique would have never existed had they not created it. Had they, however, simply combined a whole host of pre-existing technologies and simply made it usable, then competition is completely warranted.
I know many people on slashdot don't like to hear the truth, but some aspects of the legal system that seem so oppressive were created for a reason, a reason that is meaningful even now. If we lacked copyright law and somehow forced all software and ideas to be open source, it would be great in the short term: we'd have all kinds of nice software like this openly available. But in the long term, society would be screwed, since only a few hobbyists would bother to do research, while everybody else would be forced into a field of work where they could earn a living.
Actually, these steps are highly interesting and it's the future of representing our world "as we see it". Photography claims to be a more objective way of representing the world around us (let's say in comparison to drawings/ paintings). The photographer still has abundant power over the framing of a scene and the timing of his capture.
Why not capture an entire scene over a period of time in 3d? The viewer can then choose his objective standpoint. This is certainly the objective for crime reconstruction.
Saying this, I don't think we're there yet: how would this system work in scenes where surfaces (such as walls) are in uniform colour: no features for tracking.
At the present state you still have to project visual markers into the scene (laser, structured light or whatever) in order to extract the 3d-structure of the scene. Otherwise it's a pretty poor "we're here" visual reprentation of our surroundings.
that's going to be nice when it can accurately recreate things. I always wanted to do something similar with sound waves in little devices except I don't have the resources (or the knowhow) - It'll be cool to see where this develops anyway
While these two concepts may share some technology, stitching is clearly not 3D modeling. If you've ever tried to make a 360 panoramic of a room, you'll get some funky distortion, usually resulting on a fish-eye projection. 3D Modeling gives you a NASA-like model to change perspective with little or no distortion.
As stated by others, these models have limits - mostly in the "Nothing-In, Nothing-Out" vein. The QuickTime fly-thoughs show areas in the closet, for example, that are black. No data in the original stills, hence the 3D model doesn't know what's there.
-MrLogic
There are 2 kinds of people: those who would gladly give you the shirt off your back, and those who would gladly take it.
The NFL was playing around with something similar 2-3 years ago. They would take a freeze frame of a great catch or similar play from a couple different angles, then use the frames to construct a rough 3-D model of the instant. John Madden had some fun showing the replays. They would start from one camera angle, then "fly" around to the other camera angle, moving quickly enough that the roughness of the model wasn't really apparent. It was kind of cool and I'm sure they must have spent some dough developing it, so I'm a little surprised they don't seem to use it at all anymore. Of course, I don't watch much football, so maybe I just miss seeing the effect.
They don't discuss it in the article, but it you look at the sample model, there are "shadows" that the camera can't see that appear as black regions. You're right. It would be totally bogus to claim they could read a fingerprint on a doorknob or something like that. It might be helpful, however, for an investigator to be able to load a model of the crime scene on his computer and walk around with his mouse to visualize what might have gone down. Guaranteed this shows up in a future CSI scene regardless.
With graphics cards (and soon Physics Processing Units) constantly improving, game developers are faced with a problem: creating environments with appropriate levels of detail is becoming increasingly expensive. Creating the models, texturing, lighting, etc takes time.
A technology like this could be used to use a movie set approach to developing games, allowing miniscule details to be included in scenes without the prohibitive cost of a human modelling every item in a room.
This imagery to 3d is a primary component to artificial intelligence
Also if you could video tape a building and port it to 3d, you could make some quick FPS levels.
God spoke to me.
This could be neat for construction virtual walkthroughs for different house models.
Oh well there goes the last year and half of my life.
I / we have been working ona very similar system to generate 3d Models of underwater organisms using only 2 stereo video camera's.
Doh!
still it is good to see that other people work in this area is coming along nicely.
(which btw, was done with a laser based scanner called Polhemus, not a touch system)
This type of image based capturing to recreate 3D models is nowhere near accurate enough to compete with laser based scanning systems. Just because some quicktimes look pretty, doesn't mean that this can generate accurate, quality, useable surfaces. .
It would appear that there is significant webbing
The system I use has an ISO accuracy of 0.02mm. Image based systems (exepting ATOSS) rarely approach 1 or 2mm in accuracy. Most can barely do 10mm accuracy.
The high accuracy I get allows me to 3D scan a human face with enough detail to see the individual folds of the human iris.
The best this image based thing would do for that is a blurry colored dot where the eye is meant to be.
Also, it will only augment existing motion capture systems, not replace them.
The amount of position samples per second is far below what Ascension or Vicon are able to do. It would need to interpolate an incredible amount if it were to be used in motion analysis.
Forget trying to use this for crime scene analysis. I have been working with the police for years trying to develop a system that will stand up to the very intense scrutiny of the law and the courts.
These sorts of things usually are only used for illustrative purposes, never as evidence or theroy proving devices.
This sort of technology (or similar) crops up every few years and everyone gets exited about it's potential in everyday life.
Rarely does it leave the lab for the real world of non-techie consumers.
Often it stays in the domain of specialists who work in the field of 3D scanning.
And even more rarely is it ever used for practical, genuinely usefull things.
I use all of the types of 3D scanning available tody (image based, laser, CT, MRI and arm/mechanical).
Each technique is intended for specific uses.
No one system at the moment can be a "be all and end all" of 3D scanning.
It is unlikely that such a system will be made available to consumers for many years to come.
I'm not trying to put down the work done by these guys. In the context of what they intend to do, it really is an amazing peice of code. But don't try to apply it to things is wasn't intended to do, it's unlikely to produce useable results outside the areas it's intended for.
Well, copyright I understand, and copyright applies to this as well. A written work is a large well defined entity. Patents, at least as far as software is concerned, make little sense as they usually cover fairly abstract building blocks necessary to define many larger concepts. Granted, this particular patent would seem to be very specific, and benign by nature but....
Lets postulate for a second that software patents were never enacted as a recognized "invention", and software was still under the domain of copyright law as was the case until the late 1990s (which I might add was an era with a whole hell of a lot of innovation). OK, so there are no software patents, companys develop software for themselves, and keep their implementation methods as trade secrets. They develop ideas in complete secrecy. They implement the ideas in complete secrecy. They then release their software products without giving details as to how they do what they are doing. Their software is still covered by copyright law, so they have a pretty comfortable amount of time before the competition comes up with the same or functionally equivalent implementation (because most of these patented "brilliant ideas" are just an obvious next step not a revolution). Unless, they happen to be working on the same thing at the same time (which could easily happen as the competing companies are usually looking at the same problem set that needs to be rectified), in which case they would be at the mercy of the market and the better implementation will win. This is true for any other market. Truly, innovative ideas get backing by customers who see that those companies are innovative. So the net effect is: it forces companies to be innovative to stay alive, and therefore the consumer and society in general benefits.
So what about reverse engineering? Well, reverse engineering is not free. If the creating company used some kind of code obfuscation (which is another area that would really be innovated upon if everyone knows that their software is going to be reverse engineered) then the reverse engineering efforts will come at their expense of time, effort, and lots of money. And then the competition would still have to take the reverse engineered implementation and implement in the context of their software.
I don't know. I only see damage when I see software patents. I see threats, lawyers, and little guys getting squashed by the big guys patent portfolios.
Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
At least my good ideas aren't 10 years old anymore. :)
Okay how do they get more than a 2.5 dimnensional
reconstruction? And why do they have to target
police reports/analysis, does this require the scene to be still before they can aquire measurements
and texture information? Also do they capture
anisotropic detail as well as gasses?
Just say no to license servers!!
Wow, I actually tried out that autostitch program. It works extremely well. For carefully shot pictures, it will stitch more or less perfectly. For wrecklessly shot, less than perfect but much better than what I can do with Canon photostitch. Considering it was automatically stitching stuff better than what I was doing with Canon photostitch with a lot of manual tweaking, it's impressive. I hope this guy's development work becomes a commercial product.
mda.ca will eat Coral Cache for lunch .. before the main course