3D Virtual Reconstructions From Microsoft
Lord Satri writes "New around the corner, Microsoft Live Labs' Photosynth, will 'take a large collection of photos of a place or object, analyzes them for similarities, and displays them in a reconstructed 3-Dimensional space.' There's a demonstrational video and a 'smart photos' example page. From the site Very Spatial: 'The word is that Photosynth will be available for free, at least at first, but no word yet on an exact release date.' I must admit, seems like Photosynth may offer interesting features with an clean interface. This tool will directly compete with Stitcher, and to some extent, Google SketchUp. The virtual world reconstruction tools market is getting crowded, and competition is good. Microsoft doesn't yet have software to tie a photo library with Windows Live Local (Google does), but don't be surprised if it comes to life."
Does anyone know of any open source photo stichers? And by the way, what does NASA use to generate those awesome collages that they produce?
This software could revolutionize buying real estate remotely. Imagine, an agent goes in with a cheap digicam and takes a bunch of shots of the house they're selling. They load them into this software which creates a 3D, navigable model of the house, which someone can browse via a browser plugin.
Sure, this has been around for a while with VRML, but it was complicated and costly for an agent to do. From the looks of this software you can use normal photos as a base. Anyone could create 3D tours with this.
Looks like panarama software on crack. Lots of legal implications I would think - depending on how the photo's are shared or linked - since it is taking photo's that you may or may not have shot and combining them all together - the question might be "who owns the final composite?".
Looks amazing though - can't wait to see it come out.
www.wildpad.com
Their website shows wikipedia, not MSN Encarta :)
From what little I can make of everything I read, LiveLabs is more of a think tank that is funded by Microsoft. I don't believe they are even under much if any creative control by MS. I would think of this more like a small startup with an idea and an enormous budget... memories of the dotcom era.
So because of this affiliation, MS comes out looking innovative and creative when it's merely a small team of appearently very creative developers who have probably never touched any code of any of MS's major income generators (Office, Windows, etc).
Sometimes the best solution is to stop wasting time looking for an easy solution.
Microsoft's tradition of little R&D apps predates the existence of Google. Once upon a time, sandbox.research.microsoft.com was just chock full of little goodies. Microsoft turned maybe 2% of them into products, and liscensed the other to third party companies (bit of trivia - iPod scroll wheel developed by Microsoft Research as a volume control for VoIP phones, they didn't use it and liscensed it out).
The big difference is that Google started adding a limitted level of release support to their betas, and it became a very popular program. Microsoft is definitely leaching the idea that these apps should be consumed by the public at large, but Microsoft Research has been and continues to be the single largest best-funded CS-related research group on the planet, and they come up with some truly amazing stuff.
I can see where this would be a big help in investigations, journalistic, scientific, criminal, etc. Reconstructing a 3-D scene would help understand where people and things were when something happened.
Today there are mic's placed in some high crime areas that identify a gunshot and where it happened. Cameras placed at strategic locations would complete the "picture".
The next logical step (as the algorithms improve, hardware gets faster, and demand grows) will be to do the same with video. See http://www.bigfootencounters.com/files/mk_davis_pg f.gif to see a cursory example of how motion picture data can be used to build a persistent environment.
Another poster earlier in the thread speculated that a real estate agent could photo a house to make a virtual tour. Even better, maybe, would be to just carry a high def video camera of some sort through the house, waving it around to get at least a little bit of footage of everything. With that data, an intelligent program could composite a 3D representation with even fewer blackout spots. Combine this with an accelerometer/gyro field that gives a non-software correlation to the video stream, and it's essentially bulletproof.
In the form demonstrated, this is a fantastic heavy duty software solution, but physical tracking data would both make this job easier and improve the quality.
I suspect that in the near future we will see the following technologies made ubiquitous in cameras:
1. GPS
2. Tilt/Compass
3. Accelerometer/motion tracking for video.
Items 1 and 2 would enable any camera to provide very accurate geo-located data. #3 with video gives you tracking where GPS fails plus the super accurate tracking data needed to take this to the next level.
"But Chairboy, you tool, why would the camera companies go to the expense?"
The features listed have become incredibly cheap (both in cost and power consumption) over the past few years. Within a couple years, it'll probably be hard to NOT have them in one of the shared chipsets the camera manufacturers use, and at that point, why fight it?
I'm rather curious to see how well their approach scales. For example, what if you just dumped all the 1,853 photos of Times Square from Flickr into their interface? Scaling even more, in the future could one use this to aggregate all the photos in a particular city, or even have a Google Earth-like interface aggregating photos from all over the globe and integrating it with satellite data? There's some interesting computational problems with arise in trying to find correspondence between that many visual features.
I'm also like to see if they can deal with pictures taken at different times of day. I'm guessing it's still too difficult to actually adapt a day image to a night image, so it'd probably just end up treating photos taken at different times of day as different scenes.