Minolta 3D Camera
Bookwyrm writes "This was just an interesting technology toy/tidbit I ran across.
Metacreations and Minolta have teamed together to develop
what appears to be a modified digital camera that allows you to take
'3D' images. The camera stores/digitizes the image data in such
a way that Metacreations' software can (re)construct a 3D model
of objects in the picture along with their textures. While mildly neat in itself, it would be interesting to consider
how far you could develop this technology. Could you do real-time
3D capture using a video camera with these techniques (and sufficient
computer power)?"
Let me tell you: This is some awesome shit.
DISCLAIMER: I no longer work for the company, and the following information is gathered from what I have found over the web and from asking people at the company.
History
The technology was initially pondered by a russian physicist - Alexander "Sasha" Migdal. He came to the United States a long time ago and did work at Princeton University in various fields (mostly in physics, I believe). After a while, he formed a company with his friends from Russia called "Real Time Geometry." Sasha is an insanely smart man. A little eccentric, but smart
RTG pioneered the technique of being able to dynamically set the number of polygons you want to render a model with. For instance, you could have a massive model of a helicopter render with full detail when it's close to the camera, and have it render with less detail when it's far away from the camera. This technology is now part of MetaCreations' MetaStream
The company was bought out by "MetaCreations" in (I think) 1997 (or thereabouts). MetaCreations was the merger of MetaTools and Fractal Design.
After this was when the technology that we're discussing now was beginning to be implemented.
Process
Although I have not performed the procedure myself, I have seen it done on many types of objects, from pottery to toys to PEOPLE'S FACES.
The object is placed in front of a black background with several lights around it with the aim at neutral lighting. The black background prevents a shadow from being interpreted as part of the object. The camera is usually placed about 3 meters away (not precise, just average or so...). For "in studio" objects, a laser was used to accurately calculate the distance to the subject. The technology has been refined a lot (obviously) and just when I was about to leave, they introduced this deal with Minolta in an All-Hands meeting.
Now, I see a post of 5, Interesting that states that one of the shots is a piece of pottery - how simple is that!
Well, it's not. The reason?? TEXTURES. The 3D imaging RECREATES the model so as to preserve not *only* the size/shape of the object, but ALSO the *look* of the object under certain circumstances - for instance, certain lighting environments.
That's why a pot ain't so easy. While the shape might be "easy" (you try extrapolating 3D data from 2D data), the texturing is even more difficult. I can remember seeing models where everything was great, except maybe when you look into the pot and you see a hole at the bottom and you think "Hurm, we hadn't thought of that, had we?"
In any case, that's the process. Now how does dynamic resolution and 3D imaging come together? Simple: The fact is that many objects (people for instance) have *curved surfaces*. Within the realm of polygonal 3D modelling, you *have* to throw out data, it's just not gonna all fit. While the camera/software figures out the 3D models, it is very difficult to render them in real time... MetaStream does a wonderful job of rendering huge objects in real time, even on a shitty computer.
Now, in this wonderful time of the web and stuff, MetaCreations (I think) is positioning this software/hardware for two things:
Of course, that means you need small files - full 3D models and textures the size of a GIF or two? Yep. It's pretty cool stuff. From what I know, it's a wavelet compression technique that compresses both the textures and the model data. Most models (of people's faces, toys, pots, whatever) are in between 50 and 200 K, which is pretty remarkable for the quality that you get from MetaStream.
Several web sites have already implemented this technology, and make quite good use of it. Here's a sampling:
Sorry for the long post, but I hope I cleared up some information.
PS - Hi to Sasha, Victoria, Dmitry, Victor, Baga and everyone else!
You should never take life too seriously - You'll never get out of it alive.
I have worked with this package as well as other competing products and am quite impressed with the level of quality vs. effort required for this form of 3D imaging. The funny thing is I am sitting in a meeting as I type presenting the various options for this form of image capture as my company's technology lead, and I just was doing my rounds (on my new WaveLAN 11MB Silver wireless card!) on the Web when I saw this article!
If you want Linux compatibility in fact, plus a much better, less restrictive product overall, checkout Eyetronics... in fact, last time I spoke to a developer there he said that Linux was his PREFERRED platform for his software. The main benefits of Eyetronic's technology are the following:
If you want more info feel free to ask.. I've demoed and used most of the available 3D capture technologies, and for non-critical work (engineering, etc.) these new breed of photographic solutions seem the best. And there aren't as many kinks or hitches as you might think, you'd be suprised what these guys have done with image- and contour-analysis and a lot of intelligence on their part.
I may be wrong, but I believe Eyetronics started as a university project in Sweden or Denmark... probably Denmark.
Btw, before I get flamed for being a fraud, I work for a market-leader in ecommerce-oriented 3D imaging, but this is as close to my real identity as I can post under. If you can figure out who I work for, bully on you, but it isn't "0110".
:)
Link 3D photo technology with this and those Natalie Portman statues could become a reality.
Add a trouser-full of grits (don't forget to tie-off the cuffs first) and you've got yourself a party!
; )
**>>BELCH
Also, I read about 3DBuilder a long time ago; it looks semi-automated.
Some more random digging uncovered an index of VR research. A month or two ago i was looking for information on panoramic photography and I read a summary of someone's thesis (IIRC); he automated the compilation of the best affine transformations on frame-to-frame video, then statistically analyzed those transformations to yield great detail. I can't find that right now. :(
I dunno, probably interferometry techniques applied to diffuse reflection. Unlikely to ever be as accurate as in the movie, however. (accuraccy in "unseen" regions would also degrade exponentially with distance)
Probably be a matter of iteratively refining a volumetric model from initial heuristic "guesses", I suppose. Most wrong guesses would be detectable as the results would visibly lose self-consistency after the first few iterations.
(also a good way to detect doctored images, although I daresay there are easier and more efficient ways of detecting those)
DNA just wants to be free...
I've heard of software that works much like they describe on their site. The software I've heard of has been around for a while, and the way it works is you have to specify "tacks" on the same point of the object on each photograph. The software can then solve for a 3D coordinate for each of those tacks.
Essentially, you have a vector go from the "eye point" in each photo through each tack in that photo. You then solve for where the vectors for each tack come as close as possible to intersecting at the same point. (by finding a least squares solution to a system of linear equations) This is a bit of an over simplification, because the position of the "eye" in each photo is a variable as well.
Textures are generated by actually taking pieces of each photo between the tacks, scaling and stretching them appropriatly, and then blending them together.
It's all a pretty neat process, but to use in a real-time setting you'd need multiple cameras, and some sort of AI that would place the tacks. As it is, the process has a fairly large manual component. Doing that with every frame of a video would be extremely tedious. (but could probably be similfied by the fact that each "tack" probably doesn't move very much from frame-to-frame)
I can't figure out what that extra piece of hardware is for though. This type of software normally works with ordinary photos. Even scanned polaroids or hand-drawn artwork (if reasonably accurate) would work. Does anyone know what that hardware does? Does it actually somehow scan "depth" information? If so, how?
Marsokhod is also the name of the rover we bought from the Russians, which I worked on two summers ago at NASA/AMES. this one hasn't gone anywhere except for field testing -- except for in the nicely realistic 3d models of Mars on the nice SGI boxen.
in russian, Marsokhod just means "Mars rover," just like their lunar rovers, which were named "lunokhod" 1 and 2, so I'm not all that surprised it's not unique.
Lea
very simple -- more pictures... there are systems that are pretty good at inferring what they can't see, but for a high level of detail in spots you can't see from one angle, you really need to get them from another...
it's something you don't complain about in 2d... can't really complain about it in 3d either!
:)
Lea
There are a lot of applications that are orders of magnitude more expensive than the hardware and OS that they run on.
It appears that you need to take multiple pictures for the effect. That seems to kill any "action" images right there.
my GAF viewmaster, gosh, with the Grand Canyon and Aircraft carrier reels I love it!
Agent 32
try { do() || do_not(); } catch (JediException err) { yoda(err); }
There is also a wide range of active techniques. In those techniques, you don't just use a camera, but you also use some kind of light source. Structured light-based 3D recovery can be done in real time and there are lots of approaches to that as well. You can think of active autofocus systems, found on many P/S cameras, as structure light systems.
Both software-only and software/hardware combinations for 3D shape recovery from images are commercially available, and some are also available for free as research code. Still, don't expect this to be easy or completely automated.
is a neat toy but is basically a rework of the engine inside Canoma (produced by MetaCreations). It takes what would be a 2D image and interpolates the dimensions by calculating shadows and lightpaths and such. You can use Canoma to make a 3D rendering of your living room from a photograph. I think they made it a little cooler by taking multiple pictures of something and combining all of them to do real good modeling. I'm sure a printer could be developed to print images holographically but I'm not sure if this is really viable for 3D apps until a good 3D display is developed.
I'm a loner Dottie, a Rebel.
Check out www.stereovision.net for more info.
Better yet, check out www.stereovision.net for more info.
This technology doesn't sound applicable to video techniques. Minolta's FAQ indicates that "at least" six shots of an object are typically necessary to build a 3D image from it. It sounds like the camera takes 2D photographs from different angles, and Metastream's software interpolates from those photographs to determine the object's solid structure.
It sounds unlikely to be a useful technique to apply to video; you'd have to have six videocameras recording the same scene from different angles. I'm not even sure that the state of the art begins to touch the problems of recording video in three dimensions, storing the data, and playing it back.
I wouldn't hold my breath waiting for Quake environments built from this technology either. They're building a 3D model of an object based on external photographs; doing the same thing with internal photographs is a very different ballgame.
The method usually used to generate 3D models from multiple photos is called photogrammetry - and is used in aerial imaging to extract elevation from multiple shots of terrain. It is explained in just about any good cartography textbook.
Essentially, for consumer use, the camera is flipped horizontal with the subject in front of it. Everything proceeds according to Zagadka's description, pretty much.
Incidentally, there is an old copy of Byte magazine, from the late 1970's describing how to extract the 3D information from multiple shots, with included BASIC code to calculate the 3D vertices from the 2D inputs. Pretty cool - crazy though that only NOW are we actually using this at the consumer level, even though an article in a well known computer magazine has languished for nigh 20 years!
Reason is the Path to God - Anon
Rupert.
Um... what the hell?
First there's that post that gets moderated up to 6... now this one's at -5. What's going on?
--
Win dain a lotica, en vai tu ri silota
I guess this camera and software combo are the comercialization of this research. Go there if you want to know how it all works and how cool it can be.
Dave
--------
WWGD? (What Would Goku Do?)
Imagine the potential for videogames. If this is capable of producing a textured 3d model, just think how realistic the caves in Tomb Raider MCMLI could look. Or how easy it would be to create a 3d model of a room, complete with textured furniture, walls, etc.. Take a few from a couple of different angles and you have a photorealistic model. I wonder.. Could I virtually paste myself into the video feed from my office and look present?
.sig: Now legally binding!
It would take some big schpense to develop and manufacture, but it seems possible...
Take a camera, and give it a bit of sonar-like ability to determine the distance between the camera and whatever scene you have it aimed at. The camera builds a wire mesh of the objects in front of it, then breaks the image data down into textures, which it then wraps the mesh with.
Sonar is of course out of the question, but I'm sure there's better technology out there. I mean, I just come up with 'em, I don't implmenent 'em.
We reported about this along time ago on geeknews (http://geeknews.net/cgi-bin/fooboard.pl?944436957 ), but it's still cool non-the-less. The bigger version of this is really cool. I can see this camera really being used in new Quake
maps. The only problem would be that would need to tone down the poly count.. Just read the link above that I added and you'll see what we had to say.
I seriously doubt that the quake market could sustain a company's entire line of digital cameras. I would like to know a few things.
1. The cost? I don't want to have to mortage my house just to pay for one.
2. Interface? I would like this to just plug into a standard serial or parallel port. Failing that perhaps something like just taking the film and allowing for floppy film based things.
3. Linux compatability? I could always use a machine that actually worked with linux and that worked with linux apps. I don't want to buy either an expensive commercial 3d app or to have to upgrade my pc just to use this.
Slashdot social engineering at it's finest
Yeah, more useless tech. Like when the first PC's came out. They were useless then for the average user. (Some would argue they still are useless to the average user...) This is simply technology which needs to mature. There are many areas where this could have a big impact once they have it up to speed.
Except this appears to allow you to create a digitized 3-D surface, rather than just a stereoscopic image.
Theoretically, the output of this camera would allow you to use the image in a rendering application to produce an "actor" in a setting. You can't do that with a disposable camera stereoscopic image without additional work, information, and calculation.
The little guy just ain't getting it, is he?
System Requirements: Windows 95 OSR2 (Ver. 4.00.950b) or later, Windows 98, Windows NT 4.0.
I don't use windows much, and not at all at home. So this new "technology" isn't of much use to me.
--- Grow a pair, liberals... stop letting the Republicans bully you!
I think perhaps you're being too critical. The camera is obviously not meant to be the end-all be-all of 3d modeling. It's meant to provide a relatively cheap, very simple method of creating 3d models of real world objects. If someone really wants high quality, there are plenty of other (far more expensive) options. But for a small business, this is a great way of setting their products apart from the rest.
I worked for a company that made sensors and parts for many research and engineering corporations. They wanted to be able to put 3d models of their products on the CD version of their catalog. With hundreds of thousands of items to be modeled, however, they couldn't afford the cost of either having it professionally scanned or hiring a computer modeler. They would love a camera like this.
Well, the site say 95, 98, or NT. Windows, that is.
Not linux...
Why? Becuase MetaCreatons does not develop Linux software yet. And the camera works with Metacreations software.
So, let's get some of you people out there telling MetaCreations that you would buy their software if they released a Linux version.
BUY?
Yep. Metacreations is not going to open source their software, becuase they make lots of money off of it. But, they might just make a linux version. Metacreation's stuff is already very stable and useable, and a linux port would probably inherit those features. A Linux version would be great...
In other related news, somewhere on the BeOS site it says that MetaCreations is porting some of their stuff (no specifics and this may just be rumour) to BeOS. Once that hurdle is jumped, a port to Linux shouldn't be too hard at all.
--
Talon Karrde
Last months Stereo3D-newsletter mentions a digital 3D camera that has been around since 1997. There's a cool picture too.
But what you would get would be a 3d view of one side of the object. For example, if you took a picture of a soccer ball, it would tell you that the particular angle you were looking at it from was rounded, but it could well stretch off behind like a cylinder or pipe.
:o)
The camera could well be used from different sides (eg take a pic from the top and bottom, then the front, back and left sides), but since software exists that lets you do this already (including Metacreations' own Canoma software, but its not too good) it seems a bit gimmicky. Then again, I'm sure it would help novice 3D modellers get decent-looking objects if the detection mechanism was of a high enough resolution to capture a lot of detail.
Anoter example: it might well be excellent for capturing human faces, which is a tricky £$%£$^%$ to make by hand. So all in all, could be useful for some people some of the time. Bit like most things
Game dev and music blog
NASA does it too. the mars rover Marsokhod has steroscopic vision, and some very nice SGI's make 3d models out of it that you can move Sojourner and other rovers around in...
Lea
Canoma is a re-implementation of some work done at U.C. Berkeley in the mid-90s. The Berkeley group liked to do big things like buildings, and modelled the central part of the Berkeley campus. They got their aerial photographs using a camera on a kite; there's an architecture prof at Berkeley who's developed good techniques for doing this. Much cheaper than a helicopter.
Both Canoma and Metaflash are semi-automatic systems. The user has to manually identify corresponding points and edges between multiple images. This can be a lot of work. One more generation and somebody will have this fully automated.
It has a range of 90cm. At 20cm it has a accuracy discrepancy of 1mm. At 90cm it is probably close to 1cm. It can't take pictures of areas larger than 90cm distance.
:P
The screenshots neatly show reconstruction of a simple piece of pottery. Jesus, but if that isn't the simplest 3d object then I must be smoking something.
You'll get better stereoscopic results taping two $14 disposable cameras together! (I've done it, it works, just get the focal distance right).
Another example of useless technology. And I cringe at all the thousands of useless vertices this solution will create in 3d models. No thanks!
Oh, and note the accuracy discrepancy of 1mm is from a photo of a ping pong ball. Like we all need pictures of perfect round circles