Ask Slashdot: Tips On 2D To Stereo 3D Conversion?
An anonymous reader writes "I'm interested in converting 2D video to Stereoscopic 3D video — the Red/Cyan Anaglyph type in particular (to ensure compatibility with cardboard Anaglyph glasses). Here's my questions: Which software(s) or algorithms can currently do this, and do it well? Also, are there any 3D TVs on the market that have a high quality 2D-to-3D realtime conversion function in them? And finally, if I were to try and roll my own 2D-to-3D conversion algorithm, where should I start? Which books, websites, blogs or papers should I look at?" I'd never even thought about this as a possibility; now I see there are some tutorials available; if you've done it, though, what sort of results did you get? And any tips for those using Linux?
Don't do it.
Give me Classic Slashdot or give me death!
Another example of information simply not being there...
Looks pretty bad IMO...
we all were suckered. we tried it, hated it and moved on.
each time they try to re-invent this, its still just an effects gimmick.
you'll soon grow bored.
don't invest anything in this. its a reocurring cash grab due to industry boredom.
and as a fulltime glasses wearer, I'd never be caught dead with cardboard glasses over my regular ones. an absurd concept if there ever was one.
--
"It is now safe to switch off your computer."
Don't do it.
Really, just don't do it, please!
... seems unlikely. You need to solve the correspondence problem for every frame, which is time consuming.
http://xkcd.com/865/
I'm interested in converting 2D video to Stereoscopic 3D video
George Lucas, is that you?
A friend of mine used to work for a French special effects company and he had to work on this. He told me that this is basically a world of pain and it produces great piles of smocking shit. It just sucks, even when done properly by highly trained people. Can you imagine making 3D out of a 2D tree? Make every background 3D or properly cut out the character to get the desired effect?
It sucks, it's mostly manual, get over it.
Stupidity is the root of all evil.
With a 2d image it's impossible to get a 3d image. You need at least two 2d images to do this and some work to calc distances from.
Well, it's possible to do photoshop things and put third dimension's information to a image dividing into layers and give a distance to each (trully hollywood hi-tech to sell poor-3d films at low cost), but the 3d-effect is irreal.
That's you, isn't it George Lucas?
Dammit, leave the original trilogy alone! The digital "remaster" was insulting enough!
An enigma, wrapped in a riddle, shrouded in bacon and cheese
This is still an area of active research. For estimating 3d structure from video, "Shape and motion from image streams under orthography: a factorization method" is a good place to start. When you have a still camera and still scene, the best you can do is something like "Automatic photo pop-up." Unless you are thinking of some kind of semi-supervised approach?
My 2 ct's as a researcher in computer vision:
The problem is underconstrained. There is just no stable, proper way to extract depth from a single static image. That said, still or almost still scenes can never be converted properly.
What can be done (as in 'it might work sometimes, but will still give crappy results in general') is to extract depth between two frames from the movement of the camera, called structure from motion: find the epipolar configuration, i.e. the relative position, of the camera between the two frames using many feature points; then use corresponding points between both frames and triangulate them. The triangulation gives you 3D coordinates, which can be converted to a stereo image pair.
However, for fast movements, the scene will not be static between two frames, which messes up the triangulation. You need to additionally guess the movement to get rid of this, which is extremly tricky for non-linear camera and object movements, or for deformations. Also, your depth map will not be dense due to occlusion. All this leads to heavy, ugly artifacts. And again, remember, for no or very small camera movements, all this will fail - you cannot triangulate a point from one viewpoint.
An automatic conversion is thus destined to give bad results. It might, hower, be a starting point for a manual post-processing.
You will want to avoid the old paper red/cyan glasses and go with the slightly more expensive plastic ones that are designed for LCD monitors and TVs. Otherwise be prepared for a LOT of ghosting. Also, nvidia makes platic red/cyan glasses that are designed to fit over regular glasses. You may also need to calibrate your monitor to make sure that red is really red and cyan is really cyan.
I was personally very surprised at how well red/cyan works. Of course the colors get a little muddle, but not as much as I had expected.
I bought these, btw http://www.amazon.com/Glasses-Pro-Ana-movies-Computers/dp/B0036NP3CS/ref=sr_1_1?ie=UTF8&qid=1327421067&sr=8-1
I'm not familiar with the results of the AviSynth filter in the tutorial link above, but the process Hollywood uses is very labor intensive, involving manually rotoscoping objects and projection mapping. Basically you're cutting out objects and repositioning them in 3D space frame by frame. Here's a demo in Nuke (which is a costly piece of software) to give you a better idea of what the process is like.
http://lesterbanks.com/2011/05/using-nuke-for-2d-3d-stereo-conversion/
And even after all that work, films reconverted for 3D with this method tend to look pretty bad. The cardboard cutout effect is jarring.
You can't turn a two-dimensional photograph into 3D because the original has lost all the phase information that conveys needed info (e.g., "depth"). Similarly, you can't restore 2D sound to 3D, because the essential information isn't in the source recording that you'd need to "position" all the sound sources in 3D. In general, you can go from (N+1) to (N) dimensions, but you lose information. That means you can not automatically go from (N-1) dimensions to (N) without restoring that lost information...which wasn't recorded. Therefore, you'd have to synthesize every frame of video/sound to add the missing stuff, and you can't get it automagically, because the (N-1) version simply doesn't have the information you need to make the transformation.
Example. Set up an orchestra with a flutist positioned 20 feet above the main orchestra. 2D mics have picked up all sounds, but they have no sense of where, vertically, each musical instrument is located, because the two (or more) horizontally-dispersed stereo microphones are laterally displaced. You've have to add microphones that are positioned vertically to gather the phase information for 3D, but your recording has no such information.
Clarification -- Arduino doesn't suck, just paraphrasing the unfortunate mentality of a bunch of posters on this article. It is bewildering to me that on a "news for nerds" site, people are disparaging somebody from undertaking what could turn out to be a cool tech project, even if it is known in advance that the end result isn't going to be "Avatar". And even if the best of 3D is a bomb in the theater, that doesn't mean it isn't a lot of fun to play with, as a school project, etc. I enjoyed messing with this stuff in physics lab in college.
Contra my provocative subject, Arduino is an excellent choice for serious hobbyists. And similarly, there is nothing wrong with playing around with 3D video techniques and even being willing to try rolling one's own algorithm.
Get a (homebrew friendly) life, slashdotters!
(If the OP clarifies that he's working on a big Hollywood title, I'll take this back. Until then...)
I can assure you that personal head mounted 3D displays are not a gimmick. Technology just hasn't caught up yet. Unfortunately, markets have to make the technology viable in order to progress, which often results in markets stagnating the progress a little more than we'd like (in order to maximize profits).
this can be done easily with ffmpeg and imagemagick - you need two video sources, and from a ffmpeg script, extracting a picture sequence from both videos, one sequence from the left camera, and another from the right - with a bash script using imagemagick you will separate the colour channels from each frame: red from one camera, and green/blue from another - and having the separation done, you will join with imagemagick again the red channel picture frame from one and green/blue from another, into a new picture sequence, and when you have this sequence ready, you convert it into video again with ffmpeg - try googling for ffmpeg and imagemagick instruction arguments when coding this bash script
THIS. Somebody mod parent up please.
...gives me a headache, especially the flickering kind.
My blueray player can simulate 3D from any 2D source (Panasonic DMP-BDT210) although I'm not exactly sure how it does it, or how good it looks. (no 3D tv) You might be able to talk someone into connecting one up at your favorite bigbox store for you if you acted interested in buying the blueray player, and wanted a demo of its conversion capabilities. This would at least give you a firsthand idea of how it will look to see if YOU think its worth it.
There was a recent NOVA episode about aerial photo reconnaissance during WWII. To make stereoscopic images, they'd fly the plane straight and level over the target. If they could take multiple pictures with 60% overlap, they could use two adjacent images to make one stereoscopic image that was good enough to tell a ship from a decoy.
Any motion picture where the camera pans side to side gives an opportunity to create a "3d" image. If an object moves across a still camera, you can also derive 3d information. (Also if it spins)
An interesting exercise would be to process a film, and make stereoscopic only what what can be done properly, and leave the rest flat. A scene would start out flat, then people and things would begin to jump out at you.
All ideas^H^H^H^H^Hprocesses in this post are Patent Pending. (as well as the process of patenting all postings)
The convert utility in the imagemagick package does a good job of it with still images. I'd consider dumping your frames out as a series of images, running the convert utility on them, and then re-creating your movie.
I've also thought that taking that code in convert, merging it into VLC, and setting up VLC to grab from 2 cameras at once... with enough CPU and RAM, it could be come very close to real time 3d movie.
Don't blame me, I voted for Kodos
You might want to look into luminosity based research. The brightness at each pixel may contain some information of the angle of the surface with respect to the camera and a light source. At some point that looked potentially promising. But of course the technique can fail pretty easily. Much of the work I've seen is based on trying to figure out how our brains do this all the time. Try closing one eye, see how 3D the world still looks (better than most 3D movies). You are going to have a tough challenge to beat that. But that doesn't mean its not worth trying.
1. Display 2d images on a flat panel tv facing you
2. spin the display 45 degrees so that one edge is nearer to you the other edge
3. That's it --notice how pixels on one side are closer to you when the ones on the opposite edge are futher away from u spetially)you display is in 3D now.
There is no way to do this and have it be "good". The big studios with their gobs of money have been insisting on doing this, and it always looks like crap. You're not going to get any better result with cheap gear and software.
2d->3d converted media is much more likely to make people feel sick or get headaches from the video than media recorded directly in 3d. There are two reasons for this. Firstly, because you lack some information. For instance if you look at a box that is obscuring your vision of the objects behind it in the real world, each eye has different information based on its perspective. (Try looking at something with one eye, then the other, and look at what changes behind the object). 2d media will only have the information for one eye, and you'll have to make up/fake out that second eye. Secondly, you're trying to fake out the depth cues and it's very hard to do right because you often don't have the depth buffer necessary to do it right.
Standard Template:
I want to do _something_, but I do not know how to do _something_, how do I do _something_, provided I don't know how to or do not want to waste my time using Google.
Then a barrage of responses by people that don't really know how to do _something_, but surprisingly have a lot of opinion about _something_.
And then, of course, a smart *ss like myself pointing this out.
I haven't thought of anything clever to put here, but then again most of you haven't either.
I assumed he was converting single video source into 3D - no mean feat.
If he has two, left and right, video sources he can place the video side by side and many 3D TVs will convert it on the fly, as will youtube.
In a few words: if you only have a 2D video, then it is a very hard computer vision problem, that has not been solved on the research side.
There is an active benchmark of disparity estimation algorithms (full bibliography at the end of the page). Those algorithms take two pictures and estimate a depth image. From this depth image, it is possible to reconstruct the scene in 3D (but you cannot see what's behind objects). From my experience, this class of algorithms do quite a bad job with real-life images, and have not been applied to video at all.
I've been using optical flows (see a related benchmark) for the development of an Android app (3D Camera) that converts pictures from 2D to 3D, without glasses (check it out!). The optical flow is a more general version of depth estimation (i.e. in any direction, not just left to right motion motion). It has been applied 3D conversion of videos with relative success, I can search for references if you are interested.
From my knowledge & experience, optical flows are the state of the art algorithms to convert 2D pictures/videos to 3D, but they are quite computationally intensive.
Despite what some PR hustling excitables might claim, stereoscopic conversion cannot be effectively automated at this time. Do people try it? Yes. Does it generate watchable results? Sometimes by accident, yes.
The thing is, a stereoscopic conversion done painstakingly frame by frame by a highly skilled compositing artist looks pretty bad. Any automated conversion process will be orders of magnitude worse.
What you need is a ton of really excellent rotoscoping (I send my jobs out to work farms in Russia) to separate all of the elements, and then a compositing application like After Effects or Nuke to offset the various layers along the Z (while scaling to retain size coherency). Now the fun part: fill in all the missing pixels your offset has made visible! A combination of displacement maps, cloning and hand-painted details should do the trick (this is the part that separates the men from the boys).
Your mileage may vary, but in ideal circumstances this is still a pretty hard trick to pull off without inducing headaches or making everything on screen look like cardboard flats.
I'm waiting for the 1D to 2D algorithms to be perfected. I have this 1D sketch of the battle of Bull Run that I'd really like to get converted. Here's the 1D version: __________________________ ________________ ___ ____________ Hopefully Slashdot doesn't get a takedown notice. What will be really awesome is when all of these work together, so I can convert that 1D drawing into a 4D movie!
Comment removed based on user account deletion
Monoprice sells a 2D to 3D HDTV/DLP Converter (Frame Sequential, Side by Side, and Red/Cyan) w/ Remote for $95
Here the steps:
With a normal 2D camera
you need to
1 Look for camera calibration (intrinsic and extrinsic)
2 Get some images
3 Make calibration with some things that you know the size and plane.
With the calibration you estimate 3D.
Get opencv (is the best free thing around) , it has some things
With kinect:
Get nestk and play with it.
Advise: is a pain 2d is a reduced dimension so you can not recover actual 3d but only estimate it and you will have to make assumptions.
I Don't give more details cause I hate writing from my phone.
Hope it helps
Leo
The above comments assume that the IMAGE the OP wants to convert is 2d. What if he means that he's got a 2d DISPLAY, but a 3d image he wants on it?
That is possible. Sure he can't upgrade a single picture to a 3-d scene. Anyone with a brain can see that, and they wouldn't be posting here. So, assume he's got the data he needs already, but nothing to display it on. GO!
My recommendation is Adobe's gamma adjustment tool. Its packaged with all their products, and usually its meant to color correct a monitor to give an accurate picture of what an artist's work will look like across the various media they use (paper, glossy, screen, etc.) This is useful to you the 3D guy, because the analglyph glasses work through the use of colors.
The red side of the glasses filter out all green. The cyan side filters out all red. Either eye will get plenty of blue, so the first thing is to make sure you compensate for that by turning off the blue channel completely. That will ensure that only the colors your glasses can filter out are going to be displayed.
The next thing is to take both video streams (right and left) and convert them both to mono-chrome. There are a number of ways to do that, so do your research. Then, copy the red channel from one stream and use it to replace the red channel on the other. Delete the blue channel entirely.
Now you have a 3D anaglyph video playing on a 2d screen. PRESTO!
You can't just run a 2D video through an algorithm and magically get a 3D video.
You have to run the video through a compositing program (think Photoshop for video) and use that to chop and mask each scene and introduce parallax effects. Then (if your compositing program supports 3D space) you output the streams from two different virtual cameras so that you have 2 final videos that are synced and are from two different angles (one for each eye). At that point, it's trivial to encode them to whichever 3D video container format you want to deliver as your final output.
If you really want to learn how to do this, try it first using stills with Photoshop or the Gimp. Once you understand what's involved for creating a believable 3D scene out of a 2D image, you're ready to start learning how to use a video compositing app to do the same thing.
Be prepared to spend a lot of time on this.
I'm out of my mind right now, but feel free to leave a message.....
This is basically impossible, or will have horrible artifacts.
The current crop of movies with 2D-to-3D conversions still took significant human and artistic effort to achieve, even though the results are mediocre. For a given frame, for every pixel in 2D, SOMETHING has to decide how far away the subject depicted must be. That is, it has to INVENT the third dimensional value. Then this value is used to calculate two new 2D frame with parallax involved.
There's no computational way to achieve this INVENTION of the depth value with an arbitrary photograph, though. Any computational model will have big gaps in its ability. With enough computing power, you can perhaps identify visual markers in neighboring frames (say, the corner of a lampshade), solve for where the camera position must be relative to the markers, then use the depth of the solved markers to base all the other pixels (say, the lampshade versus the drapes). But that (1) takes significant solver time now, (2) requires a lot of hand-adjustments to discard inappropriate markers that upset the solver process with bad results, and (3) won't find anywhere near enough quality markers across the whole frame in fast-moving action scenes to fill in the rest of the data.
Some people get ill with the best 3D out there, others get ill as the quality of the 3D information degrades. The inconsistent results of any realtime method would likely be epilepsy- and nausea-inducing in a matter of seconds.
[
If you're creating a 3D master, it's better to render out to separate left and right streams, then use post production to convert it to whatever presentation format you need, anaglyph, R/L side by side or over/under, Field Sequential etc... If you master straight to anaglyph you're stuck with anaglyph. As for the actual 3D conversion itself, welcome to a whole world of rotoscope hurt (not impossible, but close).. unless you start from the beginning with a stereo camera rig (difficult) or do everything in a 3D rendering app and set up a standard stereo camera rig in that (easy).
Yes 3D is a bad idea but that's not the question. No there are no affordable 3D conversion softwares. If you are after converting home movies to 3D there are consumer grade cameras available. Realize that most systems involve halving frames so if the camera boasts 1080P read the fine print because the effective resolution is half that. As far as converting old movies there's nothing out there. I'm fairly sure I could produce a 3D shot with After Effects given time but to do a whole movie that way would be insanity. As far as a passive solution it's impossible if the camera is locked off. You'd be asking the software to make creative judgements. Even the old colorizing software required an operator to make judgement calls how to color different parts. If they spent proper time on it the results were impressive but more often than not they rushed it through and the results were appalling. Until some one comes up with a 48 frame or 60 frame system 3D will continue to be obnoxious. The worst of the lot are action films that try to use narrow shutters to reduce motion blur. In a normal film is causes strobing. In a 3D film it's agonizing to watch. The actions scenes strobed so much in the new Underworld movie that half the time I couldn't tell what was going on. Why did I go to the 3D version? The local theater was only playing the standard version two times a day. Often there is only the 3D version these days which artificially creates demand. They'll be lucky if 3D lasts another 5 years before people stop going because a film is in 3D. In 10 years few if any films will be made in 3D. The only thing keeping it alive now is the fantasy that the studios can stump the pirates who are filming screens. Personally I'd prefer walking through a full body scanner before being forced to watch most 3D movies.
I have slashdot as an rss feed and when i saw 'convert 2D to 3D', my otaku mind just....
*sigh*
I friend of mine (former CEO of a startup I founded) asked me to write one.
He called and kept offering more each time. I actually spent some time investigating this and decided that it was a good way to give my self a stroke.
It's hard enough implementing and getting things right when you know what to do, with 2D to 3D there isn't even a clear algorithmic method to use, few papers and no examples of a good automated conversion. DDD seems about the best.
I must admit I've seen some decent human with software assist do a surprising good job but even that isn't nearly as good as a 3D camera or rendering CGI direct in to 3D.
John L. Sokol
videotechnology.com
I am always doing that which I can not do, in order that I may learn how to do it. - Pablo Picasso
You dont simply want to filter the channels red vs green/blue. That creates terrible ghosting. Instead look up Dubois alforithm, its a linear projection from 6d colorspace to 3d colorspace, optimized for minimal ghosting using MSE. Finished matrices exist fro both red/cyan, green/magenta and amber/blue, available from Dubois homepage. Recently used this for a project, works great.
Found some info here:
http://www.3dcombine.com/conversion.html
2D to 3D conversion is the process whereby existing 2D content is converted to 3D. There are a
number of different methods that can be employed, though none will produce the same effect as
recording in 3D in the first place. There are a number of reasons for this. A key issue is that some
information is missing. Try looking at an object in the distance that is partially obscured by one
nearby. Close each eye in turn. You'll see that more of the background is visible in one eye than the
other. If you only had the view from the more obscured eye, the extra background is missing and
would have to be extrapolated (invented) when creating the missing eye view...............
You can try DVDFab from Fentao and see if that works for you: http://www.dvdfab.com/
I work in post-production, and while some of the stereo-handling algorithms are impressive from a technical point of view (like the stuff in Eyeon Dimension and The Foundry's Ocula), and while I think stereo 3D is here to stay for video games (at least after consoles add some improvements to head tracking), I doubt it will be more than a passing fad for movies. It's simply not compatible enough with human vision, even when done properly (head movements spoil the effect, the difference between convergence point and focus plane puts stress on your eyes, etc.; it's as if someone nailed your head to the cameras). When I'm watching a movie, I'm a spectator, I don't feel any need to be "in" the movie; I'm fine with being an infinite distance way. Anything that makes watching the movie less comfortable is going to detract from the experience.
Anyway, although there are ways to extract 3D information from 2D image sequences (not from individual images), as done by camera trackers such as SynthEyes, PFTrack, etc., the result is a very low resolution point cloud, which is really only useful to calculate the camera position and / or track some scene features, not to create a usable stereoscopic image pair.
The only vaguely acceptable way to get stereo is to project the frames onto a (simplified) hand-made 3D model of the shot (typically a grid deformed by a displacement map), and then render it from two virtual cameras. This can take ages (to set up; rendering is quick) and is generally the kind of work you offload to some intern you don't like much. Even then, the results are generally less pleasant to watch than the original (mono) footage. If you're interested in seeing how this is done, search for "Stereo Conversion NAB" on YouTube, and you should find a few examples.
There is no way to convert individual frames from 2D to 3D in real time for the same reason that "digital zoom" can't show you text that was smaller than the sensor's pixels; the information is simply not there. You can, obviously, write an algorithm that adds made-up depth information to any image, just as you can write an algorithm that adds random text to zoomed images, but I doubt that would improve your movies in any way.
Indeed.
We need another saying: Those who don't understand computers, are doomed to post questions on Slashdot. Poorly.
It's shocking, how many people today avoid the CLI and think that "having" to use it is something bad or uncool. (Quote "It's not 1989, you know?". My answer: Then why haven't you still not understood the point of having a computer??)
A computers' sole purpose is, to automate your work away. You do the big parts with programming... or let somebody do them for you. And you do the glue with scripts and the CLI.
Important note: Doesn't mean it can't be graphical and visual though.
But still, that's why the GUIs and desktop environments of today are so deeply wrong. Especially on a Unix-like OS.
If you want to see how it's done right, look at Maya (the 3D tool). It has a CLI, which you might not know. Similar to the Quake series console. Everything you do graphically, is a script command too. So you can just do stuff, open the console, mark the last n lines, and drag them to the shelf. Then you can edit your new script, add loops, variables, dialogs. All in a bash/TCL-like language or in Python.
And did I mention that the entire frontend you've been using to do this is made that way? Only the core/engine is hard-coded.
So it can be fully adapted to all needs and in big companies, it usually is.
Let's have *that* as a Linux UI. Let's make *everything*, be it UI-wise or whatever, also be a file. No exceptions.
How about that?
Right now, making 3D out of 2D is mostly manual work, but it need not be so in the future. Your brain can effortlessly extrapolate hidden sides of objects on a photo or film and reconstruct the depth field (it can also be confused easily by the "Devil's fork" or other such specially crafted pictures). There are lots of cues for this: lighting (shadows, glare), perspective features, sizes of known things etc.
In the future, this will be doable on an artificial brain. You can already do this with single photos, using artificial neural network. Just submit your photo to http://make3d.cs.cornell.edu/ and in a few seconds you'll get back a textured 3D model of the scene or a fly-through video, whichever you prefer. It is a awesome, I love AI (and welcome the overlords, of course)!
The problem isn't too hard if you are moving your camera sideways at an even speed since you could just use 1 frame for the left view, and a frame a short amount of time later for the right view. However, if the video camera is taking some unknown path then no 2 frames from the original video will in general create the correct parallax. Therefore, you would need to do a bundle adjust on the camera movement (computationally quite painful and not always reliable for arbitrary camera motion). Then comes the hard part of producing a close to 100% coverage dense 3D model at regular enough intervals to render new image frames with camera spacing and orientation to match human vision. Not impossible, but I think reliability of the currently available algorithms and computation time are the big problems.
I write post-production software used to do this (and it runs on Linux!). The best results I've seen involve manually breaking each shot into dozens of layers, using rotoscoping. Each set of layers is exported as masks and imported into a compositing application where the images for the layers are projected onto the masks in 3D space. In some cases they build rough 3D models and project the layers onto the respective models. Now they can add a virtual camera and render the scene from both views. Then they bring the footage into a paint system and manually paint in the "missing" parts that now show up because of the change in camera angle. This has to be done for both the left and the right eye.
They have a room of 300 guys in India doing this for Titanic. But the results are INCREDIBLE.
Some automatic techniques involve rotoscoping a depth map by hand (or with a combination of some automated depth map generation, but this almost always has to be tweaked for good results), then using that to synthesize two new views from the 2D footage. Then to fill in the gaps they can use either an automated warping (which looks almost, but not quite, entirely not all right) or hand-painting again.
The upshot is it is a very very manual, labor-intensive process, with somewhat specialized tools. But when done well it looks amazing.
I don't know how the hell they do it, but my brand new LG TV does just that. And it works.
Fuck off. Seriously, he's asking for help. You're not being helpful. Responses like yours are pointless and are unlikely to change his mind
There are TV sets available now that convert normal 2D programming to "3D" in real-time. The results vary greatly depending on the programing type, from passable 3D to blindness-inducing messes. How a viewer interprets this "3D" image also varies, from "oohs and ahhhs" (most people watching the demo I was attending), to "how can they call this 3D, now my head hurts!" (me).
I, however, am a 3D snob, able to see that most theatrical releases touted as "3D" are simply shot in 2D and processed with rotoscoping and depth maps, leaving most of the 3D effect visible as cardboard cutouts on a flat background, and feel cheated when I see these.
But, people seem to like 3D, even if it is bad 3D, and the realtime conversion feature of TVs seems to be a selling point.
Have you tried http://www.youtube.com/editor_3d It's quite basic, and requires dual video input. I gave it a crack and got horrible results (mainly due to bad camera setup on my end and a lack of patience, oh and I didn't have the r/b glasses, did I mention that I wasn't trying very hard either). With a decent dual camera setup you could probably produce them quite painlessly.
Can a person program a new solution to a problem? Why should anyone be able to stop such a thing? -Richard Stallman
Use the Gimp and then this excellent 3-d tutorial. http://goldomega.deviantart.com/art/Photoshop-3d-Anaglyph-Tutorial-149857792 I have been converting my paintings to 3-d for use with red blue glasses.
Download Bundler + PMVS 3D reconstructions packages and feed video into them. Those packages are fairly stable and reliable and they give you 3D point cloud. After that convert 3D point cloud to surface - there are several packages which can do it, but I can't give any advice here - I don't know any *stable* package, all of them are research soft - memory leaks, random crashes, difficult parameters setting, compilation problems etc. If you want to learn algorithms himself that's at least year worth of math and computer vision (if you are not math/phys major). "Multiple View Geometry in Computer Vision" usually recommended for starters, but this book is thoroughly obsolete now. All the modern staff is in the papers.
I worked on most of the major releases including Phantom Menace. It takes several thousand slave laborers to get it done for 80 million...
If you look on Google under "OpenCV stereo vision" you will find links showing how the code runs. There are video examples using two web cams that run in real time at around 5 frames per second. If you record and run off line you can get reasonable playback frame rates.
This code generates a depth map for the scene, so each pixel is assigned a distance from the camera. These techniques are derived from robotic vision research, so it is an image processing solution, not a 3D computer graphics solution. It does not generate 3D surfaces. What you do with the depth map is up to you.
Why is Snark Required?
This article is total bullshit.
I work as a visual effects artist, and this sort of "quick fix" to post-conversion is a lot of rubbish.
Stereo is for chumps. Move on. Gimmick over.