Slashdot Mirror


Video with Depth

Lifewolf writes: "A new technology from 3DV Systems uses pulsed infrared illumination to capture depth information for every pixel of a video stream. This allows for neat tricks like realtime keying without need for color backgrounds. JVC is already selling a product based on this, the ZCAM."

37 of 110 comments (clear)

  1. Modeling applications? by Wavicle · · Score: 2

    This opens up some great possibilities for
    digitizing 3D models. Anybody heard of this
    technology already being used for that?

    --
    Education is a better safeguard of liberty than a standing army.
    Edward Everett (1794 - 1865)
    1. Re:Modeling applications? by Comrade+Pikachu · · Score: 2, Interesting

      There are already some optically based 3d scanners on the market. The first ones used a scanning laser beam to trace out a line that described an object's surface texture. More recent versions use a purely optical method (I think).

      This system could probably be used for modeling by placing a physical model on a turntable and recording its changing z-depth over time. I wonder how accurate it is at close range. This could be really useful for architects who want to develop a 3D site plan. Simply snap a few shots at the building site, construct a DXF file based on the depth information, and import it into your CAD software.

      The camera is probably intended for use with compositing applications like Shake, which can process z-depth information, as well as RGB, and alpha. Great for seamlessly integrating live action with computer generated 3D, particularly realtime 3D

      This also poses the question: what other types of useful information can a digital camera acquire, if we are not limited to the visual spectrum? Would it be possible to extract diffuse color, reflected color, transparency, or other "ray depth" information from real life subjects?

  2. What's so difficult? by evilviper · · Score: 2, Interesting

    I've never really seen what makes 3D video (or 4D to get particular) so difficult to record.

    Humans have 2 eyes in the front of their heads, inches apart. All that is needed in a camera is for two syncronized tapes to run simultaneously, with the lenses just a few inches apart.

    Playback the left half on the left eye, the right half on the right eye, and our own built-in systems have no problem building those two images into a single 3D image.

    I think the difficulty is not in the recording of 3D information, but of building a display to play it back to multiple people.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    1. Re:What's so difficult? by molekyl · · Score: 2, Informative

      You're mixing things up.

      This is not an attempt at 3D-video. This is video with depth information.

      It's primary application is to select parts of the image that you want to replace ('keying'), nothing else.

    2. Re:What's so difficult? by evilviper · · Score: 2

      I admit it's not solidly on-topic, but I am not confused.

      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    3. Re:What's so difficult? by evilviper · · Score: 2

      Could you possibly be more vague?

      There are many ways to generate holographic images. The question is in the details. Will we see the same thing from any angle? Will a series of mirrors be used or just several lasers? How big will the picture really be?

      It's just as possible in the future we'll all just strap on somethng similar to the I-Glasses and individualize te experience.

      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    4. Re:What's so difficult? by Pemdas · · Score: 4, Informative
      The concepts behind it aren't too difficult, a google search for epipolar geometry is a good place to start.

      The biggest problems are computational; it's hard to do a good job of stereo reconstruction at high frame rates in real time. It's by no means impossible, and there are commercial out there that do it, like this one.

      Two cameras aren't really necessary, either, if your camera is moving in the scene. It's possible to recover both the movement of a camera and 3-d information about a scene just by moving a camera through it. Googling for structure from motion is a good place to start looking into those techniques, and there's a pretty cool page about one groups application here.

      In short, this company may have an interesting prodect (depending on cost and more details on the error characteristics) but this isn't something that couldn't be done with existing methods.

      Also, as an aside, I find it interesting that they take a swipe at laser rangefinders as requiring a spinning mirror, when just about all IR cameras have a spinning "chopper" as an integral part of the exposure system...:)

    5. Re:What's so difficult? by Forkenhoppen · · Score: 2

      I have an idea, but I'm not sure if it applies to this article or not.

      Why are there no holographic cameras? How about a personal photographic system that could take a 2D picture, along with depth information. Couldn't the lab then use that information to extract some semi-3D models as a basis for a hologram? (You know; one of those thin colour-banded holograms they put on CDs and credit cards..?) Or is the cost of making those holograms prohibitively high..?

    6. Re:What's so difficult? by Com2Kid · · Score: 2

      Nah man, that just gives you minimal depth perspective, the data format is still 2d.

      It is the difference between having a 3d object and taking a front on picture of it and importing that picture into Adobe Illustrator (or Kilistrator, take your pick, now Kdraw or KVector isn't it? ) and using a "convert to paths" tool, which will get you a very nice 3d -looking- image but it will only store two dimensions for you, VS taking multiple shots of that object and importing them into an Application that calculated the 3d space of that object.

      Of course the advantage of what THIS camera does is that you get some 3d information without having to do a lot of REALLY nasty interpolation between multiple images. Granted modern techniques to do such have gotten better, but artificialy creating 3D data from 2D pictures of 3D objects, well. . . . heh. Even worse if those objects are "4D" (aka moving).

      This new camera seems to deal with moving objects just fine. Yah.

      The MAIN thing that I am thinking of this of is that you could possably translate objects around in your 3D space that was created by this camera.

      Your point of view would remain fixed and none of the objects could rotate (more on this latter) but you could still do some REALLY nice stuff in regards to Object Based Encoding.

      In fact the integration of 3D data into Object Based Video Encoding technologies could work to create for some VERY nice bit rates, or at least the removal of gobs of artifacts.

      Imagine if the Video Encoding KNEW that such and such person was going BEHIND that plant.

      Now of course one other use for this is that if you combined it with the pre-existing methods of using multiple cameras to capture a 3d space. With this method you could, mabye even after just creating an object outline in one viewpoint, (I will have to think over this particular facet of this new technology more in order to prove or disprove that idea) to rotate all the seperate OBJECTS within the scene, and not just move your view around the scene. (This is of course excluding any partialy obscured objects, which would likely have some strange things happen to them. :) )

      Because you have each objects X, Y, and Z coordinates, and your camera could have almost complete X, Y, and Z plane movements (remember, interpolated between multiple sources and your image quality when zooming in would be dependent upon your original image capture quality) you have yourself what is basicaly a fully workable 3d workspace.

      Imagine importing your video some day not into Adobe Premiere but rather into Maya or 3D Studio Max.

      Kick Ass.

    7. Re:What's so difficult? by Forkenhoppen · · Score: 2

      I realize this, which is why I listed a method for taking the picture a more "normal" way, and then converting it to a hologram in the lab.

      Btw, what's the deal with the monochromatic light? I realize you have to use one colour or else you get a blurred resultant image, but is there any way to sort of do a component hologram and then put the parts back together? Sort of like how video has RGB..?

  3. Fun to abuse... by fleeb_fantastique · · Score: 4, Insightful

    Can you imagine using this technology to insert your favorite politician in a porn video? George Bush Does Dallas.

    Used within a survellance camera, it could detect motion without getting tricked by that tree near the air vent.

    It could also be used in surgical situations where a specialist located in another state can more easily study facets of the video being provided to him (cutting out noise, if you will).

    You could do some really weird video editing where you could create a scene of a person standing in a verdant field in the middle of summer with snow falling within his 'mask'.

    Items recorded in this way (presuming the mask is also recorded) could perhaps be admissable evidence that helps the court focus on a specific action that might otherwise get missed.

    It might also provide a less-expensive way to make 3-D videos. Precursor to holographic movies?

    --
    And so it goes.
    1. Re:Fun to abuse... by andycat · · Score: 2, Insightful

      It might also provide a less-expensive way to make 3-D videos. Precursor to holographic movies?

      It's a step along the way, but it's got one major drawback: it only captures a scene from one viewpoint. As soon as you move away from that viewpoint you're going to see holes in the scene where the camera didn't capture any information. To fix this, you must either (a) keep the viewpoint fixed at the camera's center of projection or (b) capture multiple views of the environment to fill in the missing bits.

      Cameras like this have another potential benefit: better video compression. There's a section of the MPEG-4 standard that provides for segmenting your scene into objects so you could, say, encode the weatherman separately from the backdrop he's waving his hands at. If you shoot with a camera like this that can give you a rough silhouette of major objects in the scene, you could spend more of your time doing high-quality encoding of the people running around in the foreground and less of your time on the background that doesn't change for the length of the shot.

      That said, I'm awfully skeptical about their claims of precision. As another poster has mentioned, there's a reason why laser range scanners cost so much: building an accurate rangefinder with lots of dynamic range is hard. As for object segmentation... I personally don't believe the image they provide as an example. Take a look at the depth map of the people at the conference table. In particular, look at the tabletop. It's nearly parallel to the camera axis, which means that its depth should be increasing fairly rapidly, which means you should see a gradient from light (near) to dark (far) in that part of the image -- but no, it's all one color.

      I suppose you can explain that as treating everything between depths D1 and D2 as a single object, but that doesn't work all that well in practice. What's far more likely in my opinion is that that object mask is a hand-created example rather than the actual output of the device.

  4. I used to key images... by Joe+'Nova' · · Score: 2, Informative

    I didn't have a depth thingy to tell me how to replace the image, we had blue backgrounds which had to be equally lit, and pray nobody came with blue on.
    The real reason blue was used is because if you see a video signal, it is only 11% of the signal, at most, and also a very rare color(saturation wise) in a picture. Most people don't wear blue tarp mascara, and it was acceptable.
    The other type of keying was on an Amiga with a Gen Lock, using background color as the transparency, a static image over a live background. You could also set the transparency, so you could get ghost-like effects.
    But with one of these, you can probably make a scrolling background with the occasional tree popping to front. If you were to do the same with an editing suite, you're looking at at least a good hour, and when you rent out facilities, you look for all the helpies you can. Just printing out a still from video can cost more if you're using a "video printer".
    I wonder if you can set the depth manually, or if it's hard coded. It might be fun to see something pass "through" something else.

    --
    This mind intentionally left blank.
    The KKK a bunch of sheetheads? You decide!
  5. This will revolutionize color keying. by arsaspe · · Score: 5, Interesting

    Normally, when you want to key in a false background in a scene, you need to have a constant color in the background (Hence the use of blue and green screens). If the background isn't flat, then you either have to go at it with photoshop frame by frame, or use expensive border tracking software which is less than perfect. You could spend hours setting up a scene just right, with screens placed in all the right places, making sure that there is nothing else that is the same color as the key, and planning camera angles for an action sequence, not to mention the struggle of getting the keying to work just right.

    with this new technology, however, you could film an actor just about anywhere with very little preperation, and key him/her out based on depth AND color (some situations may need both), and easily pop new things both in front and behind the actor. It could save movie studios a lot of time, effort, and money for doing special effects, especially after you consider how easily it would be to generate a virtual stunt double from the 3d mesh (film the actor from a few angles, and merge the resulting 3d wireframe. Voila, perfect model down to the wrinkles in the skin)

    1. Re:This will revolutionize color keying. by edo-01 · · Score: 3, Informative

      especially after you consider how easily it would be to generate a virtual stunt double from the 3d mesh (film the actor from a few angles, and merge the resulting 3d wireframe. Voila, perfect model down to the wrinkles in the skin)

      Uh, no... I wish it were that easy - but scanned 3D meshes of that quality are still in the domain of laser scanning. There's just so much detail that even the best scanners can't pick up, major wrinkles and folds yes but pores and fine lines have to be simulated with displacement/bump and colour maps derived from the scan data (basically as it scans, the device takes a big long photo of the object to wrap around it later). Once you have the point-cloud from the scan (raw data) there is a LOT of cleaning up to do to get a parametric mesh with correct UVs (texture mapping co-ordinates) for use in production.

      For more info, check these guys out - we've used em recently on a couple of film and tv projects and their output is damn nice, but the price tag reflects the complexity and difficulty of the task.

  6. used to do this with 3d studio by DrSkwid · · Score: 2

    3dstudio 4 has a plugin to render z buffer depth too to get scenes like the one's with this camera

    it's great for doing depth based effects such as artificial depth of field (3ds4 didn't have that)

    I'd love to have one of the cameras available for making live video stuff, I'm looking forward to getting my hands on one, I hope my local video facilities unit gets one (I'm going to mail them a link).

    Coming soon to an MTV near you. Sadly probably not from my studio any more. I gave that up when 3dsMax came out, Seemed like there was no room left for a two man outfit (one gfx, one coder).

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
  7. The downside: by TheFlu · · Score: 5, Funny

    "Once you capture live action footage in object video format, you can not only make it more visually engaging, but also sell advertising right in context of the live event."

    Great, now you won't be able to distinguish between the show you're watching and the advertisement. Now when I'm watching TechTV, I can look forward to Britney Spears bouncing thru with a Pepsi at 30 second intervals.

  8. Re:Twofold problem by evilviper · · Score: 3, Interesting
    Like I said, there is no problem recording the image in 3D. The problem seems to be playing it back to an audience easially.

    The brain superpose a correction on what we see. Object it recognize it doesn't see them as "flat" even if seen with only one eye. It automatically add depth.


    True, but what most people don't realize is that we see just as much depth in a TV screen, as we would in real life if we covered one eye.

    Speaking of complex problems... There are certain devices that, when placed over your eyes, will essentially trick your eyes into seeing the depth on a flat screen, so there is quite a lot of information saved on a 2D image. The strange thing is that computer generated images are still seen as flat, while the rest has depth. What is different in the two is a mystery, but it just goes to show that our minds are privy to much more information than we are consciously aware of. (Have you ever seen a movie which used special effects and it just didn't seem right, even through you couldn't point out any real problem?)
    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  9. The new reaches of minaturization by PhotoGuy · · Score: 2

    Just amazing how DV cameras just keep getting smaller and smaller. I think I'll pick up that ZCAM, and get the optional belt case, so it's with me everywhere I go :-)

    I guess this thing is targeted more for reporters and the media, than the consumer.

    I assume "keying" is what we dumb consumers typically know as "blue screening" or "green screening", but this lets you do the same without a solid background, since it can separate out the people in the foreground using a depth cutoff instead.

    Neat technology. I think there'll be more practical uses for this than you might think at first.

    I wonder how accurately the z layer aligns with the pixels. Since it's a different infrared source, bounced off the subjects, I wonder if there's some fancy alignment that has to be done, or if the same pixels on the camera pick up the depth information. It'd be the difference between perfect alignment, and having sloppy edges around objects, which is pretty significant for a lot of uses.

    -me

    --
    Love many, trust a few, do harm to none.
  10. Forger's wonder tool by BlueUnderwear · · Score: 5, Interesting
    ...admissable evidence that helps the court...

    IMHO, this technology would rather do the contrary. It makes photo forgeries so damn easy: no afternoon-long sessions with the gimp to get exact contours of people to delete from or insert into picutres: just use the ZCAM's distance keying and you get instant masks. The example given was scary: a business meeting, from which they could edit out people at will. The ideal tool for anybody that wants to rewrite history. So, forget about photos staying admissible as evidence in court.

    --
    Say no to software patents.
  11. Still? by BCoates · · Score: 2, Interesting

    Would it be possible to economically do this with still cameras(preferrably film vs. digital)? Are there already products that do that? It would be cool to be able to record a depth 'image' with my photograps for later editing...

    --
    Benjamin Coates

  12. Re:What this gives you by StarBar · · Score: 2, Interesting

    Depth information and movement can give a chance to triangulate objects targeted.
    From there you probably can move on to the more sophisticated compression techniques
    (soon to be) intruduced my MPEG-4.

    Ever seen the move "Enemy of the state" where they triangulate 3D shapes with satellites
    and movements? Great techniques in that movie, but scary scenario.

  13. the technique is pretty old by markj02 · · Score: 2

    Getting real-time depth information from the amount of IR reflected from a pulsed IR light is a pretty old technique. It's used in some input devices to detect where people are in front of the computer. The use of this information for video keying may be new, though.

    1. Re:the technique is pretty old by i_am_nitrogen · · Score: 2

      The technique is old, but doing it per-pixel is very cool. Now all that needs to be done is to write a 5 channel video format (RGBAZ) and I can start writing software that uses this for unrealistic things. Ohh, the possibilities...

  14. I've already got realtime keying w/o color bg... by undertoad · · Score: 3, Funny

    it's called a dumb terminal.

    Thank you.

  15. Remote sensing? by Boiling_point_ · · Score: 2

    The use of this camera technology for video composition is great, but if you bundle a panoramic (360 degree) camera with it, you solve the reason that accurate 3D visual reconstructions are expensive. I'm thinking: export a 3D map of every object in range, then feed that into CAD.

    Now take your CAD file, recompile and render with a Quake3 engine, apply sampled textures, and you've got a very cheap, fast, good 3D walkthrough - architects will enjoy this too, as will tourism sites.

    It's also going to mean some great first-person-shooter maps :P

    --
    "If you create user accounts, by default, they will have an account type of Administrator with no password." KB Q293834
  16. Re:Twofold problem by Elwood+P+Dowd · · Score: 2
    True, but what most people don't realize is that we see just as much depth in a TV screen, as we would in real life if we covered one eye.


    Remember, a strong queue for 3D perception does not require two eyes: Moving your head just slightly gives you stereo vision over time. Sometimes you can't get the same thing from a steadicam shot.
    --

    There are no trails. There are no trees out here.
  17. Re:What this gives you by buckrogers · · Score: 2

    OK, I find compression interesting... I think that you could use this to compress a 3D video stream, by essentially "seeing" each object as a seperate stream of data in the image and compressing each seperately.

    You might be able to actually generate a 360 degree view of the background and encode the distance and angle of the view in each scene, then place the seperate actors into the scene.

    The really cool thing about this technique is that it would make it easy to delete or replace any one object in a scene in a video.

    --
    -- Never make a general statement.
  18. Visual Effects work by edo-01 · · Score: 2, Interesting

    I posted a comment a while ago that explained the uses in visual effects work for depth-cameras, and some of the problems with existing methods of pulling a matte off of live action plates...

    We were actually talking about this at work the other day; mainly wondering how well it would deal with things like fine hair, smoke, transparent objects and stuff like film grain/video artifacts/lens artifacts etc...

    Would love to try one and find out...

  19. Hair? Glass? by Anonymous Coward · · Score: 2, Informative

    The biggest problems in color keying are Hair and glass (as in eyeglasses).

    If this system, as it claims is simply making a z-buffer (depth buffer) of the image, then it's going to see hair and glass as a opaque lump, not the semi-transparent reality.

    Blue and Green screening (not chroma keying) can do a very good job of pulling out variable opacity and thin items like hair. Especially with the newer LED screen illumination camera rings.

    This technology has some nifty tricks and will allow more poor quality keying to continue, but it won't replace blue and green screens.

  20. Re:Twofold problem by mskfisher · · Score: 2

    Yep. I'm monocular (due to surgery to correct crossed eyes), though I retain use of both of my eyes. (actually, I can even control which is my dominant/active eye, which allows me to perform rudimentary stereo checks, if only to amuse myself.)
    I do gain a lot of information from motion.
    At the same time, starfield simulations and the like (if done properly, refresh rate, etc) can really draw me in.

    --
    0x0D 0x0A
  21. This is huge for MPEG4 by William+Tanksley · · Score: 5, Informative

    I don't believe nobody has posted about MPEG4. This is very interesting for that -- film using this, and you can encode into MPEG4 format with /huge/ compression almost automatically. The hard part about MPEG4 is object detection; this makes that almost free.

    -Billy

  22. Re:Twofold problem by Supa+Mentat · · Score: 2

    We really don't completely understand what you're talking about. It is true that with most people if you cover one eye you lose depth perception. But it doesn't have to be that way. A friend of mine is legally blind in one eye, he shouldn't have any depth perception, most people with his particular condition don't. Many years back he switched to a new optomotrist (sp?). When he went in for preliminary testing with this guy he got his depth perception tested, they had never done this at his first optomotrist's office, they assumed he didn't have any. The boy has perfect depth perception; he's one of the best tennis players in state. No one knows why and no one can offer any explanation other than his ONE working eye can do depth perception _by itself_. So it's a bit more complicated than anyone really knows. As a neurologist all I can tell you is that it's just another one of the many mysteries the brain presents us.

    --
    "A witty saying proves nothing." - Voltaire
  23. Re:Twofold problem by arakis · · Score: 2, Interesting

    May I correct some common misconceptions about 3-dimensional optics vs. stereoscopic. 3-Dimensional light is based on a wave of photons traveling through a volume of space. Outside of holography this wavefront of light is only achieveable in the real world. Stereoscopic images consist of seperate left and right images that when combined give the *illusion* of depth due to various parts of your brain that gauge distance, but not depth since they are based on a 2-dimensional sampling.

    It may seem that I am splitting hairs here, but I get very frustrated when people think that having one eye covered eliminates all depth perception. That is a catagorically wrong assertion since the retina in each eye occupies a three-dimensional space. People who have lost an eye encounter problems with depth preception, but do not lack the *ability* to precieve depth.

    If you pay close attention to any stereoscopic image, whether it is a "magic eye" or a viewmaster you will notice that things are collected into two-dimensional sheets that appear to have depth relative to eachother. A similar situation in real life would be if everything was either a backdrop or a cardboard cutout.

    By contrast the image displayed in a hologram presents an integral depth of the surface that is preceptible by a single human eye. It looks *real* becuase it is exactly the same 3-dimensional wavefront that existed when light was bouncing off the object to record the hologram.

    It is all a little confusing, but a little thought and casual observation will reveal these things to you. In my case I spent three-months interning in a holograpy studio in NYC, so I got to hear many interesting discussions on this and various other strange concepts of reality.

    So please peole, paralax does not mean the same thing as depth. If anything, please take that away from this thread.

  24. Goes Beyond MPEG4 Codec by cryptochrome · · Score: 2

    You're absolutely right - this will make a huge difference for compressed video by separating out the layers of the image. Motion prediction (or rather background prediction) will become trivial. The potential for this goes well beyond the existing MPEG4 codecs - indeed I expect it to spawn a whole new generation of codecs based on RGBD colorspace. Not only that, it will allow you to easily build up a detailed 3 dimensional representation of the static objects in your video, which is a whole new technological potential.

    --

    ---If you can't trust a nerd, who can you trust?

  25. Stereoscopic video? by Bowie+J.+Poag · · Score: 2



    Why bother. A vertical split-screen image for left and right eye is all you need. Theres nothing stopping conventional television from broadcasting stereoscopic images. Get two camcorders, tape em together at the sides and videotape stuff in your house. Edit the video so that the left camera's image displays on the right-hand side of the screen, and vice versa. Bingo, 3D video.

    See what I mean?

    Cheers,

    --
    Bowie J. Poag

    1. Re:Stereoscopic video? by PhotoGuy · · Score: 2
      Why bother. A vertical split-screen image for left and right eye is all you need. Theres nothing stopping conventional television from broadcasting stereoscopic images. Get two camcorders, tape em together at the sides and videotape stuff in your house. Edit the video so that the left camera's image displays on the right-hand side of the screen, and vice versa. Bingo, 3D video.
      But this isn't just about presenting the 3d effect of video (in fact, they don't talk about that at all, from what I can see. It's more useful for clipping your live video images in real time to different depths (only keep the people up close, ignore the background).

      In *theory* you could do this with two cameras, and some amazing processing that compares the two images, extracting the depth information for each pixel. But if such software even exists (and I think it might, for leading edge 3D scanning techniques), there's no way it could be done in real time, like the ZCAM does.

      -me
      --
      Love many, trust a few, do harm to none.