CMU Video Conference System Gets 3D From Cheap Webcams
Hesham writes "Carnegie Mellon University's HCI Institute just released details on
their "why-didn't-I-think-of-that-style" 3D video conferencing application. Considering how stale development has been in this field, this research seems like a nice solid step towards immersive telepresence. I was really disappointed with the "state-of-the-art" systems demoed at CES this year — they are all still just a flat, square, video stream. Hardly anything new. What is really cool about this project, is that researchers avoided building custom hardware no one is going to ever buy, and explored what could be done with just the generic webcams everyone already has. The result is a software-only solution, meaning all the big players (AIM, Skype, MSN, etc.) can release this as a simple software update. 'Enable 3D' checkbox anyone? YouTube video here. Behind the scenes, it relies on a clever illusory trick (motion
parallax) and head-tracking (a la Johnny Lee's Wiimote stuff — same
lab, HCII). It was just presented at IEEE International
Symposium on Multimedia in December."
The post title/summary is misleading -- this is actually 2.5D and not 3D at all. (It works on the premise that the background is static, and obtains a matte of the background, and using subtraction to dynamically key/mask the participant from the image, and then add the user as a second foreground layer; on the viewer side, headtracking is used to gently shift the user layer to reveal background hidden behind it)
For what it's worth, I really don't care for this effect at all. I am not denigrating its inventors in the slightest; this is a novel (read: low cost) approach, and I am sure some people would enjoy having this in their iChat/AIM/skype. To me, it's the equivalent of Apple's Photobooth filters (fisheye, inverted colors, etc) -- a cheap parlor trick that seems nifty for about 5 seconds, and then becomes precipitously distracting. True 3D has its own issues with distraction and visual anomalies (leading to headaches, etc). Even the best 3D cinematographers around have to be very careful to avoid these issues (for instance, Vince Pace, who shoots 3D for James Cameron (Titanic, Terminator, etc) has plenty of headache-inducing scenes in his demoreel, and this is a guy with state-of-the-art facilities who has as much knowledge as anyone about how to do stereoscopic cinematography). Frankly, I think video conferencing is best left 2D, and any efforts toward improving it should be spent increasing framerate/resolution (and reducing lag + dropped frames).
I am Jack's complete lack of surprise.
...but that sample conversation at the end of the video may have well been between two drunken epilepsy sufferers on boats in the North Atlantic. Who moves around like that while they are talking?
This does tons for immersion! It has to be implemented wherever there is a stationary camera (it obviously doesn't work with a camera phone). IIRC, Johnny Lee's work was free to use, so get to it and add that "Enable 3D" checkbox, developers! If only they'd cropped the resulting image to get rid of the black-ground, but that was probably just to show how it works.
John Carmack prototyped this a few years back. His conclusion at the time was that there was too much lag in the system to make it really useful.
i thought that failed if you have two eyes?
IranAir Flight 655 never forget!
Yeah, FPSs that wish to implement leaning have established a convention by now of using Q and E for that purpose. Call of Duty was one of the first games I played to implement it.
Convert FLACs to a portable format with FlacSquisher
I think they already have something along those lines. http://www.naturalpoint.com/trackir/
5 years of applying Moore's law should have overcome this by now. ;-)
Much better/clever implementation than for video conferencing.
Come on... be honest, everyone has done that unconsciously on Counterstrike... even without a webcam
There's even an open source work-a-like: http://www.free-track.net/english/
Rules of Conduct:
#1 - The DM is always right.
#2 - If the DM is wrong, see rule #1
I wonder if a more practical use would be to use the technique for video bandwidth reduction. If you know where the person is, you could concentrate video bandwidth on the face region, while keeping the rest of the "video" relatively static. No point in continuously compressing and sending boring background. Of course many codecs already do temporal compression that gives a similar effect, but this might increase the efficiency for video chat.
The lag wasn't due to CPU speed - it was due to cumulative delays in the webcam itself, the USB bus, and only a tiny bit of image processing. I think his analysis was done on his .plan proto-blog, way back. I have no idea where it might be archived these days.
I know that even today, when capturing video from a USB camera, I can see a noticeable delay between when I move an object and when I see it moving on the screen, so I don't think that much changed since then. The only video capture setup I'm aware of that doesn't suffer from this problem is when you capture video via a PCI capture card at a high frame rate, and most people don't bother to set something like this up.
Now I can see everyone's zits in 3d.
There's no -1 for "I don't get it."
[citation needed] or else corps will try to patent it.
Obama's legacy: (N)othing (S)ecure (A)nywhere and (T)error (S)imulation (A)dministration
Inspired by Johnny Lee's stuff, I pulled some old code out over a year ago and turned it into a decent engine that handles multiple screens and head tracking (TrackIR) to achieve the motion parallax effect. Like with all 3D effects, it needs to be seen but the following videos give you a good idea.
Have a look at these demo videos and you can even download a demo:
My first test
http://nz.youtube.com/watch?v=X8PevTuEWlg
More accurate tracking
http://nz.youtube.com/watch?v=yf1hu6GLmf0
Multi screen study
http://nz.youtube.com/watch?v=ZBdtPz2V_vY
Engine complete
http://nz.youtube.com/watch?v=ku76aHq3pps
Download Demo
http://vandinther.googlepages.com/virtualwindow
1. Cite? 2. Because as everyone knows, as time goes on, CPU doesn't get faster and RAM doesn't get bigger.
Towards the Singularity.
My setup isn't that cheap, I would guess that most people's setups would be cheaper, and I don't see how you can speculate on the amount of lag I'm experiencing when all I said was that it was noticeable, which it is.
Anyway - if you want to implement it, go right ahead. Don't let some guy on Slashdot stop you.
Sorry, can't find the place where I saw this. The closest I came was this:
http://doom-ed.com/blog/1999/11
This is an archive of his old .plan updates in blog form. I know that the actual .plan updates are archived somewhere on www.bluesnews.com, but I can't figure out where they are. That post just mentions that he started working on it, but there's no followup there. I do remember reading a followup somewhere else some time later, and he mentioned the latency issue.
The latency had nothing to do with the CPU speed, and everything to do with the camera buffering a couple of frames before sending them on the bus. Granted, cameras are better today, but I can still see the latency. At this point, it's probably acceptable for most applications, but people tend to notice latency in games more than they would in other applications.
It's not gonna give you a true 3D sensation since the image will appear identical to both eyes.
It's basically using your cam to track your head, then using software to munge up the incoming image from the other person's 2 cameras, as if your head was at that spot between the two, kind of like setting the fade and balance for audio, but for video.
But it'll still be a flat image.
But "it's very neat!"
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
It looks like the application for this is chatting when you are drunk, standing up, and swaying about. I don't know anybody who constantly moves their head around when videochatting. They tend to look straight into the camera. And wouldn't you be rather concerned if the person on the other end of your chat did start moving around and looking at you from weird angles?
... and then they built the supercollider.
Duh, of course he didn't invent anything, he just hooked a wiimote up to a PC and used it to provide positioning for a camera in a virtual scene. Nothing special there.
The reason it wowed eveyone is A) because nobody at nintendo thought to demo it first and B)it let everyone at home do the same thing for way cheaper than before.
The floating square of background with a floating talking bust reminds me of Max Headroom.
and I can't believe no one else has mentioned it
http://www.minoru3d.com/
it comes with red-blue glasses for the purchaser to send to people that they intend to use it with.
every day http://en.wikipedia.org/wiki/Special:Random
Agreed. The next step is to make a virtual mannequin head and map the face onto that. (with a very small number of knobs for fitting size and orientation) Like that Disney ride with the ghosts.
And after that, a few tricks to change the virtual viewpoint so it looks like you're looking at the camera and not the picture of the other person.
Can you be Even More Awesome?!
No. The next step is to move the technology into a FPS. Imagine actually being able to look around the corner by...looking around the corner.
Plus, my wife will not longer be able to laugh at me for leaning around in my chair when I play.
Aah, change is good. -- Rafiki
Yeah, but it ain't easy. -- Simba
While this is pretty neat, I'm not sure it 'enhances interpersonal communication' since everyone using it will be bobbing back and forth like a Stevie Wonder impersonator convention.
Not to mention some schmuck in the US will soon sue because it made them puke from motion sickness.
USB webcams are pretty cheap these days. Why not use two, one on each side of the monitor?
In fact I've seen web cam kits with 2 in the package.
The would let you have true parallax, AND would have the benefit of making it appear that you are looking at the viewer.
Solves the two main problems I see being discussed here for an extra $29.95 or so.
Plus, it would make cool things like 3D position tracking possible (think Minority Report).