Video Chat Via Transparent Desktop Overlay
Jason0x21 writes "Wired News has an article about UNC Comp. Sci. researchers developing a transparent desktop overlay for video conferencing, allowing remote coworkers to literally point and interact with things on your screen. The researchers say that Apple's Quartz graphics engine let them go from idea to prototype in 'about 45 minutes'. Windows versions predicted in the future."
One of my favorite pieces of technology I've ever gotten a chance to play with is the SmartBoard Interactive Whiteboard It's a whiteboard that's touch-sensative. Basically, combine it with your favorite projection monitor and you've got a 60 inch touchscreen monitor. Just like any other touch screen, anywhere you tap the board is treated like a mouseclick in whatever application you're using. As an added bonus, "magic crayons" (really nothing more than plastic styluses) are at the bottom of the board. When the board detects one of the pens removed from its holder, it treats all touches as requests to draw on the screen.
:)
It's a great presentation tool to liven up a powerpoint and avoid the need to have to walk accross the room to get the next frame. Furthermore, playing solitare with foot-high cards is quite fun.
RTFA
Windows has had the ability to draw transparent windows since 2000. However, there's a limit to how far they can go.
Particularlly, you can't do any blending against windows that are being drawn with DirectX/DirectDraw which is the way that any program that wants to approach full-motion video or 3D graphics has to do things. And that's what prevents Windows from handling this application.
Mac's OSX is a lot cleaner in this department because in their universe there are no exceptions to the rules... everything passes through Quartz, so there's a chance to capture and play with anything on the screen. DirectX and DirectDraw are painted onto the screen after all mortal windows are drawn in Windows, and that's why there's no chance to add an overlay to them.
Is cocoa more responsible for fast development prototyping? Quartz is just an API. And I'm pretty sure there isn't a NSVideoConferenceWindow () in it.... or is there?
ollaborating with co-workers in the same office is painful enough, but it's nigh impossible over a network.
For a couple of decades, researchers have tried to blend shared workspaces -- systems that allow two or more people to work on the same document -- with Internet video-conferencing systems, with little success.
Now researchers at the University of North Carolina at Chapel Hill have designed a new system that cleverly blends a video-conference feed with a transparent image of a computer desktop into one full-screen window.
Called Facetop, the system simultaneously transmits a video feed of users along with a shared, transparent image of the desktop. It allows two colleagues to work on the same document, Web page or graphic, while communicating face to face.
The system also tracks the position of the users' fingertips, which can control a cursor. As well as operating the shared desktop -- opening and closing files or selecting text, for instance -- the collaborators can use natural pointing gestures to communicate ideas about the document.
Developed by David Stotts, an associate professor of computer science, and graduate student Jason Smith, Facetop was conceived for collaborative tasks like programming or editing text. But the researchers say it has obvious uses in other areas such as medical imaging or remote teaching.
"So far, from the feedback we've received, it works fantastically," said Smith. "It's a very natural interaction. You can see the facial expressions and all the nuances of face-to-face communication."
"It's spectacular technology," said Robert Gotwals, associate director of Chapel Hill's Morehead Planetarium and Science Center, who saw a demonstration of an early version. "I've done lots of video-conferencing work. This is pretty cutting edge. It's a fast-moving field and the stuff David (Stotts) is doing is pretty cool."
The system can also be used for delivering lectures or PowerPoint presentations: The speaker is projected in the background of the document allowing her to point out bullet points or important passages. According to Smith, users easily switch attention between the subject and the desktop.
"The brain is really good at picking out what part of the screen the person is interested in," said Smith. "It's like being in a room full of conversations but having no trouble paying attention to only one.... People adapt to the system really naturally."
Facetop may also be used to as an alternative to the mouse, for controlling a machine simply by pointing with a finger.
The system is implemented in Mac OS X and is made possible largely by the system's Quartz rendering engine, which can make any part of the interface transparent. Thanks to Quartz, a quick prototype was whipped up in about 45 minutes, Smith said.
A PC version will likely be delayed until the release of Longhorn, the next major version of Windows, due in 2006, which will include a similar graphics subsystem.
The system is fairly inexpensive; it has been implemented on a pair of Apple PowerBooks and two $100 FireWire cameras. So far it has been tested only on Ethernet networks and not the Internet, though the researchers say there's no reason it shouldn't work just fine. They are also trying to hook it to Apple's iChat instant-message/video-conferencing software and other similar systems.
Facetop was initially developed for "pair programming," an increasingly popular form of collaborative coding that pairs programmers in teams of two: one to program, the other to suggest and correct. Stotts said programmers normally sit next to each other, and he has been interested for some time to see whether they could collaborate over the Internet.
According to Stotts, pair programming -- sometimes called extreme programming -- is fast and effective and is becoming increasingly popular for small projects.
The idea for Facetop occurred to Stotts and Smith accidentally. Instead of a computer monitor, Stotts projects his
Where are the screenshots? - seems logical to post considering it only took 45mins for a prototype.
Why boast how easy it was to get it happening then not showing it happening?
odd
I had an internship at DoE lab outside of Chicago, Argonne Natl. Laboratory, at which we worked on a project similar to this. The system allowed multiple users (of various geographic, or digital distances) to connect to a Desktop Server, on which all users could interact with icons, windows and programs in tandem as if they interfacing with a local deskptop in windows. Althou, we used BeOS as our platform because it had a small footprint. Interesting that three years later private companies have out-done the DoE's work. Sad.
What i'd like to see is a voice controled program, instead of hurridly bending down to click the mouse at a conference, you simply say 'back', 'foward', 'pause', or even program in new words through a macro system built into the program. Oh and don't try and steal it, thats my damn intelectual property now, hah!
I know freedesktop has (incomplete?) support for full RGBA windows, making a version (or opensource clone?) of this on Linux theoretically possible. Is there any work on such a thing?
here's an Endeavors article about the project at UNC
FaceTop
Did he say inconvenience or insecurity? Score it as off topic but why when I install programs as the all mighty administrator in Windows XP does it take an extra hour to work around the severe lacking for multi-user support. Just why in the hell do programs made for Windows (some by Microsoft) store the preferences in the same folder as the program? Does it not seem obvious to you Windows developers that maybe two users might have different preferences? I guess choice is something to which Windows users are not accustomed.
That's my two cents as someone who has administered both environments.
What if your background isn't completely black or a solid color I know my office back ground isn't a flat color its a bunch of books and papers and folders not so neatly organized that overlaided on a coworkers desktop would really add to the confusion.
This must be Thursday, I never could get the hang of Thursdays.
!siht ekil skool gnihtyreve weiv fo tniop rieht morf ,yletanutrofnU
Show me on the doll where his noodly appendage touched you.
So, let's say those clever folks over at whatever-Gator-calls-themselves-now gets the brilliant idea that they could download one of them thar transparent-overlay-thingies whenever you browse to random-evil-webpage. Then, whoosh! They can sell remote access to your desktop so that advertisers can move all the annoying icons out of the way so that you can see the advertising more clearly. Or whatever. An since the overlay is transparent, the user can't figure out what is happening and simply thinks their system is posessed by the devil.
Assuming you have a touch screen, Windows has been able to do this since the release of Windows XP, using the remote assistance feature. Also, for the record, I hate getting into "I can pee this far, how far can you pee" debates. I just felt the need to reassert my "Windows shill" status by posting this ;-).
iRooster, the Mac OS X a
Heaven forbid that people should actually have to talk to each other face to face!
This concept was extensively researched by Hiroshi Ishii and his team between 1991 and 1994 while he was at NTT.
I saw the concept videos in my HCI class at the time. They went through all the various issues of pointing alignment, video flipping and the like.
You just posted the same thing over here a few minutes ago. Can you at least come up with something new when you troll?
Lost Cluster,
I played Frustration. I find the DirecTV questions particularly asinine. Do they pay you to include them? And what exactly is the purpose of the "which of these numbers is prime?" questions? Obviously if only one of them is odd, that's the answer... otherwise...
Now if I can only convince my boss to get a set of these for pair programming. That would rock. I can imagine myself now Xtreme programing on windows Xp working with directX making a game for the Xbox about the X-men with my X-Girlfriend. Then I'll release it for free and register a website for it, how about X.org! Oh... uh, Wait...
Back to reality, this thing wont be out for windows until Longhorn (2006,7,25), and by then it will be called Windows XXL. I think I better stick to my mac os X, making wallpapers of CowboyNealiX from ST:Voyager.
Im dreaming ofa big bndwdth, That can resist the
Fine, i'll do Slashdot and Wired's jobs for them:
Screenshots
--
Mod up a post Rob doesn't like and you'll never mod again
-Ian
The whole trivia thing is part of his "intellectual" motif. He wants everyone to regard him as some type of genius through his karma whoring and faked knowledge of everything (which, alot of times, consists of summarizing the story for people who don't RTFA).
I went to his little studioqb website, frankly its just a novelty. Just because you statistically rank how many people get a question right or wrong does does not make it any more special, it just measures how much you resemble everyone else in terms of your knoweldge, thinking, and how you were brought up.
Assuming a similar level of difficulty for each set of ranked questions, you could do s study of people, take diferent personality types and you would see similar trends in each study group. Find someone with a personality type that is not of the norm and he/she may very well find many of the "difficult" questions easy and vice-verse.
If there are varying levels of difficulty, then the statistical records of how well people answered it are irrelvant in terms of how you compare to others.. unless your ego really does get a boost out of that sort of thing
Business to business relationships have already become so depersonalised. This is just the next logical step - advancing technology that allows people to sit on their chairs to help other people. Heaven forbid that you would have to get up from your desk to help somebody!
Perhaps this kind of überchatting software is THE place where they can use those 3D desktop environments / window managers.
I don't really know if it would be useful, but perhaps it is cool to lean a window so you can see your partner while keeping an eye on the app content.
Anyway, beeing so far from the world as *I* am (yep, there are places on the south of the globe), where the bandwidth is kinda expensive, i can tell that i'll not be using this kind of technology for a while...
--krahd
mod me up scottie!
One possible feature for them to implement: one party can "flip the bird" to restart the whole session (as opposed to ALT-F4 or CTRL-C or whatever), thus giving new meaning to "Giving your co-worker the finger" for bad suggestions.
The Wknd Sessions - Malaysian and South East Asia independent music
Unless your camera sits exactly behind your monitor, (i.e. your monitor screen is transparent/one-way)
the image of your hand (despite the touch screen) on your collaborator's screen has to be computer generated. Or am I missing something here?
If the hand is CG, then all we have is a glorified cursor (but this too would be a pretty good hack if they got it done in 45 mins).
But wait, those pics don't seem to show the hand pointing in the right direction either!
The idea opens up lots of interesting questions.
If "dashboard" is an overlay of useful information
would apple calls this sort of thing "heads up display"?
If both of you is looking at the same text window,
does one person have to read the text backwards?
Does this mean all the tech supprt people will mount the camera under the screen, so they always seem like there looking down their nose at you?
"Call us when the New age is old enough to drink" Beck
Comment removed based on user account deletion
Doesn't this mean that what you see is actually a mirror image of yourself, and that in order to guide the cursor to an icon, you have to manipulate your on-screen finger to the right place and then flick it? (As opposed to a touch screen, where your REAL, not virtual, finger actually does the clicking).
You do realise that the camera is digital and merely passes a stream of numbers into the computer? It is relatively trivial to reverse the order of the numbers in software. "Walla", as they say, you have a non-mirror image.
Maybe I missed the ironic content in your post however.
Stick Men
Interesting ? WTF ? Since when did someone who didn't even read the article and see that THERE WERE indeed screenshots in the article should be modded up ?
Moderators, teach that guy to RTFA please, thanks.
Comment removed based on user account deletion
It's first thing on a Monday morning, and unluckily, I chose you upon whom to vent my pent-up domestic strife from the weekend.
Stick Men
Unless you have already filed your patent, you just put your so called intelectual property into the public domain... :P
> "Walla", as they say, you have a non-mirror image.
They might say that, although it would be quite unusual.
They would probably be more likely to say "Voila!", if they were not a retard.
HTH
"Walla", as they say Well you might if you were a moron, while the rest of us say voila.
Seeing some misconceptions, tossed up a quick FAQ at http://www.cs.unc.edu/~smithja/facetop/index.html for your perusal.
I'll be adding material to it through the morning as issues pop up, but these are the ones we've seen the most of this weekend.
and someone thought it was a new transparent 3D uber chat window
heh shame
The reason I put it in quotes is because I deliberately spelt it wrongly in a crude attempt at anti-slashbot irony. I'm not quite that stupid. I note that you post as AC.
Stick Men
Actually, the camera can be anywhere, as long as you're in the field of view.
As for ease of use, it literally takes people about two seconds to calibrate their hand motions to the cursor movement, and they're off and running. It's exactly like you're standing in front of a mirror (assuming the camera is in front of you), and gesturing... the visual feedback you get from your own image is the key. The transparency lets you see both your 'reflection' and the document content simultaneously.
Don't worry, we're seeing a lot of people confusing the single-user mode (one head on screen) with the video-conferencing mode (two heads on screen), simply because they're not used to video conferencing including themselves.
Even more cool than just interacting with one desktop is the possibility of interacting with an entire universe full of 2D desktops and 3D applications.
s ho ts.html
http://www.opencroquet.org/About_Croquet/screen
Once again, the inventor of the mouse leads the way.
I am always amazed at how unpopular video-teleconferencing is. The first time I saw a desktop VTC system (running on a 486, I think) - back in the late-eighties some time - I thought, wow, it can only be a few short years, and every home will have one of these instead of a telephone.
Almost twenty years later and there is still seemingly little interest.
I've tried it and it's pretty cool. It's great so see the expression on your opponents face when you roll your army of tanks into his left flank when he's least expecting it.
Windows has had the ability to draw transparent windows since 2000. However, there's a limit to how far they can go.
Transparency in window systems is an idea that goes back almost as long as window systems have been around. People were even asking for it in the earliest versions of X11.
The only reason it hasn't been implemented more widely is because hardware hasn't really been up to it and applications didn't need it. Those applications that really did need it just used special graphics and visualization libraries.
Apple has this feature not because they had some great new insight, but they actually just got it essentially for free with the PostScript-based window system they acquired from NeXT, which was designed some time in the early 1980's and is based on stuff from Adobe. And now that hardware is up to handling it, it will just be a standard component of desktop window systems.
Before people go all gaga over this rash of new features from Apple and Sun, do some research. Transparency, shadows, overlays, desktops on 3D virtual presences, portals, 3D environments, etc. go back years and years, and many researchers have contributed to them. (You can find many of these features demonstrated in Squeak.)
This stuff is being shipped commercially now because you can now run it on PC hardware costing $1000-2000, instead of requiring high-end SGI workstations, as it did just a decade ago.
Apple does have a short-term advantage in this area: as part of their deal with NeXT, they got a Postscript/PDF-based window system, which happens to do these sorts of things already. But they don't have any long-term technical advantage: Microsoft has added similar APIs to Windows, and X11 also is moving towards full support of transparency and blending throughout the entire window system. Whose system will end up being the best long-term choice remains for the market to decide; personally, I think Apple's PDF-based system is the most cumbersome of the bunch.
Comment removed based on user account deletion
The whole point of this is that a live video feed of the person you are working with is super-imposed on the screen while you work on the shared desktop.
Sounds like what you were working on was just a shared workspace?
Nice to see one of the developers participating in the forum Jason. But more importantly, is there going to be a release so that we can get our grubby hands on it?
I would get a skull mask and have the desktop fade in and out.
> The reason I put it in quotes is because I deliberately spelt it wrongly in a crude attempt at anti-slashbot irony.
Surely you would have been better to use inverted commas in that unlikely case, rather than quotation marks?
HTH
I was at a friends and noticed him watching tv maximized to the fullscreen but blended with his desktop so he could continue to work and watch tv without playing 'move the window'.
Is there any way to do that in linux today?
Combine this with the 3D technology that the latest LCD displays have, and video conferencing would be possible Neon Genesis Evangelion style.
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
Surely you would have been better to use inverted commas in that unlikely case, rather than quotation marks?My Higher English teacher told me I could use either " or ' as long as I was consistent. That much I learned from school...
Stick Men
Hmm... I wonder how many of those 6 billion that have access to Macs and FireWire cameras.
> My Higher English teacher told me I could use either " or ' as long as I was consistent. That much I learned from school...
Strange, after googling, this does indeed seem to be the case.
My apologies. To me, quotation marks and inverted commas have never been interchangable, but it appears I am infact wrong on that. That may explain MY Higher English result! Sorry.
As with everything these days, the IP is... interesting. As UNC employees, we're not allowed to just start tossing the tech around willy-nilly, UNC owns the IP. They have applied for a patent on this, and are looking at a licensing scenario for those who would like to commercialize it. (Or include it in their OS *cough*.)
It would be nice to just disseminate the thing, but I don't legally own it to do so.
Your FAQ page says that they (should be you, but undoubtedly you signed away all those rights, what use would an academic have for the fruits of his own labour... *sigh*) _have_ a patent (and doesn't say on what)
It also says that because of the patent you can't discuss technical details. Which as you should know makes no sense - a patent exchanges public knowledge of the implementation details for a monopoly production right. The fact that your patent lawyer may have used weasel words to dodge this doesn't make it less true (though it may make it easier for us to get the patent invalidated later under a more favourable administration).
David Stotts' website. It looks like they are going to be presenting at Hypertext 2004 as well.
OK, aside from the "cool" factor, does this really buy you anything? I mean, you can accomplish the same thing with a VNC session and a telephone (IM client, VoIP client, etc), no?
It seems like some of those are trolls and maybe don't qualify fully as "jokes".
That aside, please publish the ids for:
1. x
2. y
3. Profit!
4. z
and
I, for one, welcome our new x overlords.
I would like if we could have a web page or email address that people could simply submit a comment id to and have it reviewed for inclusion in the db, that would be great.
Thanks.
Yeah, he has the unfair advantage of knowing what he's talking about, but the "stupidly simple FAQ" is worth seeing.
I think there already something like this for windows. It is called webex I recall seeing a demo at my place of business. They want outrageous prices for something like this.
I am not an Open Source/Free Software zealot, but it does irritate me somewhat when publicly funded institutions patent research projects =)
I also mean no disrespect to the developers, as I am aware how little control they have over this fact.
.technomancer
This looks similar to the eyetoy for PS2. It works the same way.
This is a better use of the idea in my opinion. I'd like to use this to replace a mouse, plus the collaberation use looks great. Kudos to the ones who put it together!
Can we do this with X11?
AB HOC POSSUM VIDERE DOMUM TUUM
It turned out to be called AccessGrid and well, it's not the greatest thing since sliced bread - far from it in fact.
The first round was a limping and contankerous java crap/mess that was scrapped when MS fucked java up the ass.
The latest round AG2.2 is python and only a little better. The functionality provided is massively weak compared to the amount of work and hardware/bandwidth expense, but plans are moving ahead.
It's nice, but looking at this story and the shots (if they are indeed real), it's clear that AG is getting is motherfucking ass SMOKED by this. Period.
Nice try though.
You mean I'll have to get dressed now for my online business meetings?
If you don't have stickers, make a few small circles on the mirror using your girlfriends lipstick.
Now step about 3 feet away from the mirror.
Move your finger so that when you look in the mirror, it looks like you are touching the stickers but you don't physically do so, it just looks like you do to your eye.
Notice that you can do this regardless of your angle to the mirror, you just have to adjust your finger.
Now imagine that the stickers are icons on your desktop
and VIOLA!
If I get this correctly--and it appears most everyone else does not--I see your alpha-blended image as if you were sitting in my place; similarly, you see mine. Not flipped, or virtualized, but 'reflected' as if we were sitting in each other's place. Anyone who has worked developing and using and testing collaborative solutions will recognize that there is, potentially, a real AHA! here. VNC, NetMeeting and WebEx and all that clever crap is limited or useless, except in the hands of sophisticated people doing support or in very narrowly defined lecture settings, because these applications all abstract the notion of the absent party. You know I am moving the cursor on your machine because on your screen it has a red box around it, while I see an unboxed cursor. I may be talking to you over a voice channel, and we learn to abstract a collaborative session in our heads from voice and visual cues. If you think ordinary mortals can learn and map that cue-based interaction to natural behavior so they can just work together on a document then you labor in ignorance. The AHA! here is that the absent party is not abstracted, they are substituted via alpha-blending. Hard to say without seeing and feeling this is in action, but my hunch is this is literally a big step forward from the user perspective. People will get whom is doing what without confusion. Contemporary marketing practice would argue UNC make the code open source but patent and license the cute little red finger-tracker-dealies for this use.
Ever heard of VNC? It's cross-platform, open source, and it's been around for years, and it's exactly what you just described.
But it's not like this technology at all. Sorry.
I've got more mod points and GMail invi
You're correct, I will clarify the language on that page.
We are in the endstages of the patent *process* (applications formally filed, etc), and until I get the all-clear from the suits, I am not going to say a blasted thing outside of an NDA that hasn't already been cleared through channels such as academic publications and interviews. I like my grad student posterior unkicked by wingtips.
Dead spot on.
Bump that puppy up, mods, if you don't mind.
She has no trouble recognizing a printed word when she sees it somewhere else, so I usually end up pointing my iSight at my screen. But it would save so much time if I could show her what to look at on her screen.
I'm not a huge software patent fan myself
Does this mean you'd like for it to end up, say, on sourceforge?
I first heard about this on appleinsider. One of the moderators there, Kickaha, is one of the developers.
= &t hreadid=41910
http://forums.appleinsider.com/showthread.php?s
Historians such as myself often long for a means to consult a distant colleague about some text, which in my case is often a centuries-old Korean manuscript written in classical Chinese, in which only three people on earth may have interest. This technology holds the promise of facilitating intensive collaboration between scholars on different continents, which is a luxury currently next to impossible with only voice communication.
Adding to the university coffers is a worthy pursuit, but I hope in the rush to wealth UNC won't forget the basic goal of the university to create communities that further knowledge.
The tao I can speak of is
Your use of the brain's hand-eye coordination circuitry to solve the calibration problem is brilliant. Employing the brain/body to solve the computer interface problem is exactly what the Mac's "Spatial Finder" was all about. Use the brain's visual powers as well as "logical" powers to more completely understand data organization, muscle memory for efficient command issuance, etc. Some never understood that was what the Mac did well. It wasn't the desktop space metaphor so much as the interfacing with the brain/body modus operandi. That the Mac's technologies allowed your leap in UI is a nice break for Apple and quite fitting considering their heritage. Sadly, Apple has discarded the spatial finder with OSX. Let's hope they can still recognize the value of your take on the spatial UI.
I am glad you mentioned the ClearBoard concepts in the FAQ. The Wired story I believe included a quote from your team along the lines of you can't believe no one had thought of this. Clearboard seems to contradict that sentiment except of course for your UI breakthrough. Clearboard, in 1992 at least, seemed to be a shared drawing app, with some touchscreen capabilities. It combines video of the user with data captured via touchscreen. Facetop employs no touchscreen. It combines video of the user with data, but that data is captured in a breakthrough way. Manipulation is done via the user image, not via touchscreen as in Clearboard. I don't think Clearboard gets to where you are from where it was in 1992.
However, what do you think of the plane of glass metaphor? Is gaze-awareness maintained effectively through your interface? Is it valuable in your eyes? Clearboard captures the image of the user directly inputing the data which is a powerful perspective it seems. Your interface breaks somewhat the link between data entry and user image from the point of view of the other party (I don't see you drawing or could I?) How does Facetop compare in drawing, whiteboard type applications? In any case, gaze-awareness may be less critical with text.
Long post, but maybe you could share your thoughts on some of these questions. I certainly would find them fascinating I'm sure.
-Alex Caro
You should definitely explain your system in terms like these to journalists. It took me a couple of minutes before I realised what the real breakthrough was. (Then my metaphorical jaw dropped. Hats off to you both.)
I'm surprised not to find this in the FAQ, next to "Think of this as a replacement for the mouse": how do you click?