You're correct, I will clarify the language on that page.
We are in the endstages of the patent *process* (applications formally filed, etc), and until I get the all-clear from the suits, I am not going to say a blasted thing outside of an NDA that hasn't already been cleared through channels such as academic publications and interviews. I like my grad student posterior unkicked by wingtips.
Thank you for understanding that our hands are tied, no matter what our personal preferences and beliefs about open source, and software patents, may be.:}
I can't go into details on the actual patent application until that process is complete, but the seemingly small twist that ClearBoard's conceptual model is 'two people on either side of a pane of glass' and ours is 'two people sitting side by side' made an amazing amount of difference as to how it was designed and therefore what it can do.
The FAQ also illustrates that the number of people interacting is the number of people shown - you also see yourself, something ClearBoard did not do. The FAQ explains why this is critical.
*I* would consider it. Unfortunately, as we are employees of UNC, they own the IP, lock, stock and barrel. I am working with them to find a nice happy middle ground.
As for what is patentable, I am unable to comment on that until the patent process is complete. (Yeah, it's not my first choice either, but I'm bound by the legalities of my situation.)
You're right, there is a lot of good work out there that we've uncovered since starting this project... to be honest, I'm not a CHI guy, I'm a software engineering researcher (my dissertation is in the automated detection of instances of design patterns in OO source code), and David Stotts concentrates on pair programming methodologies, with a heavy background in hypermedia traversal theory. This is just a side gig. We came up with the idea, and were absolutely *stunned* that no one had done it exactly this way before, it seemed so obvious after that magic moment of eureka.
But, that eureka is what they give patents for, for better or for worse. It's not my IP to determine its fate.
You're close... but it's actually better than that.
Check the FAQ here for a rundown of what's actually going on.
You see the other person *and yourself*. It's as if you're sitting side by side, working at the same keyboard. If either of you lifts your tracked fingertip into the camera view, the cursor is controlled by it. Either user can control the cursor, and edit the shared document(s).
And you're right, there is very little confusion as to what's going on - most people take to it immediately.
Without video conferencing, you lose a lot of the nuances of human communication.
Stotts' work in distributed pair programming almost invariable resulted in "We really wish we could *see* them..." which led to video conferencing, which led to "We still have to verbally describe things we could just point to."
Also, VNC clones the same desktop to both machines - which makes traditional video conferencing... difficult.;)
With our approach we can clone the entire screen if you really want, but our current system lets you choose which *documents* you're sharing. The other user doesn't see everything on your screen, only work you are sharing.
I'm sure you can appreciate the utility of that.;)
Try editing a large coding project between two people in different locations over the phone.
"Okay, now open up the header... now go down to line 304... what do you mean it's not that long? No, the *other* header. Okay, now line 304. Right. The for loop halting condition is... what do you mean what for loop? The one right there... you're viewing it in *what*? Oh forget it, send me the file."
As with everything these days, the IP is... interesting. As UNC employees, we're not allowed to just start tossing the tech around willy-nilly, UNC owns the IP. They have applied for a patent on this, and are looking at a licensing scenario for those who would like to commercialize it. (Or include it in their OS *cough*.)
It would be nice to just disseminate the thing, but I don't legally own it to do so.
But you can't just reach up your hand into the camera field of view to smack his battalion into submission, can you? Think about how satisfying *that* would be.;)
Yes, that's precisely why it works - it's like you're looking into a mirror. Raise your hand to move the cursor up. Move it left, cursor goes left. It's just that simple.
Sadly, no. Forty-five minutes was about the time it took me to write the initial proof of concept, not the full application. (That included reading the documentation on various APIs.)
Yes, it is similar, but with three critical differences:
1) ClearBoard's conceptual model was two people standing on either side of a pane of glass. Ours is a much simpler view... two people sitting side by side. We have no issues requiring us to flip document content for instance. It is a small but important difference in how it drives the implementation.
2) ClearBoard required expensive and cumbersome hardware. FaceTop requires a $100 FireWire camera. Well, and a Mac.;)
3) ClearBoard was designed to be integrated with specific applications. FaceTop becomes an input device, much like a mouse replacement, and thereby can work with any application on your system. We generalized it out, and it became much more powerful.
Actually, our experiments have found that it really doesn't matter.
First off, the translucency is adjustable. Looks too cluttered? Make it more faint. Secondly, it's much like being in a room full of conversations at a party - you select particular conversations to pay attention to, and the rest just 'fade away'. In this case, when the user turns their attention to the document content, they don't notice the video, and when they concentrate on the video (either for hand motions or interaction with a remote user), the document content is ignored. The brain is much better at this sort of thing than most people realize.
Yes, you can make individual windows translucent in Windows.
Just don't expect 2D and 3D pipeline windows to intermix with translucency. While this system would be *possible* under Windows, we feel, at this point in time it is not *practical*. Three months were spent attempting a proof of concept under Windows. The prototype on the Mac took 45 minutes.
Longhorn's new graphics system will bring it to parity with Quartz, and it should then be equally feasible there as well.
Actually, the camera can be anywhere, as long as you're in the field of view.
As for ease of use, it literally takes people about two seconds to calibrate their hand motions to the cursor movement, and they're off and running. It's exactly like you're standing in front of a mirror (assuming the camera is in front of you), and gesturing... the visual feedback you get from your own image is the key. The transparency lets you see both your 'reflection' and the document content simultaneously.
Don't worry, we're seeing a lot of people confusing the single-user mode (one head on screen) with the video-conferencing mode (two heads on screen), simply because they're not used to video conferencing including themselves.
Nope, that's a misconception we're seeing pop up from time to time. This is not like standing on either side of a pane of glass... this is like sitting side by side with the other user.
I just put up a stupidly simple FAQ of sorts at http://www.cs.unc.edu/~smithja/facetop/index.html and will be updating it this morning.
You're correct, I will clarify the language on that page.
We are in the endstages of the patent *process* (applications formally filed, etc), and until I get the all-clear from the suits, I am not going to say a blasted thing outside of an NDA that hasn't already been cleared through channels such as academic publications and interviews. I like my grad student posterior unkicked by wingtips.
Thank you for understanding that our hands are tied, no matter what our personal preferences and beliefs about open source, and software patents, may be. :}
It may turn out that way. We, and UNC, don't believe so, but it will be up to the patent office to determine that.
I can't go into details on the actual patent application until that process is complete, but the seemingly small twist that ClearBoard's conceptual model is 'two people on either side of a pane of glass' and ours is 'two people sitting side by side' made an amazing amount of difference as to how it was designed and therefore what it can do.
The FAQ also illustrates that the number of people interacting is the number of people shown - you also see yourself, something ClearBoard did not do. The FAQ explains why this is critical.
*I* would consider it. Unfortunately, as we are employees of UNC, they own the IP, lock, stock and barrel. I am working with them to find a nice happy middle ground.
As for what is patentable, I am unable to comment on that until the patent process is complete. (Yeah, it's not my first choice either, but I'm bound by the legalities of my situation.)
You're right, there is a lot of good work out there that we've uncovered since starting this project... to be honest, I'm not a CHI guy, I'm a software engineering researcher (my dissertation is in the automated detection of instances of design patterns in OO source code), and David Stotts concentrates on pair programming methodologies, with a heavy background in hypermedia traversal theory. This is just a side gig. We came up with the idea, and were absolutely *stunned* that no one had done it exactly this way before, it seemed so obvious after that magic moment of eureka.
But, that eureka is what they give patents for, for better or for worse. It's not my IP to determine its fate.
You're close... but it's actually better than that.
Check the FAQ here for a rundown of what's actually going on.
You see the other person *and yourself*. It's as if you're sitting side by side, working at the same keyboard. If either of you lifts your tracked fingertip into the camera view, the cursor is controlled by it. Either user can control the cursor, and edit the shared document(s).
And you're right, there is very little confusion as to what's going on - most people take to it immediately.
D'oh, sorry.... no sleep makes me slack.
;)
;)
Without video conferencing, you lose a lot of the nuances of human communication.
Stotts' work in distributed pair programming almost invariable resulted in "We really wish we could *see* them..." which led to video conferencing, which led to "We still have to verbally describe things we could just point to."
Also, VNC clones the same desktop to both machines - which makes traditional video conferencing... difficult.
With our approach we can clone the entire screen if you really want, but our current system lets you choose which *documents* you're sharing. The other user doesn't see everything on your screen, only work you are sharing.
I'm sure you can appreciate the utility of that.
Try editing a large coding project between two people in different locations over the phone.
"Okay, now open up the header... now go down to line 304... what do you mean it's not that long? No, the *other* header. Okay, now line 304. Right. The for loop halting condition is... what do you mean what for loop? The one right there... you're viewing it in *what*? Oh forget it, send me the file."
vs....
"Yeah, right here..." *point*
Which would you rather do?
As with everything these days, the IP is... interesting. As UNC employees, we're not allowed to just start tossing the tech around willy-nilly, UNC owns the IP. They have applied for a patent on this, and are looking at a licensing scenario for those who would like to commercialize it. (Or include it in their OS *cough*.)
It would be nice to just disseminate the thing, but I don't legally own it to do so.
I'll take that as a compliment on our making it look like you're standing in front of a mirror.
;)
That's rather the *point*.
Trust me, it would have been much easier to take a picture of a reflection on the screen surface than develop the bloody thing.
Very cool.
;)
But you can't just reach up your hand into the camera field of view to smack his battalion into submission, can you? Think about how satisfying *that* would be.
I'll take that as a compliment. ;)
Yes, that's precisely why it works - it's like you're looking into a mirror. Raise your hand to move the cursor up. Move it left, cursor goes left. It's just that simple.
Sadly, no. Forty-five minutes was about the time it took me to write the initial proof of concept, not the full application. (That included reading the documentation on various APIs.)
But yes, Cocoa made it much easier to do so.
Yes, it is similar, but with three critical differences:
;)
1) ClearBoard's conceptual model was two people standing on either side of a pane of glass. Ours is a much simpler view... two people sitting side by side. We have no issues requiring us to flip document content for instance. It is a small but important difference in how it drives the implementation.
2) ClearBoard required expensive and cumbersome hardware. FaceTop requires a $100 FireWire camera. Well, and a Mac.
3) ClearBoard was designed to be integrated with specific applications. FaceTop becomes an input device, much like a mouse replacement, and thereby can work with any application on your system. We generalized it out, and it became much more powerful.
Actually, our experiments have found that it really doesn't matter.
First off, the translucency is adjustable. Looks too cluttered? Make it more faint. Secondly, it's much like being in a room full of conversations at a party - you select particular conversations to pay attention to, and the rest just 'fade away'. In this case, when the user turns their attention to the document content, they don't notice the video, and when they concentrate on the video (either for hand motions or interaction with a remote user), the document content is ignored. The brain is much better at this sort of thing than most people realize.
Yes, you can make individual windows translucent in Windows.
Just don't expect 2D and 3D pipeline windows to intermix with translucency. While this system would be *possible* under Windows, we feel, at this point in time it is not *practical*. Three months were spent attempting a proof of concept under Windows. The prototype on the Mac took 45 minutes.
Longhorn's new graphics system will bring it to parity with Quartz, and it should then be equally feasible there as well.
Actually, the camera can be anywhere, as long as you're in the field of view.
As for ease of use, it literally takes people about two seconds to calibrate their hand motions to the cursor movement, and they're off and running. It's exactly like you're standing in front of a mirror (assuming the camera is in front of you), and gesturing... the visual feedback you get from your own image is the key. The transparency lets you see both your 'reflection' and the document content simultaneously.
Don't worry, we're seeing a lot of people confusing the single-user mode (one head on screen) with the video-conferencing mode (two heads on screen), simply because they're not used to video conferencing including themselves.
Seeing some misconceptions, tossed up a quick FAQ at http://www.cs.unc.edu/~smithja/facetop/index.html for your perusal.
I'll be adding material to it through the morning as issues pop up, but these are the ones we've seen the most of this weekend.
Nope, that's a misconception we're seeing pop up from time to time. This is not like standing on either side of a pane of glass... this is like sitting side by side with the other user.
I just put up a stupidly simple FAQ of sorts at http://www.cs.unc.edu/~smithja/facetop/index.html and will be updating it this morning.