RSI, WIMPs and Pipes; What Next?
Tetard asks: "Long live the pipe! Since the `|' was invented by Doug McIlroy in 1973, has there ever been a more effective way of reusing tools and connecting data ? The mouse is a device of the Beatles era; Rather than try and provoke nostalgia in the older ones among us, I'm asking myself, as are others: when we don't try to reinvent the wheel, or at least improve it, why must we try and copy it every time ? Xerox PARC exposed us to WIMPs and we haven't done better: some innovation, some plastic surgery -- but no "paradigm shift" -- where's the
creative destruction that will take us further ? Graphical component programming is turning us into click-happy bonobos^H^H^Hchimpanzees, as we fail to find new ways to manage and connect richer data streams. My web designer friends are damaged for life because of mice, and yet we persist... Where do we go from here ? If we ever invent the graphical pipe, let if have keyboard shortcuts." Yes, you've probably seen a similar question to this run by Ask Slashdot before, but this time I'm wondering if maybe we need new input devices before the WIMP paradigm is replaced with something better. Might any of you have ideas on what form these input devices might take?
For those interested, here are the previous stories that have handled this type of question:
So what it will take to break us out of the WIMP box (or prison, depending on your bias), maybe new input devices would do it, but quite frankly, I wouldn't be surprised if a 3D interface might be another route (it would possibly spark interest in designing a new input device that would work better with 3D interfaces, or maybe data-gloves could serve this purpose?). Going on a limb, maybe this guy might just be the ticket.
With a sub-$100 webcam watching you, look at the point of the screen where you would click, and blink.
Are there lots of problems to doing this? Yes. Should that stop me from throwing out the idea? No.
So far, we're pointing at things on a screen, moving them around, and typing messages. Datagloves and other visual manipulations will be important for all sorts of specialized tasks, but the way we tend to communicate is through speech and body language.
Speech recognition is only useful for very limited functionality, mainly because computers haven't been fast enough or with large enough databases to really make use of syntax and context. Continuous speech recognition today typically uses waveform profiles with no contextual or grammatical analysis.
But with faster processors and larger memories, I expect speech recognition to go to the next quantum level within 5-10 years. Once we add contextual and grammatical constructs to speech recognition, computers will start to be able to really understand what we're saying. To go from that to understanding what we *mean* is another step, but that's coming too.
I also expect computers to have video cameras and to be responsive to our body language and facial expressions. They will be able to judge whether what they're doing is interesting or useful, and will ask for guidance or attempt to correct based on that feedback.
In other words, I expect interaction with computers to become more like interaction with people!
This seems a bit like asking what it would take to replace the current way of driving a car (steering wheel, gas and pedal brakes, etc.) with something better. But the interface between humans and automobiles is pretty much a solved problem, and nobody seems to spend much time speculating on what a paradigm change in automobile control would be like.
There's a curious assumption which I've seen repeatedly-- namely, that a paradigm shift in human/computer interaction would be a good thing. Why, exactly? I see no reason to pursue a paradigm change for its own sake; I view it as a problem which has basically been solved for now, much as the problem of steering cars is a solved problem.
I know it's not relevant to the question, but I would like to ask Tetard how he manages his bookmarks.I mean that I may had all the links that he throws in, but either :-) )
a) I would have forgotten that I have them (too many, too scattered), or
b) I would have 10 level subdirectories in my bookmarks
and they would be a mess.(Unfortunately they already are
That sort of thing will be the wave of the future, and it will mean that apps will have to be smarter and communicate a lot more than they do today. My personal agent should reside on my local machine, not the network, and should watch out for my personal privacy. It should divulge only what is necessary to others in order to perform the commands that I give it. It should be flexible and configurable, but I should never have to configure it; it should learn what I like by how I interact with it.
Several large companies have been working toward this holy grail for years, but thus far not even common voice recognition much less NLP has emerged from their research. Sure there are some voice recognition packages out there, but there's very little integration, and AFAIK nothing at all in the NLP arena. We could start working toward the level of integration that would be a necessary foundation for a lot of this stuff, but I don't know that you could get the necessary level of cooperation in ANY software development community.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
The mouse and keyboard work well together: the keyboard is very versatile (eg for games and typing), and you can switch very quickly between the keyboard and mouse without looking. The pen, OTOH, works poorly with the keyboard because it takes too long to switch: you have to locate the pen and pick it up in the right way before you can use it. You also have to hold it in the air rather than rest it on a surface, so it's more fatiguing.
I propose a development of the mouse I call the wand: it's shaped kind of like a mouse but can also be picked up and used in space, or stood on one end on the desk and used sort of like a joystick. It's sensitive to its orientation and motion in space, and can give tactile feedback as clicks and buzzes and things. It has buttons and levers and whatnot in suitably devious spots for all your fingers so you can work all sorts of games with it.
So: you play Quake with it by sliding it around on a desk, rotating and tilting it. You use it just like a mouse with your word processor. You do 3D modelling with it by just waving it around in the air. And so on.
It can have a little thumb joystick for even wilder input combinations.
It's portable. You can carry it in your pocket to use with a wearable computer (or it can be the computer). It's potentially reasonably cheap. It expands on an existing paradigm and hence is compatible with existing software and interfaces.
It slices! It dices! When will someone make me one?
The trouble with most GUI desktops is that they are designed for manipulating items on a GUI desktop and customing the GUI desktop rather than making the interface transparent. As a counterexample consider the Palm, where the idea is to make the UI be as lightweight and unobtrusive as possible, because people want to just take notes and view their schedules. A WIMP desktop is overkill, so they went with something an order of magnitude simpler.
A 3D desktop is a step in the opposite direction, placing more emphasis on the desktop itself than what people want to do.
A speech interface would be nice, but only if it was supplemented with a standard mouse and keyboard (and maybe a touch screen) and would accept natural language commands. As far as the user interface goes it should have a complete abstraction from applications and the file system leaving the user to only be concerned with documents.
The reason they should also have mouse and keyboards are for security so passwords etc wouldn't have to be spoken (see the recent user friendly strip series for a humerous take on that), and so things you're doing could be kept somewhat private. Imagine starting up a long build or whatever on your machine and figuring you'd take a short break while everything compiles and telling your computer 'open mozilla. go to hot asian chicks dot com. click hot and horney', you might get more than a few head turns from local cube dwellers unless you bookmarked it and renamed it to something like 'intranet' but the renaming process would also have to be vocalized.
It should also accept natural language commands for complicated to speak text. The main example for this is programming. If I wanted to do:
for (int i = 1; i = 10; i++)
cout << i << endl;
I would like to just say 'for loop. local integer i from zero to ten step one begin. print i and end line. end loop'. instead of having to articulate each puntuation symbol as 'for open parenthesis int i equals 1 semicolon i less than equal ten semicolon i plus plus close parethesis. enter. c out less than less than i less than less than end l', not to mention if I had to put spaces in there too.
The next thing we would need is an abstraction from the use of applications and the file system which would go in very well with a speech interface. The user would only be concerned with documents and data. The user would just ask the computer to start a new report on photosynthesis and the computer could ask the user what to call it and they could just respond with a natural name like 'biology 101 mid-term'. Later the user would just ask the computer to open the biology 101 mid-term without having to care if it was opened with word or starwriter or kword, etc, it would just be there and they could work on it.
The abstraction from the file system would be a natural extension of this because the user doesn't need to know where anything is because the computer takes care of it for them. The user just needs to remember documents/files as he would anything else 'I was writing that letter to Bob', 'I was working on the bio mid-term', etc. This also furthers the use of a computer as a tool, because it would actually help you get things done and be easy to use by anyone because speech is a natural interface for us, but keyboards and mice are not.
The best example I can think of having something like a touch screen is for web browsing or editing documents/preparing presentations, drawing (but maybe a graphic tablet would better for that), etc. so instead of telling the computer to open the 'Read more' link, I could point and it would open whatever I pointed to.
Microsoft is trying to do this with things like the My Documents folder and automatically naming documents with the first line of the document, but it's still somewhat cludgy because it relies on keyboard and mouse interaction. They are kind of on the right track in terms of abstraction from applications and the file system, but still needs a ways to go. This is why they have the Documents folder in the start menu and New Office Document and Open Office Document on the start menu instead of the programs menu. This is also why they have extension associations with applications so the user can just click on a document and it will spawn the right application (or maybe they just stole it from macs).
These ideas are nothing new, I've seem them all somewhere else before, but I just thought I'd post them here for discussion because I think they're good ideas. It should also be noted that this type of interface is for the 'average' user not the average slashdot reader since we all like our keyboards and CLIs.
Things you think are in the Constitution, but are not.
This would be good, but it would require a 3d interface. I think the only way to truly do a 3d environment is for it to exist in the physical world. Any 3d interface on a 2d screen will become kludgy pretty quick. The best way to do a fully 3d interface is to put in in the real world. Imagine if your desk *WAS* the computer. The desktop *WAS* your actual desktop. You open your draw to see 'real' manilla folders with names on the tabs for your documents and thumbing through them to find the financial report you were working on, 'grabbing' it and pulling it out and it appears on your desktop for you to work with. You open up another draw and see pens, pencils, markers highlighters, etc that you then 'grab' to select what you want to start writing with. You could just slide your hand across the desktop to move documents out of the way and tab or 'grab' a document that was 'under' the one you were working on and it comes to the top and you can begin working on that one.
This would require a lot of holography and motion tracking, touch sensors, etc, but it would be the ultimate in 3d interfaces. It would avoid klunky HUDs and gloves, etc that just detract from the actual work. You could even bring up a keyboard on the desktop and use that instead of a virtual pen or pencil.
3d interfaces would be nice, but on a 2d display I think it's best to stick with a 2d interface.
Things you think are in the Constitution, but are not.
But I think you're right; what I really want to see is a 3D device, not everybody trying to improve on the 2D paradigm. Of course, that means existing drivers and existing operating systems would need to be abandoned.
Can you name a single improvement to the concept of the wheel in those years? AFAIK, they are still a round thing that revolves around an axis. Sure, the machining precision of that roundness and that axis are many levels of magnitude better than 5000 years ago, but the concept is still the same.