Opera Promises Voice-Operated Web Browser
unassimilatible writes "Opera's latest browser talks and listens, according to AP.
The new browser incorporates IBM's ViaVoice technology, enabling the computer to ask what the user wants and "listen" to the request. "Hi. I am your browser. What can I do for you?" asked a laptop with the demonstration versions of the browser. The message can be personalized, such as greeting users by name. The computer learns to recognize users' voices, accents and inflections by having them read a list of words into a microphone. Opera plans to first launch an English version of the voice browser for computers running the Windows operating system. Versions for other systems, including handhelds, will follow. Opera's press release has more details, including Opera's hopes that people will adopt this technology for presentations - and to replace PowerPoint."
"Computer...Take me to the pr0n!!"
This sounds like a fun thing to play around with, but I certainly don't see myself using it as a normal web browser. I'll most likely stick with my keyboard.
as for their statement about it being a replacement for powerpoint, I don't think that this will fly either unless they either: a) find a company to make a powerpoint alternative which saves to html files b) make the aforementioned software themselves. Even if they accomplished that, people's stupidity and ignorance has proven time and time again that whether microsoft's software is better, worse, or just as good as its competitors- people will buy microsoft's software instead of others. Look at openoffice.org, mozilla (most people use ie)/opera/konquer/galeon/netscape/etc, linux, amd a bunch of other superior software. Maybe a couple could be explained (linux often involves use of the command line interface, netscape is slower to load (even though ie cheats by loading some of the program at startup time)), but most of it is due to a problem which exists somewhere between the keyboard and the chair. Besides, I would find a remote control a better option than speech, since a remote control wouldn't force me to scream "NEXT SLIDE" across the room like an idiot before it recognizes what I'm saying. It would also be much smoother to just press a button on a remote control.
Now the jerk in the cubicle next to me will talk both with himself, "the fairies" and his browser.
Though I can certainly understand the need to market something unique, and the logic behind "Voice is the most natural and effective way we communicate.....", I cannot ever see myself talking to my web browser like another human being.
I've worked with and supported both ViaVoice and DragonNaturallySpeaking solutions for voice-based typing in word processors, and neither of them felt natural. Perhaps because I'm a geek, or just because I've been doing it so long, I'd rather manually key in exactly what I want and let myself make the mistakes, not the interpretation.
With corrections, it always took longer to do the alleged "easier way" than manually keying in. Even with 99% accuracy, Word Processing was always clunky at best.
That, and every time I scream out "litigious bastards", I don't need it pulling up litigious bastards.
*speak it* h t t p : / / slash dot . org
Voice input and output.. that'll make it a lot harder to discreetly search for pr0n whilst at work.
Computer: "Hi. I am your browser. What can I do for you?"
User: [whispering]Find me "porn"...
Computer: "The band KoRn was formed in 1993 by Jonathan Davis and..."
User: NO! [whispering] Not "KoRn"; "porn".
Computer: "Clogged pores are the major cause of adolescent acne. Starting at puber..."
User: NOT "PORE", DAMMIT!!! [coughs, lowers voice] find me "porn"..
Computer: "Iron Ore is the primary ingredient in steel. Metalurgists will add other elements and compounds to give the steel certain proper..."
User: NOT "ORE", YOU PIECE OF SHIT! [office mates look over cubes] [whispers] Look.. I want to look at naked people..
Computer: "The goatse.cx lawyer has informed us that we need a warning! So.. if you are under the age of 18 or find this photograph offensive, please don't look at it. Thank you!"
Trolling is a art,
"Is this the real life, is this just fantasy..."
A feeling of having made the same mistake before: Deja Foobar
I'm sorry Dave, I'm afraid I can't load that.
" I'm sorry, Dave. I'm afraid I can't do that"
The original generic sig.
You can do the same with just about any other browser on Mac OS X. With the speech module you can connect a voice command to any keyboard sequence. I have it set up to switch tabs, create tabs, and with the 'Make this page speakable' voice command, you can navigate to any page, making it work like a bookmark system.
What would be nice is if 'Speech' could recognize the commands for a particular application without switching focus. So I could be coding on one screen while browsing on another.
"It takes many nails to build a crib, but one screw to fill it."
There are many words in the English language that have homophones. Google being a text-based search interface is smart enough to not mix up "four" and "for", "too" and "two", or "plane" and "plain". There's no way for voice recognition technology to tell the difference between those words in a search query, there simply isn't enough context...
How complicated can you make a browser?
I mean, tabbed browsing is cool, I've gotten used to it. But stuff like mouse gestures, voice recognition, etc, all just seems like fluff.
One could have mapped spoken keywords to mouse/keyboard actions already if this is what they wanted.
It's a hard arena to innovate in. This just seems kind of silly.
What's next, support for force feedback chairs that scroll the browser based on which ass cheek I'm clenching?
I don't need no instructions to know how to rock!!!!
I'm sure you've all done this at one point or another -- you stand over the shoulder of a friend or co-worker, and tell him or her to go to a website that you are familiar with, and they are not. Then you say "Ok, click on 'specs' up in the corner.... no, the other corner... yes, that button... no, don't click below it - that's somethign else..." Same deal with e.g. getting someone to change an option in a program somewhere -- you gotta walk them through a series of mouse clicks or things to look for, and it's frustrating when they don't do it right away. (maybe i'm just an impatient jerk?)
The point here is when it's hard to instruct intelligent people how to browse the web, how well can a computer do it? I have my doubts.
-S
...it's all well and good. but can the speech recongnition module parsebork? if so, it will be the ultimate presentation tool:
"Now gentlemen, pleese-a turn your ettenteeon to-a sleede-a twelve-a. bork!bork!bork!"
For a while my wife was a physical therapist at a nursing facility that specialized in head tramau and paralysis. I installed Dragon NaturallySpeaking for several patients there and several of them became extremely proficient in using it. I'm not sure how having built-in support would be more advantageous, though.
I can't see this having wide acceptance in the corporate world. Cube farms are noisy enough. I can't imagine what it must sound like for everyone to be browsing by voice.
I also can't imagine some of my co-workers saying the addresses of what they browse out loud. *shudder*
My sigs always suck.
I installed some of the first off the shelf Voice recognition software a number of years ago for my sisters cousin who has cerebral palsy, and it made a huge difference in her being able to use the computer for her education, I sent the Opera Link to her Mom to look at in that this might be something that would suit her also.
" My next house will have no kitchen - just vending machines and a large trash can. "
"Dubya Dubya Dubya period white house period gov" ;-)
(note to dems, i'm not a troll, i'm canadian)
-
I got a free copy of dragon dictate once so I trained it as much as possible.
I got mozilla working quite happily, 'down' 'up' 'slow' (that was a good one, it slowly scrolled down), 'back' etc.etc.
the thing I found after weeks of training that it was just so tiring talking all the time
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Also as the article notes one can buy more extensive add-on products like IBM's Mac/PC ViaVoice & Dragon's family of products as well as numerous other lesser-known and more specialized ones.
That's today, already on millions of desktops, ready and capable of driving web browsers, sitting there ignored.
Why?
- Few folks are even aware that speech recognition or speech generation are trivially or already installed on their computers
- When general users do use these capabilities they're usually disappointed they're not more like the ones on TV, where a simple ambiguous command is immediately interpreted and plot-appropriate material magically recited out
- Most folks don't have microphones plugged into their computers, or they're ones unsuitable for speech recognition
- Few folks bother to spend the time and energy into fine-tuning their microphones and training the speech recognition for their particular speech pattern and vocabulary
- Reading text is faster then hearing it, even at faster-then-typical-human-speech recitation speeds. The same goes for typing being faster then dictation
- Screens and keyboards afford a minimal level of privacy. With them eavesdropping generally requires line of site, not just sitting in the next cubicle over and unavoidably hearing everything
So, where will this be useful? Anywhere keyboards aren't. Web phones. Industrial environments (well, quiet ones). For physically challenged folks with visual or manual problems. But sitting in the typical office workspace? Not gonna (still) revolutionize the world.I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
And his favorite browser is Opera. I bet this will just make him love opera even more! It's tedious for him to type, as he has limited control of his hands, so this will really help him out. I'm really glad Opera is doing this.
Are you sure it's a good idea to have presentation software that actually responds to comments shouted out by hecklers in the audience?
"Freedom means freedom for everybody" -- Dick Cheney
Imagine a PDA that you can actually talk to instead of having to struggle with "Graffiti" or the little thumb keyboards. Hell, if it's good enough, you could even get rid of the need for a screen and just interact entirely through voice - here's how we could finally get a useable web browser/email client/schedule program in a watch!
One step closer to some of the concepts explored in Snowcrash, maybe?
The first ever Ultimate Frisbee video game: here (now
God, I hope something like this replaces PowerPoint. As we all know, PowerPoint makes you stupid. It forces you either to dumb down your presentation to the intellectual complexity (and entertainment value) of an infomercial, or cram so much text onto your slides, most of which you will recite anyway, that you might as well just pass out reports in 3-ring binders.
That said, I think the most crippling thing about PowerPoint is its linearity. Not all presentations "want" to be laid out into a preset order of points. If a college professor or a businessperson gets asked a question during a presentation, all too often it is diverted by saying "well, that's coming up in a few slides", or the presentation is interrupted as tangential data is introduced.
Using voice recognition instead of click-through navigation opens up some great possibilities for non-linear presentations, though. Imagine that, instead of organizing your presentation into a linear timeline, you group slides and other media into "points", each of which represents a different idea relevant to your talk. You can arrange these points into a web, indicating what information depends on prior knowledge from other slides, etc. You then assosciate each point with an audio "cue", say a phrase like "projected profit margins" or "the three kingdoms period". You'll note that these phrases are things you're likely to naturally utter in your presentation anyway. This has the advantage of enabling you to speak totally naturally without interrupting your presentation. To avoid accidental jumping, we would have, say, a little translucent blue arrow fade into being every time a cue is recognized, disappearing a few seconds later. If you actually want to jump to a new point, it's just a quick click of a button when you see the blue arrow.
So, imagine you're giving a sales presentation to a group of executives. You notice this particular group is getting bored with your standard sales pitch. No problem, as you just drop a key phrase into your speech, and instantly change your presentation to include information you think will appeal to the business interests of your audience, or simply to their personality. Or, imagine a professor is giving a lecture on a peice of literature. A student asks a question about the author's background, and the professor can easily insert some information on their country, their historical circumstances, and their life.
Of course, organizing this type of presentation requires a greater investment in planning, and certainly requires a little more cognitive ability than your standard PowerPoint fare. However, those who work with these new presentation systems will be giving themselves an undeniable competitive advantage over presenters using linear methods. And those in the audience will be grateful, I'm sure.
Anonymous Luddite: "What do you think of the dehumanizing effects of the Internet?"
Andy Grove: "Not Much."