Slashdot Mirror


Opera Promises Voice-Operated Web Browser

unassimilatible writes "Opera's latest browser talks and listens, according to AP. The new browser incorporates IBM's ViaVoice technology, enabling the computer to ask what the user wants and "listen" to the request. "Hi. I am your browser. What can I do for you?" asked a laptop with the demonstration versions of the browser. The message can be personalized, such as greeting users by name. The computer learns to recognize users' voices, accents and inflections by having them read a list of words into a microphone. Opera plans to first launch an English version of the voice browser for computers running the Windows operating system. Versions for other systems, including handhelds, will follow. Opera's press release has more details, including Opera's hopes that people will adopt this technology for presentations - and to replace PowerPoint."

86 of 352 comments (clear)

  1. i can hear see it now by rabbot · · Score: 5, Funny

    "Computer...Take me to the pr0n!!"

    1. Re:i can hear see it now by freshman_a · · Score: 3, Funny

      actually, since the first version is for windows, i was thinking more of hearing "Where do you want to go today?" when you start it up...

    2. Re:i can hear see it now by saforrest · · Score: 4, Funny

      "Computer...Take me to the pr0n!!"

      And, since pr0n is not an English word, you'll get this.

    3. Re:i can hear see it now by jeffkjo1 · · Score: 4, Funny

      I've seen Star Trek IV one too many times recently. All I can picture is Scotty holding up the mouse to an apple II and asking for pr0n. I feel so wrong.

    4. Re:i can hear see it now by Anonymous Coward · · Score: 5, Funny
    5. Re:i can hear see it now by sempf · · Score: 2, Funny

      Bah! I wrote a voice operated BBS in Assembler for the Apple IIe in 1985!

      --
      /usr/bin/grep -i -E meaning life.txt
    6. Re:i can hear see it now by sik0fewl · · Score: 3, Funny

      Hmm.. hands free surfing for pr0n. Imagine the possibilities.

      --
      I remember when legal used to mean lawful, now it means some kind of loophole. - Leo Kessler
    7. Re:i can hear see it now by Spetiam · · Score: 4, Funny

      page loads automated audio file

      browser: "close all other tabs"

      user: "what the hell!"
    8. Re:i can hear see it now by Idarubicin · · Score: 3, Informative
      page loads automated audio file

      It's okay. Opera lets you suppress those annoying automated audio clips. Hit F12 (opens the Quick Preferences menu) and uncheck 'Enable embedded audio'.

      The same menu also contains all the popup killing settings ('Open requested popups only' works quite well) and cripple some other annoyances of the web (uncheck 'Enable plugins' and possibly 'Enable Javascript'.)

      Cheers.

      --
      ~Idarubicin
  2. a few things to say... by frazzydee · · Score: 5, Interesting

    This sounds like a fun thing to play around with, but I certainly don't see myself using it as a normal web browser. I'll most likely stick with my keyboard.
    as for their statement about it being a replacement for powerpoint, I don't think that this will fly either unless they either: a) find a company to make a powerpoint alternative which saves to html files b) make the aforementioned software themselves. Even if they accomplished that, people's stupidity and ignorance has proven time and time again that whether microsoft's software is better, worse, or just as good as its competitors- people will buy microsoft's software instead of others. Look at openoffice.org, mozilla (most people use ie)/opera/konquer/galeon/netscape/etc, linux, amd a bunch of other superior software. Maybe a couple could be explained (linux often involves use of the command line interface, netscape is slower to load (even though ie cheats by loading some of the program at startup time)), but most of it is due to a problem which exists somewhere between the keyboard and the chair. Besides, I would find a remote control a better option than speech, since a remote control wouldn't force me to scream "NEXT SLIDE" across the room like an idiot before it recognizes what I'm saying. It would also be much smoother to just press a button on a remote control.

    1. Re:a few things to say... by Anonymous Coward · · Score: 5, Insightful

      Don't think of it as a replacement for your current browser on your current desktop. This seems as if it would be a nice start to bettering the functionality of a web browser on a computer too small for a standard keyboard... i.e. pda and smart phones.

    2. Re:a few things to say... by mahler3 · · Score: 5, Insightful
      I don't see myself using a voice-commanded that much, either... heck, I haven't even programmed the voice dialing capabilities on my new cell phone.

      That being said, this will likely make life better for people with severe spinal injuries or others with limited use of their hands. Kudos to Opera.

    3. Re:a few things to say... by gusmao · · Score: 2, Insightful
      "This sounds like a fun thing to play around with, but I certainly don't see myself using it as a normal web browser. I'll most likely stick with my keyboard"

      Well, while you probably have the option to pick your keyboard, there are many handcapped people in the world that would find amazing just surf the Web all by themselves. This will be much more than a toy for them.

    4. Re:a few things to say... by GreenCrackBaby · · Score: 4, Interesting

      Think of how this will change the life of a disabled person, who may be unable to type?

      And as for presentations, who says you have to stop your speech to scream "NEXT SLIDE". Imagine a presentation package capable of picking up from your presentation exactly when you'd like the next "slide" (useless word since you could now do much more than you are constrained with using Powerpoint).

      Imagine, during a presentation, being able to say "If you look at the sales figures for the year..." and have your presentation automatically display those figures.

      --

      "The market alone cannot provide sufficient constraints on corporation's penchant to cause harm." -- Joel Bakan
    5. Re:a few things to say... by Krondor · · Score: 4, Interesting

      a) find a company to make a powerpoint alternative which saves to html files

      OpenOffice can save to HTML and Flash files from Presentations.

      Even if they accomplished that, people's stupidity and ignorance has proven time and time again that whether microsoft's software is better, worse, or just as good as its competitors- people will buy microsoft's software instead of others. Look at openoffice.org, mozilla (most people use ie)/opera/konquer/galeon/netscape/etc, linux, amd a bunch of other superior software.

      People buy Microsoft software because they are
      a.) not familiar with the competitors
      b.) worried about compatibility with the rest of their microsoft software
      c.) do not want to retrain staff
      d.) need feature X which competition lacks
      e.) work for Microsoft or are otherwise affiliated with them.
      f.) do not trust an unproven product (in their eyes) and don't want to be the guinea pigs

      Point being, as other software matures it will be harder and harder for Microsoft to release sub par software and expect a solid buy in. If you look at Mozilla it's growing speed very fast now, I know a number of Windows users that aren't even very technical that use FireFox and/or Mozilla. Look at OpenOffice, Microsoft is killing themselves with their own Doc standard. They can't move future iteratios of Office to abandon or morph the compatiblity of .doc too much or they break compatiblity with themselves, and this allows the competition to reverse engineer and support those standards.

      As far as Opera's voice operated browser goes I think this is great, especially for disabled and handicapped people. I also think there's a certain appeal to be in front of a board and say Next slide to your openoffice html/flash presentation and have it progress. I mean what a way to impress.

    6. Re:a few things to say... by Strange+Ranger · · Score: 2, Interesting

      I say heck with PDA's and cell phones. I want this for wall mounted flat panels. So I can holler at it for goodeats.com while in the kitchen with my hands messy. Or in the basement in dire need of plumbing.com or whatever when I'm trying to prove (erroneously) that I can fix ANYTHING.

      So many times while putzing around the house or driving I've wanted to bark out a command a la Star Trek and having Google answer me. Very cool.

      Although if it chimes in with - "It sounds like you are trying to browse the internet, would you like me to help you?", then someone will surely have to die.

      --

      Operator, give me the number for 911!
    7. Re:a few things to say... by jimshep · · Score: 2, Interesting

      IBM released a similar capability with its OS/2 Warp 4 product back in '95 or '96. The boxed set even included a head set/microphone. Though not extremely useful, it was nice to be able to sit back and browse without having to use a mouse or keyboard. If I remember correctly, the browser created a list of all of the links from the current page and all you had to do was say the name of the link and it opened up. It's amazing to think that was almost a decade ago.

      -Jim

    8. Re:a few things to say... by jimshep · · Score: 3, Informative

      Exactly what was a lie? In 1996, IBM released OS/2 Warp 4 which included a voice enabled version of Netscape Navigator. Here is the press release.

      http://wp.netscape.com/newsref/pr/newsrelease224 .h tml

      The voice recognition was OK, and it was quite easy to navigate from website to website using bookmarks and links in the page.

      -Jim

  3. But I don't wanna talk to my computer by Control-Z · · Score: 3, Funny


    What could I possibly have to say to my browser?

    1. Re:But I don't wanna talk to my computer by andyrut · · Score: 2, Insightful

      What could I possibly have to say to my browser?

      Agreed. While there are some cases where voice-activated technology has its uses (I very much doubt people would be thrilled with typing into their onboard navigation systems while driving) a web browser or other common features on your computer simply don't need speech recognition.

      For Joe User, I doubt we'll ever see widespread use of speech recognition technology. Who wants to go hoarse telling a computer what to do when it only takes a flick of the wrist as it is? And man, an office could get noisy if everyone was dictating documents and telling their machines to "download Natalie Portman pictures."

  4. Voice activated by Anonymous Coward · · Score: 5, Funny
    Great.

    Now the jerk in the cubicle next to me will talk both with himself, "the fairies" and his browser.

    1. Re:Voice activated by taernim · · Score: 5, Funny

      Well, if Bob can listen to his radio, then I should be able to talk to my browser at a reasonable volume...

      --
      "PC Load Letter? What the $@#% does that mean?!"
    2. Re:Voice activated by Seraphim_72 · · Score: 2, Funny


      ....Step into his cube...
      hey bob, HOT GRITS...hows that spreadsheet comming? I got some time off coming going to head down to my brothers GOAT farm in middleSEX, could you water the plant in my cube? thanks you're the BREAST.
      You know, now that I think about this - I am going to love it.

      --
      Slashdot, where armchair scientists get shouted down and armchair theologians get modded up.
  5. Word Processing is clunky, will this be better? by michael+path · · Score: 5, Insightful

    Though I can certainly understand the need to market something unique, and the logic behind "Voice is the most natural and effective way we communicate.....", I cannot ever see myself talking to my web browser like another human being.

    I've worked with and supported both ViaVoice and DragonNaturallySpeaking solutions for voice-based typing in word processors, and neither of them felt natural. Perhaps because I'm a geek, or just because I've been doing it so long, I'd rather manually key in exactly what I want and let myself make the mistakes, not the interpretation.

    With corrections, it always took longer to do the alleged "easier way" than manually keying in. Even with 99% accuracy, Word Processing was always clunky at best.

    That, and every time I scream out "litigious bastards", I don't need it pulling up litigious bastards.

    1. Re:Word Processing is clunky, will this be better? by bitflip · · Score: 2, Funny

      Voice is the most natural and effective way we communicate...

      Psht. And wrong, too. The most natural and effective way we communicate is through body language.

      Give me a ring when they invent a web browser that scrolls down when it sees my eyes get to a certain part of the page, or clicks "back" when it sees my jaw slack in boredom.

      Or, better yet, automatically browses to another, non-porn, page while the girlfriend/boss is still walking down the hall...

    2. Re:Word Processing is clunky, will this be better? by j-jahnke · · Score: 2, Interesting

      It does suprise me that that folks don't take a step back and consider what Opera is doing here. While I was still at Motorola they were working with us and IBM on MultiModal interfaces which is what this things is.

      And I and many others think it makes a lot of sense. Presentations are a good example of helping people understand the problems Multimodal is meant to solve. Obviously we were interested in the fact that devices got smaller with each passing year and no matter how we tried there were still 26 chars in the alphabet.

      Multimodal is still a very new technique and a lot of work has to be done to define how it should work. Just like on phones when you start speaking you expect the other person to stop these interfaces evolvoed over a period of time, they are in many ways so subtle you won't notice them until you do them wrong and say... Hmmm that isn't right lets try this.

      I know some of the earliest Multimodal interfaces we had were tied to the Broadband TV stuff that Motorola's recently purchased Geneal Inst group did. So the idea was pick up your nextel phone and using PTT tell the TV to list all the shows currently playing with Cary Grant in them. These kinds of queries are easy to write for voice and are quite powerful.

      Obviously the nextel phone was the wrong input for it, but it shows the strength of Multimodal. I could fill out voice dialogs using email or SMS pages if I wanted.

      The first version of the Motorola Multimodal Fusion Server worked on the NexTel network and not only was able to combine modalities on different machines but was the first example of Distributed Speech Recognition on a public network, and I am positive a lot of the stuff we did 2 years ago in our labs will find it's way onto your PDA and cell phone soon. Opera is giving you a frist crack at it.

      Jer,

  6. Slash dot by moberry · · Score: 5, Funny

    *speak it* h t t p : / / slash dot . org

    1. Re:Slash dot by baryon351 · · Score: 5, Funny

      Opera responds with...

      "Cannot connect to http:///..org"

    2. Re:Slash dot by WormholeFiend · · Score: 4, Funny

      heytch tee tee pee colon slash slash dot dot org

      wow. sounds almost obscene.

    3. Re:Slash dot by tjmsquared · · Score: 2, Funny

      You mean: h-t-t-p colon slash slash slash dot dot org
      It will sound like you are stuttering.

    4. Re:Slash dot by Jhon · · Score: 2, Funny

      God... am I going to need to say "colon" every time I want to browse? A contant reminder that I've a scheduled colonoscopy in the near future? Ug!

  7. voice operated? by goosebane · · Score: 2, Insightful

    I have tried a lot of voice operated software, but have never had any luck getting it to work. Has anybody else had better luck with voice activated software? What do you think the chances of this actually working for most people are? Until Ive seen a product that works well I unfortunately have to remain skeptical.

    1. Re:voice operated? by AndroidCat · · Score: 2, Interesting

      I don't think Microsoft has had much luck. Otherwise they would have made use of the text-to-speech and voice command capabilities built into Clippy's agent software. (Or it was just even more a pain in the ass than now. Wow .. imagine, a Clippy even more annoying than it is now! Who says Microsoft doesn't advance the state of the art?)

      --
      One line blog. I hear that they're called Twitters now.
    2. Re:voice operated? by gryphokk · · Score: 2, Insightful

      I've used Mac's system "speakable Items" in both OS 9 and OS X,which uses a floder full of icons/scripts. Speak the name of the icon, and it actsw as a double-click. The biggest problem is using it in any kind of noisy environment. Hallway traffic interferes with it, and forget having the radio or stereo on.

      It was pretty effective as far as it went, but not a total solution. I used commands to launch all my common programs, and common File: and Edit: commands.

      (Photoshop. Select all. Copy This. Quark Express. Paste Here. Print it now. Close this Window. Quit this Program.)

      And don't forget to turn it off when people come to talk to you -- one sentence misinterpreted as a command could do -- well anything.

      (Select all. Delete. Save. Close Window)

      Worse than not saving, you can accidentally blow away weeks of work and not know it 'til you reopen the document.

      Of course there are safeguards, like requiring a keyword before accepting voice commands (Gilbert: Print it now).

      Lots of fun, but it's off right now -- great wow factor but interferes more than it helps.

      --
      And you, madam, are very ugly. In the morning, I shall be sober.
    3. Re:voice operated? by Unoti · · Score: 3, Funny
      I'm with goosebane on this-- I have yet to see voice software that are truly helpful rather than just gimmicky.

      I have had some success with "hardware", though. The other night I called home and asked my daughter to tell me the address of a shopping mall I was looking for. She googled it, clicked around, and a few seconds I had the address. That's the kind of thing I wish voice recognition apps could do!

  8. Hard to manage tech by grub · · Score: 5, Funny


    Voice input and output.. that'll make it a lot harder to discreetly search for pr0n whilst at work.

    Computer: "Hi. I am your browser. What can I do for you?"

    User: [whispering]Find me "porn"...

    Computer: "The band KoRn was formed in 1993 by Jonathan Davis and..."

    User: NO! [whispering] Not "KoRn"; "porn".

    Computer: "Clogged pores are the major cause of adolescent acne. Starting at puber..."

    User: NOT "PORE", DAMMIT!!! [coughs, lowers voice] find me "porn"..

    Computer: "Iron Ore is the primary ingredient in steel. Metalurgists will add other elements and compounds to give the steel certain proper..."

    User: NOT "ORE", YOU PIECE OF SHIT! [office mates look over cubes] [whispers] Look.. I want to look at naked people..

    Computer: "The goatse.cx lawyer has informed us that we need a warning! So.. if you are under the age of 18 or find this photograph offensive, please don't look at it. Thank you!"

    --
    Trolling is a art,
  9. The catch... by ackthpt · · Score: 5, Funny
    As it's Opera, you have to sing to it.

    "Is this the real life, is this just fantasy..."

    --

    A feeling of having made the same mistake before: Deja Foobar
    1. Re:The catch... by Cyclopedian · · Score: 3, Funny

      Caught in a landslide,
      no escape from reality. (Ms Windows)

      Open your eyes,
      look up to the skies and see.... (Mozilla)

      I'm just a fool boy,
      I don't need sympathy (Linux user)

      Cause I'm easy come, easy go
      Little high, little low (Mac OSX User)

      Any way the wind blows,
      doesn't really matter to me... (Windows BSOD)

      Now I've got this song stuck in my head. =)
      -Cyc

  10. Voice activated Powerpoint? Uhm, no... by LostCluster · · Score: 2, Insightful

    The key thing about PowerPoint presentations is that it's supposed to be a visual backdrop that you can control without disrupting your presentation. What a powerpoint presenter really wants is a simple wireless device to advance to the next slide, and maybe a back button in case of a mis-click. Any additional buttons beyond two are annoying.

    Come on, this technology has existed for the TV weatherman for years. Why hasn't anybody gotten it right for PowerPoint users yet?

    1. Re:Voice activated Powerpoint? Uhm, no... by tbase · · Score: 2, Interesting

      Egad - there are a million devices of which you speak. The simplest of which might be any old programmable multi-button mouse.

      But personally, I think this has great potential for presentations, without disrupting them - especially if you could control the commands used to advance each slide. For example, if you could program the transition to a sales figures slide to be triggered by the words "sales figures for 2002", then it would automatically pull up the right slide when you say "Now let's look at the sales figures for 2002". Properly scripted, it could be pretty slick.

      I once got paid good money just to launch PowerPoint presentations and click the "next" button all day. These people might have been ok with running the presentations by voice - but a two button device connected to a computer (wired or otherwise) was too intimidating.

      --

      666-607: 6th floor apartment of the beast
  11. Will there be a Majel Barret plug-in? by Anonymous Coward · · Score: 2, Funny

    You know damn well this is the first obvious add-on.

  12. http:///..org by modder · · Score: 5, Funny

    I'm sorry Dave, I'm afraid I can't load that.

  13. Let's push the sedimentary lifestyle more.. by AgtSmith · · Score: 3, Funny

    Well for some of us the major work out a day is mouse gestures and keyboard pecks. I guess now I'll have to actually get up to burn off that Big Mac with extra value fries.

    --
    Sig removed by order of FBI Patriot ACT
    1. Re:Let's push the sedimentary lifestyle more.. by HellKnite · · Score: 2, Funny

      I do believe you meant SEDENTARY. Unless you're some form of mineral/rock, I don't believe you're living a sedimentary lifestyle.

    2. Re:Let's push the sedimentary lifestyle more.. by peacefinder · · Score: 3, Funny

      I do believe you meant SEDENTARY. Unless you're some form of mineral/rock, I don't believe you're living a sedimentary lifestyle.

      Depends. How long has he been sitting there?

      --
      With reasonable men I will reason; with humane men I will plead; but to tyrants I will give no quarter. -- William Lloyd
  14. And then the browser said: by physicsboy500 · · Score: 4, Funny

    " I'm sorry, Dave. I'm afraid I can't do that"

    --
    The original generic sig.
  15. Through some careful configuring... by DA_MAN_DA_MYTH · · Score: 4, Informative

    You can do the same with just about any other browser on Mac OS X. With the speech module you can connect a voice command to any keyboard sequence. I have it set up to switch tabs, create tabs, and with the 'Make this page speakable' voice command, you can navigate to any page, making it work like a bookmark system.

    What would be nice is if 'Speech' could recognize the commands for a particular application without switching focus. So I could be coding on one screen while browsing on another.

    --
    "It takes many nails to build a crib, but one screw to fill it."
  16. Homophones... by LostCluster · · Score: 5, Insightful

    There are many words in the English language that have homophones. Google being a text-based search interface is smart enough to not mix up "four" and "for", "too" and "two", or "plane" and "plain". There's no way for voice recognition technology to tell the difference between those words in a search query, there simply isn't enough context...

    1. Re:Homophones... by DavidpFitz · · Score: 4, Funny
      There are many words in the English language that have homophones

      Absolutely - using Dragon Dictate I once asked my browser to go to hotmail.com... I ended up at hotmale.com and that phrase has now become my test for dictation software!!

    2. Re:Homophones... by n8willis · · Score: 4, Insightful

      Well, but text-based search cannot distinguish between homographs, like bow (as in tie a ribbon into a...) and bow (as in one end of a ship). So there are trade-offs either way.

      --
      -- Watch the REAL Jon Katz.
    3. Re:Homophones... by Kunta+Kinte · · Score: 2, Interesting

      There are ways around that. We can do it the same way humans taking dictation do it.

      One potential workaround is to have a short period of 'sensitivity' after common homophones.

      For examaple the speaker says 'Final 4' but the browser types 'Final for'. The software recognizes that 'for' is a common homophone and waits a *very* short time ( a second or two ) after the uttering of 'for' for *another* occurance of 'for', which would imply a correction. Also an occurance of a special word eg. 'no', followed by 'for' in that short period would imply the alternative 'for', ie. '4' is correct.

      To override the 'quick correction' the person speaking can simply pause after homophones that are to be repeated in dictation or followed by control phrases.

      --
      Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW
    4. Re:Homophones... by mopslik · · Score: 3, Funny

      There are many words in the English language that have homophones.

      Eye for won due knot sea awl aught of miss takes re: salting from hoe mow phones. Inn fact, eye am you sing the pro gram write now.

    5. Re:Homophones... by Rallion · · Score: 2, Insightful

      Makes no difference, as the speech software will run into the same problems anyway. All it does is convert it to text, after all.

  17. Gimmicky blah blah by stratjakt · · Score: 4, Funny

    How complicated can you make a browser?

    I mean, tabbed browsing is cool, I've gotten used to it. But stuff like mouse gestures, voice recognition, etc, all just seems like fluff.

    One could have mapped spoken keywords to mouse/keyboard actions already if this is what they wanted.

    It's a hard arena to innovate in. This just seems kind of silly.

    What's next, support for force feedback chairs that scroll the browser based on which ass cheek I'm clenching?

    --
    I don't need no instructions to know how to rock!!!!
    1. Re:Gimmicky blah blah by Lattitude · · Score: 2, Funny

      Well, strtjakt, say you were in a straight jacket...

    2. Re:Gimmicky blah blah by sysopd · · Score: 2, Insightful
      Yeah, mouse gestures are 'fluff' just like Palm's graffiti was 'fluff'.

      You are correct, it is a hard arena to innovate in, and Opera is the only company I know that is actively innovating-- and at the same time making their product faster and less resource intensive. Voice recognition will be an optional feature, and will be quite useful especially for those who rely on non-standard accessibility features.

      Many of the features opera has increase productivity and are downright addictive on the desktop, but guestures on mobile devices where you have no keyboard (such as a cellphone (with 'intelligent type' etc) or pda) are almost mandatory. Not to mention Opera's Small Screen Rendering (press Shift-F11 in opera to test it out) which makes browsing the web (ie, not WAP) actually possible.

      You have to realize that Opera as a product is used on at least 7 different desktop OSs, several brands of Smartphones, PDAs, internet terminals/STBs, etc. Much of the so-called 'gimmicks' are a necessity for one of these other markets. The benefit to the Opera user is getting all of these features regardless of platform, and homogeneity of the product line (meaning Opera on Mac should have all of the features and a similar interface (barring OS/GUI differences) as Opera for Linux).

  18. Browsing with people is a pain by sklib · · Score: 4, Interesting

    I'm sure you've all done this at one point or another -- you stand over the shoulder of a friend or co-worker, and tell him or her to go to a website that you are familiar with, and they are not. Then you say "Ok, click on 'specs' up in the corner.... no, the other corner... yes, that button... no, don't click below it - that's somethign else..." Same deal with e.g. getting someone to change an option in a program somewhere -- you gotta walk them through a series of mouse clicks or things to look for, and it's frustrating when they don't do it right away. (maybe i'm just an impatient jerk?)

    The point here is when it's hard to instruct intelligent people how to browse the web, how well can a computer do it? I have my doubts.

    --
    -S
  19. An important step in computer interaction by Fluidic+Binary · · Score: 2, Insightful

    I personally think having alternative means of interacting with our software is important.

    For a user such as myself a keyboard and mouse is presently more intuitive, but eventually this sort of software should prove very useful, especially as computers become more fully integrated into our lives.

    This technology might also be useful with a couple of modifications, for the blind if it was coupled with one of those applications that reads the text from the screen for you.

    I hope the next step would be interfacing more easily with computers through gestures or non-standard spoken communication for those who are speech impaired and for some reason can't use a keyboard or mouse.

    I suppose this is just my personal agenda shining through, but I think diverse means of interfacing with our information is essential to enriching the lives of those who are different as well as making the majorities life easier.

  20. speech recongnition... by wwest4 · · Score: 4, Funny

    ...it's all well and good. but can the speech recongnition module parsebork? if so, it will be the ultimate presentation tool:

    "Now gentlemen, pleese-a turn your ettenteeon to-a sleede-a twelve-a. bork!bork!bork!"

  21. It may have a niche. by mystery_bowler · · Score: 4, Interesting

    For a while my wife was a physical therapist at a nursing facility that specialized in head tramau and paralysis. I installed Dragon NaturallySpeaking for several patients there and several of them became extremely proficient in using it. I'm not sure how having built-in support would be more advantageous, though.

    I can't see this having wide acceptance in the corporate world. Cube farms are noisy enough. I can't imagine what it must sound like for everyone to be browsing by voice.

    I also can't imagine some of my co-workers saying the addresses of what they browse out loud. *shudder*

    --

    My sigs always suck.
  22. Great concept for people with Diabilities by Frailty · · Score: 5, Insightful

    I installed some of the first off the shelf Voice recognition software a number of years ago for my sisters cousin who has cerebral palsy, and it made a huge difference in her being able to use the computer for her education, I sent the Opera Link to her Mom to look at in that this might be something that would suit her also.

    --
    " My next house will have no kitchen - just vending machines and a large trash can. "
  23. the Prez is gonna love this... by WormholeFiend · · Score: 4, Funny

    "Dubya Dubya Dubya period white house period gov" ;-)
    (note to dems, i'm not a troll, i'm canadian)
    -

  24. English is the problem. by Thinkit4 · · Score: 2, Interesting

    Check out stuff like lojban that really seek to take languages to the next level. Lojban is built so voice and text can be converted. Lojban is even computer parsable.

    --
    -I am an elective eunuch.
  25. Maybe when AI is done by ashultz · · Score: 3, Interesting

    The only reason that voice is a good interface to other humans is that humans are very very good at filling in the missing pieces, making inferences, and generally making up for things that are unheard, misheard, or unsaid. And even so we have misunderstandings.

    Once we have a computer that can do this, we'll have great interfaces - it will be like robo-butler. But we're not there yet, and robo-idiot-child - "I thought you said Quick Bananas, so I googled and we're at the Dole website" - is only going to make things annoying.

    It will be a boon to those who can't use point and click for whatever reason, and ignored by everyone else.

  26. In Oslo, Norway by Snork+Asaurus · · Score: 2, Funny

    browser plugin listens to you.

    I'm sorry, but I had to do it just once.

    --
    Sigs are bad for your health.
  27. Re: Thermonuclear War by OC_Wanderer · · Score: 3, Funny

    Computer: Would you like to play a game?

    User: I want to play thermonuclear war.

    Computer: Wouldn't you rather play a nice game of chess?

    User: No goddammit, I want to nuke, not puke!

    --
    -- There is no spoon. Only fork.
  28. oh god i hope it reconizes hotmail.com and not by cyrax777 · · Score: 2, Funny

    I hope it nows its hotmail.com and not some gay porn site.

  29. I've tried this with Dragon by DrSkwid · · Score: 4, Interesting


    I got a free copy of dragon dictate once so I trained it as much as possible.

    I got mozilla working quite happily, 'down' 'up' 'slow' (that was a good one, it slowly scrolled down), 'back' etc.etc.

    the thing I found after weeks of training that it was just so tiring talking all the time

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
  30. Accessibility, accessibility, accessibility... by hkmwbz · · Score: 3, Insightful
    Opera is known to care about accessibility. This technology probably has many uses, and it will be especially welcome to people with certain disabilities.

    To you, it might be a gimmick. To someone with a disability, this could make life a lot easier.

    --
    Clever signature text goes here.
  31. Already got it by maggard · · Score: 4, Informative
    Recent version of MS Windows get speech recognition installed in with recent versions of MS Office, or added as a free download from MS. Mac OS X also comes with speech recognition and just announced they're gpoing to screen-reader enable their entire GUI.

    Also as the article notes one can buy more extensive add-on products like IBM's Mac/PC ViaVoice & Dragon's family of products as well as numerous other lesser-known and more specialized ones.

    That's today, already on millions of desktops, ready and capable of driving web browsers, sitting there ignored.

    Why?

    • Few folks are even aware that speech recognition or speech generation are trivially or already installed on their computers
    • When general users do use these capabilities they're usually disappointed they're not more like the ones on TV, where a simple ambiguous command is immediately interpreted and plot-appropriate material magically recited out
    • Most folks don't have microphones plugged into their computers, or they're ones unsuitable for speech recognition
    • Few folks bother to spend the time and energy into fine-tuning their microphones and training the speech recognition for their particular speech pattern and vocabulary
    • Reading text is faster then hearing it, even at faster-then-typical-human-speech recitation speeds. The same goes for typing being faster then dictation
    • Screens and keyboards afford a minimal level of privacy. With them eavesdropping generally requires line of site, not just sitting in the next cubicle over and unavoidably hearing everything
    So, where will this be useful? Anywhere keyboards aren't. Web phones. Industrial environments (well, quiet ones). For physically challenged folks with visual or manual problems. But sitting in the typical office workspace? Not gonna (still) revolutionize the world.
    --
    I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
  32. I had a parapalegic teacher in college.. by Azureflare · · Score: 4, Insightful

    And his favorite browser is Opera. I bet this will just make him love opera even more! It's tedious for him to type, as he has limited control of his hands, so this will really help him out. I'm really glad Opera is doing this.

  33. OS/2 v4 sort of had this... by dtjohnson · · Score: 2, Informative

    You could talk to it but it could not talk to you. OS/2 v4 would let you navigate through applications by voice commands. You could access any menu in an app using voice macros that you could record and add to the app in its settings notebook. The design and the implementation of the voice-navigation macros were brilliant and far ahead of their time but the system never worked very well because it was simply too slow on the Pentium cpus available when it was released in 1996. Also, IBM hard-coded some cpu limitations into the implementation such that it still ran at exactly the same speed years later on a 2.0 Ghz cpu as it did on a 133 Mhz cpu. I used to use the voice navigation to do simple things, though, like enter numbers into spreadsheets and it was fine for that. IBM took the voice navigation out of OS/2 beginning with v4.51.

    1. Re:OS/2 v4 sort of had this... by Locutus · · Score: 2, Interesting

      What was also cool was if you installed a 2nd sound card, you could have cause/effect actions based on speech input. With only one sound card, there was no playback when you had voice navigation active. For the fun of it, I had a bunch of A.C. Clarks 2001/HAL responses set for system sounds. It was kinda eerie then I'd tell the computer to do something and when it didn't understand, it's say, "Sorry Dave, I can't do that."

      The WorkplaceShell was, and still is, the most incredible desktop I've ever seen or developed for.

      LoB

      --
      "Anyone who stands out in the middle of a road looks like roadkill to me." --Linus
  34. I've been doing this for some time by zeno53 · · Score: 2, Informative

    In fact, quite a few of us have been doing this for some time. What you are reading was dictated using NaturallySpeaking, the speech recognition software the majority use who must (or prefer to) do some or all functions on the computer by voice. Well, "Put in the CD!", doesn't quite work but I can dictate very quickly and I can control everything within Mozilla (and Windows) by voice commands.

    I applaud the folks at Opera for their efforts. While a truly user-friendly speaker-independent voice interface for all computers is still a while off, it is the future and in the meantime providing the basic functionality of being able to control things like your Web browser by voice commands benefits many. Some will just find it fun to play with, of course, but others will find it truly useful and for some, like me, it is indispensable; I'm a quadriplegic and have used voice/speech recognition since the 486 days.

    Opera and Mozilla are excellent choices and both provide different approaches to accessibility, making one the better choice for some than the other (having choice is great!) but here's a bit of irony -- Internet Explorer is the one directly supported by NaturallySpeaking but while I would prefer Mozilla in any case, Mozilla actually works better for me using speech recognition.

    Now, if only we could get speech recognition working well natively in Linux...

    [Dictated using speech recognition technology. There may be air oars]

  35. Voice internet... by jasno · · Score: 3, Interesting

    Reminds me of something I've been thinking of putting in my house for a while.

    Imagine a simple voice interface for limited internet functionality. Place microphones and speakers around the house. Now, when I'm sitting on the couch reading a book and I come across I word I haven't seen before, I can say "Hey Frank, lookup the word '...'." Need the weather? "Hey Frank, what's the weather report?".. Etc, etc..

    It should be fairly simple to tie a speech recognition engine to some python scripts to perform simple queries and return a parsed result ready for text-to-speech conversion. One big problem the dictionary feature brings out is how the speech recognition would handle unfamiliar words. Even leaving that feature out, it would be nice to have a limited set of features I could use anywhere in the house.

    Use some sort of unique gating phrase('Hey Frank!') and look for the nouns and verbs to give it some flexibility.

    --

    http://www.masturbateforpeace.com/
    1. Re:Voice internet... by cr0sh · · Score: 2, Interesting

      Use something like the z-machine (zork) parser. You could start with simple verb-noun parsing, like the old text adventures. One thing about the "gating phrase" - as in ST:TNG, have the computer make a sound, signaling that is it ready for the command - that is a good UI feature, I think, for the voice interface...

      --
      Reason is the Path to God - Anon
  36. Re:Mouse gestures... by Kaimelar · · Score: 2, Informative
    They're probably the only thing keeping me from switching to Firefox.

    Ah, but I use mouse gestures with Firefox every day! There are extensions that add this funcionality. Go to http://texturizer.net/firefox/extensions/ and look at the "Mouse gestures" section. I personally use Radial Context -- it's basically mouse gestures w/ a GUI that helps you remember little-used commands.

  37. An idea looking for a market... by Mr.+Cancelled · · Score: 2, Interesting

    There's been a lot of work put into making the average PC understand its user over the past few years, but I've yet to see one that can convince the the average surfer to sit in his or her office/den/bedroom and talk to their screen. It doesn't feel natural, and most people feel that talking to ones PC is rather an awkward, embarrasing thing. And embarrassing isn't really the word I want to use, but those who I know who've tried it, and those who I've talked to about it have said that they're a little too self-conscious to talk to their PC alone in a room.

    I'm kind of in that boat myself too. While I think that anyone would readily play with such technology, there haven't been a lot of people willing to stick with it, and I think that's largely due to the "Who am I talking too? It's just a piece of furnitue" mentality.

    Someday, when we're all oil for some future earth mining civilization, people will talk to their PC's and be able to hold up conversations with them I envision.

    Something like:

    "PC, Can you tell me when my next meeting with Mr. SoAndSo is? Oh! And bring up CNN for me would you? I want to check the headlines"

    And the computer would respond with something like "Your next meeting with Mr. SoAndSo is currently scheduled for May 18. Would you like me to change that?"

    And the user would say "No, just go on with the headlines please", to which the computer would start telling the user about the headlines of the day. It would interject little things like "CNN is reporting that 30 people died in a plane crash in Switzerland, but MSNBC's saying that only 24 died, so I'm not really sure which is accurate right now.

    It'd be much more a conversation than you and I currently saying "PC, Go to CNN", "PC, Open Word", and so on. I would imagine that eventually productivity usage of the computer could be entirely verbally driven, from dictation to simply helping a user through his day... Something you could "chat with" while getting dressed, working on something else, exercising and so on. The PC would be our informer, figuring out what we want, and offering opinions and information based on discussions we would have with it, as well as prior conversations, and expressed interests. In short, it would do what a computer's always been designed to do: It'd make our lives easier, but in ways which simply are not possible today.

    Right now such technology is very clunky when compared what I've described... Kind of a silky smooth "invisible friend" of the future. I understand that there's obviously going to be a lot of "in-between" stages for such technology, but I'd rather see todays developers focusing on making my PC more productive as opposed to sticking an auditory interface over a point-and-click technology. When my computer can surprise me with its knowledge and vocabulary, as opposed to repeating phrases I've programmed into it, and translating text into speech I'll be impressed.

    Simply converting the on-screen text and reading it to me in a monotone voice is not what I want. I want my PC to know the types of news I frequently look for, and I want it to be able to paraphrase, and provide it to me in a meaningful, well-articulated manner. And I want it to feel like someone's there personally telling me of the days events. I want to be able to interupt and request greater detail on a specific bit of news. In short, I want my computer to work for me, and I want it to grow with me as my needs and interests change.

    But that's so far down the line... 8(

    For now this is a neat technology, but I'd imagine it will only appeal to the true geeks out there. Most will play with it and then go back to the more "private" methods of interfacing, such as mouse and keyboard.

  38. As seen in Futurama! by HedonismBot · · Score: 3, Funny

    Farnsworth: "Shut up friends! My internet browser heard us saying the word Fry and it found a movie about Philip J. Fry for us. It also opened my calendar to Friday and ordered me some french fries."

    3ACV04 - Luck of the fryish

    --
    Sailors. Oh man!
  39. security risk by linoleo · · Score: 3, Funny

    I remember back when the Mac first got voice-activated menus (over 10 years ago), our secretary liked them... so whenever we were passing by her office, we'd stick our head in and say "select - all files - move - trash - yes" (or whatever the magic sequence was) by way of greeting. :-)

    --
    Be faithful to your obsessions. Identify them and be faithful to them, let them guide you like a sleepwalker. JG Ballard
  40. Presentations? by El · · Score: 4, Funny

    Are you sure it's a good idea to have presentation software that actually responds to comments shouted out by hecklers in the audience?

    --

    "Freedom means freedom for everybody" -- Dick Cheney

  41. Next generation by shamino0 · · Score: 2, Interesting
    This reminds me of the voice-enabled version of Netscape that IBM bundled with OS/2 version 4.

    That system was simpler, since it couldn't rely on special voice-HTML markup tags. It took advantage of the fact that any UI element (menu item, button, etc.) in the system can be activated by speaking its text. So they added a quick Hack to Netscape so that every link's text (or ALT text) visible on a screeen would be present on a "Links" menu - thus turning the links into speakable keywords.

    It worked very well for browsing. Much less well when you want to enter new URLs. The dictation mode left a bit to be desired. But that was to be expected from the hardware of the time. Voice recognition on OS/2 required a minimum of a 150MHz Pentium, IIRC. (It would work - with much latency - on my 80MHz 486, however.)

  42. This and NASA subvocalization. by burtonator · · Score: 2, Interesting

    I've been pretty down on the whole concept of voice recognition for a while now.

    After NASA announced their subvocalization project (I'm too lazy to find the slashdot URL... someone earn karma for it!) I became excited again.

    The problem is if you're in an office you can't just start talking. Right now there are 10 people around me and most people are silently working on there computers. If they all started barking commands it would be loud as hell in here. It just doesn't scale.

    If you add the subvocalization work this totally changes the equation. Now I can silently tell my computer to do things while my hands type away.

    This is going to ROCK. Talk about multitasking... I can be typing out this slashdot post and without stopping I could launch gaim, ymessenger, make sure I'm on IRC... startup Emacs in the background , etc.

    w00t!

    Gimme gimme! $100 says the Mac has this next year and Linux has it sometime around 2015. :)

  43. PDAs? by asteinberg · · Score: 4, Insightful
    While the accessibility benefits you mention are nice, I think the key to this that most people seem to be missing is the usefulness on PDAs. I seem to recall Opera being most successful with the embedded version of their browser, and I'd say that is probably where voice interaction has the most usefulness.

    Imagine a PDA that you can actually talk to instead of having to struggle with "Graffiti" or the little thumb keyboards. Hell, if it's good enough, you could even get rid of the need for a screen and just interact entirely through voice - here's how we could finally get a useable web browser/email client/schedule program in a watch!

    One step closer to some of the concepts explored in Snowcrash, maybe?

    --
    The first ever Ultimate Frisbee video game: here (now
  44. Here's Hoping... by spoonboy42 · · Score: 4, Interesting

    God, I hope something like this replaces PowerPoint. As we all know, PowerPoint makes you stupid. It forces you either to dumb down your presentation to the intellectual complexity (and entertainment value) of an infomercial, or cram so much text onto your slides, most of which you will recite anyway, that you might as well just pass out reports in 3-ring binders.

    That said, I think the most crippling thing about PowerPoint is its linearity. Not all presentations "want" to be laid out into a preset order of points. If a college professor or a businessperson gets asked a question during a presentation, all too often it is diverted by saying "well, that's coming up in a few slides", or the presentation is interrupted as tangential data is introduced.

    Using voice recognition instead of click-through navigation opens up some great possibilities for non-linear presentations, though. Imagine that, instead of organizing your presentation into a linear timeline, you group slides and other media into "points", each of which represents a different idea relevant to your talk. You can arrange these points into a web, indicating what information depends on prior knowledge from other slides, etc. You then assosciate each point with an audio "cue", say a phrase like "projected profit margins" or "the three kingdoms period". You'll note that these phrases are things you're likely to naturally utter in your presentation anyway. This has the advantage of enabling you to speak totally naturally without interrupting your presentation. To avoid accidental jumping, we would have, say, a little translucent blue arrow fade into being every time a cue is recognized, disappearing a few seconds later. If you actually want to jump to a new point, it's just a quick click of a button when you see the blue arrow.

    So, imagine you're giving a sales presentation to a group of executives. You notice this particular group is getting bored with your standard sales pitch. No problem, as you just drop a key phrase into your speech, and instantly change your presentation to include information you think will appeal to the business interests of your audience, or simply to their personality. Or, imagine a professor is giving a lecture on a peice of literature. A student asks a question about the author's background, and the professor can easily insert some information on their country, their historical circumstances, and their life.

    Of course, organizing this type of presentation requires a greater investment in planning, and certainly requires a little more cognitive ability than your standard PowerPoint fare. However, those who work with these new presentation systems will be giving themselves an undeniable competitive advantage over presenters using linear methods. And those in the audience will be grateful, I'm sure.

    --
    Anonymous Luddite: "What do you think of the dehumanizing effects of the Internet?"
    Andy Grove: "Not Much."
  45. Try mouse gestures. You might be suprised. by SmallFurryCreature · · Score: 2, Interesting
    Switch of that toolbar with back buttons and print out the sheet with mouse gestures. Try it for a day of webbrowsing. Then just see if you are also starting to use mouse gestures in other apps. (important really try to do all the gestures and not "skip" to alt-f4 and such)

    Personally I did it because I didn't like how much space the icon toolbar was taking. My use of opera also opens most pages in other windows.

    So for instance to reply to you I rightclicked the link and moved the mouse down a bit opening this reply window in a new tab. Why? Well I am finished with this reply I will hold the right mouse button down and do a down and to the right movement, other move is also available, and close it and be instantly back where I was reading. I notice that this seems faster as some pages seem to insist on reloading if you do back. Also my move is one close and not two backs.

    I am not saying it is for anyone but once I was determined to use it I was amazed how easy it was to pick up and get totally used to it. Of course it means that when I am on a IE box I am totally out of my depth.

    Am I working faster or better with mouse gestures? It certainly seems more relaxed to me. Will I like voice commands? Well I got music on constantly in the background so perhaps not unless they got that sorted out.

    --

    MMO Quests are like orgasms:

    You may solo them, I prefer them in a group.