Are You Talking to Your PC Yet?
An anonymous reader writes "If you have ever asked "Do those speech-to-text apps like Dragon NaturallySpeaking and IBM ViaVoice really work?" Pocket PC Addict has posted a detailed review of Dragon Naturally Speaking for Pocket PC and Desktop machines. It is written from the perspective of someone who has been burned by speech to text software in the past and had vowed to never try one of these apps again. It is encouraging for slow typists who would like to use their voice to write. Plus it details some valuable tips for using it with Pocket PCs."
So if I ask Clippy to STFU, will he?
Where's my transparent aluminum?
OS X has really good system wide integration for Voice commands. and the voice interpreter is pretty good for one that comes with the OS, but I could not get it to work consistently....
:-) (when it worked)
other than that I thought it was cool to say "computer give me brad's number" and it would display my buddy brad's phone number on the screen
I am the Alpha and the Omega-3
Any recommended ones? http://sourceforge.net/projects/cmusphinx/
I know what's on your hard dr
All though Text two speach is a grape gnu technology it is not red E for the main stream yet.
Two comments and the link is dead! lol
'He was a dreamer, a thinker, a speculative philosopher... or, as his wife would have it, an idiot.' - Douglas Adams
My Accent screws up everything. I hate my Accent.
The problem I always found with uuhhhhh voice writing was mmmmm filtering out unwanted noises and shhhhh distractions from my posts period return But I uhh guess they've fixed most of those burp problems by now right question mark
-Teiresias
If so I've been talking to my machine for more then a decade.
Is anyone out there giving any thought to how a programming language should be structured to make it easy to code using a speech recognition engine?
If not, why not?
It walks just fin four mee!
apterous.org
In Japan, talking PCs are for schoolgirls.
Play Command HQ online
I've been talking to my PC for years:
You god damned son of a bitch! F'n Piece of shit!
My brother and I work at a company making efficiency programs... for awhile we toyed with the idea of having all of the programs activated by voice... we tested it out for awhile with an open source cantation originally used for games, that would execute a command, or type text based on what you said... for a while, it was awesome, every time we said something, it'd find the word from our list, and activate the program... problem was, when it listened to your voice, it only compared it to the words you had programs assigned to... so if you had four words, no problem, but if you had 60, it started choosing horribly... we eventually had to scrap the program all together... though it was funny watching what programs it would have to run through when I started cursing in frustration... I'm pretty sure the annoyance of people talking to their computers all over the building would have caused problems as well.
WANNAWIKI Wannawiki WannaWiki WANNAWIKI!
Fatal error: Call to undefined function: message_die() in /home/httpd/vhosts/pocketpcaddict.com/httpdocs/db/ db.php on line 88
Several campus administrators at the high school I work for barked for Dragon Naturally Speaking. I have yet to see/hear any of them use it.
I remember way back when, I could talk to my Star Trek Encylopedia. I was actually real disappointed that I didn't have to say "Computer." before every command though. If I find an old Apple mouse, I think I'll wire up a mouse into the bottom of it. "Hello, Computer...." "Just use the keyboard!"
Only 3 comments, and the article is already hosed.
/home/httpd/vhosts/pocketpcaddict.com/httpdocs/db/ db.php on line 88
Mirror anyone?
Fatal error: Call to undefined function: message_die() in
Take off every Sig. For great justice.
karma whore mother fucker. use ac next time. my next mod points are going towards makin ur comments' scores below 0. have a nice day now.
Do not run webservers on PocketPCs even if you are an addict
... for attempting to dictate message board posts for humerous effect. Gave me many hours of amusement. Plus I got a free mic which I now use with Skype :)
"The dew has clearly fallen with a particularly sickening thud this morning"
... to be able to sit and talk to my computer instead of typing, since my typing is so bad.
Does anyone know if there is this type of app for Linux? I would even be willing to pay a reasonable price.
Registered Linux User
Registered KDE User
I talk to my PC all the time...if you consider swearing at it and yelling profanity at it talking to it.
Evolution or ID?
And mine works just fine. Submit. Submit. I said submit. Why isn't this expletivedeleted thing triggering the submit button. Submit! Submit! Damn it, I have to move the mouse.
___ In the words of Gen. Douglas McArthur: "I'll be right back."
Voice Recognition Software Yelled At
NEW YORK--Fidelity Financial Services' Gwen Watson, 33, shouted angrily at her IBM ViaVoice Pro USB voice-recognition software, sources close to the human-resources administrator reported Monday. "No, not Gary Friedman! Barry Friedman, you stupid computer. BARRY!" Watson was heard to scream from her cubicle. "Jesus Christ, I could've typed it in a hundredth of the time." After another minute of yelling, Watson was further incensed upon looking at her screen, which read, "Barely Freedman you God ram plucking pizza ship.
My computer would be asking me to repeat anything I tried to communicate by yelling over its own leafblower noise levels.
500GB of disk, 5TB of transfer, $5.95/mo
I recall back in the early 80s I was in a Singer shop (as in sewing machines) and they sold IBM PCs as well.... ...including speech to text recognition software.
I tried it out, and surprise! it didn't work very well.
I see nothing has changed.
So rise up, all ye lost ones, as one, we'll claw the clouds.
http://weblog.infoworld.com/udell/2004/11/04.html
You get to watch/listen to him use it, which really gives you a sense of how far the software has come.
if you want to read TFA, i was lucky i yould grab it early enough.
/home/httpd/vhosts/pocketpcaddict.com/httpdocs/db/ db.php on line 88
it reads as follows:
Fatal error: Call to undefined function: message_die() in
Only morons moderate based on a sig.
.. though.
What would you do if your enter key actually worked? Imagine the possibilities!
The human body can be drained of blood in 8.6 seconds given adequate vacuuming systems.
The question is whether you talking to YOUR PC yet, not whether you are shouting at THEIR server now ;-)
Trolling using another account since 2005.
...until I noticed that the PocketPC version is just a delayed dictation device - it records, then you transfer it to your desktop computer and it's the host computer that actually does all the speech recognition.
No wireless. Less space then a Nomad. Lame.
My English teacher once told me that two positives don't make a negative. Two words for her: Yeah, right.
bracket en slash tee close bracket
"Enter" key is your best friend.
The highest rated commercial program for the Mac is iListen (not produced by Apple :-)) A visually handicapped friend is very interested in obtained speech to text software - any slash dotters have experience with the iListen or ViaVoice for the Mac?
I tried Dragon several years ago. It worked, but you really need accuracy to the nines (99.99%) to be productive with it. One mistake in 100 sylables means constant corrections. I did make a little flying demo that took english commands (right, left, up, down, slower, faster) and it was cool to control it via voice commands. There was no distinction between typing commands and speaking them though. I would recommend (if they don't have it already) the Gnome and KDE folks provide a seperate input stream for voice commands to all applications - or something. If it's there, people will code for it even if the free recognition software isn't that great yet. If there are apps that support it, people will improve the recognition software.
I have a relative that has severe dyslexia. This program has been instrumental in getting him through college. He's going through a prestigous private business school and is doing well, 3.0+ and his mom says that without that program it probably wouldn't have happened.
I tried to use a version about 5 years ago and found it somewhat frustrating, but the best of breed. I heard that it had improved quite a bit and recommended it.
It was her fault though...crapping out on me like that when I was just past Level 5 in digdug.exe.
And just when I was going to get her a shiny new Windows 3.11 for Christmas too. It sure is a pity. It'll be a while before I'm ready for another relationship.
An Indian-American Hindu committed to non-violent thought/speech/action alarmed by the global explosion of radical Islam
Maybe you should fix the computer, first by removing the leafblower and fix the reason that it was put there in the first place?
his enter key does work, he probably forgot to specify the correct formatting option for the post.
-mkb
As much as I love progress, I'd much rather type than have speech to text capabilities. Why? for the same reason I would rather type than speak on the phone: Someone down the hall can't hear you.
Unless speech recognition tech has become perfect, then there is a distinct possibility that someone else is going to hear you repeating the same word over and over again with increasing volume and anger, in a futile attempt to get the software to recognize a troublesome word.
1. It's awkward to talk when you're trying to compose something that requires a lot of thought first. I usually like to talk to myself (either out-loud or in my head) and type out what I'm thinking in a more formal fashion.
2. It is very tedious to go back and edit or make corrections. If I make an error while typing, I'm cognizant of the error very soon after it happens. With voice recognition, techincally "someone else" is typing and it takes more time to see where the mistakes were made.
3. I deal with lots of boilerplate text with original content intermingled. A lot of times working on such a text becomes an editing process where using the keyboard & mouse is more efficient.
4. My voice doesn't last for much longer than 30 minutes for non-stop speaking...and that's with short breaks for water.
Conclusion: Just hire a hot secretary that can type.
Bill Clinton: Pimp we can believe in. - The Shirt!!!
I'll be happy when someone codes a DWIM method (Do What I mean):)
*** Sigs are a stupid waste of bandwidth.
where he's working with speech recog and someone maliciously says over his shoulder " format hard drive "
I would take the speech to text over those darn laptop keyboards with the mouse pad perfect placed so you brush over it all the time when your typing and erase the last paragraph you were writing.
Is I can never start the application:
microphone on
Microphone On
MICROPHONE ON GOD DAMMIT IT !!!
but do they talk them, or us
I actually find it annoying that the focus of NaturallySpeaking and ViaVoice is to allow users to speak to their computers. Witness how they require users to "train" the software for their particular speech patterns.
As someone who is hearing impaired I've been fervently awaiting the day when I can use speech recognition to transcribe meetings, big or small, or to help me answer the phone.
I'm worried that these companies have essentially given up on writing an app that doesn't require "training" the software, can identify different speakers in a room, can identify and adjust for accents, etc.
Though i've never used this personally. I had a co-worker who was strickened by carpel tunnel. We both worked in tech support at the time. In order to accomadate her she was allowed to use DNS. To be honest after training it ( which is the most tedious part) it worked quite well. To an auther or someone who types for aliving this is a great tool. The only other concern I've heard is it does require a reasonable amount of computing power.
I hv installed and re-installed Dragon('s) Naturally speaking and yet every time, my mating with the software dosent last much beyond the obligatory training session.... might be my dad howling behind me to go to bed (the software responds -- too much backgroud noise), but the logic coded into the piece of Dragon droppings (literally....) always has failed to impress.... might just be my asian accent.
~~bada bing, bada bang, bada bong and voila~~
> Are You Talking to Your PC Yet?
:)
You don't ? I'm doing so since 25 years and all my diskdrives have hair clued around the holes
...it worked OK as long as you trained it properly and you had a nice quite room and a good mic. However, there are issues with "voice typing" that can't be overlooked. Primary is security. If you want to type a document or e-mail that contains sensitive data, make damn sure that no one can hear you. My bank recently moved to a voice activated system. I'm surprised they haven't gotten a ton of complaints from people since it REQUIRES you to say your SS# and PIN out loud. This means I can no longer check my account from my cell phone or at work. If you sit down and think about how many things you type that you would never want to say out loud, you can see why voice typing hasn't taken off. Imagine this emanating from your cubicle in a monotone:
;P
"http://www.goat.cx/ Take that you bukkake loving lunixtards."
Your co-workers would think you were a nutjob if they saw half of what you posted as AC to Slashdot.
-"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
I don't have enough experience with iListen or ViaVoice for my opinion on them to matter, but I found the built-in speech recognition for 10.2 and 10.3 to be more annoying than anything else. It would work just often enough for me to think it was doing its job, but fail just often enough to be a consistent pain in the ass. I'm looking forward to the complete rewrite in 10.4, aka VoiceOver. Here's the apple hype page for it: http://www.apple.com/macosx/tiger/voiceover.html.
If you get nervous, just remember that there are a few billion other people who don't really give a damn.
Looking backward, depending on my mood my voice should be like mmm remember those distorted graphics where people can say what text is in there but not OCR used for confirmations in web sites? well, the same :)
Anyways, it seems that Dragon Naturally Speaking scans your documents to adapt to your writing style. My question is... how does it know if that docs where written by me?. I mean: imagine what would happen if it finds a copy of the bible on your HD
Make It Secret . Free JavaScript implementation of AES for your browser
I don't know if they have improved, but I am not too keen to try again.
Dragon comes in accent variants for particular countries but unfortunately, I have a slightly mixed accent.
This means I once spent an entire evening "training" the software, and repeating specific words over and over. It drives a person over the edge very quickly.
I am sure if I ever need to type something that is all obscenities, it will be really well trained. If I need to use any words with an oo sound in them, then I think I will have to type.
For the last goddamn time I said SLEEP - not delete!
I find it distincly difficult to speak, or dictate, with the same level of concision i can achieve with single-pass typing. The different cognative processes involved can make STT, in my experience, quite difficult to master, rendering it less useful for long-form composition than one might think. I'm just guessing, but i suspect that the difference is due to the increased processing impedance required to get words out with a keyboard -- the longer time constant gives us more time to think about what we're saying.
Talk to these!
I use Dragon NaturallySpeaking version 4. It was released about 6 years ago. I have never had any problems with it. Initially training it, I spent about four hours reading the text passages, much more than it suggests that you need to train. I have long been able to dictate entire paragraphs and need only make a single correction. This is still much faster than typing for me, usually around 100wpm (I know, I was amazed too).
I modify my speech when I dictate, exaggerating the phonemes, speaking as clearly as possible while maintaining a reasonable and steady pace. Remember -- this version was from 6+ years ago, and I initially started on a 400Mhz PIII. With advances in hardware and software, I'm sure newer versions are even better these days.
The only drawback is when people make fun of me for speaking so oddly in a room with only a computer.
Hope this helps.
-C-
I cant stand trying to use speech to text, but I do like the ability to give commands, If nothing more than for fun.
The link died so fast that even the Coral cache is the error message.
donch u worry, those dragon people would be shouting all sort of perl insults at /. right about now. but isnt it lunch hour their?? i mean, even the webadmins got to sleep.
i gotta thought tho: can /. take down http://www.ibm.com/ or the likes of http://www.microsoft.com/
~~bada bing, bada bang, bada bong and voila~~
Even if this were perfect, it would still be stupid. Have you ever listened to yourself talk? Do you really want that recorded?
More to the point: have you ever learned a foreign language? Remember how obscenely different the written language is than the spoken one? The same is true of English, we just don't notice it as much. There are more stringent requirements for written speech -- that's why giving dictation is so hard. Complete sentences, no body language or appreciable emphases, paragraph structure, and correct grammar are all expected in writing but almost frowned on in speech (after all, do you look with admiration at someone who corrects your grammar, even in private)?
I suppose if you for some reason were vastly quicker at revising stream of consciousness drivel than you were at writing coherent prose then this might be useful. So maybe for journalists who do a lot of interviews?
Or, I suppose you could be writing the next Sound and the Fury, in which case please let me know so I can come shoot you before you inflict your Emperor's New Clothes on a whole new crop of high school English classes looking for meaning in a book which, ironically, signifies nothing.
adam b.
An important read on this topic is The Unfinished Revolution by the late Michael Dertouzos. In the book he describes the core technologies and approaches of human-centric computing, and speech interface is included as an essential ingredient. It's not just for "slow typists who would like to use their voice to write", it's for the future of computing.
So I think we won't really have satisfactory speech recognition until we manage to achieve a higher level of artifical intelligence. And this, sadly, is a long way off...
Until then, you'll have to learn to speak to your computer the same way you program: by specifying precisely what you want, with no ambiguity. For a computer to resolve ambiguity, it must understand "context", and that won't really happen until it is at least as intelligent as a teenage human. Will we ever get there?
In Soviet Russia, computer listens to YOU!
Oh, wait...
Net Friedman (of Ximian) fame, just blogged about his experiences with an alternative voice recognition application.
Check it out here
Voice recognition and translations would be more powerful if the computer could imagine what you're talking about in context. While I bet you'll see grammar rules before AI though, I just like to see how many problems would be easier with AI.
God spoke to me.
If by "talking" you mean verbally abusing and threatening Windows with a loaded gun.... then: YES, YES I DO TALK TO MY PC.
/dev/random
Just go to System Preferences, click on Speech, choose the Recognition tab, and away you go. How well does it perform? "Naught 2 wheel." Cancel; "Knot 2 veil." Cancel; "Not to L." Cancel; oh forget it, gimme my keyboard!
I have used Dragon NaturallySpeaking Professional for many years, and ViaVoice before that. ViaVoice's recognition was not so great, and the program crashed constantly. Dragon works very well. I can do around 140-150 wpm with it. I seem to have to make 1-2 corrections per sentence, sometimes less. I am using Dragon 7, but there is a new version (8) out now. I highly recommend this program if you have repetitive stress injuries, or would like to avoid developing them.
I use the text-to-speech on several crontab entries. Chip (yes, that's the computer's name) will announce basic daily schedule items, such as the date in the morning, kid's bedtime, and a final signoff at 11pm. I added some checks so it wouldn't talk whenever iDVD or iTunes was running. I used to have it monitor news headlines too, but it would talk too often and we would tune it out.
I also tried some "Speakable Items" for basic tasks. Essentially, there is a special folder with a number of AppleScript files. The filenames are their voice triggers. If the computer hears you say one of those filenames, it runs the AppleScript. There are nested directories with items for specific applications, so you can speak the global commands or the active app's specific commands. Well thought-out.
Some Speakable Items could come in handy, but the eMac microphone is too limited to be able to command the machine from across the room. You also cannot have a set of Speakable Items somewhere which are still active when nobody's logged in. Thus, I need to have a user logged in (and then turned away with user switch). Lastly, for most of the automation tasks I'd like to run, Perl or Bash is a better choice than AppleScript, but Speakable Items must be special text-command files or AppleScript, and I can't imagine making a bunch of AppleScript stubs for each Unix-style script I would write. These each limit the usefulness of the voice-commandable appliance I was hoping for.
On the utility side, speech command would be great for specific queries, "Chip, what day is it?" and generic countdowns: "Chip, give me ten!" and he'll tell you when ten minutes have elapsed.
[
Speech to text recognition for dictation is great and all, but it seems that few people can actually speak coherently while they are in deep thought. I would much reather have a program that allows me to control such things as music on my pc, like skipping tracks, volume, etc, or opening new windows, urls. Is there anything out there that is capable of doing this efficiently?
"Alcohol, cause of, and solution to, all of life's problems" -Homer Simpson
Mirror:d cb7b3c27c 77b59ea6919a624/index.html
http://www.mirrordot.com/stories/a5cbfb6
Shouting to yer neighbour: cee dee slash enter! are em space dash are ef! enter!
Read the EFF's Fair Use FAQ
IMHO, the problem with this kind of engines is that they don't make a separation between speech to phoneme / phoneme to text.
:-/ )
If someone designs a good open source speech to phoneme architecture, I'm sure people would start working on phoneme to text AI algorithms.
They say: "Open source? Death!!! Where will our revenues for research go?"
But... what use is patenting/selling something that doesn't work in the first place?
Again, this is only my personal opinion. (I couldn't RTFA because... *slashdotted*
I'm entering this comment right now using the voice software and a cheap mic. It's very useful when you're doing a couple things at once. Oh, hey honey, how are you this afternoon? Up to anything exciting? Whoa, aren't we frisky this morning? Yeah, just pull those off. We're going to do this until you submit
That's exactly what happened, thanks for the support mmkkbb.
With the first link, the chain is forged.
Which is weird, since it doesn't have a mic or speakers.
Right now it's telling me that it's time to go home and clean the guns.
Best Slashdot Co
I played around with Dragon and Kurzweil back in the day and man were they horrible. You practically had to read the thing an entire novel to get it to 95%.
Sometime in the post Pentium revolution, algorithms got a shot in the arm and dictation software started getting significantly better even before training.
The biggest problem I've had is that reading a predetermined text to a computer doesn't sound anything like my causal style speech I'm going to use for voice input. Anything I read over turned out reasonably well (an error a page or so, that's much better than my typing)
Back in the days of Windows 98 and all, Dragon Naturally speaking was a very nice product. I got a copy for my mother who has arthritis in her hands and she loved it for when she did stuff on the computer. Once it was trained to her she loved how much typing it saved her.
~~ Behold the flying cow with a rail gun! ~~
I've played with the speech recognition that came with my tablet PC. Works OK if I'm by myself in a quiet room where I can non-self-conciously talk unusually loud-n-clear. Every time I've demo'ed it to people, in an office environment talking normally, the results are laugable.
:-)
The good news is, you can play "Telephone" all by yourself! Remember that game where you sat in a circle, and one person says a sentence to the person next to him, and he tells the next person, and so on all around the circle, and then you hear the final version? Just talk to your computer, then when your words are shown (incorrectly) on the screen, read those words back, and so on. Easier and more fun than going from german to french to english to spanish to french to german to english in babelfish.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
We've now moved to 'chip+pin' in the uk for credit card transactions.
This means that you type or pin number into a little box in full view of everyone in the cueue.
What ever happened to never tell anyone you pin number.
I should imagine that you could use the length of the keyboard loops in the pin entry box to work out the pin number remotley.
Then all you have to do is mug the person, when they pretend they don't know their pin it's ok, because you do, so not even a GBH charge for beating it out of them.
I shouldn't imagine they've used anything better than 56bit encrypition on the chips either, so you should be able to brute force in a couple of days or so.
thank God the internet isn't a human right.
...to Opera. The Voice-Feature lets you control the browser easily including zoom-in and -out what is very interesting for those of us that are handicapped.
I've been using voice for about 6 months. I had a big issue with right hand pain this year so I talked to a fellow developer who helped me get setup. We've done some custom grammars in python for or dev environment. It's been helpful. It's a long way to go if you want to reduce mouse usage. The mouse has to be the worst peripheral for the PC. I'm considering buying the SmartNav http://www.naturalpoint.com/smartnav/ to get rid of my mouse. I've messed with one on a PC with 2 screens. It was nice.
If God had meant us to talk to computers he would have spared this guy's site!
I wish I could figure out how to embed a url without printing out the entire url.
Also, how would you say:Keyboard for me....
I have no problem with your religion until you decide it's reason to deprive others of the truth.
In Korea, only old people talk to computers.
Conformity is the jailer of freedom and enemy of growth. -JFK
Anyone ever had any experience with Sphinx 4?
It's Carnegie-Mellon Universities Java "speech recognizer". I plan on incorporating it into a project I'm devising. And along with that, Java's speech API seems to be circling the drain...at least as far as keeping it current and such. Visited the web-site a few weeks back and it doesn't seem to be updated very often.
I want both speech and noise recognition. I want to put a speaker/microphone is every room of my house. I want the computer to talk to me all the time. I basically want it to keep track of me all the time. When I am awake it should asks me if I am allright several times a day. Whenever I start something like taking a shower it should know how long it takes me and ask me if I take too long. It I start cooking something it should know how long it will take and remind me if I did not tell it I turned it off. When I leave the house it should ask me how long I expect to be gone. It should be able to recognise all noises around the house and than determine if the noise is a problem such as a leaky water pipe or gas line or the noise of a fire or someone unexpectedly being in my house(burglar). It should be able to respond to a problem by notifing someone(me, neighbor, relative, police, hospital, fire department). When I exercise it should know exactly how much I accomplished and encourage me to as much or more the next time I need to exercise. It should be both a mentor and a protector.
Nat Friedman wrote about speech recognition in his blog today. Some interesting thoughts there.
In Korea, voice recognition is only for old people
but I often swear at it
I can't remember the name of the song I was trying to play, but I had just installed RealVoice on my PC, and I said, "Open ????" where ??? was the name of the song I wanted to hear. Well, the computer repeated my every command, so it was pretty embarassing when it said back to me, "Open Paris Hilton Blow Job?"
Hopefully the speech recognition software for PC is better than my phone. I have an LG phone, and tried and tried to get the voice dialing to work. it worked great when there two numbers, but was compleatly unable to tell the difference between "dad" and "dan". and that was with me training the phone. I know it will work one day, but i dont see my keyboard going away any time soon.
Try to use the us know what the law now once you know when to go on from the home page ½ years in the bay view of the the guy is that we're all the evidence in the navy has the one you have one of the year is the home in the movie , a once and one over the CEO is the one I know of a home is the insert its own>one of one San Juan's over the CEO is the one I know of a home. Ors in all know exorcise. It looks like the theory is visited this program. This program is the socks or. Actually the state status as a guest who would happen to somebody else tries to use this disappointed. If they're under standard thinking will.
Of, the ferry. 250 left the ROI. No tax tax wars. The more it's been a textile and yet so we were so we got so we suite its own this is not very sweet area Kennedy Kennedy nine points in the low as the stars write letters are for the new young will visit the move "a great deal of data these will you for the one that all from the on the line and we know there are only are the various the law will live in or near the loan. He saw on the one on the war e-mail address these who I real have only one home the parents who are the way I see no, I one woman will be in jail in this, you do you know that all try (to get one for $1.00 . Of course a public one merely in the other is working again with the one in the one that merely a united Ireland has.
It's everything you could wanted to be in more. $1 billion. It's getting closer and Jill mama. The period. The sentencing said.. It put two periods. But if you. But in monkey. But in button lucky to needs help, and what is that smell. The new and more of the greatest thing since the end of the prison sentence and a half its Germany and we're going on in the state of Nevada site for the new unit are in the training is over Iraq not to spend the night before you as a one. Don't feed the walkie. And the navy in Mexico is creeping 21 lead whatever you can just send it to the lowest reading 111 three and one of the last minute for the lights you a little bit of money goes to all of this is the only way we're still in the eighth and one can only is the only is the one we were of the one on the status quo has over for trial for this year thing we're a long it will of the one that's been an online or the country are ones in the reader's still no thing that if you were a lot of which ones are sooner know better The
Assertive and a year in. business footing on some business footing of your. Who is your daddy. And. That was your daddy? The Sony will get a brand names and the my diet ran so it a further run overnight the topic of Spanish like from the on the Clinton people between the two have all of a lot of the these were the ones you know, for the all are all on one wall will be one for the press has been the staff room one one or more of those who are all over the matter was, as you like or see on one day I'll know, one of the Vietnamese home the new at 101 and you have one of the one and one of the one or the vision like Percy on one day all know one of the enemy's home at the news a lot of work here, I have great but with the island way you live in fear of what got us out of the sweetie to the end of the north Norfolk and a half hour milk and milk killed for the long-awaited on the one thing I can say you were to its translating everything anybody says the middle of the shoe can't find a new.. Nowhere
These,
If you say. It means no on the planet. In the line. If (the city can run the no 11 one citizen. If the for the one at a
I'm wanting to create a hand free environment on my desktop due to RSI but the Preferred version doesn't have the features (macros) needed to manipulate menus and windows. The problem? The Pro version costs nearly ten times the preferred version, apparently they gouge companies looking to avoid workers comp situations but people like me are left out in the cold. Yeah, I'll probably end up paying it but it feels like extortion.
My problem is that the way I speak and the way I write encompass two completely different mindsets. I think most people take on a more formal tone when writing, and dictation doesn't seem to lend itself to that clarity of thought. I much prefer the good old fashioned "come up with a draft, revise" approach to writing.
dude, that application is a spoken interface. meaning it will speak to the user what is on the screen. it might include some enhancements for taking commands, but it did not mention any.
I am the Alpha and the Omega-3
I must say this stuff is very forgiving and for the kinds of actions like open, close it works great. It however does not take dictation very well, and honestly I don't expect/want it to. What i would like to be able to do is say "computer record until monkey... say my bit monkey.. computer send last message to email friends name" and then that thing just sends that wav or mp3 or aac or whatever to my pal... And you know what with a short(reasonably) applescript I can...
I tried iListen when my wife was having difficulty typing. I ran through the training and played with it a bit to get familiar with it prior to having her use it. The accuracy rate was very high for me.
Then she tried to use it. Even the training procedure was difficult for her. She grew up in the midwest and had no discernable accent, so that wasn't the problem. Near as I could determine, she didn't always have the same inflection when saying many words. Without the consistency of pronunciation, the software couldn't learn correctly. She became very frustrated, which led to her over-enunciating the word in question, which just confused the software even more. It became shelfware.
I dragged it out a month ago and started using it again. I've gotta say, the response time on a Dual-G5 is pretty impressive. And for the smartasses out there; no, I'm not using it for this post, I'm at work. Isn't that where everyone reads slashdot?
I always assumed that speech recognition was for cases where typing was not possible (e.g. in a car, if you are handicapped, etc).
I get more throughput typing my own letters than dictating to a stenographer, for example.
Since keyboard commands can be controlled through speech, and it's adding a whole bunch more for universal access, I'd assume that the new stuff could be controlled by voice commands. So I guess you're right, it's not as cool as I thought it was at first, but it will still be an improvement.
If you get nervous, just remember that there are a few billion other people who don't really give a damn.
Discrete speech works fine for programming. I don't see how natural speech could work since they depend on complex recognition models. Discrete allows for the auditory nonsense of coding.
Laws are for people with no friends.
I've been working recently on a language I call 'verbal'. My goal initially was a language I could use in the car, while driving. (I love to code.)
I realized that such a language would be useful for blind people and anyone who couldn't type.
The target is a language that will mimic a subset of English, so that a program might be:
I've written a compiler that translates that kind of thing into C, but I'm not releasing it just yet. It only has the type int, and no functions or objects. As soon as it can handle objects, I'll post it quietly.
(I got stuck for a day doing an elegant itoa.c, but that's done now. All I needed it for was to generate good labels for constants on the symbol table, and sprintf didn't fit right. Of course I found a slightly simpler one after I got it done.)
sigs, as if you care.
"Hello? Hello Computer?" "Just use the keyboard."
I got a PowerBook with the speech recognition only thing is you need an American accent to get it to understand you!!!!!
I recently had to write about a thousand word public address and thought I would try the speech to text that comes with Office. I loved it. It was not perfect and I still had to use the keyboard for clean up, but for that application it was great.
I used some freeware on my XP machine and it worked ok. Granted, I did not dictate documents - I used it for commands. For example, when I wanted to skip to the next song in WMP I assigned the phrase "next song" to CTRL-F. Handy when you are sitting reading something and don't want to get up. You also had to say "computer" to get it to wake up and listen.
One of my friends uses Dragon Naturally Speaking on a regular basis and loves it. He's a family practice physician, and says the software saves him a lot of time, since he can use it to dictate his "reports" (don't know the technical term) between patients. He's used it for several years now.
Most stutterers don't stutter when they're alone, though. But, then again, even when I'm alone, recording a message on an answering machine can be a challenge since I know that someone is going to hear it. Stuttering is a big mindfuck, so I wonder whether I'd experience the same sort of self-awareness when talking to a computer.
As far as I know, software like this doesn't deal well with speech disorders, and it probably should't be expected to.
Now that it is unavoidable - for example, booking a seat at a cinema - I really feel that a significant portion of society is being discriminated against. There are 550,000 people in the UK alone who have speech dysfluency problems, and yet speech recognition cannot deal with the multitude of different manifestations of this; repetition of sounds, blocking, or laboured or breathless speaking. What am I, and everyone else who has these problems, supposed to do? Even if they did put effort into making the system smarter, I can think of nothing more intimidating than having my stammer strutinized by a piece of code.
And nothing can replace the feeling of acceptance when you can tell the operator about your stammer, and what s/he can do to make it easier for you. I place the human virtues of patience and understanding over all software (even Firefox)
Comments on Dragon NaturallySpeaking (and voice-recognition)
I have used and developed software for Dragon NaturallySpeaking Professional version 7.3 for about one year. Here is a summary of my experiences:
With a good microphone system, I have been able to achieve an accuracy of about 95% with Dragon NaturallySpeaking. This means that about 5% of words are misrecognized, missed, or spurious words. It seems to be acceptable for some command-and-control application on computers. For example, I browse the World Wide Web using Internet Explorer, send and receive electronic mail, play DVDs, and some other tasks fairly comfortably using voice commands. With this error rate, dictating lengthy documents proves tedious because of the frequent need to correct errors -- although it can be done. For example, this posting was dictated using Dragon NaturallySpeaking.
ScanSoft claims a significant improvement in accuracy in Dragon NaturallySpeaking version 8, recently released. I have begun testing Dragon NaturallySpeaking version 8 and cannot yet confirm or deny this claim.
I use Dragon NaturallySpeaking for a voice operated wall display computer system. The computer system consists of a laptop, computer projector, wireless microphone system, powered speakers, and Dragon NaturallySpeaking Professional version 7.3. For greater comfort, I use a directional microphone that clips on my shirt -- I do not need to wear a headset. I use the system as a Home Theater System and a general computer. Some common computer tasks such as browsing the World Wide Web and reading and responding to typical e-mails, which tend to be short, are quite easy and enjoyable. Other tasks, such as dictating lengthy documents, are rather tedious.
I can see the computer display from anywhere in my room. I can move around freely, stand, even walk while operating the computer hands-free. The wall display seems to encourage better posture than sitting at a desktop computer.
In my experience, it is necessary to retrain Dragon NaturallySpeaking frequently during the first few months of use. This seems to be due to natural variations in our voice. Over time, Dragon NaturallySpeaking can learn some of the variations in the speaker's voice and the need to retrain the program drops. Dragon NaturallySpeaking works better for long multisyllable words and complex phrases. Over time, its ability to recognize long words improves. However, Dragon NaturallySpeaking version 7 has problems with homonyms, short one syllable words, singular versus plural forms of nouns, and tenses of verbs. These problems do not go way over time.
Dragon NaturallySpeaking version 7 is sensitive to the rate at which you speak. If you speak faster than the rate that it has learned, it will fail. It is important to learn to speak evenly at the rate that Dragon NaturallySpeaking understands.
Over time, Dragon NaturallySpeaking can adapt to some background sounds -- even some kinds of music. However, it has consistent problems when another speaker is present, especially the same gender as the user.
Regarding security, there are many pieces of information that we do not want to speak aloud, for example our Social Security number or a computer password. In Dragon NaturallySpeaking Professional, one can create commands or new vocabulary entries with a secure spoken form such as "my supersecret password" which the program will transcribe as the actual sensitive value such as "XRILCYREWVP". In this way, an eavesdropper cannot determine the social security number, password, credit card number, or other sensitive information.
Dragon NaturallySpeaking is a heavy user of computer resources. Dragon NaturallySpeaking running on Windows XP seems to slow down significantly after a few days of continuous use. I frequently find it necessary to reboot the computer and to defragment the hard drive.
Dragon NaturallySpeaking version 7.3 has a learning curve. It is like any computer tool. Users need to learn
Many of the new releases of Dragon Naturally Speaking (version 7 and 8) do a very good job off recognizing speech and filtering out non-speech. The main problem is adequate training...
My company has been working tightly with Scansoft engineering for the past year+ and on most of their recent developer beta releases we can easily achieve 98.5-99.9+ recognition rates (We just got copies of v8 desktop, and haven't messed with it much). However to achieve those rates we use their backend server engine (essentially the same core technology as the desktop, just better server-oriented APIs) to process, at minimum, 4-6 hours of pre-recorded speech from each user.
The big problem with the desktop version is that not many people want to sit down for 6 hours reading random chapters from a book out loud. The easier way is to use pre-recorded audio along with transcribed text. Then the engine can just chug along and create a pretty accurate model with 8-14 hours of audio files.
Another big problem is punctuation...not many people feel like saying 'period', 'semi-colon', 'newline', etc. as they go along. Personally I find speaking formatting clues extremely frustrating, and end up giving up. So you really need learning engines to analyze the inflections and pause in people's speech patterns, and infer this missing information automatically (of course that's what we're working on right now).
All in all, if you spend all day typing large volumes of text (writing novels, instruction manuals, etc.) the current tools could get the job done (if you have the patients to stick with the training). But IMO a keyboard is much more efficient for average users.
How perceptive!
This post written under Gentoo-linux with an SCO IP license.
Basic literacy is not being distroyed, it is being improved. Notice that you are now able to read better. Why fight it?
This post written under Gentoo-linux with an SCO IP license.
Is more toward the portable than my desktop.
My desktop has a nice keyboard that I'm quite accustomed to using. As a general rule I can type almost as fast as I can talk, and generally faster than I can think of what to say. Definitely faster than any speech recognition software I've used.
However, my iPod and palm *don't* have nice keyboards. If I could tell my palm to take a note, write down someone's phone number and email, or just start dictating my thoughts, and have it end up in my contacts ready to be synced... well, that'd be pretty nice.
Until then, it's just a novelty to me.
IT'S ALIVE! Run!
Does frenzied swearing count?
You're using her as bait, Master!
What about an office full of people talking to their computers?
I guess I just never thought of it. OK, let's name it SACHET, the olfactory-based programming environment. Can anybody out there with synaesthesia give some suggestions on (for example) what a while-loop should smell like?
>;k
Kearny: "Jimbo, take a note on your Newton: Beat Up Martin!"
[Jimbo writes the note on his Newton and reads it back]
Jimbo: "Eat Up Martha? Bah!"
[Newton is thrown at Martin]
...that I happen to have worked on a little bit is the following:
m od al/
:|
http://www-306.ibm.com/software/pervasive/multi
It's basically an attempt to bring voice I/O into the Web application framework, by integrating voice I/O components into the XHTML. The desktop browser version is availabe for free, while the main target market (portable devices) isn't. It uses a version of the ViaVoice engine tailored for embedded systems that requires no training; it's designed for small sets of vocabulary rather than the large set that dictation requires, so it doesn't work all that poorly.
Granted, a programming language structured through XML is a bit of a hassle, but...
I'm not interested in speech recognition necessarily for dictation. as other posters have pointed out, speaking code, or thinking while speaking a large document can be frustrating and unnatural.
Instead, I'd like to use my voice as a supplemental input device. I seem to always have one hand on the mouse, and one on the keyboard, ready to hit ALT-TAB, CTRL+S, or whatever. If I could use my voice for that, it would free my hands for typing (not anything else, you sickos!)
If I could give voice commands like "Save", "Next Window", "Previous Window"... it would be nice.
And perhaps even some things could be done in the background by recognizing speech rather than having to bring an application into the foreground to interact with it visually. Perhaps you could say "dialup" (for those without broadband) instead of having to click on an icon or type a command -- keeping your visiaul context in the application you were already in.
bah... the site has been down all day.
can you lame-Os stop surfing at work so i can visit the main review?
Nass and Moon discovered that: "...Individuals mindlessly apply social rules and expectations to computers. (...) individuals overuse human social categories, applying gender stereotypes to computers and ethnically identifying with computer agents.(...) people exhibit overlearned social behaviors such as politeness and reciprocity toward computers".
See: "Machines and Mindlessness: Social Responses to Computers", Journal of Social Issues, Volume 56 Issue
No.
Free Mac Mini Yeah, it's
Nat (the Ximian dude) recently hurt himself and has been reduced to being a one-handed typist. In order to stay connected, he's hired someone to take dictation for him. In today's blog entry he talks about the experience, what it's like for a very competent typist to use a dictation system, and thinks aloud about future intelligent speech-to-text applications.
I have a lot of audio files from my personal digital recorder that i need converted to greppable textfiles. I have absolutely no requirement for realtime function - i'd be very happy to load up a directory of files and leave my computer to think about it for hours/days/weeks for an optimal recognition/transcription.
Has anybody got a tool for something like that? I've been searching for months and found nothing suitable.
Demonstrant's Open Source Tools
I have actually installed a few Speech to Text systems in some major hospitals to reduce the workload of the transcriptionist staff. This technology actually works and works very well. I could not read the slashdotted article, so I am not sure of the direction that it is taking, but I will tell you it works well. The engines we used were from the Dragon software and are wrapped in a complex learning system and a SQL database backend. The doctors have to spend a few hours up front 'talking' to the system so that it can 'learn' their speech patterns. After that it has an accuracy rate of 95%. The last 5% of the errors are caught by a reduced transcriptionist staff. The savings are huge to the hospital(s) and there are no longer delays in getting the medical records filed in a timely fashion. Win Win situation.
as one of the biggest 2001 crash victims. The original owners sold their company to Lernout-Hauspie for what was then several hundred million dollars worth of stock, and is now about ten cents worth of stock. No exaggeration here.
At least, though, they're better off than Messrs. Lernout and Hauspie, who are in jail.
The last I heard of Dragon was that the technology was sold off in the L&H bankruptcy proceedings to some software discounter.
The software is very expensive. Is there a trial version I can try?
This guy writes a useful article about specific, real world usuability and it's get's one'd? Whereas some putz makes a stoopid and obvious 'joke' about the mistakes a sloppy dictator can get from voice recognition software and it gets a Funny:5? Feh
I really feel that a significant portion of society is being discriminated against. There are 550,000 people in the UK alone who have speech dysfluency problems, and yet speech recognition cannot deal with the multitude of different manifestations of this; repetition of sounds, blocking, or laboured or breathless speaking.
How, exactly, is your very small minority being discriminated against? That is, quite possibly, the most absurd statement I've heard in some time!
A few things:
1) Learn what discrimination is. You obviously don't know.
2) You can't expect speach recognition, which is just now becoming a viable technology, to be perfected to the point that it can cope with your particular disability.
3) It looks to me like all you're after is sympathy! I find that disgusting. Learn to live in society like the rest of us. Don't expect society to learn to live with you. You're not that important.
Required reading for internet skeptics
Voice recognition is an often overlooked aspect of the Tablet PC. On the single mic systems it is often confused by background noise. Some systems have an array of three microphones that help compensate. The difference between the single mic and array mic setup is astounding.
CDs have replaced floppies, but mice haven't replaced keyboards. There seem to be plenty of good reasons (privacy, for starters) why the Linguistic User Interface won't replace the GUI.
This is not to say LUIs are pointless; There are great advantages to the LUI as well; I'm thinking that voice recognition might allow you to ditch the keyboard, but not the mouse (or other pointing device) - which could be used to designate the context of voice input (iow, "who you're talking to/about").
I had to set up dragon naturally speaking for a client about 4 years ago. It worked amazingly well (much better that the experiences that I had heard about or that I had with it personally. The big however was, the client purchased a mic & amp set that cost about $450 dollars. Also I think he just had the right voice for it because when my co-workers and I tried to use it, the software worked better then with a cheaper mic, but still not as well as was touted by the dragon developers.
:(){
Yes, #$%^&* it! But I'm ordering an iMac, so expect to be doing less of that soon. :-)
My friend, a screenwriting student at USC, developed an awful case of carpal tunnel a couple years back and has been forced to rely on Dragon Naturally Speaking to communicate online and produce his writing... The problem is, Dragon Naturally Speaking, even the latest version, really has a lot of trouble understanding even common words and phrases.
Needless to say, I get lots of weird, garbled messages and eMails from him.
"How about Saturday for the flowerpots?"
I'm supporting users in a medical environment with Dragon 7. Getting the right microphones made all the difference. We went from an average 96% to 98-99% efficiency just by spending ~$200 on Sennheiser headset microphones and Andrea USB sound pods from these folks.
Not affiliated, just a happy user - thanks for getting the MDs off my back!
Regardless of your view, I think that it is important that adequate help is given to people who can't use speech recognition methods; no different to braille on public buildings, or whatever. If a person with a stammer tries to book a theatre ticket, but they can't because the speech recognition picks up on every hesitation in the voice, what else can they do? You can't just expect them to speak better, just as you can't expect a blind person to just "make an effort" and read the signs.
... oh, i misread the topic... thought you meant 'PP,' not 'PC'!
[fumbling to keep fingers from going crazy]
A certain GPS-enabled mapping program I have been using for years has been re-written to eliminate the drop-down menus and substitute badly-done little whatever-they-are at the bottom of the screen. Navigation through this mess is suddenly rendered horrid and just about impossible while driving. The idea, apparently, is to force the user to use the goofy speech regonition business built-in to the program. It reeks, folks.
The thing is that even if the computer did identify the request properly you'd have the problem of turning it into a well-formed query. Many of the requests they get are things like "There's a hairdressing shop on Shelton, or is it Howe? Anyway, it's Southy's or Surly's, I think." Combine that with the vast number of immigrants in Canada, particularly Vancouver, and there's almost no point in trying speech recognition. They do have a primary speech recognition system attacking two parts of the query, though. When it asks you what city you're interested in and whether you want a residence or a business, that's a limited enough range that the computer can just translate it and attach it to the box that appears on the operator's terminal. Keep in mind that the operators are not listening to the whole call or sticking with your call in any way. They're in front of a scheduling computer with a headset on. Your request plays in the operator's ear, the operator types it in if she can understand it and asks you for more detail if she can't. Then she's on to the next one while the computer plays it for you. She can easily take four requests in a minute. Many people are charged for this service. Depends on your plan.
Fatal error: Call to undefined function: message_die() in /home/httpd/vhosts/pocketpcaddict.com/httpdocs/db/ db.php on line 88
Yeah...great piece of information....
"It is our choices, Harry, that show what we truly are, far more than our abilities." -- Prof. Dumbledore
I've used both, because we used to have one at work and I had the other at home.
If ur not American, Dragon NaturallySpeaking is the way to go. I get good accuracy most of the time, tho there are some words it seems to NEVER learn. It doesn't hang other programs tho.
IBM ViaVoice really sux. First, no matter how hard I try to speak American, I get maybe 70%. It's also much more CPU intensive and slow, plus there are bugs in dictating numerals/dates I have reported for 3 versions and they haven't fixed! Also, it uses this lame copy/paste things to test whether a program supports it. Which means every time you move the cursor with the mouse, it types an 'x' then cuts it out. On a slow PC (600Mhz at the time), goodbye text if you made a range select just before it cuts....
But my PC talks to me.
For some reason though, it keeps calling me "Gordon".
paintball
If you're writing a novel with lots of editing while you're writing, speech to text is not for you. If you're a physician who sees patients with just about the same conditions day in and day out, then it's a godsend if you have it working right. Have any of you heard an experienced physician rattle off a patient history into the hospital medical transcription service? Sounds a lot like the Hot Wheels guy.
For critical fields such as healthcare, speech to text will not be implemented until it's perfect. But imagine all the money saved (to help the current healthcare quagmire that we're in) if medical transcriptionists could be supplanted by software.
Linux at home
Proposed Experiment for Bored Hackers:
Pipe the text output from a chatbot to a text-to-speech program.
Then, use a speech recognition software package to listen to the audio and see how well it picks up the words.
If you are good, you can have two computers, both with talking and listening capability. See what kind of conversation they have. Perhaps it will mimic the conservation of two people with poor hearing.
Insipration for the above experiment: What Happens When Chat Bots Talk to Each Other
>[deal with punctuation]
I glossed that. I assume a speech module would have macros, so you'd say 'dot' to mean the end of a sentence and 'semmy' for semicolon. Or, the language could use keywords like 'stop', 'dot', 'semmy', etc., in place of punctuation, and suggest formatting rules to keep things readable.
It's no big deal to alter the grammar to use a word in place of a punctuation mark. That's how I started out, replacing '+' with the word 'plus', and so on.
sigs, as if you care.
Now if it was only equipped with some sort of speech recognition product, I'd really have something'...
"Oh wait, it's not my monitor after all!" :)
http://shit.slashdot.org/article.pl?sid=04/12/09/1 650234
As Duke Nukem would say in the john, "Ahh! That's better!" :)
Call me an Old Fart, but IBM had this feature integrated in OS2 before in the mid 90's. M$ said it was a feature no would ever want. Is it possible they were right?
OS/2 Warp v4 included IBM's full, very expensive voice recognition software, only lacking the very expensive dictionaries that the full version supported. It could still learn quite well on it's own, though.
The only problem I had with it was that a very speedy for the time 80MHz machine with 96MB RAM was quite bogged down by it. I played with it shortly after upping to 128MB on a 166MHz, and it was quite nice.
Unfortunatly, Windows/Linux/most other OS's are virtually impossible to manipulate via voice.
Ahh.. those were the days. Sit thirty feet away from the computer... say "Menu. File. Open. Down. Down. OK." sure, we now have wireless mice that will allow me to achieve the same thing much faster.. but.. the only wireless mice in that era were really awful IR based ones.
And you can't dictate with a mouse.
"Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
I know they are...Trying to convince the system not to get viruses.
If the software translates exactly what it hears (which is exactly how people talk), then we're even in for a bigger problem. Sometimes I wish I was a teacher - I guarantee that if I saw one instance of a slang in a written assignment I'd mark it down a grade, and make them correct it if they wanted any grade at all. Any mention of "u" instead of "you", and it's an instant F.
One word: headphones.
Girl: "Thanks, Mike. I had fun last night."
Phone: "Thanks, Mike. I have lip fungus."
Once I tied the Naturally Speaking software together with AT&T's Natural Voices and A.L.I.C.E. bot with some interesting results. Basically, I could just speak to my machine and have it answer me in a real voice. In conversation, I found myself being more forgiving of the shortcomings of the A.I. just because it was speaking to me. When it erred, it was easier to believe that the "person" I was speaking to had just misunderstood me, or I had just misunderstood what was said.
Actually, does anyone know of any research in this area? If adding speech recognition and synthesis to a computer makes the machine seem "smarter"?
A few months back, I had an accident which made it temporarily impossible for me to type with my dominant hand (left, in my case). ... typing then becomes hunt-and-peck. ... but was pleasantly surprised that it was quick. HOWEVER ... it did not seem to improve much over the next week or so, being particularly bad at really common words. AND correcting it was slow. (To be fair to M$FT: it acknowledges that the embedded speech-rec software is NOT intended for extensive use ... I actually think it's mainly intended for Chinese users to avoid the oddity of entering Han characters using a QWERTY keyboard.)
... I bought the professional version and its microphone and settled in to teaching my computer who I am. Again, you can get going pretty quickly. And it gets better and better ...
... more words flow, and what ends up is, in its first draft, stream-of-conscious writing. When I use Dragon, I can concentrate on what I *really* want to say. I can -- and do -- close my eyes, and try hard to create a logical flow. I have found, to my surprise, that what I create by dictation is actually better in grammatical and logical terms than what I get when I type.
... about half of the time I get text instead of the command. And, it was more complex than it should have been to get Dragon to ignore the tiny, terrible microphone in my laptop. I did something that turned out to be a Bad Idea. I bought another, better microphone -- one for home, one for the office. Bad Idea, because the machine recognizes the Dragon microphone (which plugs into the a/v sockets) as being a different profile from the nifty USB microphone I bought. Different profile = different person.
... and recommended that he look into the Professional version for medics, since -- at least from my experience -- Dragon really can enable much more fluid, lucid, accurate 'writing'. It can be faster than typing And without that dreadful problem of illegible handwriting, as well.
Amazing
Unfortunately, I have to write for my work. I have to write LOTS for my work. It's 2004, so there's no 'steno pool'. So: voice recognition, part 1. I used the speech recognition embedded in M$FTXP. I was quite prepared for a slow learning process
So, Speech Rec, part 2. It was clearly time to spend $$ and buy Dragon. I was particularly intrigued by the assertion in the ads that one can enter text using Dragon faster than in normal typing. Hmmm
Some weeks later, my hand is (pretty much) back to normal. I'm typing this using both hands, largely because it does remain a little time-consuming (a minute or two) to get Dragon going, and I just didn't. HOWEVER: when I have to write long documents, I now still fire up Dragon and dictate. It actually CAN BE faster than typing.
And, another thing: when I type, I look at the screen, see what's there,
Dragon is NOT flawless. Partly, I'm sure, this is my fault (yeah, have I read the manual???) I haven't mastered the art of commands. When I tell it to 'move (the cursor) to the end of the line'
On the final visit to the hand surgeon to check out my hand's recovery, I talked about this
(www666) this is so cool I'm typing with Dragon NaturallySpeaking in mIrc
(www666) no more typing
(LameLLama) www: try "thlash exit"
*** www666 has quit IRC (Leaving)
*** www666 (baroca@spc-isp-ham-uas-05-11.sprint.ca) has joined #visualbasic
(www666) Hugh Masters
(www666) you basterdes
there are *plenty* of enterprise-grade voice recognitions systems out there: both real time (dictation) and offline (digital typing). all are in constant use all over the world in legal practices, large corporations, etc, all working very nicely indeed, thank you.
dragon + decent headset and sound card = very effective.
With a stereo [or more] mike, the math of implementing directionality is simple. Then you can theoretically identify the direction of your source, and track it, and attenuate all other sources.
Whether anybody is actually implementing that kind of project, in an integratable GPLed package, is another question. Seems like that would be great fun... Any takers? Any google-able word combinations, so we can identify who is already working on this?
This is not that hard. The main problem is that usually there is no easy way to get exactly what is coming out of the speakers, due to varying drivers and latency. Once you have the speaker signal, you need to implement an echo canceller which subtracts the signal including reflections. This is usually done by using an LMS or FDAF algorithm to find the impulse response of the speakers relative to the microphone. And you can easily design microphones that are sensitive to (mostly) one direction to get rid of more background noise. This is implemented in every phone that has a speaker-phone function. Without it, the person on the other side would constantly hear themselves talking back with a delay, which is very annoying.
So what I said was the most absurd thing you've ever heard for some time? Wow.
... just as you can't expect a blind person to just "make an effort" and read the signs.
:)
Yes, it is. Read my origional comment again to gain a clear understanding.
I think that it is important that adequate help is given to people who can't use speech recognition methods
You can't compare a "stammer" to blindness -- no, a blind person can't "make an effort" (sic) to see. Nor should I expect such from a blind person. A person with a "stammer", on the other hand, CAN make an effort. As another poster pointed out, often times a person with a "stammer" can speak very clearly when they're alone. I don't know of any blind people who can see just fine when no one else is around.
Yeah, a "stammer" is certainly NOT A DISABILITY. Nor should it be so considered. Some therapy -- both speech and psychiatric -- could do wonders for you, as it has for many others with similar problems.
Required reading for internet skeptics
You seem to have very strong views on this. Stammering has basically no effect on my life, apart from speech recognition stuff, so I thought I would like a little article. I go to speech therapy regularly, and it's helped, but in the end it actually doesn't matter. I can still speak, as you point out.
At the same time, I wanted to highlight the difficulties that a significant (yes, it is significant, 550,000 people is 1% of the UK population) may have when trying to use this new techology. Read what I said as a plea for sympathy, a pathetic, ignorant comment, whatever.
Hi Raefer. We're still watching you.
I was given a copy of ViaVoice for my birthday one year after complaining about wrist pain. Fortunately, the wrist pain turned out to be due to the poor setup of my chair vs. my table, but I still had a copy of ViaVoice around to toy with.
I was writing small articles and comments with it with little problem; it handled English quite well. However, quite a lot of my typing is to produce code in Perl since that is what I do for a living. Perl has so many funny little symbols that I just spent the entire time trying to figure out what ViaVoice calls different symbols, and once that was sorted having to say long descriptions of what I wanted to type rather than just bashing a button on my keyboard. I gave up in the end.
I don't recommend speech-powered programming to anyone without a specially-targetted language. It would be interesting to try it with COBOL or AppleScript! :)