Voice-Op Linux PDA
Anonymous Coward writes "At http://www.the-times.co.uk/interface/dailyextra5.html is
news of a voice-operated Linux handheld computer to be announced at CeBit next week. Sounds cool!" Oh yeah. Until someone shouts out, "ARRR-EMMM ARRR-EFFF STAR!" Then we'll see who's laughing.
will it die if you tell it to? or hit people?
will it follow you around and sniff other voice op'ed pdas in funny ways?
Thank You,
Troll King
Thank You,
Troll King
Subscribe
until someone shouts out, "ARRR-EMMM ARRR-EFFF STAR!"
Argh!!! I forgot I had one of those set up across the room, and was browsing on this Windows computer using text-to-voice. It must have interpreted the capitals (correctly) as shouting, because my other computer definately heard it.
I remember this as an actual occurance with a voice operated system, anyone have verification?
-- Ender, Duke_of_URL
"Penguin: where are you?"
..."
"Here! Here! he stole me! nasty brute!
I think I remember a demo of L&H's text to speech, and it wasn't much of an improvement over the ancient typical monotone voice. I don't remember if it was them or someone else. Does anybody have info on them?
The big issue is security, does it only recognize you as a legitimate person to give commands/enter data?
The article is fluffly, and doesn't even say if the PDA requires time to learn your voice, or for you to learn standard enunciation...
-- Ender, Duke_of_URL
rm /..org
Thank You,
Troll King
Thank You,
Troll King
Subscribe
OK, so its kinda funny, but it is also an old Dilbert strip.
Intolerant people should be shot.
What makes the Palm or a Newton Useful?
The user space apps.
Things like the names/dates/call logging application.
And, face it, most of the apps like that under the modern Unixes need to go on a resource diet if they want to fit on a handheld.
Who's been writing the lo-resource version of Xcalendar? OR a database?
If it was said on slashdot, it MUST be true!
cell phones are bad enough, now are we going to have people walking around talking to their computers? they would deserve a good whuppin' too.
Thank You,
Troll King
Thank You,
Troll King
Subscribe
I recall reading once (in Risks perhaps?) about a workplace where they were testing voice recognition. All was well until a disgruntled employee walked down the corridor, shouting "FILE! EXIT! NO!", with predictable results.
rm: cannot remove `rf': No such file or directory - :P
I have L&H's voice express for my windows machine, and have found it's text-to-speech features to be rather adequate. Granted, it's not exactly the same as having your own personal secretary dictate the on screen text to you, but then how many of us have a personaly secretary? As for the speech-to-text: well, the enrollment process seemed rather lenghty, but I was able to use the program to do a fairly good job at dictating Emails and such. But, isn't this just a step away from those IBM commercials with the guy in Russia wearing his PC? Seems rather similar to me. This is my first post, so be sure to moderate me down! :)
If Murphy's Law can go wrong, it will.
I see that the concept of voice-operated devices is viewed largely as a laughable matter. Personally, I do not view the potensial for "voice-cracking" as the most important aspect of these gadgets...
However, I think the linux (and /.) community should welcome the prospect of an expanding platform base in this field with enthusiasm. With the focus in the PDA/handheld field on WinCE, it would be a shame if this carried over to the field of voice-operated devices.
Personally, I think that much of the "voice" functionallity will reside in the mobile phone networks, and as such be independent on the operating system of the handheld device. But it never hurts to give developers a choice of platform technology.
I am looking forward to this device, and the voice-enabled applications. Although it would be "nifty", I do not think the VoiceShell (vsh) will be the most useful application...
Vidar Larsen
bash: rmrf*: command not found
--
Infuriate left and right
Woefdram, l'apprenti sorcier
They've got it for the PDA, there must be one out there for the desktop linux user. Anybody know where I could find it?
Sorry if this seems off topic.
- AZ
- AZ
Yep. You can see it at Macromedia's Shockwave site. Kinda cute. ;-)
you pause after the the spaces, so it's rm (pause which'd add a space) rf (another space) *
nontheless, it wouldn't get you anywhere, w/o the dash.
Zaphod Beblebrox is trying to listen to the radio. I say trying because the damn silly thing is a highly sophisticated computer computer that interprets body movement as a request to change channels. Having to remain rock still once you get the channel is hardly going to improve your listening enjoyment. ;)
Things like voice control are great for text file production, but this kind of thing is often hyped just way too far.
Just as importantly, their is the issue of training the voice recognition system. Once it's calibrated, it might be fine most of the time, but what about when your voice temporarily changes when you get the flue?
Finally, there is the noise pollution factor. Modern open floor plan offices are noisy and distracting enough with telephones and what not. I tend to suspect that the introduction of voice controlled computers is going to be a no go unless people are allowed to work at home in a relative state of quiet.
That's all very nice and stuff, but surely when you are in a public place there is far to much background noise, and announcements, to give commands/dictate a letter... and besides, that's a hell of a lot worse than being on your mobile in the train for other passengers.
The future of OSs *is* voice recognition though... I want to see the end of keyboards. And mice.
Set up the IBM voice recognition SDK to control channel changing on kWinTV(with a Hauppauge WinTV card), plug in a sensitive microphone, and turn on the speaker system.
The damn thing started flipping between channels and window/full screen every time it picked a recognised "command" out of the current program.
I eventually managed to shut it down by unplugging the microphone.
Ah well, you live and learn.....
(Voice operated medical equipment, anyone??)
The future of OSs *is* voice recognition though... I want to see the end of keyboards. And mice.
I agree with the keyboards and (especially) mice becoming a drag. There are other inputdevices possible as well. There are alternatives.
Thinking here about the computer seeing when the navigator window has to scroll just by looking at your eyes. Knowing to stop when your pupils changes size, etc etc.
And off course: combinations of all the new inputdevices. Interaction through question-and-answer with your computer (computer: "give me a smile for Gnome, cry if you want to start KDE").
nosig today
I wish all you hot grits 'people' would inject about 10 gallons of hot grits/ground glass mix in your ass and beg for a decent person to drop a petrified Natale Portlywoman statue on your cakehole encompassing head.
The voice in the demo is pretty dang warble-y. The Festival speech system does much better, IMHO.
I/O Error G-17: Aborting Installation
Now consider this: the gimp explain voice control in that? "draw the mona lisa"?
No, not quite. Voice control won't replace any 2-dimensional manipulator interfaces any time soon (at least not for non-disabled users). No one is claiming that the mouse will be rendered useless. After all, "a picture is worth..." Well, ya know.
BUT. How much do you really enjoy clicking around the gimp toolbox? Or, worse yet, searching for a filter you don't normally use in 3- or 4- deep menu system while losing that exact pixel you were over in the image. Right there, a secondary interface via voice would be ideal. No need to lift hand off mouse or move the pointer at all. Just "Use filter A, settings 50%, 3, no." I'm generally against voice recognition, but this would be one of the few spots I'd definitely want to see it.
// zyqqh
You're right, typing is/can be faster than speaking, BUT speaking is faster than a mouse. But why was the slow to use GUI invented? Answer: to provide ease of use and a more natural working environment than a command line.
Dictation is a much more natural integration of person to computer. Slower, but better. A newbie can do it just as well as a nerd. Why learn to touchtype???
End of command lines. End of keyboard. End of mouse - use touch-sensitive screens. Sorted.
Would *that* qualify as "free speech"?
Seriously, Voice interfaces probably have a very limited usage. Some disabled would benefit (much). Hands free applications are very useful in cars and such, but typing is generally less tiresome.
Sure many people type faster than they speak (at least if it is to be interpretable by a machine) but the main problem is that speaking for an hour is very tiresome (and irritating for those around), and commands by voice are difficult compared to mouse and keyboard. ("Swap those two words,... three sentences back" as opposed to drag and drop or the arrow key dance.).
Still cool is always cool...
All opinions are my own - until criticized
As long as I can root Diego Garcia by just walking down the street and listening to him speak his login and password into his PDA it's all good....
http://www.developer.ibm.com/library/articles/niel sen1.html
Have a read what Jakob Nielsen (one of the greats of User Interface design) says, he presents one of the better arguments as to why voice recognition just isn't that good a way of interacting with a machine. Most of the things that voice recognition is pushed forward for can be done better and with greater accuracy with your hands and a well thought out display. There are certain cases where it is the best option, and possibly a PDA is one of them (although I use a Psion and don't have any problem with it at all and I wouldn't want voice recognition) but for the most time its a gimmick that doesn't stand upto the demands of the user.
An Eye for an Eye will make the whole world blind - Gandhi
it seems it will delete everything(at least if you are root at the time), no matter what directory you are in, as it includes /.. . I actually did a chown -rf .* in a moment of idiocy, trying to change some dot files from one user to another once, the dot files were assigned to the correct user, but it made the system as unusable as if I had used rm -rf .*
_this is not a signature_
The killer applications for a PDA are the contact info, schedule, and memos - in general, maintaining a database made of records with a small amount of data in each field. Short messaging (integrated with E-mail) too, I guess - still small amount of data. Everything else is bells and whistles. People do not write long texts on a PDA - they use laptops, or at least buy one of the nifty folding keyboards for their PDA. People do not run GIMP on a PDA.
For these killer apps, a voice API is great: "show today's schedule". "new meeting, March 14th, at 10, with L&H". "new memo: buy milk for santa". "new expense: the L&H account, 112$, business lunch". "show contact Joe". "Message to Jane: Lunch at 2?".
I'd expect you'll need to push a button to make the PDA listen - I wouldn't like one which listens all the time (it might make sense for a desktop system but not for a PDA). I also expect you'd still have a touch-sensitive display, and be able to use a stylus for menu navigation and writing. Just like desktop systems did not give up the keyboard when they got the mouse!
Something like the "Itsy" would be perfect for the above. Take my REX-PRO and add handwriting recognition like the Palm's and voice recognition like the above and you end up with the perfect PDA. The only possible improvement would be integrating it with a cellular phone, or maybe with a holographic projector
Obviously working on the voice UI would take a lot of effort to get right. I predict the initial offering - by L&H or whoever - will flop like the Newton, to be followed by a Palm-like successor which would get it right.
And both L&H and Compaq know this. Thats why they are both using Linux; writing a voice UI that works is a classical open source "itch to scratch". They'll be able to obsolete the first generation software and replace it with a second open-sourced generation - while maintaining the same hardware platform, escaping the Newton's fate. Good move for them, good move for us, bad news for Microsoft
that would simply remove all of your dot flies, the real killer , I think, but I hope not to find out I'm right the hard way, like I did with chmod, would be rm -rf .* *
_this is not a signature_
The concept of a desktop computer is so un-natural! Especially with a tube firing electrons down it, producing a flickering and raditation emitting output. Bah! I want to sit/lie in my bed, with my PDA, and read it or talk to it like a book. Not that I talk to books... tell me, how easy is it to type when lying on your back in bed?
It's about time we stopped adapting to computers with keyboards and CRT tubes and adapted them to us.
The other thing is gameing. I just dont think quake would be as cool if you just touched your oponents to splatter them.
Although I would agree with you, I think that to play games you would probably want a joystick? Or just stick to chess? :o
I haven't checked in a while (may a bit outdated), but heres some linux speech apps
For those that really wanna play, check out ISIP 's ASR project.
For those that are interested in aquiring speech corpa (training data) check out The LDC-online. Get the free guest account, use your perl skills and your imagination, and suddenly the TIMIT corpus is yours
Email me if you're interested in this kinda stuff (or want my timitgrab.pl script)... its not my primary address, but I check it from time to time.
I ate my sig.
Handwriting analysis (like graffiti in Palm Pilots) makes them usable in situations when talking to your PDA could look silly -- I'm guessing that at some point there WILL be some times when it won't look silly! Then.. a headphone/mic jack would allow the little in-ear headsets (for a bit of privacy and improved voice recognition) *AND* would also allow MP3 player apps! OK.. needs handwriting-text entry.. and also audio in/out jacks. IMHO.
Come on son! Surely you can do a better flame than that!
This Is Slipshod.org! Home of the combination fireproof codpiece and jerkin!
Saxo Grammaticus
ARR-EMM SPACE DASH ARR-EFF SPACE STAR
still won't work as it should translate to
rm dashrf star
which should cause no harm at all.
How would Metacharacters be entered anyway? ESCAPE STAR? LITERAL STAR? And how would the ESCAPE or LITERAL be escaped?
Seems to me that voice commanding some appliance is not so easy after all?!?
How about speech to text? Dictate something into your PDA, have it convert to text and edit/share/distribute it? It's P D A, remember? PERSONAL? ---ack, what's the use, you're all closet Luddites.
What's more interesting, is using voice recognition for PDA specific tasks. Like checking adresses. Or phone numbers. "Number John Doe", and it pops up. Brilliant, instead of clicking through menus.
--No sig today, my sense of humor has gone away.
Check freshmeat for linuxconf, it's similar to SAM, SMIT, etc and works great!
-- I speak only for myself.
ARRRR EMMMM DASH ARRR EFFFF SLASH STAR
but what if you're a pirate?
ARRRR MATEY!
would it delete "atey"?
I'm gonna pre-empt the arguement about an office full of people talking to their computers being too noisy. Right now everyone is talking on the phone and a couple have radios/cd's playing. It's not too noisy. The only downside I can see is that when Windows crashes people might be tempted to shout obscenities at their computer (as opposed to muttering under their breath like they do now.)
I have no fear of the keyboard though. I don't mind typing. In fact I often find it annoying to reach for the mouse. Some voice recognition capabilities would be nice though--especially at home.
Seriously, I can already picture how I can make my whole apartment voice activated. "Turn on fishtank" would turn on the light in the fish tank (X10/firecracker,) "dim lights 75%", "play sublime 40 badfish."
I doubt I'd ever use voice recognition instead of typing in the shell other than for that kind of thing. But could certainly use it in a standalone app that executes shell commands based on voice commands i've specifically taught it. One of these days I'll get around to doing something like this.
numb
I want voice recognition on my _workstation_! Is anyone listening???
The ViaVoice SDK comes close, but I havn't found any well-done frontends to that, even. I wish Dragon or L&H would release a product for Linux, or at least one that works with Wine.
Those are the figures for speaking at speeds that DNS can accurately transcribe. Yes, you do have to also account for correction, but if I'm writing I combine this step with my normal editing. It adds time but not too much, especially after you've used the system for a while and are getting good performance (98% or so).
When writing a report or whatever SR is easily faster than typing. It's not perfect, it doesn't work well for things other than text entry and command and control, but for what it does (which incidently is also nearly everything that people use their keyboard for) it does well and faster.
I just thought of an interesting idea... playing a 1st person shoot-em-up with mouse control for aiming and voice control for most other actions. I guess I'd still want the keyboard for movement, voce control of trying to run or whatever would still be kludgy. Oh well.
Duh,
nothing would happen.
You all forgot to scream
"ENTER"!!!!!!!!!!
Anyone know how this speech recognition will compare to the new kid on the block, Converse`?
Furthermore, when will that 11 node neural net the guys from USC came up with, be used in these kinds of products?
What we need is some voice recognition HARDWARE (a-la 3d video accelerators) that look similar to a keyboard or a keypad to your applications. Then writing all of your apps to use speech input is a simple task. You could even have a speech recognition daemon running (like the infrared control daemons already available). Speech recognition is simply another input "device" It would make sense to make it a piece of hardware, no?
...that Slashdot _IS_ maintained by a "Squadron of Circus Geese".
a l/pda/index.htm )...
;-)
I submited this story I found on USENET and they
desided not to post it. It's about Samsung's
Linux PDA.
--- Story - Start ---
From: "Amandio J.S. Bacalhau"
Newsgroups: comp.sys.palmtops.pilot,
comp.sys.palmtops
Subject: NEW info about the new Linux PDA from
Samsung !
Date: Sat, 12 Feb 2000 13:21:41 -0800
I received this info about the Samsung Yopy
(http://www.sem.samsung.co.kr/eng/product/digit
[snip]
We are going to show Multimedia PDA YOPY at Cebit
show in Germany for the first time in the world
from Feb. 24 through Mar. 1, 2000.
In regards to specific information(like performance) will be available on the digital
website from the end of Feb.
Of course, we will provide you any new information
on YOPY when we are ready via e-mail.
YOPY will be available in the market from the
second quotor of this year. Then you can meet YOPY
in your area. We are working on launching plan for
the product such as price and sales channels.
[snip]
Anyone going to CeBIT ?
Amandio J.S. Bacalhau
--- Story - End ---
Make of it what you will.
This is basically the last big hurdle on the way to what I call Gear. (The name comes from the short-lived SF series _Earth 2_, where it referred to the heads-up, voice-controlled computer/communicators the humans wore.) Consider:
Morning. Get up. Get dressed. Put on your Baldric, a Miss-Universe-style sash made of trendy-stereo-grey squares, roughly the size of cigarette packets. You're going for state-of-the-art, so your Baldric contains:
- a RAM RAID, four or five Gear Cells of high-capacity, non-volatile memory, redundantly copying each other so that nothing short of a flamethrower will cause memory loss.
- a Jack-In-The-Box, a cell containing a speaker, microphone, infrared and microwave tranceivers, all sorts of cable in/outs, and all the software necessary to allow your Gear to communicate with the mobile phone network, internet, infranet, and you.
- a Brain Cell, a pluggable, replaceable processor.
- an Eye Ball, a cell containing a digital camera and a projector; this does most of the visual display work, projecting on a nearby wall, or connecting to your optional heads-up display.
- a Handle, a slightly oversized cell with a chord keyboard _and_ a Palm-style stylus/graffitti-pad arrangement for quick, quiet text input.
You operate your gear using voice commands, mostly, but like most power users you don't only use English. GearCorp have followed the example of Palm Computing, whose Graffitti is not quite standard handwriting but rather a modified, streamlined version. Knowing that some sounds are easier to detect than others, they invented a language called Glish. So: a casual user might open a work file with the command "Menu File. Open. Section 'Work'. Section 'Memo'. Document 'DailyMemo'.", On the other hand, you, as a power user, would say "Fie Oh Dok At 'Work' At 'Memo' At 'DailyMemo'". Rolls off the tongue, and is much quicker for you and the Gear.
Go to work. That is, go to the park, sit there and conduct work in relaxed surrounds. Take calls, write programs or documents, "attend" meetings, all while sitting on a park bench watching the world go by. If you need confidentiality, use the Handle, or speak in Glish. In your briefcase you have a full-sized foldable keyboard and a foldable flatscreen with easel legs, so you can avoid using the Handle and the Eye Ball if you like.
I think it'd work. I think it'll be here within five years. And I think it'll change the computing world more than anything since VisiCalc.
: Fruitbat :
I have discovered a truly remarkable
chmod -R 644 / dosdir/*
</TT>
Because if you do rm -rf
--------
"I already have all the latest software."
What we should be researching is artificial intelligence (not necessarily artificial consciousness). If we managed that, we'd have vehicles that drive themselves, voice recognition, and, most importantly, fast research in all other fields.
We don't take even partial AI seriously enough.
--------
"I already have all the latest software."
Do you want the new user interface applications developed in open source on Linux, or only on MSWin3K and the occasional Macintosh? Yeah, I thought so... There's also the PDA-like devices that will come from the cell-phone makers, and it'd be nice to have good programming interfaces to them. Some things will be killer apps, others will be toys we get bored with quickly, but open development environments will make it easier for everybody to try things out.
Some user interfaces are just dumb replacements for keyboards on machines that have conventional-sized screens. There are a lot of problems for which this is adequate, including the typing-impaired but also applications where you want hands-free but don't need to be eyes-free, such as information kiosks ("mirror, mirror on the wall, where can I find beer in this airport?"), reference-finders for workers in messy environments ("zoom in on the picture of the carburetor"), etc.
Voice commands can also be mouse/menu substitutes, for people who like them. A long-known safety principal is to limit the commands to a relatively short set of very safe commands. You don't want to have "rm -fr *" there, but "mail" and "phonebook bob smith - yes - dial" are pretty safe. (Ok, there are still risks like that web site with the background sounds saying "phonebook 1-900-RIP-OFFF - dial", but you can decide how much risk management you want. And you want it to ignore almost anything after the keyword "Daddy".) One of my coworkers had a PC-based application; we'd be on a conference call, and he'd occasionally interrupt to tell his computer to fetch a file. He doesn't use it much any more - I'm not sure if the novelty wore off or if he decided to cut down his weirdness quotient on the phone.
If you're willing to do voice input and output, portability becomes more practical, and computers can be a lot smaller because they don't need screens and keyboards, and more flexible because you can stick them in a pocket or backpack and use a headset. Sure, people will look at you funny walking down the street talking to yourself, but here in San Francisco, half the people on the streets are either talking to their cellphones or their liquor bottles, and society has adjusted to it. A hands-free voice portable makes an interesting combination with a GPS system and datacomm; it can give you while you're driving, tell you about nearby restaurants and traffic jams, and maybe let you call nearby cars ("Hey, CA123456, use your &^%&^% turn signal!").
MP3 Players can also benefit from voice interfaces, since it mainly requires adding a bit of storage to the computer you're already carrying. ("Computer, play Dark Side Of The Moon three times, volume low, speakers, order large pizza from Foobaros.").
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
the sad part being that i understood that joke.
I use a Japanese OS now and then, and while I can speak Japanese ok, typing emails in Japanese is a pain in the butt..Japanese speech recognition would be cool, since typing Japanese (even with a Japanese keyboard) is HARD..
Good point.
-- Ender, Duke_of_URL