Ask Slashdot: Linux and Telephony
This one is a doosy. I've received various
submissions from people who were looking for
information on how to make their Linux
box into an answering machine. I've also
received submissions asking about Voice
Synthesis and Speech-To-Text. I have to
admit I haven't found much information on
either while browsing on the net, so I'm
turning the question over to you folks. However
I wonder if there isn't a issue hidden here?
Can Linux be used as an Interractive Voice
Response(IVR) platform? If not, why not?
First off, let's NOT forget the actual
questions:
Metiu and Sri both want to know if a Linux box with a voice modem can be used as an answering machine.
Gextyr is looking for information on Voice Synthesis packages that are available for Linux.
This Clan AC Member wants to know if there are any applications or APIs for Linux that deal with Speech-To-Text or Text-To-Speech.
Lastly, there have been quite a few submissions asking whether or not Linux can be used as a demand fax server. Can it?
If Linux can be used for all of the things above, what's stopping it from performing as an IVR system? IVR systems are simply systems designed to use a telephone as the computer interface (using both touch tones and voice). IVR systems are used everywhere, from your voice mail, to ordering systems, and corporations are adopting more and more IVR systems for various tasks.
I've seen IVR implemented on DOS systems but most of these have moved to NT. What's preventing Linux from operating in this market? Are there existing IVR projects in progress, or is this another area where Linux falls behind?
Metiu and Sri both want to know if a Linux box with a voice modem can be used as an answering machine.
Gextyr is looking for information on Voice Synthesis packages that are available for Linux.
This Clan AC Member wants to know if there are any applications or APIs for Linux that deal with Speech-To-Text or Text-To-Speech.
Lastly, there have been quite a few submissions asking whether or not Linux can be used as a demand fax server. Can it?
If Linux can be used for all of the things above, what's stopping it from performing as an IVR system? IVR systems are simply systems designed to use a telephone as the computer interface (using both touch tones and voice). IVR systems are used everywhere, from your voice mail, to ordering systems, and corporations are adopting more and more IVR systems for various tasks.
I've seen IVR implemented on DOS systems but most of these have moved to NT. What's preventing Linux from operating in this market? Are there existing IVR projects in progress, or is this another area where Linux falls behind?
There are AT commands to do all this stuff, if you want to roll your own software. You'd have to do the system side (sound, etc) yourself. Rockwell (now Conexant) supports this through the use of what they call "business audio," which uses half-duplex digital PCM audio data from your computer (over the serial port/ISA slot). They also have an analog path to and from the chip, but that would be trickier, as unless you have a speakerphone version, the mic from your PC is probably not hooked up to your modem. Here's a few Rockwell (they're the MOST comman modem chipset manufacturer) AT commands (including fax and CLID)to get you started:
7.5 CALLER ID COMMANDS
#CID=0 Disable Caller ID.
#CID=1 Enable Caller ID with formatted presentation.
#CID=2 Enable Caller ID with unformatted presentation.
7.6 FAX CLASS 1 COMMANDS
+FCLASS=n Service class.
+FAE=n Data/fax auto answer
+FRH=n Receive data with HDLC framing.
+FRM=n Receive data.
+FRS=n Receive silence.
+FTH=n Transmit data with HDLC framing.
+FTM=n Transmit data.
+FTS=n Stop transmission and wait.
7.7 FAX CLASS 2 COMMANDS
+FCLASS=n Service class.
+FAA=n Adaptive answer.
+FAXERR Fax error value.
+FBOR Phase C data bit order.
+FBUF? Buffer size (read only).
+FCFR Indicate confirmation to receive.
+FCLASS= Service class.
+FCON Facsimile connection response.
+FCIG Set the polled station identification.
+FCIG: Report the polled station identification.
+FCR Capability to receive.
+FCR= Capability to receive.
+FCSI: Report the called station ID.
+FDCC= DCE capabilities parameters.
+FDCS: Report current session.
+FDCS= Current session results.
+FDIS: Report remote capabilities.
+FDIS= Current sessions parameters.
+FDR Begin or continue phase C receive data.
+FDT= Data transmission.
+FDTC: Report the polled station capabilities.
+FET: Post page message response.
+FET=N Transmit page punctuation.
+FHNG Call termination with status.
+FK Session termination.
+FLID= Local ID string.
+FLPL Document for polling.
+FMDL? Identify model.
+FMFR? Identify manufacturer.
+FPHCTO Phase C time out.
+FPOLL Indicates polling request.
+FPTS: Page transfer status.
+FPTS= Page transfer status.
+FREV? Identify revision.
+FSPL Enable polling
+FTSI: Report the transmit station ID.
7.8 VOICE COMMANDS
#BDR Select baud rate (turn off autobaud).
#CLS Select data, fax, or voice.
#MDL? Identify model.
#MFR? Identify manufacturer.
#REV? Identify revision level.
#TL Audio output transmit level.
#VBQ? Query buffer size.
#VBS Bits per sample.
#VBT Beep tone timer.
#VCI? Identify compression method.
#VGT Set playback volume in the command state.
#VLS Voice line select.
#VRA Ringback goes away timer (originate).
#VRN Ringback never came timer (originate).
#VRX Voice receive mode.
#VSD Enable silence deletion (no function, command response only).
#VSK Buffer skid setting.
#VSP Silence detection period (voice receive).
#VSR Sampling rate selection.
#VSS Silence detection tuner (voice receive).
#VTD DTMF/tone reporting.
#VTM Enable timing mark placement.
#VTS Generate tone signals.
#VTX Voice transmit mode.
7.9 VOICEVIEW COMMANDS
+FCLASS=n Service class
-SVV Originate VoiceView data mode
-SAC Accept data mode request
-SIP Initialize VoiceView parameters
-SIC Reset capabilities data to default setting
-SSQ Initiate capabilities query
-SDA Originate modem data mode
-SFX Originate FAX data mode
-SMT Mute telephone
-SDS Disable switchhook status monitoring
-SQR Capabilities query response control
-SCD Capabilities data
-SER? Error status (read only)
-DTP VoiceView transmission speed
-SSR Start sequence response control
+FLO Flow control select
+FPR Serial port rate control
-SSV VoiceView data mode start sequence event
-SFA Facsimile data node start sequence event
-SMD Modem data mode start sequence event
-SRA Receive ADSI response event
-SRQ Receive capabilities query event
-SRC: Receive capabilities information event
-STO Talk-off event
7.10 DSVD COMMANDS
-SSE=1 Enable DSVD
-SSE=0 Disable DSVD
Its very possible.
;)
I've currently got an old 486/50 DX running Linux 2.2.5 at home that handles voicemail for me using mgetty and some custom shell scripts. (Unfortunately I was never able to get get vgetty perl module working... its very old and there's almost no docs for it...)
Its pretty slick. People calling can leave voice messages or faxes. I've got it set up so either one gets packaged up in a mime attachment to my e-mail and queued to send to me. Next time the system is online it sends them off. If they sit there more than two hours it'll dial itself up and send them and get back offline. Also archives them so I can get them through a web browser on any systems in my apartment, or I can just hit the reset switch on the front of the system (which is plugged into the parallel port) and it plays any new messages for me. The turbo light blinks when I've got new messages.
I can also control all the X10 stuff in my apartment (mostly useful for options #1 -- turn off all the halogen lights, and #2 -- turn of coffee pot, both reducing the chances that my spacing out one morning will result in my apartment burning down)
Last thing I can do is use it to cause my network to dial up. The system handles my masquerading and internet access as well as voicemail, so when it dials up my entire network is online, then it e-mails the IP address it got to my PCS phone. Secure SLL webpage on that IP address lets me control all those devices directly (especially turning on other PCs), check my messages, or disconnect the network...
The real limiting factor I'd see in using it as an IVR system is more limited support of multi-line voice products, and the poor documentation and difficult programming for vgetty. I'm not sure there are any options other than vgetty.
Using vgetty in combination with packages like HylaFAX gives you easy ability to do fax-on-demand and other services like that.
I also used a system with three 14.4k voicemodems and vgetty as a way of validating information on a system that required the user give their true phone number. User was e-mailed a code to punch in after storing their supposed phone number and that code in a database. The voice system would use caller id and compare the code they entered with the code matching that number in the database. Match? Voila! Flag is set, account is activated.
Worked great, client never used it though. C'est la vie.