Phoneme Approach For Text-to-Speech in SCIAM

← Back to Stories (view on slashdot.org)

Phoneme Approach For Text-to-Speech in SCIAM

Posted by Hemos on Sunday March 16, 2003 @11:59PM from the understanding-the-language dept.

jscribner writes "Scientific American is running a feature on IBM Research's Text-to-Speech technology. It discusses the current state of affairs in this field, and describes IBM's phoneme based 'Supervoices' approach. The IBM site provides a demonstration, allowing users to enter text to be rendered to speech, as well as providing several examples in other languages."

2 of 189 comments (clear)

Min score:

Reason:

Sort:

TTS is great by jjohn · 2003-03-17 00:31 · Score: 4, Interesting

Last year, I started playing with this IBM tech. I thought it would be cool to have RSS feeds read to you in middle of stream music. It's kind of do-it-yourself radio. Although I don't anything to show for that idea, I did make a few songs with it, like Make the Pie Higher, Plug Nickle and Progress.
mmm. I hope the server can take a slashdotting...
The TTS interface is C++, but it comes with a program that will compile text into AU files. I wrote the following script to change those AU files into mp3s:
#!/bin/bash # Make a text file a spoken MP3 if [ -z "$1" ] ; then echo "usage: $0 <input.txt>"; exit; fi base=`basename $1 .txt` echo "attempting to create $base.mp3" /home/jjohn/src/c/viavoice/cmdlinespea k/speakfile $1 writewav.pl temp.au temp.wav lame -h temp.wav $base.mp3 rm -f temp.au temp.wav

speakfile is a slightly hacked version of the demo program IBM ships. Unfortunately, /.'s lameness filter doesn't like C++ code. :-(
It's petty messy C++ hacking on my part, anyway. The Perl program is based on the CPAN module Audio::SoundFile. It's also hacked from a demo script that shipped with the module.
#!/usr/bin/perl use Audio::SoundFile; use Audio::SoundFile::Header; my $BUFFSIZE = 16384; my $ifile = shift || usage(); my $ofile = shift || usage(); my $buffer; my $header; my $reader = new Audio::SoundFile::Reader($ifile, \$header); $header->{format} = SF_FORMAT_WAV | SF_FORMAT_PCM; my $writer = new Audio::SoundFile::Writer($ofile, $header); while (my $length = $reader->bread_pdl(\$buffer, $BUFFSIZE)) { $writer->bwrite_pdl($buffer); } $reader->close ; $writer->close; exit(0); sub usage { print <<EOT; usage: $0 <infile> <outfile> EOT exit(1); }

mmm. There was indenting in code at one point. Sigh...
Re:comparison to Apple's technology? by aseidl · 2003-03-17 00:56 · Score: 4, Interesting

I'm surprised by how many people (Mac users and otherwise) haven't noticed how long MacOS has come with text to speech. It's been included since at least MacOS 7.5, maybe even 7.0 (I was using it on my trusty ol' IIci yesterday). You could use it via SimpleText or even have it speak the text of dialog boxes. The quality of the voices could be better, but they do seem better than Festival. But, I have to admit it is pretty fun to scare people who don't know about it. One of my friends told me that his mother gets scared if she doesn't click OK of Cancel in a dialog because "those voices are going to come."