Slashdot Mirror


Phoneme Approach For Text-to-Speech in SCIAM

jscribner writes "Scientific American is running a feature on IBM Research's Text-to-Speech technology. It discusses the current state of affairs in this field, and describes IBM's phoneme based 'Supervoices' approach. The IBM site provides a demonstration, allowing users to enter text to be rendered to speech, as well as providing several examples in other languages."

3 of 189 comments (clear)

  1. TTS is great by jjohn · · Score: 4, Interesting
    Last year, I started playing with this IBM tech. I thought it would be cool to have RSS feeds read to you in middle of stream music. It's kind of do-it-yourself radio. Although I don't anything to show for that idea, I did make a few songs with it, like Make the Pie Higher, Plug Nickle and Progress.

    mmm. I hope the server can take a slashdotting...

    The TTS interface is C++, but it comes with a program that will compile text into AU files. I wrote the following script to change those AU files into mp3s:

    #!/bin/bash
    # Make a text file a spoken MP3

    if [ -z "$1" ] ;
    then
    echo "usage: $0 <input.txt>";
    exit;
    fi

    base=`basename $1 .txt`
    echo "attempting to create $base.mp3"
    /home/jjohn/src/c/viavoice/cmdlinespea k/speakfile $1
    writewav.pl temp.au temp.wav
    lame -h temp.wav $base.mp3
    rm -f temp.au temp.wav

    speakfile is a slightly hacked version of the demo program IBM ships. Unfortunately, /.'s lameness filter doesn't like C++ code. :-(

    It's petty messy C++ hacking on my part, anyway. The Perl program is based on the CPAN module Audio::SoundFile. It's also hacked from a demo script that shipped with the module.

    #!/usr/bin/perl
    use Audio::SoundFile;
    use Audio::SoundFile::Header;

    my $BUFFSIZE = 16384;
    my $ifile = shift || usage();
    my $ofile = shift || usage();
    my $buffer;
    my $header;

    my $reader = new Audio::SoundFile::Reader($ifile, \$header);
    $header->{format} = SF_FORMAT_WAV | SF_FORMAT_PCM;
    my $writer = new Audio::SoundFile::Writer($ofile, $header);

    while (my $length = $reader->bread_pdl(\$buffer, $BUFFSIZE)) {
    $writer->bwrite_pdl($buffer);
    }

    $reader->close ;
    $writer->close;
    exit(0);

    sub usage {
    print <<EOT;
    usage: $0 <infile> <outfile>
    EOT
    exit(1);
    }

    mmm. There was indenting in code at one point. Sigh...

  2. Re:comparison to Apple's technology? by aseidl · · Score: 4, Interesting

    I'm surprised by how many people (Mac users and otherwise) haven't noticed how long MacOS has come with text to speech. It's been included since at least MacOS 7.5, maybe even 7.0 (I was using it on my trusty ol' IIci yesterday). You could use it via SimpleText or even have it speak the text of dialog boxes. The quality of the voices could be better, but they do seem better than Festival. But, I have to admit it is pretty fun to scare people who don't know about it. One of my friends told me that his mother gets scared if she doesn't click OK of Cancel in a dialog because "those voices are going to come."

  3. Old news by payndz · · Score: 3, Interesting
    Text-to-speech? Come on, this has been around for donkey's years - maybe the computer voice doesn't sound like Majel Barrett yet, but it's hardly new and amazing stuff.

    I want to know what's going on with speech-to-text, and will I be able to dictate rather than type a novel any time soon? (Preferably with some form of intelligent speech recognition, so it doesn't end up with passages like "She, ah... walked, no strode into the room to find, uh, er, dammit, did I say Rob left the tape on the counter or the desk? Oh, bloody hell. Hello? No, I'm not interested in double glazing. How did you get this number anyway? Bye. Where was I? Oh, crap! Computer, pause-")

    --
    You must think in Russian.