Slashdot Mirror


Codec2 — an Open Source, Low-Bandwidth Voice Codec

Bruce Perens writes "Codec2 is an Open Source digital voice codec for low-bandwidth applications, in its first Alpha release. Currently it can encode 3.75 seconds of clear speech in 1050 bytes, and there are opportunities to code in additional compression that will further reduce its bandwidth. The main developer is David Rowe, who also worked on Speex. Originally designed for Amateur Radio, both via sound-card software modems on HF radio and as an alternative to the proprietary voice codec presently used in D-STAR, the codec is probably also useful for telephony at a fraction of current bandwidths. The algorithm is based on papers from the 1980s, and is intended to be unencumbered by valid unexpired patent claims. The license is LGPL2. The project is seeking developers for testing in applications, algorithmic improvement, conversion to fixed-point, and coding to be more suitable for embedded systems."

6 of 179 comments (clear)

  1. what about LATENCY? by Kristopeit,MichaelDa · · Score: 4, Interesting
    why is seemingly the most important aspect of communication technology so often overlooked?

    i assume it's acceptable... but it angers me that someone thought it was relevant to give the exact number of bytes for a seemingly arbitrary 3.5 seconds of audio, but failed to say how long it take to encode that 3.5 seconds of audio, or what average latency can be expected after buffer conditions are met.

  2. Re:Great news by Bruce+Perens · · Score: 4, Interesting

    I think you could cut the sample rate in half and get acceptable performance, but I've not tried. Currently I think it's 25 microsecond frames, and each frame has one set of LSPs and two sets of voicing information so it's interpolated into 12.5 microsecond frames. Those lower bandwidth codecs do 50 microsecond frames. Go forth and hack upon it if you'd like to see. Also, there are some optimizations that are obvious to David and Jean-Marc (and which I barely understand) that haven't been added yet. One is that the LSPs are monotonic and nothing has been done to remove that redundancy. Delta coding or vector quantization might be ways to do that. I understand delta coding but would not be the one to do VQ. Another is that there is a lot of correlation of the LSPs between adjacent frames, so you don't necessarily have to send the entire LSP set every frame. And there is probably lots of other opportunity for compression that I have no concept of.

  3. English only ? by Yvanhoe · · Score: 4, Interesting

    At such high compression rates, one could wonder if the optimizations to transmit clear speech make assumptions about the language used. Does it work well with French ? Arabic ? Chinese ?

    --
    The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
    1. Re:English only ? by Bruce+Perens · · Score: 4, Interesting

      The basic assumptions are based on the mechanics of the vocal tract, and I suspect not high-level enough to differ across languages, but obviously it would be nice to hear from speakers of other languages who test it. We could also use a larger corpus of spoken samples for testing.

  4. Mumble integration ? by Anonymous Coward · · Score: 4, Interesting

    One of the fastest ways to ensure its testing and distribution is to use it in Mumble - the low latency voice chat software (with an iPhone client as well).
    Mumble is typically used by gaming clans for their chat rooms and it Codec2 would be tested in real-life conditions.

    1. Re:Mumble integration ? by Bruce+Perens · · Score: 4, Interesting

      One of the fastest ways to ensure its testing and distribution is to use it in Mumble - the low latency voice chat software (with an iPhone client as well). Mumble is typically used by gaming clans for their chat rooms and it Codec2 would be tested in real-life conditions.

      Is there an existing Mumble developer whom we could get interested in this? It might be that we should take some of the Alpha-isms out of the code first.