Slashdot Mirror


Open Source Codec Encodes Voice Into Only 700 Bits Per Second (rowetel.com)

Longtime Slashdot reader Bruce Perens writes: David Rowe VK5DGR has been working on ultra-low-bandwidth digital voice codecs for years, and his latest quest has been to come up with a digital codec that would compete well with single-sideband modulation used by ham contesters to score the longest-distance communications using HF radio. A new codec records clear, but not hi-fi, voice in 700 bits per second -- that's 88 bytes per second. Connected to an already-existing Open Source digital modem, it might beat SSB. Obviously there are other uses for recording voice at ultra-low-bandwidth. Many smartphones could record your voice for your entire life using their existing storage. A single IP packet could carry 15 seconds of speech. Ultra-low-bandwidth codecs don't help conventional VoIP, though. The payload size for low-latency voice is only a few bytes, and the packet overhead will be at least 10 times that size.

8 of 128 comments (clear)

  1. Specific to English? by MichaelSmith · · Score: 5, Interesting

    I wonder how it performs on tonal languages like Cantonese.

    1. Re:Specific to English? by Bruce+Perens · · Score: 4, Informative

      I recruited David to work on this because I felt that Amateur Radio operators should not be bound to any locked-down technology but should be able to tinker with all of their technology. At the same time, there is a similar controversy regarding closed codecs on the Internet.

  2. Re:do what now by frovingslosh · · Score: 4, Informative

    The samples don't sound great, and I really wonder how well it does trying to record a conversation rather than one person talking directly into a mic. Still, I would welcome the chance to try an app based on this to see if it could really record your day, although until I can test it I'm a disbeliever.

    --
    I'm an American. I love this country and the freedoms that we used to have.
  3. More than just low storage by Excelcia · · Score: 4, Interesting

    Encoding voice more efficiently has implications far exceeding the amount of storage space required to save it. There's a reason why the article is comparing the new codec to single sideband. When transmitting digital data over radio, it pretty much invariably (nowadays) means some sort of spread spectrum transmission. The fewer bits required per second means the less spectrum you are having to spread your signal over, this the more concentrated your signal is. A radio transmitter has a fixed power output, so if you are smearing that power over less band, then you have a stronger signal.

    It is a testament to the amateur radio pioneers of the past that an analog radio transmission mode invented over a hundred years ago is, just now, being possibly rivaled in its efficiency.

  4. Re:Yes, it can! by Bruce+Perens · · Score: 4, Interesting

    Actually, our modems degrade gracefully. The least-protected bits go wrong with low bit-error rates, and the more protected bits survive. It takes a high bit error rate to kill it. So bit errors result in the speech being "off" but not dropping out.

  5. Re: Yes, it can! by Bruce+Perens · · Score: 4, Informative

    It's free software, not for sale.

  6. Re:do what now by Lumpy · · Score: 4, Informative

    It's not for recording.
    It's for giving us Voice communication to MARS and back. If you have the ability to transmit voice over long distances using lower bandwidth, you can add in luxuries like checksums and redundant data so that when you send it a very long distance it arrive at the extreme distance away where your 10,000 watt transmission is weaker than a dollar store walkie talkie.

    Ham radio is where most of the breakthroughs in communication happen. I can see this mode used to allow voice communication with mars astronauts. We already have PSK31 allowing a ham with 2.5 watts of power to transmit text messages around the globe easily.

    --
    Do not look at laser with remaining good eye.
  7. Re:Close by Bruce+Perens · · Score: 4, Informative

    Lots of people ask about this. If we did pure speech-to-text and text-to-speech, it would take about half the bandwidth but everybody would have the same synthesized voice. Once you start trying to add parameters to the synthesized voice such as pitch, speed, and tonality, those take as much bandwidth as we are using for the entire codec, because they are essentially the same parameters.