Slashdot Mirror


Next-Gen Low-Latency Open Codec Beats HE-AAC

Aldenissin writes "From the Xiph.org developers, Opus is a non-patent encumbered codec designed for interactive usages, such as VoIP, telepresence, and remote jamming, that require very low latency. When they started working on Opus (then known as CELT), they used the slogan 'Why can't your telephone sound as good as your stereo?', and they weren't kidding. Now, test results demonstrate that Opus's performance against HE-AAC, one of the strongest (but highest-latency) codecs at this bitrate, bests the quality of two of the most popular and respected encoders for the format, on the majority of individual audio samples receiving a higher average score overall. Hydrogenaudio conducted a 64kbit/sec multiformat listening test including Opus, aoTuV Vorbis, two HE-AAC encoders, and a 48kbit/sec AAC-LC low anchor. Comparing 30 diverse samples using the highly sensitive ABC/HR methodology, Opus is running with 22.5ms of total latency but the codec can go as low as 5ms."

27 of 166 comments (clear)

  1. Next level beats by MadAhab · · Score: 3, Funny

    This will be perfect for my next level beats.

    --
    Expanding a vast wasteland since 1996.
    1. Re:Next level beats by Skuto · · Score: 4, Insightful

      Perhaps they could switch to "has not yet been challenged in court for any possible patent infringement". But who would use a codec like that? Besides Google, of course.

      Companies do this all the time. Anyone shipping H.264 has this risk, as the patent pool provides zero guarantee no outside patents will pop up.

      Actually, anyone shipping anything at all has this risk.

      Realistically, it's more like "does not infringe any known patents, or has licenses for them, and is not infringing any other patents that we could find in a patent search".

  2. remote jamming? by mirix · · Score: 4, Informative

    and remote jamming

    Took me a while to figure out they meant in a band. I was wondering how they were going to jam some sort of signal with this codec.

    --
    Sent from my PDP-11
    1. Re:remote jamming? by qpqp · · Score: 2

      Thanks, took me much less time, because of you! We're jammin'...

    2. Re:remote jamming? by glwtta · · Score: 2

      I assumed they meant making preserves in an isolated or inaccessible location.

      --
      sic transit gloria mundi
  3. Re:sorry for being dense, but... by tepples · · Score: 2

    Even back when I used to play games online with voice chat

    Imagine Rock Band with voice chat. Or imagine actually making real music with voice chat.

  4. Re:sorry for being dense, but... by Anaerin · · Score: 4, Insightful

    As mentioned, it's needed for VoIP systems. With a full-duplex system, more than 150ms of lag is audible and noticeably uncomfortable, breaking the flow of conversation (As the apparent lag is doubled in a "conversation", with the delay at each end adding cumulatively). For simple half-duplex systems like gaming, more lag is not really noticeable.

  5. That's all fine and dandy, but.... by adolf · · Score: 2, Insightful

    Who cares what codec is being used for my VoIP phone at home or on my desk, when anyone I call is still most likely to be connected over the PSTN with g.711 or g.723, or (far worse) a cell phone?

    And don't get me wrong: I want to care; I really do. And maybe I did care, at one point. I was going to build an Asterisk system for home -- I even collected some of the hardware to make it work.

    But I stopped caring when the boy got old enough to properly want a cell phone, the wife got a cell phone, and I had a cell phone. After that, I dropped the home phone line altogether, since it was just a waste of money.

    I have no interest, at this moment, in having any sort of telephony tied to my premises.

    And while I could, I suppose, run some manner of VoIP client on my Droid over cellular, I think that's a complete non-starter at the moment: I had trouble earlier today getting a 64kbps MP3 to stream correctly over 3G Verizon (even though I controlled both ends of the stream), but that was just an inconvenience.

    It'd be a lot more than simply inconvenient if my phone calls were that spotty. I don't care how good it sounds if it doesn't work.

    Is there any good and practical use for this new codec?

    1. Re:That's all fine and dandy, but.... by nog_lorp · · Score: 4, Insightful

      Lol what? You're crazy. I suppose it is never worth inventing a new codec ever, since everyone uses old codecs! /fail argument

    2. Re:That's all fine and dandy, but.... by adolf · · Score: 2

      So, it's something that might be useful for musicians. Maybe.

      100ms of total, round-trip, end-to-end-to-end latency (remember to count both hypothetical DSL connections) is the same as two musicians trying to play together when they are about 56 feet apart. It might be practical, but it doesn't sound very fun for many types of informal "jam"-oriented music: There's a reason the bass player often stands next to the drummer, and it's usually not because he wants more hearing damage.

      I just listened to some Beatles (just because of their typical hard-panned stereo separation) with the left channel delayed by 100ms, and found it to be fairly bothersome.

      If I were playing bass with that sort of delay, I'd expect either myself or the drummer to become very annoyed very quickly.

      But at least you answered my question. :)

      Thanks.

    3. Re:That's all fine and dandy, but.... by adolf · · Score: 2

      Massive online game chat: WoW, Mumble/Murmur...

      As if WoW is the most latency-sensitive thing in the world. (It's not.)

      Free WiFi audio/video telephony: Ekiga...

      Ok, sure. It doesn't improve my life at all (with the stated constraints about what I think I should care about), but why not. I guess it does this one thing that folks have already been doing, and has a chance at sounding better in the process.

      (I can't be disagreeable all the time, and hey, at least I learned about new unpronounceable open-source widget thanks to your suggestion, which I guess should be worth some credit...)

      Radio Amateur digital voice over satellite.

      Can HAMs working amateur satellites actually manage to reliably stuff 64Kbps through that pipe consistently enough to make it useful for realtime voice communications?

      Oh, and the latency...chopping off a few tens-of-milliseconds of latency, which is the best part about this new codec, is pretty well meaningless when working with the RTT of a geosync satellite.

      We've already got codecs that provide good voice audio, at far lower bandwidth than this.

      Digital voice over HF radio.

      No, not at all. There are other codecs which use less bandwidth and provide intelligible voice audio. This one is supposed to be better because it sounds good and has low latency, not because it's particularly efficient of bandwidth. Digital audio over HF is neat, but this isn't the right approach. AMBE might be a good choice for voice audio if it weren't so monetarily expensive.

      GSM would be better for HF, and (IIRC) it is free to use, but even that seems rather impractical for those sorts of miniscule data rates.

      Digital voice over any existing data channel that is already 'full'.

      If it's already 'full', then circuit latency and packet loss will be already be a bitch that is better subdued using stuff we've already got. Opus's current claim to fame is that it sounds good and has low latency on good networks, not that it survives broken/overburdened networks (where the latency improvement will be swamped by that of the network itself).

      Digital voice chat and telephony over a LAN, without clogging up the network.

      Show me a modern LAN segment which is alleged to be clogged by voice chat and telephony, and I'll show you both a network admin who just lost his job AND the new BMW M3 that I bought with my new-found fortune. (I'll even let you give it a spin for a few days -- consider it a finder's fee.)

  6. Re:Total Latency by jmv · · Score: 5, Informative

    Yes, 5 to 22.5 ms is the algorithmic delay of the codec. By comparison, codecs like AAC/MP3/Vorbis have more than 100 ms algorithmic delay (you need to give the encoder side more than 100 ms of audio before the decoder side gives you any audio back).

  7. Re:And this 'SILK' codec? by jmv · · Score: 4, Informative

    To be exact, there *are* patents, but they will be available without fee in a way that is compatible with FOSS licences such as the GPL. The main idea behind these patents is that your license terminates if you sue someone by claiming Opus infringes your patents. Almost like a copyleft, but for patents (of course the details are different because copyright != patent).

  8. Re:HE-AAC is worse than LE-AAC in terms of quality by woolpert · · Score: 5, Insightful

    HE-AAC uses SBR to reduce its data footprint. This results in worse reproduction of the source audio than LE-AAC at same bitrate (and often even lower bitrate). The whole deal with HE is that it can maintain good quality at very low bitrate, by giving up accuracy. So far, Apple's LE-AAC encoder in their Core Audio framework is the best choice for digitally non-lossless compression.

    While your rant appears informative if not insightful on its face, it is completely missing the point.

    This is a test of audio codecs at low bitrates.

    I don't know what this "LE-AAC" is you speak of (and rather suspect you don't either) but AAC-LC was actually in this test, as the low anchor.

    At these bitrates (~64kbps) HE-AAC (despite its "low-accuracy" as you put it) is perceptually better sounding than AAC-LC. Lossy audio codecs (even the LE-AAC [sic] encoder in Apple's Core Audio framework you love) can only be judged by how they sound, not how they look. "Accuracy" is not a metric very worthy of discussion.

  9. Technically... by jd · · Score: 2

    ...it can't have been "then known as CELT" since it is a merge of two codecs of which CELT is one and SILK is the other. It's good that it's an IETF standard as that will help some with adoption. It will also help some with getting other implementations. (Hell, Dirac is a great codec for video but because it's not a recognized standard for anything it's not getting used.)

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  10. Re:HE-AAC is worse than LE-AAC in terms of quality by parlancex · · Score: 3, Interesting

    That's the whole idea behind any lossy codec. You're trading mathematical accuracy for psycho-acoustical accuracy; personally, I don't care if the root mean square error is higher, I just need it to sound like the original.

    Anyway, if this really IS an improvement over HE-AAC, which uses some very techniques, I'll be extremely impressed, and quite pleased that it's patent free.

  11. Re:And this 'SILK' codec? by jmv · · Score: 3, Informative

    This is the license for the "old" SILK codec. The patent licenses for Opus has nothing to do with that. Please read them:

    Xiph.Org IPR statement: https://datatracker.ietf.org/ipr/1524/
    Broadcom IPR statement: https://datatracker.ietf.org/ipr/1526/
    Skype IPR statement: https://datatracker.ietf.org/ipr/1525/

  12. Not so— by Anonymous Coward · · Score: 3, Informative

    Skype will release their patents under a free software compatible license if the codec is standardized by the IETF: https://datatracker.ietf.org/ipr/1525/

  13. Re:sorry for being dense, but... by parlancex · · Score: 2

    In something like an actual telephone conversation it creates awkward pauses when each person is finished speaking of a length equal to twice the latency. High latency codecs also greatly encumber echo cancellation algorithms and hardware, which is extremely important in VoIP as anyone who has had to deal with it would know.

  14. Re:Total Latency by jmv · · Score: 2

    There are "custom modes" that can do that. With those you can go as low as 2.5 ms. The only down side is that you can't switch frame size dynamically when you use these custom modes.

  15. Re:HE-AAC is worse than LE-AAC in terms of quality by woolpert · · Score: 3, Interesting

    The sad thing is it shouldn't be better than HE-AAC. Being low latency does tend to mean one is better at the kind of time-domain issues many find so objectionable, but outside that OPUS is really packing a MUCH smaller toolkit than HE-AAC.

    This is really egg on AAC's face, IMHO, and quite the upset. OPUS is so immature the bitstream isn't even stable yet.

  16. Re:And this 'SILK' codec? by jmv · · Score: 4, Informative

    What makes you say that? If you find a real issue, please raise it -- either on the mailing list: codec@ietf.org, or to me privately (jmvalin@jmvalin.ca). Skype is on the good side on this one. The technology they have contributed is very useful and they're open about resolving any licensing issue.

  17. Re:HE-AAC is worse than LE-AAC in terms of quality by jmv · · Score: 4, Interesting

    If we were talking about a 96 kb/s test, I'd agree with you. But at 64 kb/s, HE-AAC sounds much better than AAC-LC. The guys who organized this test picked the best AAC implementation they could find at the rate the test was run at.

  18. Re:ok but how is dtmf detection? by parlancex · · Score: 3, Interesting

    You do realize that most modern VoIP hardware / software supports out of band DTMF? In fact, the most modern software demands it.

  19. Re:HE-AAC is worse than LE-AAC in terms of quality by Wannabe+Code+Monkey · · Score: 2

    Apple's LE-AAC encoder in their Core Audio framework is the best choice for digitally non-lossless compression

    Yes, but for digitally re-un-non-illossless compression I would go with the Foobar Audio Framework.

    --
    We always knew Comcast was corrupt, here's the proof: http://tech.slashdot.org/comments.pl?sid=1909890&cid=34545432
  20. And that isn't just important online by Sycraft-fu · · Score: 2

    When you are dealing with audio signals in the home, low latency can be needed too. If you are doing something like playing prerecorded video then no, the system can find out the delays of the screen, audio, codecs, etc and insert delays as needed to sync it all up. However not if you are doing something live, like games. That's the reason for stuff like Dolby Digital Live and DTS Interactive. They are made so that you can get low latency encoding so the sound from a game console syncs up with the video.

    It is also important for mobile phones. There's only so much latency you can tolerate in a conversation before things start to sound strange to the people using it. Of course there's already latency from the phone network, so codec latency matters. That is part of the reason why new phone standards aren't using something like AAC to get better sounding audio out of the bandwidth available.

    As such this project is has a lot of really cool potential. If it not only offers better per-bit perceptual sound but also is extremely low latency, it can be used in situations the others can't.

  21. Re:HE-AAC is worse than LE-AAC in terms of quality by Anonymous Coward · · Score: 3, Informative

    He was discussing accuracy as being irrelevant because perception is more important in a medium designed to be perceived by a human. You've now apparently converted it to "because fewer people care about accuracy", which was in no way his point. Or: Straw man.