Next-Gen Low-Latency Open Codec Beats HE-AAC
Aldenissin writes "From the Xiph.org developers, Opus is a non-patent encumbered codec designed for interactive usages, such as VoIP, telepresence, and remote jamming, that require very low latency. When they started working on Opus (then known as CELT), they used the slogan 'Why can't your telephone sound as good as your stereo?', and they weren't kidding. Now, test results demonstrate that Opus's performance against HE-AAC, one of the strongest (but highest-latency) codecs at this bitrate, bests the quality of two of the most popular and respected encoders for the format, on the majority of individual audio samples receiving a higher average score overall. Hydrogenaudio conducted a 64kbit/sec multiformat listening test including Opus, aoTuV Vorbis, two HE-AAC encoders, and a 48kbit/sec AAC-LC low anchor. Comparing 30 diverse samples using the highly sensitive ABC/HR methodology, Opus is running with 22.5ms of total latency but the codec can go as low as 5ms."
This will be perfect for my next level beats.
Expanding a vast wasteland since 1996.
and remote jamming
Took me a while to figure out they meant in a band. I was wondering how they were going to jam some sort of signal with this codec.
Sent from my PDP-11
Even back when I used to play games online with voice chat
Imagine Rock Band with voice chat. Or imagine actually making real music with voice chat.
As mentioned, it's needed for VoIP systems. With a full-duplex system, more than 150ms of lag is audible and noticeably uncomfortable, breaking the flow of conversation (As the apparent lag is doubled in a "conversation", with the delay at each end adding cumulatively). For simple half-duplex systems like gaming, more lag is not really noticeable.
Who cares what codec is being used for my VoIP phone at home or on my desk, when anyone I call is still most likely to be connected over the PSTN with g.711 or g.723, or (far worse) a cell phone?
And don't get me wrong: I want to care; I really do. And maybe I did care, at one point. I was going to build an Asterisk system for home -- I even collected some of the hardware to make it work.
But I stopped caring when the boy got old enough to properly want a cell phone, the wife got a cell phone, and I had a cell phone. After that, I dropped the home phone line altogether, since it was just a waste of money.
I have no interest, at this moment, in having any sort of telephony tied to my premises.
And while I could, I suppose, run some manner of VoIP client on my Droid over cellular, I think that's a complete non-starter at the moment: I had trouble earlier today getting a 64kbps MP3 to stream correctly over 3G Verizon (even though I controlled both ends of the stream), but that was just an inconvenience.
It'd be a lot more than simply inconvenient if my phone calls were that spotty. I don't care how good it sounds if it doesn't work.
Is there any good and practical use for this new codec?
Kid-proof tablet..
Yes, 5 to 22.5 ms is the algorithmic delay of the codec. By comparison, codecs like AAC/MP3/Vorbis have more than 100 ms algorithmic delay (you need to give the encoder side more than 100 ms of audio before the decoder side gives you any audio back).
Opus: the Swiss army knife of audio codec
To be exact, there *are* patents, but they will be available without fee in a way that is compatible with FOSS licences such as the GPL. The main idea behind these patents is that your license terminates if you sue someone by claiming Opus infringes your patents. Almost like a copyleft, but for patents (of course the details are different because copyright != patent).
Opus: the Swiss army knife of audio codec
While your rant appears informative if not insightful on its face, it is completely missing the point.
This is a test of audio codecs at low bitrates.
I don't know what this "LE-AAC" is you speak of (and rather suspect you don't either) but AAC-LC was actually in this test, as the low anchor.
At these bitrates (~64kbps) HE-AAC (despite its "low-accuracy" as you put it) is perceptually better sounding than AAC-LC. Lossy audio codecs (even the LE-AAC [sic] encoder in Apple's Core Audio framework you love) can only be judged by how they sound, not how they look. "Accuracy" is not a metric very worthy of discussion.
...it can't have been "then known as CELT" since it is a merge of two codecs of which CELT is one and SILK is the other. It's good that it's an IETF standard as that will help some with adoption. It will also help some with getting other implementations. (Hell, Dirac is a great codec for video but because it's not a recognized standard for anything it's not getting used.)
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
That's the whole idea behind any lossy codec. You're trading mathematical accuracy for psycho-acoustical accuracy; personally, I don't care if the root mean square error is higher, I just need it to sound like the original.
Anyway, if this really IS an improvement over HE-AAC, which uses some very techniques, I'll be extremely impressed, and quite pleased that it's patent free.
This is the license for the "old" SILK codec. The patent licenses for Opus has nothing to do with that. Please read them:
Xiph.Org IPR statement: https://datatracker.ietf.org/ipr/1524/
Broadcom IPR statement: https://datatracker.ietf.org/ipr/1526/
Skype IPR statement: https://datatracker.ietf.org/ipr/1525/
Opus: the Swiss army knife of audio codec
Skype will release their patents under a free software compatible license if the codec is standardized by the IETF: https://datatracker.ietf.org/ipr/1525/
In something like an actual telephone conversation it creates awkward pauses when each person is finished speaking of a length equal to twice the latency. High latency codecs also greatly encumber echo cancellation algorithms and hardware, which is extremely important in VoIP as anyone who has had to deal with it would know.
There are "custom modes" that can do that. With those you can go as low as 2.5 ms. The only down side is that you can't switch frame size dynamically when you use these custom modes.
Opus: the Swiss army knife of audio codec
The sad thing is it shouldn't be better than HE-AAC. Being low latency does tend to mean one is better at the kind of time-domain issues many find so objectionable, but outside that OPUS is really packing a MUCH smaller toolkit than HE-AAC.
This is really egg on AAC's face, IMHO, and quite the upset. OPUS is so immature the bitstream isn't even stable yet.
What makes you say that? If you find a real issue, please raise it -- either on the mailing list: codec@ietf.org, or to me privately (jmvalin@jmvalin.ca). Skype is on the good side on this one. The technology they have contributed is very useful and they're open about resolving any licensing issue.
Opus: the Swiss army knife of audio codec
If we were talking about a 96 kb/s test, I'd agree with you. But at 64 kb/s, HE-AAC sounds much better than AAC-LC. The guys who organized this test picked the best AAC implementation they could find at the rate the test was run at.
Opus: the Swiss army knife of audio codec
You do realize that most modern VoIP hardware / software supports out of band DTMF? In fact, the most modern software demands it.
Yes, but for digitally re-un-non-illossless compression I would go with the Foobar Audio Framework.
We always knew Comcast was corrupt, here's the proof: http://tech.slashdot.org/comments.pl?sid=1909890&cid=34545432
When you are dealing with audio signals in the home, low latency can be needed too. If you are doing something like playing prerecorded video then no, the system can find out the delays of the screen, audio, codecs, etc and insert delays as needed to sync it all up. However not if you are doing something live, like games. That's the reason for stuff like Dolby Digital Live and DTS Interactive. They are made so that you can get low latency encoding so the sound from a game console syncs up with the video.
It is also important for mobile phones. There's only so much latency you can tolerate in a conversation before things start to sound strange to the people using it. Of course there's already latency from the phone network, so codec latency matters. That is part of the reason why new phone standards aren't using something like AAC to get better sounding audio out of the bandwidth available.
As such this project is has a lot of really cool potential. If it not only offers better per-bit perceptual sound but also is extremely low latency, it can be used in situations the others can't.
He was discussing accuracy as being irrelevant because perception is more important in a medium designed to be perceived by a human. You've now apparently converted it to "because fewer people care about accuracy", which was in no way his point. Or: Straw man.