Next-Next Generation Video: Introducing Daala
An anonymous reader sends this excerpt from a post by Xiph.org's Monty Montgomery:
"Xiph.Org has been working on Daala, a new video codec for some time now, though Opus work had overshadowed it until just recently. With Opus finalized and much of the mop-up work well in hand, Daala development has taken center stage. I've started work on 'demo' pages for Daala, just like I've done demos of other Xiph development projects. Daala aims to be rather different from other video codecs (and it's the first from-scratch design attempt in a while), so the first few demo pages are going to be mostly concerned with what's new and different in Daala. I've finished the first 'demo' page (about Daala's lapped transforms), so if you're interested in video coding technology, go have a look!"
Daala? That will play well in test/focus groups - I'm certain of it.
It's the submarine patents that are the bigger worry. Since this codec is based on work that's come from some academic research papers, one can imagine a sufficiently wealthy litigation-mad megalocorp paying developers to stay on top of research and file patent applications citing obvious implementation details and then keeping them under the surface until it's sufficiently advantageous to allow them to surface. This process is 100% the opposite of the intention of the Constitutional provision for IP.
BTW, the posted whitepaper is really nicely done - good job Xiph team. In a free world everybody would rejoice and be happy for your efforts.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
If you want to have a good solid niche for this (always helps) then.
1 - support a reasonably efficient lossless mode (several others have this, but it is alsways useful)
2 - support ALPHA channels (32bit, RGBA, YUVA) - this is not trivial but very worthwhile, basically ZERO of the modern codec support this.
alpha channels are a requirement for many composition/editing/video production workflows, and yet the supporting codecs are old, clunky,
and painful to use.
good alpha channel support is not trivial, but is usually not major also.
For extra points support lossless on alpha, and lossy on the other channels, that is a VERY good option for many workflows.
Gives alpha and lossless, you are pretty much guaranteed professional users.
I'm sure its a worthy project and no doubt based on very solid maths and engineering. However I'm also sure it'll be forgotten about this time next week. There are too many codecs fighting it out already - yet another one is just a bit more noise in the background. Sorry, but thats just the way it is.
I'm a bit worried about the big focus on "blocks" for such a video codec. Originally, the blocks used in the JPEG image coder was put there to make sure that you could stream-encode images using reasonably cheap silicon back in the eighties. No one really wants the blocks, they were a necessary limitation. Using the same algorithm as JPEG but removing the blocks gives a serious quality boost.
This codec will never run on hardware that can't handle more than 16 x 16 pixels at once. The lowest specs that will encode these frames will be hand-held cameras, which will have more than enough ram to buffer at least two full frames, and use a small FPGA for encoding/decoding. Everything else will be decoded by a GPU directly to the framebuffer, and likely encoded by the same GPU. Even server farms have these for processing media.
There's also no issue with streaming as far as I know. Both DCT and Wavelet based coders can packetize the important bits in a frame first, and the less important bit later, so that a slow connection can still decode a degraded image even if not all bits are received. This without splitting the image into blocks.
c++;
I'm holding out hope that this codec supports subtitles as filters, so we can bake them in without actually baking them in, if you catch my drift. One of the biggest challenges I've faced when working with video encoding is that container formats are notoriously unsupported across the entire spectrum of video players.
The issue being that whatever inconsistencies you are hitting won't be addressed by shuffling the subtitle track awkwardly into the video stream somehow. A player not being able to interpret an SSA/ASS stream won't suddenly be able to render them because it comes in the video part of the container. This just moves the problem form one part to another, and does so in a way without technical benefit.
If the meat of your gripe is that a lot of things do not understand matroska and you are really thinking that such a trick will make .mp4 servicable, that would be incorrect too. Already you can put SSA or vobsub into mp4, but because it isn't part of the mp4 spec, many .mp4 players ignore the track. The same exact thing would happen if Daala somehowe hypothetically had subtitles in it, Daala in .mp4 would be something that those players would just fail to play.
XML is like violence. If it doesn't solve the problem, use more.
Wow, were do I get my drive?
Huh, the lapped transform is not quite "beyond" next-gen codecs like HEVC. In fact, the [over]lapped transform is already part of the WMV9/VC-1 codec, and it didn't really have a significant impact on coding efficiency (it does tend to look more blurry than blocky at very high quantization, but both cases look like crap at these levels)
Help! I am a self-aware entity trapped in an abstract function!
I think ACs point was that by moving subtitles from a container thing to a stream thing (but still a distinct layer you can enable or disable) the world would be a better place, because codec support is far more common than container support. There are plenty of players that support h.264, but only in an mp4 container, not an mkv container, and not I suspect for any technical reason, just marketing requirements written that way.
If you had to support subtitles to claim to support the codec, it would be a win - not because of technical benefit but marketing benefit. Having the video "just fail to play" if the player couldn't do it right would be a huge win.
Socialism: a lie told by totalitarians and believed by fools.
That just creates another layer of container.
I already have to serve 2 versions of a video to make sure everyone can see it on the web. It was not obvious to me that we needed another codec.