Technical Objections To the Ogg Container Format
E1ven writes "The Ogg container format is being promoted by the Xiph Foundation for use with its Vorbis and Theora codecs. Unfortunately, a number of technical shortcomings in the format render it ill-suited to most, if not all, use cases. This article examines the most severe of these flaws."
In related news, I know technical shortcomings in hosting an article on hardwarebug.org...
Wasn't us. We don't read the articles before posting.
Google cache link: http://209.85.229.132/search?q=cache:http://hardwarebug.org/2010/03/03/ogg-objections/&hl=en&strip=1
And then over here in actuality (rather than the conversation), there is Youtube. And Netflix. And Hulu. And so on.
Nerd rage is the funniest rage.
It's not a selling point, it's a starting point. It's a sine qua non. For an application like video on the Web, nothing non-free can even enter the conversation.
Says who? XanC? Any format (and its software requirements) can succeed as long as the users will put up with it.
RealPlayer did very well for many years (say 1995-2000).
Apple Quicktime is used on many sites.
And of course, there's Adobe Flash.
To simply say that "nothing non-free can even enter the conversation" is ridiculous. Are your clothes free or open source? Your car? Your house? Your shampoo, your radio, your computer's processor, your keyboard?
Companies can make excellent closed-source products. Communities can make excellent open-source products.
Battlemaster--Game with friends in medival realms
Because, in general, you don't only want a video stream. You want to be able to bundle multiple streams together (usually of different types, eg an audio stream and a video stream). Vorbis, for example, is audio. Theora would be video. Ogg (the container) exists to bundle them together into a single file, so you can have a movie and sound with it.
This whole thing is really about bad blood between Xiph and the mplayer folks. Once, long ago, I made disparaging remarks about a particular mplayer developer's extensive collection of ass hats, and they declared war. This stopped being about facts or reason years ago. Here's the last blog thread that got completely hijacked by the anti-Ogg container wingnuts. It's a hell of a read:
http://blog.gingertech.net/2010/02/20/googles-challenges-of-freeing-vp8/
So, rehashing this yet again: The Anti-Ogg bullet points [Not going to bother with complete sentences, because I've wasted too much typing on this recently]:
1) A few of the mplayer/x264 hackers are right pissed that Ogg and Theora are getting all this attention when x264 is so obviously superior. That simply cannot stand. Since only America has patents and there are no computers there anyway, nobody should have to worry about them. Stick it to The Man! (How very ironic, Xiph being considered 'The Man' by folks contributing to an h264 encoder).
2) Xiph should immediately drop Ogg for [insert container here], breaking millions of hardware decoders and hundreds of millions of software decoders:
a) the [patented] mp4/MOV container is one suggestion they actually make seriously. Never mind adding 'willful infringement' to breaking the entire installed software/hardware base, this choice would totally redeem Xiph in their eyes. The benefit: by their own figures, it would reduce container overhead from .7% to .3%.
(Except that number is wrong. I found later that DonDiego screwed up his mp4 overhead figures at the link above; I had simply assumed he got his container numbers right. The mp4 file in his example has almost identical container overhead to the Ogg, a shade under 1%. His demultiplexed mpeg audio and video had framing in them, so it made it appear the mp4 container overhead was much smaller when he subtracted their file sizes.)
b) OK, mp4 is patented and no better, fine, Xiph should have just used Matroska from the beginning. Despite the fact that Ogg and Vorbis predated it by about five years (also mkv's not been able to interleave until just recently, which == no streaming). This is not to say you can't put Theora and Vorbis in Matroska. It's even a good idea! I've come to like MKV. But for streaming, Ogg is still much easier to deal with. Ogg was designed to stream, mkv was not.
c) OK, so, mp4 and Matroska are right out for streaming, Xiph should use Nut, which is the system they designed. Nut came ten years after Ogg was already widespread. And looks almost exactly like Ogg. Which is not to say there aren't things about it I like that improved on the Ogg approach. Eg, the packet length encoding is better. It has a conditional checksum coverage feature I had never considered, etc. At some point we'll make those changes when that wouldn't mean completely abandoning any chance we have at adoption just to save a fraction of a percent and add... no new features.
d) but.. but.. even FLV is better! OK, at this point I can't even entertain the arguement.
3) OK, maybe not adopt another container, but Xiph should immediately improve/change Ogg for, breaking millions of hardware decoders and hundreds of millions of software decoders for a 'better' implementation that won't actually give users any features they don't have now. FOSS need _tools_, not us wasting time overoptimizing something they couldn't care less about.
3) 64 bit timestamp! OMG, waste! Wait, mov/mp4 uses 64 bit stamps... Also, plenty of things in Ogg use a full byte instead of one bit because the container assumes octet alignment. Alignment makes it much faster/easier to deal with (you don't need a bitpacker to read pages, and you don't have to repack packet data to embed it into the page). Remember, all the completely unacc
Your objective is to Armchair engineers? Ok, well I'm not an armchair engineer. I've written my own Ogg/Vorbis decoder from scratch in the past (here). I've worked on codecs for about 10 years. I'm a fan of Vorbis and Theora, but Ogg needs to die a horrible death.
Ogg was by far the most bug-inducing part of the code. It's just AWFUL. It's ill-designed. It's incredibly complicated. It's inherently inefficient (copy sometimes required).
In short, it's the worst container format I've used in any serious application, and I've used pretty much all the common ones.
The irony of what you're saying, is that actually Ogg is what you'd end up with if an armchair engineer designed an audio codec container from scratch.
I sadly have to agree, and I've voiced the same objections for a long time. It really is like he tells it: it's just bad at everything it was intended to achieve. It's a source of bugs, it's horrendously complicated to support, and it's horrendously inefficient at anything but audio (and even then, not so good).
It seems to me, most of what went wrong was trying to support concatenation of Ogg streams. This is a nice idea, but actually quite a rare case. It's also incredibly naive for the specification document to request that Ogg implementation detect this. What, I'm supposed to scan the entire file in case that happens? No. I'll just not be compliant to that, thank you very much.
I even wrote my own Ogg/Vorbis decoder from scratch a while back (and dabble every now and then), and found Ogg to be a never-cooling, never-extinguishing steaming pile of hippo crap left over from consuming a dog. It just made everything so difficult to do. Seeking a stream involves divide-and-conquer - not necessarily a bad thing, but when you have huge streams the number of seeks can be bad. Not to mention if your stream has an endpoint the other side of the Atlantic Ocean. Why oh why did they pick timestamps being at the END of a page and indicating the output byte count produced by the END of that page? That little detail alone probably cost me days of debug.
I almost gave up at one point and went to a container format of my own which would have worked much better. Header: 'CONTAINER v1'. Packet: 'MAGIC', 4 byte Length, 4 byte Output pos. Job done. The sad fact is, that's easier than Ogg, smaller than Ogg (unless you're talking really low bit rate), and does entirely the job of Ogg without the complexity.
I'm probably going to add a Matroska container to my codec just to see how easy they are to produce. The spec looks fantastic, but the devil's always in the details - although seeing the praise on various (engineer) forums, it looks like the way to go.
So, Ogg, please die. We need you to get out of the way.
I'll do some analysis for you:
Generalities/codec mapping
The complaint is that there's no up-front header declaring all the streams contained. This is actually absurd - in theory you need to scan the entire file in case someone's just concatenated a video file with an audio file. This was, also absurdly, one of the aims of the Ogg container spec: concatenation. It's awesome to ask implementations to do this.
Overhead
One of Ogg's aims was to try to be less than 1% of the total stream space. It does achieve that, but the 'lacing values' end up looking pretty stupid for anything with large packets. It's like the article says: you end up with long strings of '255' summing up to 32-64KB packets, and hey just for extra complexity's sake, you'll have to split them across multiple not-quite-64KB pages. And then figure out where in that mess you're supposed to stick a timestamp: and here's a hint, you first page in that sequence has timestamp 0xffffffff which is nice if you randomly seeked to that place to find a position. God, what a mess that is to implement.
Then there's decode CPU overhead: the above basically means you end up copying the bitstream, which is a significant few percent overhead when you're talking about video.
Latency
You didn't understand his point. The latency is inherent in Ogg due to the large pages (not packets) required to reduce its size overhead, and in the position of the CRC (at the front of the page rather than the end). Reducing the page size makes the page headers start taking significant percentages of size if it's a low bit rate stream, e.g internet audio.
Random Access
Try pre-caching a 2GB video file. Or try pre-caching a 2GB video stream coming off the internet where the other end of the pipe is the other side of the world. Random access in these two realistic cases (if you'll admit that) requires a look up table, and it's precisely why many containers DO.
Complexity
The lacing values crossing pages, packets crossing pages, position of CRC, position of timestamp between packets/pages especially when cross-page, timestamps between logical streams (elementary streams), and other oddities/idiocies all ADD UP to make it a bloody mess to deal with. You end up just making copies of packets out of the stream, which is inefficient. In fact, that's exactly what the official Xiph codecs do: they make ugly copies. On real world MP3 players (and I've worked on some) that accounts for about 10% of your battery play time right there. I kid you not.
What this guy is expressing is what everyone who's worked on the Ogg container format itself has found out: it's just BAD at EVERYTHING. It needs replacing with something that doesn't suck, and there are free/open alternatives around. Maybe Vorbis 2 should switch container.