Ogg Format Accusations Refuted
SergeyKurdakov sends in a followup to our discussion a couple of months ago on purported shortcomings to the Ogg format. The inventor of the format, Monty "xiphmont" Montgomery of the Xiph Foundation, now refutes those objections in detail, with the introduction: "Earnest falsehoods left unchallenged risk being accepted as fact." The refutation has another advantage besides authoritativeness: it's far better written than the attack.
Summary so far:
Many of the complaints levied against Ogg were not about its technical merits, but about its inadequate documentation -- a feature Matroska shares. Other complaints were about features of Ogg (such as mappings) which nearly every other container format has as well. ... I've only gotten about a quarter of the way through, so far.
... the last time we discussed this, didn't the consensus eventually become that ogg isn't a fun container to work with, despite the fact that the guy who wrote the rant about it was a moron for wanting to trim headers that contribute fractions of percents to the overall size of files? I know I personally have worked with ogg, and it was a pain in the ass, mostly because (as the author of the format admits) the documentation blows.
Nearly every other container format+codec has exactly two bits that are codec dependent: an identifier (e.g. 'XVID' or "V_MPEG4/ISO/AVC" or a number) and binary private data/codec-specific init data/whatever you want to call it. Some codecs in some containers additionally define one bitstream, if the codec has multiple possible (h.264).
Timestamps, dimensions, aspect ratio, framerate, samplerate, etc. are stored in codec-independant ways in the container.
Ogg is not like that at all. The only thing it stores in a codec-independant manner is framing. Every other piece of information you might expect a container to have is stored in a codec-dependant manner. Even metadata!
I have no fucking clue why the creator does not see this as the problem that it is for everyone that tries to work with ogg.
My rant with Ogg is not so much the minute details of the format itself but that it works badly in a few common real world cases:
I know it's all been said before, but these are pretty common cases and Ogg isn't great when you have to deal with them. Everything else is nit-picking. I'm not a fan of the minute details of the format either, to be honest, but the above are real world examples of where it falls a little short. I should add that none of these issues make it unusable in any of those situations: just annoying.
And you're ignoring the problem that with ogg you have to hunt down and read the spec of every single codec that you want to implement demuxing support for, and that it is impossible to have, say, a generic lightweight file analyzer that tells you duration, codecs used, metadata, samplerate, framerate, etc.
From the article:
"This is commonly asserted by detractors, but a combination of false and missing the point.
Ogg transport is based entirely on the page structure primitive, described accurately above. There are no other structures in the container transport itself. Higher level structures are built out of pages, not built into them. All Ogg streams conform to this page structure and all Ogg streams are parseable and demuxable without knowing anything about the codec. "Drop the needle" anywhere in an Ogg stream and start demuxing; you get the codec data out without knowing anything about the codec. You possibly won't know what exactly to do with that data without the codec mapping and the data is possibly useless without the codec anyway, but that's true of every container.
To avoid being accused of sidestepping the issue, I posit that the actual [if unstated] objection is that the Ogg container does not fully specify the granule position in the transport specification. Beyond a few requirements, a codec mapping defines the granule position spec for that codec's streams, not the Ogg spec. In theory, this would mean that without codec knowledge or some other place to find the granule position definition, a decoder missing the codec for a given stream would not be able to determine the timestamp on the stream that it is not capable of decoding anyway. In practice, the granule position mapping does in fact exist in the stream metadata within the Skeleton header[7] (as it would be in Matroska or NUT). Additionally, the Ogg design allows implementations to ignore the pretty design theory and just do things the way other containers do by building granule position calculation into the mux implementation.
There's specific considered reasons for the granulepos design which take some space to explain accurately. Because Mr. Rullgard also wrote a lengthy diatribe against Ogg timestamping[8], I'll leave the explanation for there and link to it here when my response to the other article is live."
Too bad that in practice, I've seen a skeleton header maybe once. And anything optional is guaranteed to be missing in many cases. Thus to demux a new codec you still have to find the codec spec, find the ogg mapping, write the granule demangler, write a parser for the codec headers, etc. instead of adding a single entry to a table like you would for sane containers.
I think this speaks to your own inexperience more than anything else. Here's an ogg video with a Skeleton stream:
http://videos.videoonwikipedia.org/video/275/cell-phone-engineerguyogv
You can find many more with Skeleton streams at http://videos.videoonwikipedia.org or http://openvideo.dailymotion.com or http://www.archive.org or many other sites. I can only conclude that you are not very knowledgeable about ogg usage in practice.
That's because MPEG Transport Streams have an easily-accessible Presentation Time Stamp (PTS) in each GOP header, and it's reasonably easy to calculate the increment between PTSs (which will vary with framerate). The simplistic explanation is that the GOP header has the bit rate* & framerate; you can calculate the PTS increment either from the framerate or examining adjacent blocks, you then check the current PTS, calculate the desired PTS from that, and can then jump to the appropriate part of the file to find the PTS you're after.
(That's assuming you're working with a TS file, where the player can examine the first & last block to determine file length. With streaming, you're restricted to working with what's in the buffer (& hopefully your app knows how long the buffer is, since it allocated it!))
Ogg, AFAIK, doesn't have that info in the block header - IIRC it relies on the bitstream having presentation timing stored in it (i.e. none, in the case of most audio formats), which means you have to decode the block to find it. It was done that way to allow for variable framerates to be stored without having to build a huge index. MKV is a bit better in this respect, but it's a remarkably fragile container.
* It falls down a bit sometimes, particularly where the bitrate in the block header is set to max (15Mbps), or where you're using VBR. With the latter the calculation will usually get you in the ballpark; with both cases, some splitters/decoders calculate the bitrate themselves while playing, store it, and use that for seeking.
What part of "a well regulated militia" do you not understand?
From TFA:
An index is only marginally useful in Ogg for the complexity added; it adds no new functionality and seldom improves performance noticeably. Why add extra complexity if it gets you nothing?
You can do seeking without an index:
A binary search is discussed in the spec for ease of comprehension; implementation documents suggest an interpolated bisection search. So far, this is the same as Matroska and NUT.
The only difference being, Matroska implementers tend to be lazy about implementing the indexless seeking properly, and people tend to use indexes, thus propagating this myth even more.
The Vorbis source distribution includes an example program called 'seeking_example' that does a stress-test of 5000 seeks of different kinds within an Ogg file. Testing here with SVN r17178, 5000 seeks within a 10GB Ogg file constructed by concatenating 22 short Ogg videos of varying bitrates together results in 17459 actual seek system calls. This yields a result of just under 3.5 real seeks per Ogg seek request when doing exact positioning within an Ogg file. Most actual seeking within an Ogg file would be more appropriately implemented by scrubbing with a single physical seek.
And there you go. I don't know WTF is wrong with your players, but really, how can a total of four seeks bring your system to a crawl?
Don't thank God, thank a doctor!
Ah, the gotcha is in the source:
http://svn.xiph.org/trunk/vorbis-tools/ogginfo/ogginfo2.c
Ogginfo's source includes information on how to process the metadata for various codecs.
So, the grandparent's complaint is still valid. Ogginfo appears to require recompilation for every stream that they want to support inside an ogg container.
A DVD is MPEG-PS, not MPEG-TS. Your cable system and satellite feed are TS. Both are built on top of the PES layer.
MPEG-2 is the reason I have no hair left on my head.