Notes On The Future of Video on Linux

← Back to Stories (view on slashdot.org)

Notes On The Future of Video on Linux

Posted by timothy on Wednesday February 27, 2002 @06:50AM from the must-be-visible-and-moving dept.

Dina's Dream points out two interesting articles currently running on LinuxPower, and linked from Gnotices (GNOME news site) as well. "The first article is a really good summary of the current state of affairs of video under Linux and the direction we should take. Questions are bounced back between a few very knowledgeable people, including GStreamer developers, SGI people and Alan Cox. The second article is a set of lessons learned by Chris Pirazzi while working at SGI. Chris was involved in a lot of Video API programming at Silicon Graphics, and raises a few very good points based on his experience. All people even remotely working on video drivers or software should read these points and take them to heart."

7 of 126 comments (clear)

Min score:

Reason:

Sort:

SSSCA by Anonymous Coward · 2002-02-27 07:06 · Score: 4, Insightful

Well, if SSSCA passes, there won't be any video on linux.
video focus by paulbd · 2002-02-27 07:26 · Score: 4, Insightful

the slashdot headline is more accurate than the article's actual title. the author's approach comes almost entirely from a consideration of video. if he was starting from a primary interest in audio, he would have talked about many different issues, and mentioned different kinds of solutions. gstreamer is a cool system, but it needs to be stressed over and over that gstreamer is an architecture for building applications. it does not offer any mechanisms for inter-application communication or synchronization. since most people want to do a lot more than write a particular plugin for gstreamer, gstreamer doesn't help us when the challenge is not providing an architecture for a single program, but one for multiple applications on the same (or even networked) system(s). when you want to run a cool video processor along with a really nice FX rack for audio, gstreamer can't help you unless the author of each component had decided to implement his stuff as a gstreamer plugin. since this cramps the GUI style rather considerably, its unlikely that many people will choose to do this. finally, i would note that although it has become customary to sync audio to video, this actually makes very little sense when the temporal resolution of audio (22000-96000 frames/sec) is vastly greater than video (20-30 frames/sec). its really just an artifact of the way technological development has happened, of who has the most power in the entertainment "content" business, and of the fact that we generally consider visual data more significant than acoustic data. we'd be in much better shape is the conventional approach was to sync a video stream to audio, since we could easily and uniformly take advantage of the much better clock resolution that audio devices provide.
1. Re:video focus by vektor_sigma · 2002-02-27 09:31 · Score: 5, Insightful
  
  Paul: You mention that it makes little sense to sync audio to video and then state that this is what most applications do.
  
  On the contrary, syncing video output to the audio is much easier, since audio resampling is a big pain. It's much easier to just watch the soundcard buffer and decide approximately when to show the frame as you imply. You only run into big difficulties when you want to sync audio to the video (or more accurately, sync video blits and audio output to the vertical refresh rate of your output device).
  
  There are times when this is a very important. You imply that video framerates are between 20-30fps, which is quite small. For playback of video sources we need to handle framerates of 50fps and 59.94fps. At such a high framerate, the monitor refresh must track the video input in order to achieve smooth video. See the link to Dave Marsh's faq on judder I mention in the article.
  
  In these cases, you need to set the monitor to the correct refresh rate and then watch the error between the refresh and the sound card clock, and resample the audio when necessary: playing with the refresh rate on the fly would cause a monitor resync and disturb playback!
  
  If video cards let applications drive the refresh of the monitor (software genlock), we could run it based on the audio clock and get the advantages you describe. However given the current state of the hardware it's better to do these small resamplings. That said, actually doing this in linux is still infeasible without some updates to the APIs and drivers, which we are working on.
  
  -Billy Biggs
2. Re:video focus by Omega+Hacker · 2002-02-27 10:23 · Score: 3, Insightful
  
  First of all, the focus of the article was indeed multimedia, from the point of view of the average person who wants to use Linux to watch DVDs and videos off the net. These people really couldn't care less about perfect sample accuracy of their $10 sound card. Trying to force pro-audio qualities onto every machine is impossible, and highly counterproductive in the face of other more visible issues.
  
  GStreamer is indeed an infrastructure for building multimedia apps. Absolutely everything it does is under the auspices of the elements that it loads and connects together. Claiming that GStreamer has no means of inter-application communication is false, as there are already numerous elements both for communication with sound servers and other applications (network source/sinks, etc.). A set of Jack elements are being written, as well.
  
  GStreamer also has nothing to do with the GUI whatsoever. Elements are GObjects that have no GUI in them at all, they focus on doing what they're supposed to. Any GUI exists as a seperate entity on top of the processing pipeline as managed by GStreamer.
  
  If your worry is that LADSPA won't be used, don't. LADSPA plugins are fully supported with a shim under GStreamer.
  
  As for whether to sync to audio or video, you actually have it quite backwards. First of all, the most difficult situation is not with progressive video, but with field-based video (which both NTSC and PAL are, BTW), where the vertical rates are 50 or 59.95Hz. Compare this to a CRT, and you have significant problems finding a decent match between them.
  
  As for the "content" business somehow making video more important than audio, you're ignoring the fact that video *is* more important than audio when both are present together in the same stream. There are several order of magnitude more bits in the video than than the audio, anyway.
  
  Now, to the technical reasons it makes vastly more sense to absorb changes in the audio clock:
  
  1) 90+% of computers have sound cards with clocks that can be best described as "kinda sorta correct". They vary wildly around the real 44.1k baseline. If anyone notices this, there's not much they can do about it. More fundamentally, how are you going to maintain any kind of stream clock when the audio rate is changing during the presentation (as it will when temperature and other things change in your computer, since the clock is susceptable to this)?
  
  2) The quanta of video playout is much greater than that of audio, on the order of 500+ times larger. The goal of any good video playback is to synchronize the playout of a video frame (theoretical time unit) to a vertical refresh (physical time unit). If at any point those become desynchronized, you will have a discontinuity of at least one vertical retrace, or on average about one 75th of a second. Specifically, you'll have a video frame that is suddently displayed for between 50% and 100% longer than it's supposed to be, in the middle of a bunch of correct frames. This is *blatantly* noticable by anyone with a normal visual cortex.
  
  Audio, and the human ear, is much more forgiving. If the program simply drops or duplicates a sample every once in a while to maintain minimal drift between the two (video and audio) clocks, it will be altering 1/44,100th of the samples in that second. If done wrong, it will cause a click. Doing it "less wrong" to avoid clicks is trivial.
  
  So the decision is between locking to a highly variable audio clock (think 3.6 seconds per hour per 0.1% off) and having video that jerks and sputters whenever the theoretical and physical frame times disagree, or doing some resampling to the audio where necessary, with the possibility of some loss of quality that is undetectable by 99% of people watching videos on their computer with tinny speakers.
  
  Me, I don't like video-induced headaches, I'll resample the audio.
  
  --
  GStreamer - The only way to stream!
dScaler by Anonymous Coward · 2002-02-27 08:23 · Score: 1, Insightful

For me, one major thing holding me back from using Linux as my home theater PC is the lack of a Linux port of dScaler. My own programming experience isn't enough for me to wade through the Linux video APIs to get the job done. Not to mention the fact that Linux APIs for video change every year or so (frame buffering, Xvideo, v4l, v4l2, etc).
a refreshing perspective by FrostyWheaton · 2002-02-27 08:49 · Score: 4, Insightful

This has to be the only article I have read in a long time that stresses the importance of doing things the right way, instead of the wrong way or, heaven forbid, the "Max Power" way.

There are many core issues with video on any Unix that need to be hammered out now to ensure that things will go well both now and in the future.

As the author mentions several times, adapting refresh rates to video frame rates and working with the monitor's vertical sync as well as audio sync etc, are all very important things that need to be implimented before Video for (insert favorite unix here) will become anything more than a glorified hack.

The first logical step is to impliment what is needed to do things right, and to impliment them in the right(proper) way. the X-protocol should be fully implimented in Xfree, and the kernel should be extended to enable applications to be written which can make full use of the hardware, with minimal kludge-work.

Then the focus moves to making the "killer-app" type media production tools and players. The power of Open Source is the ability to build on the work of others. However, stealing someone's hack to adapt refresh rates, and jamming it into your own code is not an optimal solution. Focus on doing things right the first time, anything less (especially when dealing with core issues) is just asking for untold headaches and frustration in x years, when we are kicking ourselves for not doing the right thing the first time

--
Comments should be like skirts. Short enough to keep your attention, but long enough to cover the subject
Grave licensing issue with VP3 by AirLace · 2002-02-27 09:52 · Score: 2, Insightful

VP3 is distributed under a proprietary, non-free license that, whilst it purports to be open, meets neither the Open Source nor Free Software definitions.

The problem lies herein:
(e) Notwithstanding Sections 2.1 (a), (b), and (c) above, no license is granted to You, under any intellectual property rights including patent rights, to modify the code in such a way as to create or accept data that is incompatible with data produced or accepted by the Original Code. By way of example but not limitation, a Modification that adds support for other compression data such as MPEG-1 or MPEG-2 would be permissible, but only if the resulting Larger Work continues to support playback of VP3.2 data. Modifications that provide only playback or encode support are also permissible. However, a Modification that adds support for encoding or playback of any non- VP3.2 compatible files or bitstreams without complementary support for VP3.2 encoding or playback would not be permissible, and no license is granted for such Modification(s).

Basically, this is denying users the right to modify the source code to produce binaries that produce a stream incompatible with the original software. It may sound good to some, but I urge developers to think twice before releasing modifications or compiled versions of the VP3 codec because, even if unintended, a compiler bug or error in your modifications to the software could mean that the stream your modified VP3 codec produces is unintentionally incompatible with the VP3 specification, opening you to legal procedure from On2 Technologies, the proprietors of the codec.

The VP3 codec licensing terms are not only not Open Source, they are a threat to developers, contribuors and distributers of VP3 both in source code and compiled form. Please contact On2 Technologies and try to convince them to update their license to remove this dangerous clause, and spread the word to your friends!