Facts and Fiction of GPU-Based H.264 Encoding
notthatwillsmith writes "We've all heard a lot of big promises about how general-purpose GPU computing can greatly accelerate common tasks that are slow on the CPU — like H.264 video encoding. Maximum PC compared the GPU-accelerated Badaboom app to Handbrake, a popular CPU-based encoder. After testing a variety of workloads ranging from archival-quality DVD rips to transcodes suitable for play on the iPhone, Maximum PC found that while Badaboom is significantly faster than X264-powered Handbrake in a few tests that require video resizing, it simply can't compare to the X264-powered Handbrake for archival-quality DVD backups."
Wouldn't archival-quality backups be actual MPEG instead of H.2 or whatever? I mean if you're archiving, why go lossy?
Is it just a badly-designed test?
To begin with, x264 blows the water out of Badaboom in terms of speed when similar settings are used. Badaboom appears to use the rough equivalent of --aq-mode 0 --subme 1 --scenecut -1 --no-cabac --partitions i4x4 --no-dct-decimate in terms of x264 commandline... its no wonder its "fast" when they compare it to x264 on far slower settings!
GPU encoders won't be able to compete with CPU encoders until they either get a lot faster (in which case they'll compete in the "high performance" market) or they get much better quality, since at sane settings x264 unsurprisingly blows Badaboom out of the water quality-wise, too. Until then, the product is not only completely proprietary but furthermore simply inferior, and they're going to have a very hard time marketing it.
The CPU usage of the program when used with a good video card is 25% on my quad core machine, implying it is CPU bound right now. That means if they can get the CPU overhead down, even a little bit, they will stand to get huge gains.
This is the most obvious and boring insight they could possibly offer... Everyone with the slightest interest knows this already.
The low quality of hardware-based video encoder cards is a very well-known fact, and those MPEG encoders cards are just ASICs on a PCI card, almost exactly the same hardware as your video card.
The point of offering up APIs for GPUs, and AMD's attempt to integrate the GPU ASIC with the CPU via HyperTransport, is aimed at improving things, however.
x264 does a good job because it's an open source project, with several skilled and interested individuals continually tweaking the code to improve quality and performance. Once hardware-based video encoding routines aren't hidden in closed-source firmware on a dedicated card, the same development effort can step up and improve HARDWARE encoding now, exactly as they have with software.
Not only can quality be significantly improved, you can expect performance to improve significantly as well, even with greater quality. The initial implementation of any codec is always relatively poor performing, and low quality, so this wouldn't even be an insightful observation if it was comparing x264 with any other software based encoder... The only difference is that a new software h.264/AVC encoder would be SLOWER than x264, as well as being much lower quality.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
To know how the next pixel should be compressed you must know the statistical likelihoods of the previous pixels. So compression is a really linear operation. You could have threads that work from each keyframe of the video independently but that still isn't ideal for graphics cards.
From the CUDA guide-
"Every instruction issue time, the SIMT unit selects a warp that is ready to execute and issues the next instruction to the active threads of the warp. A warp executes one common instruction at a time, so full efficiency is realized when all 32 threads of a warp agree on their execution path."
So if you have code that isn't SIMD-able you are really only using 1/32 available threads per unit of branching code.
Comparing a GPU, an SIMD (single instruction, multiple data) vector processor, to a CPU, a superscalar sequential processor, is like comparing apples and oranges. Sure, they are both fruits but they don't taste the same. Using the term 'general-purpose' to describe a GPU is pushing the limits of what a GPU is. Certainly, it can run general-purpose programs but much faster at running what it was designed to run, data-parallel applications. A GPU does not have to have a fast clock because it makes up for it by doing a lot of operations in parallel.
A CPU, OTOH, can have a very fast clock but even if it has a superscalar architecture, it cannot come close to the performance of a GPU on data-parallel apps.
It is obvious that neither the GPU nor the CPU are universal processors and, IMO, that is an unforgivable sin. Having both of these types of processors in the same machine is asking for trouble. They require two incompatible programming models. Programming such a beast is like pulling teeth with a crowbar. Only a few are good at it and that is not good for the industry. What is needed is a fast vector processor that can run in MIMD (multiple instruction, multiple data) mode. This way, it would have no trouble running general-purpose apps just as fast as data-parallel apps. The problem is that such a processor would need a radically different programming model, one that is specifically designed for fine-grain MIMD prosessing. None of the current programming tools would work with it. Still, that's the future of parallel computing. There is no getting around this.
Herading the Impending Death of the CPU
It will take at least another 18 months before GPU encoding becomes seamless and the ideal solution for most users.
Intel is working on its own GPU, I am sure that they will exploit multimedia handling capabilities (video/photoshop) as one of the selling points of that GPU.
So if you had tried Handbrake before posting you would see you don't need to first rip the dvd's. You wouldn't have to buy slysoft. You furthermore would be able to choose ipod, psp, etc as a setting for output.
So you paid money for a GUI that selects command-line options?
I'm in the wrong line of work.
There's no failure quite as dissatisfying as a complete and total solution to the wrong problem.
There is a huge business made around building payware GUIs that (often silently, without giving any credit, sometimes violating GPL/LGPL) do nothing but use open-source tools to do their work. This is especially true in video encoding where there is are almost no cheap proprietary tools--only the extremely widely used open source solutions and extremely expensive "professional" ones (with some rare exceptions like DivX and Nero). Usually these GUIs are much worse than the free ones, but a sucker's born every minute.
Not only that, but x264 is one of the very best h.264 encoders out there. You could compare it to most other CPU based encoders and it would also come up trumps. Does this mean encoding on a CPU is better than encoding on a CPU?
Man I wish it was LGPL instead of GPL though.
LGPL vs GPL is not actually a very big issue in my experience. I spent the summer working at Avail Media, a company that uses x264 for real-time 1080i/720p broadcast encoding for IPTV and cable television (and also funds a large portion of x264 development). They use x264 in their encoding boxes--yet their main application is proprietary! This is done by having an extremely simple open-source wrapper which is statically linked to x264; the raw frames to be encoded are passed to it over a pipe by the main program. This completely bypasses the limitations of the GPL without violating the spirit of it, since anyone who wants to can still read the source code of the wrapper, modify it, and recompile it as necessary and still use it with the main application.
So you paid money for a GUI that selects command-line options? I'm in the wrong line of work.
I appreciate the humor. But seriously, have you seen how long a command can be with all the extra switches and whatnot? It can get up to 180+ characters long!
Sure, I could write batch files or script it. However, I'm always trying new options and feature combinations. For me, a simple GUI that I can run with is what I want.
Simply put. My computer works for me. I shouldn't have to work for it.
Life is not for the lazy.
They're not encoding video. They're transcoding it. They're starting from one compressed representation and outputting another compressed representation. (Now, with twice the artifacts!)
The good test for this is football. The players, ball, and field are all moving in different directions. If the motion compensation gets that right, it's doing a very good job.
This is done by having an extremely simple open-source wrapper which is statically linked to x264; the raw frames to be encoded are passed to it over a pipe by the main program. This completely bypasses the limitations of the GPL without violating the spirit of it, since anyone who wants to can still read the source code of the wrapper, modify it, and recompile it as necessary and still use it with the main application.
Moreso, that is exactly how proprietary software is supposed to interact with GPL software. See Mere Aggregation, especially the last paragraph:
By contrast, pipes, sockets and command-line arguments are communication mechanisms normally used between two separate programs. So when they are used for communication, the modules normally are separate programs.
I'm pretty sure that wouldn't fly very well in a court. You are still linking to the wrapper which has to be GPL. IANAL but you could state that includig the binary in your workflow is linking.
Their wrapper is required to be GPL ; but since they don't distribute it, the source distribution clauses are not in effect.
Their commercial software pipelines frames into their wrapper ; they are separate processes, not linked, and thus their use does not violate GPL.
Otherwise you could argue that because you opened a Word document in OOo, that Word was now required to be GPL because it had emitted data that was now being consumed by a GPL application.
Well, if they don't distribute, then the GPL indeed doesn't apply. But if it they do, then an argument could definitely be made that the GPL would apply if the GPL code were an essential part of the software as a whole (ie. it couldn't be replaced), and they were distributing both sets of code together as a single software suite. The GPL license doesn't say anything about code running in a different process being automatically excluded from the licensing requirements, it only talks about "derivative work", without specifying exactly what that means. The same argument exists over the question of closed source kernel modules - many prominent kernel developers believe that they're illegal. The GPL FAQ says:
"I'd like to incorporate GPL-covered software in my proprietary system. Can I do this by putting a 'wrapper' module, under a GPL-compatible lax permissive license (such as the X11 license) in between the GPL-covered part and the proprietary part?
No. The X11 license is compatible with the GPL, so you can add a module to the GPL-covered program and put it under the X11 license. But if you were to incorporate them both in a larger program, that whole would include the GPL-covered part, so it would have to be licensed as a whole under the GNU GPL.
The fact that proprietary module A communicates with GPL-covered module C only through X11-licensed module B is legally irrelevant; what matters is the fact that module C is included in the whole."
There's no simple answer here. As the FSF say in the answer you link to "This is a legal question, which ultimately judges will decide." And you missed the rest of the answer following your quote "But if the semantics of the communication are intimate enough, exchanging complex internal data structures, that too could be a basis to consider the two parts as combined into a larger program." It could certainly be argued in a court that distributing an application that performs video encoding by transferring commands and data frames across an IPC link constitutes a derivative work, especially if there's no way for the encoding application to work if the GPL component were removed (as would be the case here). The mere aggregation clause was meant to apply more to clear cut cases like distributing a Linux distribution, where differently licensed unconnected software like, say, emacs and skype, could be incorporated on the same CD.
Does too violate the spirit of the GPL. The point is that the main program should be GPL as well; if they wanted it to behave the way you describe, it would have been LGPL-licensed.
I am trolling
But if you're choosing a "profile, resolution, and quality," you could use Handbrake for free. It does all of those things. If you don't want to touch command line stuff, don't. Handbrake's GUI will generate it all for you. Plus it's open source.
Tell me when I can get a PCI card with a one or more Cell co-processors to do the heavy lifting.
No, the wrapper is not being linked to. Linking has a very specific meaning--and linking was not being done. Calling a binary via "exec" is not linking. If it was, the GPL would truly be a dangerous license! (And yes, Avail does distribute software containing GPL'd products, its not internal-use-only).
You mean, like these?
http://ffmpeg.mplayerhq.hu/shame.html
I happened to look at ConvertXtoDVD the other day. While ffmpeg itself is licensed under the LGPL, ConvertXtoDVD also appears to use both libpostproc and libswscale which are both GPL. The ffmpeg licensing page states, "If those parts get used the GPL applies to all of FFmpeg."
I don't see any LICENSE.txt file nor any mention of the GPL or the LGPL in the version of the product I downloaded. Running strings against the binaries looking for things like "public" doesn't bring it up either.
Did anyone catch what GPU/graphics card they used? The article mentions they used a Q6600 ($185) as their test CPU but it makes no mention of which GPU they ran with.
Did they run this on an 9800GT? 8800GT? 8600?
To make this a fair comparison they should be running the test on a system with a quadcore and the lowest end GPU for the CPU test. Then run the same comparison on a low end Intel CPU (same price as that low end GPU from above) and a GPU priced about the same as their Q6600.
This would fit better with comparing what NVIDIA's been claiming with their optimized PC campaign.