Transcoding in 1/5 the Time with Help from the GPU

This would be great for MythTV.. Linux support?? by tji · 2005-11-02 05:28 · Score: 4, Insightful

My educated guess is, No, there won't be Linux support..

ATI was the leader in MPEG2 acceleration, enabling iDCT+MC offload to their video processor almost 10 years ago. How'd that go in terms of Linux support, you ask? Well, we're still waiting for that to be enabled in Linux.

Nvidia and S3/VIA/Unichrome have drivers that support XvMC, but ATI is notably absent from the game they created. So, I won't hold my breath on Linux support for this very cool feature.

Slashdotted! by Anonymous Coward · 2005-11-02 05:31 · Score: 4, Funny

I wonder if http://www.gpgpu.org/ could offload some of the Slashdot effect to their GPU?

What I want to see. by Anonymous Coward · 2005-11-02 05:33 · Score: 5, Interesting

Maybe others have had this idea. Maybe it's too expensive or just not practical. Imagine using PCI cards with a handful of FPGAs on board to provide reconfigurable heavy number crunching abilities to specific applications. Processes designed to use them will use one or more FPGAs if they are available, else they'll fall back to using the main CPU in "software mode."

Re:What I want to see. by TooMuchToDo · 2005-11-02 10:51 · Score: 2, Informative

I use to play with this idea 4-5 years ago. A small team was going to look into building FPGA PCI boards that could be used with http://www.distributed.net/ to help crack DES/RC5/*insert-your-choice-encryption-here*.

I'm rarely impressed... by HotNeedleOfInquiry · 2005-11-02 05:35 · Score: 2, Insightful

With tech stuff these days, but this is awesome. A very clever use of technology just sitting in your computer and a huge timesaver. Anyone that does any transcoding will have immediate justification for laying out bucks for a premium video card.

--
"Eve of Destruction", it's not just for old hippies anymore...

Re:I'm rarely impressed... by drinkypoo · 2005-11-02 05:37 · Score: 4, Interesting

I'd like to see it but I wonder what the quality is going to be like as compared to the best current encoders. I mean you can already see a big difference between cinema craft and j. random mpeg2 encoder...

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:I'm rarely impressed... by Dr.+Spork · 2005-11-02 06:04 · Score: 3, Informative

You don't get it. ATI is not releasing a new encoder. The test used standard codecs, which do the very same work when assisted by the GPU, only 5X faster.

Re:This would be great for MythTV.. Linux support? by ceoyoyo · 2005-11-02 05:35 · Score: 5, Interesting

This should be written in Shader Language (or whatever it's called these days) which is portable between cards. There's no reason NOT to release this on any platform. Since it only runs on the latest ATI cards it probably uses some feature that nVidia will have in it's next batch of cards as well. If ATI doesn't release it for Linux and the Mac hopefully it won't be that difficult to duplicate their efforts. After all, shader programs are uploaded to the video driver as plain text.... ;)

But is it worth it? by Anonymous Coward · 2005-11-02 05:37 · Score: 3, Interesting

the X1800XT ties almost exactly with the 7800GTX @ stock of 430 core in most gaming benchmarks.

with nVIDIA's 512mb implementation of the G70 core touted to be at 550mhz core, it should theoretically thrash the living daylights out of the X1800XT.

http://theinquirer.net/?article=27400

the decision is between aVIVO's encode and transcode abilities for h.264, or superior performance by nVIDIAs offering?

Re:But is it worth it? by Dr.+Spork · 2005-11-02 06:16 · Score: 2, Insightful

Well, if you can see the difference between 150fps and 200fps, and you don't waiting and don't care about spending an extra $200, you really should wait for the G70.
I don't play the sort of games that need a graphics card over $200 to look good. I never even considered looking at the high end. However, this video encoding improvement will certainly make me do a double take. I was proud of my little CPU overclock that improves my encoding rate by 20%. But the article talks about improvements of over 500%! That's worth a couple of extra bucks.
Of course, by the time the software to do this actually becomes full-featured and useful, the price of the 1800 ATIs will hopefully drop a bit. Still, I have a feeling this will be my next GPU.
Unless nVidia can produce something equally impressive, of course!
Re:But is it worth it? by nine-times · 2005-11-02 06:34 · Score: 2, Insightful

Well, I'm assuming that the hope is that support for encoding/decoding h264 will be put into hardware going forward (meaning it will find its way into low-end cards as well). I know encoding h264 is the longest, most processor intensive task I do with a computer these days, and a hardware solution that would drop any time off that task would be appreciated.

Crippled? by bigberk · 2005-11-02 05:39 · Score: 4, Funny

But will the outputs have to be certified by Hollywood or the media industry? You know, because the only reason for processing audio or video is to steal profits from Sony, BMG, Warner, ... and renegade hacker tactics like A/D conversion should be legislated back to the hell they came from

Re:This would be great for MythTV.. Linux support? by ratboy666 · 2005-11-02 05:49 · Score: 2, Informative

GPU Stream programming can be done with Brook http://graphics.stanford.edu/projects/brookgpu/. Brook supports the nVidia series, so that is what you purchase.

Pick up a 5200FX card (for SVIDEO/DVI output) and then use the GPU to do audio and video transcode. I have been thinking about audio (MP3) transcode as a first "trial" application.

"Heftier" GPUs may be used to assist in video transcode -- but it strikes me that the choice of stream programming system is most important (to allow code to move to other GPUs, driver permitting). I think that nVidia also supports developers using the GPU (there are comments and test results generated by nVidia available on the 'web). So far, not much from ATI, so I think nVidia gets the nod...

Ratboy.

--
Just another "Cubible(sic) Joe" 2 17 3061

Already available.. by LWATCDR · 2005-11-02 05:54 · Score: 2, Insightful

I have seen a combo FPGA/PPC chip for embedded applications. The issue I see with this is how long would it be useful? FPGAs are slower then ASICs. Something like the Cell or a GPU will probably be faster than an FPGA. There are a few companies looking at "reconfigurable" computers. So far I have heard about any products from them yet.

--
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.

Re:Already available.. by tomstdenis · 2005-11-02 06:21 · Score: 3, Interesting

FPGAs aren't always slower than what you can do in silicon. AES [sorry I have a crypto background] takes 1 cycle per round in most designs. You can probably clock it around 30-40Mhz if your interface isn't too stupid. AES on a PPC probably takes the same time as a MIPs which is about 1000-1200 cycles.

Your clock advantage is about 10x [say] that is typical 400Mhz PPC vs. 40Mhz FPGA ... so that 1000 cycles is 100 FPGA cycles. But an AES block takes 11 FPGA cycles [plus load/unload time] so say about 16 cycles. Discounting bus activity [which would affect your software AES anyways] you're still ahead by ~80 FPGA cycles [800 PPC cycles].

Though the more common use for an FPGA aside from co-processing is just to make a flexible interface to hardware. E.g. want something to drive your USB, LCD and other periphs without paying to go to ASIC? Drop an FPGA in the thing. I assure you controlling a USB or LCD device is much more efficient in an FPGA than in software on a PPC.

Tom

--
Someday, I'll have a real sig.
Re:Already available.. by forkazoo · 2005-11-02 07:28 · Score: 2, Informative

Ummm... Comparing a general purpose CPU to an FPGA is a bit odd. The grand-parent post was talking about ASIC's vs. FPGA's. An ASIC can impliment exactly the same structure as an FPGA, so it can work just as efficiently, but an ASIC can be made to clock higher than an FPGA. Somebody mod the parent post "non-sequitor."
Re:Already available.. by tomstdenis · 2005-11-02 07:55 · Score: 2, Insightful

It's fine that you're an amateur cryptographer, but that is a completely different field than computer engineering

Which is interesting because I'm the author of some widely deploy cryptographic software, I worked at a IP design company [for cryptographic cores]. I'd say I'm no longer an amateur when I make enough money to live on my own.

Apparently, according to you, FPGAs aren't made from silicon, they're made from fluffy bunny pixie dust

There is a strong price difference from PCB design and tapeout. If you're making less than a million devices or so it's cheaper to just use an FPGA because the tapeout alone will cost you millions.

"going to silicon" is a common expression which means to tapeout a design in real hardware. Sure FPGAs are "real silicon" but there is a big difference between using an FPGA and an ASIC in a fielded design.

Your last paragraph is about as far from the truth as can get. The last design I know of where the CPU controlled the peripherals was the Atari 2600. Even the gameboy had dedicated LCD controllers.

You're right I'm not an EE. I never claimed to be. Though what I speak of is from my experience working alongside these folk [as well as I have quite a few friends who design FPGAs for a living].

In otherwords you're trying to look all cutsie by trying to make me look stupid but really you haven't the first foggiest clue what you are talking about.

Tom

--
Someday, I'll have a real sig.
Re:Already available.. by Jerry+Coffin · 2005-11-02 10:56 · Score: 2, Informative

There's another problem with general-purpose FPGAs. (order of magnitude comparison only): Athlon 64 4000 (from pricewatch): $330 Xilinx 2.4Million gate design (from digikey): $2100-$5000.

You haven't specified which FPGAs you're talking about, but at those prices, you should be getting more like 6 million gates or so (e.g. an XC2V6000 goes for about $4000). Perhaps you're looking at something like a Virtex-4 FX? If so, you should be aware that what you're looking at not only includes FPGA gates, but also includes a PowerPC core (or perhaps 2) as well.
The computing world would look a lot different if there were good $100 high-speed, high-capacity FPGAs. Now, I wouldn't argue with a good ASIC or highspeed DSP implementation for some algorithms...

It depends a bit on what you mean by high capacity and high speed. At around $100US, you can get a 1.5 million gate Spartan 3, or a somewhat smaller Virtex (which will generally run a bit faster).
These obviously aren't the biggest or fastest FPGAs available, but for the right kind of job, they'll still blow away a general purpose CPU pretty easily.
As far as ASICs vs. FPGAs goes, it's really not a contest: ASICs are fast, but have specific purposes. FPGAs are slower, but can be programmed. Given the idea originally stated in this thread, ASICs simply don't seem (to me) like contenders at all.
--
The universe is a figment of its own imagination.

--
The universe is a figment of its own imagination.
Re:Already available.. by forkazoo · 2005-11-02 12:42 · Score: 2, Informative

Photon317 writes:
The original post never mentioned ASICs that I saw.

Ummm... Okay, here is a quote from the original post again... by LWATCDR:
I have seen a combo FPGA/PPC chip for embedded applications. The issue I see with this is how long would it be useful? FPGAs are slower then ASICs.

And then a quote from tomstdenis:
FPGAs aren't always slower than what you can do in silicon.

tom then goes on to talk about PPC versus FPGA's, as if LWATCDR weren't talking about ASICs. Since this conversation now ivolves so many people, I hope I've quoted clearly enough. Anyhow, the explanation that an FPGA can be faster than a general purpose CPU was correct, but a complete non-sequitor from LWATCDR's point that ASICs are faster than FPGA.

I do agree that the basic question of general-purpose CPU vs. other is relevant to the article. I couldn't quite bring myself to claim he was off-topic, just non-sequitor. Now, to get to something interesting you said, rather than just picking nits about who said what...
photon37:
The idea of sticking one or more FPGAs into a machine via an I/O bus certainly has merit. I think the main issue is that we don't have compiler toolchains, libraries, and kernels ready to take advantage of it in an intelligent way. The biggest problem is that the FPGA computations need to be able to fallback to the general purpose CPU, which has an entirely different instruction set. A method that might be used, for example, would be to wrap things up such that in the application source code you have two functions with identical call signatures and supposedly identical behavior - one is for the cpu, the other is for fpga offloading. Then the runtime linker and the kernel can work magic together and schedule applications any time they need on the FPGAs and dynamically cause an application to fallback to its cpu code as well.

I like the basic idea, but I'm not so sure how well dynamically sharing a function between an FPGA and a CPU would work in practice. In theory, it would be like a thread migrating from one normal CPU to another.

Since FPGAs have significant latency when they are reconfigured, I have to think that you wouldn't really want the kernel to be dynamically deciding which app gets FPGA time, and which gets CPU time. I think a better interface would be that the programmer has to manually write an optimisation for an FPGA. This eliminates the need to have automatically generated matching functions in hardware and software. The programmer can decide that the specific functions X, Y, and Z should be able to run on an FPGA if it is available. Whether the FPGA programming code comes from a C compiler or from hand coded FPGA specific stuff doesn't matter. There should be some standard interchange format for the FPGA data. gcc should be able to take some C code an output FPGA intermediate programming data from it.

Then, at run time, an app can do something like:
register_fpga(num_gates_needed, programming_data, function_name);

with a matching "unregister_fpga(function_name)"

These two functions would act sort of like a malloc and free for the FPGA, so the kernel could choose to assign any given function to any of the 0 or more FPGA's in the system. Many applications could each allocate chunks of the FPGA(s). The application itself just needs to call the function by a function pointer, so it can use the hardware or software version by swapping the value of the pointer. (Just like we do now with code bases that have optimised versions of functions for various SIMD variants)

calc_foo = soft_calc_foo; // or calc_foo = fpga_calc_foo, or sse_calc_foo, or altivec_calc_foo

With each app only registering the functions that it knows will most benefit from optimisation, there will be more room on the FPGA hardware for other apps...
Re:Already available.. by Hurricane78 · 2005-11-02 19:27 · Score: 2, Informative
> There should be some standard interchange format for the FPGA data. gcc should be able to take some C code an output FPGA intermediate programming data from it.

Smile! This stuff already exists for years:

You just have to build a library that
- shoves "compiled" logic chunks to the chip
- uses the FPGA-board's upload functionality as a pluggable driver
- does the resource management.
Everything else is already there.
- You can get some FPGA developer board to develop and test your library:
- You can use SPARK to compile your C-code to VHDL.
- I guess VHDL can be uploaded directly to the FPGA. If not maybe stuff like gEDA or similar stuff for VHDL helps...
- I am a total n00b in things of hardware design, but i found this in 1-2 hous of investigation and reading via wikipeda.
The problem is that FPGA-boards are pretty expensive... (The least expensive i found was some 66MHz devboard for 150$. The most expensive had 500MHz and a price tag of ~7000$!! [including a ton of golden analog contacts and stuff ;])
--
Any sufficiently advanced intelligence is indistinguishable from stupidity.

GPU or CPU? by The+Bubble · 2005-11-02 05:58 · Score: 3, Interesting

Video cards with GPU's used to be a "cheap" way to increase the graphic processing power of your computer by adding a chip who's sole purpose was to process graphics (and geometry, with the advent of 3d-acellerators).

Now that GPU's are becomming more and more programmable, and more and more general~purpose, what, really, is the difference between a GPU and a standard CPU? What is the benefit to having a 3d~acellerator over having a dual~CPU system with one CPU dedicated to graphic processing?

Re:GPU or CPU? by gr8_phk · 2005-11-02 06:12 · Score: 3, Insightful

"what, really, is the difference between a GPU and a standard CPU? What is the benefit to having a 3d~acellerator over having a dual~CPU system with one CPU dedicated to graphic processing?"
In a few years, there will be no real benefit to the GPU. Not too many people write optimized assembly level graphics code anymore, but it can be quite fast. Recall that Quake ran on a Pentium 90MHz with software rendering. It's only getting better since then. A second core that most apps don't know how to take advantage of will make this all the more obvious.
On another note, as polygon counts skyrocket they approach single pixel size. When that happens, the hardware pixel shaders - that GPUs have so many of - become irrelevant as the majority of the work moves up to the vertex unit. Actually at that point it makes a lot of sense to move to raytracing (something I have fast code for) which is also going to be quite possible in a few more years on the main CPU(s). Ray Tracing is one application that really shows why the GPU is NOT general purpose. You need data structures and pointers mixed with fast math - preferably double precision. You need recursive algorithms. You'll end up wanting a MMU. By the time you're done, the GPU really would need to be general purpose. The problem doesn't map to a GPU at all, and multicore CPUs are nearing the point where full screen, real time ray tracing will be possible. GPUs will not stand a chance.
Re:GPU or CPU? by Jerry+Coffin · 2005-11-02 06:58 · Score: 2

What is the benefit to having a 3d~acellerator over having a dual~CPU system with one CPU dedicated to graphic processing?

That depends on what you mean by the "one CPU dedicated to graphic processing." If you mean something on the order of a second Pentium or Athlon that's dedicated to graphics processing, the advantage is tremendous: a typical current CPU can only do a few floating point operations in parallel, where a GPU has lots of pipes to handle multiple pixels at a time (or multiple vertexes at a time, depending on which part of the pipeline you're looking at), and each pipe (at least potentially) does vector processing to work on all four pixel components at once.
The result of all that is that the GPU has substantially higher overall floating point throughput than the CPU does.
If, OTOH, what you're suggesting is that the second CPU that's dedicated to graphics processing be optimized for that by having lots of floating point hardware, a much larger number of parallel pipelines to process multiple pixels at once, etc., then what you're suggesting really comes down to pretty close what we have right now, but re-naming the "GPU" as "secondary CPU".
In fairness, there are some differences even now. First of all, you program the GPU using a slightly different programming language that includes primitives for working on things like 3- and 4-element vectors, and for doing the kinds of things you typically have to do with them (e.g. compute normals) that require a series of instructions on a normal CPU.
The other obvious difference is that the GPU normally has its own memory, mostly for the sake of improved bandwidth. You could more or less homogenize the memory, using (for example) half a dozen or so DDR channels to your main memory, and have all the processors share them symmetrically -- but that imposes some extra difficulties on design and would probably drive the price up considerably (or limit the overall design).
In particular, the main memory bus normally allows you to plug in varying numbers of varying sizes of memory modules, where the GPU typically has a specific number of modules of known sizes. This makes it much easier to design bus drivers in the GPU because the bus loading is known at design time. That's a large part of the reason motherboards are still transitioning to DDR 2 memory while high-end graphics cards are now univerally using GDDR 3.
The other problem with that would be that it would then require essentially everybody to pay (most of) the price of a high-end graphics system whether they wanted it or not. Given the number of machines sold with (for example) Intel Integrated Graphics, it's pretty clear that most people are willing to sacrifice performance for lower price.
--
The universe is a figment of its own imagination.

--
The universe is a figment of its own imagination.
Re:GPU or CPU? by LaPoderosa · 2005-11-02 09:01 · Score: 2, Interesting

"In a few years, there will be no real benefit to the GPU" Nonsense - we're actually going in the other direction, we need more general purpose massively parallel processing units to go beyond current hardware limitations. Dual CPUs do not come close to the level of parallelism we have on GPUs. Rendering a 1600x1200 4X AA scene with full filtering on a top tier dual core system would yield perhaps 1fps with an optimized software path. That gives you an idea of the order of magnitude you gain in performance with parallelizing these tasks on the GPU. "[GPUs] need data structures and pointers mixed with fast math - preferably double precision. You'll end up wanting a MMU" Nonsense. GPUs already do everything you need for raytracing. There are demos on the internet. Raytracing is ideally suited to GPUs - there's so much you can parallelize. "Actually at that point it makes a lot of sense to move to raytracing " Nonsense. You're off by orders of magnitude. Maybe they just haven't seen your fast code... *rolls eyes*
Re:GPU or CPU? by SlayerDave · 2005-11-02 10:38 · Score: 4, Insightful

You're hallucinating, buddy. Let me count the ways.
1. On another note, as polygon counts skyrocket they approach single pixel size
This is not happening. Not anywhere (except maybe production rendering). It is far too time-consuming, expensive, and labor-intensive to produce huge numbers of high-polygon-count models for games. Vertex pipes are currently under-utilized in most games and applications now. Efforts are underway to allow procedural geometry creation on the GPU to better fill the vertex pipe without requiring huge content creation efforts. See this paper for details.
2. A second core that most apps don't know how to take advantage of will make this all the more obvious.
This undercuts the argument you make in the next paragraph. Also, it's not true. Both the PS3 and XBOX 360 have multiple CPU cores. It's true that current-gen engines aren't optimized for this technology, but next-gen engines will be.
3. multicore CPUs are nearing the point where full screen, real time ray tracing will be possible. GPUs will not stand a chance.
This might be true, but so what? Ray tracing offers few advantages over the current-gen programmable pipeline. I can only think of 2 things that a ray-tracer can do that the programmable pipeline can't: multilevel reflections and refraction. BRDFs, soft shadows, self-shadowing, etc. can all be handled in the GPU these days. Now, you can get great results by coupling a ray-tracer with a global illumination system like photon mapping, but that technique is nowhere near real-time. Typical acceleration schemes for ray-tracing and photon mapping will not work well in dynamic environments, but the GPU could care less whether a polygon was somewhere else on the previous frame.
Hate to break it to you, but the GPU is here to stay. Why? GPUs are specialized for processing 4-vectors, not single floats (or doubles) like the CPU + FPU. True, there are CPU extensions for this, such as SSE and 3DNOW, but typical CPUs have a single SSE processor, compared to a current-gen GPU with 8 vertex pipes and 24 pixel pipes. Finally, do you really want to burden your extra CPU with rendering when it could be handling physics or AI?
Re:GPU or CPU? by gr8_phk · 2005-11-02 10:39 · Score: 3, Insightful

"Which approach is going to be most effective but economical for rendering fields of grasses or detailed jungles? How about a snowstorm with snow that gets denser and fog like into the distance? Sand dunes that give way and slide underfoot? Water that breaks around objects and coats them in a wet sheen?"
Most of that stuff can be done with OpenGL/DirectX or ray tracing. Grasses are sometimes done in OpenGL with instancing small clumps. In RT you'd use proceedural geometry or instancing.
For the snow, both renderes would probably do similar techniques.
Sand dunes - either method needs an engine with deformable geometry - both can support that.
Water simulation is something I don't know much about. For the FFT methods of simulating waves it's possible that a GPU has an advantage. Once it start interacting with objects, I don't know how people handle that.
Your quesitons all point toward vast detailed worlds with lots of polygons. RT scales better with scene complexity. To get more traditional methods to work well, you get into fancy culling techniques (HZB comes to mind) and RT starts to look simpler - because it is.

Yawn... by benjamindees · 2005-11-02 06:04 · Score: 2, Interesting

nVidia has been doing this for a while now. In fact, there are finally getting to be interesting implementations like GNU software radio on GPUs:

An Implementation of a FIR Filter on a GPU

--
"I assumed blithely that there were no elves out there in the darkness"

Re:Yawn... by ehovland · 2005-11-02 06:18 · Score: 2, Interesting

To see the latest generation of this work, check out their sourceforge page:
http://openvidia.sourceforge.net/

Re:GPU advantages over CPU? by tomstdenis · 2005-11-02 06:11 · Score: 4, Informative

GPUs are massively parallel DSP engines. That makes them ideally suited for the task. They can do things like "let's multiply 8 different floats in parallel at once". Which is useful when doing transforms like the iDCT or DCT which are capable of taking advantge of the parallelism.

But don't take that out of context. Ask a GPU to compile the linux kernel [which is possible] and an AMD64 will spank it something nasty. *GENERAL* purpose processors are slower at these very dedicated tasks but at the same time capable of doing quite a bit with reasonable performance.

By the same token, a custom circuit can compute AES in 11 cycles [1 if pipelined] at 300Mhz which when you scale to 2.2Ghz [for your typical AMD64] amounts to ~80 cycles. AES on the AMD64 takes 260 cycles. But, ask that circuit to compute SHA-1 and it can't. Or ask it render a dialog box, etc...

Tom

--
Someday, I'll have a real sig.

But I'd rather have it the other way around! by Macguyvok · 2005-11-02 06:12 · Score: 2, Interesting

I'd rather see GPU's ofloading thier work to the system CPU. There's no *good* way to do this. So, why not run this isn reverse? If it's possible to speed up general processing, why can they speed up graphics processing? Especially since my CPU hardly does anything when I'm playing a game; it has to wait on the graphics card.

So, what about it ATI? Or will thi be an NVIDIA innovation?

--
--Mac "Nine point eight meters per second squared: The Best Damn Windows Accelerator, Ever."

Using their own codecs by no_such_user · 2005-11-02 06:22 · Score: 4, Insightful

It looks like they're using their own codec to produce MPEG-2 and MPEG-4 material. How would you get an existing, x86-only aware application to utilize the GPU, which is not x86 instruction compatible? It's a good bet that codecs will be rewritten to utilize the GPU once code becomes available from ATI, nVidia, etc.

I'd actually be willing to spend more than $50 on a video card if more multimedia apps took advantage of the GPU's capabilities.

lessons of "array processors" from 1980s by peter303 · 2005-11-02 06:25 · Score: 3, Informative

In the scientific computing world there have been several episodes where someone comes up with a attached processor an order of magnitude faster than a general purpose CPU and try to get the market to use it. Each generation improved the programming interface eventually using some subset of C (now Cg) combined with a preprogrammed routine library.

All these companies died mainly because the commodity computer makers could pump out new generations about three times faster and eventually catch up. And the general purpose software was always easier to maintain than the special purpose software. Perhaps graphics card software will buck this trend because its a much larger market than specialty scientific computing. The NVIDAS and ATIs can ship new hardware generations as fast as the Intels and AMDs.

Re:lessons of "array processors" from 1980s by TheRaven64 · 2005-11-02 08:58 · Score: 3, Interesting

A lot of the improvements in CPU performance recently have come from vector units. On OS X, things like the AAC encoder make heavy use of AltiVec - to the degree that ripping CDs on my PowerBook is limited by the speed of the CD drive, not the CPU.
A GPU is, effectively, a very wide vector unit (1024-bits is not uncommon). What happens when CPUs all include 2048-bit general purpose vector units? What happens when they include a couple on each core in a 128-core package? Sure, a dedicated GPU will still be faster - but it won't be enough faster that people will care. For comparison, take a look at Chromium. Chromium is a software OpenGL implementation that runs on clusters. Even with relatively small clusters, it can compete fairly well with modern GPUs - now imagine what will happen when every machine has a few dozen cores in their CPU.

--
I am TheRaven on Soylent News

Done in Roxio Easy Media Creator 8 by Anonymous Coward · 2005-11-02 06:25 · Score: 2, Informative

fyi this is already done by Roxio in Easy Media Creator 8. they offload a lot of the rendering or transcoding to GPUs that support it. for those that are older they have a software fallback. probably not an increase by such a large factor but still a significant boost on newer PCI-E cards.

Re:Will all x1000 cards do this? by freakyfreak2 · 2005-11-02 06:26 · Score: 2, Informative

It is very specific about this
From the article (second page):
"The application only works with X1000 series graphics cards, and it only ever will. That's the only architecture with the necessary features to do GPU-accelerated video transcoding well."

Apple's core image by acomj · 2005-11-02 06:29 · Score: 3, Informative

some of Apple's apis (core video/core image/core audio) use the gpu when it detects a supported card, otherwise it just uses the cpu, seemlessly and without fuss. So this isn't new.

http://www.apple.com/macosx/features/coreimage/

Re:This would be great for MythTV.. Linux support? by EpsCylonB · 2005-11-02 06:34 · Score: 2, Informative

When I got my 6600gt the box that it came said it could do hardware mpeg2 encoding, obviously this is not the case. I remember reading somewhere that nvidia orginally wanted the 6XXX series to be able to do loads of on board video stuff but they couldn't get it working on time. Its a real shame.

There's a CPU in my keyboard too... by Anonymous Coward · 2005-11-02 06:54 · Score: 2, Funny

As I remember from my hardware class...there's an Intel 8051 or similar in most PC keyboards...wouldn't it be cool to somehow be able to use that CPU for something useful (aside from polling the keyboard)

Re:There's a CPU in my keyboard too... by Saffaya · 2005-11-02 08:34 · Score: 5, Interesting

Though I am sure you wrote that as a pure joke, this has already been done long ago. During the fierce competion on the demo scene between the ATARI ST and the Amiga, crews were exploiting every speck of power they could from their machine. The ATARI ST being a general purpose machine compared to the Amiga (which had very advanced sound and graphical custom processors), the programmers who wanted to pull off the same graphical effects went as far as using the processor managing the keyboard (a 68xx 8bit motorola chip) for added computational power.

Keep in mind by Solr_Flare · 2005-11-02 06:58 · Score: 2, Insightful

That while few people will notice the difference between 150fps and 200fps, those numbers are more or less there to help you determine the lifespan of the card itself. While, for current games, both cards will perform extremely well, a 50fps difference means that on future games, the Nvidia card will be able to last longer and run with more graphics options enabled without bottoming out on fps.

While a select few individuals still always buy the latest and the greatest, the majority of buyers look at video cards as long term investments mainly because of the rediculously inflated prices in the GPU market. All that said, I think you have to look at the card's feature set and make a decision based on that. While, gaming wise, the Nvidia GPU may be superior, the dramatically increased transcoding times definitely make the ATI card a potentially attractive purchase to people who work a lot with video. Given the amazing rise in popularity of the Video Ipod and the existing PSP market, the number of people with interests in transcoding video is definitely on the rise, and ATI was smart in tapping that market now.

--
You are who you are, let no one tell you different. But, never close your mind to a new point of view.

Linux Support by Yerase · 2005-11-02 07:03 · Score: 3, Informative

There's no reason there couldn't be Linux Support. At the IEEE Viz05 Conference there was a nice talk from the guys operating www.gpgpu.org about cross-platform support, and there's a couple of new languages coming out that act as wrappers for Cg/HLSL/OpenGL on both ATI & NVidia, & Windows & Linux... Check out Sh (http://libsh.sourceforge.net/ and Brook (http://brook.sourceforge.net./ Once their algorithm is discovered (Yipee for Reverse engineering), it won't be long.

Re:This would be great for MythTV.. Linux support? by thatshortkid · 2005-11-02 07:05 · Score: 5, Interesting

wow, for once there's a slashdot article i have insight on! (whether it's modded that way remains to be seen.... ;) )

i would actually be shocked if there weren't linux support. the ability to do what they want only need to be in the drivers. i've been doing a gpgpu feasability study as an internship and did an mpi video compressor (based on ffmpeg) in school. using a gpu for compression/transcoding is a project i was thinking of starting once i finally had some free time since it seems built for it. something like 24 instances running at once at a ridiculous amount of flops (puts a lot of cpus to shame, actually). if you have a simd project with 4D or under vectors, this is the way to go.

like i said, it really depends on the drivers. as long as they support some of the latest opengl extensions, you're good to go. languages like Cg and BrookGPU, as well as other shader languages, are cross-platform. they can also be used with directx, but fuck that. i prefer Cg, but ymmv. actually, the project might not be that hard, just needs enought people porting the algorithms to something like Cg.

that said, don't expect this to be great unless your video card is pci-express. the agp bus is heavily asymmetric towards data going out to the gpu. as more people start getting the fatter, more symmetric pipes of pci-e, look for more gpgpu projects to take off.

--
The IRS is the one organization that you don't want to fuck with. Remember, these are the guys who took down Al Capone.

funny about memory comments by iamhassi · 2005-11-02 07:19 · Score: 2, Interesting

it's funny to read the article and see them brag about the "very fast RAM":
"This is, after all, one of the fastest CPUs money can buy, paired with very fast RAM.
"1 GB of very low latency RAM "

After the other review posted today about fast memory doing almost nothing for transcoding:
"moving to tighter memory timings or a more aggressive command rate generally didn't improve performance by more than a few percentage points, if at all, in our tests."
"Mozilla does show a difference between the settings, both on its own and when paired with Windows Media Encoder. Still, the differences in performance between 2-2-2-5 and 2.5-4-4-8 timings, and between the 1T and 2T command rates, are only a couple of percentage points."

--
my karma will be here long after I'm gone

Re:GPU advantages over CPU? by DotDotSlasher · 2005-11-02 07:20 · Score: 2, Informative

Wouldn't it just be easier to have multiple CPUs?
Why, yes it would. GPUs fill a For one thing, about 90% of the transistors on a GPU are used for processing. About 60% on a CPU are used for processing (the rest is used for caching).
There are also many more transistors in GPUs these days than CPUs. Graphics processing is inherently parallel and streamed. That's what a GPU does very well, very fast. Grab 8 texture samples simultaneously each clock cycle, the next stage linearly blends these floating point values together in one clock. A CPU would have to work on each of those 8 one at a time.
For parallel, streamed operations - a GPU can speed up a process by 5 or 10 times, like this example. At SIGGRAPH this summer, they had a session on running a ray tracer on a GPU. After 15 minutes of explaining all of the optimizations they performed, they were happy to report that they were only 5x slower than a CPU implementation. Ray tracing is not a very parallel, streamed operation. Rays can bounce 10 times or maybe not a all.
So, let's review: GPUs are significantly faster than a CPU for graphics and some streamable parallelizable processes. CPUs are great for branchable, more random processing.

FPGA's cheap. Synthesis EXPENSIVE. by xtal · 2005-11-02 07:31 · Score: 2, Informative

Unfortunately, there are a few problems with this scenario in practice that prevent it from becoming widespread. I worked on optimizations with VHDL destined for FPGA's in a prior life.

- Tools: FPGA tools are getting better, but still suck compared to modern IDEs and software development. This might be me being jaded (VHDL can get nasty), some things like System C and others are in the infancy stage, but long ways to go here.

- Synthesis time: It can take DAYS on a very fast machine to run the synthesis that produces your design for the FPGA. Some designs work out to be impossible to synthesise; you might not find this out until hours or worse into the process. Then your whole design might have to change! Ha ha ha.

- Tool expense: The good tools cost a lot of money. The ones that can do good designs on the fly cost on the order of a new Ferrari or worse. Engineers that are framiliar with optimizing and implementing these tools and designs cost a lot too, but sadly, don't get to drive too many Ferraris. (me anyway!)

- CPUs and GPUs are heavily optimized and VERY VERY VERY fast for most tasks. In many cases it is cheaper to go buy a farm than implement on a FPGA, unless you are trying to do something very specialized. FPGAs are more often used for specialty communications brokering, timing, and interfacing tasks where bus speeds on a micro are too low.

Great idea in principle. Wouldn't hold my breath, however.

--
..don't panic

Apple foes this now. by ChrisA90278 · 2005-11-02 07:42 · Score: 3, Insightful

Aple does this now. "Core Image" is built into the OS and all "correctly written" applications that need to do graphic use Core Image. Core Image wil use the GPU if one is available. This is a very good idea but the hardest part of getting this to work on a non-Apple platform will be standarizing the API so that we can use any GPU. OK X11 did this fr displays on UNIX ad we have OpenGL for 3D graphic so we can hope something will happen an API for GPU based image transfomation. The biggest use for this wil not be just simple transcoding but editing and dispay programs for still and moving images Think "gimp" and "cinelerra".

Re:Apple foes this now. by GweeDo · 2005-11-02 09:11 · Score: 2, Interesting

Well, sorta. CoreImage is for video effects in real time. Like window transitions, transpernecy, shadows, blah blah blah.

The idea behind using your GPU in this case is even more far reaching. While using a GPU for any visual effect is fairly logical...what about SETI@Home? What about Folding? What about for runing kalc :)

See the difference?

--
Unstable Apps: Our Android Apps Don't Suck

In the meantime... by Happy+Monkey · 2005-11-02 09:05 · Score: 2, Interesting

Does anyone have any transcoding software recommendations? Nero for some reason keeps losing audio sync after a few minutes of video.

--
__
Do ya feel happy-go-lucky, punk?

Slashdot Mirror

Transcoding in 1/5 the Time with Help from the GPU

48 of 221 comments (clear)