Intel, NVIDIA Take Shots At CPU vs. GPU Performance

← Back to Stories (view on slashdot.org)

Intel, NVIDIA Take Shots At CPU vs. GPU Performance

Posted by kdawson on Sunday June 27, 2010 @12:11AM from the army-boots dept.

MojoKid writes "In the past, NVIDIA has made many claims of how porting various types of applications to run on GPUs instead of CPUs can tremendously improve performance — by anywhere from 10x to 500x. Intel has remained relatively quiet on the issue until recently. The two companies fired shots this week in a pre-Independence Day fireworks show. The recent announcement that Intel's Larrabee core has been re-purposed as an HPC/scientific computing solution may be partially responsible for Intel ramping up an offensive against NVIDIA's claims regarding GPU computing."

6 of 129 comments (clear)

Min score:

Reason:

Sort:

It depends? by aliquis · 2010-06-27 00:23 · Score: 5, Insightful

Isn't it like saying "Ferrari makes the fastest tractors!" (yeah, I know!), which may be true, as long as they can actually carry out the things you want to do.
I don't know about the limits of OpenCL/GPU-code (or architecture compared to regular CPUs/AMD64 functions, registers, cache, pipelines, what not), but I'm sure there's plenty and that someone will tell us.
1. Re:It depends? by jawtheshark · 2010-06-27 00:44 · Score: 5, Informative
  
  Try Lamborghini next time... You do know that Mr Lamborghini originally made his money making tractors. The legend says he wasn't satisfied with what Ferrari offered as sports cars and thus made one himself. Originally, Lamborghini is a tractor brand.... Not kidding. I think they still make them...
  
  --
  Ahhh...the great dumpster continuum. Many a free computer will be found there. -- sowth (748135)
2. Re:It depends? by Sycraft-fu · 2010-06-27 00:59 · Score: 5, Informative
  
  Basically, GPUs are stream processors. They are fast at tasks that meet the following criteria:
  1) Your problem has to be more or less infinitely parallel. A modern GPU will have anywhere in the range of 128-512 parallel execution units, and of course you can have multiple GPUs. So it needs to be something that can be broken down in to a lot of peices.
  2) Your problem needs to be floating point. GPUs push 32-bit floating point numbers really fast. The most recent ones can also do 64-bit FP numbers at half the speed. Anything older is pretty much 32-bit only. For the most part, count on single precision FP for good performance.
  3) Your problem must fit within the RAM of the GPU. This varies, 512MB-1GB is common for consumer GPUs, 4GB is fairly easy to get for things like Teslas that are built for GPGPU. GPUs have extremely fast RAM connected to them, much faster than even system RAM. 100GB/sec+ is not uncommon. While a 16x PCIe bus is fast, it isn't that fast. So to get good performance, the problem needs to fit on the GPU. You can move data to and from the main memory (or disk) occasionally, but most of the crunching must happen on card.
  4) Your problem needs to have not a whole lot of branching, and when it does branch, multiple paths need to branch the same. GPUs handle branching, but not all that well. The performance penalty is pretty high. Also generally speaking a whole group of shaders has to branch the same way. So you need the sort of thing that when the "else" is hit, it is hit for the entire group.
  So, the more similar your problem is to that, the better GPUs work on it. 3D graphics would be an excellent example of something that meets that precisely, which is no surprise as that's what they are made for. The more your deviate from that, the less suited GPUs are. You can easily find tasks they are exceedingly slow at compared to CPUs.
  Basically modern CPUs tend to be quite good at everything. They have strong performance across the board so no matter what the task, they can do it well. The downside is they are unspecalized, they excel at nothing. The other end of the spectrum is an ASIC, a circuit designed for one and only one thing. That kind of thing can be extremely efficient. Something like a gigabit switch ASIC is a great example. You can have a tiny chip that draws a couple watts and yet and switch 50+gbit/sec of traffic. However that ASIC can only do its one task, no programability. GPUs are something of a hybrid. They are fully programmable, but they are specialized in to a given field. As such at the tasks they are good at, the are extremely fast. At the tasks they are not, they are extremely slow.
You lazy fuckers by drinkypoo · 2010-06-27 00:46 · Score: 5, Interesting

I don't expect slashdot "editors" to actually edit, but could you at least link to the most applicable past story on the subject? It's almost like you people don't care if slashdot appears at all competent. Snicker.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
AMD by MadGeek007 · 2010-06-27 00:49 · Score: 5, Funny

AMD must feel very conflicted...
CPUs and GPUs have different goals by leptogenesis · 2010-06-27 01:04 · Score: 5, Interesting

At least as far as parallel computing goes. CPUs have been designed for decades to handle sequential problems, where each new computation is likely to have dependencies on the results of recent computations. GPUs, on the other hand, are designed for situations where most of the operations happen on huge vectors of data; the reason they work well isn't really that they have many cores, but that the operations for splitting up the data and distributing it to the cores is (supposedly) done in hardware. In a CPU, the programmer has to deal with splitting up the data, and allowing the programmer to control that process makes many hardware optimizations impossible.

The surprising thing in TFA is that Intel is claiming to have done almost as well on a problem that NVIDIA used to tout their GPUs. It really makes me wonder what problem it was. The claim that "performance on both CPUs and GPUs is limited by memory bandwidth" seems particularly suspect, since on a good GPU the memory access should be parallelized.

It's clear that Intel wants a piece of the growing CUDA userbase, but I think it will be a while before any x86 processor can compete with a GPU on the problems that a GPU's architecture was specifically designed to address.