Slashdot Mirror


Intel, NVIDIA Take Shots At CPU vs. GPU Performance

MojoKid writes "In the past, NVIDIA has made many claims of how porting various types of applications to run on GPUs instead of CPUs can tremendously improve performance — by anywhere from 10x to 500x. Intel has remained relatively quiet on the issue until recently. The two companies fired shots this week in a pre-Independence Day fireworks show. The recent announcement that Intel's Larrabee core has been re-purposed as an HPC/scientific computing solution may be partially responsible for Intel ramping up an offensive against NVIDIA's claims regarding GPU computing."

27 of 129 comments (clear)

  1. first post! by Dynetrekk · · Score: 4, Funny

    I am now posting using my GPU. It's at least 50x faster!

    1. Re:first post! by LordKronos · · Score: 4, Informative

      Awesome. And now maybe you've learned a lesson. While the external processor was faster, sending your data over the bus to the external processor has an inherent delay in it. That's why your first post came in fourth.

    2. Re:first post! by TheLink · · Score: 4, Funny

      The other earlier posts however seem to suffer from some sort of processing or data corruption/error.

      --
  2. It depends? by aliquis · · Score: 5, Insightful

    Isn't it like saying "Ferrari makes the fastest tractors!" (yeah, I know!), which may be true, as long as they can actually carry out the things you want to do.

    I don't know about the limits of OpenCL/GPU-code (or architecture compared to regular CPUs/AMD64 functions, registers, cache, pipelines, what not), but I'm sure there's plenty and that someone will tell us.

    1. Re:It depends? by jawtheshark · · Score: 5, Informative

      Try Lamborghini next time... You do know that Mr Lamborghini originally made his money making tractors. The legend says he wasn't satisfied with what Ferrari offered as sports cars and thus made one himself. Originally, Lamborghini is a tractor brand.... Not kidding. I think they still make them...

      --
      Ahhh...the great dumpster continuum. Many a free computer will be found there. -- sowth (748135)
    2. Re:It depends? by Sycraft-fu · · Score: 5, Informative

      Basically, GPUs are stream processors. They are fast at tasks that meet the following criteria:

      1) Your problem has to be more or less infinitely parallel. A modern GPU will have anywhere in the range of 128-512 parallel execution units, and of course you can have multiple GPUs. So it needs to be something that can be broken down in to a lot of peices.

      2) Your problem needs to be floating point. GPUs push 32-bit floating point numbers really fast. The most recent ones can also do 64-bit FP numbers at half the speed. Anything older is pretty much 32-bit only. For the most part, count on single precision FP for good performance.

      3) Your problem must fit within the RAM of the GPU. This varies, 512MB-1GB is common for consumer GPUs, 4GB is fairly easy to get for things like Teslas that are built for GPGPU. GPUs have extremely fast RAM connected to them, much faster than even system RAM. 100GB/sec+ is not uncommon. While a 16x PCIe bus is fast, it isn't that fast. So to get good performance, the problem needs to fit on the GPU. You can move data to and from the main memory (or disk) occasionally, but most of the crunching must happen on card.

      4) Your problem needs to have not a whole lot of branching, and when it does branch, multiple paths need to branch the same. GPUs handle branching, but not all that well. The performance penalty is pretty high. Also generally speaking a whole group of shaders has to branch the same way. So you need the sort of thing that when the "else" is hit, it is hit for the entire group.

      So, the more similar your problem is to that, the better GPUs work on it. 3D graphics would be an excellent example of something that meets that precisely, which is no surprise as that's what they are made for. The more your deviate from that, the less suited GPUs are. You can easily find tasks they are exceedingly slow at compared to CPUs.

      Basically modern CPUs tend to be quite good at everything. They have strong performance across the board so no matter what the task, they can do it well. The downside is they are unspecalized, they excel at nothing. The other end of the spectrum is an ASIC, a circuit designed for one and only one thing. That kind of thing can be extremely efficient. Something like a gigabit switch ASIC is a great example. You can have a tiny chip that draws a couple watts and yet and switch 50+gbit/sec of traffic. However that ASIC can only do its one task, no programability. GPUs are something of a hybrid. They are fully programmable, but they are specialized in to a given field. As such at the tasks they are good at, the are extremely fast. At the tasks they are not, they are extremely slow.

    3. Re:It depends? by JanneM · · Score: 4, Informative

      "So to get good performance, the problem needs to fit on the GPU. You can move data to and from the main memory (or disk) occasionally, but most of the crunching must happen on card."

      From what I have seen when people use GPUs for HPC, this, more often than anything else, is the limiting factor. The actual calculations are plenty fast, but the need to format your data for the GPU, send it, then do the same in reverse for the result really limits the practical gain you get.

      I'm not saying it's useless or anything - far from it - but this issue is as important asthe actual processing you want to do for determining what kind of gain you'll see from such an approach.

      --
      Trust the Computer. The Computer is your friend.
    4. Re:It depends? by rahvin112 · · Score: 4, Insightful

      It is not a secret (it's a stated fact on both Intel and AMD's roadmaps) to integrate GPU like programmable FP into the FP units of the general processor. The likely result will be the same general purpose CPU you love, but there will be dozens of additional FP units that excel at mathematics like the parent described except more flexible. When the fusion'eske products ramp and GPGPU functionality is integrated into the CPU Nvidia is out of business. Oh I don't expect these fusion products to have great GPU's, but once you destroy the low end and mid range graphics marketplace there is very little $$ wise left to fund R&D (3dfx was the first one into the high end 3d market and they barely broke even on their first sales, the only reason they survived was because they were heavy in the arcade sector sales). If Nvidia hasn't been allowed to purchase Via's x86 license by that point they are quite frankly out of business. Not immediately of course, they will spend a few years evaporating all assets while they try to compete with only the highend marketplace but in the end they won't survive. Things go in cycles and the independent graphics chip cycle is going to end very shortly, maybe in a decade it will come back, but I'm skeptical. CPU's have exceeded the speed needed for 80% of most tasks out there.

      When I first started my Career computer runs of my design work took about 5-30 minutes to run on bare minimum quality. These days I can exceed that bare minimum by 20 times and the run will take seconds. It's to the point where I can model with far more precision than the end product needs with almost no time penalty. In fact additional CPU speed at this point is almost meaningless and my business isn't alone in this. In fact most of the software in my business is single threaded (and the apps run that fast with single threads). Once the software is multi-threaded there is really no additional CPU power needed and it may come to the point where my business just stops upgrading hardware beyond what's need to replace failures and my business isn't alone. I just don't see a future for independent graphics chip/card producers.

    5. Re:It depends? by pnewhook · · Score: 3, Informative

      GPUs have extremely fast RAM connected to them, much faster than even system RAM

      I'd like to see a citation for that little bit of trivia

      Ok, so my Geforce GTX480 has GDDR5 ( http://www.nvidia.com/object/product_geforce_gtx_480_us.html ) which is based on DDR3 ( http://en.wikipedia.org/wiki/GDDR5 )

      My memory bandwidth on the GTX480 is 177 GB/sec. The fastest DDR3 module is PC3-17000 ( http://en.wikipedia.org/wiki/DDR3_SDRAM ) which gives approx 17000 MB/s which is approx 17GB/sec. So my graphics ram is basically 10x faster than system ram as it should be.

      --
      Tesla was a genius. Edison however was a overrated hack who liked to torture puppies.
    6. Re:It depends? by somenickname · · Score: 2, Interesting

      That's a very good breakdown of what you need to benefit from GPU based computing but, really, only #1 has any relevance vs. an x86 chip.

      #2) Yes, an x86 chip will have a high clock speed but, unless you can use SSE instructions, x86 is crazy slow. Also, most (if not all) architectures will give you half the flops for using the double precision vector instructions vs. the single precision ones.

      #3) This is a problem with CPUs as well except, as you point out, the memory is much slower. Performance is often about hiding latency. You don't need your problem to fit in the L2/L3 cache of a CPU, but, if the compiler/programmer/CPU can prefetch things into L2/L3 before it's accessed, it's a huge win. The same goes for having things in GPU memory before it's needed. The difference is that the GPU has a TON of memory compared to an L2/L3 cache.

      #4) You might be right here. I know that with hyperthreading a CPU will yield to another "thread" when it mispredicts a branch. However, the fact that branch misprediction is a condition in which the CPU will switch to another thread, to me, means that mispredicting a branch on an x86 CPU is also a fairly expensive thing to do. Maybe not as expensive as on a GPU but, expensive nonetheless.

      I suppose it all comes down to what kind of problem you are trying to compute but, if you can make your problem work in a way that is pleasing to #1, using a GPU is probably going to be a win.

    7. Re:It depends? by Spatial · · Score: 2, Interesting

      I haven't seen ANY GPU's that came with on-board RAM that is any different than what you can mount as normal system RAM, however.

      You haven't been looking very hard. Most GPUs have GDDR3 or GDDR5 running at very high frequencies.

      My system for example:
      Main memory: DDR2 400Mhz, 64-bit bus. 6,400 MB/sec max.
      GPU memory: GDDR3 1050Mhz, 448-bit bus. 117,600 MB/sec max.

      Maybe double the DDR2 figure since it's in dual-channel mode. I'm not sure, but it hardly makes much of a difference in contrast. :)

      That isn't even exceptional by the way. I have a fairly mainstream GPU, the GTX 260 c216. High-end cards like the HD5870 and GTX 480 are capable of pushing more than 158,000 and 177,000 MB/sec respectively.

  3. You lazy fuckers by drinkypoo · · Score: 5, Interesting

    I don't expect slashdot "editors" to actually edit, but could you at least link to the most applicable past story on the subject? It's almost like you people don't care if slashdot appears at all competent. Snicker.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  4. AMD by MadGeek007 · · Score: 5, Funny

    AMD must feel very conflicted...

  5. CPUs and GPUs have different goals by leptogenesis · · Score: 5, Interesting

    At least as far as parallel computing goes. CPUs have been designed for decades to handle sequential problems, where each new computation is likely to have dependencies on the results of recent computations. GPUs, on the other hand, are designed for situations where most of the operations happen on huge vectors of data; the reason they work well isn't really that they have many cores, but that the operations for splitting up the data and distributing it to the cores is (supposedly) done in hardware. In a CPU, the programmer has to deal with splitting up the data, and allowing the programmer to control that process makes many hardware optimizations impossible.

    The surprising thing in TFA is that Intel is claiming to have done almost as well on a problem that NVIDIA used to tout their GPUs. It really makes me wonder what problem it was. The claim that "performance on both CPUs and GPUs is limited by memory bandwidth" seems particularly suspect, since on a good GPU the memory access should be parallelized.

    It's clear that Intel wants a piece of the growing CUDA userbase, but I think it will be a while before any x86 processor can compete with a GPU on the problems that a GPU's architecture was specifically designed to address.

  6. Intel says "Buy Nvidia" by Posting=!Working · · Score: 4, Insightful

    What the hell kind of sales pitch is "We're only a little more than twice as slow!"

    [W]e perform a rigorous performance analysis and find that after applying optimizations appropriate for both CPUs and GPUs the performance gap between an Nvidia GTX280 processor and the Intel Core i7 960 processor narrows to only 2.5x on average.

    It's gonna work, too.

    Humanity sucks at math.

    --
    This sentence no verb.
  7. Re:Optimizations Matter by Rockoon · · Score: 3, Informative

    Just to be clear, those same memory reorganizations are required for the GPU. That being specifically the Structure-of-Arrays strategy instead of the Array-of-Structures strategy.

    Its certainly true that most programmers reach for the later style, but mainly because they arent planning on using any SIMD.

    --
    "His name was James Damore."
  8. Still trying to keep Larrabee going? by Junta · · Score: 4, Insightful

    On top of being highly capable at massively parallel floating point math (the bread and butter of top500 and most all real world HPC applications), GPU chips benefit from economies of scale by having a much larger market to sell chips to. If Intel has an HPC-only processor, I don't see it really surviving. There have been numerous HPC only accelerators that provided huge boosts over cpus that flopped. GPUs growing into that capability is the first large scale phenomenon in hpc with legs.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  9. Re:AMD by Rockoon · · Score: 4, Informative

    ..they have products in both segments.

    ..and for the record, AMD is still ruling the very high end multi-CPU (aka server) benchmarks and of course, we all know that their GPU's are top notch.

    AMD just isnt doing well in the high end consumer-grade space, but then again the chips that Intel is ruling with in that segment are priced well above consumer budgets.

    --
    "His name was James Damore."
  10. Re:Who cares anymore? by Overzeetop · · Score: 3, Insightful

    Two things: you've been conditioned to accept gaming graphics of yesteryear, and your need for more complex game play now trumps pure visuals. You can drop in a $100 video card, set the quality to give you excellent frame rates, and it looks fucking awesome because you remember playing Doom. Also, once you get to a certain point, the eye candy takes a backseat to game play and story - the basic cards hit that point pretty easily now.

    Back when we used to game, you needed just about every cycle you could get to make basic gameplay what would now be considered "primitive". Middling level detail is great, in my opinion. Going up levels to the maximum detail really adds very little. I won't argue that it's cool to see that last bit of realism, but it's not worth doubling the cost of a computer to get it.

    --
    Is it just my observation, or are there way too many stupid people in the world?
  11. Re:Who cares anymore? by Rockoon · · Score: 3, Informative

    Well as far as GPU's and Gaming, there are two segments of the population: Those with "low resolution" rigs such as 1280x1024 (most common group according to steam), and those with "high resolution" rigs such as 1920x1200.

    An $80 video card enables high/ultra settings at 60+ FPS on nearly all games for the "low resolution" group, but not the "high resolution" group.

    --
    "His name was James Damore."
  12. Re:AMD by Junta · · Score: 4, Insightful

    AMD is the most advantaged on this front...

    Intel and nVidia are stuck in the mode of realistically needing one another and simultaneously downplaying the other's contribution.

    AMD can use what's best for the task at hand/accurately portray the relative importance of their CPUs/GPUs without undermining their marketing message.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  13. Re:Optimizations Matter by Junta · · Score: 2

    The difference is the 'naive' code you write to do things in the simplest manner *can* run on a CPU. For the GPU languages, you *must* make those optimizations. This is not to undercut the value of GPU (as Intel concedes, the gap is large), but it does serve to counteract the dramatic numbers tauted by nVidia.

    nVidia compared expert tuned and optimized performance metrics on their product and compared against stock, generic benchmarks on intel products.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  14. Oh for cryin' out loud by werewolf1031 · · Score: 4, Insightful

    Just kiss and make up already. Intel and nVidia have but one choice: to join forces and try collectively to compete against AMD/ATI. Anything less, and they're cutting their nose off to spite their respective faces.

  15. Big Deal, A Barrel... by jedidiah · · Score: 3, Insightful

    Yeah, speciality silicon for a small subset of problems will stomp all over a general purpose CPU. No big news there.

    Why is Intel even bothering to whine about this stuff? They sound like a bunch of babies trying to argue that the sky isn't blue.

    This makes Intel look truely sad. It's completely unecessary.

    --
    A Pirate and a Puritan look the same on a balance sheet.
  16. Re:Big Deal, A Barrel... by chriso11 · · Score: 2, Insightful

    The reason that Intel is whining is in the context of large number crunching systems or high end workstations. Rather than sell Ks of chips for the former, Nvidia (and to a lesser extent AMD) gets to sell hundreds of GPU chips. And for the workstations, Intel sells only one chip instead of a 2 to 4.

    --
    No, I don't trust in god. He'll have to pay up front, like everybody else.
  17. Re:AMD by Joce640k · · Score: 2, Insightful

    I don't think AMD really cares about competing with top-end Intel processors. It takes a lot of R&D investment with very little return (it's a tiny market segment)

    In the low/mid range AMD rules the roost in terms of value for money.

    --
    No sig today...
  18. That's the big draw of the Teslas by Sycraft-fu · · Score: 2, Informative

    I mean when you get down to it, the seem really overpriced. No video output, their processor isn't anything faster, what's the big deal? Big deal is that 4x the RAM can really speed shit up.

    Unfortunately there are very hard limits to how much RAM they can put on a card. This is both because of the memory controllers, and because of electrical considerations. So you aren't going to see a 128GB GPU or the like any time soon.

    Most of our researchers that do that kind of thing use only Teslas because of the need for more RAM. As you said, the transfer is the limiting factor. More RAM means less often you have to snuffle data back and forth.