Why 'Gaming' Chips Are Moving Into the Server Room

← Back to Stories (view on slashdot.org)

Why 'Gaming' Chips Are Moving Into the Server Room

Posted by timothy on Thursday July 15, 2010 @07:37AM from the expense-report-manipulation-++ dept.

Esther Schindler writes "After several years of trying, graphics processing units (GPUs) are beginning to win over the major server vendors. Dell and IBM are the first tier-one server vendors to adopt GPUs as server processors for high-performance computing (HPC). Here's a high level view of the hardware change and what it might mean to your data center. (Hint: faster servers.) The article also addresses what it takes to write software for GPUs: 'Adopting GPU computing is not a drop-in task. You can't just add a few boards and let the processors do the rest, as when you add more CPUs. Some programming work has to be done, and it's not something that can be accomplished with a few libraries and lines of code.'"

7 of 137 comments (clear)

Min score:

Reason:

Sort:

A whole new level of parallelism by TwiztidK · 2010-07-15 07:42 · Score: 4, Insightful

I've heard that many programmers have issues coding for 2 and 4 core processors. I'd like to see how they'll addapt to running "run hundreds of threads" in parallel.

--
Sent from my iPhone 5
1. Re:A whole new level of parallelism by Nadaka · 2010-07-15 08:04 · Score: 4, Insightful
  
  No it isn't. That you think so just shows how much you still have left to learn.
  I am not a high end programmer either. But I have two degrees on the subject and have been working professionally in the field for years, including optimization and parallelization.
  Many algorithms just won't have much improvement with multi-threading.
  Many will even perform more poorly due to data contention and the overhead of context switches and creating threads.
  Many algorithms just can not be converted to a format that will work within the restrictions of GPGPU computing at all.
  The stream architecture of modern GPU's work radically differently than a conventional CPU.
  It is not as simple as scaling conventional multi-threading up to thousands of threads.
  Certain things that you are used to doing on a normal processor have an insane cost in GPU hardware.
  For instance, the if statement. Until recently OpenCL and CUDA didn't allow branching. Now they do, but they incur such a huge penalty in cycles that it just isn't worth it.
2. Re:A whole new level of parallelism by Dynetrekk · 2010-07-15 08:08 · Score: 5, Insightful
  
  Believe me, when you change the way you think about how an algorithm works, it doesn't matter if you are using 3 or 10000 processors.
  Have you ever read up on Amdahl's law?
3. Re:A whole new level of parallelism by Chris+Burke · 2010-07-15 08:33 · Score: 4, Informative
  
  Programmers of Server applications are already used to multithreading, and they've been able to make good use of systems with large numbers of processors on them even before the advent of virtualization.
  But don't pay too much attention to the word "Server". Yes the machines that they're talking about are in the segment of the market referred to as "servers", as distinct from "desktops" or "mobile". But the target of GPU-based computing isn't "Servers" in the sense of the tasks you normally think of -- web servers, database servers, etc.
  The real target is mentioned in the article, and it's HPC, aka scientific computing. Normal server apps are integer code, and depend more on high memory bandwidth and I/O, which GPGPU doesn't really address. HPC wants that stuff too, but they also want floating point performance. As much floating point math performance as you can possibly give them. And GPUs are way beyond what CPUs can provide in that regard. Plus a lot of HPC applications are easier to parallelize than even the traditional server codes, though not all fall in the "embarrassingly parallel" category.
  There will be a few growing pains, but once APIs get straightened out and programmers get used to it (which shouldn't take too long for the ones writing HPC code), this is going to be a huge win for scientific computing.
  
  --
  
  The enemies of Democracy are
4. Re:A whole new level of parallelism by David+Greene · 2010-07-15 17:09 · Score: 4, Interesting
  
  The stream architecture of modern GPU's work radically differently than a conventional CPU.
  True if the comparison is to a commodity scalar CPU.
  
  It is not as simple as scaling conventional multi-threading up to thousands of threads.
  True. Many algorithms will not map well to the architecture. However, many others will map extremely well. Many scientific codes have been tuned over the decades to exploit high degrees of parallelism. Often the small data sets are the primary bottleneck. Strong scaling is hard, weak scaling is relatively easy.
  
  Certain things that you are used to doing on a normal processor have an insane cost in GPU hardware.
  In a sense. These are not scalar CPUs and traditional scalar optimization, while important, won't utilize the machine well. I can't think of any particular operation that's greatly slower then on a conventional CPU, provided one uses the programming model correctly (and some codes don't map well to that model).
  
  For instance, the if statement.
  No. Branching works perfectly fine if you program the GPU as a vector machine. The reason branches within a warp (using NVIDIA terminology) are expensive is simply because a warp is really a vector. The GPU vendors just don't want to tell you that because either they fear being tied to some perceived historical baggage with that term or they want to convince you they're doing something really new. GPUs are interesting, but they're really just threaded vector processors. Don't misunderstand me, though, it's a quite interesting architecture to work with!
  
  --
Re:Yes, of course by Yvan256 · 2010-07-15 07:54 · Score: 5, Funny

Portal 2? It's something for our Web server. It adds more portals to access the internet.
Re:CUDA by Rockoon · 2010-07-15 08:01 · Score: 4, Interesting

Indeed. With Cuda, DirectCompute, and OpenCL, nearly 100% of your code is boilerplate interfacing to the API.

There needs to be a language where this stuff is a first-class citizen and not just something provided by an API.

--
"His name was James Damore."