IBM to use Cell in Blade Servers

← Back to Stories (view on slashdot.org)

IBM to use Cell in Blade Servers

Posted by Zonk on Thursday February 9, 2006 @05:47AM from the ps3-gets-some-siblings dept.

taskforce writes "IBM announced on Wednesday that it would be putting versions of its Cell processor inside its increasingly popular low-power blade servers by this summer. From the article: 'For Cell to gain wide acceptance, IBM needs to spur outside programmers to write software that takes advantage of Cell's prowess. That could prove more challenging than usual because Cell's architecture is so different. IBM hopes this summer's release of the Cell-based servers kick-starts work by third-party programmers.'" Also covered in a PCPro article.

10 of 159 comments (clear)

Min score:

Reason:

Sort:

Sun has 'em beat by AKAImBatman · 2006-02-09 06:02 · Score: 4, Interesting

As I understand it, the various pipelines of the Cell chip tend to be more specialized than the Coolthreads technology Sun is using on their new T1 processor. However, even with 32 full-blown pipelines, Sun is also concerned about whether their chips will be put to good use or not.

I'm not quite sure what IBM is planning to do, but Sun has started a contest to see who can build the coolest program that takes advantage of their new Coolthreads technology. The prize is a cool $50,000, so Sun seems to be serious about this. The results of the contest may very well prove whether the new parallel technologies have a future or not.

--
Javascript + Nintendo DSi = DSiCade
1. Re:Sun has 'em beat by Zantetsuken · 2006-02-09 06:30 · Score: 2, Interesting
  
  Especially when IBM's already setting the groundwork for Cell to be used in supercomputers (for seismic activity, nuclear warhead simulations, ect), rendering 3D MRIs (reportedly, current image rendering for this is done on Intel Pentium 4s and takes about 4 minutes, when they did the tech demo of it on a Cell platform, it took about 20 seconds).
Your organs are specialized, too. by Orrin+Bloquy · 2006-02-09 06:07 · Score: 5, Interesting

It's a hell of a paradigm shift for programmers to go from writing code that targets one CPU to code that deliberately splinters tasks across a bank of specialized processors.

It's fun to bash the Cell as a general purpose CPU when no one has actually suggested it's designed for that.

All of the above being true, it remains to be seen what gains IBM's POWER/Cell system actually offers above present architectures -- RISC was the next big thing, too, until Intel internalized part of it into the x86 architecture.

Flyover landscape graphics demos are a shopworn rabbit pulled out of a threadbare hat: convert fractals into craggy vertical displacements with extremely primitive lighting/mapping. Show me an architecture that can *realtime* render Incredibles-caliber cloth/hair simulations and I'll get a hard-on while ATI and nVidia executives slit their wrists.

--
"Made up/misattributed quote that makes me look smart. I am on /. and I must look smart."
Good point. Unfortunately ... by vlad_petric · 2006-02-09 06:46 · Score: 4, Interesting

It's *very* difficult to get a compiler to exploit this kind of parallelism. Unless you're doing scientific Fortran loopy code, where it's much easier to do things like automatic vectorization/parallelization, it's basically almost impossible for the compiler (out of curiosity, try to use the automatic openmp parallelization feature within Intel C Compiler on standard C/C++ code; the results will likely underwhelm you). Unfortunately, even if you do have scientific code, the slave processing units only do simple precision (IIRC).
In my opinion, this thing will run well games, but that's about it. I've seen so far 2 presentations by IBM about the Cell processor (at (micro-)architecture conferences). Both times, the question on everybody's mind was "How do you program these things?". The answer was pretty much a hand-wavy "oh hmmm, well, blah blah blah manual"

--
The Raven
Cell will live long, but Niagara may not. by reporter · 2006-02-09 07:00 · Score: 1, Interesting

"The Register" has a recent article about building servers based on the IBM Cell.
Since the Cell is now integrated into the military apparatus of the best-funded military aparatus in the world, the Cell will live essentially forever. For the same reason, Ada (i.e. the computer language) will live forever even though few people in industry use the language.
By the way, Cell is also IBM's answer to Sun's Niagara. For years, Sun touted Niagara as a new revolution in computing: Niagara is supposedly the first commercially viable processor to use hordes of cores to quickly executed multithreaded applications.
Yet, Cell also uses hordes of cores. Though the Cell is 1 complex general-purpose POWER core plus 8 simple supporting specialized cores, IBM could easily downgrade the 1 complex core to a simple core (thus yielding additional silicon area) and upgrade the 8 simple specialized cores to 8 simple general-purpose cores. The hard part is linking the 9 cores together, but IBM already solved that problem when it created the Cell. (Intel is also working on a processor with hordes of cores.) If Niagara-based servers ever become popular, IBM is already prepared to launch a general-purpose Cell-based server.
The difference between the Cell and the Niagara is that the American military uses Cell, not Niagara. The American military will subsidize research on Cell.
Why SPEs? by Guspaz · 2006-02-09 07:30 · Score: 3, Interesting

Why go with SPEs anyhow? The whole problem with coding for the Cell involves the differences between the PPE and the SPE. The SPE doesn't have branch predictors, making it virtually useless for any sort of flow control.

Why didn't IBM just pack in a lesser number of PPEs? The PPE already seems to be a very lightweight general purpose processing core, unless I'm missing something. It is about the same size as an SPE. So why not just put 9 PPEs on a Cell chip instead of 1 PPE and 8 SPEs?

If you had 9 PPEs on the chip, any multithreaded code (servers for example) would see massive benefits without having to rewrite it to try to find aspects of the program that could run on what is effectively a DSP. While everybody else was fooling around with 2-core processors, they'd have a 9-core processor on the market. Sure, slower per-core, but 9 of them, with that number going up in the future.

Or am I missing something here?
1. Re:Why SPEs? by Anonymous Coward · 2006-02-09 08:10 · Score: 1, Interesting
  
  Your not missing anything here.
  
  If you want general purpose system go with the 4-6 Gigahertz Power6 proccessors they are developing. This will provide very fast multiple 'PPE's your looking for.
  
  Ok, so the SPEs don't have 'branch prediction'.. So what? They are so freaking fast at what they do that it probably won't matter.
  
  Your looking at one cell. A Blade isn't going to have one cell. It's going to have 2-4.
  
  A rackmount of these guys will provide, conservatively, 10 of these blades.
  
  Maxed out 10 blades would be 40 cells, theoretically.
  
  That is 40 PPEs. That is 320 SPEs.
  
  That will give you a supercomputer-level, buy todays standards, number crunching ability (remember SPEs are NOT vector.. they can do more then just floating point) in the roughly same space and probably electrical usage as a common 24 inch CRT Television.
  
  Think about that for a second. Two full racks of this crap side by side would provide enough number crunching power to real-time render a virtual Holodeck.
  
  Look forward to the return of 'software rendering'. Remember that the current MIPS-powered Playstation 2 is fully software rendered...
  
  Early models are already shipping:
  http://linuxdevices.com/news/NS3591350722.html
  
  It's a evaluation system. 1-2 or dual core Cell blades.
  
  Oh and of course it runs Linux. Terra Soft (makers of yellowdog linux) will be selling them. They also will ship with Fedora Core installed.
Two Tutorials by GrEp · 2006-02-09 08:41 · Score: 2, Interesting

IBM needs to release two SIMPLE tutorials if they want programmers to bother porting code specifically to the cell:

1. A cell program that solves linear equations Ax=b efficently using SPE's. This would help those with data intensive problems.

2. A cell program that speeds up depth first search (a la for SAT,GRAPH COLORING, MAX-CLIQUE) by using the SPE's. This would help those programming CPU intensive problems.

--

bash-2.04$
bash-2.04$yes "Don't you hate dialup connections?"| write USERNAME
Re:I work in blade development. by fitten · 2006-02-09 10:35 · Score: 2, Interesting

They're a lot different than any other architecture...

Actually, they are similar to a number of DSPs and other discrete solutions from the past. For example:

The TMS 320DM64x series of DSP from TI which has an ARM9 and a number of DSPs on it.

The TMS 320DM54x and 55x series of DSP from TI which has an ARM7 and a number of DSPs on it.

And a descrete version in the CSPI MAP 1310/11 which had a PPC and multiple multi-core DSP chips on it as early as 1997.
Tutorials 3 and 4 by dch24 · 2006-02-09 12:13 · Score: 2, Interesting

Having been a long-time reader over at the IBM forums, there are a lot of similar questions and answers going on over there.
There were a couple that would be really helpful:
1. An implementation of zlib for the SPE architecture, with a speed comparison to the PPE. (Hopefully, the SPE is very fast...)
2. Examples of direct SPE-to-SPE streaming.