IBM to use Cell in Blade Servers
taskforce writes "IBM announced on Wednesday that it would be putting versions of its Cell processor inside its increasingly popular low-power blade servers by this summer. From the article: 'For Cell to gain wide acceptance, IBM needs to spur outside programmers to write software that takes advantage of Cell's prowess. That could prove more challenging than usual because Cell's architecture is so different.
IBM hopes this summer's release of the Cell-based servers kick-starts work by third-party programmers.'" Also covered in a PCPro article.
As I understand it, the various pipelines of the Cell chip tend to be more specialized than the Coolthreads technology Sun is using on their new T1 processor. However, even with 32 full-blown pipelines, Sun is also concerned about whether their chips will be put to good use or not.
I'm not quite sure what IBM is planning to do, but Sun has started a contest to see who can build the coolest program that takes advantage of their new Coolthreads technology. The prize is a cool $50,000, so Sun seems to be serious about this. The results of the contest may very well prove whether the new parallel technologies have a future or not.
Javascript + Nintendo DSi = DSiCade
It's a hell of a paradigm shift for programmers to go from writing code that targets one CPU to code that deliberately splinters tasks across a bank of specialized processors.
It's fun to bash the Cell as a general purpose CPU when no one has actually suggested it's designed for that.
All of the above being true, it remains to be seen what gains IBM's POWER/Cell system actually offers above present architectures -- RISC was the next big thing, too, until Intel internalized part of it into the x86 architecture.
Flyover landscape graphics demos are a shopworn rabbit pulled out of a threadbare hat: convert fractals into craggy vertical displacements with extremely primitive lighting/mapping. Show me an architecture that can *realtime* render Incredibles-caliber cloth/hair simulations and I'll get a hard-on while ATI and nVidia executives slit their wrists.
"Made up/misattributed quote that makes me look smart. I am on
In my opinion, this thing will run well games, but that's about it. I've seen so far 2 presentations by IBM about the Cell processor (at (micro-)architecture conferences). Both times, the question on everybody's mind was "How do you program these things?". The answer was pretty much a hand-wavy "oh hmmm, well, blah blah blah manual"
The Raven
Since the Cell is now integrated into the military apparatus of the best-funded military aparatus in the world, the Cell will live essentially forever. For the same reason, Ada (i.e. the computer language) will live forever even though few people in industry use the language.
By the way, Cell is also IBM's answer to Sun's Niagara. For years, Sun touted Niagara as a new revolution in computing: Niagara is supposedly the first commercially viable processor to use hordes of cores to quickly executed multithreaded applications.
Yet, Cell also uses hordes of cores. Though the Cell is 1 complex general-purpose POWER core plus 8 simple supporting specialized cores, IBM could easily downgrade the 1 complex core to a simple core (thus yielding additional silicon area) and upgrade the 8 simple specialized cores to 8 simple general-purpose cores. The hard part is linking the 9 cores together, but IBM already solved that problem when it created the Cell. (Intel is also working on a processor with hordes of cores.) If Niagara-based servers ever become popular, IBM is already prepared to launch a general-purpose Cell-based server.
The difference between the Cell and the Niagara is that the American military uses Cell, not Niagara. The American military will subsidize research on Cell.
Why go with SPEs anyhow? The whole problem with coding for the Cell involves the differences between the PPE and the SPE. The SPE doesn't have branch predictors, making it virtually useless for any sort of flow control.
Why didn't IBM just pack in a lesser number of PPEs? The PPE already seems to be a very lightweight general purpose processing core, unless I'm missing something. It is about the same size as an SPE. So why not just put 9 PPEs on a Cell chip instead of 1 PPE and 8 SPEs?
If you had 9 PPEs on the chip, any multithreaded code (servers for example) would see massive benefits without having to rewrite it to try to find aspects of the program that could run on what is effectively a DSP. While everybody else was fooling around with 2-core processors, they'd have a 9-core processor on the market. Sure, slower per-core, but 9 of them, with that number going up in the future.
Or am I missing something here?
IBM needs to release two SIMPLE tutorials if they want programmers to bother porting code specifically to the cell:
1. A cell program that solves linear equations Ax=b efficently using SPE's. This would help those with data intensive problems.
2. A cell program that speeds up depth first search (a la for SAT,GRAPH COLORING, MAX-CLIQUE) by using the SPE's. This would help those programming CPU intensive problems.
bash-2.04$
bash-2.04$yes "Don't you hate dialup connections?"| write USERNAME
They're a lot different than any other architecture...
Actually, they are similar to a number of DSPs and other discrete solutions from the past. For example:
The TMS 320DM64x series of DSP from TI which has an ARM9 and a number of DSPs on it.
The TMS 320DM54x and 55x series of DSP from TI which has an ARM7 and a number of DSPs on it.
And a descrete version in the CSPI MAP 1310/11 which had a PPC and multiple multi-core DSP chips on it as early as 1997.
There were a couple that would be really helpful:
1. An implementation of zlib for the SPE architecture, with a speed comparison to the PPE. (Hopefully, the SPE is very fast...)
2. Examples of direct SPE-to-SPE streaming.