Unleashing the Power of the Cell Broadband Engine
An anonymous reader writes "IBM DeveloperWorks is running a paper from the MPR Fall Processor Forum 2005 explores programming models for the Cell Broadband Engine (CBE) Processor, from the simple to the progressively more advanced. With nine cores on a single die, programming for the CBE is like programming for no processor you've ever met before."
The Sony PS3 seems a good development kit alternative for open source programmers, low-budget laboratories or even startup companies.
It will carry a Cell with a powerful graphic chipset, a hard drive, a good deal of ports, and a Linux distribution.
The problem I see, however, is that it is restricted to 256 MB of RAM.
This is very small in comparison with the data processing capabilities of the Cell. Also, it is too little for modern OSes which usually starts working decently over 512 MB.
Virtual memory helps, but the PS3 will use 2'5 inch hard drives, which are quite slow.
My sugestion is that Sony could make a limited edition PS3 with bigger memory for developing, like 512 or 1 GB. After all, if they agreed to open Cell to the industry, why not help with technology's adoption selling cheap development kits?
It would be nice if IBM could back this idea, and convince Sony to make it a reality, don't you think?
Please mod me only (+) Underrated or (-) Troll
I read this on digg days ago...
The Cell Architecture grew from a challenge posed by Sony and Toshiba to provide power-efficient and cost-effective high-performance processing for a wide range of applications, including the most demanding consumer appliance: game consoles. Cell - also known as the Cell Broadband Engine Architecture (CBEA) - is an innovative solution whose design was based on the analysis of a broad range of workloads in areas such as cryptography, graphics transform and lighting, physics, fast-Fourier transforms (FFT), matrix operations, and scientific workloads. As an example of innovation that ensures the clients' success, a team from IBM Research joined forces with teams from IBM Systems Technology Group, Sony and Toshiba, to lead the development of a novel architecture that represents a breakthrough in performance for consumer applications. IBM Research participated throughout the entire development of the architecture, its implementation and its software enablement, ensuring the timely and efficient application of novel ideas and technology into a product that solves real challenges; More...
I just want to draw a flowchart and have the compiler and realtime scheduler distribute processes and data among the hardware resources. If we are getting a new architecture and new "programming models", and therefore new compilers and kernels, how about a new IDE paradigm.
--
make install -not war
Damn you marketing droids! This has nothing to do with broadband at all.
So yes, I want a Cell-based devkit now, 'cuz this sounds like _fun_ :-)
Regards,
John
Falling You - beautiful
from the article and if the ps3 cell cpu is even half the processor than this monster is i say that game companies will need a lot of real programmers to make real good games (as if they cared).
And I have prayed unto You, O Lord U**X in the time of the Will of Linux.
It's Saturday night and I'm all alone here, cut me some slack...
Oh man, getting blowjob while reading Slashdot... Is this not every geek's dream :)
Its when you take old code from previous things and then try to do a direct port that you will see some issues in performance hits. But if designed from the ground up in terms of the code for a cell environment (or ANY CPU architecture), it is all in the hands of the few top level software design architechs to properly structure the overall workings of the game's code. Once the structure is correct, sending the bits and pieces that need to be made to the rest of the code monkeys is no problem, they just need to follow the UML or whatever other design docs they are specifically suppose to implement.
We were all warned a long time ago that MS products sucked, remember the Magic 8 Ball said, "Outlook not so good"
Looks like you officially earn you geek gold card today! Congrats.
What do you think Page 0 was for?
Damn, nothing gets me fired up on a Saturday night like the thought of a nine way!
can it do infinite loops in 5 seconds?
... of the promotional material for the Sega Saturn from a few years back?
I remember right about the time it came out, there was a lot of hype about it's architecture. Two main processors and a bunch of dedicated co-processors, fast memory bus, etc., etc. I don't remember any more specifics, but at the time it seemed very impressive. Of course it flopped spectacularly, because apparently the thing was a huge pain in the ass to program for and the games never materialized. Or at least that's the most often spoken reason that I've heard.
Anyway, and I'm sure I'm not the first person to have realized this, Cell is starting to sound the same way. The technical side is being hyped and seems clearly leaps and bounds ahead of the competition, but one has to wonder what MS is doing to prevent themselves from producing another Saturn on the programming side.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
IBM will also be releasing Cell-based Blade servers next year, so pick one up if you're serious about development!
The Cell machines are about equally painful to program, but because they're cheaper, they have more potential applications than the nCube did. Cell phone sites, multichannel audio and video processing, and similar easily-parallelized stream-type tasks fit well with the cell model. It's not yet clear what else does.
Recognize that the cell architecture is inherently less useful than a shared-memory multiprocessor. It's an attempt to get some reasonable fraction of the performance of an N-way shared memory multiprocessor without the expensive caches and interconnects needed to make that work. It's not yet clear if this is a price/performance win for general purpose computing. Historically, architectures like this have been more trouble than they're worth. But if Sony fields a few hundred million of them, putting up with the pain is cost-justified.
It's still not clear if the cell approach does much for graphics. The PS3 is apparently going to have a relatively conventional nVidia part bolted on to do the back end of the graphics pipeline.
I'm glad that I don't have to write a distributed physics engine for this thing.
You can run a 68000 or 80386 emulator in each of the SPUs, or just run lots of native processes in parallel.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Both Sony and MS realized they couldn't make a single true general-purpose CPU with the performance they wanted for a price they could afford to sell in their consoles.
Sony went to a CPU, GPU and 7 co-processors (Cell).
MS went to a 3 CPUs with vector-assist and a GPU.
Both companies are going to need to spend a lot of time and money on developer tools to help their developers more easily take advantage of their oddball hardware, or else they will end up right where Saturn did.
I guess the good news for both companies is that there is no alternative (like PS1 was to Saturn) which is straightforward and thus more attractive.
PS2 requires programming a specialized CPU with localized memory (the Emotion Engine) and it seems to get by okay. So developers can adapty, given sufficient financial advange to doing so.
http://lkml.org/lkml/2005/8/20/95
Time to port the Lego Mindstorms development environment to the Cell processor!
The cell seems to be great for system with a limited number of tasks (like a game console), but what about general OS ? Context changes seems to be a big problem and it looks like this CPU will be very bad in a general desktop computer.
I wonder... is there any processor that are good for tasks switching (by having, for example, several sets of registers and TLB so a task switch only mean using another set instead of saving and loading everything to memory) ?
Note to moderators: the user "5, Troll" likes to cut and paste posts from other sites to gain karma. This one was found on the DeveloperWorks site with a quick google search.
The problem will be for much of the IT industry is that those making the decisions would ask only one question:
Does it Run Windows?
If the answer is no that the manager will say something like:
"I don't care if the processor is the most powerful ever developed, costs next to nothing to produce and will allow us to build a powerful computer the size of of pea. If it doesn't run Windows, then I'm not interested".
And that sums up the total IT knowledge of that manager.
IANAGP (game programmer), but it would seem to me that physics and lighting calculations should be easily parallelizable. Each processor can compute the physics for a separate set of objects / pixels / etc. Same for AI for each agent, if the companies actually bothered to put some effort into gameplay over graphics. On the other hand, I would guess that things like fluids (i.e. Far Cry) would be more difficult to do in parallel, due to the less local nature of the interactions.
Mod me down if you wish but I think the CBE architecture is bound to fail. The reason is that you don't design your software model around a new processor. It should be the other way around. You first come up with a software model and then design a processor optimized for the new model. This way you are guaranteed to have a perfect fit. Otherwise, you're asking for trouble.
The primary reason that anybody would want to devise a new software model is to address the single most pressing problem in the computer industry: unreliability. The reason that software is unreliable is that it is based on the algorithm. Switch to a non-algorithmic, signal-based, synchronous model and the problem will disappear. Unfortunately current processor architectures, including the CBE, are optimized for the algorithm. Click on the link below for details on a new software model designed to solve the reliability problem.
if every ps3 was networked and sony rented out your redundant core to the DoD, how fast would the worlds most powerful super computer be?
It's called the Revolution.
Twinstiq, game news
the first is that they don't deal well with resource contention. No language, or any other thing for that matter, does.
When you fork N processes on N objects and you have N-M processors, it costs you computationally, which translates into efficiency.
Its one thing to think of this situation as a bunch (N) of ball-bearings going a bunch of holes (N-M) with each ball-bearing having its state information local to it. (Any kind of concept of a sieve can serve as a 'gedanken' experiment.)
The situation becomes hopelessly confused when there is any dependency on external data or process sources.
The mechanisms for handling that confusion are all basically ones of reducing the many threads down into a single thread and meting out the shared resource piece-meal.
A sufficiently evolved schema is capable of handling replication of a shared 'read-only' resource but, despite the efficiencies inherent in that situation, it merely shifts the burden of resource access up one level. There will be a stiffer computational penalty to be encountered when 'access starvation' is reached.
Hopefully the replication penalty will be acceptable, and there are ways to mitigate the computational cost of that penalty, but the trade-off is an instance-level, existential sort of thing and exists at run-time and can only be guess-timated at algorithm/method design-time.
The second fault is one of design of the languages themselves.
They are not designed to operate within a schema. Actually no language is so the efficiencies to be gained from using a schema are bolted on to the application and not an inherent part of it.
MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
I and most of my friends have stoped buying consoles and playing video games since the days of the SNES.
Add your 4, subtract my 4 and you get zero growth.
Although Nintendo isn't even talking about the hardware specs, so we can't be sure.
But I didn't include the Revolution because Nintendo is saying the same thing they did with the Gamecube, that they don't need 3rd party developers. Revolution seems largely like a platform for Nintendo to sell you their older games again. Additionally, if Revolution is sufficiently underpowered compared to the other two, it may be that 3rd parties just plain cannot port their games to this platform, or else have to "dumb down" their game in such a way which might make the game uncompetitive with games that don't work on Revolution.
So, basically, N is downplaying new development so much on the Revolution that I simply left it out as a platform which would attract developers who were fed up with the other two. But probably I shouldn't have done so.
By the way, with all of this, I want to mention I'm a huge N fan. I have three GBAs, a DS and a Gamecube, plus all their other consoles back to the SNES. I just think that N is concentrating on 1st/2nd party development more than 3rd party development.
http://lkml.org/lkml/2005/8/20/95
I haven't really done much programming since college and none of those programs have been multithreaded, so maybe I don't have the right background to comment. But, all I can say is wow. This is crazy compared to the Sparc processors that I learned assembly on. As somebody pointed out, not only do these processors have multiple cores, but apparently each one has 128 registers?! Processor design has come a long way.
That said, I see a lot of comments reflecting on how hard it will be for programmers to adjust to programming on this architecture. While I agree that there may be some learning that will have to take place, shouldn't most of the optimization take place on the compiler level? I mean, that's partly the point of languages such as C/C++: write a minimum ammount of architecture specific code and let the compiler do the rest.
Anyway, I find this new architecture very impressive and can't wait to see devices take advantage of this hardware.
If Murphy's Law can go wrong, it will.
Since most of the inter-processor "interconnects" would be consumer-grade DSL/Cable links, it'd have phenomental capacity to process chunks of data but serious latency issues in distributing work units. Commercial cluster data-processing units probably use gigabit ethernet or faster connections to get around this.
I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU
The part of Sony that has been providing Linux kits for the PS2 since 2002.
The console homebrew scene is rather big, and Sony and Microsoft can do nothing about it.
"programming for the CBE is like programming for no processor you've ever met before"
Which is exactly why it will never take off.
Another horrible early processor was the TMS9900, which pretended to have 16 16-bit registers but they were just mapped memory. And that too didn't have a proper subroutine call and return. It really wasn't better in the old days.
Pining for the fjords
cellular broadband coverage is spotty at best in my area, and the damned providers charge too much per minute for the airtime. :)
help me i've cloned myself and can't remember which one I am