Slashdot Mirror


Unleashing the Power of the Cell Broadband Engine

An anonymous reader writes "IBM DeveloperWorks is running a paper from the MPR Fall Processor Forum 2005 explores programming models for the Cell Broadband Engine (CBE) Processor, from the simple to the progressively more advanced. With nine cores on a single die, programming for the CBE is like programming for no processor you've ever met before."

32 of 136 comments (clear)

  1. Cell architecture by rd4tech · · Score: 2, Informative

    The Cell Architecture grew from a challenge posed by Sony and Toshiba to provide power-efficient and cost-effective high-performance processing for a wide range of applications, including the most demanding consumer appliance: game consoles. Cell - also known as the Cell Broadband Engine Architecture (CBEA) - is an innovative solution whose design was based on the analysis of a broad range of workloads in areas such as cryptography, graphics transform and lighting, physics, fast-Fourier transforms (FFT), matrix operations, and scientific workloads. As an example of innovation that ensures the clients' success, a team from IBM Research joined forces with teams from IBM Systems Technology Group, Sony and Toshiba, to lead the development of a novel architecture that represents a breakthrough in performance for consumer applications. IBM Research participated throughout the entire development of the architecture, its implementation and its software enablement, ensuring the timely and efficient application of novel ideas and technology into a product that solves real challenges; More...

  2. Re:New Me by NanoGator · · Score: 4, Funny

    "I just want to draw a flowchart and have the compiler and realtime scheduler distribute processes and data among the hardware resources. If we are getting a new architecture and new "programming models", and therefore new compilers and kernels, how about a new IDE paradigm."

    Bingo, sir.

    --
    "Derp de derp."
  3. Has nothing to do with Broadband by ScottCooperDotNet · · Score: 2

    Damn you marketing droids! This has nothing to do with broadband at all.

    1. Re:Has nothing to do with Broadband by Guilly · · Score: 3, Informative

      I would assume they call it broadband because the 8 SPE's can communicate to each other over a 100GB/s link (called the Element Interconnect Bus -- yes, that's 100GB not 100Gb) and also because it provides plenty of SIMD instructions.

      Oh yeah. If you read their web page they also mention the Cell processor will be able to handle broadband rich media applications and streaming content:
      The first-generation Cell Broadband Engine (BE) processor is a multi-core chip comprised of a 64-bit Power Architecture processor core and eight synergistic processor cores, capable of massive floating point processing, optimized for compute-intensive workloads and broadband rich media applications.

    2. Re:Has nothing to do with Broadband by ScottCooperDotNet · · Score: 5, Insightful
      Simply because IBM mentions broadband doesn't mean it has anything to do with system-to-system data transmission. This sounds a bit like Intel's marketing of "shiny new Pentiums make the Internet faster."

      "The Pentium III will make the Internet a much more consumer-friendly environment," says Jami Dover, Intel's marketing vice president. Surfing today, Dover maintains, is a limited experience because data-transfer rates over ordinary telephone lines do not allow for high-quality audio, video and 3D graphics. "You take people raised on TV and show them a flat, text [Web] page," says Dover. "It's quite a juxtaposition." I guess Intel was hoping the world could go through a phone line with enough compression.

      To us this is a nitpick, to the general public this is more confusion in a jargon filled marketplace.

  4. Wow ... by JMZorko · · Score: 4, Interesting

    ... all those _registers_ make me salivate! One of the coolest things about the RCA1802 (the processor I learned on) compared to others in its' time was that it had _loads_ of registers when compared to a 6502 or 8085. It spoiled me, though ... when I started exploring those other CPUs, I always thought "Huh? Where are all of the registers?"

    So yes, I want a Cell-based devkit now, 'cuz this sounds like _fun_ :-)

    Regards,

    John

    --
    Falling You - beautiful
  5. ps3 programming by orlyonok · · Score: 3, Insightful

    from the article and if the ps3 cell cpu is even half the processor than this monster is i say that game companies will need a lot of real programmers to make real good games (as if they cared).

    --
    And I have prayed unto You, O Lord U**X in the time of the Will of Linux.
    1. Re:ps3 programming by MikeFM · · Score: 2, Interesting

      It'd seem to me that a lot of the development trickery will be in getting a proper compiler and specialized libs out there that take advantage of this parallelism without requiring massive changes to how the average developer has to write their code.

      Most of the bitching we've heard from developers so far hasn't been that the cell sucks but that their existing codebases don't take advantage of it's design and they don't want to make a rewrite that locks them into the platform.

      As with every platform the really good stuff will come out a couple years after it's release. At least with the Cell they are pushing it to go mainstream instead of just for gaming consoles so we should expect to see development moving along much faster than with a plain console.

      --
      At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
    2. Re:ps3 programming by iota · · Score: 4, Insightful

      from the article and if the ps3 cell cpu is even half the processor than this monster is i say that game companies will need a lot of real programmers to make real good games (as if they cared).

      1. Some of us do care, actually.
      2. The Cell processor described is exactly the processor in the PS3.
      3. Yes, regardless of what some would like to believe, there is no magic. It's different, but it's the way things are going, so some of us are adapting the way develop. It'll take work, and maybe a little time, but that's always been our job - we get hardware and we figure out how to do something cool with it.
      4. It is actually really fun to work on and very impressive.

    3. Re:ps3 programming by iota · · Score: 2, Insightful

      It'd seem to me that a lot of the development trickery will be in getting a proper compiler and specialized libs out there that take advantage of this parallelism without requiring massive changes to how the average developer has to write their code.

      Certainly people are working on that very idea. However, it's a long way off and not likely to happen in the lifetime of this version of the processor. Both XLC (IBM's optimizing compiler) and GCC have a very difficult time vectorizing (i.e. taking advantage of the SIMD instruction sets) within a single processor. IBM has released a Cell SDK for managing the PPU/SPU at a higher level, which should make the transition slightly easier for some developers, but on the whole - there is no way around the fact that the final algorithms and data design are very different when targetting a Cell.

      Most of the bitching we've heard from developers so far hasn't been that the cell sucks but that their existing codebases don't take advantage of it's design and they don't want to make a rewrite that locks them into the platform.

      These developers that are bitching are just the decendants of the developers that were bitching when games moved from 2D to 3D. That caused a major upheaval as well. We lost a lot of programmers in that transition, we're bound to lose some here too. But times change and multi-processing has been a long time coming - it's not going anywhere. The Cell may be a hit, or not - but the software techniques will be the basis of what we do for quite a while.

    4. Re:ps3 programming by TheRaven64 · · Score: 2, Interesting
      C is an incredibly bad language for programming a modern CPU. There are many parts of the C language which assume that the target machine looks a lot like a PDP-11. Trying to turn this code into something that runs on a superscalar machine, let alone a vector processor is incredibly difficult - we can only do it at all because such a huge amount of effort has been invested in research. If you want a good language for programming something like the cell, then you should take a look at more or less any functional language.

      Oh, and anyone who thinks functional languages are scary should realise that they probably use (a very primitive and unfriendly) one for their build system - make.

      --
      I am TheRaven on Soylent News
  6. 20 core die by Anonymous Coward · · Score: 3, Funny
    Amazing progress. So with 20 cores on a single die, we can play D&D in real time?

    It's Saturday night and I'm all alone here, cut me some slack...

  7. Re:ps3 programming...no, not really by Fallen+Kell · · Score: 2, Insightful
    I say no, they won't need lots of real programmers. They only need 1 or 2 per game team to do the overall design and let the compilers do the rest. Since the real guts of it will be compiler optimization. If your lead designers do their job, the compiler will be able to do its job and everything will work like it should.

    Its when you take old code from previous things and then try to do a direct port that you will see some issues in performance hits. But if designed from the ground up in terms of the code for a cell environment (or ANY CPU architecture), it is all in the hands of the few top level software design architechs to properly structure the overall workings of the game's code. Once the structure is correct, sending the bits and pieces that need to be made to the rest of the code monkeys is no problem, they just need to follow the UML or whatever other design docs they are specifically suppose to implement.

    --
    We were all warned a long time ago that MS products sucked, remember the Magic 8 Ball said, "Outlook not so good"
  8. Re:PS3 Suggestion by spoonboy42 · · Score: 4, Informative

    The PS3 has 512M of memory by default. It is half Rambus XDR and half GDDR3, but both segments of memory can be addressed by both the processor and the GPU.

    --
    Anonymous Luddite: "What do you think of the dehumanizing effects of the Internet?"
    Andy Grove: "Not Much."
  9. Re:PS3 Suggestion by Trigun · · Score: 2, Interesting

    Just cut Sony out of the loop, and have IBM do the work. They could re-revolutionize the desktop PC market.

  10. yea but by rrosales · · Score: 2, Funny

    can it do infinite loops in 5 seconds?

  11. Re:PS3 Suggestion by rpdillon · · Score: 4, Interesting

    Every PS3 hard drive is shipping with Linux onboard.

  12. Remind anyone... by Kadin2048 · · Score: 2, Insightful

    ... of the promotional material for the Sega Saturn from a few years back?

    I remember right about the time it came out, there was a lot of hype about it's architecture. Two main processors and a bunch of dedicated co-processors, fast memory bus, etc., etc. I don't remember any more specifics, but at the time it seemed very impressive. Of course it flopped spectacularly, because apparently the thing was a huge pain in the ass to program for and the games never materialized. Or at least that's the most often spoken reason that I've heard.

    Anyway, and I'm sure I'm not the first person to have realized this, Cell is starting to sound the same way. The technical side is being hyped and seems clearly leaps and bounds ahead of the competition, but one has to wonder what MS is doing to prevent themselves from producing another Saturn on the programming side.

    --
    "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
    1. Re:Remind anyone... by Sycraft-fu · · Score: 3, Insightful

      Well, not quite. The odd processors were a problem for the Saturn, but not the major one. The really major problem was that it wasn't good at 3D. The Saturn was basically designed to be the ultimate 2D console, which it was. However 3D was kinda hacked on later and thus was hard to do and didn't look as good as the competition. This was at a time when 3D was new and flashy, and thus an all-important selling point.

      However you are correct in that having a system with a different development model could be problematic. Game programmers (and I suppose all programmers) can be fairly insular. Many are already whining about the multi-core movement. They like writing single-thread code, a big while loop in essencde, since that's the way it's always been done. However the limitations of technology are forcing new thinking. Fortunately, doing multi-threaded code shouldn't require a major reqorking of the way things are done, espically with good development tools.

      Well, the Cell is something else again. It's multi-core to the extreme in one manner of thinking, but not quite, because the Cells aren't full, independant processor cores. So programming it efficiently isn't just having 8 or 9 or however many cores worth of tasks for it.

      Ultimately, I think the bottom line will come down to the development tools. Game programmers aren't likely to be hacking much assembly code. So if the compiler knows how to optimise their code for the cell, it should be pretty quick. If it doesn't and requires a very different method of coding, it may lead to it being under utilised.

      Now it may not be all that imporant. Remember this isn't like the PS2, the processor isn't being relied on for graphics tranformations, the graphics chip will handle all that. So even if the processor is underultilised and thus on the slow side, visually stunning games should still be possible.

      However it is a risk, and a rather interesting one. I'm not against new mthods of doing things, but it seems for a first run of an architecture, you'd want it in dev and research systems. Once it's been proven and the tools are more robust, then maybe you look at the consumer market. Driving the first generation out in a mass consumer device seems risky, espically given that the X-box has lead time and thus it's development model is already being learned.

    2. Re:Remind anyone... by TheRaven64 · · Score: 2, Interesting
      Many are already whining about the multi-core movement. They like writing single-thread code, a big while loop in essencde, since that's the way it's always been done.

      Meanwhile, those of us who have been advocating building large numbers of loosely coupled, message passing, components all running with their own process space gave enormous grins on our faces at the thought of being able to do the message passing via a shared cache with only a cycle or two penalty...

      --
      I am TheRaven on Soylent News
  13. Mambo development by iota · · Score: 4, Informative
    Development for the Cell is open. You are free to download IBM's Cell Simulator.
    Written in C, a significant part of the Full-System Simulator's simulation capability is directly attributed to its simulation multitasking framework component. Developed as a robust, high-performance alternative to conventional process and thread programming, the multitasking framework is a lightweight, multitasking scheduling framework that provides a complete set of facilities for creating and scheduling threads, manipulating time delays, and applying a variety of interthread communication policies and mechanisms to simulation events.
    The simulator runs a Redhat kernel, so the programming model will be familiar. Also both SCE's (gcc-based) and IBM's (XLC) compilers are available for both the PPU and SPU.

    IBM will also be releasing Cell-based Blade servers next year, so pick one up if you're serious about development!
  14. Reminds me of programming the nCube by Animats · · Score: 3, Interesting
    The nCube, in the 1980s, was much like this. 64 to 1024 processors, each with 128KB and a link to neighboring processors, plus an underpowered control machine (an Intel 286, surprisingly.)

    The Cell machines are about equally painful to program, but because they're cheaper, they have more potential applications than the nCube did. Cell phone sites, multichannel audio and video processing, and similar easily-parallelized stream-type tasks fit well with the cell model. It's not yet clear what else does.

    Recognize that the cell architecture is inherently less useful than a shared-memory multiprocessor. It's an attempt to get some reasonable fraction of the performance of an N-way shared memory multiprocessor without the expensive caches and interconnects needed to make that work. It's not yet clear if this is a price/performance win for general purpose computing. Historically, architectures like this have been more trouble than they're worth. But if Sony fields a few hundred million of them, putting up with the pain is cost-justified.

    It's still not clear if the cell approach does much for graphics. The PS3 is apparently going to have a relatively conventional nVidia part bolted on to do the back end of the graphics pipeline.

    I'm glad that I don't have to write a distributed physics engine for this thing.

    1. Re:Reminds me of programming the nCube by plalonde2 · · Score: 2, Interesting
      I disagree that that the cell architectures is "inherently less useful than a shared-memory multiprocessor".

      Shared memory is the cause of 80% of the nasty little race conditions programmers leave peppered through their code on parallel machines - it's just too easy to break discipline, particularly considering the crap programming languages support we have - C and C++ are just not up to the task because of their assumption that you may touch anything in the address space.

      Cell-like architectures have one other advantage, particularly in performance-sensitive applications: the explicit DMA to local stores *makes* you look at how the busses work on that machine, and they do not really differ from the busses on non-Cell-like modern machines; the structure of your Cell code will be bus-friendly on pretty much any architecture you port it to. And in our modern world, the bus is the bottleneck.

  15. they gave up... by YesIAmAScript · · Score: 5, Interesting

    Both Sony and MS realized they couldn't make a single true general-purpose CPU with the performance they wanted for a price they could afford to sell in their consoles.

    Sony went to a CPU, GPU and 7 co-processors (Cell).
    MS went to a 3 CPUs with vector-assist and a GPU.

    Both companies are going to need to spend a lot of time and money on developer tools to help their developers more easily take advantage of their oddball hardware, or else they will end up right where Saturn did.

    I guess the good news for both companies is that there is no alternative (like PS1 was to Saturn) which is straightforward and thus more attractive.

    PS2 requires programming a specialized CPU with localized memory (the Emotion Engine) and it seems to get by okay. So developers can adapty, given sufficient financial advange to doing so.

    --
    http://lkml.org/lkml/2005/8/20/95
    1. Re:they gave up... by thomasscovell · · Score: 2, Interesting

      No alternative? The Nintendo codename-Revolution will be comparatively "under"-powered, but will definitely be a simpler machine to code for and have novel (not novelty!) controller hardware that will afford the kind of possibilities Sony and Microsoft's idea of "next generation" don't offer. Just pushing more polygons isn't where it is at. There's been no growth in size of the gaming market since the SNES era, just more spending by those who do game. Nintendo's next generation model is at least looking to increase the gaming demographic, just the way their Nintendo DS handheld as (senior gamers? plenty of those in Japan not thanks to the DS!

  16. MOD PARENT DOWN by imroy · · Score: 4, Informative

    Note to moderators: the user "5, Troll" likes to cut and paste posts from other sites to gain karma. This one was found on the DeveloperWorks site with a quick google search.

  17. Re:Task switching... by freeduke · · Score: 2, Interesting
    For use with general OSes, what could be interesting would be a dual-core PPE with those 8 SPE.

    The first core could be the main processor, handling processes, and the second core, could just be there to be interrupted by dedicated threads executed on the SPEs, and communicate with them. The main problem would come from memory bandwidth used by the core which handles the 8 SPEs, it should be designed to minimize the impact on the first core.

    A solution to this could be to have a cell processor and a traditional single-core processor, both of them using HT to improve memory bandwidth. But it seems to be complicated. Anyway, this Cell processor could be interesing as a threads management unit.

    Another point should be to double memory to each SPE, and prefetch context switches while another thread is running on it, and once, the context switch is done, retrieve data from the previous thread: this could me managed by the PPE. And if you combine this solution with a non-synchronized timer interrupts on each SPEs, I bet you can get a pretty good improvement on memory bandwidth consumption made by a cell unit...

    With all those basics ideas, I think that there is plenty of room to use efficiently those cell processors

  18. Re:CBE = Failure by plalonde2 · · Score: 4, Insightful
    You're right - you don't design around a new processor.

    But you should design around the changes in architecture that have been coming at us for the last 5-10 years: the bus is the bottleneck, and the Cell makes this explicit. It goes so far as to deal with the clock-rate limits we've reached by taking the basic "bus is the limit" and exposing it in a way that lets you stack a bunch of processors without excessive interconnect hardware (and associated heat) into a more power-efficient chip.

    I've been working on Cell for nearly a year now, and it's been really nice being forced to pay attention to the techniques that will be required to get performance on all multi-core machines, which in essence means all new processors coming out. Our bus/clockrate/heat problems are all inter-dependent, and Cell is the first project I've seen that gets serious about letting programmers know they need to change to adapt to this new space.

  19. you're probably right by YesIAmAScript · · Score: 2, Interesting

    Although Nintendo isn't even talking about the hardware specs, so we can't be sure.

    But I didn't include the Revolution because Nintendo is saying the same thing they did with the Gamecube, that they don't need 3rd party developers. Revolution seems largely like a platform for Nintendo to sell you their older games again. Additionally, if Revolution is sufficiently underpowered compared to the other two, it may be that 3rd parties just plain cannot port their games to this platform, or else have to "dumb down" their game in such a way which might make the game uncompetitive with games that don't work on Revolution.

    So, basically, N is downplaying new development so much on the Revolution that I simply left it out as a platform which would attract developers who were fed up with the other two. But probably I shouldn't have done so.

    By the way, with all of this, I want to mention I'm a huge N fan. I have three GBAs, a DS and a Gamecube, plus all their other consoles back to the SNES. I just think that N is concentrating on 1st/2nd party development more than 3rd party development.

    --
    http://lkml.org/lkml/2005/8/20/95
  20. interconnect restrictions by CdBee · · Score: 2

    Since most of the inter-processor "interconnects" would be consumer-grade DSL/Cable links, it'd have phenomental capacity to process chunks of data but serious latency issues in distributing work units. Commercial cluster data-processing units probably use gigabit ethernet or faster connections to get around this.

    --
    I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU
  21. Re:HLL like Java & Smalltalk have two faults by TheRaven64 · · Score: 2, Informative

    Java and Smalltalk are both imperative languages and, while I am quite fond of Smalltalk, my post was about functional languages. Most functional languages don't permit aliasing, which dramatically reduces locking issues related to resource contention (and copy-on-write optimisations can make them very fast).

    --
    I am TheRaven on Soylent News
  22. Re:PS3 Suggestion by TheGSRGuy · · Score: 2, Informative
    2.5" drives don't have to be slow. Most laptops ship with 5400rpm (or even worse, 4200rpm) drives. I paid to upgrade to a 7200RPM drive w/8MB cache in my Dell notebook. Huge speed jump. You can even get 2.5" drives with 16MB caches. That would offer a significant speed bost.

    Frankly, I don't see why they couldn't just use flash memory instead...everyone's doing it these days.