Slashdot Mirror


A Look Into The Cell Architecture

ball-lightning writes "This article attempts to decipher the patent filed by the STI group (IBM, Sony, and Toshiba) on their upcoming Cell technology (most notably going to be used in the PS3). If it's as good as this article claims, the Cell chip could eventually take over the PC market."

2 of 318 comments (clear)

  1. Consider a different approach by Space+cowboy · · Score: 4, Informative


    All the programs that run on PC architectures expect certain things to be in place - they expect a single fast central CPU. They expect that good cache usage is important for performance. They expect to have access to gobs of RAM. Etc. Etc. The PS2 (and by extension the cell) is completely different.

    Consider a different architecture. You have a job that consists of multiple things to do. Some of these can be easily parallelised, others are mainly sequential. Divide it up so the parallel ones are coded separately, maybe with some IPC to synchronise to some clock.

    For a sequential part (say rendering the object list of a scene back to front to gain occlusion) the approach that worked for me on the PS2 (which is logically similar, if significantly less powerful) was to divide the job into tasks. Each task (say, one per object in the above) gets its own bit of code and knows about the data that it needs to perform its task.

    The key thing is that the Harvard separation of code and data just isn't, on a PS2. You set up a DMA chain that loads the program into the processor, then streams the data through the program on the processor, lather, rinse, repeat. Make the chain self-submitting and you can effectively forget about that chunk of code now, it'll just happen.

    This is still doing things sequentially (but we've agreed that this is a sequential task, right?) - the point is that it's being done highly efficiently within the architectural constraints. You have a dataflow architecture and even sequential code can hit the performance limits if you code to the architecture.

    The Cell looks even more powerful, in that you can chain execution modules together, so you can load code into APU's 1,2,3,4 and stream the data through 1,2,3,4 automatically before it's considered 'done'. This was possible on the PS2, but ... awkward. It'll keep the effective instructions/clock down because you're effectively pipelining your software... Nice idea.

    Simon

    --
    Physicists get Hadrons!
  2. Re:Well, this could use some more reiteration... by hattig · · Score: 4, Informative
    We will find out a whole lot more within the next fortnight, Cell is being described in a lot of details at ISSCC 2005 in early February.

    Paper Details:

    • The Design and Implementation of a First-Generation CELL Processor (10.2)
    • A Streaming Processing Unit for a CELL Processor (7.4)
    • A 4.8GHz Fully Pipelined Embedded SRAM in the Streaming Processor of a CELL Processor (26.7)
    • A Double-Precision Multiplier with Fine-Grained Clock-Gating Support for a First-Generation CELL Processor (20.3)
    • Clocking and Circuit Design for a Parallel I/O on a First-Generation CELL Processor (28.9)