Slashdot Mirror


Ars Technica's Hannibal on IBM's Cell

endersdouble writes "Ars Technica's Jon "Hannibal" Stokes, known for his many articles on CPU technology, has posted a new article on IBM's new Cell processor. This one is the first part of a series, and covers the processor's approach to caching and control logic. Good read."

6 of 449 comments (clear)

  1. Part II is up now by Anonymous Coward · · Score: 5, Informative

    Part II is up as well.

  2. Re:Apple? by Tropaios · · Score: 5, Informative

    From the article:

    The Cell and Apple

    Finally, before signing off, I should clarify my earlier remarks to the effect that I don't think that Apple will use this CPU. I originally based this assessment on the fact that I knew that the SPUs would not use VMX/Altivec. However, the PPC core does have a VMX unit. Nonetheless, I expect this VMX to be very simple, and roughly comparable to the Altivec unit o the first G4. Everything on this processor is stripped down to the bare minimum, so don't expect a ton of VMX performance out of it, and definitely not anything comparable to the G5. Furthermore, any Altivec code written for the new G4 or G5 would have to be completely reoptimized due to inorder nature of the PPC core's issue.

    So the short answer is, Apple's use of this chip is within the realm of concievability, but it's extremely unlikely in the short- and medium-term. Apple is just too heavily invested in Altivec, and this processor is going to be a relative weakling in that department. Sure, it'll pack a major SIMD punch, but that will not be a double-precision Alitvec-type punch.

  3. Re:How do I code this thing?? by Space+cowboy · · Score: 4, Informative


    The architecture of the Cell look like a much-improved PS2 system, with the PS2's vu0 and vu1 (vector units 0 and 1) replaced by 8 SPE's. Also, the programmable DMA (with chaining ability, allowing it to sequence multiple DMA events one after the other etc.) looks very similar to the PS2's.

    If that turns out to be the case, then PS2 programming is a hint towards how it'll work. On the PS2, you generally configured the DMA controller to upload mini programs to the vector units, then DMA-chained data as streams from RAM through the just-uploaded program and onto the destination (usually the GS which rasterised the display).

    On the Cell, it looks as though you can DMA-chain code & data through multiple SPE's and ultimately back to RAM/the PPC core/whatever is memory mapped. This is cool - it's software pipelining :-)

    So, my guess is that the PPC acts as a (DMA, IO, etc.) controller (much like the mips chip did in the PS2), and the heavy lifting goes on in the vector units, with code and data being streamed in on demand.

    It's a different model to normal programming, and as far as I can see it encourages you to be closer to the metal (ie: it's harder, I normally expect my L1 cache to take care of itself...), but assuming they release/port gcc for the SPE's, it might not be too hard if you're used to event-driven highly-threaded programming. Let's just hope they release a Linux port and 'vcl' so we can do something useful with the vector units...

    Oh, and if the xbox was a target for a self-hosting linux solution, I think the Cell will be irrestible :-)

    Simon

    --
    Physicists get Hadrons!
  4. Re:Apple? by prockcore · · Score: 4, Informative

    My old 600mhz g3 ibook runs panther, safari, quicktime, iphoto, itunes and everything else I need on a daily basis pretty well. Try saying that about a five year old PC.

    5 year old? Your 600mhz g3 ibook came out October 2001. That machine is just a few months older than 3 years old.

    In October of 2001, the P4 was at 2.0ghz, and the Athlon 2000+ was just coming out. Are you going to tell me that a 2ghz P4 isn't adequate for browsing the web, listing to mp3s and importing digital photos?!

  5. Re:How do I code this thing?? by adam31 · · Score: 4, Informative
    This is similar to the 'scratchpad' RAM that Sony used in the PS2 and PS1. It's 16kb of on-chip (super-fast) memory that can be loaded and manipulated by the programmer, completely separate from the jurisdiction of the cache (which can cause big headaches-- think cache writeback with stale data).

    We'd do our skeletal animation skinning with this. DMA a bunch of verts to scratchpad, transform and weight them on the VU, DMA back to a display list. The thing is, there's really no high-level language support for this... the onus is on the programmer to schedule and memory map everything, mostly in assembly.

    The design of the cell-- it's incredible. It's every game programmer's wet dream. I just don't see how it's going to be as useful in other areas though. It's going to be a compiler-writer's nightmare, and to get real performance frome the SPEs is going to take a lot of assembly or a high-level language construct that I haven't seen yet.

  6. Top 7 Myths of the New Cell Processor: by Modab · · Score: 5, Informative
    There are so many people saying dumb things about the Cell and the upcoming PS3, I have to set some things straight. Here goes:
    1. The Cell is just a PowerPC with some extra vector processing.
      Not quite. The Cell is 9 complete yet simple CPU's in one. Each handles its own tasks with its own memory. Imagine 9 computers each with a really fast network connection to the other 8. You could problably treat them as extra vector processors, but you'd then miss out on a lot of potential applications. For instance, the small processors can talk to each other rather than work with the PowerPC at all.
    2. Sony will have to sell the PS3 at an incredible loss to make it competitive.
      Hardly. Sony is following the same game plan as they did with their Emotion Engine in the PS2. Everyone thought that they were losing 1-200 bucks per machine at launch, but financial records have shown that besides the initial R&D (the cost of which is hard to figure out), they were only selling the PS2 at a small loss initially, and were breaking even by the end of the first year. By fabbing their own units, they took a huge risk, but they reaped huge benefits. Their risk and reward is roughly the same now as it was then.
    3. Apple is going to use this processor in their new machine.
      Doubtful. The problem is that though the main CPU is PowerPC-based like current Apple chips, it is stripped down, and the Altivec support will be much lower than in current G5s. Unoptomized, Apple code would run like a G4 on this hardware. They would have to commit to a lot of R&D for their OS to use the additional 8 processors on the chip, and redesign all their tweaked Altivec code. It would not be a simple port. A couple of years to complete, at least.
    4. The parallel nature will make it impossible to program.
      This is half-true. While it will be hard, most game logic will be performed on the traditional PowerPC part of the Cell, and thus normal to program. The difficult part will be concentrated in specific algorithms, like a physics engine, or certain AI. The modular nature of this code will mean that you could buy a physics engine already designed to fit into the 128k limitation of the subprocessor, and add the hooks into your code. Easy as pie.
    5. The Cell will do the graphics processing, leaving only rasterezation to the video card. Most likely false. The high-end video cards coming out now can process the rendering chain as fast as the Cell can, looking at the raw specs of 256Gflops from the Cell, as opposed to about 200GFlops from video cards. In two years, video cards will be capable of much more, and they are already optomized for this, where the Cell is not, so video cards will perform closer to the theoretical limits.
    6. The OS will handle the 8 additional vector processors so the programmer doesn't need to.
      Bwahahaha! No way. This is a delicate bit of coding that is going to need to be tweaked by highly-paid coders for every single game. Letting on OS predictively determine what code needs to get sent to what processor to run is insane in this case. The cost of switching out instructions is going to be very high, so any switch will need to be carefully considered by the designer, or the frame-rate will hit rock-bottom.
    7. The Cell chip is too large to fab efficiently.
      This is one myth that could be correct. The Cell is huge (relatively), and given IBM's problems in the recent past with making large, fast PowerPC chips, it's a huge gamble on the part of all parties involved that they can fab enough of these things.