Slashdot Mirror


A Look Into The Cell Architecture

ball-lightning writes "This article attempts to decipher the patent filed by the STI group (IBM, Sony, and Toshiba) on their upcoming Cell technology (most notably going to be used in the PS3). If it's as good as this article claims, the Cell chip could eventually take over the PC market."

6 of 318 comments (clear)

  1. Dupe!-Was it as good for you? by Anonymous Coward · · Score: 5, Insightful

    "Timothy do you actually read Slashdot?"

    Here's a better question. If he will not, why should we?

  2. Dataflow squared by Space+cowboy · · Score: 5, Interesting


    The original PS2 design was for a dataflow architecture - the Cell is a continuation (and significant evolution) of the theme. Interestingly enough, if this *does* take off it may be that the best programmers of tomorrow turn out to be the PS2 low-level guys, who've already written the algorithms that are about to be important.

    In the PS2, the MIPS chip was there mainly to do the simple stuff, all the heavy lifting was done on the 2 vector processors, and they were designed to have programs uploaded into them and data streamed through them using a very flexible (chainable) DMA engine. Sounds similar (if in a limited sense) to the Cell chip itself.

    Simon.

    --
    Physicists get Hadrons!
  3. Some Thoughts by logicnazi · · Score: 5, Insightful

    Well, I think we all recognized that article was a little over enthusiastic but it does suggest some interesting possibilities.

    First of all I want to say I think it is completly possible to make a processor with 8APUs and so forth. For starters PowerPC chips already have several seperate execution units on them, and I think they use fewer transitors than intel chips. Moreover, a huge chunk of the transitor budget goes to doing things like cache consistancy or complicated instruction prediction which is probably not used on the much simpler APUs.

    Of course it seems like this is primarily of interest to game systems or signal processing applications (note that a 4 threaded 32 stream processors is just another way of saying 4 cell procesors, each has a PPC core with 8 APUs). However, I would not be so quick to dismiss this for the PC market. While it may be true that many individual applications may not easily multi-thread it seems we are approaching a point where the biggest complaint is not the maximum processing rate in one application but the ability to run multiple applications at once. On my computers I'm rarely if ever frustrated at the rate some program is running at, but slowdown in other programs when I run a processor intensive job or turn on a video. So while drawing a webpage may not be speed up by this processor drawing several webpages at the same time will be and that is the sort of thing which makes a big difference for the end user.

    Also, a processor like this offers great possibilities for JIT and VM code. The main thread can dispatch instructions and threads to the APUs dynamically based on what is happening in the system. Also I find it interesting that IBM is going the same way as intel in pushing all the complexity on the compiler. It makes one wonder if itanium is really as dead as everyone thinks. Perhaps in 4 years when AMD can't squeeze anything more out of x86 intel will be ready to jump in having worked out all the bugs to their new chip.

    --

    If you liked this thought maybe you would find my blog nice too:

  4. Re:Dupe! by Ohreally_factor · · Score: 5, Insightful

    Timothy do you actually read Slashdot?

    Wouldn't that be like eating from the toilet?

    --
    It's not offtopic, dumbass. It's orthogonal.
  5. What I can't help but think by mcc · · Score: 5, Interesting

    I've had for a very long time the suspicion that the XBox was basically just a big blindside at Sony. The XBox loses a huge amount of money, and looks as if it will continue to lose a huge amount of money right into the XBox 2 line; Microsoft must be doing this for some reason. My personal theory for awhile has been that at least one of Microsoft's motivations in spending all this money is because they see the Playstation as a potential future threat; i.e., they feared and fear that at some point the Playstation 2 or 3 or 4 will become so close in power and functionality to a PC that it will begin to supplant the PC for common tasks. This would be disastrous for Microsoft; their lockdown on the PC market is complete, but this doesn't protect them from the PC market itself being slowly eaten away at from the bottom by consumer electronics like the ones Sony makes. So to stave off this threat, Microsoft begins to instead grow the PC market it monopolizes downward, so that the PC (as it becomes the "Windows Media Center") begins to slowly suck up the consumer electronics market, competing directly with the Playstation, bringing the fight to Sony's door instead of Microsoft's. Since consumers wouldn't on their own be interested in a PC that supplants consumer electronics, Microsoft instead basically bribes them into being interested with subsidized hardware; they make a big money blackhole out of the XBox to undercut Sony's ability to maneuver with the Playstation, the way the money blackhole that was MSIE undercut Netscape's ability to maneuver.

    This is, of course, all just conjecture.

    But when I begin to see people seriously talking about the chip from the Playstation 3 eventually potentially being used in PC hardware, I begin to wonder if it's maybe reasonable conjecture...

  6. Re:Looks like we need to throw all computers out by JQuick · · Score: 5, Interesting

    The author had a good grasp of the high level architecture, but beyond that was clueless. His interpretation of the design is way off the mark.

    He seemed astonished by the 1024 bit wide data paths. The Power family is design with cache fill lines of 128 bytes. So, for instance the G5 L2 cache already does fetches 128 bytes into cache for each main memory read.

    Similarly all the talk about doing with cache and VM is bullshit. Instead of having each vector unit interfere with a shared cache as is done today, they've simply added smaller per ALU caches to the design, and complemented it with a device that is a souped up cache controller/MMU unit (the DMAC). The dmac apparently will be able to address both memory, and other hardware by having a virtual address layer, to enable reference to remote cell units as well as local physical hardware. The 64 MB of high speed rambus memory, may be all that is required for a PS3, but in a workstation implementation that memory is L3 cache.

    Altivec currently has 32 vector registers. Each ALU as 128. It it highly likely that the core opcode architecture will remain similar. The most likely addition will be to add a few flow control instructions to the existing mix.

    Altivec is already powerful but the biggest limiting factor is latency. Altivec can peform 1 instruction per clock on the G5, However the pipeline is 8 levels deep thus the overhead involved in fetching data, loading registers, performing a calculation among 1-3 registers, and getting a result is prohibitively expensive. However, if you can arrange to submit 8 calculations (or more) in rapid sequence, you can keep Altivac and the CPU busy and reap great benefits.

    The beauty of Cell will be in proving the ALUs with a bit more autonomy (thought not much more, they are still basically vector units), and enabling the main CPU to keep doing useful work while a number of ALUs are cranking away. Other novel design features provide for communication and synchronization with other units via remote addressing and timing (that's what those realtime clock signals are all about).

    This will be very fast, and very cheap. However, all the hand waving, and theorizing this guy does about both hardware and software reads like patent bullshit.