Slashdot Mirror


Prospects For the CELL Microprocessor Beyond Games

News for nerds writes "The ISSCC 2005, the "Chip Olympics", is over and David T. Wang at Real World Technologies put a very objective review of the CELL processor (the slides for the briefing are also available), covering all the aspects disclosed at the conference. Besides the much touted 256 GFlops single-precision floating point performance the CELL processor has 25-30 GFlops in double-precision, which is useful enough for scientific computation. Linus seems interested in CELL, too."

16 of 246 comments (clear)

  1. I'll believe it when I see it by chris09876 · · Score: 5, Insightful

    This is a very positive review for the cell processor. It does seem like a really exciting new piece of technology. It promises a lot, and if it will do everything people say it will do, it really has the possibility to give the entire industry a big leap forward.

    That being said, I think it's important not to get too excited about it... it's hard to say if it will live up to everything that people have written about it. I'm a bit skeptical. Until I see some production units doing amazing things, I'm cautiously optimistic.

    1. Re:I'll believe it when I see it by Neil+Watson · · Score: 2, Insightful

      I too, am skeptical. Especially when I see Rambus mentioned. I keep looking around expecting to see a school of their lawyers circling. Biding their time until before a patent law suit frenzy.

    2. Re:I'll believe it when I see it by BobPaul · · Score: 5, Insightful

      That being said, I think it's important not to get too excited about it... it's hard to say if it will live up to everything that people have written about it. I'm a bit skeptical. Until I see some production units doing amazing things, I'm cautiously optimistic

      I'm a little bit concerned about the PowerPC Element. The article states that it's not simply a Power5 derivative, but a core designed for high mhz at the cost of per stage logic depth. To quote the author: "The result is a processing core that operates at a high frequency with relatively low power consumption, and perhaps relatively poorer scalar performance compared to the beefy POWER5 processor core. "

      The means the PPE in the CELL @ 4Ghz will not perform as well as a Power5 would could it reach 4Ghz (but since the CELL has 8 SPEs, I would hope it performs better as a whole than a POWER5 at the same frequency). It would be interesting to know at what frequency the two are similar, but since the PPE is integrated into an extended system, this isn't something that can ever really be benchmarked.

    3. Re:I'll believe it when I see it by Anonymous Coward · · Score: 1, Insightful

      >I would hope it performs better as a whole than a POWER5 at the same frequency

      Better vector performance, but not so good for Excel.

  2. Comment removed by account_deleted · · Score: 3, Insightful

    Comment removed based on user account deletion

  3. Re:Cool, as a co-proc by The_Mr_Flibble · · Score: 2, Insightful

    But isn't the point of the cell processor a distributed model.

    From the reviews I've seen they are touting it as if the cell communicates with other cells to handle all the processor intensive stuff.
    so where one cell would not be as powerful as an x86 cpu two cells would be. And the way they have designed the things is as a seperate computer on a chip so you can basically upgrade your ?? just the same way you upgrade your memmory.

    Or have I gotten the wrong end of the stick and they are designing these things for pointless fun.

  4. More Cell reviews? by Anonymous Coward · · Score: 3, Insightful

    Sheesh, /. might as well make a Cell image & category, they post so many articles about it!

  5. Obviously a TROLL by gorim · · Score: 1, Insightful

    And a good one. Someone actually modded this person as Interesting. :)

    Having said that, if the original poster of this thread truly does think its underpowered, one should provide a bit more elaboration besides a trollish reference to the IBM/Sony marketing machine.

  6. Re:Transmeta by mirko · · Score: 3, Insightful

    These are not buzzwords : ARM have been doing this for years and are a very profitable R&D company.

    --
    Trolling using another account since 2005.
  7. IBM by Anonymous Coward · · Score: 1, Insightful

    don't forget that this time ibm is part of the whole show. they aren't going to risk their reputation witch cheap tricks, that's their main business after all

  8. What's the point? by jeif1k · · Score: 4, Insightful

    Unless you are computing digital orreries, whether it has 256GFlops or 256TFlops makes little difference if the memory bandwidth isn't substantially increased, and people don't increase the memory bandwidth because that has expensive consequences all over the system.

    On the whole, my impression is that current mainstream CPUs have a pretty reasonable balance between CPU power and all the other system components. Changing just the CPU without making substantial (and expensive) changes to the rest of the system will not magically give you more performance.

  9. Re:What software will it run by Anonymous Coward · · Score: 2, Insightful
    Well, it seems to be ccNuma. The coprocessors can access shared memory but copy to local memory to do the processing. The ppc control processor is there to set up stuff for the special processors since they're not equipped to communicate with the outside world themselves.

    The iteresting thing which most commentators seemed to have missed is the virtualization technology. If you're going to have cell based devices job out stuff to execute on any nearby cell processors on the network, you're going to need a really good sandbox. One that's better than Java's which isn't that good. IBM's virtualization technology is more secure than anything else I've seen out there.

  10. these are max figures by Anonymous Coward · · Score: 2, Insightful

    folks need to keep in mind these are max figures assuming software is perfectly written to take care of parallelization (does that word exist?). this means that most computer programs will hit no where near these rates, but super optimized versions of things like SETI-Home and an mpeg encoder/decoder could take advantage of it.

    just remember how many developers complained about the Emotion Engine from the SP2 and how it was such a bitch to program for, this will be worse. it's first gonna require a special compiler or at least a tool to fill the code to all the independent mini-procs and reorder all the instructions to take advantage of it's little quirks. they seem to be a bit different from pipelines, but the some of the same concepts with regards to stalls will apply. so if you're working heavily on one set of data, it's quite possible only one of these mini procs will be used, and the rest will stand there and do nothing.

    i think this is something that'll work much better on a video card and a maybe a soundcard than as a main processor, except in the cases where mostly only media processing is requird. settop boxes, game consoles, tvs, stereo systems, etc.

  11. 250 Gigaflops? by CTho9305 · · Score: 4, Insightful

    People seem to think this is leaps and bounds above everything else, but they're missing the details. In order to obtain that much performance, you'll need a task which parallelizes well so it can be broken up into chunks for the 8 SPEs. Graphics rendering falls into this set of tasks, but a lot of general applications just don't gain that much from parallel processors. Even when you have a task that does parallelize, writing parallel code is quite a bit harder than writing code for just a single thread of execution.

    I've seen a lot of hype about having the Cell in your laptop talk to the Cells in your desktop, microwave, and TiVo, but you have to consider real-world limitations. When you set up a network like that (presumably wireless), you're going to be limited to around 100Mbps. In computer clusters and supercomputers, one of the main limitations of performance is the communcation bandwidth available between processors, and the latency of the network. To build a "home supercomputer", you not only need a task that parallelizes well, but one that doesn't require so much inter-node communication that it's held back by a slow network. You can't work around this problem with hardware magic - if the task you're working on requires lots of communication bandwidth, you're going to be held back.

    So how much beyond a modern PC is 250GFLOPS anyway? Not much! A GeForce FX at 500MHz does 200 gigaflops. An AMD Athlon's peak performance is 2.4 GFLOPS at 600 MHz... if we scale this up to 2.2 GHz (high-end Athlon), that's 8.8GFLOPS (note: As we're talking about theoretical performance, nonlinear factors like bus speeds can be ignored). Basically, if the Cell dedicates most of its power to graphics rendering, you'll have computation power in the same range as a fast PC of today. Given that we're not going to see any products based on the Cell for a while, this isn't going to be the end of the world for Intel and nVidia (let alone the fact that Cell isn't x86).

    Consoles using the Cell will have the advantage of only having to render for TV resolutions - at most 1080 lines, while PCs will be rendering at up to 1600x1200, but if you look at recent history, you can compare the xbox to a then-good PC with a GeForce3 (which came out at around the same time) - the xbox looked better, but PCs did catch up and surpass it's performance and it didn't take all that long. Consoles have to be very high-end when they're released, because the platform doesn't change for 2-3 years, and they still need to be "good enough" after a couple years, before the next generation is released.

  12. Re:What software will it run by cow-orker · · Score: 2, Insightful

    I don't think the operating system could make much use of the APUs. The best that can be hoped for is an OS that somehow allocates apulets to the APUs, but since the APUs will work best if used as stream processors this allocation is... well... non-trivial.

    However, given a way to allocate these units to userspace programs, there are lots of programs that could benefit. X and mplayer come to mind, provided someone implements the critical code for APUs, which may well mean coding in assembly.

    What you need is a more general concept, probably at the programming language level, in which algorithmns can be expressed in such a way that the operating system can detect that they can be loaded into these subsidiary processors to be executed.

    This will remain a dream for "general purpose languages" like C. However, I could imagine Parallel Haskell or something similar for the Cell. That would be way cool and could even work.

    Anyway, the architecture without adequate software is quite useless. I'm still very much interested.

  13. Re:What software will it run by ReelOddeeo · · Score: 3, Insightful

    What software will it run? Software "cells".

    A software cell runs on one of the APU's (or SPU's, or whatever we're currently calling them). It is sandboxed. When the main processor sends a software cell to one of the sub processors, it specifies exactly what memory that the hardware will allow that processor to access.

    You can run a software cell from an untrusted source. The software cell is a combination of code/data. The processor performs some function on it. While running, the sub processor has access only to the memory that the main processor designated.

    Applications like X Window system, Xine, MPlayer, mpg123, LAME, XMMS, etc., ad-infinitum, can be designed with their own software cells. In fact, entire libraries of software cells can be constructed and re-used. Libraries of multiplexors, demultiplexors, encoders, decoders, compositing, FFT's, transcoders, renderers, shaders, GIMP Filters (blurr, effects, etc.), etc.

    If you're building an application, such as SETI at Home, then you organize your program as software cells. You can farm out as many software cells as you have hardware cell processors to handle.

    Cells can be safely shuffled from device to device. Spare cell capacity in your TV or PS3 can run your SETI at Home, or your Xine cells.

    The Cell processor isn't very helpful for, say OpenOffice.org spreadsheets or drawings, or spellchecking. But word processing isn't the function that usually needs super fire-breathing processor power.

    It is not inconceivable that things like spreadsheet calculations can be effectively improved using software cells. But this is not as obvious (at least to me) as the former applications that I mentioned.

    So if you had a 2 GHz main processor and one or more Cell co-processors (a variable, expandable number) you would have a tremendous amount of computing power. The applications that demand extraordinary power would have it -- even with just one cell coprocessor. And this was quite a list of applications I mentioned above. Just about anything audio-visual or doing massive parallel operations on pixels, or 3d.

    --

    Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!