Slashdot Mirror


How Sony's Development of the Cell Processor Benefited Microsoft

The Wall Street Journal is running an article about a recently released book entitled "The Race for a New Game Machine" which details Sony's development of the Cell processor, written by two of the engineers who worked on it. They also discuss how Sony's efforts to create a next-gen system backfired by directly helping Microsoft, one of their main competitors. Quoting: "Sony, Toshiba and IBM committed themselves to spending $400 million over five years to design the Cell, not counting the millions of dollars it would take to build two production facilities for making the chip itself. IBM provided the bulk of the manpower, with the design team headquartered at its Austin, Texas, offices. ... But a funny thing happened along the way: A new 'partner' entered the picture. In late 2002, Microsoft approached IBM about making the chip for Microsoft's rival game console, the (as yet unnamed) Xbox 360. In 2003, IBM's Adam Bennett showed Microsoft specs for the still-in-development Cell core. Microsoft was interested and contracted with IBM for their own chip, to be built around the core that IBM was still building with Sony. All three of the original partners had agreed that IBM would eventually sell the Cell to other clients. But it does not seem to have occurred to Sony that IBM would sell key parts of the Cell before it was complete and to Sony's primary videogame-console competitor. The result was that Sony's R&D money was spent creating a component for Microsoft to use against it."

49 of 155 comments (clear)

  1. I have altered the deal by symbolset · · Score: 4, Funny

    Pray I do not alter it further.

    --
    Help stamp out iliturcy.
    1. Re:I have altered the deal by hairyfeet · · Score: 3, Funny

      Here is the video to go with your comment for those of us that are visually oriented. It is probably how Sony is felling after finding this out, especially the shoes part. ;-)

      --
      ACs don't waste your time replying, your posts are never seen by me.
  2. Jeez by symbolset · · Score: 3, Funny

    I didn't mean to kill the thread with the second comment, but yeah, is there something else that needs be said to this?

    --
    Help stamp out iliturcy.
  3. a few facts please? by Anonymous Coward · · Score: 3, Interesting

    the cell is not "made from scratch". It's based on powerpc. Minus the branch prediction and some other goodies, and with additional cores specialised for numerics called "SPEs". Without the SPEs it's a piece of junk. And the xbox360's processor doesn't have the SPEs.

    This article is full of shit.

    Big deal if M$ got their hands on a crap, slow design based on the G5 powerpc, and they made it able to execute 2 threads per core and put 3 cores on a die. It has NOTHING LIKE the gigaflops of the cell.

    1. Re:a few facts please? by TheRaven64 · · Score: 5, Interesting
      Even the SPEs aren't exactly built from scratch. They're based on the VMX units from the PowerPC 970 with widened register sets and a modified memory architecture with explicit DMA commands. If the meeting in question took place, I'd imagine IBM showed Microsoft the Cell, the PowerPC 980MP, the 40x, and said 'we can do anything on this spectrum - what requirements do you have?'.

      The chip they sold to Microsoft in the end is more or less the same design as the PPU core in the Cell, but that, in turn, is an in-order variant of the 970 with a few bits from the POWER4 that were originally dropped (the 970 itself was a cut-down POWER4 with a VMX unit bolted on) re-added.

      IBM would be crazy not to reuse parts of old designs on any new one. They've spent hundreds of millions of dollars creating a library of CPU designs that fit anywhere from a mobile phone to a supercomputer. You're very unlikely to have a set of requirements that they can't meet with a tweaked version of one of their existing designs, and if you really need them to work from scratch then you probably can't afford the final product.

      --
      I am TheRaven on Soylent News
    2. Re:a few facts please? by TheRaven64 · · Score: 4, Informative

      They are very different approaches. The 360's CPU is basically a 3-core, 6-context, in-order variant of the POWER4 with a vector unit. In terms of pure number crunching ability, it's pretty pathetic next to the Cell. On the other hand, it is based on a model that we have spent 30 years building compilers for. You only need to write slightly-parallel, conventional code to get close to 100% of the theoretical performance out of it.

      In contrast, the Cell has one PPU which is roughly equivalent to one context on the 360's CPU (somewhere between 1/3 and 1/6 of the speed). It also has 7 SPUs. These are very fast, but they're basically small DSPs. They have very wide vector units and are limited to working on 256KB of data at a time. You can use them to implement accelerator kernels for certain classes of algorithm, but getting good performance out of them is hard.

      In terms of on-paper performance, the Cell is a long way out in front, but it is a long way behind in ease of programming, meaning that you generally get a much smaller fraction of the maximum throughput.

      --
      I am TheRaven on Soylent News
    3. Re:a few facts please? by ZosX · · Score: 2, Informative

      It was the same problem with the PS2. It took developers a few good years to really start to push the hardware. Look at some of the later games that really push the envelope like say Final Fantasy XII or Shadow of Colossus. The PS2 was certainly capable of some nice visuals but the other consoles were ultimately superior while basically using off the shelf hardware. Developers were pushing the Xbox and the Gamecube almost nearly from day one. I think the cell has backfired, but not for the reason that Microsoft shares aspects of their core. Parallel processing is indeed the future, but not in the form of vector units, but rather general purpose chips. The one size fits all approach is inefficient but at the same time it has been the approach that has worked to fit the needs of modern computer users. Hardware should get easier to program on over time, certainly not harder. What happened to those predictions that in the future the average user will be able to code just by throwing some GUI elements together and maybe even describing the program to the computer a bit and having it generate the program for you? How far away are we from that day? (It seems an awfully long way away and the visual IDE is not the same as what I am describing here)

    4. Re:a few facts please? by Mr+Z · · Score: 2, Interesting

      I came here to make pretty much the same point. IBM has a habit of reusing the same microarchitecture with tweaks to run different variants of the POWER or PowerPC instruction set as needed to fit a particular application niche.

      I suspect IBM didn't say specifically "Here's the Cell Broadband Architecture and what it can do." Rather, since the Xbox360's CPU doesn't have any SPEs, I imagine the presentation had more to do with what the PPU would be capable of, and was part of the IBM processor roadmap anyway.

    5. Re:a few facts please? by Michael+Hunt · · Score: 4, Informative

      Do you have even a vague understanding of what 'transform' and 'lighting' actually mean? Allow me to elucidate.

      'transform' refers to the act of converting vertex positions in model space (the coordinate system used in the vertex buffers) to clip (camera) space. This is typically one matrix * vector multiplication per vertex; the vertex's position in model space is multiplied by the world-view-projection matrix. On modern hardware, this is generally done in the vertex program (other things may be done to the vertex's position before or after the co-ord transform, mind you, such as multiplication by a set of bone matrices for hardware animation, etc.)

      'lighting' refers to the process of deciding the colour of each fragment ('potential pixel'). Before programmable graphics hardware, this was done by taking the dot product of the vertex normal and the light vector (or position, depending on light type), and multiplying it by the light's diffuse colour. The resulting colour intensity was then linearly interpolated across the face between vertices, and used to light the texture in conjunction with the ambient term. With modern programmable hardware, lighting is usually done per-fragment based on a normal map, which is input as a second texture to the fragment program. The light position is converted from object (or world) space into 'tangent' space, which is a coordinate system whose basis vectors are parallel and orthogonal to the plane being lit, and the surface is lit based on the dot product of the light vector in tangent space and the normal from the normal map.

      Back in the bad old days, when men with beards owned IRIX boxes and everybody else had a TNT2 or worse, transform and lighting were done in software for most folks, by a client of the rendering system, before the primitives were submitted as draw calls to the rendering system. Post-about-2001, cards with hardware T&L, such as Geforce 256, showed up in the PC space. These cards were the first consumer 3D hardware to perform fixed-function transform and lighting (roughly as I've described it above) in silicon. The API didn't change much, although there was a DirectX version bump (6 to 7). OpenGL programmers didn't really notice; the library itself, obviously, had to know if it was talking to a fixed-function card or a dumb card, but most OpenGLs were provided by hardware vendors in any event, so this wasn't a factor.

      Fast forward to today, everybody's using hardware which allows parts (most, these days) of the rendering pipeline to be replaced entirely with programs written by the engine developer (or even the artist, in some cases.) Transforming vertices can be done in conjunction with all manner of other crap, and lighting can be done using whatever model the programmer/artist desire. Regardless, however, it's all done in the same pipeline on the GPU. If the SPUs, as you suggest, were pre-transforming and pre-lighting vertices before writing them to 3d hardware's vertex buffers, then all you'd get is some really confused 3d hardware. RSX (the 3D chip loosely based on nvidia's G70 architecture) has 8 vertex pipelines and 24 fragment pipelines, all programmable. This is more than enough power to do significantly more with each vertex than simple transformation, and enough power to perform even complex effects such as steep parallax-mapped lighting in the fragment pipeline.

      In conclusion, while Xenos (360's GPU) may or may not be better than RSX, RSX is CERTAINLY more than powerful enough to handle its own T&L. Cell's SPEs are, at least on some level, a compromise between the massively data-parallel yet somewhat braindead pipelines of a GPU and the more-or-less serial yet significantly intelligent threads of a modern CPU. They'd be great for accelerating physics (Bullet, i believe, has a Cell backend) or AI, but really add fuck all to the 3D rendering side of the console.

    6. Re:a few facts please? by Glonk · · Score: 3, Informative

      While it certainly sounds like you know what you're talking about, it's pretty clear to anyone with a game-dev background you do not.

      Cell's SPEs are actually PRIMARILY used as aids to graphics processing (T&L) by most developers. Look into how games like Heavenly Sword use the SPEs as part of its "faux" HDR or games like Killzone 2 use SPEs to implement deferred rendering for awesome smoke effects. The SPEs are, in PRACTICAL TERMS to PS3 game developers, very essential to the 3D rendering side of the console.

      While RSX is "powerful enough" to do its own T&L, it cannot compare to the standalone power of the 360's Xenos chip. There are many reasons for this (6 fixed vertex shaders on RSX vs the unified shaders on the 360 which permit far higher vertex workloads, to the RSX's limited bandwidth vs the 360's eDRAM bandwidth, to triangle setup rates). On the PS3, developers need to leverage Cell in intelligent ways to draw comparable graphics to the 360. If an intelligent and determined PS3 developer really leverages Cell, it can make unparalleled graphic in the console world. The problem is, it costs a fortune in time and money to do it and very few developers can. It's simply not worth it to even attempt it for most developers.

      As a sidenote, Cell is not at all good for most game AI for many reasons (not the least of which is the lack of branch predictors in the SPEs).

      Additionally, people keep making the mistake of assuming the PPU in the Cell is basically the same as each core in the 360's CPU. That's not at all true. There are some significant differences, including native Direct3D format support in the 360's CPU to the new VMX128 vector units (which have 128 registers per context per core [6 total], vs 32 on the PPU) as well as additional instructions specifically tailored towards 3D games (like single-cycle dot-product instructions). The combined triple VMX-128 units on the 360 are still faster than most quad-core Core i7 in vector processing, so I'm perplexed by the notion that it's somewhat slow or underpowered from what I've read from some people.

      If you're truly interested in how PS3 games use Cell, check out the Beyond3D community where PS3 developers post in detail about how they do what they do. And Cell is a major factor in 3D rendering on the PS3. It has to be.

  4. I don't think it's quite as they tell it by Anonymous Coward · · Score: 3, Interesting

    I'm not a games console programmer, but I understood that the 'core' of the Cell and the chip used in the XBox 360 are both derivatives of the standard PowerPC chip. This smells like a couple of trolls being mischievous. IBM can do what they like with PowerPC, and that includes selling it to both Micrsoft for the XBox 360 and to Nintendo to power the Wii.

    Sony's payback comes when Playstation3 programmers learn to fully utilize the Cell architecture.

    1. Re:I don't think it's quite as they tell it by Bastard+of+Subhumani · · Score: 4, Funny

      Sony's payback comes when Playstation3 programmers learn to fully utilize the Cell architecture.

      It has direct hardware support for rootkits.

      --
      Only three things are certain; death, taxes, and apocryphal quotations - Ben Franklin.
    2. Re:I don't think it's quite as they tell it by sleeponthemic · · Score: 3, Insightful

      Sony's payback comes when Playstation3 programmers learn to fully utilize the Cell architecture.

      That statement is fast becoming another hallowed urban myth of gaming.

      Techspecs aside, do you really believe the hype when absolutely nothing has come out on ps3 that blows the 360's capabilities away? Haven't they had enough time? Where is the practical proof? Folding at home performance? Not really applicable.

      Not to mention the fact that no developer making cross platform games is going to go very much further on a ps3 version. There's just simply no point.

      --
      I record my sleeptalking
    3. Re:I don't think it's quite as they tell it by Richard_at_work · · Score: 2

      Sony's payback comes when Playstation3 programmers learn to fully utilize the Cell architecture.

      Whens that going to be? How long is the customer base willing to wait for the developers to get their act together?

    4. Re:I don't think it's quite as they tell it by Galactic+Dominator · · Score: 3, Interesting

      As an owner of both consoles, MGS4 is the only ps3 that even rivals the good 360 games in terms of graphics. It is no where close to blowing 360 games out the water. However, at the ps3 doesn't lock up nearly as much and kicks ass at folding.

      --
      brandelf -t FreeBSD /brain
    5. Re:I don't think it's quite as they tell it by Anonymous Coward · · Score: 2, Interesting

      I'm not a games console programmer, but I understood that the 'core' of the Cell and the chip used in the XBox 360 are both derivatives of the standard PowerPC chip.

      There is no such thing as a "standard PowerPC chip." PowerPC and POWER is an architectural specification and there are a wide variety of implementations of those specifications, ranging from embedded system-on-chip CPUs all the way to supercomputer processors.

      The story here is that IBM created a specific PowerPC implementation which serves as the "Power Processing Element" in the Cell implementation in the Playstation 3 and then sold the same implementation to Microsoft for the Xenon CPU in the Xbox360. Note that the PS3's PPE and the Xenon's cores are not completely identical. The PPE has the AltiVec SIMD instruction set also found in some of Freescale's and IBM's other PowerPC CPUs, while the Xenon uses a modified/extended version called VMX128.

      Cell, by the way, is also only a specification - the PS3's processor is one implementation of it. There are Cell implementations with fewer SPEs for cost- and power-sensitive applications, and IBM are making CPUs based on an updated Cell spec for supercomputing applications.

  5. And they both stole from Apple and Nintendo? by Sarusa · · Score: 5, Insightful

    This is really kind of misleading. The PowerPC, which is at the core of the Cell and is what MS uses as the cores of the Xbox 360, has been IBM's baby for years.

    The Xbox 360 uses 3 of the cores. The Cell uses one of the cores plus 8 SPEs (6 of which you can actually use in a game). If you will recall, the Wii uses a PowerPC too, a slightly beefed up Gamecube CPU which IBM made for Nintendo even before they made Cell. And of course Apple used to use PowerPCs (and IBM itself did and does, for servers).

    Anyhow, without the Cell's SPEs, there's not a lot to really 'steal'. The lack of SPEs is what makes the Xbox 360 so easy to program for, but the SPEs are what really define the Cell and make it such a floating point crunching monster (better suited for supercomputing than writing video games for in my opinion, and that's not intended as a dis here).

    1. Re:And they both stole from Apple and Nintendo? by snowgirl · · Score: 2, Interesting

      I still think the issue is x86 (windows) compatibility, portable support and never to forget the lack of small time developer support from IBM as the reason of Apple's switch to Intel.

      X86 (windows) compatibility had little to do with it. Apple for a long time was trying to get good Desktop and Laptop performance from the PowerPC architecture. The design is just vastly superior to x86 that it's so much easier and cost effective. Unfortunately, because windows has dominated so strongly, and has only really been available for the x86, (the x86-64 is just a carry on upon the x86 design) and so the most money was spent on improving that.

      There are really only three ways to do things. Design well upfront and save costs on furthering that design in the future, use something that already exists, and just pump money at it when you need to, and get crazy about design and assume that people will pick up the costs later, because the benefits of doing so will outweigh the costs.

      The first is PowerPC, the second is x86, the third is Itanium. Here's the problem, what is already out there is cheap. Throwing money occasionally beats out upfront costs. A good design minimizes later costs, but only if you have the money to throw at it. Lastly, the last one just doesn't work because it's high upfront costs, and high maintenance cost. (The Itanium now goes the way of the chip in this post... to obscurity...)

      Apple doesn't have the market share to throw money at things. As a result, they get better benefit by using what is already out there, even though they had already paid for the design. While the PowerPC design is way better fundamentally, the money poured into the x86 has just been ridiculous.

      Apple was also completely unsatisfied with the PowerPC producers. Motorola wasn't interested in producing anything but embedded chips. The G4 was totally an embedded chip, that's why Apple laptops when the G4 came out were way better than x86 laptops until Intel and AMD wised up and started making power-efficient chips. It's also why the TiBooks had 5 hours of run time compared to the Intel books at like 2-3 hours at the time.

      So, Apple went to IBM to get a Desktop chip. They worked on the G5, finally pushed it out, and it was an awesome chip. It has really good power-per-watt and performance. There's a reason why Apple started transitioning the G4 laptops first. They could get a 4-times speed increase out of moving from the G4 to an Intel chip. The PowerMac however would have only gained a 50% speed boost at the time of the original announcement. Later Intel development pushed this margin up high enough when they switched the desktops, and frankly, no one really cared that much about switching to x86 at that time... it was old news.

      So, the reason Apple went to Intel is because other people, like say.... Microsoft... were pouring money into the x86 development that Apple couldn't match. IBM was more concerned about the big iron aspect of the PowerPC, and Motorola more the embedded aspect of it. Apple just couldn't afford to spend the money on the desktop market. I'm happy to see the consoles taking over PowerPC development, and hope that leads to the same selective pressures that the Intel chips have. Maybe Apple might be tempted to return........

      --
      WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
  6. OTOH by symbolset · · Score: 3, Insightful

    It looks like the engineers who actually make stuff are in charge. I know that's not as good to you as lawyer-based engineering, but some of us prefer physics-based engineering, for spice. OK?

    Please don't sue me.

    --
    Help stamp out iliturcy.
  7. It also helped MS by Sycraft-fu · · Score: 4, Informative

    Because it was a really misdirected effort when it came to a console. Sony really had no idea what the hell they were doing as far as making a chip for their console. Originally, they thought the Cell would be the graphics chip. Ya well turned out not to be near powerful enough for that, so late in the development cycle they went to nVidia to get a chip. Problem was, with the time frame they needed, they couldn't get it very well customized.

    For example in a console, you normally want all the RAM shared between GPU and CPU. There's no reason to have them have separate RAM modules. The Xbox 360 does this, there's 512MB of RAM that is usable in general. The PS3 doesn't, it had 256MB for each CPU and GPU. Reason is that's how nVidia GPUs work in PCs and that's where it came from. nVidia didn't have the time to make them a custom one for the console, as ATi did for Microsoft. This leads to situations where the PS3 runs out of memory for textures and the 360 doesn't. It also means that the Cell can't fiddle with video RAM directly. It's power could perhaps be better used if it could directly do operations at full speed on data in VRAM but it can't.

    So what they ended up with is a neat processor that is expensive, and not that useful. The SPEs that make up the bulk of the Cell's muscle are hard to use in games given the PS3's setup, and often you are waiting on the core to get data to and from them.

    It's a neat processor, but a really bad idea for a video game console. Thus despite the cost and hype, the PS3 isn't able to outdo the 360 in terms of graphics (in some games it even falls behind).

    I really don't know what the hell Sony was thinking with putting a brand new kind of processor in a console. I'm willing to bet in 10 years there are compilers and systems out there that make real good use of the Cell. However that does you no good with games today.

    Thus we see the current situation of the PS3 having weak sales as compares to the 360 and Wii. It is high priced, with the idea that it brings the best performance, but that just doesn't bare out in reality.

    1. Re:It also helped MS by master_p · · Score: 2, Informative

      I don't think the PS3 has the problem you mention (SPEs not being able to work directly on VRAM). It has a huge bus and it can easily stream huge amount of texture data from the SPE cores to the GPU.

    2. Re:It also helped MS by Sycraft-fu · · Score: 4, Informative

      Specs on it I see show the system bus as being around 2GB/sec. That's comparable to PCIe (about the same as an 8x connection). While that's fine, it isn't really enough to do much in terms of back and forth operations. You'll find on a PC if you try that things get real slow. You need to send the data to the graphics card and have it work on its own RAM.

      Now that isn't to say that you can't do things to the data before you send it, but then that's of limited use. What I'm talking about is doing things like, say, you write code that handles some dynamic lighting that the CPU does. So it goes in and modifies the texture data directly in VRAM. Well you can't really do that over a bus that slow. 2GB sounds like a lot but it is an order of magnitude below the speed that the VRAM works at. It is too slow to be doing the "read data, run math, write data, repeat a couple million times a frame" sort of thing that you'd be talking about.

      You see the same sort of thing on a PC. While in theory PCIe lets you use system memory for your GPU transparently, in reality you take a massive hit if you do. The PCIe bus is just way too slow to keep up with higher resolution, high frame rate rendering.

      So while it's fine in terms of the processor getting the data ready and sending it over to the GPU (which is what is done) it isn't a fast enough bus to have the SPEs act as additional shaders, which is what they'd probably be the most useful for.

    3. Re:It also helped MS by hptux06 · · Score: 2, Informative

      2Gb/s? The RSX is on the FlexIO bus, giving it ~20Gb/s to play with according to specs.

    4. Re:It also helped MS by Zixx · · Score: 5, Informative

      For example in a console, you normally want all the RAM shared between GPU and CPU. There's no reason to have them have separate RAM modules. The Xbox 360 does this, there's 512MB of RAM that is usable in general. The PS3 doesn't, it had 256MB for each CPU and GPU. Reason is that's how nVidia GPUs work in PCs and that's where it came from. nVidia didn't have the time to make them a custom one for the console, as ATi did for Microsoft. This leads to situations where the PS3 runs out of memory for textures and the 360 doesn't. It also means that the Cell can't fiddle with video RAM directly. It's power could perhaps be better used if it could directly do operations at full speed on data in VRAM but it can't.

      Being a (former) PS3 and 360 dev, I have to say this is not true. Let's start with the memory split. Both consoles have about 20GB/s of memory bandwidth per memory system. Only the PS3 has two of them, giving it twice the memory bandwidth. The 360 compensates for that by having EDRAM attached to the GPU, which removes the ROP's share from your bandwidth budget. Especially with a lot of overdraw, the bandwidth needed by the ROPs can get huge (20GB/s, anyone?), so this would be a nice solution where it not for two things: the limited EDRAM-size and the costs of resolving from EDRAM to DRAM.
      The RSX can also read and write both to XDR (main memory) and DDR (VRAM), giving it access to all of memory. The reason it is tighter on texture memory is because the OS is heavier.

      About access to VRAM, it is true that reading from VRAM is something you don't want the Cell to do. It's a 14MB/s bus, so it is of no practical use for texture data. Writing into VRAM is actually pretty ok, as it's at 5GB/s, which is more or less achievable without trouble. At 60fps that's more than 80MB per frame.

      In general, both design teams made sound decisions. The 360 has a significant ease-of-use advantage to PC developers with DirectX experience. The PS3 on the other hand is a lot more to-the-metal, but allows for some pretty crazy stuff. Sadly, most development these days is cross-platform, so you won't see a lot Cell-specific solutions. It's just not cost-effective.

    5. Re:It also helped MS by xero314 · · Score: 4, Interesting

      There are a number of errors in the comment above and a number of oversights.

      First it is true that the Graphics processing of the PS3 was originally intended to be handled by a Cell processor, but this is not the same as saying the Cell processor was built to be a graphics processor. The original specs for the PS3 included 4 fully functional cell processors. This would have meant that there would be no need for dedicated GPU. Time and cost made this configuration prohibited.

      The reason the PS3 does not have dedicated memory is because it is a very different design. First the PS3 contains a very high speed data bus, which allows the system to keep it's lower amount of memory full of the data it needs at any given time, with no need to store data not actively in use. Secondly the GPU in the PS3 has direct access to almost all of the memory in the system (480MB to be exact). It's just not the same picture that some people would like to paint. Dedicated memory has it's advantages (which is why all high end PC GPUs have such).

      Now the reason that Sony, Toshiba and IBM design the Cell and crammed it into a PS3, prematurely, is ingenious, but we wont see this for a number of years. The Cell processor is designed from the ground up to work effective as a single node of a multi processor system. This means that you can include more than one, utilize the same code, and get a much faster program rate. What this means is that for computing today you can use a single Cell processor and have a fast machine. In the future you can have a machine with 4, 8, 16, or more cell processors and have an unbelievable fast machine. On top of that speed you also get a very energy efficient machine. Take a look at the top 500 supercomputer list to see what a difference the cell processor makes. Putting in the PS3 on the other hand was a good move because it meant mass production and greatly reducing costs so that they can finally build the system they want in the next console generation.

      Ok I'm to tired to finish this, but as you can see if you look, the cell is an interesting chip with great potential, and has already surpassed other chips a number of applications.

    6. Re:It also helped MS by hptux06 · · Score: 2, Informative

      The SPEs that make up the bulk of the Cell's muscle are hard to use in games given the PS3's setup, and often you are waiting on the core to get data to and from them.

      While I agree the SPEs are a pain in the neck to program for, one of their redeeming features is that they use asynchronous IO when writing/reading to/from main memory. One of the key design points with Cell was that modern processors spend an enormous amount of time waiting on memory operations to complete, something that gets worse when you introduce extra processors competing for memory cycles. An individual SPE can be reading in one set of data, writing back another, and processing a third all at the same time, there's no need to wait on data transfers.

      Granted though, this is only actually useful in limited situations, they're rubbish for general logic operations.

    7. Re:It also helped MS by MemoryDragon · · Score: 2, Interesting

      Well the idea of sony was to advance the PS2 design further, in my opinion a broken design having two SIMD Vector processors doing everything.
      They probably wanted to reduce the design down to one SIMD processor doing everything.
      The design of Sony seems to be vector processing is everything you dont need multithreading anyway. The problem with this approach is that even for modern games you need a good mix of everything, good vector processing for graphical stuff and physics, good general purpose processing and multiple cores for the application logic. I personally think Microsofts approach despite the obvious quality problems of the console (which stems mostly from Microsoft internal management decisions, you can read that up) is dead on, while Sonys approach is totally broken even on the PS2, you can remember the complaints of the developers how hard it was to handle the PS2, if it wasnÂt for the success of the hardware among the customers developers probably would have neglegted it due to the fact that pushing an application logic into dual vector processors is a rather hard task. But tools covered that over time!

    8. Re:It also helped MS by Otis_INF · · Score: 2, Interesting

      It also means that the Cell can't fiddle with video RAM directly. It's power could perhaps be better used if it could directly do operations at full speed on data in VRAM but it can't.

      Everyone who has written assembler code for an Amiga 500 knows that this isn't true: if you have multiple processors fiddling with data in videomemory, they also share a bus, and that sharing is precisely why it makes it slow. At least compared to memory which is only for 1 processor.

      Microsoft's 512MB memory runs at a very slow speed compared to the 3ghz frequency the PS3 cpu memory runs on. It's not a surprise why this is: the bus is shared: display hardware, video chip, main cpu, all have to utilize a bus to the same memory. To schedule all these requests, you have to use even/odd cycle schemes or similar, you can't use the bus all for one chip. 'DMA' only helps you if you own the bus to the memory, which is what the PS3 hardware gives you: very fast data crunchers in the CPU space and a videochip which can do whatever it wants in videomemory.

      That the PS3 runs out of texture memory is not really an argument as well: one can easily generate /unpack textures in cpu memory for usage in video memory, as the speed difference is significant. Though what happens is that multi-platform engines in general tend to write most stuff in shaders and want to use a big main memory block for texture memory as that's easier and 'it works on xbox'. Porting it over to ps3 works, unless you have a need for more than a given threshold of texture memory, which gives problems which aren't easy to solve.

      It's partly lazyness really: you've to solve it once and you can re-use the engine for multiple games on the multi-platforms you want to support. The question is: do you want to write that special SPU using optimization code or not? More and more studios are willing to do so. Not because they want to, but because they have to: once Sony starts releasing more and more games exclusively for PS3 developed using their maturing engines (e.g. KZ2, uncharted 2 etc.), keeping up with that for a multi-platform game really requires that PS3 optimizations are in place, otherwise the multi-platform game will suck in comparison with the ps3 exclusives. As Sony owns more studios than MS and nintendo combined, this is a matter of time.

      --
      Never underestimate the relief of true separation of Religion and State.
    9. Re:It also helped MS by Otis_INF · · Score: 3, Insightful

      Well the idea of sony was to advance the PS2 design further, in my opinion a broken design having two SIMD Vector processors doing everything

      It's not broken, it's just an advanced system so a developer who wants to write really fast code has to know how it works. If you look at God of war 2 for example, what the engine can do on a system with 32MB of ram and a pretty slow CPU, it really shows that a skilled developer who knows what s/he's doing can get the results desired.

      I.o.w.: a 'lamer' can't get the performance desired. Well, what a shame, ain't it? If one really understands what it takes to write 3D engine code, it shouldn't be hard to understand that what the PS3 offers is in theory not really broken, but an opportunity to really get results which are beyond what one could imagine.

      Sure it's hard to write that code, but that's no different from writing solid, performing, scalable data-access code for example. It doesn't require thousands of developers to write that code: only a few are required, they can write the hard part, the rest of the developers can build on top of that. After all, a game is often mostly written in a script-like language of the engine or C/C++ utilizing engine libraries, not a lot of people developing games are really writing engine cores.

      --
      Never underestimate the relief of true separation of Religion and State.
    10. Re:It also helped MS by drinkypoo · · Score: 3, Insightful

      I really don't know what the hell Sony was thinking with putting a brand new kind of processor in a console. I'm willing to bet in 10 years there are compilers and systems out there that make real good use of the Cell. However that does you no good with games today.

      It IS a bit hilarious isn't it? The Playstation murdered the Saturn in part because it was easier to develop for, with one CPU (a MIPS core at that!) and one graphics chip. Then Sony completely blew it with the PS2, made the most complicated video game console to program for ever and Microsoft made huge inroads. Then they blew it again with the PS3. A majority of developers willing to speak on such issues despise both systems. Sony would be out of the video game market completely at this point if it weren't for Xbox 360 RROD.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    11. Re:It also helped MS by Vellmont · · Score: 2


      It's not broken, it's just an advanced system so a developer who wants to write really fast code has to know how it works.

      Presumably spoken by such a developer. That's great if you're a super-star and want to develop specifically for that machine. But what about the game development company who's looking to make a buck? Superstar developers are by definition, rare. Or maybe you ARE a superstar, and care more about cross-platform, or productivity?

      Designing a system that requires the developers to be extraordinary (or at least SOME to be extrodinary) is a broken model from the game company's perspective, and from the gamers perspective. Harder to develop for = less developers = less high quality games = less money. So how is this not "broken" for anyone except the highly talented willing and able to learn the specifics of this particular system?

      --
      AccountKiller
    12. Re:It also helped MS by Anonymous Coward · · Score: 2, Interesting

      OP is talking about gigabytes per sec (GB/s), not gigabits per secont (Gb/s). 20Gb/s = 2.5 GB/s. Everything the OP says is accurate. 20 gigabit isn't that fast, especially for an internal bus.

    13. Re:It also helped MS by faragon · · Score: 2, Insightful

      Microsoft's 512MB memory runs at a very slow speed compared to the 3ghz frequency the PS3 cpu memory runs on. It's not a surprise why this is: the bus is shared: display hardware, video chip, main cpu, all have to utilize a bus to the same memory. To schedule all these requests, you have to use even/odd cycle schemes or similar, you can't use the bus all for one chip.

      RAM access cycle interleaving works for pre-burst memories (e.g. DRAM, SRAM). Current synchronous RAMs (since late nineties SDRAM) operate in bursts, i.e., the address is set in the bus, and then, at every clock a read (or write) operation is performed, being the next address is increased implicetely (burst transfer). So my bet is that there is not RAM cycle interleaving for modern synchronous DRAMs, as it would be very complex and nonsens to add a "interleaving logic" in between the DRAM controller and the CPUs or DMA controllers.

    14. Re:It also helped MS by TheNetAvenger · · Score: 3, Informative

      Ya well turned out not to be near powerful enough for that, so late in the development cycle they went to nVidia to get a chip.

      The funny part about the NVidia chip Sony is using in the PS3 only exists because of Microsoft and Microsoft funding.

      On the original XBox MS Engineers worked with NVidia to create what was the technology behind the Geforce4ti. The GPU created for the XBox was the first (NVidia at least) GPU that had Pixel Shader technology.

      It was the work from the MS engineers and NVidia that created this custom GPU that NVidia took on to become the Geforce4ti (high end) and the GeforceFX (5xxx series) line of GPUs.

      It wasn't until the 8xxx series of NVidia GPUs did they abandon the architecture that was co-designed and funded by Microsoft originally.

      This is why when NVidia was asking for more money per GPU for MFR on the XBox GPU, MS basically told them to pound sand, as they had already help to create and fund their entire line of PC GPUs that was giving NVidia the success they were having.

      So not only did Sony screw themselves by shoving a 'slower' Geforce 7900 into the PS3 that caused their own delays, the Geforce 7900 in the PS3 is based on designs from Microsoft engineers and MS funding that NVidia got during the original XBox development.

      Besides adding the GPU into the PS3 at a late date, Sony screwed themselves with their own problems that were beyond anything IBM was doing.

      Look at the PS3 Development tools. Even if Sony was waiting on parts from IBM, they could have at least had a mature set of development tools, instead even 'after' waiting on IBM or whatever their excuses are, their development tools sucked ass and can be argued to this day still don't properly harness the power of the Cell processor.

      So if it was just waiting on IBM, the development tools would have been done and waiting, instead, the hardware was available before even a realistic or solid set of development tools were available.

      In contrast, MS's development tools for the XBox 360 were ahead of the hardware and developers were using two G5 Macs running a custom version of Windows2003 x64 with a full set of development tools. And when the REAL XBox 360 hardware was made available to developers, they again got updated development tools from Microsoft that directly targeted the tri-core PowerPC and the MS designed ATI GPU that was optimized for the actual hardware.

      Basically MS didn't even have the XBox 360 hardware, but had development tools in the hands of game developers and even found a way to provide these on an emulated hardware configuration. - Sony could have done this, instead they screwed developers and still do, and not they blame IBM for delaying their 'precious' chip. Holy lord of the rings...

      With regard to the poster I am replying to, they are spot on with many things. The PS3 GPU is a slower version of the NVidia 7900 - this means laptops from 2005 have faster GPUs in them than a PS3. How is that for sad and scary...

      Additionally, it was MS designs (that they kept ownership to this time) on the ATI based GPU technology in the XBox 360 that set the standard for all current GPUs on the market today. It was a unified shader technology, with on chip cache for AA, and also was designed to use the shared memory architecture that the Vista WDDM model is built around.

      So every time you see a video card from ATI or NVidia with DX10, the design comes from MS engineers. (Yes NVidia didn't have access to, but used the design specifications behind the DX10 hardware specifications designed and written by Microsoft for their 8xxx and newer GPUs.)

      Technically the GPU in the XBox 360 is a DX11 based GPU that is ahead of the current generation of GPU architectures still, and won't see desktop PC equivalents until you see DX11 GPUs on the shelves. (As it has hardware WDDM 1.1 hooks that current desktop GPUs do not have.)

      I actually think the PS3 is a good gaming system for what it is. It is a good Blu-Ray player too.

      It was

    15. Re:It also helped MS by rtechie · · Score: 2, Informative

      most complicated video game console to program for ever

      Actually, the Saturn was probably the most difficult console to program for ever. Sega basically told developers "Here's some of the system calls and incomplete design docs. Have fun." and it NEVER got any better. To this day there are parts of the Sega Saturn that are basically totally undocumented. Notice how you've never seen any Saturn emulators? This is why.

      The first 2 years of the PS2 were painful, and then much better development tools arrived on the scene that handled much of the fiddly crap. Nowadays it's easy to develop for the PS2.

    16. Re:It also helped MS by SuiteSisterMary · · Score: 2, Insightful

      Almost every console that has ever failed, has failed because they screwed the developers.

      Developers jumped ship from Nintendo to Sony because Nintendo was still requiring carts, huge lead times, limited number of titles to each company, certification, and so on. Sony removed these restrictions, and developers didn't let the door hit their asses on the way out.

      Sega had a history of screwing up with developers; random hardware addons like the Sega CD, 32X and so on; if the Dreamcast had come out without several years of beaten-puppy-syndrome affecting the devs, it would have done much better against the PS2.

      When the Xbox came out, it was widely acknowledged as having the best dev tools, hands down. However, it was the new kid on the block. But Microsoft has always known that developers make the world go 'round; just ask Steve Ballmer.

      Now, Sony has become Nintendo, dictating rather than asking. There's a reason why so many formerly Sony exclusives are winding up on the 360, just like so many Nintendo exclusives wound up on the PS1.

      Of course, it also helps that the 360 is a better game machine than the PS3, for reasons other posters have pointed out; unified memory, better GPU, and so on.

      --
      Vintage computer games and RPG books available. Email me if you're interested.
  8. Well what does Sony really have to do with this? by Anonymous Coward · · Score: 2, Insightful

    I don't see how "Sonys" research money was used or really in question for any this.
    the PowerPC both CPUs are based on is the PowerPC 970, the Processor Apple used in their G5 series- but from there the difference is that they disabled out of order execution- implemented SMT from the Power5, on sonys behalf they added 8 newly developed SIMD coprocessors known as SPEs.
    For microsoft well they wrote a new version av VMX called VMX-128, something not to be found on the Cell which still uses the old VMX (Mostly Apples design)

    If any thing worked against Sony it's been their high unit cost, their total failure to meet up with the advertising (full hd at 60 fps, etc)
    Absent of games etc.

    I bought a PS3 myself, but still today 3 years later the only reason I ever kept it was the ability to install linux so it could be put to some use in the absence of the games.

  9. Hmm, really? by Ecuador · · Score: 4, Interesting

    Maybe I have to read the book to get a better picture, it is possible that the article blows things out of proportion. So, I thought that the whole "deal" about the Cell are the SPE's. The Xenon CPU that powers the Xbox 360 is just a custom-made triple core PowerPC. Now, I guess the "customization" of that core is similar to what is done for the PPE of the Cell, so research there could have overlapped, but I would not think that the PPE is the "essence" of the Cell - at least that is what Sony's and IBM's own claims have made me believe.
    Additionally, I have to admit that I always thought the usage of the Cell processor a very bad (or, more precisely, very arrogant) decision. It is not just that it has many "cores"; the fact that they are asymmetric and that SPE's are not your usual general-purpose cores, was bound to make it very hard for developers to utilize them. If you wanted to develop for many platforms there is no way you would want to optimize for the SPE's when all other architectures (PC, Xbox...) use symmetric, general purpose cores. So, in my book, the Microsoft engineers knew much better what they were doing than the Sony ones. I guess they are not the same engineers responsible for gems like Me, Vista or Zune firmware.
    What I would like to know are the differences that the modified core has compared to a "classic" PowerPC core? So, if MS had not benefited at all from Cell research and got a triple-core whose cores were closer to the original PowerPC, would it be a much different CPU? Anybody knows? If the answer is not, the whole discussion about MS benefits from Sony is moot...

    --
    Violence is the last refuge of the incompetent. Polar Scope Align for iOS
    1. Re:Hmm, really? by IamTheRealMike · · Score: 2, Informative

      The XBox360 cores don't have any superscalar features, things like branch prediction, instruction re-ordering or speculative execution. That means they use much less power than a regular core (and so generate less heat), but only run branchy game logic type code at around half the speed.

    2. Re:Hmm, really? by MemoryDragon · · Score: 2, Insightful

      I agree here as well, there is nothing from the Cell design which went into the Microsoft PowerPC core. IBMs processor business nowadays is mostly to customize power pc processors for various customers. The design which went in from Microsoft is basically just a trimmed down G5 core with three cores, while the Cells, is a trimmed down G5 core with a load of SIMD units!

    3. Re:Hmm, really? by Anonymous Coward · · Score: 2, Interesting

      > instruction re-ordering

      I don't know if any of the current generation of console CPUs has re-ordering... and for good reason. In an ideal world the compiler would schedule instructions well (or, rather, "well enough that dicking with it further in hardware wouldn't be a productive use of silicon") However in the real world 99.999% of users aren't running gentoo — instead, they have binaries which weren't likely compiled for the exact cpu that they're running. A hardware reorderer can make significant gains in instructions/clock on code like this.

      Game consoles live outside this "real world" though — since every user is running the same CPU you can tell developers "hey, use this set of compiler options for the best instruction scheduling"

      In other words, it's easier to make ILP a problem for the software folks.

      > branch prediction

      BP is kind of a mixed bag... a naive BP (backwards=taken, forwards=not taken) still gets it right most of the time. For really critical code reordering the code blocks to fit those rules isn't too hard. gcc makes if very easy with __builtin_expect() Not sure if MS's Visual suite has anything similar... does anyone know?

      > speculative execution

      That's one thing I'm surprised to see missing -- I'd expect it to be a big win for a design like this. I guess they didn't want to do a whole register renaming scheme? I'd be very interested to see what percentage of cycles are lost to mispredited-branch stalls in real game code. I bet you're right: it's probably high.

      I trust IBMs CPU designers though... I'm sure they simulated it extensively both ways. Either that or it would have really blown the thermal budget for the core.

  10. EXCUSES~1 .. by rs232 · · Score: 2, Interesting

    "Maybe I have to read the book to get a better picture, it is possible that the article blows things out of proportion. So, I thought that the whole "deal" about the Cell are the SPE's. The Xenon CPU that powers the Xbox 360 is just a custom-made triple core PowerPC", Ecuador

    Well unless you know different we'll just have to take the points raised in the article as accurate. And if the CELL was just a custom-made core then why the need to commit $400 million over five years?

    "I agree here as well, there is nothing from the Cell design which went into the Microsoft PowerPC core", IamTheRealMike

    "Xenon .. is based on IBM's PowerPC instruction set architecture, consisting of three independent cores on a single die"

    --
    davecb5620@gmail.com
  11. VMX128 in Xenon is borrowed from the Cell SPU's ! by stephen70 · · Score: 5, Interesting

    Slashdot users read and learn because anyone who fails to understand the following is uninformed >

    The SPU's on the Cell and the PPC Altivec unit on the Xenon(X360) are very closely associated never before has IBM done a 128register 128Bit Altivec unit. The 128bit X 128register Altivec VMX128 unit on the Xenon is the best of any CPU it is also an almost perfect subset or cut down version of the Cell's SPU !.

    In non braching calculations and assuming no cache misses VMX128 performance is equal to the SPU's performance this is not a coincidence it's a newly shared design feature in both the instruction sets and silicon fab and clearly shows the CPU designers shared alot.

    The older VMX is only 32 registers. Only the Xenon PPC cores and Cell's SPU's have this new VMX128 type arrangement with 128 SIMD registers - especially enhanced for multimedia and gaming.

  12. Re:E_TOO_VAGUE by CronoCloud · · Score: 2, Informative

    Cell is Hyperthreaded, as any Linux on the PS3 user can show you:

    [CronoCloud@mideel ~]$ cat /proc/cpuinfo
    processor : 0
    cpu : Cell Broadband Engine, altivec supported
    clock : 3192.000000MHz
    revision : 5.1 (pvr 0070 0501)
     
    processor : 1
    cpu : Cell Broadband Engine, altivec supported
    clock : 3192.000000MHz
    revision : 5.1 (pvr 0070 0501)
     
    timebase : 79800000
    platform : PS3
    model : SonyPS3

  13. The Cell architecture just isn't that useful by Animats · · Score: 5, Interesting

    Sony's payback comes when Playstation3 programmers learn to fully utilize the Cell architecture.

    As someone else pointed out, if that was going to happen, it would have happened by now.

    The fundamental problem with the Cell is that each SPU only has 256KB of RAM. (Not 256MB, 256KB.) Data can be moved in and out of main memory in the background with explicit DMA-like operations. Given that model, you have to turn your problem into a data-flow problem, where a data set is pumped sequentially through a Cell processor. The audio guys love this. It's useful for compression and decompression. It's a pain for everything else.

    It's not good for graphics. There's not enough memory for a full frame, not enough memory for textures, not enough memory for the geometry, and not enough processors to divide the frame up into squares or bands. Sony had to hang a conventional nVidia GPU on the back to fix that. It's useful for particle systems. If you need snow, or waves, or grenade fragments, the Cell is helpful, because that's a pipelineable problem.

    There are some other special-purpose situations where a Cell SPU is useful. But not many. If each SPU had, say, 16MB, the things might be more useful. But at 256KB, it's like having a DSP chip. The Cell part belongs in a cell phone tower, processing signal streams, not in a game machine. It's a great cryptanalysis engine, though. Cryptanalysis is all crunch, with little intercommunication, so that fits the Cell architecture.

    We're back to a historical truth about multi-CPU architecture - there are only two things that work. Shared-memory multiprocessors ("multi-core" CPUs, or the Xbox 360) work; they're well understood and straightforward to program. Clusters, like Google/Amazon/any web farm, also work; each machine has enough resources to do its own work and can live with limited intercommunication. Everything in between those extremes has historically been a flop: SIMD machines (Illiac IV through Thinking Machines), dataflow machines (tried in the 1980s), and mesh machines (nCube, BBN Butterfly). The only exception to this are graphics processors and supercomputers derived from them. That, not the Cell, is cutting edge architecture.

    I've met one of the architects of the Cell processor, and his attitude was "build it and they will come". They didn't.

    1. Re:The Cell architecture just isn't that useful by 91degrees · · Score: 2, Interesting

      I never thought the cell was intended for graphics anyway. 3D hardware is simple SIMD, with very long pipelines. Unless you're after ray tracing, a more general purpose chip would be a waste of resources.

      Cell is probably good for complex physics, and sophisticated AI, but that's a bit of a problem because programmers haven't really worked out how to use the resources efficiently yet. Game developers have a very procedural approach to solving problems.

  14. Again, An Iffy Article.... by EXTomar · · Score: 2, Insightful

    Japan has always been like this. Take a look at the PS3 and Wii. Both offer highly proprietary, custom built, in ways convoluted technology to the same problem. But for some reason Sony is treated as idiots while the author sort of forgets Wii takes the prize. For whatever reason Japanese engineers like doing this: When there is no technology that exists that exactly fits to solve a problem, their engieneers tend to build a new one even if there are other pre-existing solutions that almost achieve it. Just like other capital projects, it sometimes pays off and sometimes fails.

    Another thing not considered is the fact the XBox 360 is most conservative console out of the three. The software and hardware technology in the Wii and PS3 are dramatically different then their predecessors where they have features that simply don't exist in the ancestors. On the other hand the XBox 360 is more like a beefier XBox. I think the real story is that Sony gambled on some fundamental technology shifts and it didn't pan out. Microsoft on the other hand "played safe" and iterated. There is nothing wrong with that but to claim its some technology shift or special insight, especially given their production and software problems is a bit much.

  15. The die photos show it pretty clearly... by CTho9305 · · Score: 2, Informative

    At first glance, the Xbox CPU doesn't really resemble Cell, but if you just compare Cell's PPE to one of Xenon's three cores the similarity is striking: Xenon, Cell

  16. Re:The poor graphics is a learning curve issue by Ilgaz · · Score: 2, Informative

    Programming issue as result of development tools? I am a Symbian user since Nokia 7650 (first S60) and I keep getting amazed at the developers love for iPhone, how a very advanced application like Fring can ship in matter of months without any kind of help from Apple and how wisely OpenGL (ES) acceleration was used while it is ignored on my poor UIQ3 Sony Ericsson P1i for years until Opera 9.5 beta.

    People say SSE could just reach the point of Altivec after new Xeons and yet as a G5 owner, I kept wondering why Altivec was not used many times even by Apple themselves in certain parts. Or SMP (I got Quad G5) is just to be seen in full potential after OS X Leopard. It has easy answer. Intel and AMD does offer great support to developers, the entire gnu compiler family and OS developers.

    If it is a programming issue and both IBM and Sony involved, I would look to Development tools. Somehow I suspect the development tools and support for them offered way better on XBox 360. Compare the Symbian UIQ3 market to way more premature (in terms of coding/ui) Nokia S60 and finally compare Symbian S60 to iPhone. Development tools really makes huge difference and Sony is a hardware company, IBM doesn't really have clue about end user etc.