PS3 Cell Processor 'Broken'?
D-Fly writes "Charlie Demerijian at the Inquirer got a look at some insider specs on the PS3, and says, Sony screwed up big time with the Cell processor; the memory read speed on the current Devkits is something like 3 orders of magnitude slower than the write speed; and is unlikely to improve much before the ship date. The slide from Sony pictured in the article is priceless: 'Local Memory Read Speed ~16Mbps, No this isn't a Typo.' Demerjian says when the PS3 comes out a full year after the XBox360, it's still going to be inferior: 'Someone screwed up so badly it looks like it will relegate the console to second place behind the 360.'" This is the Inquirer, so take with a grain of salt. Just the same, doesn't sound too good for Sony or IBM.
What is this 'local memory'? On-die cache? How the fuck can you screw that up to make it 16Mbit?
PS3 is way overkill for a console anyway. What are they thinking? Not everyone needs a console with 1GB of memory, huge HDD, which also doubles as a DVD Player/Entertainment center/Memory stick player (you betcha sony is already adding THAT feature), oh and can also play some games.
I'm all for Nintendo's new console. Its cheap, it will have amazing games AND they're not trying to make it the center of your digital home.
Microprocessor Online has some an interesting analysis. Pay attention to page 8, where the PS2 "Emotion Engine" processor is compared to the PS3 Cell processor. This is an analyst report for the industry of microprocessors.
If you really want to dig into the details of the Cell processor, check out Sony's resources. You have to agree to a bunch of things to get to the pdfs but there's a lot of information in them. Another place you can find information is IBM's resource site which contains a lot of stuff including the programming handbook.
My work here is dung.
there is no point in judgin a dev kit. x360 kits were shitty too.
[chinese democracy starts now
I'm aware that, in the past, The Inquirer has published questionable articles. However, they've certainly got a revealing picture to back it up here...unless they're outright lying and they photoshopped something, why should we take this story with a grain of salt?
ACs are modded -6. I don't read you, I don't mod you, I don't see you. Don't like it? Don't be a coward.
That Ken Kutaragi let his loser long-lost baby brother design the PS3 without looking at the thing or its price tag until it was unvieled?
Monstar L
So what is the difference between the local memory 16MB/s and the main memory 25GB/s 'reading'?
I assume the local memory is not going to be used much for 'reading' and only main memory is going to be used.
"Only one thing, is impossible for god: to find any sense in any copyright law on the planet." Mark Twain
Ah well, it's nothing a complete recall and price increase can't fix...
This reminds me, I am certain to be cruicified for not remembering this bit of trivia, but the PS3 is looking more and more like that car that Homer designed for his brother....
What was that called again?
On Wall Street they say "buy low, sell high" On the pad we say, "buy high, sell high" Isn't that somehow better?
I can't imagine why Sony would add the text "this is not a typo" underneath the below average local read speed unless they are planning to release the final PS3 public version with much higher read speeds. If you can program a game to run great with the low read barrier then wouldn't you expect it to run ever more efficiently with the gates wide open in a final/public ps3 release? my .02c
Pay attention. The article says that SONY is telling the developers to avoid using local memmory at all - that means, it won't be fixed in the retail version.
No it means the Inquirer is the digital equlivant of a rag.
I thought we had all boycotted Sony anyway! Or are we on another bandwagon this week?
The subject says it all. It's getting really tedious. Why just not wait for the release and then make comments?
A CC-licensed illustrated horror novel
Noticed the logo on the bottom left of the slide. Maybe it should have read
DeviStation
Open Source Drum Kit, LPLC deve board - mjhdesigns.com
The "Local Memory" is the memory attached to the RSX.
That the read performance for the Cell from this memory is dreadful is no surprise. This is exactly the same architecture that has been traditionally used in PCs. Reading graphics memory from the main processor is usually really really slow.
This memory is where you store textures and other graphics data. The main processor will usually have little need to read from this memory. If it does, then, as apparently Sony says, you just get the RSX to write to main memory instead.
This is a non-story. People have dealt with this for PC games for a long time.
I was just about to post the exact same thing. It amazes me that: /. editing kiddie troupe) seems to have no clue /., and it's constantly pointed out and constantly ignored
1) The poster had no clue
2) Zonk (and for that matter, the whole
3) This mistake happens _constantly_ on
4) Anyone with even a basic understanding of computers wouldn't make this mistake
Just more proof that "IT" != computer science
Does anyone ever bother reading the *IBM* documents for this? Never mind what Sony have managed to do to the cell processor, if you turn to the IBM CBEA developers handbook (page 75), you will see:
"Load and store operations (LS), 6 Clock cycles Latency". And that's the time it takes for the instruction to complete, not to be issued to memory.
(3.2Ghz / 6 cycles) * 16 bytes != 16MB/s
Personally, I'm gonna bet on IBM being right, seeing how they're the ones who made the bloody thing. I don't trust the inquirer anyway, but if those figures are true, the most likely answer is inefficiencies in their benchmarking programs, (Such as instruction starvation, a nasty side effect of using SPU's)
I've been hearing a lot of chatter about how the PS3 is difficult to program for, developers don't like it, Sony isn't providing quality libraries, blah, blah, blah. These exact same things were said about the PS2 when it first came out six years ago and it still managed to dominate its generation of console gaming. And it certainly wasn't true that developers avoided the PS2 in favor of XBOX or GameCube. As always the winner and losers of the console wars will be decided by the buying public, in the US, Japan, and Europe.
I think being too connected to the online debates about this stuff can make you lose sight of what the more average public thinks and bases their purchase decisions on. That's why the only real argument for the PS3's failure so far is the high price, not questions about performance or developer issues.
It's a dev kit, first off, second off it's the inquirer, which was formed from register rejects and doesn't have BOFH, and third off, I saw a UC Berkeley benchmark with an emulated cell that would seem to indicate this is a production problem, not a design problem.
But seriously, WTF should I care? I really don't care which console wins the virtual pissing match in the "ooooh shiny" department, if I was one of the people that did, the PS3 is already into the realm where $500 video card purchases begin to look slightly reasonable.
I'll judge it by the games, when they're released or playable.
The key to the enjoyment of pop music is to replace any instance of "love" with "C.H.U.D."
No, Sony are telling developers not to read from "local memory" using the Cell. This is not the same thing at all.
There is nothing to fix. This is by design.
The "Local Memory" is the RSX graphics memory. The Cell has no need to read from this.
The The Inquirer article is rubbish and that slide is taken out of context. It seems to imply that the Cell can only read "Cell local memory" (whatever that is) at 16MB/s.
Memory transfer bandwidth between each SPU and its SPU Local Memory is something more like 25GB/s (gigabyte per second); sustained actual bandwidth between all SPUs is greater than 100GB/s; peak theoretical is greater than 200GB/s (assuming all 8 SPUs present for simplicity).
If you had access to the full version of the presentation (part of the full Sony PS3 SDK and technotes), you'd realise that that slide is part of a presentation about the RSX (the PS3's GPU). As such, when it refers to "Local Memory", it means RSX's Local Memory (eg graphics memory, video memory, VRAM or whatever you call it in fanboy/ps3/360-is-teh-suck websites). To be understood outside that context, the columns would be better labelled "Main System Memory" and "GPU Local Memory".
The Inquirer article seems to suggest that this figure of 16MB/s (megabyte per second, by the way, what the fuck is it with journalists swapping bits for bytes? why don't they get their shift/capslock keys fixed?) is some kind of show stopper. No it isn't. It simply means that the Cell processor has 16MB/s bandwidth when reading directly from memory-mapped GPU address space. So what? Unless you're planning on calling memcpy() or some shit to bring your data back then it doesn't really matter.
On RSX-initiated transfers you have 20GB/s bandwidth to do the same transfer (from RSX local to main system memory). Cell read bandwidth of GPU memory might as well have 0MB/s (ie no connection at all) and it wouldn't matter a bit.
About two years ago I decided to leave my post as a reviewer/tester for Sony. I had close ties with them for over 4 years and I began to have major misgivings on the direction and quality (lack thereof) that was being pumped out. I have been around the gaming industry long enough to know the beginnings of massive problems and they began a few years back.
Everyone close to me in the industry said I was crazy and that this would all smooth out and Sony would easily retain its market share if not grow more. I wasn't buying it and stuck to my guns, I'm pretty happy about my decision almost daily since day 1 of E3 this year.
I was against UMD from the beginning, yet everyone claimed that the sales were stellar. Looks like they weren't and they are proprietary, expensive, unwieldy little discs that no one wants to deal with. The "cell" processor was without a dobt my turning point, I have ZERO faith in it or the architecture and it will not become this ubiquitous omnipresent processor as so many claim, even IBM has major problems with it and designing compilers and dev software for their own product. Control schemes have been radically changed from initial proposals, and too quickly to be properly tested... that is a bomb yet to go off. System price and dev costs that are just too high for our current economic situation as well as for widespread adoption. There are more issues, but top it all off with a new unproven media that is also expensive and offers no real consumer advantages and you have the high risk of a catastrophic failure that could hurt Sony and IBM even more than they are already hurting.
The best that can happen is that companies finally lose the DRM/proprietary/Closed nature of their consumer electronics. Stop treating customers as criminals and start to offer them affordable and accessible entertainment that is convenient. I'd actually prefer consoles to standardize and become built into consumer electronics so that developers and consumers can really get to work on a stable and long lasting platform. Imagine the possibilities. There is a lot to be said for standards.
http://teasphere.wordpress.com - A little spot of tea
There is no screw up. This is by design. This is exactly how PC graphics cards work. With the PS3 graphics system based closely on a PC one, it is no surprise that this is the case.
The "Local Memory" is the RSX memory. The Cell doesn't need to read from this.
This isn't the online IT arm of the National Enquirer, you know.
The Inq isn't always right, but what the do tend to have is a lot of news-breaking stuff that they're (well, Mike) is willing to publish regardless of the consequences when the corporate heads find out there's a leak. Thats' why Mike got eased out of The Register when it went more corporate to form the Inq in the first place.
Those who have been following it for a while will remember all the appearances of leaked memos from Compaq (ex-DEC) insiders who were willing to leak happily to someone of the old school who was interested in seeing how the whole fiasco was turning out. Compaq/HP even started internal witchhunts looking for the leakers.
Regardless, the only real problem people might have with the Inq is they can't distinguish between an opinion piece and direct reporting, or can't accept that while the information as presented might be correct, it doesn't ensure that interpretive parts also follow.
Nihil Illegitemi Carborvndvm
Er, yes it is. The slide says 16MB/s, not 16Mbps, i.e. megabytes, not megabits... 16Mbps would be pretty slow!
i know it's far fetched, but think for a moment, if you were IBM, a major IT player with lots to gain if you make peace with microsoft (after years of a bitter relationship, see MSs monopoly trial's documents for more info), who would you prefer to help: microsoft or sony ?
i'd bet on MS. making a kick ass CPU for the 360 would make easier for IBM to extract sweeter deals from MS in other areas and to placate bill's wrath in what concerns IBM's linux business. if this means screwing up sony, so be it.
The reason IBM's relationship with MS was bitter was because IBM was simultaneously a competitor and a customer of Microsoft. It was a bitter relationship because IBM did have to do things to make MS happy that IBM would have rather not done. This is the kind of control MS has over every company that depends on their OS for sales, and IBM doesn't like it. They don't want to have to try to extract sweet deals from MS by dancing to their -- a competitor's -- tune.
IBM has for years been trying to extricate themselves from this situation. In recent years these efforts have become even more pronounced. They sold off their PC division, making all of MS's influence on the desktop irrelevent to them. That leaves MS in the server, and a major reason for IBM's investment in Linux is to fend off the advance of Windows into that space (proprietary Unix having proven ineffective at doing so).
So IBM really has no reason to make peace with MS, in so much as it doesn't stop MS from being IBM's customer. This is an arrangement I'm sure they much prefer -- MS is now buying from IBM instead of the other way around, and all they have to do to keep MS happy is provide the processors they want in the quantity they want just like every other customer.
Sony and MS are both just revenue streams as far as IBM's processor division is concerned. If there was any customer they were going to sabotage in order to benefit themselves in another space, it'd be Microsoft.
The enemies of Democracy are
When you see "No this isn't a Typo" on the front page of slashdot, be very skeptical.
Take a closer look at the linked image. The two top colums are CELL. Not RSX, CELL.
And the theoretical bandwidth numbers listed for CELL to main memory are those of the direct XDR interface. You'll note that the RSX has much lower numbers because it accesses main memory through a bridge bus (much like a graphics card on PCIe).
On the Cell, there is only one thing local memory can mean, and that is the local memory of each SPE.
NOTE: this can be a serious issue, because each SPE MUST read instructions and write results to the local memory. It is up to the main processor to load instructions into this memory from main memory, and to copy results from this local memory to main.
Man is the animal that laughs.
And occasionally whores for Karma.
After reading the article, I realize that these are numbers for Cell and RSX local memory. Of course, our stupid submitter wanted to make us think this was the SPE's local memory, and purposefully put a DIRECT LINK to the photo in addition to the article link when he knew it would be taken out of context.
Man is the animal that laughs.
And occasionally whores for Karma.
Either that, or a broken benchmark. Each Cell processor (Synergistic Processing Element -- SPE) shares its instruction fetch port with its data memory port. The SPE can buffer up 80 instructions at a time (2.5 fetch words), plus an additional 32 from a branch target. Fetch will stall if the memory system gets saturated with loads and stores. Properly written memory-intensive code includes explicit fetches to keep these buffers full. Incorrectly written code will cause problems. Still, that doesn't explain a 3 orders of magnitude drop.
If you look at the slides on the page I linked to above, you'll see the SPEs are not connected into the global address space. They connect to a private single ported memory, and to each other through two unidirectional rings. (The ring structure is not apparent from that diagram, but trust me, it's there.) These rings then connect to a DMA engine.
If you wade through this paper, you'll see that the Cell compiler implements a software cache. (The same paper also explains the instruction fetch mechanism mentioned above, BTW.) That is, it emulates a cache in software, using the DMA to actually move memory around. Depending on the nature of the benchmark and how it was written, it could be that the read benchmark spends all its time allocating stuff into this cache and waiting for it to arrive. Writes would be faster because the cache can "write behind" without having to wait for the allocation to happen, if the compiler is smart enough to know that the previous data will be entirely overwritten. So, if the benchmark goofed, then the results are meaningless.
Fact of the matter is that the SPEs are capable of reading 128 bits a cycle each (128 bytes / cycle across the 8 SPEs). Other benchmarks, such as the article recently posted to Slashdot about using Cell for scientific computation confirm that this thing hauls--and these are bandwidth-intensive tasks. The quoted paper did run some numbers on real silicon and showed numbers similar to their simulation results.
With all this in mind, I find it hard to believe that Cell is broken.
--JoeProgram Intellivision!
It's the Inquirer, who have confirmed allegance to be Xbox fanboys and slag anything Sony..
Can you provide examples of them "confirming" allegiance to the Xbox? Can you provide a single example?
Because I read The Inquirer quite often and I have never seen anything like that. Is it at all possible that you're just a lying little shit who didn't have any interesting points to make and so decided to make some stuff up?
Actually those are rhetorical questions. The Inquirer is very much anti-MS. Of course it's also anti-Sony, which may be what's confusing you. Try only reading websites which show the proper level of sycophancy towards large multi-national corporations - they may not upset you so much.
Why do you care if Sony are unfairly maligned anyway? If you worked for them or had stock in them it would be understandable, although then your opinions would be null and void. But you don't do you? You don't have any stake at all in Sony and yet here you are bullshitting on their behalf. What the fuck is going on in your mind? Do we even want to know?
It's there for a reason.
My flame:
"I'm sure you'll get a lot of these messages, but hell, you deserve it.
The slow read speed you noted in the slide is for Cell reading from the RSX's local memory. Such accesses are expected to be very slow. If you look at this USENIX article from one of the Linux DRI folks, you can see this quite easily:
DRI article
He shows how painfully slow it is to read from AGP or framebuffer memory (14 and 5 MB/sec, respectively), on a Rage 128 graphics card. For the CPU to framebuffer read, which is the equivalent to what we're talking about here, the read speed is 1/40th the write speed. At 16MB/sec read and 4GB/sec write, the PS3 is actually right in line with what can be expected of modern GPU architectures.
Reading from the framebuffer is just slow unless you have a unified memory architecture. The CPU and the GPU aren't cache-coherent, which means every access to framebuffer memory (or even AGP memory, which is actually a chunk of system memory allocated to the GPU) must be an uncached access. Uncached accesses are just plain slow, on any architecture.
The way your article is written, it makes it seem like Cell reads its local storage at 16 MB/sec. That is, of course, bollocks, since IBM has shown benchmarks of the Cell local storage achieving 98% efficiency. If you had any journalistic integrity at all, you'd post a retraction on your site, and a clarification of the technical issues involved."
A deep unwavering belief is a sure sign you're missing something...
Ahh, so this is the rate at which the Cell can read RSX's local memory? That I'll believe. And I will equally agree "BFD!" The Cell does its work and dumps everything to main memory or the RSX's memory. RSX does its work and if it needs to communicate anything major back to the Cell, it does so through main memory. Makes perfect sense then.
I thought something seemed awful fishy. I thought the slide was summarizing performance of the Cell SPE and RSX, not the Cell's and RSX's ability to communicate with the RSX's local memory. If your statement's true, then this paragraph in TFA is full of it: (Emphasis mine.)
It all begins to make a lot more sense, though, if this is about accesses from Cell or RSX to memory local to RSX. I admit ignorance on the RSX's architecture. I just know in my bones that those numbers aren't for a Cell SPE talking to its local memory.
--JoeProgram Intellivision!
The entertaining thing is that this particular problem is something quite natural to PCs. Framebuffer accesses on PCs have been slow ever since graphics cards started sporting dedicated coprocessors. You take a brand spanking new PC, and start downloading stuff from the framebuffer, your transfer rate will be abysmal.
The problem here isn't just a lack of embedded hardware knowledge. It's just a lack of knowledge.
A deep unwavering belief is a sure sign you're missing something...