Next-Gen Console CPUs Not Up to Hype
rAiNsT0rm writes "Anandtech follows up their initial in-depth coverage of the Xbox 360 and PS3 CPU with the real truth about the next-gen consoles' Poor CPU Performance. From the article: "Speaking under conditions of anonymity with real world game developers who have had first hand experience writing code for both the Xbox 360 and PlayStation 3 hardware (and dev kits where applicable), we asked them for nothing more than their brutal honesty. What did they think of these new consoles? Are they really outfitted with the PC-eclipsing performance we've been lead to believe they have? The answer is actually quite frequently found in history; as with anything, you get what you pay for."" Update: 06/30 21:11 GMT by Z : The original article disappeared from Anandtech, so I've changed the link to point to the story as hosted by Google Groups.
The revolution is for you. Not only are nintendo games known for being popular around kids, but the Revolution will have downloadable classics that ran on old systems.
The article said that most developers would be using only one of the PS3's processors for most operations. Well, when you're used to designing for one processor, you tend to continue designing for one processor.
/.; I forget where I first saw it).
Not really surprising; at any rate, it may be essential to get used to this type of architecture/programming, as The Free Lunch Is Over, if this article is to be believed. (This may have featured in
"Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
It is true that they have no released a MRSP for the next-gen consoles however Merrill Lynch business analysts have placed their estimate for the PS3 at $399USD. What makes this interesting is that it has been expected that each system will cost Sony $494 to build. The full article can be read here.
do.what.promptcmds
No shit. 2-issue and in-order requires hand tuned coding. Yes there is a whollop for a "cache miss" (fetching out to main mem) on the SPE's of the Cell processor. But there are ways to code around that. Split the local store up into smaller chunks and fetch data to fill the smaller chunks while the SPE plugs away on the chunks filled with data. That's why the SPE has TWO pipes. One pipe is for memory loads, the other pipe is for data processing.
t echdocs/E815CC047A60914687256FC000734156/$file/ISS CC-07.4-Cell_SPU.PDF
1 5.html
1 6.html
http://www-306.ibm.com/chips/techlib/techlib.nsf/
http://research.scea.com/research/html/CellGDC05/
http://research.scea.com/research/html/CellGDC05/
If you don't split up the local store, you're going to incurr a 500 cycle penalty while waiting for memory. If you split up the local store, you can fetch to half the mem and process on the other half. This amortizes, if not completely masks the cost of main memory access.
Correct me if I'm wrong.
It's up to the developer to optimize their code and ensure that it is being scheduled properly.
I'd love to hear from a developer that is actually doing everything they can at the low level to optimize data flow. What's their experience with keeping the processors fed with data?
Sony never claimed 160 million polygons per second. These numbers have become fabricated over time. They did claim 70M polygons per second, but nobody ever quotes the context. It was 70M unshaded polygons per second. The number for shaded/textured polygons (in the PS2 datasheet) is 20-30M, and the most recent games are indeed in the (lower end) of that range.
I'd also point out the "Toy Story in Real Time" thing was never a Sony claim either. As far as I've been able to track it down, it was some idiot journalist that made the claim, not a Sony spokesperson or any Sony marketing literature.
A deep unwavering belief is a sure sign you're missing something...
If you still want fun games for kids that don't include lots of graphic violence, and you're on a PC (or Mac) instead of a console - I think almost all the stuff from GameHouse is excellent. My kid is only 3, yet she already loves playing their "Gutterball 3D" game, just to try different colored bowling balls and watch them roll down the lane and knock pins down. And if they're a little older, all the stuff like TextTwist makes you think as well as have fun.
... so if you want a new one, you don't even have to go to the store to get it first.
They're inexpensive and downloadable off the net, too
These days, most of the really good, non-violent stuff in PC/Mac gaming comes from web sites marketing their goods online. The small developers haven't "sold out" to Hollywood yet.
"If Sony and MS are overhyping their systems then so is Nintendo."
False! I don't know how old you were last console cycle, but Nintendo was very realistic about the Gamecube's abilities before it was released. They said, "it can render 9 million polygons per second under realistic conditions." Cue Sony: "Well ours can render ONE HUNDRED BILLION polygons per second!" and then microsoft: "We can do INFINITY BILLION TO THE INFINITY POWER!!!". So it isn't at all clear to me why the fact that Sony and MS overhype indicates that Nintendo overhypes.
http://news.com.com/2100-1040-250632.html?legacy=
"It is accurate that at this time we will not support high-definition [on Revolution]," Kaplan told IGN.com.
It's really hard to tell what will happen by the time it's released. The Gamecube is theoretically capable of 720p output, though the games only utilize 480p. Considering the video hardware that is being used, it's safe to assume that the Revolution is at least as capable as the Gamecube It's not going to matter all that much, because we're still going to be stuck with 480p DVD movies for a while. 480p is a form of SDTV. Even if it's not "HD", it's still much higher quality than any analog television. Your comment about the RCA analog television is grossly exaggerated.And let's be honest... All three systems will have hardware that's paractically the same, regardless of these cracked out specs and numbers (ironic isn't it that all three are using what is essentiall a next-gen Gamecube with PowerPC and ATI graphics). What it will really boil down to is the games.
On the PS3, we get 256 KB of memory with a vector processor running 3.2Ghz, and people are complaining? And we get a bunch of em. Whoever they talked to were not PS2 developers. The same people who made sweet PS2 games will be making sweet PS3 games, trust me. For everyone else it will be harder to get used to.
For one thing, you can kiss your virtual calls goodbye.
Here's your source!
http://cube.ign.com/articles/522/522559p2.html
Q: Is Revolution "two-to-three times more powerful than GameCube"?
A: USA Today reported this news based on a comment from Nintendo of America's vice president of corporate affairs, Perrin Kaplan. The information was later determined to be false. We do not yet know how much more power Revolution wields over its predecessor.
Developers pushed the GC to over 14 million shortly after it was released. (I think it was in one of the star wars games). The numbers that Nintendo was putting out were not only realistic; they were slightly conservative.
Contrast with Sony and MS whose claimed performance numbers for the last 6 years have been pure fantasy and/or hype.
(minus page 6 about the GPUs, it got squased in my cache when I tried linking back after it was pulled)
In our last article we had a fairly open-ended discussion about many of the challenges facing both of the recently announced next-generation game consoles. We discussed misconceptions about the Cell processor and its ability to accelerate physics calculations, as well as touched on the GPUs of both platforms. In the end, both the Xbox 360 and the PlayStation 3 are much closer competitors than you would think based on first impressions.
The Xbox 360's Xenon CPU features more general purpose cores than the PlayStation 3 (3 vs. 1), however game developers will most likely only be using one of those cores for the majority of their calculations, leveling the playing field considerably.
The Cell processor derives much of its power from its array of 7 SPEs (Synergistic Processing Elements), however as we discovered in our last article, their purpose is far more specialized than we had thought. Speaking with Epic Games' head developer, Tim Sweeney, he provided a much more balanced view of what sorts of tasks could take advantage of the Cell's SPE array.
The GPUs of the next-generation platforms also proved to be quite interesting. In Part I we speculated as to the true nature of NVIDIA's RSX in the PS3, concluding that it's quite likely little more than a higher clocked G70 GPU. We will expand on that discussion a bit more in this article. We also looked at Xenos, the Xbox 360's GPU and characterized it as equivalent to a very flexible 24-pipe R420. Despite the inclusion of the 10MB of embedded DRAM, Xenos and RSX ended up being quite similar in our expectations for performance; and that pretty much summarized all of our findings - the two consoles, although implementing very different architectures, ended up being so very similar.
So we've concluded that the two platforms will probably end up performing very similarly, but there was one very important element excluded from the first article: a comparison to present-day PC architectures. The reason a comparison to PC architectures is important is because it provides an evaluation point to gauge the expected performance of these next-generation consoles. We've heard countless times that these new consoles would offer better gaming performance than anything we've had on the PC, or anything we would have for a matter of years. Now it's time to actually put those claims to the test, and that's exactly what we did.
Speaking under conditions of anonymity with real world game developers who have had first hand experience writing code for both the Xbox 360 and PlayStation 3 hardware (and dev kits where applicable), we asked them for nothing more than their brutal honesty. What did they think of these new consoles? Are they really outfitted with the PC-eclipsing performance we've been lead to believe they have? The answer is actually quite frequently found in history; as with anything, you get what you pay for.
Learning from Generation X
The original Xbox console marked a very important step in the evolution of gaming consoles - it was the first console that was little more than a Windows PC.
The original Xbox was basically a PC
It featured a 733MHz Pentium III processor with a 128KB L2 cache, paired up with a modified version of NVIDIA's nForce chipset (modified to support Intel's Pentium III bus instead of the Athlon XP it was designed for). The nForce chipset featured an integrated GPU, codenamed the NV2A, offering performance very similar to that of a GeForce3. The system had a 5X PC DVD drive and an 8GB IDE hard drive, and all of the controllers interfaced to the console using USB cables with a proprietary connector.
For the most part, game developers were quite pleased with the original Xbox. It offered them a much more powerful CPU, GPU and overall platform than anything had before. But as time went on, there were definitely limitations that developers ran into with the first Xbox.
One of the biggest limitations
On a purely hardware level, ATI's Xbox 360 GPU (codenamed Xenos) is quite interesting. The part itself is made up of two physically distinct silicon ICs. One IC is the GPU itself, which houses all the shader hardware and most of the processing power. The second IC (which ATI refers to as the "daughter die") is a 10MB block of embedded DRAM (eDRAM) combined with the hardware necessary for z and stencil operations, color and alpha processing, and anti aliasing. This daughter die is connected to the GPU proper via a 32GB/sec interconnect. Data sent over this bus will be compressed, so usable bandwidth will be higher than 32GB/sec. In side the daughter die, between the processing hardware and the eDRAM itself, bandwidth is 256GB/sec.
At this point in time, much of the bandwidth generated by graphics hardware is required to handle color and z data moving to the framebuffer. ATI hopes to eliminate this as a bottleneck by moving this processing and the back framebuffer off the main memory bus. The bus to main memory is 512MB of 128-bit 700MHz GDDR3 (which results in just over 22GB/sec of bandwidth). This is less bandwidth than current desktop graphics cards have available, but by offloading work and bandwidth for color and z to the daughter die, ATI saves themselves a good deal of bandwidth. The 22GB/sec is left for textures and the rest of the system (the Xbox implements a single pool of unified memory).
The GPU essentially acts as the Northbridge for the system, and sits in the middle of everything. From the graphics hardware, there is 10.8GB/sec of bandwidth up and down to the CPU itself. The rest of the system is hooked in with 500MB/sec of bandwidth up and down. The high bandwidth to the CPU is quite useful as the GPU is able to directly read from the L2 cache. In the console world, the CPU and GPU are quite tightly linked and the Xbox 360 stands to continue that tradition.
Weighing in at 332M transistors, the Xbox 360 GPU is quite a powerful part, but its architecture differs from that of current desktop graphics hardware. For years, vertex and pixel shader hardware have been implemented separately, but ATI has sought to combine their functionality in a unified shader architecture.
What's A Unified Shader Architecture?
The GPU in the Xbox 360 uses a different architecture than we are used to seeing. To be sure, vertex and pixel shader programs will run on the part, but not on separate segments of the hardware. Vertex and pixel processing differ in purpose, but there is quite a bit of overlap in the type of hardware needed to do both. The unified shader architecture that ATI chose to use in their Xbox 360 GPU allows them to pack more functionality onto fewer transistors as less hardware needs to be duplicated for use in different parts of the chip and will run both vertex and shader programs on the same hardware.
There are 3 parallel groups of 16 shader units each. Each of the three groups can either operate on vertex or pixel data. Each shader unit is able to perform one 4 wide vector operation and 1 scalar operation per clock cycle. Current ATI hardware is able to perform two 3 wide vector and two scalar operations per cycle in the pixel pipe alone. The vertex pipeline of R420 is 6 wide and can do one vector 4 and one scalar op per cycle. If we look at straight up processing power, this gives R420 the ability to crunch 158 components (30 of which are 32bit and 128 are limited to 24bit precision). The Xbox GPU is able to crunch 240 32bit components in its shader units per clock cycle. Where this is a 51% increase in the number of ops that can be done per cycle (as well as a general increase in precision), we can't expect these 48 piplines to act like 3 sets of R420 pipelines. All things being equal, this increase (when only looking at ops/cycle) would be only as powerful as a 24 piped R420.
What will make or break the difference between something like a 24 piped R420 and the unified shaders of the Xbox GPU is ho