NVIDIA GTX 970 Specifications Corrected, Memory Pools Explained
Vigile writes Over the weekend NVIDIA sent out its first official response to the claims of hampered performance on the GTX 970 and a potential lack of access to 1/8th of the on-board memory. Today NVIDIA has clarified the situation again, this time with some important changes to the specifications of the GPU. First, the ROP count and L2 cache capacity of the GTX 970 were incorrectly reported at launch (last September). The GTX 970 has 52 ROPs and 1792 KB of L2 cache compared to the GTX 980 that has 64 ROPs and 2048 KB of L2 cache; previously both GPUs claimed to have identical specs. Because of this change, one of the 32-bit memory channels is accessed differently, forcing NVIDIA to create 3.5GB and 0.5GB pools of memory to improve overall performance for the majority of use cases. The smaller, 500MB pool operates at 1/7th the speed of the 3.5GB pool and thus will lower total graphics system performance by 4-6% when added into the memory system. That occurs when games request MORE than 3.5GB of memory allocation though, which happens only in extreme cases and combinations of resolution and anti-aliasing. Still, the jury is out on whether NVIDIA has answered enough questions to temper the fire from consumers.
You pay for an airdrop containing the extra ROPS and Cache. It's contested, though, so you may or may not get it.
Sig Follows: "Suppose you were an idiot. And suppose you were a member of Congress. But I repeat myself." -- Mark Twain
How about giving us the option to either always be able to run at maximum speed (disable that last 0.5GiB) or always let the software use the full 4GiB (at the cost of speed if more than 3.5GiB is required).
Get free satoshi (Bitcoin) and Dogecoins
What about those users (more than one, anecdotes are data not anomalies!) whose use causes the GPUs to attempt to address more than 3.5GB VRAM causing them to crash out? If what NVidia are claiming here according to TFS is accurate, then this should not be happening. It is happening, the 3.5GB roof is being hit hard and people are feeling it. What say you, NVidia?
Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
Consumers are fine. The only benchmark that matters to a normal consumer is "How fast does it run my games?" and the answer for the 970 is "Extremely damn fast." It offers performance quite near the 980, for most games so fast that your monitor's refresh rate is the limit, and does so at half the cost. It is an extremely good buy, and I say this as someone who bought a 980 (because I always want the highest end toy).
Some people on forums are trying to make hay about this because they like to whine, but if you STFU and load up a game the thing is just great. While I agree companies need to keep their specs correct, the idea that this is some massive consumer issue is silly. The spec heads on forums are being outraged because they like to do that, regular consumers are playing their games happily, amazed at how much power $340 gets you these days.
As usual, AnandTech's article is generally the best technical reporting on the matter
Key takeaways (aka tl;dr version):
* Nvidia's initial announcement of the specs was wrong, but only because the technical marketing team wasn't notified that you could partially disable a ROP unit with the new architecture. They overstated the number of ROPs by 8 (was 64, actually 56) and the amount of L2 cache by 256KB (was 2MB, actually 1.75MB). This was quite unlikely to be a deliberate deception, and was most likely an honest mistake.
* The card effectively has two performance cliffs for exceeding memory usage. Go over 3.5GB, and it drops from 196GB/s to 28GB/s; go over 4GB and it drops from 28GB/s to 16GB/s as it goes out to main memory. This makes it act more like a 3.5GB card in many ways, but the performance penalty isn't quite as steep, and it intelligently prioritizes which data to put in the slower segment.
* The segmented memory is not new; Nvidia previously used it with the 660 and 660 Ti, although for a different reason.
* Because, even with the reduced bandwidth, the card is bottlenecked elsewhere, this is unlikely to cause actual performance issues in real-world cases. The only things that currently show it are artificial benchmarks that specifically test memory bandwidth, and most of those were written specifically to test this card.
* As always, the only numbers that matter for buying a video card are benchmarks and prices. I'm a bigger specs nerd than most, but even I recognize that the thing that matters is application performance, not theoretical. And the application performance is good enough for the price that I'd still buy one, if I were in the market for a high-end but not top-end card.
Not a shill or fanboy for Nvidia - I use and recommend both companies' cards, depending on the situation.
AMD isn't involved here. This is NVIDIA...
A particular high performance car has a premium 8 cylinder engine and 32 valves at 400 hp. They also sell a non-premium version which is also 8 cylinders but only 30 valves and makes 350 hp but is a lot cheaper. The difference is that one cylinder is missing two valves which lowers its maximum power compared to the premium version. The engine's computer correctly controls the engine to compensate for the one weird cylinder, but someone in the marketing department sold the car as having 32 valves when it only had 30. The 350 hp figure is accurate, but some people complain because if they reprogram the engine control chip to force the one 2-valve cylinder to run at the same conditions as the other 4-valve cylinders, the car only makes 300 hp. But in all normal circumstances the car performs as advertised, only it was initially sold with incorrect details as to how the engine was put together to make it nearly as fast for much cheaper than the premium version.
This coward has just clearly demonstrated that there is in fact such a thing as a dumb question.
I don't hold Nvidia in high regard, but it's still higher than ATI, and besides, how many other graphics vendors do gamers have to go to?
(And before you say it, Intel is only the choice of idiots when it comes to graphics for gamers. Maybe someday they'll get their header out of their asterisk, but it hasn't happened yet.)
If consumers end up getting anything, at best it'll probably be an old game download 3 years from now. Those kinds of lawsuits or corporate "apologies" never seem to be worth it, and yet an old game is still far better than most of them, just look at Sony's "apologies" to it's users.
I just bought a GTX970 and I'm chuffed with it, 89% of the performance of a GTX980 for 60% of the price, I couldn't justify spending the thick end of five hundred quid for a graphics card and the GTX970 will run pretty much anything I throw at it.
Captcha : illusion
Wow straight to the "race to the bottom" hey mate? Your masters have trained you well... Why instead of condemning false advertising, do you excuse it as a common practice..?
I thought about shelling out the dough for the 980s, but didn't, because, well - same CPU, lower clock, right?
Wrong.
Less than happy about this.
..don't panic
The reason you don't go out and buy the latest AAA title is because, in recent years, they haven't been living up to the hype. Buggy, unfinished, and not quite the product that is expected. ( You know . . . . playable. )
:|
Wondering if we have to start doing the same thing with hardware now. Let the same folks who pre-order game titles beta-test this stuff for a few months to determine if the marketing claims are legitimate or not, then decide on if you should buy it.
" Just ship the damn thing ! We'll update the drivers later. "
Your statement that the last .5GB is not running at 1150MHz is as factually correct as Nvidia's statement that the card had 64 ROPs...
The issue isn't with the speed of the RAM, it's with the setup of the connections between the RAM and the GM204 GPU, the entire last 1GB of RAM is accessed using a single L2 interface, while the other six L2 interfaces only handle .5GB each.
Each of the seven L2 interfaces in the GPU can handle roughly 22GB/s bandwidth to RAM and data in RAM is interleaved between interfaces, so to reach peak bandwidth (~150GB/s) the last L2 interface dedicates its full bandwidth to just its first .5GB of RAM. Otherwise the first six L2 interfaces (3GB VRAM) would spend half their time waiting for the last interface (1GB) to catch up, since it's reading or writing twice as much data, netting us a peak bandwidth of only ~75GB/s total. Only when 3.5GB is already used does the seventh L2 interface start splitting its performance to use the remaining .5GB. Fortunately NVidia is at least competent enough the least bandwidth intensive stuff on that last .5GB so it doesn't slow down the whole system anywhere close to its the worst case scenario would have predicted.
Source: TFA
The factor is 1/7.
Use this formula: 1/7 * speed
The practical effect is still similar to the last 0.5GB running at 1150/7 MHz.
There is a 200 dollar price difference for the cards on new egg.... You don't stay in business by undercutting yourself.. Anyone who thought the cards would be identical is beyond retarded.
ever heard of Fire and Quadro series... ?
world was created 5 seconds before this post as it is.
Shouldn't the driver be responsible for making sure that "slower" data ends up in that half gig? Something besides the frame buffer or textures. I bet that better memory management could completely hide the problem.