NVIDIA GTX 970 Specifications Corrected, Memory Pools Explained
Vigile writes Over the weekend NVIDIA sent out its first official response to the claims of hampered performance on the GTX 970 and a potential lack of access to 1/8th of the on-board memory. Today NVIDIA has clarified the situation again, this time with some important changes to the specifications of the GPU. First, the ROP count and L2 cache capacity of the GTX 970 were incorrectly reported at launch (last September). The GTX 970 has 52 ROPs and 1792 KB of L2 cache compared to the GTX 980 that has 64 ROPs and 2048 KB of L2 cache; previously both GPUs claimed to have identical specs. Because of this change, one of the 32-bit memory channels is accessed differently, forcing NVIDIA to create 3.5GB and 0.5GB pools of memory to improve overall performance for the majority of use cases. The smaller, 500MB pool operates at 1/7th the speed of the 3.5GB pool and thus will lower total graphics system performance by 4-6% when added into the memory system. That occurs when games request MORE than 3.5GB of memory allocation though, which happens only in extreme cases and combinations of resolution and anti-aliasing. Still, the jury is out on whether NVIDIA has answered enough questions to temper the fire from consumers.
As usual, AnandTech's article is generally the best technical reporting on the matter
Key takeaways (aka tl;dr version):
* Nvidia's initial announcement of the specs was wrong, but only because the technical marketing team wasn't notified that you could partially disable a ROP unit with the new architecture. They overstated the number of ROPs by 8 (was 64, actually 56) and the amount of L2 cache by 256KB (was 2MB, actually 1.75MB). This was quite unlikely to be a deliberate deception, and was most likely an honest mistake.
* The card effectively has two performance cliffs for exceeding memory usage. Go over 3.5GB, and it drops from 196GB/s to 28GB/s; go over 4GB and it drops from 28GB/s to 16GB/s as it goes out to main memory. This makes it act more like a 3.5GB card in many ways, but the performance penalty isn't quite as steep, and it intelligently prioritizes which data to put in the slower segment.
* The segmented memory is not new; Nvidia previously used it with the 660 and 660 Ti, although for a different reason.
* Because, even with the reduced bandwidth, the card is bottlenecked elsewhere, this is unlikely to cause actual performance issues in real-world cases. The only things that currently show it are artificial benchmarks that specifically test memory bandwidth, and most of those were written specifically to test this card.
* As always, the only numbers that matter for buying a video card are benchmarks and prices. I'm a bigger specs nerd than most, but even I recognize that the thing that matters is application performance, not theoretical. And the application performance is good enough for the price that I'd still buy one, if I were in the market for a high-end but not top-end card.
Not a shill or fanboy for Nvidia - I use and recommend both companies' cards, depending on the situation.
There's really no point to doing that. If you disable the memory and run at high resolutions with ultra textures and AA that would cause you to break that 3.5 GB barrier, your performance would just tank because you are exchanging with main memory. In other words, the performance of the card is exactly at least what you would get from a 3.5 GB card. That extra 500 MB isn't hurting anything.
>> causing them to crash out
This is a blatantly misieading thing to say. The cards don't crash at all. The only thing that happens as a result of this is a properly handled decrease in real world performance compared to the 980.
Are you seriously trying to claim that the 970 _should_ have the same performance as the 980?
>> I run three 30" 2650x1600s
Thats pretty much irrelevant. GPU ram isn't used that way at all. Its used to hold the 3D geometry, bitmaps, bump maps etc of assets and other processing data which is largely if not completely independent of screen resolution/no.of screens.
Do the math:
2560 x 1600 x 4 (4 bytes per pixel for 32 bit color) = 15.625 Mb * 3 monitors = screen buffer for 3 screens total size = 46.875 Mb.
Even triple buffering your total screen buffer requirement for all 3 monitors is less than 150Mb.