NVIDIA Previews GF100 Features and Architecture
MojoKid writes "NVIDIA has decided to disclose more information regarding their next generation GF100 GPU architecture today. Also known as Fermi, the GF100 GPU features 512 CUDA cores, 16 geometry units, 4 raster units, 64 texture units, 48 ROPs, and a 384-bit GDDR5 memory interface. If you're keeping count, the older GT200 features 240 CUDA cores, 42 ROPs, and 60 texture units, but the geometry and raster units, as they are implemented in GF100, are not present in the GT200 GPU. The GT200 also features a wider 512-bit memory interface, but the need for such a wide interface is somewhat negated in GF100 due to the fact that it uses GDDR5 memory which effectively offers double the bandwidth of GDDR3, clock for clock. Reportedly, the GF100 will also offer 8x the peak double-precision compute performance as its predecessor, 10x faster context switching, and new anti-aliasing modes."
Why more disclosure now? There doesn't seem to be any major AMD or, gasp, Intel product launch in progress...
One that hath name thou can not otter
...but will it run Linux?
For making the better GPU? :P
I'm god, but it's a bit of a drag really...
I understand most of the time people who write about computers aren't exactly literature graduates, but wtf, at least write correctly. Use some spell checker or have someone proof read it.
Oblivion Awaits
Anandtech also has an article up about the GF100. They generally have very well written, in-depth articles: http://www.anandtech.com/video/showdoc.aspx?i=3721
From the article:
"The GPU will also be execute C++ code."
They integrate a C++ interpreter (or JIT compiler) into their graphics chip?
The Tao of math: The numbers you can count are not the real numbers.
is where it's at for scientific computation. Folks are moving their codes to GPUs now, betting the double-precision performance will get there soon. 8x increase in compute performance looks promising, assuming it translates into real world gains.
46 & 2
Why it is that they would stick with a 256 bit memory bus (aside from the fact that clock for clock its really the same speed as a 512 bit bus of slower memory?) Is it just because the rest of the card is a bottle neck? I don't think I can recall another card, that when all other things were equal, a faster bit bus didn't result in a sizable increase in processing power? It was obviously implemented in the previous generation of cards, so why not stick with it, use the GDDR5 and then end up with a card thats even faster?
Can anyone explain to me why they would do this (or not do this, depending on how you look at it?)
So we've had this long history with nvidia part numbers gradually increasing. 5000 series, 6000 series, etc. up until the 9000 series. At that point they needed to go to 10000, and the numbers were getting a bit unwieldy. So understandably, the decided to restart with the GT100 series and GT200 series. So now instead of continuing with a 300 series, we're going back to a 100. So we had the GT100 series and now we get the GF100 series? And GF? Serieously? People already abbreviates GeForce as GF, so now when someone says GF we can't be sure what they are talking about. Terrible marketing decision IMHO.
A wide memory bus is expensive in terms of card real-estate (wider bus = more lines) this increases cost. It also increases the amount of logic in the GPU and requires more memory chips for the same amount of memory.
What happened to GDDR4?
This monster is already 550 mm^2, I don't think the couple million transistors needed to do a 512bit bus would be noticed, nor would the cost of the pins to connect to the outside. The more likely explanation is that they aren't memory starved and that trying to route the extra high precision lanes on the board was either too hard or was going to require more layers in the PCB which would add significant cost.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
They _think_ that the card won't be memory starved at the usual loads. More memory lanes means higher complexity also in assuring the same "distance" (propagation time) for all the memory chips.
People think that the newest AMD card (5970?) is huge, I wonder how big cards with Fermi will be, and how much bigger they should be if needing even more memory chips and memory lanes.
The wider your memory bus, the greater the cost. Reason is that it is implemented as more parallel controllers. So you want the smallest one that gets the job done. Also, faster memory gets you nothing if the GPU isn't fast enough to access it. Memory bandwidth and GPU speed are very intertwined. Have memory slower than your GPU needs, and it'll be bottlenecking the GPU. However have it faster, and you gain nothing while increasing cost. So the idea is to get it right at the level that the GPU can make full use of it, but not be slowed down.
Apparently, 256-bit GDDR5 is enough.
Not to mention there is certainly not a 1:1 gain in speed from doubling the bandwidth. Double bandwidth is nice for, say, copying blocks of memory, but it doesn't help for performing operations, and sometimes added latencies can make it under perform slower memory - early DD3 for instance, had CAS latencies double or more of DDR2 without a huge gain in bandwidth (800 to 1066) and often could be beaten by much cheaper DDR2. Without a more comprehensive analysis it is hard to say which is faster.
Now that graphics are largely stagnant in between console generations, the PC's graphics advantages tend to be limited to higher resolution, higher framerate, anti-aliasing, and somewhat higher texture resolution. If the huge new emphasis on tesselation in GF100 strikes a chord with developers, and especially if something like it gets into the next console generation, games may ship with much more detailed geometry which will then automatically scale to the performance of the hardware on which they're run. This would allow PC graphics to gain the additional advantage of having an order of magnitude increase in geometry detail, which would make more of a visible difference than any of the advantages it currently has, and it would occur with virtually no extra work by developers. It would also allow performance to scale much more effectively across a wide range of PC hardware, allowing developers to simultaneously hit the casual and enthusiast markets much more effectively.
"I zero-index my hamsters" - Willtor (147206)
Prescott dissipated 105W from only 112 mm^2, or about twice the power density of this chip, I don't think cooling will be a major problem.
Especially as this announcement came out during winter (for those of us in the northern hemisphere).
What video card do people recommend you fit in your PC nowadays
a) on a budget (say £50)
b) average (say £100)
c) with a bigger budget (say £250)
Bonus points if you can recommend a good (fanless) silent video card....
Donte Alistair Anderson Roberts - hi son!
Karma: Chameleon
nor would the cost of the pins to connect to the outside.
Are you kidding? The pin driver pads take up more die real-estate than anything else (and they suck up huge amounts of power as well). Even on now-ancient early 80's ICs, the pads were gargantuan compared to any other logic. E.g. a logic module vs. a pad was a huge difference... like looking at a satellite map of a football field (pad) with a car parked beside it (logic module). These days, that's only gotten orders of magnitude worse as pin drivers haven't shrunk much at all when compared to current logic process sizes.
There's a reason that there's a fair bit of active research towards integrating optical off chip communication directly into current silicon processes. The hope is that such approaches will represent a big improvement in chip die size, power dissipation, and available bandwidth -- all just from removing the pad drivers and pins from the equation.
Huh? The outer left and right rows on this picture are the memory controllers, that's what 5-10% of the total die area? Adding 1/3rd more pins would add a couple percent to the overall cost of the chip. Now on lower level parts where there's half as many logic units it would be more significant, but there's a reason that lower end parts have less memory bandwidth (and they need less since they can process less per clock.)
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
from GF100 to MRS100.
I eat only the real part of complex carbohydrates.
also made your PC sound like a vacuum cleaner.
Platform advocacy is like choosing a favorite severely developmentally disabled child.
"But wait, there's still more!"
Increasing the bus size has the effect of increasing the perimeter of the chip. Which drives up costs because of the increased die area.
I was disappointed by that article as well. It was obvious even then that they simply weren't exaggerating enough.
Can you be Even More Awesome?!
GF100 to MRS100 is easy, it is the future downgrade from MRS100 to MR50/MS50 that's the really killer ;^)
Graphical hardware power is a problem on consoles not PC. Despite their much touted power the PS3 or Xbox360 cannot do FSAA at 1080p. Most developers have resorted to software solutions (hacks, for all intents and purposes) to get rid of jaggedness.
Most games made for consoles will work the same, if not better on a low end PC (if they don't do a crappy job on porting but Xbox to PC this is pretty hard to screw up these days). The problem with PC gaming is that it is not utilised to its fullest extent. Most games are console ports or PC games bought up at about 60% completion and then consolised.
PC Graphics 1280x1024 upwards tend to look pretty good. Compare that to Xbox (720p) or PS3 (1080p) which still look pretty bad at those resolutions. Check out the screenshots of Fallout 3 or Far Cry 2, the PC version always looks better no matter the resolution. According the the latest Steam survey 1280x1024 is still the most popular resolution, 1680x1050 the second.
If you have the power, why not use it.
Dont get me wrong however, progress and new idea are a good thing but the PC gaming market is far from in trouble.
Calling someone a "hater" only means you can not rationally rebut their argument.