GeForce FX Architecture Explained

← Back to Stories (view on slashdot.org)

GeForce FX Architecture Explained

Posted by CowboyNeal on Thursday September 11, 2003 @04:18PM from the kicking-the-tires dept.

Brian writes "3DCenter has published one of the most in-depth articles on the internals of a 3D graphics chip (the NV30/GeForce FX in this case) that I've ever seen. The author has based his results on a patent NVIDIA filed last year and he has turned up some very interesting relevations regarding the GeForce FX that go a long way to explain why its performance is so different from the recent Radeons. Apparently, optimal shader code for the NV30 is substantially different from what is generated by the standard DX9 HLSL compiler. A new compiler may help to some extent, but other performance issues will likely need to be resolved by NVIDIA in the driver itself."

9 of 185 comments (clear)

Min score:

Reason:

Sort:

I wonder what a structured classroom approach... by tloh · 2003-09-11 16:42 · Score: 5, Interesting

Weird timing. I'm currently writing code for a class on microcontrollers. Most electrical engineering students would at some time come across an advanced digital course on microprocessors where one learns about different machine architectures and how to write assembly code for them. Are there any /.ers who have systematically studies GPU chips as part of a class, like say on graphic algorithms or DSP?

--
Stay sentient. Don't drink bad milk.
On the other hand... by La+Temperanza · 2003-09-11 16:58 · Score: 3, Interesting

NVidia has much better Linux drivers then ATI. Support 'em.

--

--
est modus in rebus
One assumption is probably wrong by hbog · 2003-09-11 16:59 · Score: 5, Interesting

From the article - "Because of the length of the pipeline and the latencies of sampling textures it is possible that the pipeline is full before the first quad reaches its end. In this case the Gatekeeper has to wait as long as is takes the quad to reach the end. Every clock cycle that passes means wasted performance then. An increased number of quads in the pipeline lowers the risk of such pipeline stalls."

I understand that the article writers are trying to come up with reasons that the Nvidia part is wasting performance, but this doesn't make sense. No architect in this right mind would ever design a pipeline that becomes full before the first instruction can exit. The means that you are fetching much faster than you are retiring instructions. That means you will always have a pipeline stall at the frontend and you will always be wasting cycles. I think the designers would have checked something like that. You can't afford pipeline stalls to happen regularly.
Re:Anand tells the tale by Bob-o-Matic! · 2003-09-11 17:09 · Score: 2, Interesting

damn-- too much sierra nevada pale ale... hosed the hyperlink to the article
GeforceFX by BigFootApe · 2003-09-11 19:17 · Score: 5, Interesting

This article seems to reiterate what everyone has been saying (Carmack, Valve, everyone). The GeforceFX architecture can only be made competitive for 3d engines using modern shaders with herculean effort. This is to be competitive, not dominantly superior.

Honestly, I thought nVidia learned their lesson with the NV1 - don't make weird hardware.

Now, what has to be making GeforceFX owners worried is Gabe Newell's warning that the new Detonator drivers might be making illegitimate 'optimizations' and, furthermore, covering them up by rendering high quality screen captures.
Re:Say what by Anonymous Coward · 2003-09-11 20:45 · Score: 4, Interesting

No need for the tinfoil hat.

The most complex part of a DX8 or DX9 chip is the Pixel Shader, so I'll concentrate on it. Nvidia spearheaded the development of PS1.1 for DX8.

Then ATI stole the show with PS1.4 (DX8.1), which is much closer to PS2.0 than PS1.1. At this point, ATI got Microsoft's ear -- ATI was ahead of Nvidia in implementing programmable shaders in graphics hardware.

So Microsoft had good reason to pay attention to ATI's ideas of DX9 (including how the HLSL should look like and what kind of assembly it should output), long before any Xbox 1 money issues with Nvidia, long before choosing the designer for Xbox 2 graphics/chipset.

I guess ;-)
Re:Say what by Gojira+Shipi-Taro · 2003-09-11 23:50 · Score: 2, Interesting

ATI also brings a new level to "Abandoning support for hardware as soon as we think we can get away with it". I've had one video card and one tv tuner from ATI, each of which I bought right around the introduction of a new Windows version. In both cases, ATI dropped support for the card shortly after. In the case of the video card, they never did release WDM drivers. In the case of the tuner, they released only a version or two that "kind of" worked under 2000 pro and XP, then decided the remaining major bugs were too hard and dropped support.

I've never had this problem with Nvidia, so even if they are slower from time to time, I'll stick with a company that doesn't screw me.

I appreciate the fact that ATI is increasing competition these days, but they'll never get another cent of mine.

--
"Oh my God. This is terrible. This is the end of my Presidency. I'm fucked."; ~ Donald J. Trump
Re:But can you hack a GeForce like you can hack Ra by top_dog_nuke · 2003-09-12 00:45 · Score: 2, Interesting

Oh, but you are wrong my friend. :) Newegg has these old style cards in stock. I bought one last week. I have all 8 pipelines after the soft mod.
Notice in the picture the arrangement of the memory chips AROUND the core.
http://www.newegg.com/app/ViewProduct.asp?c atalog= 48&DEPA=1&submit=property&mfrcode=0&propertycode=& propertycodevalue=4396,3668
Re:I wonder what a structured classroom approach.. by SmackCrackandPot · 2003-09-12 03:56 · Score: 2, Interesting

I did a project a while back using TIGA and the associated chips. You might want to take a look at , FPGA and User Interface Guide.

It's all obsolete and legacy now. But it gives you a good idea about how a current day graphics card is designed. Back then, the various components had to be implemented on separate chips (eg. RAMDAC's, clock oscillators, memory decoding, graphics).

TI also had the TMS34082 vector processor. You could have up to four of those in a slave/master configuration (a bit like the PS2 VU0 processor). The TMS34020 supported 1/2/4/8/16/24/32 bit pixel sizes and had a parallogram rendering instruction (Two of those allow you to render a triangle). If they had kept the product range going and allowed Moore's law to keep going, they would probably have been able to keep up with 3Dfx.

Intel also has the i860 which combined the floating point and graphics processing onto a single chip. The Intel XEON chip still supports this instruction set.

If you can access the IEEE and ACM archives, you'll find out about dozens more such processors.

Presently, you should have a look at the OpenGL extension a href="http://oss.sgi.com/projects/ogl-sample/regis try/ARB/vertex_program.txt">ARB_vertex_program and "a href="http://oss.sgi.com/projects/ogl-sample/regis try/ARB/fragment_program.txt">ARB_fragment_program .

Any Google search on these topics will provide an almost infinite list of topics.