Future of 3d Graphics
zymano writes "Extremetech
has this nice article on the future of 3d graphics. The article also mentions that graphic card gpus can be used for non-traditional powerful processing like physics. A quote from the article, "GPU can be from 10 to 100 times faster than a Pentium 4 and Scientific computations such as linear algebra, Fast Fourier Transforms, and partial differential equations can benefit". My question - If these cards are getting so powerful at computations then why do we need a Intel/AMD processor at all? Just make a graphics card with more transistors and drop the traditional processor..."
Just make a graphics card with more transistors and drop the traditional processor..."
Apple have done this several years ago. The Newton 2000 and 2100 didn't use a CPU but rather the graphics processor.
RST
Because GPUs are NOT general purpose devices. A normal processor, like a P4 can be programmed to do anything. It might take a real long time, but it is a general purpose processor and so can process anything, including emulating other processors. A GPU is not. It does one thing and one thing only: pushes pixels. Now more modren GPUs have gained some limited programmability, but they still aren't general purpose processors.
GPUs work with limited precision -- IEEE single precision is typical. This is good enough for 3D graphics -- after all, in the end you'll be limited by the 10-11 bit spatial resolution and 8 bit color resolution -- but not good enough for most scientific problems, which typically require a minimum of double precision.
Simulating higher precision with single precision arithmetic is possible, but the performance penalty is too severe for it to be useful.
Tarsnap: Online backups for the truly paranoid
Because it isn't cheaper if you need hundres of these simple units. The good thing about a DSP (which is what a GPU is, after a fashion) is that because it is specialised to a single operation, it can be highly optimised and do it much quicker than a general purpose CPU. However the good is also the bad, the DSP is highly specialised and can ONLY do that operation, or at least onyl do it efficiently.
Take digital audio. Used to be that CPUs were too pathetic to do even simple kinds of digital audio ops in realtime, so you had to offload everything to dedicated DSPs. Protools did this, you bught all sorts of expensive, specialesed hardware and loded your Mac full of it so it oculd do real time audio effects. Now, why bother? It is much cheaper to do it in software since processors ARE fast enough. Also, if a new kind of effect comes out, or an upgrade, all you have to do is load new software, not buy new hardware.
Also, if you like, you can get DSPs to do a number of computationally intensive thing. As mentioned, the GPU is real common. They take over almost all graphics calculations (including much animation with things like vertex shaders) from the CPU. Another thing along the games line is a good soundcard. Something like an Audigy 2 comes with a DSP that will handle 3d positioning calculations, reflections, occlusions and such. If you want a video en/decoder those are available too. MPEG-2 decoders are pretty cheap, the encoders cost a whole lot more. Of course the en/decoder only works for the video formats it was built for, nothing else. You can also get processors to help with things like disk operations, high end SCSI and IDE RAID cards have their own processor on board to take care of all those calculations.
GPUs are highly specialized. In graphics processing, you generally perform the same set of operations over and over again. Also, pixels can be rendered concurrently - as such, graphics hardware can be extremely parallel in nature. Also, in graphics hardware, there isn't much (if any) branching in code. Simple shader code just runs through the same set of operations over and over again.
"Normal" code, such as a game engine, compiler, word processor or MP3/DivX encoder does all sorts of different operations, in a different order each time, many which are inherently serial in nature and don't scale well with parallel processing. This type of code is full of branches.
To optimize graphics processing, you can really just throw massively parallel hardware at it. Modern cards do what, 16 pixels/texels per cycle? 4+ pipelines for each stage all doing the EXACT same thing?
Regular code just isn't like that. Because different operations have to happen each time and in each program, you can't optimize the hardware for one specific thing. In serial applications, extra pipelines just go to waste. Also, frequent branch instructions mean that you have to worry about things like branch prediction (which takes up a fair amount of space). When you do have operations that can happen in parallel (such as make -j 4), the different pipelines are doing differnet things.
Take your GeForce GPU and P4 and see which can count to 2 billion faster. In a task like this, where both processors can probably do one add per cycle (no parallelizing in this code), the 2GHz P4 will take one second, and the 500MHz GeForce will take four seconds (assuming it can be programmed for a simple operation like "ADD"). Even if you throw in more instructions but the code cannot be parallelized, the CPU will probably win.
Basically, since you can't target one specific application, a general purpose processor will always be slower at some things - but can do a much wider range of things. Heck, up until recently, "GPUs" were dumb and couldn't be programmed by users at all. I haven't looked at what operations you can do now, but IIRC you are still limited to code with at most 2000 instructions or so.
My server
My question - If these cards are getting so powerful at computations then why do we need a Intel/AMD processor at all? Just make a graphics card with more transistors and drop the traditional processor...
If you'd really like the answer to this question, try programming anything on the GPU and you'll understand. It's hell to do half this stuff. GPUs are highly specialized and make very specific tradeoffs in favor of graphics processing. Of course, some operations, specifically those that can be modeled using cellular automata, map well to this set of constraints. Others, such as ray-tracing can be shoe-horned in, but if you were to try to write a word processor on the GPU, it'd essentially be impossible. The GPU allows you to do massively parallel computations, but penalizes you heavilly for things such as loops of variable length or reading memory back from the card outside of the once-per-cycle frame update, and the price of interrupting computation is prohibitive. Clearing the graphics pipeline can take a long, long time.
Furthermore, while there have been a few papers published claiming the orders of magnitude increase in speed in these sorts of computations, none actually demonstrate this sort of speed-up. Everyone's speculating, but when it comes to it, results are lacking.
b.c
Consider from the point of games: even if you could move all calculations for occluders, all texturing, all graphical effects etc. you still would need advanced AI, networking, audio system and decoding, resources and data structures and of course input handling. Not all of these are heavy calculation but require some timing and most of all: the bus architecture in PC doesn't allow enough flexibility to use GPU in anything more than additional processing power.
GPUs have many computational features added but even occlusion culling needs some work in the CPU at the moment. The hardware implementations of OC are rather limited and naturally require transferring unneeded vertex data to display card over slow bus.
SGI has developed some nice bus subsystems in the past, I'd like to see more of them in the mainstream.
OpenGL and Direct3D are the two interfaces that graphics card vendors provide to get at the hardware; there is no lower-level way to get at it. However, these APIs now include ways of describing programs that run on the GPUs directly; you can write programs that run at the per-vertex or per-pixel level with either of those APIs.
These programs can be given to the GPU via specialized low-level assembly language that has been developed to expose the programmability of GPUs. (They are pretty clean, RISC-like instruction sets).
Alternatively, you can use a higher level programming language, like NVIDIA's Cg, or Microsoft's HLSL, to write programs to run on the GPU. These are somewhat C-like languages that then compile down to those GPU assembly instruction sets.
Interestingly he thinks it'll be specialized hardware that will do ray-tracing, etc.
http://www.hardwarecentral.com/hardwarecentral/rev iews/1721/1/
"Is there a future for radiosity lighting in 3D hardware? Ray-tracing? When would it become available?
Gary: Yes, but probably just in specialized hardware as it's a very different problem. Ray-tracing is nasty because of it's non-locality, so fast localized hacks will probably prevail as long as people are clever. Especially for real-time rendering on low-cost hardware. It's interesting that RenderMan has managed to do amazing CGI without ray-tracing. That's an existence proof that a hack in the hand, is worth ray-tracing in the bush.
Oh... and for people who haven't seen it before, here's a cool detailed paper about how the pipeline of a traditional 3d accellerators can be tweaked used to do ray tracing...
http://graphics.stanford.edu/papers/rtongfx/rtongf x.pdf
Reading that shows how programming a graphics pipeline is quite different (more interesting? more complicated?) than programming a general purpose CPU.
NVidia's Cg language is a C like language for GPUs. You can download the compiler and examples from nvidia. Writing for a GPU is not trivial, though, and getting the best use out of it requires quite a bit of knowledge about how a GPU works.
"A language that doesn't affect the way you think about programming, is not worth knowing" - Alan Perlis