Future of 3d Graphics

No processor. by rebeka+thomas · 2003-05-18 05:43 · Score: 3, Informative

Just make a graphics card with more transistors and drop the traditional processor..."

Apple have done this several years ago. The Newton 2000 and 2100 didn't use a CPU but rather the graphics processor.

--
RST

Re:No processor. by Karamchand · 2003-05-18 05:50 · Score: 4, Insightful

In my opinion that's just a matter of definition. If it manages the core tasks of running the machine it's a CPU - i.e. a central PU - and not just a GPU since it handles more than the graphics.

That is as soon as there is no CPU and the GPU handles its tasks it becomes a CPU by definition!

The head of Nvidia agrees with the poster by AvitarX · 2003-05-18 05:44 · Score: 3, Interesting

The head of Nvidia was written about in wired a while ago and he essentially said the same thing.

He was like, our cards ARE the computer, and are becoming far more important then the CPU for the hard core stuff.

It was interesting, but I totally foo fooed it.

obviously he was smarter then me.

--
Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg

Re:The head of Nvidia agrees with the poster by mmp · 2003-05-18 08:08 · Score: 5, Informative

OpenGL and Direct3D are the two interfaces that graphics card vendors provide to get at the hardware; there is no lower-level way to get at it. However, these APIs now include ways of describing programs that run on the GPUs directly; you can write programs that run at the per-vertex or per-pixel level with either of those APIs.

These programs can be given to the GPU via specialized low-level assembly language that has been developed to expose the programmability of GPUs. (They are pretty clean, RISC-like instruction sets).

Alternatively, you can use a higher level programming language, like NVIDIA's Cg, or Microsoft's HLSL, to write programs to run on the GPU. These are somewhat C-like languages that then compile down to those GPU assembly instruction sets.
Re:The head of Nvidia agrees with the poster by jericho4.0 · 2003-05-18 09:14 · Score: 4, Informative

NVidia's Cg language is a C like language for GPUs. You can download the compiler and examples from nvidia. Writing for a GPU is not trivial, though, and getting the best use out of it requires quite a bit of knowledge about how a GPU works.

--
"A language that doesn't affect the way you think about programming, is not worth knowing" - Alan Perlis

We could remove the traditional processor if... by RiverTonic · 2003-05-18 05:45 · Score: 3, Insightful

... everybody would use his computer for 3D only, but I know a lot of people never do anything with 3D. And I don't think a computer for office-work benefits a lot of the GPU.

--
This is RiverTonic's sig.

Precision by cperciva · 2003-05-18 05:45 · Score: 3, Informative

GPUs work with limited precision -- IEEE single precision is typical. This is good enough for 3D graphics -- after all, in the end you'll be limited by the 10-11 bit spatial resolution and 8 bit color resolution -- but not good enough for most scientific problems, which typically require a minimum of double precision.

Simulating higher precision with single precision arithmetic is possible, but the performance penalty is too severe for it to be useful.

--
Tarsnap: Online backups for the truly paranoid

Re:Precision by Sycraft-fu · 2003-05-18 06:00 · Score: 4, Informative

Just a note, new graphics cards step up the resolution of internal calcuaions. GeForceFX cards (and I assume new Radeon cards too) can calculate colour with 128-bit precision. However, doesn't change the fact that the card is designed to push pixels fast, not to do general calculation work.
Re:Precision by cperciva · 2003-05-18 06:10 · Score: 3, Informative

"128-bit" colour precision is just a four dimensional vector of 32-bit elements.

--
Tarsnap: Online backups for the truly paranoid
Re:Precision by Sinical · 2003-05-18 06:38 · Score: 5, Insightful

Not true. Newer cards appear to be IEEE 'extended' (there isn't a definition of the bitsize of extended from the spec, so even Intel's 80-bit format is considered 'extended') at 128 bits wide per color channel. This is pretty much the last word in accuracy as far as I'm concerned. Perhaps numerical analysts can come up with scenarios where 128 bits aren't sufficient, but I don't want to hear about them.

For some of the stuff that we do, we would kill for a slightly faster card. Right now, for simulation of IR imagery, we have to prefly a scenario where the sensor-carrying vehicle (use your imagination) flys a trajectory and we render the imagery along this path. This rendering consists of doing convolutions of background scenes with target information to generate a final image. At the end we have a 'movie'. This can take a few hours to run.

Afterwards, we run the simulation in realtime and play frames from this movie (adjusted in rotation and scaling, etc. because real-time interactions can result in flight paths subtly different from the movie) and show it to a *real* sensor and see what happens.

The point: if we could do real time convolution inside a graphics card and then get the data back out some way (we usually need to go through some custom interface to present the data to the sensor), then a lot of pain would be saved. First, we could move the video-generating infrastructure into the real time simulation, which would be simpler, we wouldn't have to worry about rotating and scaling the result since we'd be generating exactly correct results in the fly, we wouldn't have to worry about allocating huge amounts of memory (Gigabytes) to hold the video and all the concerns about memory latency and bandwidth and problems with NUMA architectures, and finally (maybe) we could change scenarios on the fly without having to worry about whether we already had a video ready to use.

I think the computational horsepower is almost there, but right now there's no good way to get the data back out of the card. On something like an SGI you get stuff after it's gone through the DACs, which mean you now have at most 12-bits per channel (less than we want, although you can use tricks for some stuff to get up to maybe 16-bits for pure luminosity data). What would be sweet in the extreme is to get a 128-bit floating point value of each pixel in the X*Y pixel scene. So if the scene were 640x480 then we'd get about 4.5Meg of data per frame at say 60Hz then we'd get about 281Meg a second to convert and send out.

Life would be sweet. Sadly, this is a pretty special purpose application, so I'm not too hopeful. What's weird is that only NVidia (and perhaps ATI) are coming up with this horsepower because of all the world's gamers, and vendors like SGI are left with hardware that is many, many generations old (although it does have the benefit of assloads of texture memory).

In short: need 1GB of RAM on the card and a way to get stuff back out after we've done the swoopty math.

Old news... by ksheka · 2003-05-18 05:45 · Score: 4, Insightful

...This is what the future releases of DirectX is supposed to address: The use of 3D renderers to render non-graphical elements and other work.

Good for the end user, but going to be a pain in the ass for software developers to take advantage of, is my guess. :-)

--
alias uptime="echo '5:33pm up 22342352324 days, 6:28, 2124315623 users, load average: 2432.40, 12312.31, 123123.19'"

Specialised hardware by James_Duncan8181 · 2003-05-18 05:46 · Score: 3, Interesting

I do often wonder why specialised hardware is not used more often for tasks that are often performed. I recall that the Mac used to have some add-on cards that spead some Photoshop operations up to modern levels 3-4 years ago.

Why buy a big processor when the only intensive computational tasks are video en/decoding and games, tasks that can easily be farmed off to other, cheaper units?

--
"To any truly impartial person, it would be obvious that I am right."

Re:Specialised hardware by Sycraft-fu · 2003-05-18 05:55 · Score: 4, Informative

Because it isn't cheaper if you need hundres of these simple units. The good thing about a DSP (which is what a GPU is, after a fashion) is that because it is specialised to a single operation, it can be highly optimised and do it much quicker than a general purpose CPU. However the good is also the bad, the DSP is highly specialised and can ONLY do that operation, or at least onyl do it efficiently.

Take digital audio. Used to be that CPUs were too pathetic to do even simple kinds of digital audio ops in realtime, so you had to offload everything to dedicated DSPs. Protools did this, you bught all sorts of expensive, specialesed hardware and loded your Mac full of it so it oculd do real time audio effects. Now, why bother? It is much cheaper to do it in software since processors ARE fast enough. Also, if a new kind of effect comes out, or an upgrade, all you have to do is load new software, not buy new hardware.

Also, if you like, you can get DSPs to do a number of computationally intensive thing. As mentioned, the GPU is real common. They take over almost all graphics calculations (including much animation with things like vertex shaders) from the CPU. Another thing along the games line is a good soundcard. Something like an Audigy 2 comes with a DSP that will handle 3d positioning calculations, reflections, occlusions and such. If you want a video en/decoder those are available too. MPEG-2 decoders are pretty cheap, the encoders cost a whole lot more. Of course the en/decoder only works for the video formats it was built for, nothing else. You can also get processors to help with things like disk operations, high end SCSI and IDE RAID cards have their own processor on board to take care of all those calculations.
Re:Specialised hardware by Sycraft-fu · 2003-05-18 06:21 · Score: 3, Insightful

Doesn't mean you can't offload it to a DSP. Also depends on what your definition of handle absolutly realistic sound is. Sure, I can do a perfectly realistic reverb on a sound source by using an impulse based reverb, which actually samples a real concert hall and reproduces it. However that is limited in power. Suppose I have a non-real location, I want to describe it all mathematically and then have multiple different sound sources, all calculated correctly. That sort of thing is much more complex and intense.

However the real point of a sound DSP is to free up more CPU for other calculations. A game with lots of 3d sounds can easily use up a non-trivial amount of CPU time, even on a P4/AthlonXP class CPU. So no, it isn't critical like a GPU, it can be handled in software, but it does help.

Graphics processor vs. general-purpose CPU by anonymous+loser · 2003-05-18 05:48 · Score: 4, Insightful

If these cards are getting so powerful at computations then why do we need a Intel/AMD processor at all? Just make a graphics card with more transistors and drop the traditional processor..

Because GPUs are specialized processors. They are only good at a couple of things: moving data in a particular format around quickly, and linear algebra. It is possible to do general-purpose calculations on a GPU, but that's not what it is good at, so you'd be wasting your time.

This is akin to asking why you shouldn't go see a veterinarian when you get sick. Because veterinarians specialize in animals. Sure, they might be able to treat you, but since their training is with animals you might find their treatments don't help as much as going to see a regular doctor.

Re:Graphics processor vs. general-purpose CPU by Phleg · 2003-05-18 06:02 · Score: 5, Funny

Actually, I see it more like having a brain surgeon as your family doctor.

--
No comment.

Re:CPU's are still neccessary by phelddagrif · 2003-05-18 05:55 · Score: 3, Insightful

Not only do some people not play games, but also GPU's do what they do so well because they are specialized. I think that if they were made to do general functions as well their efficiency would decrease. Also the comment from the nvidia guy about the graphics card doing most of the work. In a game there are still physics and AI, and overhead calculations that all need to be done. Not many or none of these are covered by the GPU.

I agree completely that offloading tasks from the CPU is good, look at the Amiga, that was an amazing machine for its time. And the a huge part of its power can be accredited to it's multiple, separate and specialized processors. I think in the future we will see a shift towards that again, as transistor increases become no longer feasible.

But then what do I know..

Umm....GPU?!? by Tokerat · 2003-05-18 05:59 · Score: 4, Funny

Because GPUs are NOT general purpose devices.

General Purpose Unit, duh!

...Huh?

Graphics? I still use a VT100 :-\

--
CAn'T CompreHend SARcaSm?

Re:Umm....GPU?!? by MulluskO · 2003-05-18 09:01 · Score: 4, Funny

GPU: Gossudarstwenoje Polititscheskoje Upravlenije (Russian: National Political Administration; Soviet Secret Service until 1937)

In GPU's Soviet Russia, pixel pushes you!

In GPU's Soviet Russia, graphics card displays you!

In GPU's Soviet Russia, jaggies smooth you!

In GPU's Soviet Russia, adventure game pixel hunts you!

--

Too busy staying alive... ~ R.A.

The difference between a CPU and GPU by CTho9305 · 2003-05-18 05:59 · Score: 4, Informative

GPUs are highly specialized. In graphics processing, you generally perform the same set of operations over and over again. Also, pixels can be rendered concurrently - as such, graphics hardware can be extremely parallel in nature. Also, in graphics hardware, there isn't much (if any) branching in code. Simple shader code just runs through the same set of operations over and over again.

"Normal" code, such as a game engine, compiler, word processor or MP3/DivX encoder does all sorts of different operations, in a different order each time, many which are inherently serial in nature and don't scale well with parallel processing. This type of code is full of branches.

To optimize graphics processing, you can really just throw massively parallel hardware at it. Modern cards do what, 16 pixels/texels per cycle? 4+ pipelines for each stage all doing the EXACT same thing?

Regular code just isn't like that. Because different operations have to happen each time and in each program, you can't optimize the hardware for one specific thing. In serial applications, extra pipelines just go to waste. Also, frequent branch instructions mean that you have to worry about things like branch prediction (which takes up a fair amount of space). When you do have operations that can happen in parallel (such as make -j 4), the different pipelines are doing differnet things.

Take your GeForce GPU and P4 and see which can count to 2 billion faster. In a task like this, where both processors can probably do one add per cycle (no parallelizing in this code), the 2GHz P4 will take one second, and the 500MHz GeForce will take four seconds (assuming it can be programmed for a simple operation like "ADD"). Even if you throw in more instructions but the code cannot be parallelized, the CPU will probably win.

Basically, since you can't target one specific application, a general purpose processor will always be slower at some things - but can do a much wider range of things. Heck, up until recently, "GPUs" were dumb and couldn't be programmed by users at all. I haven't looked at what operations you can do now, but IIRC you are still limited to code with at most 2000 instructions or so.

--
My server

Re:The difference between a CPU and GPU by CTho9305 · 2003-05-18 06:06 · Score: 3, Insightful

Sorry to reply to myself, but a really simple example just occured to me.

Take your 486SX without a coprocessor... you can get an FPU (coprocessor) which does floating point operations MUCH faster than you can emulate them. However, you can't just use an FPU and ditch the 486, since the FPU can't do anything but floating point ops - it can't boot MS-DOS... it can't run Windows 3.1... it can't fetch values from memory... it can't even add 1+1 precisely!

--
My server

GPU Performance Myths by Shelrem · 2003-05-18 06:01 · Score: 5, Informative

My question - If these cards are getting so powerful at computations then why do we need a Intel/AMD processor at all? Just make a graphics card with more transistors and drop the traditional processor...

If you'd really like the answer to this question, try programming anything on the GPU and you'll understand. It's hell to do half this stuff. GPUs are highly specialized and make very specific tradeoffs in favor of graphics processing. Of course, some operations, specifically those that can be modeled using cellular automata, map well to this set of constraints. Others, such as ray-tracing can be shoe-horned in, but if you were to try to write a word processor on the GPU, it'd essentially be impossible. The GPU allows you to do massively parallel computations, but penalizes you heavilly for things such as loops of variable length or reading memory back from the card outside of the once-per-cycle frame update, and the price of interrupting computation is prohibitive. Clearing the graphics pipeline can take a long, long time.

Furthermore, while there have been a few papers published claiming the orders of magnitude increase in speed in these sorts of computations, none actually demonstrate this sort of speed-up. Everyone's speculating, but when it comes to it, results are lacking.

b.c

Re: GPU Performance Myths by Black+Parrot · 2003-05-18 08:38 · Score: 4, Insightful

> The GPU allows you to do massively parallel computations, but penalizes you heavilly for things such as loops of variable length or reading memory back from the card outside of the once-per-cycle frame update, and the price of interrupting computation is prohibitive. Clearing the graphics pipeline can take a long, long time.

> Furthermore, while there have been a few papers published claiming the orders of magnitude increase in speed in these sorts of computations, none actually demonstrate this sort of speed-up. Everyone's speculating, but when it comes to it, results are lacking.

I looked in to using the GPU for vector * matrix multiplications over my Christmas vacation (yep, a Geek), and everywhere I turned I found people saying that whatever you gained in the number crunching you lost in the latency of sending your numbers to the GPU and reading them back when done. In the end I didn't even bother running an experiment on it.

But maybe conventional wisdom was wrong; elsewhere in the talkbacks I see links to a couple of .edu sites pushing this kind of thing, so I'm going to look at it some more.

--
Sheesh, evil *and* a jerk. -- Jade

SGI did this (very) long ago by jhzorio · 2003-05-18 06:01 · Score: 3, Interesting

Using the power of the graphic subsystem to handle other kinds of calculations has been done for years, if not decade(s) by Silicon Graphics.
At least for the demos...

Re:We need traditonal processors by cmcguffin · 2003-05-18 06:04 · Score: 5, Informative

While optimized for graphics, GPUs can indeed be used as general-purpose processors. GPUs are effectively stream processors, a class of devices whose architecture and programming model make then particularly efficient for scientific calculation.

> It might take a real long time, but it is a general purpose processor and so can process anything

The same holds true for GPUs. Like CPUs, they are turing complete.

Integrated GPU/CPU by renehollan · 2003-05-18 06:05 · Score: 4, Insightful

If these cards are getting so powerful at computations then why do we need a Intel/AMD processor at all? Just make a graphics card with more transistors and drop the traditional processor..."

You mean like: this?

Now, that press release was about two years old, and you can bet that ATI has advanced beyond that point (though I can't provide details).

Also, while not integrating a serious 3D graphics GPU, there's no reason that this can't be done -- except one -- and the same reason that a powerful CPU isn't integrated: heat dissipation.

But, for a "media processor", it sure is sweet.

--
You could've hired me.

Didn't read the article, but it doesn't matter. by moogla · 2003-05-18 06:08 · Score: 4, Interesting

You keep hearing this logic every once in awhile.

Look, for the same price of a $400 graphics engine you can get yourself a dual CPU machine, a cheap graphic card with AGP, and do it in "software" with about the same efficiency, if you know what you're doing.

Because the extra CPU isn't inheritly multi-core like most modern GPUs, you need to compensate with a higher clock speed, and use whatever multimedia instructions it has to the fullest extent (ie altivec, mmx2, etc.)

But of course, the GPU is better suited to the actual drudge work of getting your screen to light up. If there's stuff to be computed and forgotten by it (i.e. particle physics), its probably better left decoupled to exploit parallism in that abstraction.

As you get to a limit in computational efficiency, you start adding on DSPs, and this is where FPGAs and grid computing start looking interesting.

So it shouldn't be considered suprising that these companies will say that; they can see that trend and they want a piece of that aux. processor/FPGA action. The nForce is a step in the right direction. They don't want to be relegated to just making graphic accelerators when they have the unique position to make pluggable accelerators for anything.

But to plan on packaging an FPGA designed for game augmentation and calling it a uber-cool GPU is just a marketing trick. This technology is becoming commercial viable, it seems.

--
Black holes are where the Matrix raised SIGFPE

Wheel of reincarnation by Anonymous Coward · 2003-05-18 06:10 · Score: 3, Insightful

Aloha!

You wrote: My question - If these cards are getting so powerful at computations then why do we need a Intel/AMD processor at all? Just make a graphics card with more transistors and drop the traditional processor...

Congratulations! You have just reinvented Ivan Sutherlands Wheel of reincarnation which is exactly about this: Normal CPU:s are enhanced with specific functions to provide acceleration for a common task, the enhancments are getting so big that farming them out into a separate chip/module seems like a good idea. The separate thingy grows in complexity as more flexilibility and programmability is needed. Finally you end up with a new CPU. And then someone says.... You get the idea.

Here is a good take on Ivan Sutherlands story. And here is Myers and Sutherlands original paper.

Read, think and learn.

The horror! by Ridge · 2003-05-18 06:11 · Score: 4, Funny

"One research group is looking to break the Linpack benchmark world record using a cluster of 256 PCs with GeForce FXs!"

Unfortunately, the researchers have all inexplicably been rendered deaf.

Reconfigurable computing by wfmcwalter · 2003-05-18 06:23 · Score: 4, Interesting

Just make a graphics card with more transistors and drop the traditional processor

There's a lot of work being done on reconfigurable computing, which imagines replacing the CPU, GPU, DSP, soundcard, etc., with a single reconfigurable gate array (like an RAM-FPGA). You'd probably have a small control processor that manages the main array. On this array one could build a CPU (or several) of whatever ISA you needed, and GPU, DSP, whatever functionality was called for by the program(s) you're running at the current moment. Shutdown UnrealTournament 2009 and open Mathlab, and DynamicLinux will wipe out its shader code and vector pipelines, and grow a bunch of FP units instead. Run MAME and it will install appropriate CPUs and other hardware.

In the initial case, this would be controlled statically, a bit like the way a current OS's VM manages physical and virtual memory. Later, specialist "hardware" could be created, compiled, and optimised, based on an examination of how the program actually runs (a bit like a java dynamic compiler). So rather than running SETI-at-home your system would have built a specialist seti-ASIC on its main array. There will be lots of applications where most of the work is done in such a soft ASIC, and only a small proportion is done on a (commensuately puny) soft-CPU.

This all sounds too cool to be true, and at the moment it is. Existing programmable gate hardware is very expensive, of limited size (maybe enough to hold a 386?), runs crazy hot, and doesn't run nearly quickly enough.

--
## W.Finlay McWalter ## http://www.mcwalter.org ##

CPUs and GPUs are competitors by f97tosc · 2003-05-18 06:27 · Score: 3, Insightful

My question - If these cards are getting so powerful at computations then why do we need a Intel/AMD processor at all?

A development this extreme is unlikely. However, what is very real is the fact that GPUs and CPUs are at least partially competitors.

If you are doing a lot of graphics then you the best computer for your money may be with a great graphics card and a so-so CPU. The better and cheaper GPUs Nvidia can make, the smaller the demand for state of the art Pentium's.

But unless there is a revolutionary development somewhere, we will probably see computers with both kinds of processors for a good while.

Tor

Interesting to compare the 3DFX perspective... by ron_ivi · 2003-05-18 09:11 · Score: 4, Informative

Gary Tarolli (Chief Technical Officer of 3dfx) has an interesting interview on a similar subject.

Interestingly he thinks it'll be specialized hardware that will do ray-tracing, etc.

http://www.hardwarecentral.com/hardwarecentral/rev iews/1721/1/

"Is there a future for radiosity lighting in 3D hardware? Ray-tracing? When would it become available?

Gary: Yes, but probably just in specialized hardware as it's a very different problem. Ray-tracing is nasty because of it's non-locality, so fast localized hacks will probably prevail as long as people are clever. Especially for real-time rendering on low-cost hardware. It's interesting that RenderMan has managed to do amazing CGI without ray-tracing. That's an existence proof that a hack in the hand, is worth ray-tracing in the bush.

Oh... and for people who haven't seen it before, here's a cool detailed paper about how the pipeline of a traditional 3d accellerators can be tweaked used to do ray tracing...

http://graphics.stanford.edu/papers/rtongfx/rtongf x.pdf

Reading that shows how programming a graphics pipeline is quite different (more interesting? more complicated?) than programming a general purpose CPU.

Mandelbrot.ps by TeknoHog · 2003-05-18 11:32 · Score: 3, Interesting

That reminds me, here's a real classic. It computes and draws the famous fractal, and might be quite nasty on shared printers ;-)

%!ps /iter 60 def /reso .005 def /sq { dup mul } def /mod { 2 copy div floor mul sub } def /plot { newpath moveto 1 0 rlineto stroke } def gsave 280 420 translate 260 2 div dup scale 2 260 div setlinewidth -2 reso 2 { /x exch def -2 reso 2 { /y exch def /r 0 def /i 0 def 0 iter { r sq i sq add 4 gt { exit } if r sq i sq sub x add /i 2 r mul i mul y add def /r exch def 1 add } repeat 10 mod .1 mul .1 add setgray x y plot } for } for grestore showpage

--
Escher was the first MC and Giger invented the HR department.

33 of 285 comments (clear)