Using GPUs For General-Purpose Computing

As has been said many time before ... by keltor · 2004-05-08 18:55 · Score: 5, Insightful

The GPU are very fast ... at performing vector and matrix calculations. This is the whole point. If general computing CPUs were capable of doing vector or matrix calcs very efficiently, we would probably not have GPUs.

178 Million in the P4EE by 2megs · 2004-05-08 18:57 · Score: 5, Insightful

The Pentium 4 EE actually has 178 million transistors, which puts it in between ATI's and NVIDIA's latest.

In all of this, keep in mind that there's computing and there's computing...the kind of computing power in a GPU is excellent for doing the same numeric computation to every element of a large vector or matrix, not so much for branchy decisiony type things like walking a binary tree. You wouldn't want to run a database on something structured like a GPU (or an old vector-processing Cray), but something like a simulation of weather or molecular modeliing could be perfect for it.

The similarities of a GPU to a vector processing system bring up an interesting possibility...could Fortran see a renaissance for writing shader programs?

Re:178 Million in the P4EE by gunix · 2004-05-08 20:09 · Score: 5, Insightful

Well, it's like UNIX, it's userfriendly, it's just selects it's friends very carefully.
IMHO, the perfect friend is someone interested in maximum performance and knows how to program and knows something about computer hardware.

Have you looked at fortran 90, 95 or 2000?

--
Evolution of Language Through The Ages: 6000 BC : ungh, grrf, booga 2000 AD : grep, awk, sed
Re:178 Million in the P4EE by Bender_ · 2004-05-08 21:02 · Score: 3, Insightful

The transistor count on the video cards does not count the ram

How do you know? In fact, modern GPUs require a large amount of small scattered memory blocks. Texture caches, FIFOs for fragment/pixels/texels when they are not in sync, caches for vertex shader and pixel shader programs etc etc..

More recent GPUs are notorious for their incredibly long latencies. Long latencies imply that a lot of data has to be stored in chip..

Re:Not the Point by Amiga+Lover · 2004-05-08 19:08 · Score: 4, Insightful

The whole point of graphic cards is that they have a dedicated purpose. Using the cards for anything that is general purpose is like using a motorcycle to tow a pop-up camper.

What's relevant is that to the processor on a graphics card, its dedicated purpose is simply a bunch of logic. There's no dedicated "this must be used for pixels only, all else is waste" logic inherent in the system. there are MANY purposes for which the same/similar logic that applies in generating 3D imagery can be used, and that seems the purpose of this paper. Run THOSE type operations on the GPU. Some things they won't be able to do well no doubt - but those they can, they can do extremely well.

This is BIG by macrealist · 2004-05-08 19:18 · Score: 5, Insightful

Creating a way to use the specialize GPUs for vector processing that is not graphics related is ingenious. Like a lot of great ideas, it is sooo obvious AFTER you see some one else do it.

Don't miss the point that this is not intended for general purpose computing. Don't port OoO to the graphics chip.

Where it is huge is in signal processing. FPGAs have begun replacing even the G4s in this area recently because of the huge gains in speed vs. power consumption an FPGA affords. However, FPGAs are not bought and used as is, and end up costing a significant amount (of development time/money) to become useful. Being able to use these commodity GPUs for vector processing creates a very desirable price/processing power/power consumption option. If I were nVIDIA or ATI, I would be shoveling these guys money to continue their work.

--
I am living proof of the Peter Principle

When... by alexandre · 2004-05-08 19:32 · Score: 2, Insightful

...will someone finally port john the ripper to a new video card's graphical pipeline? :)

Not the Point-headbanger. by Anonymous Coward · 2004-05-08 19:40 · Score: 1, Insightful

There is however one thing to keep in mind. Presently our GPU's may have the headroom to play with, but with Apple's Quartz, and Microsoft's Longhorn, let alone what's coming with X. That headroom may disappear, and our video cards will have to go back to being video cards.

Re:Not the Point-headbanger. by Amiga+Lover · 2004-05-08 20:00 · Score: 4, Insightful

There is however one thing to keep in mind. Presently our GPU's may have the headroom to play with, but with Apple's Quartz, and Microsoft's Longhorn, let alone what's coming with X. That headroom may disappear, and our video cards will have to go back to being video cards.

On those operating systems that require them, that could very well be.

Still makes a nice thought that a linux box without even X installed, but a kickass graphics card, could crunch away doing something 4 times quicker than any windowed machine.

Maybe time for a new generation of math-processor? by Anonymous Coward · 2004-05-08 19:47 · Score: 4, Insightful

Remember the co-processors? Well, actually I don't (I'm a tad to young). But I know about them.

Maybe it's time to start making co-processing add-on cards for advanced operations such as matrix mults and other operations that can be done in parallell on a low level. Add to that a couple of hundred megs of RAM and you have a neat little helper when raytracing etc. You could easily emulate the cards if you didn't have them (or needed them). The branchy nature of the program itself would not affect the performance of the co-processor since it should only be used for calculations.

I for one would like to see this.

Bass Ackwards? by Anonymous Coward · 2004-05-08 20:01 · Score: 5, Insightful

Perhaps offloading the CPU to the GPU is the wrong way to look at things? With the apparently imminent arrival of commodity (low power) multi-CPU chips, maybe we should be considering what we need to add to perform graphics more efficiently (ala MMX et al)?

While it's true that general purpose hardware will never perform as well as or as efficiently as a design specifically targeted to the task (or at least it better not), it is also equally as true that eventually general purpose/commodity hardware will achieve a price-performance point where it is more than "good enough" for majority.

Violation of Compartmentalization by BlakeB395 · 2004-05-08 20:13 · Score: 2, Insightful

From a design standpoint, I can imagine a GPU that donates its power to the CPU would be a nightmare. It violates the fundamental tenet that everything should do one thing and do it well. OTOH, that tenet focuses on simplicity and maintainability over performance. Is such a tradeoff worth it?

Re:Violation of Compartmentalization by evilviper · 2004-05-08 21:55 · Score: 3, Insightful

It violates the fundamental tenet that everything should do one thing and do it well.

No, having a CPU that does everything is what violates the tenet.

I don't know about you, but I don't have a chip that does my video processing for me, I don't have a chip that does all the encryption for me, I don't have a chip that handles (en/de)capsulating network traffic, as well as handing interrupts and routing.

Having a second processor that does some specialized work that a CPU isn't good at is an improvement, not a nightmare. I'd love to be able to plug in a chips or two into my PC and have them do better-than realtime MPEG-4 encoding that doesn't affect my processor at all... Who wouldn't?

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant

Re:Wow by PitaBred · 2004-05-08 20:23 · Score: 4, Insightful

You don't seem to understand that GPU's are very specific purpose computing devices. They aren't like a general purpose processor like you CPU. They crunch matrices, and that's about it. Even all the programmable stuff is just putting parameters on the matrix churning.

--
My blog. Good stuff (when I remember to update it). Read it.

Re:Maybe that's the answer... by John+Starks · 2004-05-08 20:23 · Score: 3, Insightful

GCC is an inferior compiler for the x86, whether you like it or not. Intel's optimizing C/C++ compiler is much faster according to numerous benchmarks (I'm sorry, it's too late to find the links.) On the other hand, I understand that GCC is great on the Mac, since Apple optimized it properly. (Certainly I appreciate the hard work of the various GCC teams over the years; hopefully new optimizations will continue to improve the quality of the release until it is as fast as Intel's offerings.)

In any case, why do you believe all of Apple's conveniently high numbers, but you don't believe Spec numbers reported by Dell, AMD, etc.? These are not numbers pulled out of a hat; they are standard Spec results. Thus, the numbers should be comparable from company to company. But Apple retested other companies' products and released new numbers without properly optimizing for the x86. Why is it when Microsoft pays for benchmarks, people freak out, but when Apple PERFORMS benchmarks, people believe them instantly?

There are plenty of other links out there that provide similar information. It is patently false advertising for Apple to claim that they use the fastest chip of any PC.

Oh, and re: the Linux issue, you're right. But you'll find that the x86 is faster in Linux with a proper optimizing compiler.

My issue is basically that at best -- at best! -- the results are inconclusive. At worst, Apple blatently lied. It's foolish to believe Apple blindly just because they're the underdogs and produce a pretty, Unix-based OS. And it's foolish to hold this strange hatred for all that is x86. I don't understand this mentality.

AGP read latency not important when not real time. by anti-NAT · 2004-05-08 20:32 · Score: 2, Insightful

These applications are not likely to generate or process data at such a rate that the slow AGP read speed will matter that much, if at all.

--
The Internet's nature is peer to peer - 20050301_cs_profs.pdf

Re:Unused computing Power? by PitaBred · 2004-05-08 20:37 · Score: 4, Insightful

Lemme try to help:
a) Not equal. Apples and oranges. A GPU will do repeated calculations very, very fast, like matrix transforms and the like. A CPU on the other hand will make decisions based on input, rather than just crunching numbers
b) The main display (the GUI) already uses many tricks on the graphics card. The hard part is making sure that all graphics cards support the features. Things like the xrender extension and such are becoming more common as graphics cards and drivers get "standard" capabilities
c) Your imagination is the limit as to what it could be used for. Just realize that it's a good data processing unit, not a good program execution unit. Use each for their strengths.
d) Modified? With new cards/drivers, all it takes is OpenGL calls to start taking advantage of this power. All it really takes is someone who knows what they're doing and has a bit of inspiration.

--
My blog. Good stuff (when I remember to update it). Read it.

I think I speak for many of us by Sycraft-fu · 2004-05-08 20:42 · Score: 5, Insightful

When I say oh shut the fuck up.

Sorry for the flames, but seriously, I get so damn sick of all the "all new games suck" whiners. Look, there are legit reasons to want new technology. It is nice to have better graphics, more realistic sound, etc. It is NICE to have game that looks and sounds more like reality. Yes, that doesn't make the game great, but that doesn't mean it's worthless.

What's more, don't pretend like all modern games suck while old games ruled. That's a bunch of bullshit. Sure, there are plenty of modern games that suck, but guess what? There are tons of old games that suck too. Thing is, you just tend to forget about them. You remember the greats that you enjoyed or heard about, the ones that helped shape gaming today. You forget all the utter shit that was released, just as is released today.

So get off it. If you don't like nice graphics, fine. Stick with old games, no one is forcing you to upgrade. But don't pretend like there is no reason to want better graphics in games.

Re:I think I speak for many of us by Tim+C · 2004-05-08 21:16 · Score: 5, Insightful

Hear, hear.

There's something that's always puzzled me a little about this site - attached to every single article about some new piece of PC tech - a faster processor, better graphics card, etc - there are a number of comments bemoaning the advance. All of them saying that people don't need the power/speed they have already, that they personally are just fine with 4 year old hardware, or, in this case, that better graphics don't make for better games. Hell, the same is true for mobile phones - I've lost count of the number of comments bemoaning advances in them, too.

It's funny, but I thought this was supposed to be a site for geeks; aren't geeks supposed to *like* newer, better toys?

To get back on topic - no, better graphics are not sufficient for a better game. However, if the gameplay is there, then they can certainly make the experience more enjoyable. Would Quake have been as much fun if it was rendered in wireframes?

Better graphics help add to the sense of realisim, making the game a more immersive experience. The whole point of the majority of games is entertainment and (to an extent) escapism. Additionally, what a lot of people like the grand-parent poster seem to forget is that most of the big-name game engines are licensed for use in a number of games. Let people like id spend their time and money coming up with the most graphically intensive, realistic engine they can. Think Doom 3'll suck because the gameplay will be crap? Fine, then wait for someone to license the engine and create a better game with it. In the meantime, please shut up and remember that there are those of us who like things to be pretty, as well as useful/well made/fun/(good at $primaryPurpose)

Good graphics on their own won't make a good game, but they will help make a good game great.

--
It's official. Most of you are morons.

Re:What comes next. by renoX · 2004-05-08 20:55 · Score: 2, Insightful

Yes, one thing shocked me in their paper: they don't talk much about the precision they use..

Strange because it is a big problem for using GPU as coprocessors: usually scientific computation use 64bit floats or on Intel 80-bit floats!

Re:Maybe time for a new generation of math-process by pe1chl · 2004-05-08 21:11 · Score: 4, Insightful

What I remember about co-processing cards and "intelligent peripheral cards" (like raid controllers or network cards with an onboard processor) is this:

There is a certain overhead because a communications protocol is to be established between the main processor and the co-processor. For simple tasks the main processor often stops and waits for the co-processor to complete the task and retrieves the results. For more complicated tasks, the main processor continues but later an interrupt occurs that the main processor must service.

You must be very careful or the extra overhead of this communication makes the execution of the task slower than without the co-processor. This is certainly going to happen at some time in the future, when you increase central processor power all the time but keep using the same co-processor.

For example, your matrix co-processor needs to be fed the matrix data, start working, and tell it is finished. Your performance would not only be limited by the processor speed, but also by the bus transfer rate, and by the impact those fast bus transfers have on the CPU-memory bandwidth available and the on-CPU cache validity.
When you are unlucky, the next CPU you buy is faster in performing the task itself.

Re:Maybe time for a new generation of math-process by Squant · 2004-05-08 22:47 · Score: 2, Insightful

Math co processor boards would be great, buy still quite fixed function.

It would be much more efficient if you would implement an co processor with an FPGA. First programming the FPGA what functions to execute. And then feeding the data to it, when the calculation is completed you just reprogram it to become whatever you want.

This way you would not have an math only board, but a board that could perform many many functions. You just need to write algorithms to exploit them.

Re:Link to previous discussion on same/similar sub by cehardin · 2004-05-08 23:08 · Score: 3, Insightful

I think the real reason Apple comes out with newer and bette technology is because they have to fight for their user base. After all, if Apple's products were the same as Microsoft's, who would care?

Microsoft can afford to be lazy with their products, they make money either way. I don't think that will last forever though. Sometimes they do try hard, NT for example, but then they pile a bunch of poorly designed stuff to go on top of it and that ruins it. If you can, check out OS X's directory structure, it's beautiful. Now compare that to Window's cryptic system...

"Microsoft, as usual, announced the feature after Apple shipped it"

"God I'm tired of hearing that phrase over and over again when 95% of the time it's just because Apple can control the hardware and it would be a total disaster if MS included a technology as fast as they do..."

Re:Audio DSP by SmackCrackandPot · 2004-05-08 23:23 · Score: 3, Insightful

I wonder if it's deliberate, to sell the "pro" cards they use for the rendering farms

No, it's just the way that the OpenGL and DirectX API's evolved. There never was any need in the past to have a substantial data feedback. The only need back then was to read pixelmaps and selection tags for determining when an object had been picked.

All very impressive, but.... by tiger99 · 2004-05-08 23:53 · Score: 3, Insightful

... there are a few snags, such as the fact that a GPU will not have (because it normally does not need) memory management and protection, so it is really only safe to run one task at a time. And, does this not need the knowledge of the architecture and instruction set that Nvidia seem to be unable or unwilling to disclose, hence the continuing controversy over the binary-only Linux drivers?

However I do know that a lot of people had been wondering about this for a while, could it be done, and was it worth attempting, so now we know. Maybe we shall soon see PCI cards containing an array of GPUs, I imagine the cooling arrangements will be quite interesting!

There are other things which are faster than a typical CPU, are not some of the processors in games machines 128-bit? Again, you could in theory put some of these together as a co-processor of some sort.

This was a good piece of work technically, but it says something about society that the fastest mass-produced processors, whether for GPUs or games consoles, exist because people want a higher frame rate in Quake. I can't think of any professional application that needs really fast graphics output, but many that could use faster processing. So why can't Intel and AMD stop putting everything in the one CPU (multiple CPUs with one memory are not really much better), and make co-processors again, which will do fast matrix operations on very large arrays, etc, for those who need them? The ultimate horror of the one CPU philosophy was the winmodem and winprinter, both ridiculous. Silicon is in fact quite cheap, as Nvidia have proved, people's time while they wait for long calculations to finish is not.

Maybe we are going to see an architectural change coming, I expect it will be supported by FOSS long before Longhorn, just like the AMD64.

Ever heard of PCI Express? by Egekrusher2K · 2004-05-09 02:04 · Score: 2, Insightful

Touche. However, with the upcoming advances in bus speeds (read: PCI Express) and the available bandwidth to the PCI bus, we won't have to worry about latency when using a coprocessor type piece of hardware. There is room to grow with this new bus to almost outlandish amounts of bandwidth. Not a problem we'll run into any time soon.

--
Listen to my experimental-industrial-techno!

Re:Commodore 64 by pommiekiwifruit · 2004-05-09 03:05 · Score: 2, Insightful

I would be interested in a reference for that, since the 1541 serial link was so slow. If you are talking about Mindsmear that was not actually released, but a demo would have to be pretty clever to make the communication time worth while (and accurate with the screen still turned on).

Re:Altivec by Anonymous Coward · 2004-05-09 04:23 · Score: 1, Insightful

I disagree. Performances of Altivec-aware apps of heavyly vectorized computation shows that they beat those of similar apps on the Wintel side over and over, even at higher MHz (although not as much as Apple claims). I know that other factors such as optimization, the non-vector code, etc. influence the outcome, but in the absence of true vector computation benchmarking, I can accept that Altivec is better than SSE2.

Now, compared to GPUs, I think SIMD instructions suck. Why do you think 3D games utilize GPUs than Altivec or SSE2? In general, you can't compare the performance a part of a general utility chip to a specifically designed chip tuned to gain the highest performance without having to worry about trade-offs.

Re:The day is saved by Metasquares · 2004-05-09 08:07 · Score: 4, Insightful

This way we wouldn't have to keep getting a new video card every time we want to upgrade our systems 3-d performance.

I think you've just answered your own question.

Slashdot Mirror

Using GPUs For General-Purpose Computing

29 of 396 comments (clear)