Using GPUs For General-Purpose Computing
Paul Tinsley writes "After seeing the press releases from both Nvidia and ATI announcing their next generation video card offerings, it got me to thinking about what else could be done with that raw processing power. These new cards weigh in with transistor counts of 220 and 160 million (respectively) with the P4 EE core at a count of 29 million. What could my video card be doing for me while I am not playing the latest 3d games? A quick search brought me to some preliminary work done at the University of Washington with a GeForce4 TI 4600 pitted against a 1.5GHz P4. My Favorite excerpt from the paper:
'For a 1500x1500 matrix, the GPU outperforms the CPU by a factor of 3.2.' A PDF of the paper is available here."
Now I finally have a use for the 20 Voodoo 2 cards I have in a box in the basement. Now I can have my very own supercomputer. I just need some six pci slot motherboards.... Instant cluster!
Humor from a Genetically Molested Mind
Intel's been telling me for years that I need faster hardware from THEM to get the job done...
You mean........ they were lying?!?!?
CRAP!
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
http://developers.slashdot.org/article.pl?sid=03/1 2/21/169200&mode=thread&tid=152&tid=18 5
Here's a HTML version of the PDF, thanks to Google.
The GPU are very fast ... at performing vector and matrix calculations. This is the whole point. If general computing CPUs were capable of doing vector or matrix calcs very efficiently, we would probably not have GPUs.
The Pentium 4 EE actually has 178 million transistors, which puts it in between ATI's and NVIDIA's latest.
In all of this, keep in mind that there's computing and there's computing...the kind of computing power in a GPU is excellent for doing the same numeric computation to every element of a large vector or matrix, not so much for branchy decisiony type things like walking a binary tree. You wouldn't want to run a database on something structured like a GPU (or an old vector-processing Cray), but something like a simulation of weather or molecular modeliing could be perfect for it.
The similarities of a GPU to a vector processing system bring up an interesting possibility...could Fortran see a renaissance for writing shader programs?
General-purpose computation using graphics hardware has been a significant topic of study for the last few years. Pointers to a lot of papers and discussion on the subject are available at: www.gpgpu.org
Is a course being offered at caltech since last summer on using gpus for numerical work. Course page is here.
:wq
"Utilize the sheer computing power of your video card!"
New market blitz, hmmmm.
SETI ports their code, and within five days their average completed work units increase 1000 fold. 13 hours later, they have evidence of intelligent life at 30000 locations within one degree.
Microsoft gets the hint, and comes out with a brilliant plan to utilize GPUs to speed up their OS and add bells and whistles to their UI.
And, once again, Apple and Quartz Extreme is ignored.
Before you get excited just remember how asymmetric the APG bus is. Those GPUs will be at much better use when we get them as 64bit pci cards.
What's interesting with new video cards it's their memory capacity, 128 or 256 MB and that this memory is accessible on some new cards at 900 MHz with a data path of 256 bit (which is a lot faster than a CPU with DDR 400 installed).
All that processing power, and the latest games still run at about 22 frames per second, if that.
The CPU can do six billion instructions a second, the GPU can do 18 billion, and every last cycle is being used to stuff a 40MB texture into memory faster. What a waste. Yeah, the walls are even more green and slimy. Whoop-de-fucking-do.
Would it be great if all that processing power could be used for something other than yet-another-graphics-demo?
Like, maybe some new and innovative gameplay?
Business isn't willing to pay for products, innovation and careers, so we get brands, mortgage commercials and layoffs.
Creating a way to use the specialize GPUs for vector processing that is not graphics related is ingenious. Like a lot of great ideas, it is sooo obvious AFTER you see some one else do it.
Don't miss the point that this is not intended for general purpose computing. Don't port OoO to the graphics chip.
Where it is huge is in signal processing. FPGAs have begun replacing even the G4s in this area recently because of the huge gains in speed vs. power consumption an FPGA affords. However, FPGAs are not bought and used as is, and end up costing a significant amount (of development time/money) to become useful. Being able to use these commodity GPUs for vector processing creates a very desirable price/processing power/power consumption option. If I were nVIDIA or ATI, I would be shoveling these guys money to continue their work.
I am living proof of the Peter Principle
If you have a matrix solver, there is no telling what you can do. And i remember, these papers show that the speed is faster than the matrix calculations of the same stuff using the CPU.
# Linear Algebra Operators for GPU Implementation of Numerical Algorithms
Jens Krüger, Rüdiger Westermann
# Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid
Jeff Bolz, Ian Farmer, Eitan Grinspun, Peter Schröder
# Nonlinear Optimization Framework for Image-Based Modeling on Programmable Graphics Hardware
Karl E. Hillesland, Sergey Molinov, Radek Grzeszczuk
Using GPUs For General-Purpose Computing
I'm glad that finally they started to use the General-Purpose Unit. What took them so long?
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
Dude, you obviously have never tried to sleep in a motorcycle.
KFG
Perhaps offloading the CPU to the GPU is the wrong way to look at things? With the apparently imminent arrival of commodity (low power) multi-CPU chips, maybe we should be considering what we need to add to perform graphics more efficiently (ala MMX et al)?
While it's true that general purpose hardware will never perform as well as or as efficiently as a design specifically targeted to the task (or at least it better not), it is also equally as true that eventually general purpose/commodity hardware will achieve a price-performance point where it is more than "good enough" for majority.
When I say oh shut the fuck up.
Sorry for the flames, but seriously, I get so damn sick of all the "all new games suck" whiners. Look, there are legit reasons to want new technology. It is nice to have better graphics, more realistic sound, etc. It is NICE to have game that looks and sounds more like reality. Yes, that doesn't make the game great, but that doesn't mean it's worthless.
What's more, don't pretend like all modern games suck while old games ruled. That's a bunch of bullshit. Sure, there are plenty of modern games that suck, but guess what? There are tons of old games that suck too. Thing is, you just tend to forget about them. You remember the greats that you enjoyed or heard about, the ones that helped shape gaming today. You forget all the utter shit that was released, just as is released today.
So get off it. If you don't like nice graphics, fine. Stick with old games, no one is forcing you to upgrade. But don't pretend like there is no reason to want better graphics in games.
QE is cool, but it doesn't do anything similar at all to what they're talking about here. FFTs on an NV30 are only incidentally related to texture mapping window contents. Check out gpgpu.org or BrookGPU. In a sense, the idea is to treat modern graphics hardware as the next step beyond SIMD instruction sets. Incidentally, e17 exploited (hardware) GL rendering of 2D graphics via evas a bit before Apple put that into OS X.
This concept was being used back in 1988. The Commodore 64 (1mhz 6510, a 6502 like micro processor) had a peripheral 5.25 disk drive called the 1541, which itself had a 1mhz 6510 cpu in it, connected via. a serial link.
It became common practice to introduce fast loaders: these were partially resident in the C64, and also in the 1541: effectively replacing the 1541's limited firmware.
However, demo programmers figured out how to utilise the 1541: one particular demo involved uploading program to the 1541 at start, then upon ever screen rewrite, uploading vectors to the 1541, which the 1541 would perform calculations in parallel with the C64, then at the end of the screen, the C64 fetch the results from the 1541, and incorporate them into the next screen frame.
Equally, GPU provides similar capability if so used.