Using GPUs For General-Purpose Computing
Paul Tinsley writes "After seeing the press releases from both Nvidia and ATI announcing their next generation video card offerings, it got me to thinking about what else could be done with that raw processing power. These new cards weigh in with transistor counts of 220 and 160 million (respectively) with the P4 EE core at a count of 29 million. What could my video card be doing for me while I am not playing the latest 3d games? A quick search brought me to some preliminary work done at the University of Washington with a GeForce4 TI 4600 pitted against a 1.5GHz P4. My Favorite excerpt from the paper:
'For a 1500x1500 matrix, the GPU outperforms the CPU by a factor of 3.2.' A PDF of the paper is available here."
http://developers.slashdot.org/article.pl?sid=03/1 2/21/169200&mode=thread&tid=152&tid=18 5
Here's a HTML version of the PDF, thanks to Google.
General-purpose computation using graphics hardware has been a significant topic of study for the last few years. Pointers to a lot of papers and discussion on the subject are available at: www.gpgpu.org
Yes, it's true that it has that many transistors BUT, only 29 million of them are part of the core, the rest is memory. The transistor count on the video cards does not count the ram.
Is a course being offered at caltech since last summer on using gpus for numerical work. Course page is here.
:wq
Before you get excited just remember how asymmetric the APG bus is. Those GPUs will be at much better use when we get them as 64bit pci cards.
If you have a matrix solver, there is no telling what you can do. And i remember, these papers show that the speed is faster than the matrix calculations of the same stuff using the CPU.
# Linear Algebra Operators for GPU Implementation of Numerical Algorithms
Jens Krüger, Rüdiger Westermann
# Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid
Jeff Bolz, Ian Farmer, Eitan Grinspun, Peter Schröder
# Nonlinear Optimization Framework for Image-Based Modeling on Programmable Graphics Hardware
Karl E. Hillesland, Sergey Molinov, Radek Grzeszczuk
QE is cool, but it doesn't do anything similar at all to what they're talking about here. FFTs on an NV30 are only incidentally related to texture mapping window contents. Check out gpgpu.org or BrookGPU. In a sense, the idea is to treat modern graphics hardware as the next step beyond SIMD instruction sets. Incidentally, e17 exploited (hardware) GL rendering of 2D graphics via evas a bit before Apple put that into OS X.