Slashdot Mirror


BrookGPU: General Purpose Programming on GPUs

An anonymous reader writes " BrookGPU is a compiler and runtime system that provides an easy, C-like programming environment (read: No GPU programming experience needed) for today's GPUs. A shader program running on the NVIDIA GeForce FX 5900 Ultra achieves over 20 GFLOPS, roughly equivalent to a 10 GHz Pentium 4. Combine this with the increased memory bandwidth, 25.3 GB/sec peak compared to the Pentium 4's 5.96 GB/sec peak, and you've got a seriously fast compute engine but programming them has been a real pain. BrookGPU adds simple data parallel language additions to C which allow programmers to specify certain parts of their code to run on the GPU. The compiler and runtime takes care of the rest. Here is the Project Page and Sourceforge page."

32 of 275 comments (clear)

  1. Basically like having two processors... by Anonymous Coward · · Score: 4, Interesting

    I wonder how long till we see a (insert worthwhile cause here)-At-Home client that supports this?

  2. Cool ... by torpor · · Score: 5, Interesting

    ... can you say 'software synthesists' wet dream?

    Oh, suddenly, that 'game investment' also gives you a few 100 extra voices of polyphony?

    Sweet ... $5 to the first person to use Brooke to make a synthesizer. :)

    --
    ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
  3. first link is incorrect by 2.246.1010.78 · · Score: 5, Informative

    but the link to the project page is correct.

  4. Like the good old days by fiskbil · · Score: 5, Funny

    Reminds me of the good old days when you used the processors in the C64 tapedrive to compute stuff. Wouldn't want to waste those precious cycles.

    I'm sure a lot of old farts will tell me how they used some serial controller to compute stuff back in the 60's and that I'm just a little kid. :)

  5. Re:High Performance for General Purpose? by fidget42 · · Score: 4, Informative

    Actually, since "graphics-related things" are all matrix operations, this would turn the GPU into a high-end vector (matrix) engine.

    --
    The dogcow says "Moof!"
  6. wait a minute by Janek+Kozicki · · Score: 5, Interesting

    A shader program running on the NVIDIA GeForce FX 5900 Ultra achieves over 20 GFLOPS, roughly equivalent to a 10 GHz Pentium 4.

    wait, if there is a technology that allows construction of GPU that is 3 times faster than the fastest CPUs, why Intel and AMD do not use this technology to build those 3times faster CPUs?

    are you sure that you can compare the speed of GPU and CPU?

    --
    #
    #\ @ ? Colonize Mars
    #
    1. Re:wait a minute by the+uNF+cola · · Score: 5, Informative

      You are assuming using the GPU technologies are possible in a CPU. Because something is applicable in one instance doesn't mean it is in all instances. Making some things efficient may take away from the efficiency of others, but in the case of such aa specialized chip, it may not matter.

      It may be ok to compare the speed of a GPU and a CPU if they are infact different. If a GPU was a CPU used with cheaper material, yeah, it would be unfair. But as life goes, they both have their merits.. so why not? A GPU is prolly best at some matrix math transforms.. or not. :)

      --

      --
      "I'm not bright. Big words confuse me. But Wanda loves me and that should be enough for you." - Cosmo

    2. Re:wait a minute by enigma48 · · Score: 5, Insightful

      Definately possible - general purpose CPUs have to do everything where graphics cards can specialize and do what little they can, faster.

      Also, good point about comparing GHz to GHz - AMD CPUs do more per cycle than Intel, but are also clocked much lower. You could look at a subset of instructions (ie: FLoating-point OPerations (FLOPS)) but this only gives you a piece of the overall performance picture.

      Without having read the article, my guess is they extrapolated (educated, math-based guess) how fast a 10GHz P4 would perform and compared the results that way.

      I'd LOVE to see this tech built into a SETI or Folding@Home client (steroids version). (Imagine the kids - "Mom, I need the Radeon 9800XT to find a cure for Grandma's cancer!")

    3. Re:wait a minute by Entropy_ajb · · Score: 5, Informative

      Because CPUs are limited to running instructions (for the most part) in serial. GPUs get to run a large number of instructions in parallel. As some above posts mentioned, a lot of the stuff the GPU can do is vector and matrix multiplication, therefore the GPU is really good at multiplying a lot of numbers times a lot of numbers at once. But in everyday life you aren't multiplying a bunch of number times a bunch of numbers at once, you are multiplying one number time another, then multiplying the result times a number, and so on. GPUs are built to a specific task, and at that task they are very fast, but outside that task they won't be able to compete with a real CPU. And on top of all of that I can buy 3 2.4Ghz P4s for the price of a Geforce FX5950.

    4. Re:wait a minute by mdpye · · Score: 4, Interesting
      And on top of all of that I can buy 3 2.4Ghz P4s for the price of a Geforce FX5950

      But you forget the 256MB (at least) RAM on a steaming fast interface that you get with the GeForce... It makes the P4s' cache look pretty paltry in size by comparison.

      MP
    5. Re:wait a minute by Kjella · · Score: 4, Informative

      wait, if there is a technology that allows construction of GPU that is 3 times faster than the fastest CPUs, why Intel and AMD do not use this technology to build those 3times faster CPUs?

      are you sure that you can compare the speed of GPU and CPU?


      Well, yes and no. In the same way you can take a render farm and say that "this provides the equivalent of a 100GHz Pentium" Which might be true, for that specific task. You see it already between GPUs, compare Pentium, Xeon, Athlon XP and Athlon 64. Do you get one benchmark "X is 3% faster than Y"? No. Faster at some, slower at others. For a specific benchmark, the difference can be pretty big already among "general" processors.

      A specialized processor like a GPU will show much greater variation. It might really shine on some, really suck on others. Which is why it's no good using a GPU as a CPU. Those numbers tell you that it can be much faster than the fastest CPU around. Or better yet, if you can make it run in parallell to the normal CPU, give you a total performance which may theoretically be about 13GHz (10 + 3), where 3 of those can be general-purpose operations. Or it may be a task the GPU runs like a dog, and isn't even worth the overhead.

      Kjella

      --
      Live today, because you never know what tomorrow brings
    6. Re:wait a minute by barik · · Score: 5, Interesting

      Are you sure that you can compare the speed of GPU and CPU?

      Professor Pat Hanrahan, of Stanford University, made a stab at answering this question in his presentation 'Why is Graphics Hardware so Fast?'. The first half of the presentation focuses on this question, while the second half of the presentation covers programming languages that utilitize this hardware. Specifically, the Stanford Real-Time Shading Language (RTSL) and Brook are discussed. Overall, it's a good presentation that should get you up to speed with the basics of what's happening in this area of research.

  7. How does this look? by adrianbaugh · · Score: 5, Interesting

    I'm completely new to meddling with graphics card, so apologies if this is a silly question: when programs utilising the GPU for arbitrary calculations are running does the screen go weird, or is there a way of stopping the output being displayed? A screenfull of junk might not matter to a scientist leaving their computer to crunch numbers for a few months but it wouldn't be good for a general-purpose program.

    --
    "'I pass the test,' she said. 'I will diminish, and go into the West, and remain Galadriel.'"
    - JRR Tolkien.
    1. Re:How does this look? by Anonymous Coward · · Score: 5, Informative

      Nope. Nothing appears on your screen until the contents of the area of memory known as the "frame buffer" are rewritten by a program (on either the GPU or CPU). The GPU can execute math code all day and you won't see the results unless it deliberately modifies the frame buffer.

  8. Re:High Performance for General Purpose? by Anonymous Coward · · Score: 5, Insightful

    "graphics-realted" things include things like floating point mathmatics, linear algebra, and vector operations. If you are doing anything computationally intensive, this might be usefull. You don't have to actually use the hardware to do anything graphical if you are just interested in turning numbers.

  9. I am not an EE, but... by unfortunateson · · Score: 5, Interesting

    It would seem to me that the GPU is not going to be as general-purpose as the CPU, but could still attain the high mathematical throughput with vector-oriented processing.

    Doing string searches, complex logic analyses, etc. would probably suck, but big data manipulations, such as SETI-style wave transformations, molecular analysis, etc., might be able to take advantage of them.

    --
    Design for Use, not Construction!
  10. Good point. by yoshi_mon · · Score: 4, Insightful

    After taking a quick peek at the language part of the project it seems right now that most of it's functions are all about sets of data and how to move them around.

    Makes sence of course as that is what a GPU is all about. (Yes I'm vastly over-simplyifying here.) So I would gather that it might be used for types of data that are streamed alot? Maybe used for video editing, real time video, etc where your trying to deal with a lot of data at once that your trying to move around and not just store or have to perform some more complicated types of functions upon.

    However, I'm no 3d programmer and I should would love a more detailed analysis of the potentals for this.

    --

    Really, I know what I'm doing...Ohhhh, look at the shiny buttons!
  11. The deaf leading the blind... by Kjella · · Score: 4, Informative

    ...but I assume that in any advanced texturing/shading/bump mapping/other GFX function rendering, you apply all the different effects, and when you're done, specifically call that the frame is to be displayed on screen. (E.g. why your FPS != your monitor refresh rate)

    I would assume that this program simply never calls the drawing function, but instead gets the results back from the GPU. The normal screen should be able to run in the meanwhile (I assume you can e.g. build a 3D environment while showing a 2D cutscreen), so I would think you can have a plain GUI, as long as it doesn't need to use anything advanced.

    Kjella

    --
    Live today, because you never know what tomorrow brings
  12. Homepage of GPGU research by zymano · · Score: 4, Informative


    www.gpgpu.org

    Very cool. Vector/Graphics processors could one day overtake General processors. They are way more energy efficient too.

  13. Re:Fast Fourier Transform by Kazymyr · · Score: 4, Interesting

    Not to mention that you can put several PCI video cards in the same cheap PC. Multiply power by N.

    --
    I hadn't known there were so many idiots in the world until I started using the Internet -Stanislaw Lem
  14. Drawing text with GPU shader units? by jonsmirl · · Score: 4, Interesting
    Has anyone tried drawing text with GPU shader units? It would work something like this:

    1) Each character would have it's own shader program.
    2) You would set the shader program, draw a rectange, and the character would appear.
    3) The shader programs would be automatically generated by processing TrueType files.

    To implement:
    1) Break Truetype outline up into a number of convex curve segments.
    2) Each of these curve segments would be represented as a set of constants in the shader program
    3) For each pixel, test a line from pixel to an edge.
    4) If the number of segments crossed is odd the pixel is black else white.
    The algorithm can be refined to add antialiasing and hinting.

    What you end up with is text that is clear at any resolution. The size of the text is controlled by the rectangle you draw it in. The text can also be clearly rotated and sheared.

    An obvious optimization is to get the GPU vendors to add a shader instruction to do the calculation for which side of the bezier curve segment the current point lies.

    While not important for games drawing text is critical for desktops. And we all know about the current trends to draw desktops with 3D hardware.

  15. Brook by belmolis · · Score: 5, Insightful

    This looks like a straightforward and clean extension that experienced C/C++ programmers won't find difficult to learn, but it isn't entirely clear to me whether just using this language, without any knowledge of GPU architecture, will lead to big improvements in performance. Granted, you don't need to know the details, but you've got to have an idea of what it is that you're trying to do and in a general way how the special constructs of the language allow you to do that. As with other such language extensions, you can nominally write in the language but not really use the extensions (how many "C++" programs have you seen that were really C programs with // comments and a few couts?) or use them in unintended ways that prevent the intended optimization. It seems to me that if the project really is aiming at programmers who are not familiar with GPUs, they need at least to provide a brief introduction to the special properties of GPU architecture and some guidelines as to how to use the features of the language to take advantage of them. At present I don't find this either on the web sites or in the distribution.

  16. Re:Fast Fourier Transform by jonsmirl · · Score: 5, Informative

    http://www.cs.unm.edu/~kmorel/documents/fftgpu/

    The FFT on a GPU
    This page contains supplemental material for the following paper.

    Moreland, K and Angel, E. "The FFT on a GPU." In SIGGRAPH/Eurographics Workshop on Graphics Hardware 2003 Proceedings, pp. 112-119, July 2003.

  17. Research by dfj225 · · Score: 4, Insightful

    I've always wondered why certain research programs (like Folding@home or SETI@home) don't use this type of code. My GPU sees more free time than my CPU plus it would probably get the work done faster. Also, imagine the speed increase of utilizing both the GPU and the CPU to their fullest potential. Now thats some fast folding!

    --
    SIGFAULT
    1. Re:Research by BiggerIsBetter · · Score: 4, Interesting

      I (and presumably others) have asked some project leaders about this, but it seems to come down to testing and support of various cards. Also, remember that this is relatively unknown technology - Amiga blitting aside ;-) - you have to be pretty sure it's going to give accurate and consistent results before using it seriously. Find-A-Drug was my project of interest, and they have a Linux version too.

      --
      Forget thrust, drag, lift and weight. Airplanes fly because of money.
  18. GPU opcodes by Anonymous Coward · · Score: 4, Informative

    Here is a Beyond3d link that has some opcode info. Look around their site for a NV30 vs R300 architecture document that has lots of great stuff. If you are looking for the best s/n ratio, Beyond3d is one of the best. All meat, little fanboyism.

  19. Re:The future is the past by Total_Wimp · · Score: 4, Interesting

    PCI-X can fix this data bus in other ways as well. Motherboards come with one AGP slot, but PCI-X can and will provide many expansion slots.

    Picture five high end GPUs on the motherboard eclipsing the single high-end cpu for a fraction of the price. Intel and AMD would be forced to cut the asking price of their products to compete. We could finally see some real four-way competition for "processors".

    TW

  20. Re:HP for GP?-AGP Bottleneck. by Nexx · · Score: 5, Insightful

    WARNING: Lots of conjecture involved.

    That said, if you can fit your data sets and your program on to the video memory (128MB isn't uncommon on high-end), and you're doing lengthy calculations on these sets while being only interested in the results (again, not uncommon in HPC), then the relative slowness of reading these results back becomes a nonissue.

    Does that help? :)

  21. Re:High Performance for General Purpose? by BrainInAJar · · Score: 4, Interesting

    would the percision be enough though? as far as i know, GPU's do a lot of rounding off

  22. Re:Fast Fourier Transform by BiggerIsBetter · · Score: 4, Funny

    Multiply power by N.

    You work for Nvidia, don't you?

    --
    Forget thrust, drag, lift and weight. Airplanes fly because of money.
  23. GPU use for scientific programming. by kiniry · · Score: 4, Interesting

    Researchers at Caltech and other institutions have been looking at this for about three years. See "Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid" by Bolz, Farmer, Grinspun and Schroder (SIGGRAPH 2003), for example. The paper, illustrations, and movies are available from Dr. Grinspun's homepage. The primary problems with the approach at the time this work was done was the limited bandwidth of texture-related operations in OpenGL based upon improper assumptions in pipeline optimization.

    --
    Joseph R. Kiniry
    http://kind.ucd.ie/~kiniry/
    Lecturer
    UCD School of Computer Science and Informatics
  24. Imagine a Beowulf Cluster... no, seriously by billstewart · · Score: 4, Interesting
    There's a cluster of Sony Playstations at UIUC (BBC) that's using the Emotion Engine to do numbercrunching and running Linux on the main processors to do communications and I/O. It's probably not strictly Beowulf, because it's using the Playstation version of Linux.

    This cluster has 70 Playstations (one article said that they'd ordered 100, but only 70 are in the cluster... Obviously the others are being used for "research".)

    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks