Slashdot Mirror


GPU Gems

Martin Ecker writes "Following other entrants in the successful series of graphics and game programming-related "Gems" books, Randima Fernando of NVIDIA has recently released GPU Gems - Programming Techniques, Tips, and Tricks for Real-Time Graphics through Addison- Wesley. As the title indicates, GPU Gems contains a collection of tips and tricks for real-time graphics programming with graphics processing units (GPUs) that are found on modern graphics adapters." Read on for the rest of Ecker's review, and for a few more notes on the book. GPU Gems – Programming Techniques, Tips, and Tricks for Real-Time Graphics author Randima Fernando (Editor) pages 816 publisher Addison-Wesley Publishing rating 9 reviewer Martin Ecker ISBN 0321228324 summary An excellent book containing many "gems" for real-time shader developers.

The book is intended for an audience already familiar with programmable GPUs and high-level shading languages and is divided into six parts that concentrate on particular domains of graphics programming. Each part contains between five andd nine chapters, with the entire book containing a total of 42 chapters. Each chapter was written by a different renowned expert(s) from a gaming company, tool developer, film studio, or the academic community. About half of the contributors are from NVIDIA's Developer Technology group. The chapters focus on effects and techniques that help developers to get the most out of current programmable graphics hardware. With approximately twenty pages per chapter, the contributors are able to describe various effects and techniques in-depth, as well as delve into the required mathematics.

All the shaders in the book are written in the high-level shading languages Cg and HLSL. The demo programs on the CD-ROM that accompanies the book use both Direct3D and OpenGL as graphics API, depending on the authors' preferences. Even though the shaders are in Cg and HLSL, it should be fairly straightforward for OpenGL programmers who might prefer to use the recently released OpenGL Shading Language to port the shaders, as the syntax is very similar.

The first part of the book deals with natural effects and contains chapters on rendering realistic water surfaces, water caustics, flames, and grass. Two chapters look behind the scenes of NVIDIA's Dawn demo, which shows a dancing fairy with realistically lit skin. There is also a chapter on Perlin noise (improved version) and its implementation on GPUs that was written by Ken Perlin himself.

The second part of the book concentrates on lighting and shadows. There are chapters from people at Pixar Animation Studios that describe some of the lighting and shadow techniques used in their computer-generated movie productions, as well as a chapter on managing visibility for per-pixel lighting. In the shadow department, the two predominant ways of rendering shadows in real-time, shadow mapping and shadow volumes, are discussed with possible optimizations and improvements. The chapter by Simon Kozlov on methods to improve perspective shadow maps presents some especially interesting new material on the topic.

The third part of the book covers materials and contains chapters on subsurface scattering, ambient occlusion, image-based lighting, spatial BRDFs, and how to use them efficiently in real-time, while part four describes various techniques for image processing (being used more frequently in computer games), mostly in the form of post-processing filters. The chapters presented in this section deal with various depth-of-field techniques, a number of filtering techniques using shaders, and the real-time glow effect seen in many of the newer games (especially in Tron 2.0). Not surprisingly, one of the authors of this chapter is John O'Rorke from Monolith Productions, a developer of the game. Contributors from Industrial Light & Magic introduce the OpenEXR file format used for storing high-dynamic-range image files (see openexr.org).

Part five, titled "Perfomance and Practicalities," is a collection of chapters that deal more with software engineering aspects of developing software that uses shaders. In particular, there are chapters on optimizing performance and detecting bottlenecks, using occlusion queries efficiently, integrating shaders into applications and content creation packages (in particular Cinema4D), and how to develop shaders using NVIDIA's FX Composer tool. There is also an interesting chapter on converting shaders written in the RenderMan shading language, a language for offline rendering, to real-time shaders. The chapter uses a fur shader from the movie "Stuart Little" to demonstrate this conversion. With the large increase of GPU processing power, more shaders from the offline rendering world will enter the realm of real-time graphics and it will be useful to re-use already existing resources, such as RenderMan shaders.

The final part of the book deals with a topic that has recently received a lot of attention by graphics researchers - a topic called General Purpose GPU or GPGPU programming, i.e. using the GPU for other things than rendering triangles. This part comprises chapters on performing computations, in particular fluid dynamics, on the GPU, chapters on volume rendering, and a nice chapter on generating stereograms on the GPU. As a side note, there is a website that deals exclusively with news in the GPGPU community at gpgpu.org.

The book contains a many images that show the presented effects in action, and also plenty of diagrams and illustrations that explain more complicated techniques in detail. Unlike Randima Fernando's previously released book, The Cg Tutorial, which I have also reviewed in the past on Slashdot, the book and all of its illustrations and images are printed entirely in color. The large number and high quality of the illustrations is probably one of the best features of this book that makes even the more advanced effects easily comprehensible.

The book comes with a CD-ROM that contains sample applications for most of the chapters in the book. Some of these applications include the full source code, whereas others, such as NVIDIA's Dawn demo (also described in some of the book's chapters), are included as executables only. It must be noted that all applications run exclusively on Windows, even though some of the samples that are available in source code form and use OpenGL could probably be built to run on other operating systems as well. Furthermore, about half of the samples require what Fernando and Kilgard in The Cg Tutorial call a fourth-generation graphics card to run, in particular, an NVIDIA GeForceFX card. Note that most samples that require a GeforceFX will not run on comparable ATI hardware. This comes as no surprise since GPU Gems is predominantly an NVIDIA book. It should be noted, however, that the techniques, effects, and shaders presented in the book's text are generally applicable to programmable GPUs and are equally useful when working with graphics hardware from vendors other than NVIDIA.

This is a great book that every programmer involved in game development and/or real-time computer graphics should have on his/her shelf. For the game programmer it is critical to stay up-to-date with the latest and greatest effects available with modern GPUs in order to remain competitive when creating the gaming experience. For the graphics developer, it is interesting to see how the immense processing power of current graphics hardware can be exploited in graphics applications. This book offers insight on both of these topics and more, and I highly recommend it.

A few notes from reader Akalgonov: Reader akalgonov contributes a few more thoughts on the book:

"The sample programs and demos require shader support, Cg, OpenGL, or the latest version of DirectX to run. On the plus side, the majority of the companion topics included pre-compiled binaries (but not the runtime dynamic link libraries) or an AVI illustrating the subject in addition to the source code. While the CD contains over 600 MB of examples from the text, it provided only 23 of the 42 topics covered in the book. Since most of the articles provide an overview and references to a topic, additional material on the CD would have been beneficial.

I found the wide range of subjects quite interesting - and was refreshed that the topics actually seemed "ahead of the curve" in terms of hardware requirements. However in order to provide more subject depth, it seemed that the text could have been split into two volumes in order to expand the existing chapters with sufficient depth. As the material is just enough to get one started, the subject treatment may disappoint some readers seeking to apply the clever and unique techniques presented in the book directly or those hoping to use the book as an opportunity to learn some of the advanced features provided in a programming graphical processing unit."

Martin Ecker has been involved in real-time graphics programming for more than 9 years and works as a games developer for arcade games, and works on the open source project XEngine. You can purchase GPU Gems -- Programming Techniques, Tips, and Tricks for Real-Time Graphics from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

29 of 116 comments (clear)

  1. gems? by lawngnome · · Score: 4, Funny

    no wonder high end cards are expensive!

  2. you can get it cheaper at.... by millahtime · · Score: 4, Insightful
  3. Yawn... by AKAImBatman · · Score: 2, Interesting

    Call me when NVidia and ATI open up their specs so I can finally code that real time raytracing engine I've been dreaming of. Otherwise, you're just tweaking OpenGL or DirectX until the cows come home.

    Actually, I'm a bit surprised that the big names haven't started looking at raytracing. Sure, it has a reputation for being slow, but graphics technology has grown by leaps and bounds. Combined with about 5 billion caching and approximation tricks, and the fact that ray tracing is a highly parallel operation, I'm thinking that we should already have games that are raytraced.

    1. Re:Yawn... by Anonymous Coward · · Score: 4, Interesting

      Ray tracing can be done in real time today, at around a million rays per second on a P4-class host CPU. The real bottleneck is not the CPU, but memory bandwidth. It turns out that conventional PC "random-access" memory does not like to be accessed randomly, but that's exactly what a ray tracer needs to do. Memory performance has become seriously dependent on caching over the past 10-15 years, and ray tracers are about the least cache-friendly class of algorithms in existence.

      So don't look to CPU or GPU manufacturers for help with ray tracing... you want to bitch at the short-bus-riding DRAM people instead.

    2. Re:Yawn... by bradkittenbrink · · Score: 5, Interesting

      Actually, I'm a bit surprised that the big names haven't started looking at raytracing. Sure, it has a reputation for being slow, but graphics technology has grown by leaps and bounds. Combined with about 5 billion caching and approximation tricks, and the fact that ray tracing is a highly parallel operation, I'm thinking that we should already have games that are raytraced.

      I'm not sure that's gonna happen. The fact of the matter is that current graphics hardware is fast approaching the point where raytracing will be irrelevant. The lighting algorithms that can be coded on GPUs will one day match the complexity of raytracers and you won't know the difference. The fact of the matter is that scan conversion is not actually mathematically inferior to raytracing as a rendering technique, it's just a way to quickly generate the first recursive step of the raytracer. That advantage isn't going to go away. In actuality, the end result will probably be something of a hybrid between raytracing and traditional scan conversion techniques and you won't really be able to identify it as one or the other.

    3. Re:Yawn... by gr8_phk · · Score: 2, Informative
      "I'm thinking that we should already have games that are raytraced."

      google for rtChess.

      The ray tracing engine has since seen a 40% performance boost and has added photon mapping and scales nicely with more CPUs - I just haven't written a game with it since. I don't think a GPU implementation will be much faster. nVidia seems to think they make general purpose processors now - HAH what a laugh.

    4. Re:Yawn... by Viking+Coder · · Score: 4, Informative

      Two points:

      First, Why? Most people don't even make movies that are raytraced.

      Second, they already are doing raytracing on the GPU. Purcell had one working in 2002. There was a presentation on it, in a course at SIGGRAPH 2003. The GPU is maybe a little faster than the CPU, right now, for raytracing.

      "Tweaking OpenGL" is kind of like saying "tweaking the CPU", any more. It's fairly close to a generalized stream processor. And their specs already are open enough to have figured this out. Look at GPGPU and read some more about how people are doing amazing stuff on the GPU today. No need to wait for ATI and NVidia to open up any specs - they already did. Cg and GLSlang are fully up to the task.

      And, photon mapping and similar techniques are much more sophisticated than raw raytracing.

      --
      Education is the silver bullet.
    5. Re:Yawn... by hawkstone · · Score: 3, Informative

      Open up their specs so you can write a real-time raytracer? Why can't you use Cg or HLSL like others have done? Why do you need to write to the video card directly? You have full access to the programmability of the GPU through these languages. If not, program the damned thing in their version of assembler through the DirectX or OpenGL APIs. Unless by "tweaking OpenGL or DirectX" you mean "programming the GPU", your statement seems flat-out wrong.

      Don't believe you can do it? Here's a link some projects that do real-time raytacing, radiosity, photon mapping, and subsurface scattering, all on GPUs. These GPUs are programmable without them opening up their specs.

      (The desire for them to open up their specs is for other reasons, not because they are hiding some functionality from you.)

    6. Re:Yawn... by Speare · · Score: 2, Informative
      Ray tracing can be done in real time today, at around a million rays per second on a P4-class host CPU.

      I fail to see how one million rays per second is "real time" for most images people associate with ray-tracing. Even at one ray per pixel, you're limited to a single 500x500 image per second. But the value of ray tracing is the recursion: one ray hits an object, and anywhere between 2 and 200 rays result (counting for any subsequent recursions, lights and diffusions).

      Your budget: 1000000 rays per second. Take a guess at an average of 10 rays per resulting pixel including all recursion, and you're down to a paltry 100x100 pixels at 10fps. You fail on all metrics of expected quality: poor fidelity, poor resolution, and poor framerate. Even on faster CPUs, you haven't made up the difference for what users want to see.

      --
      [ .sig file not found ]
    7. Re:Yawn... by Ann+Coulter · · Score: 2, Informative

      There are serious investigations into making cache optimized algorithms. For example, the matrix transposition and array index bit reversal algorithms have been investigated in two papers. Also, Bailey's 4-step and 6-step FFT algorithms are also cache efficient. The latter example shows that a complex algorithm such as a FFT can be made cache efficient with the sacrafice of only a few extra computations. Perhaps it would be prudent to use a hybrid ray-tracer/polynomial renderer to section each portion of the screen into regions that will only access a particular portion of memory. In fact, texture mapping is a lot like that. But I propose that we section the geometry into sections that are localized in memory. This will require more computation in the form of checking which ray goes where but it might be possible to create a viable ray tracer/polygon renderer that produces images of ray tracing caliber. By polygon renderer I mean the renderers that we currently use in gaming.

      Some references about cache efficiency.

    8. Re:Yawn... by Minna+Kirai · · Score: 2, Funny

      Has anyone told you that your choice of nickname is distracting? The real Ann Coulter has a very limited range of output that can be trivially replicated by Markovian string techniques. This leads most readers to skip over anything under her name, because the content is entirely predictable.

    9. Re:Yawn... by NothingToSeeHere · · Score: 2, Informative

      I'm not sure that's gonna happen. The fact of the matter is that current graphics hardware is fast approaching the point where raytracing will be irrelevant.

      Actually, AFAIK the opposite is true.
      Raytracers scale very nicely with geometric complexity: O(log n). So as the virtual environments continue to grow, raytracing should gain popularity over scan conversion. Have a look at this - that's 50 million triangles raytraced at 4-5 fps!

      Most of the current interactive raytracing is still done on parallel computers or PC clusters, but there are a lot of optimizations that can be combined to achieve interactivity even on a single CPU. And hardware architectures are underway as well...

    10. Re:Yawn... by NothingToSeeHere · · Score: 3, Interesting

      I realize 250 Mtriangles/sec aren't quite the 380 stated by ATI for their current generation GPU (Radeon 9800 Pro), but the paper I linked to is from 2001.
      The hardware raytracing site has a nice video of their FPGA-based system rendering about 187 million triangles at about 15 - 40 fps (512x384, 90MHz FPGA).

    11. Re:Yawn... by AKAImBatman · · Score: 2, Interesting

      Remember the three magic words to making high speed 3D graphics work: "Cheat like hell" I'd actually done some research into this area not so long ago (most of which I can't remember) and I found that about 95% of calculations can be stored in lookup tables, or calculated once for all rays. I don't remember all the details of my evil plan (I really need to start writing this stuff down) but I had pivoted the calculations in such a way as to make multiple, pipeline friendly passes.

      The first stage or two got most of the predictable calculations out of the way, Most of these were length calculations for all objects within the bounding area. These calculations were then reused in the next step where the rays were actually cast. Since most of the info had been precalculated at an O(objects) performance, the number of computations for the O(rays) operation was reduced. I think I left myself with a multiplication or two, plus a square root from a lookup table. Obviously the algorithm assumed that the number of spatial objects within the bounding area was significantly less than the number of rays being cast. (A fair assumption in raytracing.)

      In the end, the point was to not only precalculate as much as possible, but to also avoid any unnecessary jumps in the code. By making the entire algorithm as linear as possible, I planned to make full use of a super-deep pipeline like those present in GPU and DSP chips. The performance results of such a pipeline would put a modern Pentium IV to shame. In the end, I gave up for want of a programmable graphics card. Cg was brand new, and my current NVidia GeForce 2 wasn't programmable enough for experimentation.

      Antialiasing is difficult, but not insurmountable. I don't know if my algorithm would have been fast enough to allow a 4x antialias pass, but I'm not sure it's relevant. If a raytracing standard were deployed to developers today, some games would take advantage of it. These games would then drive the development of better hardware designs for raytracing. Even if the idea flopped, the only risk would be in writing a new set of drivers for a GPU.

    12. Re:Yawn... by captaineo · · Score: 2, Interesting

      Ray tracing does seem to have some on-paper theoretical advantages, but I've always found that it's a render time killer for my scenes. While the asymptotic running time of ray tracing is good, the coefficient is so much higher that "polygon splatting" has won every time so far.

      You also need to consider that the O(log N) figure for ray tracing does not include the cost of building a ray-acceleration data structure, and it also assumes the entire scene fits in RAM. Polygon splatting is O(N), but the coefficient is much smaller, and RAM requirements are much less. Depending on how your scene is organized, and how much you can cull, you may be able to get more like O(log N) behavior from a scanline renderer.

      Also, if you think of "N" not as geometry elements but as output pixels, ray tracing is O(N^2) whereas I claim that scanline rendering scales at somewhat less than O(N^2), because you don't need to shade each and every screen-space sample. (look at how PRMan works for an example of this).

    13. Re:Yawn... by AKAImBatman · · Score: 2

      I had a go some time back. [snip] I was getting around 15 seconds/frame.

      I never said it was easy. :-) You have to carefully limit your rays as much as possible. With extremely complex scenes, you may even have to render only some of the pixels for each frame. However, things get much better when you get to the GPU. Most of today's cards have at least two pipelines. Some even have 16! Now raytracing is a highly parallel operation, and GPUs tend to have very deep pipelines with excellent floating point support. Combine the two with a properly tuned ray tracer, and the results should be fantastic!

      Oh, and I remember one more thing. I was planning to time limit a frame rendering as if it was a hard real time system. The rays would be cast in a wide pattern so that if time ran out, most of the scene would already exist. Pixels from the previous frame could plug the hole.

      It certainly seems to be an active area.

      Indeed. I myself have been trying to figure this out since the days of the original Pentium. It wasn't until about 2 years ago, however, that someone actually broke the barrier and built the first realtime ray tracer. It was pixelated as all get out, but it worked. Its most interesting feature was all the curved geometry that it was capable of. The graphics were so pretty that I would have liked to see a game despite its limitations.

      I think that the future really is ray tracing. Time will tell.

    14. Re:Yawn... by obelixn13 · · Score: 2, Insightful

      Many of the 'pretty' effects that come with raytracing such as reflections and highlights are easily approximated in most games using cheap hacks, eg environment and normal mapping.

      We have become so used to these in games now, that I dare say if you did produce a real-time raytracer you would be hard-pushed to explain to the average gamer what was so cool about it.

      The bar has been raised significantly since ray-tracing was first presented in the 70s. And we've long since started looking beyond what raytracing can deliver, eg soft shadows, colour bleeding, subsurface scattering etc.

      As a lighting model, its a very blunt instrument mathematically - most /.ers probably remember writing their own for some undergraduate assignment a long time ago ;).

    15. Re:Yawn... by Viking+Coder · · Score: 2, Insightful

      This is like saying "Programming CPUs has become exceedingly complex because they stick to a 'floating-point' model instead of declaring vectors outright."

      I'm not even asking you to do it from scratch yourself - borrow liberly from people like Purcell, and from GPGPU.org, and from BrookGPU and from other stream-processing-on-GPU sources.

      When you say that you want a new "driver," I think you should really consider using a wrapper layer like BrookGPU - or just figure out how to do things the way Purcell did them. Don't get frustrated by the complexity. I'm not frustrated by the complexity of SSE2 instructions - I just use a compiler that takes care of that for me. Or I use a wrapper layer like the Intel Performance Primitives. I think you need to do the same.

      --
      Education is the silver bullet.
  4. The Gems books are classics... by tcopeland · · Score: 4, Interesting

    ...if only to give an appreciation for how hard it is to write 3D games/engines these days. An article on A* will start off with a paragraph or two saying "of course you know A*, and you've read the three papers on A* optimizations, so here's a fourth optimization you may not have seen before".

    A lot of the articles are practical, too, if you're working in the field. When I was fiddling with some fuzzy logic stuff the articles from Game Programming Gems II was very helpful.

  5. Perlin by happyfrogcow · · Score: 3, Funny

    There is also a chapter on Perlin noise (improved version) and its implementation on GPUs that was written by Ken Perlin himself.

    Wow.. there's a person behind Perlin noise? I always thought it was a random noise generator based on the chaos found in Perl programs. Thus, the noise was generated by an http client that has "gone perlin'" -- which means to crawl the web in search of arbitrary bits of Perl.

    who knew!?

  6. Where would we be without shaders by 192939495969798999 · · Score: 3, Interesting

    I love how shaders have taken a very hard step and made it into a much easier step. I can tell you about the days before shaders, and doing something like fur was just unthinkable. Now, thanks to Pixar, et al. you can practically make a whole character from a shader, and not ever have to make anything but spheres with cylinders sticking out of them. I am actually anxious to see what happens when any shader can be a real-time shader!

    --
    stuff |
  7. Also Check Out... by th1ckasabr1ck · · Score: 3, Informative

    If you're interested in thsi stuff, also check out Real Time Rendering by Tomas Moller and Eric Haines. It's one of my favorites and contains an amazing amount of information..

  8. brings to mind an old question I once had. by tloh · · Score: 3, Interesting

    This post reminds me of a question that I haven't thought about since High School. I was taking programming classes right around the time I was discovering the gaming phenomenon. The dizzying pace of hardware evolution at the time (still going strong as ever many would say) prompt me to ask my computer teacher if computer video hardware was designed in such a way that when graphics were not being processed, the GPU could be used for general number crunching. In other words, if it is possible to do load balancing between the GPU and the CPU. I seemed to recall reading something (possibly on /.) about someone investigating this exact thing I was wondering about so long ago. I should probably STFW, but if someone could point me in the proper direction, I would be as grateful as anyone would to have a long-irritated itch finaly scratched.

    --
    Stay sentient. Don't drink bad milk.
    1. Re:brings to mind an old question I once had. by kemapa · · Score: 2, Informative

      if computer video hardware was designed in such a way that when graphics were not being processed, the GPU could be used for general number crunching. In other words, if it is possible to do load balancing between the GPU and the CPU.

      While it would probably be possible to use a GPU for general purpose number crunching, I believe it would make the GPU unable to send a signal to your monitor at the same time.

      I asked the same question back in the days of RC5-64 and I was told that it was not feasible for just that video signal reason. I was told I would not be able to use my video card while it was crunching.

      Correct me if I am wrong, though!

  9. Hmmm,. I wonder if it is very nVidia centric? by Assmasher · · Score: 2, Interesting

    lol... Why did they bother to use Cg at all? Could it be because nVidia is putting this book out? Some conflict of interest? Hehe. There are books on HLSL and OSL that are more valuable than this one.

    --
    Loading...
    1. Re:Hmmm,. I wonder if it is very nVidia centric? by ardor · · Score: 2, Insightful

      Cg is still very useful if you intend to develop cross-platform shader-driven graphics apps. Plus, its also API-independent, which makes it the only viable alternative to rewriting all shaders for each API if you are about to write some API-independent graphics code. Remember, GLSL support is still not widespread. Heck, even the ARB FPrograms arent supported on cards older than a radeon9500/geforceFX. If you do not want to develop half a dozen of different codepaths, use Cg.

      --
      This sig does not contain any SCO code.
  10. I think you're wrong by roystgnr · · Score: 2, Insightful

    Or, at least you're wrong about modern programmable GPUs; you might have been right about the first generations of 3D cards.

    See this paper for some examples which not only use the GPU simultaneously for graphics and number crunching, but which use the graphics to give real-time output of computational fluid results.

    The only remaining problem I remember is that the bandwidth to current video cards is very asymmetric, which is fine for video games that just push a lot of data to the video card but not so good for numerical physics that also wants to ask for a lot of data back. I think at least one of the new PCI-Extended/Express/X/whatever standards is supposed to fix this.

  11. Re:gems? by NanoGator · · Score: 3, Funny

    "no wonder high end cards are expensive!"

    My gf's ex bought her a Diamond video card for their anniversary. I was warned that that little joke was only funny the first time.

    --
    "Derp de derp."
  12. B&N? Ripoff! by TastyWords · · Score: 3, Informative

    Why does everyone insist on considering Amazon and B&N to be the only online bookstores? I have news for you folks: it's almost always cheaper to go to AddAll or BookPool and get a book cheaper including shipping than Amazon and B&N.

    In the case of this book, I've taken the liberty of making your life easier by providing you with urls which will take you directly to the price list for the book. For future reference: AddAll is a shopping 'bot, looking at thirty-six stores. AddAll Results and BookPool

    Now, if you insist upon paying Amazon and B&N prices, let me know. You can PayPal the money to me and I'll order the book for you from AddAll or BookPool and have it shipped to you. (Of course, I'll keep the difference. After all, you were willing to pay the extra price!) If you're willing to waste your money, I'd rather collect the waste than Amazon or B&N.

    p.s. Remember this the next time you see someone post a message saying, "it's -this price- at Amazon!"

    p.p.s.
    Here's the listing from Froogle (just in case you haven't used it yet)