AMD Demonstrates "Teraflop In a Box"
UncleFluffy writes "AMD gave a sneak preview of their upcoming R600 GPU. The demo system was a single PC with two R600 cards running streaming computing tasks at just over 1 Teraflop. Though a prototype, this beats Intel to ubiquitous Teraflop machines by approximately 5 years." Ars has an article exploring why it's hard to program such GPUs for anything other than graphics applications.
That should be Teraflops. Flops is Floating-point operations per second, so always has an s on the end even if singular.
Ars has an article exploring why it's hard to program such GPUs for anything other than graphics applications.
No, Ars has an article blithering that it's hard to program such GPUs for anything other than graphics applications. It doesn't say anything constructive about why.
Here's an reasonably readable tutorial on doing number-crunching in a GPU. The basic concepts are that "Arrays = textures", "Kernels = shaders", and "Computing = drawing". Yes, you do number-crunching by building "textures" and running shaders on them. If your problem can be expressed as parallel multiply-accumulate operations, which covers much classic supercomputer work, there's a good chance it can be done fast on a GPU. There's a broad class of problems that work well on a GPU, but they're generally limited to problems where the outputs from a step have little or no dependency on each other, allowing full parallelism of the computations of a single step. If your problem doesn't map well to that model, don't expect much.
http://folding.stanford.edu/FAQ-ATI.html
It's still in beta AFAIK, but it has been in development for quite some time.
We've run several PC clusters and IBM mainframes that didn't have a 1TF of capacity. You don't want know much power went into them. Yes, our modern blade-based clusters are more condensed, but they're still power hogs for dual and quad core systems.
Blue gene is considered to be a power efficient cluster and the fastest, but it still draws 7kw per rack of 1024 cpus. At 4.71 TF per rack, even Blue Gene pulls 1.5kw per teraflop.
Yes, it's a pair of video cards, and not a general purpose cpu, but your average user doesn't have ability to program and use a Blue Gene style solution either. They just might get some real use out of this with a game Physics Engine that taps into this computing power.
This is cool.
The Internet has no garbage collection
Even if Nvidia's CUDA is as hard as the Ars Technica article suggests, I still hope AMD either makes their chips binary compatible, or makes a compiler that works for CUDA code.
From what I saw at the demo, the AMD stuff was running under Brook. As far as I've been able to make out from nVidia's documentation, CUDA is basically a derivative of Brook that has had a few syntax tweaks and some vendor-specific shiny things added to lock you in to nVidia hardware.
What would Lemmy do?
Don't forget that you need at least a 60MHz (yes, sixty megahertz) ADC and DSP pair to do what was suggested. The cost of building useful supporting electronics around a DSP capable of implementing a direct sampling receiver at 60MHz would be prohibitive in the range $ridiculous-$ludicrous.
h tml
...priceless
Maybe there aren't any DSP available and low cost, if you aren't a hardware designer:
400 MHz DSP $10.00 http://www.analog.com/en/epProd/0,,ADSP-BF532,00.
14-bit, 65 MSPS ADC $30.00 http://www.analog.com/en/prod/0,,AD6644,00.html
Catching non-designers talking smack