Ask Slashdot: GPU of Choice For OpenCL On Linux?
Bram Stolk writes So, I am running GNU/Linux on a modern Haswell CPU, with an old Radeon HD5xxx from 2009. I'm pretty happy with the open source Gallium driver for 3D acceleration. But now I want to do some GPGPU development using OpenCL on this box, and the old GPU will no longer cut it. What do my fellow technophiles from Slashdot recommend as a replacement GPU? Go NVIDIA, go AMD, or just use the integrated Intel GPU instead? Bonus points for open sourced solutions. Performance not really important, but OpenCL driver maturity is.
They're too busy with CUDA to give two shits about decent OpenCL performance.
That's why the HD Radeon series was the mining GPU of choice for Bitcoin.
Intel is your best bet for a mature open sourced opencl compatible GPU, if performance doesn't matter that is..
The future of GPU's is open standards. GPU's won't take off until all major vendors support the latest (OpenCL 2.0) standards
Here is the list of conformant products
https://www.khronos.org/conformance/adopters/conformant-products#opencl
I would go with an nVidia consumer card. They may be more expensive than the AMD ones. On the other hand, they offer CUDA and OpenCL support and are much faster.
For the newer ones (GTX9xxx) you will need to wait a little bit until the driver shipped with CUDA actually supports the cards though.
I work in a lab that does CT image reconstruction (all gpgpu computing) as part of what we do. I've been the one to program it using OpenCL under Ubuntu (I insisted I use linux; windows was too infuriating) so I'll share my experience.
I have two Nvidia 780 GPUs in my machine (an Alienware Aurora R4) and getting everything running under linux was actually much smoother than my initial attempt to get OpenCL running under Windows 8, so I don't think you'll have too much trouble there. I use the binary blob from Nvidia and it has been pretty stable with the occasional driver crash for whatever reason (maybe once in a six month period, but things just restart and it's fine. It's usually my fault for writing shitty code). I personally really like this setup and the only thing that could make it better would be more GPUs and a great, solid open source driver.
I would say that if you're going to use Nvidia GPUs for GPGPU computing, consider learning CUDA. Syntactically it's very similar to OpenCL but the tools you have access to for debugging, profiling, and increasing performance as well as the overall stability of the programs seems to be much much better. I suppose we should expect that though from a proprietary language, on proprietary hardware, using a proprietary driver. I've heard that you can get better performance (read: speedups) using CUDA over OpenCL, but I've never tested that for myself, or seen proof firsthand.
I've learned OpenCL, and I like it's portability and openness, but I look at some of the stuff my friends can do with CUDA and I can't say that I'm not envious. Mainly what I'm referring to is Nvidia's NSight program, which can do OpenCL if you're willing to pay for the "pro" edition. Also, Nvidia GPUs are scalar based, so if much of you speedup would come from using OpenCL's vector structures, that won't happen on Nvidia GPUs the same way that it would on AMD. Programming might be more convenient, but performance will stay the same.
Hope that helps. Feel free to ask more questions.
Integrated graphics in your CPU will have a modest performance but stable and open source OpenCL driver. If it proves too slow for your particular project, you will be able to compare benchmarks and get the cheapest card that is fast enough to, say, run your animation at 60fps. If you are planning to distribute your code, you will anyway need several GPUs to test with.
But now I have something to say on the matter: my Nvidia card is no longer supported (that 96.xx line of drivers). So, no proprietary driver for me.
You mean no new proprietary driver for you. The old one still exists. It didn't magically stop working because a new driver came out.
I don't play games and my machine is ok to even play video. I don't need to sacrifice anything except the things I already don't use (like desktop indexing).
You can install an indexed search tool on older Ubuntu versions. But regardless, if newer Ubuntu doesn't support older nVidia drivers, how is that nVidia's fault?
Wait, I just saw this:
And no, I can't go for a more recent distribution, like Ubuntu), because they decided my CPU is too old for them.
CPU without PAE? How quaint. Maybe you should join this millenium. But there are some non-PAE kernels for Ubuntu floating around out there, sometimes.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
I recommend the ocl-icd package to make it easy to switch OpenCL implementations on the fly. Also, download the Intel and AMD OpenCL runtimes which support CPU-based computation using SIMD instructions and multicore parallelism, and try them out as well as GPUs. You can then micro-benchmark your own algorithms on different vendor runtimes quite easily. I have found that the Intel OpenCL does a very decent job of auto-vectorization, so my scalar-based OpenCL code ran almost as fast as my hand-vectorized version that uses OpenCL vector intrinsics.
In my case, my image processing algorithms are more memory-bound and a recent 2.4 GHz mobile Intel quad-core outperforms my desktop NVIDIA GTX 760 on the same OpenCL code. Both of these trounce my c. 2010 Xeon E5530. I had no idea how much Intel SIMD performance has improved until I tried this and saw for myself. I think a big advantage is that the CPU doesn't have to transfer the large N-dimensional arrays back and forth over the PCIe bus, but can just get to computing immediately. This may not hold true for some algorithms that crank much longer on a small input or output array.
It is also important to realize that OpenCL parallelism won't save you from poor algorithm choices. You need to be open to experimentation and reevaluating your assumptions as you explore new problems. I work with Python, Numpy, and PyOpenCL so that I can focus on the math first, and then selectively replace the underlying algorithms with different implementations as needed. Being able to work at a high level of abstraction makes it so much easier to explore the math you want to perform, without writing a lot of low-level code that gets thrown away.
If you want to write modern OpenCL code and run it on a GPU, AMD is your only option.
In terms of performance, NVIDIA is actually the best. But they've been stuck at OpenCL 1.1 for years, while everyone else has long since moved to newer versions. Until (if) they add OpenCL 2.0 support, they'll be a bad choice.
Intel doesn't support running OpenCL on the GPU under Linux. See the chart at the end of https://software.intel.com/en-.... You can still write OpenCL programs, but you'll just be running them on your CPU.
"I'm too busy to research this and form an educated opinion, but I do have time to tell everyone my uninformed opinion."
Have a look at this talk, namely 8 min 30 seconds into the talk:
https://www.youtube.com/watch?...
The talk was given at the recent Linux Conf Australia (in New Zealand). It shows that AMD supports OpenCL 2.0, while Nvidia only support version 1.1 (released in 2010). I spoke to the speaker after his talk and he said Nvidia are basically dragging their heals with regard to supporting more recent versions. Nvidia also request unconvential features be put into the spec, and then never implement those features. Obvisouly Nvidia are doing well with their own CUDA language and seem to be trying to create a walled garden. It sounds like if you are going for openness and not for speed, then you could look at Intel or AMD (both support version 2.0).
As for a particular model, if double-precision performance is important, go with a 7970 or 280x on theAMD side (or 7990 if you need dual-gpu in one slot). They did double-precision at 1/4th their single-precision rate, which is the best you're going to find at consumer-grade pricing -- even more-modern or more powerful cards have backed off on double-precision, so something like a 290x has almost 50% more shader ALUs than a 280x, and will perform better at single-precision workloads, but only does double-precision at a rate of 1/8th, so its actually slower in purely double-precision workloads. All of nVidia's consumer cards are in the ballpark of 1/8th to 1/16th rate too, except the GTX Titan Black, which did 1/3rd rate, but at $1500 is nearly Quadro pricing anyways.
If money is no object an AMD firepro 9100 is the workstation version of the 290x, and does double-precision at 1/2 single precision rate, and is the current best-of-both worlds, and will probably remain so for the remainder of the year, but its a 3-grand price tag or so.