Ask Slashdot: GPU of Choice For OpenCL On Linux?

← Back to Stories (view on slashdot.org)

Ask Slashdot: GPU of Choice For OpenCL On Linux?

Posted by timothy on Sunday January 25, 2015 @03:38AM from the discriminating-tastes dept.

Bram Stolk writes So, I am running GNU/Linux on a modern Haswell CPU, with an old Radeon HD5xxx from 2009. I'm pretty happy with the open source Gallium driver for 3D acceleration. But now I want to do some GPGPU development using OpenCL on this box, and the old GPU will no longer cut it. What do my fellow technophiles from Slashdot recommend as a replacement GPU? Go NVIDIA, go AMD, or just use the integrated Intel GPU instead? Bonus points for open sourced solutions. Performance not really important, but OpenCL driver maturity is.

4 of 110 comments (clear)

Min score:

Reason:

Sort:

You're not going very far with nVidia by Anonymous Coward · 2015-01-25 03:51 · Score: 4, Interesting

They're too busy with CUDA to give two shits about decent OpenCL performance.
That's why the HD Radeon series was the mining GPU of choice for Bitcoin.
1. Re: You're not going very far with nVidia by Anonymous Coward · 2015-01-25 03:58 · Score: 2, Interesting
  
  The opencl performance on nvidia 900 series gpus is actually pretty good. They have finally come around.
nVidia Consumer Card by Anonymous Coward · 2015-01-25 03:58 · Score: 2, Interesting

I would go with an nVidia consumer card. They may be more expensive than the AMD ones. On the other hand, they offer CUDA and OpenCL support and are much faster.
For the newer ones (GTX9xxx) you will need to wait a little bit until the driver shipped with CUDA actually supports the cards though.
Don't ignore CPU-based OpenCL, consider Numpy by Anonymous Coward · 2015-01-25 06:44 · Score: 3, Interesting

I recommend the ocl-icd package to make it easy to switch OpenCL implementations on the fly. Also, download the Intel and AMD OpenCL runtimes which support CPU-based computation using SIMD instructions and multicore parallelism, and try them out as well as GPUs. You can then micro-benchmark your own algorithms on different vendor runtimes quite easily. I have found that the Intel OpenCL does a very decent job of auto-vectorization, so my scalar-based OpenCL code ran almost as fast as my hand-vectorized version that uses OpenCL vector intrinsics.
In my case, my image processing algorithms are more memory-bound and a recent 2.4 GHz mobile Intel quad-core outperforms my desktop NVIDIA GTX 760 on the same OpenCL code. Both of these trounce my c. 2010 Xeon E5530. I had no idea how much Intel SIMD performance has improved until I tried this and saw for myself. I think a big advantage is that the CPU doesn't have to transfer the large N-dimensional arrays back and forth over the PCIe bus, but can just get to computing immediately. This may not hold true for some algorithms that crank much longer on a small input or output array.
It is also important to realize that OpenCL parallelism won't save you from poor algorithm choices. You need to be open to experimentation and reevaluating your assumptions as you explore new problems. I work with Python, Numpy, and PyOpenCL so that I can focus on the math first, and then selectively replace the underlying algorithms with different implementations as needed. Being able to work at a high level of abstraction makes it so much easier to explore the math you want to perform, without writing a lot of low-level code that gets thrown away.