Slashdot Mirror


An Open Source Compiler From CUDA To X86-Multicore

Gregory Diamos writes "An open source project, Ocelot, has recently released a just-in-time compiler for CUDA, allowing the same programs to be run on NVIDIA GPUs or x86 CPUs and providing an alternative to OpenCL. A description of the compiler was recently posted on the NVIDIA forums. The compiler works by translating GPU instructions to LLVM and then generating native code for any LLVM target. It has been validated against over 100 CUDA applications. All of the code is available under the New BSD license."

11 of 71 comments (clear)

  1. Alternative? by Guspaz · · Score: 4, Insightful

    This isn't an alternative to CUDA; it lets CUDA code run on x86, but still doesn't do anything for AMD graphics cards. In other words, your choices as a developer are to use OpenCL and have your code run everywhere (AMD, nVidia, x86 slowly), or use CUDA and have your code run on nVidia or x86 slowly.

    What possible reason could you have to want to be locked into one GPU vendor?

    1. Re:Alternative? by Guspaz · · Score: 3, Insightful

      Progressively more and more.

      Example: Go to "ati.com" and you get redirected to the regular amd.com front page. Go to desktop graphics products and you get a page titled "AMD Graphics for Desktop PCs" inviting you to shop for "AMD Desktop Graphics Cards".

      The actual cards themselves have as product name "ATI Radeon", but describing an "ATI Radeon" as an "AMD graphics card" is accurate.

    2. Re:Alternative? by Sloppy · · Score: 3, Informative

      He means CUDA was here first, and it does(did) lock you into Nvidia. So if you jumped on the bandwagon early, your code is Nvidia only. If you waited for a standard (opencl) (or ported your app) then you're cross-platform.

      --
      As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
    3. Re:Alternative? by TheRaven64 · · Score: 4, Informative

      it lets CUDA code run on x86, but still doesn't do anything for AMD graphics cards

      Actually, it does. It lets CUDA code run on any processor that has an LLVM back end. The open source Radeon drivers have an experimental LLVM back end and use LLVM for optimising shader code.

      --
      I am TheRaven on Soylent News
    4. Re:Alternative? by CDeity · · Score: 4, Informative

      The greatest challenges lie in accommodating arbitrary control flow among threads within a cooperative thread array. NVIDIA GPUs are SIMD multiprocessors, but they include a thread activity stack that enables serialization of threads when they reach diverging branches. Without hardware support, this kind of thing becomes difficult on SIMD processors which is why Ocelot doesn't include support for SSE yet. It is also one of the obstacles for supporting AMD/ATI IL at the moment, though solutions are in order.

      Translation from PTX to LLVM to multicore x86 does not necessarily throw away information concerning the PTX thread hierarchy initially. The first step is to express a PTX kernel using LLVM instructions and intrinsic function calls. This phase is [theoretically] invertible and no information concerning correctness or parallelism is lost.

      To get to multicore from here, a second phase of transformations insert loops around blocks of code within the kernel to implement fine-grain multithreading. This is the part that isn't necessarily invertible or easy to translate back to GPU architectures and is what is referenced in the note you are citing.

      Disclosure: I'm one of the core contributors to the Ocelot project.

  2. Wait wut? by Icegryphon · · Score: 3, Insightful

    Why would you go from CUDA(Fast Floating-points) to x86(slower Floating-points)?
    Is there support yet for double-precision floating points yet on Nvidia cards?
    This makes as much sense as a Wookiee on the planet Endor.
    Unless the Point is portability but, then why write it in Cuda to begin with?

    1. Re:Wait wut? by tepples · · Score: 3, Insightful

      Why would you go from CUDA(Fast Floating-points) to x86(slower Floating-points)?

      For running legacy apps that were developed between the release of CUDA and the release of OpenCL. There aren't many, I'd guess.

  3. Doesn't sound like a compiler by gnasher719 · · Score: 3, Interesting

    Seems to be just a front-end for LLVM. And if it is just a front-end for LLVM, then why doesn't it support ATI graphics cards? That would actually make it useful; there is no need for a second CUDA compiler for NVidia cards.

    1. Re:Doesn't sound like a compiler by MostAwesomeDude · · Score: 3, Informative

      There is no LLVM backend for AMD/ATI cards. Of the few of us that actually understand ATI hardware, most of us are working on other things besides GPGPU. Sorry.

      --
      ~ C.
  4. OpenCL not a magic bullet by smcdow · · Score: 3, Insightful

    A bit off topic, but since I'm seeing posts about OpenCL and portability...

    OpenCL will indeed get you portability between processors, however OpenCL does not make any guarantees about how well that portable code will run. In the end, to get optimum performance you still have to code to the particular architecture on which that your code is going to run. For example, performance on Nvidia chips is extremely sensitive to memory access patterns. You could write OpenCL code that runs very well on Nvidia chips, but runs poorly on a different architecture.

    Not saying that portability isn't a good thing, but a lot of people seem to be thinking that OpenCL will solve all your portability problems. It won't. It only will let code run on multiple architectures. You'll still have to more or less hand optimize to the architecture.

    --
    In the course of every project, it will become necessary to shoot the scientists and begin production.
  5. Why? by Gregory+Diamos · · Score: 3, Informative

    So there seem to be several questions as to why people would want to use CUDA when an open standard exists for the same thing (OpenCL).

    Well, honestly, the reason why I wrote this was because when I started, OpenCL did not exist.

    I have heard the following reasons why some people prefer CUDA over OpenCL:

    • The toolchains for OpenCL are still immature. They are getting better, but are not quite as bug-free and high performance as CUDA at this point.
    • CUDA has more desirable features. For example, CUDA supports many C++ features such as templates and classes in device code that are not part of the OpenCL specification.

    Additionally I would like to see a programming model like CUDA or OpenCL replace the most widespread models in industry (threads, openmp, mpi, etc...). CUDA and OpenCL are each examples of Bulk Synchronous Parallel models, which explicitly are designed with the idea that communication latency and core count will increase over time. Although I think that it is a long shot, I would like to see more applications written in these languages so there is a migration path for developers who do not want to write specialized applications for GPUs, but can instead write an application for a CPU that can take advantage of future CPUs with multiple cores, or GPUs with a large degree of fine-grained parallelism.

    Most of the codebase for Ocelot could be re-used for OpenCL. The intermediate representation for each language is very similar, with the main differences being in the runtime.

    Please try to tear down these arguments, it really does help.