Slashdot Mirror


An Open Source Compiler From CUDA To X86-Multicore

Gregory Diamos writes "An open source project, Ocelot, has recently released a just-in-time compiler for CUDA, allowing the same programs to be run on NVIDIA GPUs or x86 CPUs and providing an alternative to OpenCL. A description of the compiler was recently posted on the NVIDIA forums. The compiler works by translating GPU instructions to LLVM and then generating native code for any LLVM target. It has been validated against over 100 CUDA applications. All of the code is available under the New BSD license."

71 comments

  1. Alternative? by Guspaz · · Score: 4, Insightful

    This isn't an alternative to CUDA; it lets CUDA code run on x86, but still doesn't do anything for AMD graphics cards. In other words, your choices as a developer are to use OpenCL and have your code run everywhere (AMD, nVidia, x86 slowly), or use CUDA and have your code run on nVidia or x86 slowly.

    What possible reason could you have to want to be locked into one GPU vendor?

    1. Re:Alternative? by Anonymous Coward · · Score: 2, Insightful

      I think Cuda was first out there, later on OpenCL occurred. And i see it as bad thing really, since that binds you to using Nvidia card. I hope it wont become popular i dont want to stick to Nvidia.(ot, when AMD has Linux drivers open sourced)

    2. Re:Alternative? by Yvan256 · · Score: 2, Interesting

      When did AMD drop the ATI brand?

    3. Re:Alternative? by raftpeople · · Score: 2, Informative

      What possible reason could you have to want to be locked into one GPU vendor?

      The reason is that today CUDA has a headstart and is more mature. Eventually things will probably shift to OpenCL but that takes time and people don't want to sacrifice features today.

    4. Re:Alternative? by Pinky's+Brain · · Score: 2, Informative

      I've seen feature requests suggesting they are considering it, but at the moment too much information is lost in the PTX->LLVM step to be able to generate CAL or OpenCL.

    5. Re:Alternative? by Pinky's+Brain · · Score: 2, Informative
    6. Re:Alternative? by mrsteveman1 · · Score: 2, Funny

      Wednesday December 23, @02:11PM

    7. Re:Alternative? by Icegryphon · · Score: 1

      What possible reason could you have to want to be locked into one GPU vendor?

      Hardware, libraries, and Toolkit.
      Cuda was useable way before anything else
      At the Time Cuda came out AMD was using CTM.
      Which is absolutely Painful to use.

    8. Re:Alternative? by Guspaz · · Score: 3, Insightful

      Progressively more and more.

      Example: Go to "ati.com" and you get redirected to the regular amd.com front page. Go to desktop graphics products and you get a page titled "AMD Graphics for Desktop PCs" inviting you to shop for "AMD Desktop Graphics Cards".

      The actual cards themselves have as product name "ATI Radeon", but describing an "ATI Radeon" as an "AMD graphics card" is accurate.

    9. Re:Alternative? by beelsebob · · Score: 1, Informative

      Pardon? OpenCL does not in any way bind you to an nVidia card, it was a standard created by Apple (not nVidia) and pushed to Khronos to manage as an open standard (also not nVidia). ATI have just announced drivers for their cards for OpenCL.

    10. Re:Alternative? by Sloppy · · Score: 3, Informative

      He means CUDA was here first, and it does(did) lock you into Nvidia. So if you jumped on the bandwagon early, your code is Nvidia only. If you waited for a standard (opencl) (or ported your app) then you're cross-platform.

      --
      As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
    11. Re:Alternative? by TeXMaster · · Score: 2, Informative

      I think Cuda was first out there, later on OpenCL occurred.

      Yes and no. CUDA and CTM/Brook+/FireStream came to live more or less at the same time when NVIDIA and ATI realized that GPGPU (General Purpose computing on the GPU) was getting traction in the scientific computing world (originally implemented using OpenGL and shaders).

      OpenCL was essentially an effort (by Apple first and foremost, although obviously with cooperation from both NVIDIA and ATI) to get a standardized interface to SIMD multicore programming. It's actually quite close to low-level CUDA programming, although I'm not sure how close it is to the ATI solution (I've tried going through the ATI docs a couple of time, but their stuff is absolutely abysmal when compared to the NVIDIA docs and SDKs, sadly).

      --
      "I'm never quite so stupid as when I'm being smart" (Linus van Pelt)
    12. Re:Alternative? by TheRaven64 · · Score: 4, Informative

      it lets CUDA code run on x86, but still doesn't do anything for AMD graphics cards

      Actually, it does. It lets CUDA code run on any processor that has an LLVM back end. The open source Radeon drivers have an experimental LLVM back end and use LLVM for optimising shader code.

      --
      I am TheRaven on Soylent News
    13. Re:Alternative? by Yvan256 · · Score: 1

      I'm not sure it would be wise for AMD to drop a known brand name like ATI.

    14. Re:Alternative? by Anonymous Coward · · Score: 0, Offtopic

      How do you mod retarded?

    15. Re:Alternative? by Score+Whore · · Score: 2, Funny

      Hard to say, but it must be easy since there are lots of mods that are, at the very least, a bit challenged. If you know what I mean.

    16. Re:Alternative? by Anonymous Coward · · Score: 0

      I didn't say that. OpenCL doesn't, CUDA does.

    17. Re:Alternative? by PitaBred · · Score: 1

      AMD is working on a unification of GPU and CPU. It makes perfect sense to start attaching the AMD name to the GPUs.

    18. Re:Alternative? by Trepidity · · Score: 1

      As far as I can tell, OpenCL is pretty much based on CUDA, not on an attempt to unify CUDA and CTM/Brook+/FireStream. That's partly because ATI's solutions never really caught on, and have been sorta ignored.

    19. Re:Alternative? by Anonymous Coward · · Score: 2, Informative

      OpenCL isn't ALL that close either to CUDA or anything from AMD (CAL, Brook+). The status quo with AMD is that the OpenCL implementation they have is very immature e.g. doesn't support a lot of fairly basic and highly desirable OpenCL "extensions" (actually it didn't support ANY until about 2 days ago, and now they're just beta testing a few of the most rudimentary ones). Additionally there are still lots of issues with missing / unclear documentation, missing features, bugs, development / runtime platform portability issues, et. al. Most significantly, the openCL performance is still a fraction of the performance commonly achievable with Brook+ or CAL in many common scenarios on the AMD platform. This is sometimes / often true for their 58xx series boards, and pervasively so for their older 4xxx series cards (which by architectural limitations as well as by lack of planned OpenCl development toolchain support / optimization will never really perform well with OpenCL).

      On the NVIDIA side, CUDA performance and usage flexibility is still typically and substantially higher than is achievable via OpenCL, since obviously CUDA exists to fairly optimally exploit their GPU architectural capabilities whereas OpenCL is a generic GPU-vendor / architecture "neutral" platform that doesn't give as much card specific control as CUDA (or CAL in AMD's case).

      Development tools and platform portability are still poor in both NVIDIA and AMD cases. NVIDIA, for instance, lacks CUDA/OpenCL support on platforms like Solaris, FreeBSD. AMD AFAIK doesn't even have graphics driver support (much less OpenCL/Stream/CAL/Brook+) on BSD, Solaris, Mac(?), and the support is pretty rocky on LINUX still.

      LINUX Open Source drivers for AMD hardware are still barely at the stage of providing high quality basic 2D functionality for R600/R500 GPUs, R700 isn't there yet, and R800 is farther out still. In none of these cases does anything like Stream / Brook+ / OpenCL work with the open source driver. It seems as if it may take the better part of 2010 to go by before we see even the first good previews of OpenCL and decently useful 3D graphics running on R600/R700/R800 GPUs with Gallium, X.org, Mesa, et. al. all coming together with the open source radeon drivers.

      Basically if you want high performance within the next few months, plan on writing GPU model specific code in CUDA for NVIDIA, and deal with platform / software / card portability issues that will come up frequently. If you're targeting AMD, either target R800 generation cards only, or assume that you'll be getting only a fraction of the performance from R700/R600 cards using OpenCL, and even in the case of R800, don't assume there will be production quality comprehensive high performance driver/toolchain support before mid to late 2010.

      If you just want stuff to be "portable" across GPU vendors and do graphics-like computations with the GPUs, use either OpenCL or DX11 (on Windows Vista/7 platforms), or just stick to shaders in DX9/DX10 for even better portability.
      Don't expect OpenCL to be "write once run anywhere" with minimal developer issues or end user runtime configuration / linking issues for at least a few more months in the case of AMD/NVIDIA on Windows. As of now even a lot of developers have issues with DLL compatibility / versioning / paths / capabilities detections etc.

      I think 18 months from now maybe it will be really a more streamlined experience to use OpenCL across OS platforms and GPU cards, but still probably mostly for GPU generations that are DX11 and beyond only, not really so much the legacy models (which are still 95% of the deployed market).

    20. Re:Alternative? by Elbows · · Score: 2, Informative

      On top of that, the CUDA tools are still much better than OpenCL. OpenCL is basically equivalent to CUDA's low-level "driver" interface, but it has no equivalent to the high-level interface that lets you combine host/device code in a single source, etc. CUDA also supports a subset of C++ for device code (e.g. templates), which I don't believe is the case for OpenCL. CUDA also has a debugger (of sorts), profiler, and in version 3 apparently a memory checker. But I haven't been following OpenCL that closely lately -- it may be catching up on the tool front.

      If you're developing an in-house project where you have control over the hardware you're going to run on, or you know that most of your customers have Nvidia cards anyway, there are still good reasons to go with CUDA.

    21. Re:Alternative? by CDeity · · Score: 4, Informative

      The greatest challenges lie in accommodating arbitrary control flow among threads within a cooperative thread array. NVIDIA GPUs are SIMD multiprocessors, but they include a thread activity stack that enables serialization of threads when they reach diverging branches. Without hardware support, this kind of thing becomes difficult on SIMD processors which is why Ocelot doesn't include support for SSE yet. It is also one of the obstacles for supporting AMD/ATI IL at the moment, though solutions are in order.

      Translation from PTX to LLVM to multicore x86 does not necessarily throw away information concerning the PTX thread hierarchy initially. The first step is to express a PTX kernel using LLVM instructions and intrinsic function calls. This phase is [theoretically] invertible and no information concerning correctness or parallelism is lost.

      To get to multicore from here, a second phase of transformations insert loops around blocks of code within the kernel to implement fine-grain multithreading. This is the part that isn't necessarily invertible or easy to translate back to GPU architectures and is what is referenced in the note you are citing.

      Disclosure: I'm one of the core contributors to the Ocelot project.

    22. Re:Alternative? by Anonymous Coward · · Score: 0

      This isn't an alternative to CUDA; it lets CUDA code run on x86, but still doesn't do anything for AMD graphics cards. In other words, your choices as a developer are to use OpenCL and have your code run everywhere (AMD, nVidia, x86 slowly), or use CUDA and have your code run on nVidia or x86 slowly.

      What possible reason could you have to want to be locked into one GPU vendor?

      Because the hardware doesn't suck?

    23. Re:Alternative? by Cytotoxic · · Score: 0, Offtopic

      -----Hard to say, but it must be easy since there are lots of mods that are, at the very least, a bit challenged. If you know what I mean.-----

      <The Tick>

                            Nope.

      </The Tick>

      Ok, that one was strictly for the two other people who would get a Warburton/Tick reference. But the three of us laughed our asses off..... "Yes, it is I, Bat-Manuel! I saved them three times later that night, if you know what I mean." Heh, funny... Ok, well, if you had watched the show you'd laugh too. And the stupid thing wouldn't have been canceled after a half-season. So basically it's doubly your fault that you didn't get it.

      Spooooon!

    24. Re:Alternative? by Tycho · · Score: 1

      Because the hardware doesn't suck?

      We'll see about that in 2Q10, the earliest that Fermi could be released. This assumes nVidia avoids bankruptcy and can get the steaming pile poo known as Fermi to actually work acceptably and reliably enough for general release to "consumers".

      --
      Impersonating Tycho from Penny Arcade since before there was a PA.
    25. Re:Alternative? by triso · · Score: 1

      What possible reason could you have to want to be locked into one GPU vendor?

      Only that the other GPU Vendor, AMD/ATI, doesn't have a working Linux driver for 3-d, proprietary or open. In addition there isn't much support for their older cards,

    26. Re:Alternative? by Darundal · · Score: 1

      Not right off the bat, but a slow transition from one brand to the other, like what is happening now, can be quite good for them.

    27. Re:Alternative? by TeXMaster · · Score: 1

      OpenCL isn't ALL that close either to CUDA or anything from AMD (CAL, Brook+). The status quo with AMD is that the OpenCL implementation they have is very immature e.g. doesn't support a lot of fairly basic and highly desirable OpenCL "extensions" (actually it didn't support ANY until about 2 days ago, and now they're just beta testing a few of the most rudimentary ones). Additionally there are still lots of issues with missing / unclear documentation, missing features, bugs, development / runtime platform portability issues, et. al. Most significantly, the openCL performance is still a fraction of the performance commonly achievable with Brook+ or CAL in many common scenarios on the AMD platform. This is sometimes / often true for their 58xx series boards, and pervasively so for their older 4xxx series cards (which by architectural limitations as well as by lack of planned OpenCl development toolchain support / optimization will never really perform well with OpenCL).

      On the NVIDIA side, CUDA performance and usage flexibility is still typically and substantially higher than is achievable via OpenCL, since obviously CUDA exists to fairly optimally exploit their GPU architectural capabilities whereas OpenCL is a generic GPU-vendor / architecture "neutral" platform that doesn't give as much card specific control as CUDA (or CAL in AMD's case).

      I do wonder how much this is because of OpenCL being vendor-neutral and thus 'far' from the underlying architecture, and how much it depends on the quality of the compilers. I suspect that NVIDIA does not have much of an interest in optimizing their OpenCL compiler as much as they do with their CUDA compiler, for the obvious reason that with CUDA they have vendor lock-in and can sell more hardware, whereas with OpenCL there is the (remote) possibility that a better compiler from ATI might lead people to look at the other hardware more.

      --
      "I'm never quite so stupid as when I'm being smart" (Linus van Pelt)
    28. Re:Alternative? by DeKO · · Score: 1

      On the NVIDIA side, CUDA performance and usage flexibility is still typically and substantially higher than is achievable via OpenCL, since obviously CUDA exists to fairly optimally exploit their GPU architectural capabilities whereas OpenCL is a generic GPU-vendor / architecture "neutral" platform that doesn't give as much card specific control as CUDA (or CAL in AMD's case).

      That's not true. I've run many equivalent CUDA and OpenCL kernels on NVIDIA cards, and they perform both the same. Pretty much in accordance with those benchmarks.

      There's no reason for OpenCL code to be any slower than CUDA code (the same compiler is used, only with small changes in the frontend). Maintainability on the other hand... with CUDA you can launch a kernel just like you were calling a function; with OpenCL you have almost a dozen of setup steps (reminds me of programming Win32 applications directly with raw Win32 api calls). Function and operator overloading, templates... those are nice things to have at your disposal when you need it. Let's hope they make an "OpenCL++" standard too.

    29. Re:Alternative? by badkarmadayaccount · · Score: 1

      Same reason people stick to Flash, superior development tools. But there is a catch - LLVM has been romancing vector support, and I believe clang is used as a opencl frontend, so anything with a llvm backend == supports opencl

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  2. Performance? by pablodiazgutierrez · · Score: 1

    I wonder how the performance of the open source solution is compared to the proprietary compiler by NVidia. If it's good enough, they might be scared.

    1. Re:Performance? by obarthelemy · · Score: 1

      scared ? they should be happy.

      --
      The Cloud - because you don't care if your apps and data are up in the air.
    2. Re:Performance? by Gregory+Diamos · · Score: 1

      Here's a graph performance. The GPU version uses NVIDIA's JIT to generate native instructions for a particular GPU so the GPU results here should be more or less the same as if the program was compiled with NVIDIA's static compiler.

  3. Wait wut? by Icegryphon · · Score: 3, Insightful

    Why would you go from CUDA(Fast Floating-points) to x86(slower Floating-points)?
    Is there support yet for double-precision floating points yet on Nvidia cards?
    This makes as much sense as a Wookiee on the planet Endor.
    Unless the Point is portability but, then why write it in Cuda to begin with?

    1. Re:Wait wut? by tepples · · Score: 3, Insightful

      Why would you go from CUDA(Fast Floating-points) to x86(slower Floating-points)?

      For running legacy apps that were developed between the release of CUDA and the release of OpenCL. There aren't many, I'd guess.

    2. Re:Wait wut? by Anonymous Coward · · Score: 1, Interesting

      Suppose you have working CUDA code but your dataset is relatively small, say a block of 1000 floating point numbers. Then the overhead of delegating the work to the GPU isn't necessarily worth the trouble.

    3. Re:Wait wut? by SpinyNorman · · Score: 2, Insightful

      I can think of a couple of reasons it may be useful on x86 :

      - Better debugging tools
      - Allows CUDA development without buying specialized hardware up-front (a lesson I've learnt - don't buy hardware until the software is ready)

      It's also another option for multi-core programming. If the CUDA API is good, maybe it's an efficient way to develop certain types of parallel apps even if you never intend to use it on a GPU.

    4. Re:Wait wut? by beelsebob · · Score: 2, Informative

      Which is exactly why you should be using OpenCL, not CUDA – because it lets the OpenCL driver decide whether to run it on the CPU or the GPU.

    5. Re:Wait wut? by Trepidity · · Score: 1

      It doesn't really, though. OpenCL "decides" based on some very high-level, high-granularity features of devices it can enumerate. In practice, if you want your code to run reasonably well, you know which parts are going to run on the GPU and which on the CPU. OpenCL isn't an auto-parallelization solution, just a set of primitives for parallel programming--- more like an MPI or OpenMP that also supports GPGPU than the old 70s holy grain of auto-parallelizing where the compiler or runtime magically figures out how to chunk up your computation and where to send the chunks.

    6. Re:Wait wut? by Anonymous Coward · · Score: 0

      The FPU in modern x86 CPUs is much faster than the ones in GPUs. The difference is that your GPU has hundreds of them and your CPU has only one per core. It's perfect for testing and debugging and will probably also be perfect for when x86 CPUs get hundreds of cores.

    7. Re:Wait wut? by beelsebob · · Score: 1

      Very true, to get that, you need to combine OpenCL with Grand Central Dispatch.

    8. Re:Wait wut? by Midnight+Thunder · · Score: 2, Interesting

      For running legacy apps that were developed between the release of CUDA and the release of OpenCL. There aren't many, I'd guess.

      Sounds like there is great potential for a tool that will convert CUDA to OpenCL.

      --
      Jumpstart the tartan drive.
    9. Re:Wait wut? by Tycho · · Score: 1

      This assumes that your GPU can perform (or that you can deal without) double precision operations, can carrying out renormalization, offer rounding as well as chopping, and properly handle "Not a Number" or Infinity values by following the IEEE754 standards.

      --
      Impersonating Tycho from Penny Arcade since before there was a PA.
    10. Re:Wait wut? by Jesus_666 · · Score: 1

      Or for running science-related apps on computers without a NVIDIA GPU. As far as I can tell, computational science is all about CUDA. Even in courses about GPGPU computing you get brief rundowns á la "CUDA is [15 minute explanation]. Then there's also OpenCL and Sh but nobody uses those" and requirements like "everyone needs to use CUDA. If you don't have a supported NVIDIA GPU please buy one or drop the course" because the lecturer is convinced that teaching anything but CUDA would be a waste of time for everyone.

      I don't know if things are different elsewhere but in the science sector CUDA has massive brand recognition whereas OpenCL doesn't.

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
    11. Re:Wait wut? by badkarmadayaccount · · Score: 1

      Radeon cards have an experimental LLVM backend, AFAIK.

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  4. Doesn't sound like a compiler by gnasher719 · · Score: 3, Interesting

    Seems to be just a front-end for LLVM. And if it is just a front-end for LLVM, then why doesn't it support ATI graphics cards? That would actually make it useful; there is no need for a second CUDA compiler for NVidia cards.

    1. Re:Doesn't sound like a compiler by beelsebob · · Score: 1

      Seems to be just a front-end for LLVM. And if it is just a front-end for LLVM, then why doesn't it support ATI graphics cards?
      Because OpenCL already does that job just fine. The only possible use for this is to have legacy CUDA apps actually run while people port them to use OpenCL instead.

    2. Re:Doesn't sound like a compiler by MostAwesomeDude · · Score: 3, Informative

      There is no LLVM backend for AMD/ATI cards. Of the few of us that actually understand ATI hardware, most of us are working on other things besides GPGPU. Sorry.

      --
      ~ C.
    3. Re:Doesn't sound like a compiler by Voline · · Score: 1

      Dude, you are truly most awesome.

    4. Re:Doesn't sound like a compiler by Anonymous Coward · · Score: 0

      There is no LLVM backend for AMD/ATI cards.

      Watch the LLVM tree carefully over the next few months. There may be some interesting checkins on the way.

  5. Recompiler to what back-end? by tepples · · Score: 1

    And if it is just a front-end for LLVM, then why doesn't it support ATI graphics cards?

    That depends on whether LLVM has a back-end for ATI graphics cards. Is the Stream Computing SDK based on LLVM or something else?

  6. Metal Gear Solid Joke Here by Anonymous Coward · · Score: 0

    Ocelot!

    1. Re:Metal Gear Solid Joke Here by HTH+NE1 · · Score: 1

      Does it run as a last accessed, last executed, last used, last in, last out queue?

      --
      Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
    2. Re:Metal Gear Solid Joke Here by badkarmadayaccount · · Score: 1

      I just imagined the queue and pointer structure described by the parent. Man I'd kill for some weed right now.

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  7. I'm betting.. by RightSaidFred99 · · Score: 2, Funny

    NVidia isn't real happy about this. No Christmas cards for those guys! In fact the developers should expect some insipid, obvious, and unfunny cartoons will be drawn about them.

  8. CUDA is only fast on some computers by Anonymous Coward · · Score: 1, Insightful

    Why would you go from CUDA(Fast Floating-points) to x86(slower Floating-points)?

    Because if you don't have the right hardware, CUDA isn't fast floats. It's a program that doesn't run at all.

  9. just-in-time compiler? by Snaller · · Score: 0, Troll

    Doesn't that mean "not compiled at all"

    --
    If Google really cared they would fix Android Chrome to reflow text, instead of discriminating
    1. Re:just-in-time compiler? by Anonymous Coward · · Score: 0

      No.

      "Not compiled at all" means that every time through a given set of code, you're translating it to machine code.

      A JIT compiler, on the other hand, when it runs through code, will translate that code into machine instructions and keep that translation around in case there's a next time. So for code that's run once, you translate, and have a bunch of machine code that's never run again... but for code that's run multiple times (function calls, loops, that sort of thing), you translate, and run that bunch of machine code multiple times.

      So for a task that runs through its code just once, there's no difference. But for a task that runs through particular code paths multiple times - which is going to be just about anything that's a reasonable size - JIT is a win. Not as fast as ahead of time compilation, of course, but relatively few applications need that extra performance boost nowadays.

    2. Re:just-in-time compiler? by slimjim8094 · · Score: 1

      No, it means compiled just in time. Shocking, I know.

      --
      I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
  10. OpenCL not a magic bullet by smcdow · · Score: 3, Insightful

    A bit off topic, but since I'm seeing posts about OpenCL and portability...

    OpenCL will indeed get you portability between processors, however OpenCL does not make any guarantees about how well that portable code will run. In the end, to get optimum performance you still have to code to the particular architecture on which that your code is going to run. For example, performance on Nvidia chips is extremely sensitive to memory access patterns. You could write OpenCL code that runs very well on Nvidia chips, but runs poorly on a different architecture.

    Not saying that portability isn't a good thing, but a lot of people seem to be thinking that OpenCL will solve all your portability problems. It won't. It only will let code run on multiple architectures. You'll still have to more or less hand optimize to the architecture.

    --
    In the course of every project, it will become necessary to shoot the scientists and begin production.
    1. Re:OpenCL not a magic bullet by Midnight+Thunder · · Score: 2, Informative

      Not saying that portability isn't a good thing, but a lot of people seem to be thinking that OpenCL will solve all your portability problems. It won't. It only will let code run on multiple architectures. You'll still have to more or less hand optimize to the architecture.

      Like the argument of assembler vs C, I think as time goes on we will find ourselves with code that can do a better job of optimising the code for a specific processing core, given a block of OpenCL code than the programmer. Sure there will always be specific cased where the programmer can do a better job, but most programmers IMHO would rather write portable code and let the optimisation left to code which does a better than them - for reasons of lack of intrinsic knowledge and time.

      --
      Jumpstart the tartan drive.
    2. Re:OpenCL not a magic bullet by complete+loony · · Score: 1

      This is one of the strengths of LLVM. If your hardware performs better with some specific tweaks to the code, then write an optimising pass that makes the appropriate transformations. Then you can keep your back end machine code generator as simple as possible. Even better, write your optimiser in a generic way so anyone else tackling a similar problem can reuse your work. Heck if you're lucky someone else has already done so.

      --
      09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
    3. Re:OpenCL not a magic bullet by Anonymous Coward · · Score: 0

      The fundamental flaw in your assumption that optimizers will solve everything is that the things that make the most difference actually change the algorithm used.

      Example: I can sort a 2-dimensional array by columns however that will cause major cache missing on most processors, if I change the algorithm to run by rows or rearrange the data structure so that it is stored (y, x) instead of (x, y) then the performance is improved. The only way an optimiser could do this is by haphazardly undermining the programmer (there may be a reason it was done by columns that can't be changed) or using an obscenely high level "write me a program that sorts this array" command (largely impractical).

    4. Re:OpenCL not a magic bullet by TheRaven64 · · Score: 1
      There's only so much an optimizer can do, and it depends on how high-level a language it. With C, for example, the optimizer can't turn three arrays of colour values into an array of structures, which would let it use the vector unit for operations. In a higher-level language, which didn't expose the memory layout to the programmer, this is possible.

      In general, high-level languages have more potential for optimization than low-level ones. In contrast, low-level languages make it easier for the programmer to write optimised code.

      --
      I am TheRaven on Soylent News
    5. Re:OpenCL not a magic bullet by badkarmadayaccount · · Score: 1

      Do you have an issue with "obscene" languages like Prolog et al.?

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  11. larrabee by Anonymous Coward · · Score: 0

    Had Larrabee turned into a product this xmas, I think alot of people would have been interested in CUDA to x86.
    I'm sure the people still working on it will be interested in it.

    Next step CUDA to ATI...

  12. Because by Groo+Wanderer · · Score: 1

    "What possible reason could you have to want to be locked into one GPU vendor?"

    Perhaps because you are sick and tired of GPUs that don't die an early death, and love sitting on the phone and being told that it isn't covered by warranty by HP, Dell, Apple, Sony, and the rest.

                    -Charlie

  13. We need the opposite... by Anonymous Coward · · Score: 0

    An (Open Source) Compiler From X86-Multicore To CUDA.... This way, the ION3 could completely miss the Atom part of the equation, and we would get one more player in the x86 field.

  14. Why? by Gregory+Diamos · · Score: 3, Informative

    So there seem to be several questions as to why people would want to use CUDA when an open standard exists for the same thing (OpenCL).

    Well, honestly, the reason why I wrote this was because when I started, OpenCL did not exist.

    I have heard the following reasons why some people prefer CUDA over OpenCL:

    • The toolchains for OpenCL are still immature. They are getting better, but are not quite as bug-free and high performance as CUDA at this point.
    • CUDA has more desirable features. For example, CUDA supports many C++ features such as templates and classes in device code that are not part of the OpenCL specification.

    Additionally I would like to see a programming model like CUDA or OpenCL replace the most widespread models in industry (threads, openmp, mpi, etc...). CUDA and OpenCL are each examples of Bulk Synchronous Parallel models, which explicitly are designed with the idea that communication latency and core count will increase over time. Although I think that it is a long shot, I would like to see more applications written in these languages so there is a migration path for developers who do not want to write specialized applications for GPUs, but can instead write an application for a CPU that can take advantage of future CPUs with multiple cores, or GPUs with a large degree of fine-grained parallelism.

    Most of the codebase for Ocelot could be re-used for OpenCL. The intermediate representation for each language is very similar, with the main differences being in the runtime.

    Please try to tear down these arguments, it really does help.

  15. The worlds only Ocelot joke by Lardmonster · · Score: 1

    Q: How do you titilate an ocelot?
    A: Oscillate it's tits a lot.

    I thank you.

    --
    The more advanced the technology, the more open it is to primitive attack