Slashdot Mirror


OpenGL 4.4 and OpenCL 2.0 Specs Released

Via Ars comes news that the OpenGL 4.4 and OpenCL 2.0 were released yesterday. OpenGL 4.4 features a few new extensions, perhaps most importantly a few to ease porting applications from Direct3D. New bindless shaders have access to the entire virtual address space of the card, and new sparse textures allow streaming tiles of textures too large for the graphics card memory. Finally, the ARB has announced the first set of conformance tests since OpenGL 2.0, so going forward anything calling itself OpenGL must pass certification. The OpenCL 2.0 spec is still provisional, but now features a memory model that is a subset of C11, allowing sharing of complex data between the host and GPU and avoiding the overhead of copying data to and from the GPU (which can often make using OpenCL a losing proposition). There is also a new spec for an intermediate language: "'SPIR' stands for Standard Portable Intermediate Representation and is a portable non-source representation for OpenCL 1.2 device programs. It enables application developers to avoid shipping kernel source and to manage the proliferation of devices and drivers from multiple vendors. OpenCL SPIR will enable consumption of code from third party compiler front-ends for alternative languages, such as C++, and is based on LLVM 3.2. Khronos has contributed open source patches for Clang 3.2 to enable SPIR code generation." For full details see Khronos's OpenGL 4.4 announcement, and their OpenCL 2.0 announcement. Update: 07/23 20:17 GMT by U L : edxwelch notes that Anandtech published notes and slides from the SIGGRAPH announcement.

11 of 66 comments (clear)

  1. Better Article by edxwelch · · Score: 5, Informative
    1. Re:Better Article by gman003 · · Score: 2

      They're essentially at parity because they're matching, for the most part, the features of the underlying hardware. It's a weird give-and-take - Microsoft likes to dictate features, but their DirectX team is smart enough (now) to dictate features that at least one of the GPU companies is already implementing - essentially, they ask Nvidia what they're adding, ask AMD what they're adding, then add both of them to Direct3D and tell them both to implement it (and since they're both watching each other's moves anyways, this doesn't really change much). OpenGL takes a less leading role in adding features - they're usually added first as a vendor-specific extension (meaning Nvidia gets to dictate how it works and is used, and the extension has their name as a prefix). Then they standardize it as an ARB (Architecture Review Board) extension - make it more generic, less designed for one specific piece of hardware, and thus much easier for others to implement. Finally, if it's becoming a standard feature, they'll add it to the core language. OpenGL takes a more leading role in obsoleting older functionality, or in designing the OpenGL ES variants of the language. But they leave the cutting edge to the hardware people.

      So that's why OpenGL "trails" Direct3D by a hair - they follow the hardware a bit less closely. But they're both close enough that they're essentially feature-compatible, for modern stuff (this is all completely wrong for OpenGL 2.0 / DirectX 8 or lower).

  2. Re:OpenCL by Anonymous Coward · · Score: 2, Insightful

    If you can't understand C, you have no business touching the GPU or even calling yourself a programmer.

  3. Re:OpenCL by kthreadd · · Score: 3, Insightful

    Nothing wrong with C, but you don't really need to limit your self to it just because the code is running on the GPU. Have a look at C++ AMP for example.

  4. Re:OpenCL by Musc · · Score: 4, Insightful

    Except for the fact that CUDA only works on nvidia devices, and OpenCL works on everything...

    --
    Hamsters are at least as feathery as penguins. HamLix
  5. Re:OpenCL by cheesybagel · · Score: 2

    In my experience CUDA is not any faster than OpenCL. Frameworks don't solve the problem properly. There are a lot of debugging tools for OpenCL my guess is you did not look hard enough. You can run OpenCL programs without installing all the cruft required to do CUDA development since the driver will compile and run code by itself. This means a lot of people don't bother looking for tools but they are out there.

  6. Re:OpenCL by cheesybagel · · Score: 3, Informative

    There are also several mobile devices (smartphones, tablets) running ARM which have OpenCL support and zero CUDA support. Not to mention that it is also a web standard namely WebCL.

  7. Conformance tests developed along with spec by GNUThomson · · Score: 2

    That's surprisingly uncommon among standardization organizations. I wish IETF could do the same for RFCs...

  8. NVidia not that important. by tuppe666 · · Score: 2, Interesting

    NVidia, who own the 50% of the GPU market

    Not even close NVidia has 18% of the GPU market with Intel at 61.8% and AMD at 20.2%. NVidia is less prolific than you think. Basically 80% of the market can implement it without Nvidia. I don't think they want to do that.

  9. Re:OpenCL by gman003 · · Score: 2

    Well, let's look at the use cases for OpenCL right now:
    * Scientific computing, at levels from workstations to supercomputers
    * Games that need to offload stuff too parallel for the CPU to handle, or for code that needs to run on the GPU as the output will be used by other GPU code (streaming texture decompression is a common task).
    * Video transcoders, encoders and decoders
    * Bitcoin miners (obligatory Bitcoin reference: check!)

    All of those are fields where performance is a very high priority - in some cases, above even correctness. They're also fields for experts - if you don't know how to program at essentially the assembly level, you won't make it in the field. So is it harder? Sure. But this is stuff where you can't just wave a magic wand and make it easy - it's tough because massively multi-threaded programming is intrinsically difficult.

  10. Re:OpenCL by K.+S.+Kyosuke · · Score: 2

    No side effects is *key* in being able to parallelize things. Because you can trust that the same input will *always* give the exact same output.

    Actually, that's mostly irrelevant. That could be useful for memoization, but it's not a sufficient condition for parallelization - if you take it to the logical conclusion, you're asking for nothing more than a computer that is reliable, which is an assumption you do for most computer programs, so you're asking for a very weak property. The key to parallel computing is the associativity of individual operations. Other properties that are of lesser help are commutativity, idempotency (basically the thing you've mentioned), and the existence of zeros and identities, but it's associativity that is vital. If you can do (((1+2)+3)+(4+5))+((6+7)+(8+(9+10))) instead of ((((((((1+2)+3)+4)+5)+6)+7)+8)+9)+10, you win big. If you can't, you lose.

    --
    Ezekiel 23:20