AMD's OpenCL Allows GPU Code To Run On X86 CPUs
eldavojohn writes "Two blog posts from AMD are causing a stir in the GPU community. AMD has created and released the industry's first OpenCL which allows developers to code against AMD's graphics API (normally only used for their GPUs) and run it on any x86 CPU. Now, as a developer, you can divide the workload between the two as you see fit instead of having to commit to either GPU or CPU. Ars has more details."
Good on them. Now how about an API that allows me to run GPU code on the GPU? The day I can play 1080p mkvs from a netbook on AMD/ATI hardware is the day I'll quit buying nvidia.
I am literally 3000 tokens away from the chaotic crossbow --Stephen
In that memory on the card is faster for the card GPU and memory on the CPU is faster than the CPU. Like, I know PC-Express speeds things up, but, is it that fast that you don't have to worry about the bottleneck of the system bus?
This is my sig.
Why would anyone ever want to do something well when they can fail at several things?
A bullet may have your name on it but splash damage is addressed "To whom it may concern."
Wouldn't the real benefit be that you wouldn't have to create two separate code-bases to create an application that both supported GPU optimization and could run naively on any system?
Ironically Intel announced that they are going to stop outsourcing their GPU's in Atom processors and include the gpu + cpu in one package, yet nobody knows what happened to the dual core Atom N270...
Actually, this will provide more flexibility in their optimizations. There are some aspects that the CPU does very well, and there are others that the GPU handle well... being able to say "perform THIS function on the CPU and THAT one on the GPU, will free up resources on each chip. Utilizing the CPU for some functions will free up resources on the GPU, and vise-versa, allowing (theoretically) to optimize the performance of EACH one for a better overall experience.
So now programmers can write code that will work on either processor and will be optimized on neither. Brilliant. I'm sure this is somehow a great step forward.
-sigh-
Um, what? How does the existence of a compiler that generates x86 code prevent the existence of an optimizing compiler that generate GPU instructions?
Things have been slowly moving in this directly already, since game makers have not been using available cpu horsepower very effectively. A little z-buffer magic and there is no reason why the object space couldn't be separated into completely independent processing streams.
-Matt
I haven't read too much of OpenCL (just a few whitepapers and tutorials) but does anybody know if you can use both the GPU and CPU at the same time for the same kind of task. For example, in a single "kernel", I want it done 100 times, I can send 4 to the quad-core CPU and the rest to the GPU? If so, this would be a big win for AMD.
This is old news, Apple has been touting this for a year now, not AMD.
Having a separate compiler that doesn't integrate cleanly with the rest of your toolchain (i.e. uses a different intermediate representation preventing cross-module optimisations between C code and OpenCL) and doesn't integrate with the driver stack is very boring.
Oh, and the press release appears to be a lie:
AMD is the first to deliver a beta release of an OpenCL software development platform for x86-based CPUs
Somewhat surprising, given that OS X 10.6 betas have included an OpenCL SDK for x86 CPUs for several months prior to the date of the press release. Possibly they meant public beta.
I am TheRaven on Soylent News
Ok, I'll feed the troll (this time)
Anyway, Apple was one of the companies that first came up with the OpenCL standard. Apple worked with Khronos to make it a full standard. AMD is one of the first to publicly release a full implementation of OpenCL which is why this is big news.
Yeah, it's amazing how things that can generate executables on multiple platforms, things like C, are so amazingly slow.
Man, why did we ever stop using assembly?
Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
I suppose it really sucks to code in OpenCL and also take advantage of your CPU. It also really sucks that when you have an nVidia card and the code is made for ATI that you can still use it on your CPU. Seriously...
Here be signatures
Welcome back to the days of the math coprocessor....
nVidia has had a full implementation of OpenCL out for months now.
However, its beta and only accessible via the "OpenCL Early Access Program" which you have to apply for.
Insightful, funny, best post yet
the NPG electrode was replaced with carbon blac
This idea isn't new. CUDA allows you to execute your GPU code on the CPU. This is just AMD implenting OpenCl which afaik is sufficently new no one else has done this yet. I would have expected it to be another couple of months before we really saw NVIDIA and AMD start pushing OpenCL when they release new hardware. Obviously they're working on it already, it's just a matter of when anyone can do anything with it.
Now that we have CPUs with literally more cores than we know what to do with, it makes sense to use those cores for graphics processing. I think that within a few years, we'll start seeing games that don't require a high-end graphics card- they'll just use a couple of the cores on your CPU. It makes sense, and is actually a good thing. Fewer discrete chips is better, as far as power consumption and heat, ease-of-programming and compatibility are concerned.
For the kind of really high performance stuff OpenCL is targeted to, we didn't. Look at the low level code in GnuMP, for instance.
No they haven't. Only as of last month have they had a release candidate for the developers-only crowd. I think you're thinking of CUDA, which is an nVidia-only technology similar to OpenCL, but differing in implementation (and I believe openness as well). Along with OpenCL, DirectX 11 is also bringing "Compute Shaders" into the DirectX model, making this kind of thing a requirement for a DX11 GPU.
Screw the rules, I have green hair!
The OpenCL spec already allowed for running code on a CPU or a GPU. It's just registered as a different type of device. So basically, they are enabling compiling the OpenCL programming language to the x86? I don't really see the story, here.
Sometimes I doubt your committment to SparkleMotion!
Note that this OpenCL implementation works for CPU only. GPU support is forthcoming.
However, we know that Mac OSX (Snow Leopard) will soon be shipping with an OpenCL implementation.
I think we can expect full OpenCL (CPU & GPU) support from Intel, ATI/AMD, and nVidia sooner rather than later.
The SX is for Sux!
Platform advocacy is like choosing a favorite severely developmentally disabled child.
This is essentially what it comes down to. Does OpenCL make parallel programming of heterogeneous processors easy? The answer is no, of course, and the reason is not hard to understand. Multicore CPUs and GPUs are two incompatible approaches to parallel computing. The former is based on concurrent threads and MIMD (multiple instructions, multiple data) while the latter uses an SIMD (single instruction, multiple data) configuration. They are fundamentally different and no single interface will get around that fact. OpenCL (or CUDA) is really two languages in one. Programmers will have to frequently flip their mode of thinking in order to take effective advantage of both technologies and this is the primary reason that heterogeneous processors will be a pain to program. The other is multithreading, which, as we all know, is a royal pain in the arse in its own right.
Obviously what it needed is a new universal parallel software model, one that is supported by a single *homogeneous* processor architecture. Unfortunately for the major players, they have so much money and resources invested in last century's processor technologies that they are stuck in a rut of their own making. They are like the Titanic on a collision course with a monster iceberg. Unless the big players are willing and able to make an about-face in their thinking (can a Titanic turn on a dime?), I am afraid that the solution to the parallel programming crisis will have to come from elsewhere. A true maverick startup will eventually turn up and revolutionize the computer industry. And then there shall be weeping and gnashing of teeth among the old guard.
Read How to Solve the Parallel Programming Crisis if you're interested in an alternative approach to parallel computing.
I agree that the eventual goal is everything on the CPU. After all, that is the great thing about a computer. You do everything in software, you don't need dedicated devices for each feature, you just need software. However, even as powerful as CPUs are, they are WAY behind what is needed to get the kind of graphics we do out of a GPU. At this point in time, dedicated hardware is still far ahead of what you can do with a CPU. So it is coming, but probably not for 10+ years.
So, where can one obtain an open source OpenCL compiler? (Or, to be more precise, an open source compiler which can take OpenCL compliant code and produce object code that will run on my GPU via the driver stack?)
Hi, I am working on an OpenCL implementation sponsored by google summer of code. It is nearly done supporting the CPU and the Cell processor. This news has come to as a blow to me. I have struggled so much with my open source project and now a big company is going to come and trample all over me. boo hoo. http://github.com/pcpratts/gcc_opencl/tree/master
And to take that one step further, both Intel and AMD are planning on integrating the GPU on-die in future products, just like the math coprocessor moved on-die 15-20 years ago.
The problem is typically with how you set up your data structures to solve the problem at hand. When I converted my CPU code to run on a GPU, I had to go through and re-work the problem. I changed the way my data was stored, which was previously optimized for CPU serial processing and caching etc. to something that matched the GPU's model of queuing up read requests of multiple adjacent words while previously read memory is being processed.
These types of changes aren't really optimizations the compiler can do.
It was already explained above. CPU and GPU are very different at handling things, meaning that top level algorithms used are very different.
Unless of course you can point at a compiler which can rethink and rewrite the program.
All hope abandon ye who enter here.
Now that we have CPUs with literally more cores than we know what to do with, it makes sense to use those cores for graphics processing. I think that within a few years, we'll start seeing games that don't require a high-end graphics card- they'll just use a couple of the cores on your CPU.
LOL. That's funny, because this is about exactly the opposite -- using the very impressive floating point number crunching power of the GPU to do the work that the CPU used to do. OpenCL is essentially an API for being able to use your GPU for general purpose computing. Not a way to use your CPU to do rendering (OpenGL already does that).
Your CPU, four cores and all, is a LOOOOOOONG way from being able to do what your graphics card does wrt 3d rendering. That's okay, the tradeoffs are different for something that's supposed to be able to run databases just as competently as finite element analysis. But for raw floating point throughput on embarassingly parallelizable tasks -- which the 3d rendering pipeline is, and thus why GPUs are optimized around it -- the GPU is miles ahead. Thus the motivation to use it instead of the CPU.
It makes sense, and is actually a good thing. Fewer discrete chips is better, as far as power consumption and heat, ease-of-programming and compatibility are concerned.
Well you got that right at least, but the way it's going to happen is that you're still going to have a GPU, but it's going to be on the same piece of silicon as your CPU. Both Intel and AMD have combined CPU/GPU products in the pipe that are supposed to be released in 2011, meaning they have been in development for a number of years now.
Discrete graphics will live on for quite a while though in situations where low power is less important than performance. Both cpu and gpu having separate memory with their own memory controllers optimized for their needs is a big advantage over sharing a memory bus and memory controller. Not having to fit both functions within a single socket's TDP budget is another.
Eventually, the built-in UMA graphics may become good enough that it doesn't make sense to have a separate card. In the meantime, discreet graphics cards will live on, and the GPU in general ain't going anywhere -- it's only becoming even more important!
The enemies of Democracy are
CUDA allows you to easily compile C code to run on the GPU, not the reverse.
If history tells us anything, it's quite the opposite. For years, graphics cards have been getting more and more cores and applications (especially games or anything 3D) have come to rely on them much more than the CPU. I remember playing Half-life 2 with a 5 year old processor and a new graphics card...and it worked pretty well.
The CPU folk, meanwhile, are being pretty useless. CPUs haven't gotten much faster in the past 5 years; they just add more cores. Which is fine from the perspective of a multiprocess OS, but the fact remains that some algorithms you can parallelize, others you can't...and a GPU with hundreds of cores is only going to be as fast at one of these as its fastest core.
We'll see. My bet is if Intel/AMD just keep dumping more cores in the processors, they'll risk becoming irrelevant as we'll have more processors than we know what to do with (see the SGI's Prism...which was terribly slow despite having dozens of processors.)
-- Political fascism requires a Fuhrer.
The PR freaks have always said CUDA could and would work where ever nvidia want, ie CPU or supercomputers. Look up the stanford uni "Computer Systems Colloquium - Winter 2008 - Scalable Parallel Programming with CUDA on Manycore GPUs (February 27, 2008) - (February 27, 2008) John Nickolls from NVIDIA ". Video should be on the net somewhere.
"Now that we have CPUs with literally more cores than we know what to do with,"
For many problems, multi-core CPU's aren't even close to having enough power, that's why all of the interest in utilizing the GPU processing power.
They are different ends of a spectrum: CPU generally=fast serial processing, GPU generally=slow serial, fast parallel. Some problems require fast serial processing, some require fast parallel processing and some are in between. Both are valuable tools and neither will replace the other, although merging them onto one chip with shared memory/cache would be great.
AMD obviously has a vested interest in making their scheme an industry standard, so of course they'd want to support Larrabee with their GPGPU stuff. Larrabee has x86 lineage (of some sort, I'm not clear on exactly what or how), so they'd have to have at least some x86 support to be able to use their scheme on Larrabee. It seems to me that if they were going to bake some x86 support in there, they may as well add regular CPUs in as well (if you already wrote 90% of it, why not write the other 10%?).
I don't really know anything about this kind of stuff, but this news strikes me as unsurprising, given the environment.
Stasis is death. Embrace change.
Where is the link to the source tarball?
Can't find it, just some more mumbo jumbo about delivering seameless integration with the goatse paradigm shift, blah, blah, etc.
http://everything2.com/index.pl?node_id=1311164&displaytype=linkview&lastnode_id=1311164
Exactly the same thing.
I said EXACTLY!
[wanders off, muttering and picking bugs out of beard]
My bad, I forgot it was still for developers only. Although frankly it's so easy to become an nVidia "developer" that it may as well be called a public beta.
(Ars makes a similar point:)
the fact that Larrabee runs x86 will be irrelevant; so Intel had better be able to scale up Larrabee's performance
If AMD is working on a abstraction layer that lets OpenCL run on x86, could the reverse be in the works, having x86 code ported to run on CPU+GPGPU as one combined processing resource? AMD may be trying to make it's GPUs more like what Intel is trying to achieve with larrabee - a bridge between CPU and GPU -- yet Intel is originally trying to undermine the GPU as a unique processing platform.
After logging in slashdot still does not take you back to the page you were on. It's been that way for 20 years.
I used to have a 486 40mhz DLC cpu from Texas Instruments. It didn't have a math co-processor... Can you believe it? A TI chip that couldn't do math!
We used to joke that DLC stood for:
Da Low Cost
The DX is for Dux!
"Unless of course you can point at a compiler which can rethink and rewrite the program."
That's exactly what Lisp was invented for.
Pity we abandoned it in the 1980s and left it half-built.
You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
Those types of change aren't all that radical, even though they're not commonly implemented in compilers at the moment, as far as I know.
You're not describing major algorithm changes, just reorganising data to suit different batching requirements, reorganising loops and so on.
Reorganising loops is decades old already.
"why didnt they do this 10 years ago"-moment. Go A-Team!
Winkey shortcut mapping for 64bit windows. WinKeyPlus
Hmmmm what about a *working*, *full-featured* linux driver instead of SDKs?
The advantage of being able to run the same code on a GPU and an x86 multicore is that some parts of some apps run faster on one or the other, and with a compiler that targets both you can easily move apps between them.
GPU architectures are becoming very similar to CPU architectures, enough so that it is becoming possible to write compilers that generate efficient code for each. On the NVIDIA side, my group wrote an emulator for running CUDA on x86 ( http://code.google.com/p/gpuocelot/ ). The step from an emulator like this to a compiler is not huge...
http://tech.slashdot.org/comments.pl?sid=1327945&cid=28981391 see subject above and read all about it in that url link. Ion.SIMIAN.c only brought it on himself, as usual.
The FX is for...
Dammit, why didn't I know this stuff as teen?
My sig will be released in 2015 third quarter. Rating pending.
ion.simon.c is a convicted child rapist who was caught several years ago raping and molesting little boys.
Pure functional or dataflow programming FTW!
I know tobacco is bad for you, so I smoke weed with crack.