Microsoft Demos C++ AMP At AMD Developers Summit

← Back to Stories (view on slashdot.org)

Microsoft Demos C++ AMP At AMD Developers Summit

Posted by samzenpus on Wednesday June 15, 2011 @02:06PM from the take-a-look-at-this dept.

MojoKid writes "The second day of the AMD Fusion Developer Summit began with a keynote from Microsoft's Herb Sutter, Principal Architect, Native Languages and resident C++ guru. The gist of Herb's talk centered around heterogeneous computing and the changes coming with future versions of Visual Studio and C++. One of the main highlights of the talk was a demo of a C++ AMP application that seamlessly took advantage of all of the compute resources within a few of the various demo systems, from workstations to netbooks. The physics demo seamlessly switched from using CPU, integrated GPU, and discrete GPU resources, showcasing the performance capabilities of each. As additional bodies are added, workload increases with a ramp-up to over 600 of GFLops in compute performance."

37 of 187 comments (clear)

Re:Where's my C# version? by exomondo · 2011-06-15 14:13 · Score: 3, Funny

Oh but haven't you heard? They're dropping everything else for HTML5/JavaScript ;)
AMP? by c0lo · 2011-06-15 14:28 · Score: 5, Insightful

Gosh, I came hate this acronymia that so endemic in IT.
In this context, AMP doesn't stand for amplifier, Adenosine monophosphate or Ampere, but for "Accelerated Massive Parallelism". Seems like a microsoftism for the more traditional term of "Massive Parallel Processing"

--
Questions raise, answers kill. Raise questions to stay alive.
1. Re:AMP? by Alex+Belits · 2011-06-15 14:36 · Score: 5, Insightful
  
  Microsoft has a history of inventing names and acronyms that collide with established terms in unrelated areas. I suspect, they are trying to get potential users to see a new name as something they have heard but know nothing about its actual meaning, so term looks "established" in those people's eyes.
  For example, ".Net".
  
  --
  Contrary to the popular belief, there indeed is no God.
2. Re:AMP? by Jeffrey_Walsh+VA · 2011-06-15 15:05 · Score: 2
  
  I thought it MS's answer to LAMP: Apache, MySQL, PHP but on Windows.
3. Re:AMP? by IQgryn · 2011-06-15 15:15 · Score: 2
  
  WAMP? I guess it's still better than WinCE...
4. Re:AMP? by Midnight+Thunder · 2011-06-15 15:18 · Score: 2
  
  Surely that would be WISA? Windows, IIS, SQL Server, ASP.
  
  --
  Jumpstart the tartan drive.
5. Re:AMP? by overlordofmu · 2011-06-15 15:33 · Score: 2
  
  Haven't you heard of AFT? Acronyms for techies?
6. Re:AMP? by phantomfive · 2011-06-15 15:38 · Score: 2, Insightful
  
  The worst part is when Microsofties try to get you to accept the term as something real, and that it makes Microsoft better. Example:
  
  Microsoftie: isn't Microsoft great? They have managed code and no one else does.
  Me: Isn't Java the same?
  Microsoftie: No, that's a virtual machine, that's different!
  Me: ..........
  
  --
  "First they came for the slanderers and i said nothing."
7. Re:AMP? by koreaman · 2011-06-15 16:10 · Score: 2
  
  C# is a "scripting language"?
  Maybe when you graduate from high school, you'll learn that how cool you are is unrelated to the height of the language you program in.
  
  --
  Le français vous intéresse?
8. Re:AMP? by slimjim8094 · 2011-06-15 16:28 · Score: 3, Interesting
  
  To be fair, doing any nontrivial assembly will put some serious hair on your balls. But it's just not very good for (almost*) any real work.
  I use C for performance-critical code, C++ for complex performance-significant components (like a OpenGL million-poly renderer), Java or C# (depending on target platform(s)) for large but otherwise-modest programs, and scripting languages (mostly Python) for one-off programs or little tools that don't justify the involvement of a more heavyweight language.
  Use the right tool for the job, as always. Can't go too far wrong with Java, and if you're going to hit its performance wall, you should know up front.
  * Only large-scale assembly coding I've ever had to do was for a compilers class, but there was obviously no way around it. Fascinating to learn and do, but I sure hope I'm done with it...
  
  --
  I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
9. Re:AMP? by Joce640k · 2011-06-15 17:31 · Score: 3, Insightful
  
  How hard is it to write "AMP (Accelerated Massive Parallelism)" in a summary?
  
  --
  No sig today...
10. Re:AMP? by hedwards · 2011-06-15 17:41 · Score: 2
  
  Yes, but I think the relevant question is: how precisely is it that they kill this one. They have a history of devising cool technology and then managing to fuck it up.
Re:Grand Central Dispatch by Daniel_Staal · 2011-06-15 14:45 · Score: 3, Informative

The most relevant difference is that it automatically uses different types of compute resources for the same task, depending on what's available. Core Image can do some of that, but it's limited to graphics workloads.
So it's Grand Central Dispatch + Core Image + a bit.

--
'Sensible' is a curse word.
Re:Who would be the target customers? by kelemvor4 · 2011-06-15 14:54 · Score: 3, Insightful

Them and pretty much anyone who writes c++ code and wants their software to run faster. I suppose if you're interested in having your software run slower, this may not be for you.
Normally I would say that too by symbolset · 2011-06-15 14:59 · Score: 4, Insightful

This is key innovation. It looks like an important new step we've needed for a long time. It looks like they have done well with it.
Of course it should be inspected for traps. From these folks there are always traps. But this particular time I think this is important enough that we look closely at it to see if there isn't something useful we can safely extract, while being mindful for the traps.
I've been here a long time. I've posted nearly 5,000 comments here over 8 years. Never once before have I said this about a Microsoft technology: This deserves a look.

--
Help stamp out iliturcy.
1. Re:Normally I would say that too by hedwards · 2011-06-15 15:25 · Score: 2
  
  I know that 5,000 comments isn't much to you, the one that has posted probably a million comments, but for the rest of us it's quite a bit.
2. Re:Normally I would say that too by Anonymous Coward · 2011-06-15 15:30 · Score: 5, Interesting
  
  As someone actually at the event, someone who attended both the keynote and the later (and more in-depth) technical session, and someone who is employed as a GPU programmer, I would say that it's being vastly overblown. It is very easy to look at the examples in the keynote (dense matrix multiplication with very little code modification, and an N-body simulation for which the code is not shared) and believe that this is finally some panacea for the difficulties involved in GPU computing and massively parallel computing in general. But the reality is, much like some approaches before it, C++ AMP simply elides some of the verbosity in the CUDA/OpenCL APIs regarding memory allocation, thread configuration, etc. The matrix multiplication example appears dead simple because matrix multiplication on a GPU is dead simple. As soon as you start trying to write more advanced applications with this, you find that you need to take advantage of a fast shared memory to get worthwhile performance gains -- to do that, you add "tiles" to your "grid" (in CUDA terms, "blocks" and "grid", in OpenCL terms, "local workgroups" and "global workgroup"). As soon as your output starts getting more complicated than a nice, deterministic matrix multiplication or N-body simulation, you may find that you have potential race conditions that you have to address yourself. And when you've broken up your problem into a tiled grid, taken fast local shared memory and slow global shared memory into account, and ensured that you have no race conditions, you've basically done all of the work of writing a CUDA or OpenCL kernel. Only now you've done it in a way that is very proprietary, instead of the (comparatively open) CUDA and (way the fuck more open) OpenCL.
  It's unfortunate that it is being sold as this amazing world-changing breakthrough, because although it is not by any stretch that, it is in fact quite a nice concept. This is something, like Microsoft's PPL, that can be used to parallelize existing code very easily provided the code is parallelizable and written in a parallel-friendly manner. It is not something, however, that will do the work of parallelization or even the work of optimizing parallel-friendly code for GPU hardware for you.
3. Re:Normally I would say that too by PhrostyMcByte · 2011-06-15 16:18 · Score: 2
  
  An interesting part of AMP is that it is platform-agnostic. Their implementation uses DirectCompute under the hood, but none of that is exposed in the API. This means it could probably be implemented for *nix.
  Believe it or not, Microsoft has also done this a couple other times recently -- with real results -- and it all comes from the native C++ team as part of Microsoft's new-found focus on C++ after so many years in .NET mode.
  The Parallel Patterns Library integrates extremely well. It knows that it's a C++ library and doesn't try to act like a Windows library, and certainly not like a COM library. It's pure, modern C++. So much, in fact, that Intel's Thread Building Blocks provides a compatible implementation that is cross-platform. If AMP ends up being similar, this could indeed be a very cool thing.
4. Re:Normally I would say that too by The+Master+Control+P · 2011-06-15 18:15 · Score: 2
  
  I'm a PhD student whose job is currently get an MHD code running on multiple GPUs (getting it to run really fast on /one/ I have not yet quite done), and from my experiences, I sort of figured these were trivially parallel types of things.
  
  There's only one kind of kernel I deal with that actually gets near full utilization without extensive hand-tuning (i.e. that could be written by a computer without human guidance) - the ones that do simple atomic operations on N input arrays and spits out M output arrays. Everything else takes weeks of agonizing hand-holding, tuning, and the occasional use of percussive maintainence before it gets past 10-20% efficiency. As you say, by the time you've solved the "parallel BS" for any nontrivial problem, you may as well have just written the GPU code yourself.
5. Re:Normally I would say that too by shutdown+-p+now · 2011-06-15 18:43 · Score: 2
  
  I have no knowledge of or experience with CUDA or OpenCL (other than the general vague idea of what these are for), so let me clarify something. How easy is it to write a program in either of those that parallelizes across all computational devices available to the system (not just GPU, but also CPU cores), and can change the specific devices being used on the fly, all without recompiling or restarting the binary? My impression from the demo, at least, was that this is the main selling point, rather than it just being easier than OpenCL.
Re:Who would be the target customers? by SanityInAnarchy · 2011-06-15 15:29 · Score: 5, Insightful

Well, assuming your code has embarrassingly parallel components. Otherwise, it's pretty useless.

--
Don't thank God, thank a doctor!
Re:Where's my C# version? by c0lo · 2011-06-15 16:03 · Score: 2

Java is equally garbage.
Mm-yeeaah!... But, at least, it has a garbage collector. :)

--
Questions raise, answers kill. Raise questions to stay alive.
Re:Grand Central Dispatch by inglorion_on_the_net · 2011-06-15 16:30 · Score: 2

Can't speak for others, but in my case it's

Don't know what AMP is and can't understand TFS/TFA
Neither the summary nor the article seem to explain what AMP is.
For the benefit of everyone else who is trying to figure out, here is a link: Introducing C++ Accelerated Massive Parallelism (C++ AMP) To quote from that page:

Iâ(TM)m excited to announce that we are introducing a new technology that helps C++ developers use the GPU for parallel programming. Today at the AMD Fusion Developer Summit, we announced C++ Accelerated Massive Parallelism (C++ AMP). (â¦) By building on the Windows DirectX platform, our implementation of C++ AMP allows you to target hardware from all the major hardware vendors. (â¦)
So, from a cursory look, this seems to be similar in purpose to OpenCL.

--
Please correct me if I got my facts wrong.
Re:Where's my C# version? by c0lo · 2011-06-15 16:31 · Score: 2

Java is equally garbage.
Mm-yeeaah!... But, at least, it has a garbage collector. :)
If only it could collect itself.
You need something recursive for that: try Prolog and/or the "GNU's not UNIX" toolset :)

--
Questions raise, answers kill. Raise questions to stay alive.
CUDA C++ and Thrust by gupg · 2011-06-15 16:37 · Score: 2

This is an awesome development - Microsoft adding support for GPU computing in their mainstream tools and C++.

Today, CUDA C++ already provides a full C++ implementation on NVIDIA's GPUs:
http://developer.nvidia.com/cuda-downloads

And the Thrust template library provides a set of data structures and functions for GPUs (similar in spirit to STL):
http://code.google.com/p/thrust/

- biased NVIDIA employee
1. Re:CUDA C++ and Thrust by drewm1980 · 2011-06-15 17:58 · Score: 2
  
  I am a CUDA C++ programmer. My biggest complaint about programming tools for the GPU is that there are no dense linear algebra libraries that work at the SM level. For my application I had to re-implement a big chunk of BLAS and part of LAPACK from scratch so that each SM runs a different problem instance. On the CPU you can just use openmp + single threaded BLAS to achieve the same granularity of parallelism. Thrust API does not address this granularity of parallelism. I'm eager to see if the AMP API does.
This could push new hardware by jader3rd · 2011-06-15 16:54 · Score: 2

I can see this pushing new hardware. More developers start writing with C++AMP, because it lowers the bar of entry for writing code that makes use of the GPU, and before we know it every little application will have some C++AMP. But, a lot of older computers which don't have DirectX 11 graphic cards have to emulate the DirectX DirectCompute API on the CPU, which is noticably glacial. People see an application run blazingly fast on one computer, see it slow on theirs and ask why it's so slow on theirs. Either they find out that they need a new GPU, or figure that they're computers getting old and they need to buy a new one (which would just happen to have a decent GPU in it).
1. Re:This could push new hardware by Suiggy · 2011-06-15 17:14 · Score: 3, Informative
  
  [quote]But, a lot of older computers which don't have DirectX 11 graphic cards have to emulate the DirectX DirectCompute API on the CPU[/quote].
  They don't really have to emulate anything, most of the kernel (as in "compute kernel") functions and operations in DirectCompute have a one-to-one mapping with most CPU's SIMD instruction sets, such as x86's SSE/AVX. The primary difference then is that on the CPU you have a lot less cores, and on the GPU you may have thousands of cores/streaming processors, but you have higher memory latencies and at best only a L1 & L2 cache.
Re:Microsoft C++ by Suiggy · 2011-06-15 17:03 · Score: 3, Informative

That was back with MSVC++ 6.0 released in 1998 before the ISO C++ draft was fully ratified. MSVC++ today is one of the more standards compliant compilers, although their template instantiation mechanism is still somewhat broken so that it can still support their legacy MFC crap.
Re:Microsoft C++ by shutdown+-p+now · 2011-06-15 17:45 · Score: 2

A "while ago", gcc didn't support C++ namespaces. So?
So it's OpenCL then by SuperKendall · 2011-06-15 19:02 · Score: 3, Informative

No, it's really a lot more like OpenCL.
Which is not Mac only BTW... but you can use it in OSX or IOS development.
Also Apple's Accelerate library (C library) takes advantage of OpenCL for BLAS and Linpack and so on...

--
"There is more worth loving than we have strength to love." - Brian Jay Stanley
1. Re:So it's OpenCL then by TheRaven64 · 2011-06-15 21:28 · Score: 2
  
  OpenCL allows you to write code in a specialised dialect of C that will run on GPUs of (not spectacularly efficiently) on CPUs. A better comparison is HMPP (currently supported by CAPS and PathScale's C/C++ and Fortran compilers), which allows you to annotate sections of code with pragmas and then have the compiler automatically run them on different processing units, including GPUs and other cores on the CPU.
  
  --
  I am TheRaven on Soylent News
Why go Microsoft? by loufoque · 2011-06-15 22:31 · Score: 2

There are already tons of such tools, most of which are not tied to specific architectures, operating systems, or compilers.
Really, why would you go Microsoft on this at all? Clusters and supercomputers usually don't even run Windows at all.
Re:Yet another attempt at vendor lock-in. by Anonymous Coward · 2011-06-15 23:04 · Score: 2, Insightful

I don t think there are open affords which are attempting to do this by extending C++ compiler.
Open affords(clearly exclude CUDA) are usually inventing a new language (usually a subset of C). With much restricted language features, and is loosely integrated with host code.
I think they are the first and they are doing the right things here.
It is nice to have the host code tightly integrated to the GPU code, and with most of the useful C++ language features there.
Re:Where's my C# version? by fitten · 2011-06-16 00:31 · Score: 2

Yeah, I know you're trolling, but C# is a good language. I've coded millions of lines in C, C++, and C# and I can tell you which I'd rather code in any day of the week and twice on Sunday. Combined with VS, you simply get. stuff. done. very quickly and very easily.
Re:Where's my C# version? by Mongoose+Disciple · 2011-06-16 01:06 · Score: 2

Ahh, the sinking feeling of having written a serious response to a post that's accruing funny mods...
Re:Where's my C# version? by terjeber · 2011-06-16 01:42 · Score: 2

Ah, QNX :-)