Microsoft Demos C++ AMP At AMD Developers Summit

← Back to Stories (view on slashdot.org)

Microsoft Demos C++ AMP At AMD Developers Summit

Posted by samzenpus on Wednesday June 15, 2011 @02:06PM from the take-a-look-at-this dept.

MojoKid writes "The second day of the AMD Fusion Developer Summit began with a keynote from Microsoft's Herb Sutter, Principal Architect, Native Languages and resident C++ guru. The gist of Herb's talk centered around heterogeneous computing and the changes coming with future versions of Visual Studio and C++. One of the main highlights of the talk was a demo of a C++ AMP application that seamlessly took advantage of all of the compute resources within a few of the various demo systems, from workstations to netbooks. The physics demo seamlessly switched from using CPU, integrated GPU, and discrete GPU resources, showcasing the performance capabilities of each. As additional bodies are added, workload increases with a ramp-up to over 600 of GFLops in compute performance."

13 of 187 comments (clear)

Re:Where's my C# version? by exomondo · 2011-06-15 14:13 · Score: 3, Funny

Oh but haven't you heard? They're dropping everything else for HTML5/JavaScript ;)
AMP? by c0lo · 2011-06-15 14:28 · Score: 5, Insightful

Gosh, I came hate this acronymia that so endemic in IT.
In this context, AMP doesn't stand for amplifier, Adenosine monophosphate or Ampere, but for "Accelerated Massive Parallelism". Seems like a microsoftism for the more traditional term of "Massive Parallel Processing"

--
Questions raise, answers kill. Raise questions to stay alive.
1. Re:AMP? by Alex+Belits · 2011-06-15 14:36 · Score: 5, Insightful
  
  Microsoft has a history of inventing names and acronyms that collide with established terms in unrelated areas. I suspect, they are trying to get potential users to see a new name as something they have heard but know nothing about its actual meaning, so term looks "established" in those people's eyes.
  For example, ".Net".
  
  --
  Contrary to the popular belief, there indeed is no God.
2. Re:AMP? by slimjim8094 · 2011-06-15 16:28 · Score: 3, Interesting
  
  To be fair, doing any nontrivial assembly will put some serious hair on your balls. But it's just not very good for (almost*) any real work.
  I use C for performance-critical code, C++ for complex performance-significant components (like a OpenGL million-poly renderer), Java or C# (depending on target platform(s)) for large but otherwise-modest programs, and scripting languages (mostly Python) for one-off programs or little tools that don't justify the involvement of a more heavyweight language.
  Use the right tool for the job, as always. Can't go too far wrong with Java, and if you're going to hit its performance wall, you should know up front.
  * Only large-scale assembly coding I've ever had to do was for a compilers class, but there was obviously no way around it. Fascinating to learn and do, but I sure hope I'm done with it...
  
  --
  I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
3. Re:AMP? by Joce640k · 2011-06-15 17:31 · Score: 3, Insightful
  
  How hard is it to write "AMP (Accelerated Massive Parallelism)" in a summary?
  
  --
  No sig today...
Re:Grand Central Dispatch by Daniel_Staal · 2011-06-15 14:45 · Score: 3, Informative

The most relevant difference is that it automatically uses different types of compute resources for the same task, depending on what's available. Core Image can do some of that, but it's limited to graphics workloads.
So it's Grand Central Dispatch + Core Image + a bit.

--
'Sensible' is a curse word.
Re:Who would be the target customers? by kelemvor4 · 2011-06-15 14:54 · Score: 3, Insightful

Them and pretty much anyone who writes c++ code and wants their software to run faster. I suppose if you're interested in having your software run slower, this may not be for you.
Normally I would say that too by symbolset · 2011-06-15 14:59 · Score: 4, Insightful

This is key innovation. It looks like an important new step we've needed for a long time. It looks like they have done well with it.
Of course it should be inspected for traps. From these folks there are always traps. But this particular time I think this is important enough that we look closely at it to see if there isn't something useful we can safely extract, while being mindful for the traps.
I've been here a long time. I've posted nearly 5,000 comments here over 8 years. Never once before have I said this about a Microsoft technology: This deserves a look.

--
Help stamp out iliturcy.
1. Re:Normally I would say that too by Anonymous Coward · 2011-06-15 15:30 · Score: 5, Interesting
  
  As someone actually at the event, someone who attended both the keynote and the later (and more in-depth) technical session, and someone who is employed as a GPU programmer, I would say that it's being vastly overblown. It is very easy to look at the examples in the keynote (dense matrix multiplication with very little code modification, and an N-body simulation for which the code is not shared) and believe that this is finally some panacea for the difficulties involved in GPU computing and massively parallel computing in general. But the reality is, much like some approaches before it, C++ AMP simply elides some of the verbosity in the CUDA/OpenCL APIs regarding memory allocation, thread configuration, etc. The matrix multiplication example appears dead simple because matrix multiplication on a GPU is dead simple. As soon as you start trying to write more advanced applications with this, you find that you need to take advantage of a fast shared memory to get worthwhile performance gains -- to do that, you add "tiles" to your "grid" (in CUDA terms, "blocks" and "grid", in OpenCL terms, "local workgroups" and "global workgroup"). As soon as your output starts getting more complicated than a nice, deterministic matrix multiplication or N-body simulation, you may find that you have potential race conditions that you have to address yourself. And when you've broken up your problem into a tiled grid, taken fast local shared memory and slow global shared memory into account, and ensured that you have no race conditions, you've basically done all of the work of writing a CUDA or OpenCL kernel. Only now you've done it in a way that is very proprietary, instead of the (comparatively open) CUDA and (way the fuck more open) OpenCL.
  It's unfortunate that it is being sold as this amazing world-changing breakthrough, because although it is not by any stretch that, it is in fact quite a nice concept. This is something, like Microsoft's PPL, that can be used to parallelize existing code very easily provided the code is parallelizable and written in a parallel-friendly manner. It is not something, however, that will do the work of parallelization or even the work of optimizing parallel-friendly code for GPU hardware for you.
Re:Who would be the target customers? by SanityInAnarchy · 2011-06-15 15:29 · Score: 5, Insightful

Well, assuming your code has embarrassingly parallel components. Otherwise, it's pretty useless.

--
Don't thank God, thank a doctor!
Re:Microsoft C++ by Suiggy · 2011-06-15 17:03 · Score: 3, Informative

That was back with MSVC++ 6.0 released in 1998 before the ISO C++ draft was fully ratified. MSVC++ today is one of the more standards compliant compilers, although their template instantiation mechanism is still somewhat broken so that it can still support their legacy MFC crap.
Re:This could push new hardware by Suiggy · 2011-06-15 17:14 · Score: 3, Informative

[quote]But, a lot of older computers which don't have DirectX 11 graphic cards have to emulate the DirectX DirectCompute API on the CPU[/quote].
They don't really have to emulate anything, most of the kernel (as in "compute kernel") functions and operations in DirectCompute have a one-to-one mapping with most CPU's SIMD instruction sets, such as x86's SSE/AVX. The primary difference then is that on the CPU you have a lot less cores, and on the GPU you may have thousands of cores/streaming processors, but you have higher memory latencies and at best only a L1 & L2 cache.
So it's OpenCL then by SuperKendall · 2011-06-15 19:02 · Score: 3, Informative

No, it's really a lot more like OpenCL.
Which is not Mac only BTW... but you can use it in OSX or IOS development.
Also Apple's Accelerate library (C library) takes advantage of OpenCL for BLAS and Linpack and so on...

--
"There is more worth loving than we have strength to love." - Brian Jay Stanley