Intel Updates Compilers For Multicore CPUs

← Back to Stories (view on slashdot.org)

Intel Updates Compilers For Multicore CPUs

Posted by kdawson on Tuesday June 5, 2007 @07:46AM from the what-about-gcc dept.

Threaded writes with news from Ars that Intel has announced major updates to its C++ and Fortran tools. The new compilers are Intel's first that are capable of doing thread-level optimization and auto-vectorization simultaneously in a single pass. "On the data parallelism side, the Intel C++ Compiler and Fortran Professional Editions both sport improved auto-vectorization features that can target Intel's new SSE4 extensions. For thread-level parallelism, the compilers support the use of Intel's Thread Building Blocks for automatic thread-level optimization that takes place simultaneously with auto-vectorization... Intel is encouraging the widespread use of its Intel Threading Tools as an interface to its multicore processors. As the company raises the core count with each generation of new products, it will get harder and harder for programmers to manage the complexity associated with all of that available parallelism. So the Thread Building Blocks are Intel's attempt to insert a stable layer of abstraction between the programmer and the processor so that code scales less painfully with the number of cores."

16 of 208 comments (clear)

Min score:

Reason:

Sort:

Anyone want to... by u-bend · 2007-06-05 07:49 · Score: 4, Funny

...briefly translate this article into cretin for me, so that I can understand a bit more of why it's so cool?

--
u-bend
1. Re:Anyone want to... by Trigun · 2007-06-05 07:54 · Score: 5, Informative
  
  The compiler worries about the cores so you don't have to. Is that too cretin?
2. Re:Anyone want to... by BecomingLumberg · 2007-06-05 07:55 · Score: 4, Informative
  
  >>>So the Thread Building Blocks are Intel's attempt to insert a stable layer of abstraction between the programmer and the processor so that code scales less painfully with the number of cores.
  
  They found a way to make the computer be able to determine how to use its many CPU cores automagically when you compile a program. It is cool, since it is really to figure out how to share a given workload 16 even ways.
  
  --
  If a nation expects to be ignorant and free, in a state of civilization, it expects what never was and never will be.-TJ
3. Re:Anyone want to... by Mockylock · 2007-06-05 07:58 · Score: 5, Funny
  
  The parallelism of the Compiler Fortran and Professional Edition of the uranium core both sport improved auto-vectorizationalism of the fortran and format that can target Intel's new SSE4 extensionalism. For thread-level parallelismisitic quantum theory, the compilers support the use of Intel's Threadtastic Building Block nationalism for objectionism for automatic thread-level optimizationalism that takes place simultaneously with auto-vectorization of parellel universes... Intel is encouraging the widespread use of its Intel Threading quantum physics parallel vectorizationistic Tools as an interface on the enterprise bridge to its Spock multicore processors. As the parallel company raises the vectorized core count with each multitudinal generation of new vector parallel products, it will get harder and harder for programmers to manage the complexity associated with all of that available parallelismistic forces.
  
  See, it's not that hard to understand.
  
  --
  "Please, shut up. Just when I think you can't say anything more stupid, you speak again." -Archie Bunker.
4. Re:Anyone want to... by LWATCDR · 2007-06-05 08:09 · Score: 4, Interesting
  
  SSE4 the latest and greatest vector instruction set from Intel. MMX->SSE->SSE2->SSE3->SSE4. These instructions speed up things like trans-coding video and audio. They are also good for anything that does a lot of Floating-Point. The downside is very few systems have CPUs that support SSE4 and selecting it may hurt systems that don't have SSE4 or the program might not run at all depending on how the compiler is written. My bet is it will degrade gracefully. Over all SSE4 is most useful for people that are writing custom software right now and will become commonplace in off the shelf software once AMD supports it and systems that support it are more common.
  The Threading Building Blocks are yet another attempt to make writing multithreaded code easier. Frankly I don't find pthreads hard but maybe I am just odd.
  Threading is very important because we are not going to see an endless increase in clock speed anymore. Intel, AMD, and IBM are all pushing multiple cores. While adding an extra core or three really does help modern systems at least a little since we are often running multiple tasks current software will not scale as well when the cores start growing in a Moore like fashion. Right now we are at four cores if Moore's law holds in two years we might see eight, then 16, then 32... As you can see it gets out of hand pretty quickly. Your average desktop will not use four cores very well much less eight until software is written to take advantage of more cores.
  Yes I know that Moore said 18 months but I was going for a nice round numbers.
  
  --
  See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
5. Re:Anyone want to... by James_Intel · 2007-06-05 12:08 · Score: 5, Informative
  
  Automagical - we try. Vectorization, paralellization - I dare say the Intel compilers are at least as good at it as any compiler ever has been. Bold statement - yeah. I believe it is true.
  
  A more interesting question is "Is that good enough?" For vectorization, the answer is 'usually' - so some additional work/headaches happen when it isn't enough. For parallelization - the answer is at best 'sometimes.' So I'll get flamed two ways: (1) by people very happy with it - and say that I've understated how good it is - and it is all they need, (2) by people with programs which don't get magical auto-paralleism to solve there needs. There are more people in #2 than #1 - but this ain't a 1-size-fits-all-world. Not a bad deal if it solves you problems - otherwise - you got work to do... but that ain't the compiler's fault... parallelism requires work for most of us.
  
  About languages...
  Virtually every Fortran, C and C++ compiler these days support OpenMP, which is not part of the official standard - but is there to use. It is loop oriented, and is very Fortran-like and fits into C well enough... but is definitely not C++ like.
  
  Fortran and C/C++ don't support threading in the language, you need to write your code to be thread-safe, and you need to use a threading package like Windows threads or POSIX threads (pthreads). Boost thread offer a portable interface to hit on the key threading needs - essentially wrappers for pthreads and Windows threads, etc. - the standards are likely to add a portable interface officially in the future. One thing Java did from the start.
  
  Intel compilers -> Intel CPUs -> all compatible processors
  The Intel compilers and libraries aim to beat other compilers and libraries regardless of the processor it is run on. No one will get it right all the time - so this is not a dare to find single examples of little code sample to prove me wrong. But if a real program doesn't get the best results from Intel - we want to know. (yeap - I work at Intel - I post for myself)
Intel - The Software Company by Necroman · 2007-06-05 07:56 · Score: 5, Insightful

We see Intel mainly as a CPU/chipset maker, but don't pay much attention to their software side. I believe they are one of the largest software development companies in the world. Between drivers, compilers, and all the other goodies to support all their hardware, they spend a lot of time doing software development.

And as much as they develop compilers to optimize code for Intel CPUs, the code most of the time will also see a speed increase on AMD CPUs as well. Who else do you want developing a compiler but the people who made the hardware it's running on.

--
Its not what it is, its something else.
1. Re:Intel - The Software Company by dmoore · 2007-06-05 08:03 · Score: 4, Interesting
  
  I have not tried their compiler, but for the Intel Performance Primitives (IPP), a library of useful MMX/SSE-optimized functions written by Intel, they explicitly fall-back to slow versions of the code if it detects an AMD processor, even if the AMD processor has MMX/SSE/SSE2. This kind of behavior is one reason that you may not want to trust Intel for your compiler needs if you are planning on doing development for more than just Intel-branded CPUs.
2. Re:Intel - The Software Company by James_Intel · 2007-06-05 12:39 · Score: 5, Informative
  
  (Yes - I work for Intel - post for myself - tell it like it is) Cute story if it was true. However - Intel compilers and libraries, are designed to use features - but we don't come out every day with an update. The new compilers support SSE 4, but Intel only. AMD support comes after the processors exist that support it. Libraries aren't quite there yet with SSE 4 (I guess we hate Intel processors too - flame us). But AMD support for SSE 3 is there - now that it is in their processors. It wasn't there when we developed version 9 of the compilers. We do test our compilers/libraries on other implementations - because believe it or not - we care if it works. It doesn't always - and we adjust the compiler/library to make it work. We had a beta a few years ago which blew up on Intel processors and worked on AMD processors (yeap - I said it right - imagine the embarrassment when a customer told us about that combination). Opps. I heard that was because we released support before we tested that it worked on that processor. So we learned not to do that too often. By the time we release product - it should work on all procesors. I would say "does" or "guaranteed to" - but the lawyers would freak - because nothing in life is guaranteed. We are clearly not trying to screw our customers though - you know... the developers who count on our software. It is annoying when people suggest that might be our goal.
  
  My favorite complaint: Intel checks "CPUID"
  No duh - that's where the feature information is.
  
  Next favorite: Intel checks for "GenuineIntel".
  Another "no duh" - RTFM from Intel or AMD - the features flags checking has to come AFTER you determine the manufacturer AND family of the processor...
  unless you don't care about running on all processors
  (spare pointing out to be that you can skip the first two checks - look at the SSE flag - and it is usually right - unless say you pick just the right older processor)
  We do the checks the way Intel and AMD manuals say we have to... if that is evil... so be it.
  We even start by testing if the CPUID instruction exists (it didn't before Pentium processors).
Re:Umm.. by Timesprout · 2007-06-05 08:02 · Score: 4, Funny

Intel has added kitten whiskers and pixie dust to its compilers so your ponies can now play on multiple paddocks.

--
Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
What truth?
There is no dupe
The inevitable... by R2.0 · 2007-06-05 08:06 · Score: 4, Funny

Cue "Fortran is Dead" comments in

30
20
10

--
"As God is my witness, I thought turkeys could fly." A. Carlson
1. Re:The inevitable... by TeknoHog · 2007-06-05 08:53 · Score: 4, Insightful
  
  Fortran is dead, and it has had native parallel math since 1990. C is alive and it needs ugly hacks to get parallel math.
  
  --
  Escher was the first MC and Giger invented the HR department.
Re:Umm.. by repvik · 2007-06-05 08:44 · Score: 4, Funny

OMG! PONIES!!!
I dont understand this statement: by JustNiz · 2007-06-05 09:07 · Score: 5, Insightful

>> As the company raises the core count with each generation of new products, it will get harder and harder for programmers to manage the complexity associated with all of that available parallelism.

I'm very surprised and dissapointed by the pervasiveness of the incorrect myth thats being promoted even amongst supposedly technically knowledgeable groups that:
a) Writing multithreaded code is terribly difficult
b) You need to implement code to have the same number of threads as your target hardware has cores
Both of these is completely not true at least for the PC marchitecture.

The way to develop multithreaded code is to exploit the natural parallelism of the problem itself. If the problem decomposes down most neatly into one, three or 6789 threads, then design and write the implementation that way. Consequently the complexity of the problem does not increase as the number of cores available increases.

In the PC architecture case, attempting to design your code based on the number of cores in your target hardware just leads to a twisted and therefore bad and also non-portable design.

I'm surprised how few developers seem to understand that in fact its OK, normal and often desireable to have more than one application thread running on the same core. In fact you really can't even ensure or even assume that your multi-threaded app will get one core per thread even if the hardware has enough cores, or work best if it does, as core/thread allocation is dynamically scheduled by the OS depending on loading. Not to mention there's all sorts of other apps, drivers and operating system tasks running concurrently too, so depending on each core's load, one app-thread per core may actually not be the most optimal approach anyway.
Re:Umm.. by Just+Some+Guy · 2007-06-05 09:37 · Score: 4, Funny

Intel has added kitten whiskers and pixie dust to its compilers

You're thinking of IBM.

--
Dewey, what part of this looks like authorities should be involved?
Re:No and yes by smallfries · 2007-06-05 11:31 · Score: 4, Informative

Well, no actually you can't. If you've ever spent any time going through the 1000 page Intel Optimisation Guide for the x86 then you would know that they don't spell out all of the trade offs explicitly. They describe enough to point you in the right direction but they keep a lot back. Partially because the behaviour of these chips in certain usage patterns isn't even defined by the design - it's a side-effect of several other parts of the chip design interacting. So the best that you can do is suck it and see - and in general it changes not between major ISA revisions but on individual models.

Now, if you're Intel then you have the time and the money to work out exactly how to exploit these tradeoffs to schedule threads effectively. But you don't want to give that away for free. From the (very scanty) marketing bullshit that was linked to, it would appear that they've appear an Intel-specific threading library (probably with a POSIX interface). Separate to this is a profiling tool and a multi-threaded debugger (the latter of which is non-trivial). While any debugger will let you skip across threads allowing you do it in a deterministic manner to look for race hazards is much harder.

The analysis tools sound nice, but the bolton library is nothing special. It's purely to win a few synthetic benchmarks and gain some marketshare for ICC and therefore more "Made for Intel" applications in the market. I'm cynical about the library because what is broken about the threading model in C/C++ would take more than a library to fix. It would require redesigning the language down to the ground and choosing a different set of control constructs.

So finally, when you claim that it's because Intel has "better" coders. You don't know what you're talking about. I know a few guys who code GCC for a living, and they are grade A coders. It is because Intel has moved the goalposts. It's not so much that GCC targets multiple architectures, it's that they are trying to stick to (relatively) standard C where-as Intel is willing to redefine where the semantic gap sits if they can squeeze out a little more performance. Their attitude is screw portable code - talking across different compiler vendors here, rather than chip vendors. If what they need to squeeze into their compiler is no longer "C" strictly speaking, then they don't care. The gcc guys do.

Ah yes, and portable code can be a smaller window than you expect. That weighty 1000 page Intel document is sitting comfortable next to the AMD equivalent, which differs in suprising places.

--
Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php