Slashdot Mirror


Intel Updates Compilers For Multicore CPUs

Threaded writes with news from Ars that Intel has announced major updates to its C++ and Fortran tools. The new compilers are Intel's first that are capable of doing thread-level optimization and auto-vectorization simultaneously in a single pass. "On the data parallelism side, the Intel C++ Compiler and Fortran Professional Editions both sport improved auto-vectorization features that can target Intel's new SSE4 extensions. For thread-level parallelism, the compilers support the use of Intel's Thread Building Blocks for automatic thread-level optimization that takes place simultaneously with auto-vectorization... Intel is encouraging the widespread use of its Intel Threading Tools as an interface to its multicore processors. As the company raises the core count with each generation of new products, it will get harder and harder for programmers to manage the complexity associated with all of that available parallelism. So the Thread Building Blocks are Intel's attempt to insert a stable layer of abstraction between the programmer and the processor so that code scales less painfully with the number of cores."

208 comments

  1. Anyone want to... by u-bend · · Score: 4, Funny

    ...briefly translate this article into cretin for me, so that I can understand a bit more of why it's so cool?

    --
    u-bend
    1. Re:Anyone want to... by Trigun · · Score: 5, Informative

      The compiler worries about the cores so you don't have to. Is that too cretin?

    2. Re:Anyone want to... by BecomingLumberg · · Score: 4, Informative

      >>>So the Thread Building Blocks are Intel's attempt to insert a stable layer of abstraction between the programmer and the processor so that code scales less painfully with the number of cores.

      They found a way to make the computer be able to determine how to use its many CPU cores automagically when you compile a program. It is cool, since it is really to figure out how to share a given workload 16 even ways.

      --
      If a nation expects to be ignorant and free, in a state of civilization, it expects what never was and never will be.-TJ
    3. Re:Anyone want to... by CaptainPatent · · Score: 2, Informative

      essentially the compiler will automatically optimize thread splitting (time and number of splits if I'm reading this correctly) which is very handy feature as it will quickly become nearly impossible to manage future processors with 16+ cores. They do seem to hide a lot of the true features underneath market-speak though.

      --
      Well, back to rejecting software patent applications.
    4. Re:Anyone want to... by u-bend · · Score: 1

      haha! Maybe a little too cretin. I might be able to handle information that's a *tiny* bit more technical.
      :)
      Soooo, at the risk of sounding really stupid, wasn't this sort of thing happening with previous compilers?

      --
      u-bend
    5. Re:Anyone want to... by Mockylock · · Score: 5, Funny

      The parallelism of the Compiler Fortran and Professional Edition of the uranium core both sport improved auto-vectorizationalism of the fortran and format that can target Intel's new SSE4 extensionalism. For thread-level parallelismisitic quantum theory, the compilers support the use of Intel's Threadtastic Building Block nationalism for objectionism for automatic thread-level optimizationalism that takes place simultaneously with auto-vectorization of parellel universes... Intel is encouraging the widespread use of its Intel Threading quantum physics parallel vectorizationistic Tools as an interface on the enterprise bridge to its Spock multicore processors. As the parallel company raises the vectorized core count with each multitudinal generation of new vector parallel products, it will get harder and harder for programmers to manage the complexity associated with all of that available parallelismistic forces.

      See, it's not that hard to understand.

      --
      "Please, shut up. Just when I think you can't say anything more stupid, you speak again." -Archie Bunker.
    6. Re:Anyone want to... by Anonymous Coward · · Score: 0, Funny

      Go back to your PHP and leave this to the real programmers.

    7. Re:Anyone want to... by Applekid · · Score: 1

      Sounds like snake oil to me.

      I can't speak for Fortran but what standard C++ mechanisms are there for threading? If they added stuff to the CLR, shouldn't it have gone through the organizations that maintain them? Weird compiler extensions are bad for cross-compatiblity. (Which I guess is the point since Intel compilers -> Intel CPUs -> No other CPU manufacturers).

      Besides, threading is still an OS specific venture. Do these optimizations just work by looking for calls to fork() or the Windows alternative?

      I'd rather they take my code and automagically compile sections to take advantage of SSE invisibly.

      --
      More Twoson than Cupertino
    8. Re:Anyone want to... by andrewd18 · · Score: 1

      Hilarious!

    9. Re:Anyone want to... by Applekid · · Score: 1

      And by CLR I obviously mean CRT. You maniacs! You blew it all to .NET!

      --
      More Twoson than Cupertino
    10. Re:Anyone want to... by Applekid · · Score: 1

      (Also, I obviously didn't RTFS. If you'll excuse me I'll put on the paper bag.)

      --
      More Twoson than Cupertino
    11. Re:Anyone want to... by u-bend · · Score: 3, Funny

      Heh, now that's what I really needed to hear. So crap's going to automatically make use of multiple cores better.

      FYI, not a programmer/developer/etc., not even PHP, just interested in tech, but love the attitude anyway, AC ;)

      --
      u-bend
    12. Re:Anyone want to... by LWATCDR · · Score: 4, Interesting

      SSE4 the latest and greatest vector instruction set from Intel. MMX->SSE->SSE2->SSE3->SSE4. These instructions speed up things like trans-coding video and audio. They are also good for anything that does a lot of Floating-Point. The downside is very few systems have CPUs that support SSE4 and selecting it may hurt systems that don't have SSE4 or the program might not run at all depending on how the compiler is written. My bet is it will degrade gracefully. Over all SSE4 is most useful for people that are writing custom software right now and will become commonplace in off the shelf software once AMD supports it and systems that support it are more common.
      The Threading Building Blocks are yet another attempt to make writing multithreaded code easier. Frankly I don't find pthreads hard but maybe I am just odd.
      Threading is very important because we are not going to see an endless increase in clock speed anymore. Intel, AMD, and IBM are all pushing multiple cores. While adding an extra core or three really does help modern systems at least a little since we are often running multiple tasks current software will not scale as well when the cores start growing in a Moore like fashion. Right now we are at four cores if Moore's law holds in two years we might see eight, then 16, then 32... As you can see it gets out of hand pretty quickly. Your average desktop will not use four cores very well much less eight until software is written to take advantage of more cores.
      Yes I know that Moore said 18 months but I was going for a nice round numbers.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    13. Re:Anyone want to... by Anonymous Coward · · Score: 0

      They're making it easier for n00b h4x0rs to develop leet apps that take advantage of multi-core chips.

    14. Re:Anyone want to... by digitalunity · · Score: 1

      I lol'd and my cube-neighbor is looking at me funny. Thanks, ass.

      --
      You can't legislate goodness. Let each to his own destiny, by will of his freely made choices.
    15. Re:Anyone want to... by Vo1t · · Score: 1

      I guess, news from Arse are always cool.

    16. Re:Anyone want to... by u-bend · · Score: 2, Funny

      You spelled '1337' wrong. Now that's funny coming from a n00b like me!

      --
      u-bend
    17. Re:Anyone want to... by StikyPad · · Score: 0

      Better wipe that up.

    18. Re:Anyone want to... by StikyPad · · Score: 1

      I'll give it a try..

      Ugh ugh uhh. *Bang* Huh ugh huh mmmm uh buh ugh Fortran.

    19. Re:Anyone want to... by James_Intel · · Score: 5, Informative

      Automagical - we try. Vectorization, paralellization - I dare say the Intel compilers are at least as good at it as any compiler ever has been. Bold statement - yeah. I believe it is true.

      A more interesting question is "Is that good enough?" For vectorization, the answer is 'usually' - so some additional work/headaches happen when it isn't enough. For parallelization - the answer is at best 'sometimes.' So I'll get flamed two ways: (1) by people very happy with it - and say that I've understated how good it is - and it is all they need, (2) by people with programs which don't get magical auto-paralleism to solve there needs. There are more people in #2 than #1 - but this ain't a 1-size-fits-all-world. Not a bad deal if it solves you problems - otherwise - you got work to do... but that ain't the compiler's fault... parallelism requires work for most of us.

      About languages...
      Virtually every Fortran, C and C++ compiler these days support OpenMP, which is not part of the official standard - but is there to use. It is loop oriented, and is very Fortran-like and fits into C well enough... but is definitely not C++ like.

      Fortran and C/C++ don't support threading in the language, you need to write your code to be thread-safe, and you need to use a threading package like Windows threads or POSIX threads (pthreads). Boost thread offer a portable interface to hit on the key threading needs - essentially wrappers for pthreads and Windows threads, etc. - the standards are likely to add a portable interface officially in the future. One thing Java did from the start.

      Intel compilers -> Intel CPUs -> all compatible processors
      The Intel compilers and libraries aim to beat other compilers and libraries regardless of the processor it is run on. No one will get it right all the time - so this is not a dare to find single examples of little code sample to prove me wrong. But if a real program doesn't get the best results from Intel - we want to know. (yeap - I work at Intel - I post for myself)

    20. Re:Anyone want to... by James_Intel · · Score: 3, Interesting

      The compiler will try like crazy to do that - and sometimes it does a great job. Most of the time - you'll have work to do (it won't do it for you). What we've found though - is that anything a programmer can do to express tasks that are splittable - makes the automation more and more possible. OpenMP (11 years old now) has carried that into the multicore world from the world of supercomputing - for loops. Don't have loops? Well, that's there would be a tough one.
      Threading Building Blocks is a good option for C++ developers - because it pushes you to rewrite key parts of the code - for thread safety (too bad C++/C doesn't force that) and for this automation of splitting. Often this is easier than you'd think - and then you're in easy city. I'm not saying it is easy, nor a cure-all - but it is useful to look at it and see if it isn't the best idea so far - and see what else we can do.

    21. Re:Anyone want to... by Anonymous Coward · · Score: 0

      The parallelism of the charvering Compiler "Butplug" Fortran and Professional "Ass-stitcher" Edition of the uranium core both sport improved auto-vectorizationalism of the fortran and format that can target Intel's new SSE4 extensionalism. For thread-level parallelismisitic quantum theory, the creaming squirts support the use of Intel's Threadtastic "Saggysack" Building Block nationalism for objectionism for automatic thread-level optimizationalism that ballbusts place simultaneously with auto-vectorization of parellel spanks... Intel is encouraging the widespread use of its Intel "Fannyfarmer" Threading quantum felchs parallel vectorizationistic Tools as an interface on the spanking enterprise bridge to its Spock multicore processors. As the parallel company unclefucks the fucking motherfucked core count with each multitudinal generation of new vector parallel motherfucks, it will get harder and harder for felchs to manage the complexity associated with all of that available parallelismistic forces.

      See, it's not that hard to understand.

    22. Re:Anyone want to... by durdur · · Score: 1

      >selecting it may hurt systems that don't have SSE4 or the program might not run
      > at all depending on how the compiler is written

      Intel compilers can generate multiple versions of a function for different processors and code to dispatch to the most optimal one.

    23. Re:Anyone want to... by GMFTatsujin · · Score: 1

      Where can I invest in this exciting new technology?

    24. Re:Anyone want to... by Mockylock · · Score: 1

      Best Buy.

      --
      "Please, shut up. Just when I think you can't say anything more stupid, you speak again." -Archie Bunker.
    25. Re:Anyone want to... by Anonymous Coward · · Score: 0

      Hmm, your ideas are intriguing to me, and I wish to subscribe to your newsletter.

    26. Re:Anyone want to... by Hoi+Polloi · · Score: 1

      Yah, us LOGO programmers don't care for your type around here.

      --
      It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
  2. OK by Anonymous Coward · · Score: 0

    Translation: "They have made improvements that can better translate regularly programmed code into machine code that can run faster on their CPU's".

    Hopefully, your games will be out faster, and run more realistic (in terms of AI and graphics) because programmers will spend less time making sure their code makes full use of the features of the CPU(s).

  3. Intel - The Software Company by Necroman · · Score: 5, Insightful

    We see Intel mainly as a CPU/chipset maker, but don't pay much attention to their software side. I believe they are one of the largest software development companies in the world. Between drivers, compilers, and all the other goodies to support all their hardware, they spend a lot of time doing software development.

    And as much as they develop compilers to optimize code for Intel CPUs, the code most of the time will also see a speed increase on AMD CPUs as well. Who else do you want developing a compiler but the people who made the hardware it's running on.

    --
    Its not what it is, its something else.
    1. Re:Intel - The Software Company by Tribbin · · Score: 2, Insightful

      "Who else do you want developing a compiler but the people who made the hardware it's running on."

      You mean like nvidia making nvidia drivers for linux?

      --
      If you mod this up, your slashdot background will turn into a beautiful sunset!
    2. Re:Intel - The Software Company by dmoore · · Score: 4, Interesting

      I have not tried their compiler, but for the Intel Performance Primitives (IPP), a library of useful MMX/SSE-optimized functions written by Intel, they explicitly fall-back to slow versions of the code if it detects an AMD processor, even if the AMD processor has MMX/SSE/SSE2. This kind of behavior is one reason that you may not want to trust Intel for your compiler needs if you are planning on doing development for more than just Intel-branded CPUs.

    3. Re:Intel - The Software Company by Chandon+Seldon · · Score: 3, Interesting

      It's really useful for a CPU company to develop an optimizing compiler for their hardware. It forces them to understand how their CPU features actually speed up software, and it gives them the opportunity to prove that certain hard optimizations actually work. It would probably be best for everyone if the compiler were open source, but if Intel thinks they need to sell it as a commercial product to justify it financially we still get all of the benefit on their future processor designs.

      --
      -- The act of censorship is always worse than whatever is being censored. Always.
    4. Re:Intel - The Software Company by 15Bit · · Score: 1
      You can say the same for most of the other major chip makers - IBM and Sun both do the same, and in years gone by DEC used to make an arse-kicking Fortran compiler for the Alpha. In fact, probably the only major chip producer that doesn't make compilers is AMD.

    5. Re:Intel - The Software Company by Elbereth · · Score: 2, Funny

      From the viewpoint of Intel, this is actually good practice. They don't know what features that AMD actually supports (through possibly intentional ignorance), and they don't want to cause someone's system to lock up. While I'd rather see my AMD CPU be supported by Intel's compiler, I can understand why they might be reticent to support certain features, even though the CPU reports support for that feature.

      Anyways, it's not like MMX/SSE are really used for much of anything but benchmarks and voice synthesis. Or, at least, that's what it was like last time I actually cared enough to look.

      When I was a kid, we didn't even have MMX. We made use with math coprocessors, and sometimes we didn't even have that. In fact, I remember using CPUs that didn't even have onboard MMUs or support for protected mode operation. Kids today are spoiled. Try using a VIC 20 or TI 99/4a for a few hours, then tell me how important it is to have your competitor design a compiler that optimizes for your CPU.

    6. Re:Intel - The Software Company by Red+Flayer · · Score: 1

      Try using a VIC 20 or TI 99/4a for a few hours, then tell me how important it is to have your competitor design a compiler that optimizes for your CPU.
      Noob. Try punchcards. At the very least, the Commodore PET. The VIC-20 was an awesomely powerful piece of hardware compared to that.
      --
      "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
    7. Re:Intel - The Software Company by Anonymous Coward · · Score: 0

      I disagree. Intel disabled the optimizations in a very under-handed way and they did not come clean about it until they were 'outed'. There is no compelling technical reason for what they did.

      The best thing to do when you can't trust Intel's compilers is to just not use them. Who knows what other crappy sneaky things they put in their tools?

    8. Re:Intel - The Software Company by Wesley+Felter · · Score: 3, Insightful

      So if an Intel processor reports SSEn support you assume that it works, but if an AMD processor reports the same feature, you assume that it doesn't work? Great idea.

      This matters because the whole purpose of IPP is to take advantage of newer instructions. If you say "new instructions don't matter because no one uses them" it becomes a self-fulfilling prophecy. Optimized libraries could break out of that cycle, but only if they aren't used as competitive weapons.

    9. Re:Intel - The Software Company by Anonymous Coward · · Score: 0

      Who knows what other crappy sneaky things they put in their tools? IIRC, the source of icc is available for viewing.
    10. Re:Intel - The Software Company by cyfer2000 · · Score: 1

      I thought every CPU provider does. Or they make CPUs compatible with existing compilers.

      --
      There is a spark in every single flame bait point.
    11. Re:Intel - The Software Company by jimicus · · Score: 1

      Who else do you want developing a compiler but the people who made the hardware it's running on.

      My goodness... you can't mean... that the company which developed the hardware is in a strong position to get a few people from the hardware dev team onto the team developing software for it?! And that these people are well placed to know what's worth optimising, where and how?

      No shit, Sherlock.

      The only amazing thing about this is that it is such a novel insight that it is necessary for you to be modded as such.

    12. Re:Intel - The Software Company by coats · · Score: 1
      But they have major NDAs with the compiler teams from Sun ( http://developers.sun.com/sunstudio/index.jsp ) and PathScale ( http://www.pathscale.com/index.html )...

      --
      "My opinions are my own, and I've got *lots* of them!"
    13. Re:Intel - The Software Company by the_humeister · · Score: 3, Funny

      Or ATI making any sort of drivers?

    14. Re:Intel - The Software Company by Bert64 · · Score: 2, Insightful

      On the contrary, they should check for the presence of the appropriate feature, and then use it...
      They should also let you build binaries without those fallback code paths, as a lot of code will never run on older machines (eg x86 macs, which all have at least sse3).
      If someone's system lock up because AMD claimed to support a feature which they dont actually support, that's AMD's fault and intel could claim the moral high ground instead of the other way round.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    15. Re:Intel - The Software Company by Anonymous Coward · · Score: 0

      Maybe some of the sources for the std libraries are available, but I don't think sources for the compiler proper are.

      It's pretty much a black box with "crappy sneaky things" in it.

    16. Re:Intel - The Software Company by RedElf · · Score: 1

      And just earlier today I was reading a comment right here on slashdot where someone was ranting because ATI wasn't letting the community do the driver development instead of doing it in-house.

      Talk about polar opposites swimming in the same pond.

      --
      You know, I have one simple request. And that is to have sharks with frickin' laser beams attached to their heads!
    17. Re:Intel - The Software Company by Anonymous Coward · · Score: 1, Informative

      It would probably be best for everyone if the compiler were open source, but if Intel thinks they need to sell it as a commercial product to justify it financially we still get all of the benefit on their future processor designs.

      If it were open source you could modify it to work on AMD processors. In the past, I specced out an intel workstation rather than AMD specifically because my software used the Intel Math Kernel Libraries. Granted, it was only one computer many years ago (When AMD was faster than Intel) but when you see companies building big bewoulf clusters or considering processor/math intensive apps I bet there's a few extra sales to be made there.

      And yes, the MKL gave me 60x speedup over hand-written matrix algebra. Big deal when things go from an hour to a minute.

    18. Re:Intel - The Software Company by geekoid · · Score: 1

      I'm sure the ATI's driver being 'teh suck' has a lot to do with that. Seriously, they are always bad and do not take advantage of the chips.

      I am pretty sure that if their drivers were well made the call to OS there drivers would not be so loud.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    19. Re:Intel - The Software Company by RedElf · · Score: 1

      Ahh yes, I found the comment I was talking about earlier, it can be read right here.

      --
      You know, I have one simple request. And that is to have sharks with frickin' laser beams attached to their heads!
    20. Re:Intel - The Software Company by KingMotley · · Score: 2, Funny

      Who else do you want developing a compiler but the people who made the hardware it's running on.
      Who else do you want developing an office suite but the people who made the operating system it's running on.
    21. Re:Intel - The Software Company by Anonymous Coward · · Score: 0

      You mean like nvidia making nvidia drivers for linux?
      The difference is Intel actually uses Linux. I went to their little show-and-tell in Pittsburgh last year and every single machine there was running Ubuntu, except for one MacBook. No Windows.
    22. Re:Intel - The Software Company by Chandon+Seldon · · Score: 1

      All CPU makers make their CPUs compatible with existing compilers - but that completely ignores new instructions like SSE4. For that sort of thing, ether the programmer has to take advantage of it with hand-coded assembly, or someone needs to write a compiler optimized for the new instruction set. If the CPU vendor does it themselves rather than waiting for Microsoft and the GNU project to get around to it they can see results faster and feed information from/to hardware design more quickly and efficiently.

      --
      -- The act of censorship is always worse than whatever is being censored. Always.
    23. Re:Intel - The Software Company by JebusIsLord · · Score: 1

      Or, they can just contribute to GCC, like I believe Apple did in order to get altivec optimizations in there.

      --
      Jeremy
    24. Re:Intel - The Software Company by Tim+Browse · · Score: 1

      The only amazing thing about this is that it is such a novel insight that it is necessary for you to be modded as such.

      And yet, historically it has proven to be incorrect. The usual result of getting hardware developers to write compilers is that you get shitty compilers. The amazing reason for this is that people who spend their career writing compilers turn out to be way better at it than people who spend their career developing hardware.

      The Intel compiler is a notable exception - but it wasn't that long ago that code correctness was not that high on the Intel compiler's list of qualities. The code was fast, but not reliable (compared to, e.g. gcc or msvc). To paraphrase Gerald Weinberg, "I can write a program that executes in zero seconds if the output doesn't have to be correct."

      Just because you designed the hardware doesn't mean you have the best idea of what goes on in most 'real world' software - in some cases, you can be totally blindsided because you thought you knew best. iirc, some versions of the VAX processor had a bunch of instructions put in that were 'useful for compilers'. The compiler writers took a look at them, and said "Er, no thanks." There are other examples in the field's history.

    25. Re:Intel - The Software Company by Anonymous Coward · · Score: 0

      No it's not.

      Perhaps it was a long time ago (seriously doubt it, though).

    26. Re:Intel - The Software Company by jd · · Score: 1
      Maths co-processors... Yeah, I remember those. I wrote a Mandelbrot generator that manipulated the 8087 stack directly, so none of the floating-point values ever had to be transferred to/from main memory. All integers (for loops) were held in 8086 registers for the same reason. It was still slow, but it was a respectable slow.

      IIT's maths co-processor could handle matrix arithmetic directly - it didn't have a simple 1D stack, but a 2D array you could process on. It was also roughly 10x faster than the Intel co-processor. For the time, this was a damn good product, and it's a pity they vanished. (Fractint even had specialized support for it.)

      Support for protected mode - that was a waste. Yeesh. Switching between CPU modes is slow and rarely worth it. Forget the VIC20 - the PET 8096 supported up to 256 Kb of RAM (banked) and had offloading for all print and disk operations. Something that even modern ix86 systems do not generally have. The PetSpeed compiler, from Oxford Computing, was truly amazing - a superb four-pass compiler that could beat to a pulp any one-pass compile/one-pass link solution of the day. They did some amazing things that shouldn't have been possible on the PET in their demo, and their advertising was definitely risque. (I also know that at least one author reads Slashdot, so hi and many thanks.)

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    27. Re:Intel - The Software Company by yahooadam · · Score: 1

      it probably wouldnt be a good demonstration if it crashed though ...

    28. Re:Intel - The Software Company by Anonymous Coward · · Score: 0

      speaking of outing, u r gay.

    29. Re:Intel - The Software Company by ravyne · · Score: 1

      I agree with your point that the optimizations should be taken based on a feature-by-feature basis, however its likely that code optimized for Intel's processor extensions might be sub-optimal on AMD's extensions. All these instruction sets like SSE, MMX, even x87 and x86 are essentially specs; the implimentation can and often does differ. Each new core from each vendor will have different latancy and throughput characteristics that will have a bearing on what the optimal code for each platform will look like. An approach that results in a 17-cycle computation on a Core 2 duo might require 19 on AMDs K8 architecture where a different approach might yield 16-cycle execution. The story might change again with K10, which might very well run that Core 2 duo codepath even faster than intel.

      So, while I aggree that they should take the better-than-nothing approach of using the SSE code on SSE-supporting AMD processors, I can see how they'd rather avoid the business of writing highly-optimized code for their competitors. By sidelining the competitors' chips to platform-agnostic C/C++ code, they avoid a situation where AMD comes back and complains about their SSE paths being sub-optimal for their CPUs. AMD has to come up with their own Performance Primitives if they want an optimal solution.

      I forget the name of the project, but I've seen a few OpenSource projects aimed at unifying the various vector instruction sets (MMX, 3DNow!, SSE/2/3/4, Altivec) under a common set of compiler-independant "intrinsics" (basically mapping their intrinsics onto each compiler's instrinsics) but of course this doesn't solve the problem of generating optimal code, only the problem of maintaining several optimized code-paths. Maybe its time for the FOSS community to develop a free and open competitor to Intel's Performance Primitives targetting AMD's extensions, Intel's extensions and possibly Altivec that are API compatible.

    30. Re:Intel - The Software Company by James_Intel · · Score: 5, Informative

      (Yes - I work for Intel - post for myself - tell it like it is) Cute story if it was true. However - Intel compilers and libraries, are designed to use features - but we don't come out every day with an update. The new compilers support SSE 4, but Intel only. AMD support comes after the processors exist that support it. Libraries aren't quite there yet with SSE 4 (I guess we hate Intel processors too - flame us). But AMD support for SSE 3 is there - now that it is in their processors. It wasn't there when we developed version 9 of the compilers. We do test our compilers/libraries on other implementations - because believe it or not - we care if it works. It doesn't always - and we adjust the compiler/library to make it work. We had a beta a few years ago which blew up on Intel processors and worked on AMD processors (yeap - I said it right - imagine the embarrassment when a customer told us about that combination). Opps. I heard that was because we released support before we tested that it worked on that processor. So we learned not to do that too often. By the time we release product - it should work on all procesors. I would say "does" or "guaranteed to" - but the lawyers would freak - because nothing in life is guaranteed. We are clearly not trying to screw our customers though - you know... the developers who count on our software. It is annoying when people suggest that might be our goal.

      My favorite complaint: Intel checks "CPUID"
      No duh - that's where the feature information is.

      Next favorite: Intel checks for "GenuineIntel".
      Another "no duh" - RTFM from Intel or AMD - the features flags checking has to come AFTER you determine the manufacturer AND family of the processor...
      unless you don't care about running on all processors
      (spare pointing out to be that you can skip the first two checks - look at the SSE flag - and it is usually right - unless say you pick just the right older processor)
      We do the checks the way Intel and AMD manuals say we have to... if that is evil... so be it.
      We even start by testing if the CPUID instruction exists (it didn't before Pentium processors).

    31. Re:Intel - The Software Company by clarkn0va · · Score: 1

      From the viewpoint of Intel, this is actually good practice...they don't want to cause someone's system to lock up.
      No. If Intel's only concern in their compiler supporting features such as MMX or SSE on the competitor's hardware was that said hardware might lock up or do something funny, then they would just put a disclaimer on the compiler to let you know that your binaries risk locking up on AMD or any other unsupported hardware.

      The only reason Intel bothers disabling code on a cpu which AMD claims to be supported code is to make the end user feel like AMD is much slower than Intel at running this code. This is much more insidious than just writing a compiler to run on your own hardware and letting the competition fend for themselves. It is subtle, it requires more work on Intel's part, and of the discussed hypothetical options, it is the least likely to make the user sit up and say, "hey, this binary that was created using Intel's compiler doesn't work well on my AMD cpu!"

      --
      I am literally 3000 tokens away from the chaotic crossbow --Stephen
    32. Re:Intel - The Software Company by bill_mcgonigle · · Score: 1

      Noob. Try punchcards. At the very least, the Commodore PET. The VIC-20 was an awesomely powerful piece of hardware compared to that.

      Ha! After writing software for the VIC-20 I wound up on a PET one summer and kept getting these 'out of memory' errors. I had no idea why that might be happening (had to ask the prof). :)

      And today 'hello world' can't even fit in 20K.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    33. Re:Intel - The Software Company by 3p1ph4ny · · Score: 1
      And today 'hello world' can't even fit in 20K.

      tycho@mittens:~/c$ cat temp.c
      #include<stdio.h>
      main(){printf("Hello World\n");}
      tycho@mittens:~/c$ gcc temp.c
      tycho@mittens:~/c$ ./a.out
      Hello World
      tycho@mittens:~/c$ size a.out
      text data bss dec hex filename
      1103 528 8 1639 667 a.out
    34. Re:Intel - The Software Company by bill_mcgonigle · · Score: 1

      yeah, it's probably even smaller in pure assembly.

      The context was student work, then in BASIC, now typically in Java. As I understand it gcj 4.2 can produce a static binary in as little as a meg.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    35. Re:Intel - The Software Company by Wolfrider · · Score: 1

      Man, my 1st computer was a TI 99/4a... We had teh speech synthesizer, and Parsec, Alpine... And a MONO CASSETTE RECORDER to store Extended Basic progs. (Man it was a b1tch when the little POS froze up because of the sh1tty cartridge implementation, tho.)

      I made a Voltron interactive-text game for it back in the day; easiest way to win quickly was just: " FORM BLAZING SWORD ". :)

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
    36. Re:Intel - The Software Company by Nazlfrag · · Score: 1
      That sounds like a much better way for Intel to achieve goals of fast feedback etc. considering that nowadays I'd assume the vast majority of GCC invocations will occur on Intel chips. Features like automagic parallelisation they could keep for themselves, which would keep their own compilers' status intact. All that said, the GCC developers are usually up with or ahead of other compilers in terms of features, and will sail along perfectly well without direct Intel support.

      In other words, Intel could utilize GCC for gain, while GCC probably couldn't care less if Intel joins the party.

    37. Re:Intel - The Software Company by Red+Flayer · · Score: 1

      It does depend on which model PET, though. THe original had 4k memory, I think they offered an 8k version later that year (1977). Eventually (1980 or 81, IIRC) the PET had 32k memory, and could handle anything the VIC20 could (except for graphics, of course -- the problem with the PET was that the character library was in ROM, and so it was impossible to create custome graphics instead of PETSCII analogues)...

      At any rate, the VIC-20 had only 3.5k memory available... you must have had one with expanded memory? I think they went up to 40k or thereabouts.

      Ah, good times.

      --
      "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
    38. Re:Intel - The Software Company by at0mjack · · Score: 2, Informative

      Checking for 'GenuineIntel' is fine, but the actual code emitted by the compiler goes straight to 'no additional capabilities' if it detects any other string. In other words, in the 32-bit compiler any non-intel chip is doomed to run the 'you are a bog-standard 386 with no MMX/SSE/SSE2 support' code path regardless of its actual capabilities. This 'feature' makes less difference in the 64-bit compiler (because the base level is a EM64T with SSE2, as opposed to a 386 with nothing for the 32-bit version), but as new instruction sets come online (SSE3, SSE4 and the like) this artificial crippling of AMD chips will start to show there as well.

      And yes, you say you 'tell it like it is', but I've disassembled the actual code and it doesn't accord with your story. See http://www.swallowtail.org/naughty-intel.html for the gory details. The proof of the pudding is in the eating: if you patch one of our programs compiled with the Intel compiler to remove the Intel check it runs significantly slower on AMD chips (as in DOUBLE the runtime).

      There is no technical reason for these checks to be there: they are purely a competitive ploy to cripple performance on AMD chips. If Intel released their compiler for free, then I'd say so what: they're allowed to make it a marketing tool. OTOH, they release it as a commercial product and charge me money for it: doing that and then deliberately crippling its performance is IMHO not acceptable.

    39. Re:Intel - The Software Company by bill_mcgonigle · · Score: 1

      Thanks for the info.

      At any rate, the VIC-20 had only 3.5k memory available... you must have had one with expanded memory? I think they went up to 40k or thereabouts.

      Yeah, that sounds right - I only borrowed time on the VIC-20, but I recall it had 24K or so (it's been a while!) - by the time I could buy a computer (imagine that) the C-64 had gotten cheap, and that had more memory than I could ever fill. :)

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    40. Re:Intel - The Software Company by James_Intel · · Score: 1

      The patches you point to will cause the compiled code to fail on older Intel processors and older AMD processors by bypasing checking and forcing code to run regardless of processor. We don't support that nor encourage doing that.

      The 10.0 compilers have SSE3 tested and supported for Intel and AMD - as I said before, the previous version was designed before AMD had SSE3 support. The 10.0 documentation has amore information on this topic, worth a read.

    41. Re:Intel - The Software Company by Anonymous Coward · · Score: 0

      You should try ldd, actually you've got significant amount of libraries in the binary.
      If you compile with gcc -static temp.c then the binary shall be of real size. (660k on my AMD64/Ubuntu).

      However: diet cc temp.c gives only 2710 bytes and the result is a static binary. Now, if we'd optimize, printf is extremely stupid way to print something, use write(1,"Hello World\n",12) or something like that instead. You need to optimize for size: add -Os to command line.

      BTW, on VIC-20 equivalent binary shall be little smaller as instructions take less space than on newer architectures. (With some exceptions of course, eg. 64-bit div 32-bit shall need lot more instructions on VIC-20 than on modern x86 or x86-64).

    42. Re:Intel - The Software Company by at0mjack · · Score: 1

      No, that's simply not true. The only check the patch bypasses is the 'is the CPUID GenuineIntel?' one: I don't bypass any of the 'is CPUID supported? What is the maximum level of CPUID supported? What instruction set flags are set?' tests.

      The 'who made this CPU' test is completely irrelevant to all of this: I challenge you to name a CPU for which the _cpu_indicator_init function (or equivalent) would give incorrect results if the vendor ID string check was removed.

      Now, I haven't looked at the 10.0 compilers yet, so it's possible that Intel have removed all this naughtiness in that release. However, given that I first complained to Intel about this "feature" back in version 7.1, and it remained in place for versions 8 and 9, I'm not holding my breath.

    43. Re:Intel - The Software Company by James_Intel · · Score: 1

      There are several processors it will fail for - from several vendors. They are not recent processors, but they exist based on our research. Since they are not Intel processors, I'm not going to list them because I cannot find vendor information acknowledging the issue. Since it actually is not an issue (since that is not the CPUID usage defined as valid) - I'm not surprised it is not documented. At least some were not AMD processors either based on my memory - but I don't have the information at my fingertips any more.

      I do know Intel and AMD document the approved process very carefully. And since our designers have agreed upon the seqeunce they will support - it is not safe to use a different sequence for detection and assume it will work in the future.
      You are proposing a sequence for detection neither Intel nor AMD agree to support. You observe it works for most processors you know of. I agree that is true.
      I just disagree this is the right approach - I prefer to see the approved and support sequence used, all processors covered, and no future issues.

    44. Re:Intel - The Software Company by at0mjack · · Score: 1

      If you and AMD have documented the approved process so carefully, why do you only allow 'GenuineIntel' chips to pass through and have their capabilities detected and not 'AuthenticAMD' ones? This still smacks of deliberate crippling of performance on any chips but your own. If all else fails, just document that code compiled with the -ax flags may fail on certain very obscure old CPUs and be done with it.

      I run a computing farm with mixed PIII, P4, Core 2 Duo, Athlon MP and Opteron CPUs in it. How should I compile my code to work in that environment such that the optimal CPU capabilities are used on each CPU?

    45. Re:Intel - The Software Company by James_Intel · · Score: 1

      For the mix you suggest - optimal is a different binary for each (that is true for every compiler - no matter what anyone claims), near optimal (close enough) is one of the -ax options as you suggest since you know you don't have Pentiums, Pentium II, etc. The real headache is you can't build an optimal binary with lots paths... i-cache kills you... in fact... 2 paths (highly specific and generic) is hard to make run faster, and 3 paths is nearly impossible to make faster reliably with the compiler. Forget a path for every processor - the compiler really gets to make 2. You can force more, but your results are not likely to be good. The options let you pick how specific you want the 'highly specific' to be - the other is generic (the compiler switches let you pick how generic - so you can say 'assume MMX and SSE' so you don't have to be completely braindead). Unfortunately, our default caters to those who don't read the manual - and so generic includes 386 processors. Many of our customers would be upset if we produced binaries which don't run "everywhere." Many would be happy is P6 was the lowest we supported or even Pentium II.

    46. Re:Intel - The Software Company by 3p1ph4ny · · Score: 1

      Yeah, you're right about all that. I was just pointing out that it's possible to fit a simple program into less than 20k.

    47. Re:Intel - The Software Company by cyfer2000 · · Score: 1

      I think you are talking X86 compatible CPUs. When a CPU with a brand new instruction set was born, the developers would make a new compiler together with the CPU.

      --
      There is a spark in every single flame bait point.
    48. Re:Intel - The Software Company by Anonymous Coward · · Score: 0

      I don't know who you are, but you have the patience of a saint.

    49. Re:Intel - The Software Company by gnasher719 · · Score: 1

      '' From the viewpoint of Intel, this is actually good practice. They don't know what features that AMD actually supports (through possibly intentional ignorance), and they don't want to cause someone's system to lock up. While I'd rather see my AMD CPU be supported by Intel's compiler, I can understand why they might be reticent to support certain features, even though the CPU reports support for that feature. ''

      This is actually not true. If AMD chips were incompatible say with SSE2 even though they set the SSE2 flag in the processor, then Intel would _want_ any AMD system to lock up and point to that incompatibility. Since the code _works_ on AMD chips, Intel does what is the second best thing for Intel and makes the code run slow on AMD instead.

    50. Re:Intel - The Software Company by Man+On+Pink+Corner · · Score: 1

      When I was a kid, we didn't even have MMX. You think that's bad? My computer used to interrupt me every 108 minutes to make me enter the result of a floating-point calculation.

    51. Re:Intel - The Software Company by Gazzonyx · · Score: 1

      I don't know who you are, but you have the patience of a saint.

      Agreed. I was just reading this thread thinking the same thing.

      I just took a class in assembly (PPC Solaris w/ multiple cores), the logistics of it all blew me away. I suddenly realized how complicated my obfusticated C++ code must be to GCC, because I don't ever think about cache coherency and pipeline latency when I'm writing C++ (or any other language). I also never considered that you have to align your memory at the hardware level. It suddenly makes a coder all that much more cautious. I found out that GCC which is consider a 'good' compiler is still sloppy at the machine code level because anything that isn't hard coded has to be built blindly, leaving 'holes' in the memory where datastructures didn't line up on a word boundary. Compilers are complicated and being the monday morning quarterbacks that we all are at times, won't make this simple truth different. Oh, for the record, it's all worth it when you realize what you can do with pass by value-reference on the stack from a function.

      --

      If I mod you up, it doesn't necessarily mean I agree with what you've said, sorry.

    52. Re:Intel - The Software Company by be-fan · · Score: 1

      Intel's and NVIDIA's software guys are generally recognized as being pretty competent. ATI's guys have a decidedly worse reputation, which really is deserved given their historical performance (or lack thereof).

      --
      A deep unwavering belief is a sure sign you're missing something...
    53. Re:Intel - The Software Company by Hoi+Polloi · · Score: 1

      See what happens when you try to be helpful? Now you've been drafted into tech support and get hostile accusations too. No good deed goes unpunished. :)

      --
      It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
  4. GCC by Anonymous Coward · · Score: 3, Insightful

    Will they add these features to GCC or make docs available so others can?

    1. Re:GCC by Anonymous Coward · · Score: 0

      I think this is mostly done by the gcc people. GCC already has sse4 support in mainline (specifically, support for ssse3, sse4.1, and sse4.2). It also has an autovect-branch (tree-ssa is already in mainline). Intel® Threading Building Blocks seems to be yet another threading library, though only for C++, and only on Intel® machines (unlike, e.g., pthreads or OpenMP). If (for some reason) you still want to use this less-portable threading library, you can use it perfectly well with gcc.

    2. Re:GCC by sofla · · Score: 1

      The Thread Building Blocks look interesting, and are available for multiple compilers (MS, Intel, GCC) and platforms (Win, Linux, Mac). Reading between the lines, it looks like their strategy is to convince you to use TBB for your threading (maybe not a bad idea, by-hand threading code is boring and error prone), and then profile your code with VTune (which I have used before, its good as profilers go) plus the VTune plugin to do thread performance analysis.

      As far as this actual announcement, there's not much to it. Looks like they upgraded their Fortran and C++ compliers to work better with TBB. TBB and VTune are Intel's strategy when it comes to optimizing for multiple cores. No surprise there - VTune has been the tool for optimizing for Intel for awhile now, if you want to optimize for Intel (which != optimizing for x86).

      So depending on what you mean by "add these features to GCC", they already have. If you're looking for GCC extension support in the Intel compiler, though - well, I can't help you there. And likely neither can they, thanks to GPL. You painted yourself in that corner when you chose to code to the GCC compiler (and not ANSI) in the first place. Don't get me wrong, I like GCC, but coding to a specific compiler is always a bad idea, in my book.

  5. Moore's law onto programmers?! by iknownuttin · · Score: 1
    FTFA: I've outlined before how multicore moves the burden of taking advantage of Moore's law from hardware onto developers.

    From Wikipedia:Moore's Law is the empirical observation made in 1965 that the number of transistors on an integrated circuit for minimum component cost doubles every 24 months

    Alrighty, then. It's been a while since my CS classes. How does that apply to software? Does he mean that instead of increasing transistors on a single chip, the transistors are virtually increasing by using multiple cores?

    --
    I prefer Flambe as apposed flamebait.
    1. Re:Moore's law onto programmers?! by Anonymous Coward · · Score: 0

      Moore's law states that the number of transistors you can squeeze into a certain area effectively doubles every 24 months. Most people interpret this to mean that every 2 years, your performance will double (you have twice as many transistors after all). Just adding a second core will not double your performance unless you have written your programs to actually run on both cores at once. Thus, instead of using the extra transistors to improve performance of the CPU, they will simply add a second core and put the work of optimizing programs on programmers.

    2. Re:Moore's law onto programmers?! by Anonymous Coward · · Score: 0

      Multicore allows you to double your transistor count by just copying the existing design. Thus Moore's law keeps going. The downside is that simple, single-thread programs no longer go twice as fast like they did in previous doublings. Thus, the programmers have to do much more work to take advantage of the extra transistors.

    3. Re:Moore's law onto programmers?! by Anonymous Coward · · Score: 1, Interesting

      I know people like car analogies, bear with me. Let's imagine a CPU is like a horse.

      For years, people have been selectively breeding horses (CPUs) to allow them to go faster and faster. If you wanted a job done quickly, you simply bought a faster horse. That worked great until recently. Now, there isn't much improvement. So, instead of a faster horse, people are offered more horses for the same price. That's great, except that now the challenge becomes managing multiple horses in parallel and figuring out how to accomplish some of the same tasks as before, using all those horses efficiently. Hitching two horses to a wagon is easy. Four, harder. Eight, even more difficult. Eventually there is a practical limit, but in all cases, all you get from the horse breeder is a horse (the hardware). It's up to you to figure out how to put them together to get the job done effectively (software). Worst case, it's as if you had only a single horse that isn't much faster than before. Hence, the burden of an actual performance increase is shifting more to the user of the hardware than has been the case in the past. Adding more horses isn't as easy or simple a solution as before.

      To stretch the analogy even more ridiculously, I guess this announcement is like Intel just released their latest and greatest version of a multi-horse harness especially for scientific stagecoaches (FORTRAN).

    4. Re:Moore's law onto programmers?! by __aaclcg7560 · · Score: 1

      The problem with a multi-horse harness, if not configured properly, is that the extra manure from the horses can cause the stagecoach to slide around and force the driver to do more work to keep everything in line. Plus it can stink to high heaven. ;)

    5. Re:Moore's law onto programmers?! by kybred · · Score: 1

      Moore's Law is the empirical observation made in 1965 that the number of transistors on an integrated circuit for minimum component cost doubles every 24 months

      Alrighty, then. It's been a while since my CS classes. How does that apply to software? Does he mean that instead of increasing transistors on a single chip, the transistors are virtually increasing by using multiple cores?

      No, it means the number of programmers on a project doubles every 24 months.

    6. Re:Moore's law onto programmers?! by sofla · · Score: 1

      FTFA: I've outlined before how multicore moves the burden of taking advantage of Moore's law from hardware onto developers.

      Alrighty, then. It's been a while since my CS classes. How does that apply to software?

      It doesn't. But its either that, or Intel has to come out and tell us the truth: "We hardware guys are tired of all this focus on Moore's Law. Let's let the software guys bang their heads against Amdahl's Law for awhile."

    7. Re:Moore's law onto programmers?! by James_Intel · · Score: 1

      I don't remember programs running twice as fast on a 2GHz processor as a 1GHz processor - oh, unless they fit in cache. I'm not arguing that single threaded gets a lot of benefit from multicore (it gets some from offloading other tasks, from OS services, other services it might use, etc.) - but I can't stand the myth that doubling clock rate was doubling performance of single threaded applications. I drives me NUTS (not far to go?)

      issue 1: Memory wall - memory speed wasn't keeping up with processor speed; multicore slows this nuisance... at least clockspeed-wise... bandwidth is a challenged when caches aren't enough - and solving this isn't easy (at least not on a budget)

      issue 2: ILP wall - out of order execution and more and more exotic hardware to try to maintain a myth of double clock - double perf... at a high cost... multicore stops this rat race

      issue 3: power wall - double clock, double power consumption (okay - this is a lie, double clock means 4X the power... but if you wait 18 months... you can halve the size and get to only 2X) - multi-core stops this (even reversed it a bit)

      So - you got parallelism to deal with. Work - yes. But the three walls in front of us weren't going away.

    8. Re:Moore's law onto programmers?! by adrianmonk · · Score: 1

      Alrighty, then. It's been a while since my CS classes. How does that apply to software? Does he mean that instead of increasing transistors on a single chip, the transistors are virtually increasing by using multiple cores?

      No, it means that previously, the number of transistors was doubling every 24 months and the software developer had to do nothing (or very little) to take advantage of it because the extra transistors were buying you the ability to process a single instruction stream that much faster. Now, the number of transistors is still doubling every 24 months, but what it's buying you is the ability to process more instruction streams (first 1, then 2, now 4) at the same speed as before. That means developers have a new burden to make sure their software has multiple streams of instructions to feed the processor.

    9. Re:Moore's law onto programmers?! by ClosedSource · · Score: 1

      I think we can safely say that, in general, doubling the number of transistors doesn't double performance. It's not as if we've seen a 16X increase in performance since 1999.

    10. Re:Moore's law onto programmers?! by ClosedSource · · Score: 1

      The way I look at it, the heavy lifting was being done in the appropriate place. In order for Intel's customers to pay good money for a new processor, that processor should make their legacy applications run faster. If transistors are being shifted from making a fast single core to making slower multicores, legacy applications may run slower. Hardly a good reason to upgrade.

      It will be interesting to see the extent to which software developers will add the necessary value to multicore processors to make them succeed in the marketplace. Clearly, what Intel and AMD should be working very hard on is new technology to allow them to continue to improve their hardware performance rather than repackage the old technology that is reaching its limits.

      If that day comes, we may look back and laugh at the over-threaded mess we created to try to squeeze the last ounce of performance out of a failed technology.

  6. learn better parallel programming techniques? by sr.+taquito · · Score: 3, Interesting

    If compilers keep abstracting away the interface between the programmer and the cpu, programmers will be less likely to write better code or learn new techniques that take advantage of all the power a few extra cores can provide right? That's just my take on it. Then again, I also think learning parallel programming techniques is fun, and a little more academic than most career programmers might like.

    --
    mr pibb + red vines = crazy delicious
    1. Re:learn better parallel programming techniques? by BoChen456 · · Score: 2, Insightful

      If compilers keep abstracting away the interface between the programmer and the cpu, programmers will be less likely to write better code or learn new techniques that take advantage of all the power a few extra cores can provide right?

      If compilers keep abstrating away the programmer and the cpu, and getting better at optimization, programmers won't need to write better code or learn new techniques to take advantage of all the power a few extra cores can provide.

      Instead the programmer can concerntrate on writing more understandable code.

    2. Re:learn better parallel programming techniques? by Anonymous Coward · · Score: 0

      we've already seen that with 'higher' level languages like Java and C# - programmers today don't really understand how to use memory properly, and as a result apps use up masses of the stuff, because 'the garbage collector will handle it for me'.

      However, in this case the advantages are in better handling of the code you write. you still have to do it right. This is not an excuse for sloppiness.

      Parallel programming is dull, its all splitting code into separatable sections and using the proper control synchronisation mechanisms to prevent data corruption. Making it work fast (ie with few context switches, long-held locks etc) is more fun. Career programmers like getting their work done well, we don't care at all if parallel programming stays in the university, or is used for specialist applications.

    3. Re:learn better parallel programming techniques? by peragrin · · Score: 1

      have you run windows? Yep that's C with poorly utilised memory.

      Good programmers can write good highly optimized and mostly bug free code.

      unfortunately good programmers are like good Car Drivers. Everyone says they are good, but very few really are.

      --
      i thought once I was found, but it was only a dream.
    4. Re:learn better parallel programming techniques? by BlueCollarCamel · · Score: 2, Insightful

      Actually, I've always thought that telling the compiler what you wanted to do, instead of how to do it, would result in the compiler being able to determine the best path to take for a given task.
      Even more so for interpreted/compiled on the fly languages. They can be dynamically compiled to take advantage of whatever hardware is available on each machine, without the developer having to code for it.

      --
      1&1 - Cheap domain and web hosting.
    5. Re:learn better parallel programming techniques? by Kupek · · Score: 1

      Aside from the auto-vectorizing stuff, most of Intel is advertising does not happen automatically. Instead, they provide abstractions that make it easier to write high performance multithreaded code. But programmers will still have to do the hard stuff, which is figure out how best to parallelize their algorithms, distribute their data, and synchronize their threads.

    6. Re:learn better parallel programming techniques? by Anonymous Coward · · Score: 0

      This is not new at all. Compilers (interpreters) have been abstracting away the interface between the programmer and the cpu since the beginning. Very-high-level languges like LISP and Prolog are from the 1960's and 70's.

      1970's era LISP Systems were built on special hardware to suit LISP's functional programming. They evaluated the code more quickly than standard CPUs. Today, each independent S-expression would be evaluated on a separate CPU in parallel.

      Prolog is an inherently parallel language. A Prolog query can build several knowledge trees at a time for the same variables. On the proposed Intel 80 core chip, up to 80 answers can be found concurrently.

      I favor building parallellism into the compilers and interpreters where possible.

    7. Re:learn better parallel programming techniques? by zakeria · · Score: 0

      All compilers abstract away the interface between the programmer and the CPU, its the main reason you use the compiler in the first place..

    8. Re:learn better parallel programming techniques? by suggsjc · · Score: 1

      First, I agree (somewhat).
      I've got a couple of thoughts that I'm not sure how to get out, so just see if can put the pieces together.

      Low-level languages like C are powerful because they can interact (almost) directly with the hardware. Then there are other languages that are built on top of those languages that are designed to hide complexity and allow programmers to code more efficiently at the cost of non-optimized code.
      I didn't RTFA, but if the compilers start taking liberties and "hiding the complexity" of writing multi-threaded code, then unless they are absolutely perfect how would someone truly take advantage of hardware even while programming in C?

      I'm sorry that I don't have the time to refine my thoughts, but basically if low-level languages (lll's) start becoming more like mid/high-level languages then how do go back to being able to optimize code like you did back when lll's were real lll's?

      --
      When I have a kid, I want to put him in one of those strollers for twins and then run around the mall looking frantic.
    9. Re:learn better parallel programming techniques? by Abcd1234 · · Score: 1

      You're making the tacit assumption, here, that software developers will do a better job optimizing their code. The problem is, this hasn't been true for many many years, now. Heck, even memory management is being taken out of the hands of programmers, and the result is more efficient code (yes, believe it or not, studies show that GCs are generally faster than manually allocating and deallocating memory, as the system can do a better job of judging when and how to do it). I suspect the same will be true of parallel processing.

    10. Re:learn better parallel programming techniques? by StikyPad · · Score: 1

      All I have are Dr. Pepper, Twizzlers, and cavities you insensitive clod.

    11. Re:learn better parallel programming techniques? by Billly+Gates · · Score: 1

      In large projects its nice to have abstraction as not all programmers are experienced. Also it makes code easier to read. Trying to find how another programmer is engineering the project or a specific implementation is hard. I like Java for example because combined with javadocs you can easily document code and figure out the hirarchy with the abstraction. But that is just my taste.

      But in career programming its all about time and making deadlines to help make your company more money.

    12. Re:learn better parallel programming techniques? by DragonWriter · · Score: 1

      If compilers keep abstracting away the interface between the programmer and the cpu, programmers will be less likely to write better code or learn new techniques that take advantage of all the power a few extra cores can provide right?


      The people writing the compilers are "programmers" too. If the compiler programmers make the compiler do more effective optimization when turning platform-agnostic source into platform-specific binaries, that just means that burden of worrying about platform optimization is shifted to them from application programmers, who then are freed to spend more time and effort on the higher-level design and implementation of their application, and less on platform-specific tweaking.

      This is a good thing, I would think, for everyone.
    13. Re:learn better parallel programming techniques? by yahooadam · · Score: 1

      im not so sure

      a compiler has to be general purpose, it has to convert any code into something that will run
      A human however can see better ways to optimise code, or do interesting tricks, this is why assembler is capable of being so much faster

      The trouble is, as the program gets complex it takes far too much time to try and optimise your code, so assembler isn't going to take over (as it would take you forever to do even the most basic things), tools like this are good because it can save a lot of human time, which is expensive, and use machine time, which is pretty cheap in comparison, however, it wont generate the best/smallest/fastest code possible

    14. Re:learn better parallel programming techniques? by Billly+Gates · · Score: 1

      Believe it or not its impossible to directly talk to hardware like you could in the 386 days. Assembly calls are translated to internal risc ones.

      Modern processors have many features for compilers to take advantage of like pipelines and sse4 which makes the need less of being low level. Compilers do a damn good job and even jit compilers like java and .net can take advantage of modern features. I am imaging hardware virtualization will make these even faster.

      I suppose there will always be a need for low level optimization and C will be around for a long time as a result. Many hackers still know how to do these as computers are more affordable to more people than ever including third world countries like India. So I would not worry.

    15. Re:learn better parallel programming techniques? by 644bd346996 · · Score: 2, Insightful

      Almost everybody who can write better assembly than GCC is already working on compilers and optimization. Even GCC is better than most programmer's hand-optimized assembly. I've seen many times over the past several years where open source projects have thrown away assembly source because it is faster and more readable in C. (WINE in particular benchmarked their hand-optimized routines and found themselves soundly beat by GCC.)

      These days, a similar thing is happening with vectorization. If programmers try to do it manually, odds are that they won't do better than the compiler, but they will have wasted a lot of time on it. Eventually, we will probably see the same thing for multi-threading workloads. Compilers aren't stupid, and compiler writers are some of the best programmers around when it comes to optimization.

    16. Re:learn better parallel programming techniques? by EastCoastSurfer · · Score: 1

      Low-level languages like C are powerful because they can interact (almost) directly with the hardware.

      I'm going to guess you're defining 'powerful' as fast? C is barely an abstraction above the underlying hardware. The power of C comes from being able to express what you want the hardware to do with very few leaks to the abstraction.

      Then there are other languages that are built on top of those languages that are designed to hide complexity and allow programmers to code more efficiently at the cost of non-optimized code.

      I think you're making an incorrect assumption that more abstract languages aren't optimized.

      I didn't RTFA, but if the compilers start taking liberties and "hiding the complexity" of writing multi-threaded code, then unless they are absolutely perfect how would someone truly take advantage of hardware even while programming in C?

      I'm not so sure compilers are trying to hide the complexity of multi-threaded code. Instead they are trying to take their normal peep-hole style optimization techniques and expand them to use multiple cores. In this manner a program/programmer can receive advantages of being on a multi-core machine w/o explicitly coding towards it. This is a good thing, b/c of how messed up multi-threaded programming is in most languages based on the von-neuman architecture.

      I'm sorry that I don't have the time to refine my thoughts, but basically if low-level languages (lll's) start becoming more like mid/high-level languages then how do go back to being able to optimize code like you did back when lll's were real lll's?

      Of course there are exceptions (things like individual algos, embedded, etc...), but with the size and complexity of most systems today optimizations are now coming from places like system design (which is often easier in a higher level language) rather than being able to hand code one function in assembly. This is not to say those low level optimizations will ever go away, just that they'll be used/needed less and less (and when they are used, they'll be packaged in a complier or some library).

  7. Looks like something they rushed out by Doctor+Memory · · Score: 3, Informative

    I was looking at the Thread Building Blocks paper, and it reads like it was somebody's hastily-scribbled draft:

    "The Intel Threading Tools automatically finds correctness and performance issues" (The tools finds?)
    "Along with sufficient task scheduler and generic parallel patterns" (Who has insufficient task scheduler?)
    "automatic debugger of threaded programs which detects many of thread-correctness issues such as data-races, dead-locks, threads stalls" (Sarcasm fails me...)

    And that's just in the first few paragraphs, I haven't even gotten to the real meat of the article!

    I'm used to informative, well-written and reasonably complete technical documentation from Intel — WTF is this?

    --
    Just junk food for thought...
    1. Re:Looks like something they rushed out by Anonymous Coward · · Score: 0

      Who has insufficient task scheduler?

      Apparently you haven't use ......

    2. Re:Looks like something they rushed out by Red+Flayer · · Score: 1

      "The Intel Threading Tools automatically finds correctness and performance issues" (The tools finds?)
      No, the "Intel Threading Tools" is a product, in the singular -- it finds. Maybe Intel threading tools would find, but notice the subtle difference?

      "Along with sufficient task scheduler and generic parallel patterns" (Who has insufficient task scheduler?)
      OK, sso it's a bit awkward to parse, but isn't it obvious by the grammar that "sufficient" modifies both "task scheduler patterns" and "generic parallel patterns"?

      "automatic debugger of threaded programs which detects many of thread-correctness issues such as data-races, dead-locks, threads stalls" (Sarcasm fails me...)
      Oh wait, nevermind. This sentence shows that the author truly can't write clearly. Silly me, thinking that the author intentionally used correct grammar, instead of stumbling upon it by accident. Sorry about that.

      I'm used to informative, well-written and reasonably complete technical documentation from Intel -- WTF is this?
      This? This is Slashdot. Maybe Intel decided to go with the flow and make it seem like they are a true Slashdot insider. It's quite a nice bit of trickery, really -- if they really want it to succeed, they'll issue the same press release tomorrow.
      --
      "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
    3. Re:Looks like something they rushed out by presearch · · Score: 2, Informative

      The Intel Compiler Lab is based in two Russian cities - Moscow and Novosibirsk.
      Probably the source of the less than optimal text.

      How's the documentation on -your- compiler coming along?

    4. Re:Looks like something they rushed out by ichigo+2.0 · · Score: 1

      WTF is this?


      WTF? No, this is SPARTA!
    5. Re:Looks like something they rushed out by Doctor+Memory · · Score: 1

      The Intel Compiler Lab is based in two Russian cities - Moscow and Novosibirsk.
      Probably the source of the less than optimal text. The point is, whatever tortured, twisted prose was submitted should have been edited and polished before going out with an Intel logo on it. This was a white paper on the corporate web site, not a post on some random Intel engineer's blog — different standards apply.

      Seriously, check out this opening paragraph from the Intel® 64 and IA-32 Architectures Application Note:
      TLBs, Paging-Structure Caches, and Their Invalidation

      The Intel® 64 and IA-32 architectures may accelerate the address-translation process by
      caching on the processor data from the structures in memory that control that process.
      Because the processor does not ensure that the data that it caches are always consistent
      with the structures in memory, it is important for software developers to understand how
      and when the processor may cache such data. They should also understand what actions
      software can take to remove cached data that may be inconsistent and when it should do
      so. The purpose of this application note is to provide software developers information about
      the relevant processor operation. This application note does not comprehend task switches
      and VMX transitions. Notice how they even get the fact that "data" is plural right? That's the kind of documentation I'm talking about.
      --
      Just junk food for thought...
    6. Re:Looks like something they rushed out by geekoid · · Score: 1

      A large company develops and release an profession compiler, decent documentation is a reasonable expectation. To say snide comments does not help, and shows that you have no real argument.

      Grow up.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    7. Re:Looks like something they rushed out by Doctor+Memory · · Score: 1

      develops and release an profession compiler Too easy, moving on....

      To say snide comments does not help, and shows that you have no real argument. Um, I'm not arguing. I'm making an observation. If you diagree with me, then you're making the argument. Which is fine, just so we know where we stand. Nice non-sequitur, though.

      Grow up. So expressing dismay that a respected corporation is showing less-than-professional work is a sign of immaturity? Buy a vowel and solve the puzzle, honey, Real World moves and all...
      --
      Just junk food for thought...
    8. Re:Looks like something they rushed out by Mike1024 · · Score: 1

      "automatic debugger of threaded programs which detects many of thread-correctness issues such as data-races, dead-locks, threads stalls" (Sarcasm fails me...)
      Oh wait, nevermind. This sentence shows that the author truly can't write clearly.


      Couldn't you make that sentance pretty normal sounding just by removing that one errant 'of', i.e.

      "[The software includes an] automatic debugger of threaded programs, which detects many thread-correctness issues such as data-races, dead-locks, threads stalls [...]"

      Although, one would think Intel would be more careful about proof-reading their sales literature; western developers (i.e. the target of English-language technical sales literature) probably prefer not to be reminded of jobs like theirs being exported to places like India (which poorly-written technical documentation may remind them of).
      --
      "Goodness me, how unlike the FBI to abuse the trust of the American public." -- The Onion
    9. Re:Looks like something they rushed out by Glasswire · · Score: 1

      Wrong. Parse the grammar the way Intel does it not only makes sense but is very meaningful. What you're lacking are some font and other cues about what things are compound word nouns as well as some knowledge about what are legitimate threading concepts.

      "The Intel Threading Tools automatically finds correctness and performance issues" (The tools finds?)
      Try italizing the product name: The Intel Threading Tools(product name) automatically finds correctness and performance issues
      and it looks much better gramatically once you realize the product name is singular, not plural (despite the last word ending in s)

      "Along with sufficient task scheduler and generic parallel patterns" (Who has insufficient task scheduler?)
      Look at it as as function: sufficient( task scheduler and generic parallel patterns ).
      It gives you sufficient patterns for both of these. The blurb is, of course, aimed at someone who understands why threading tools need to have this kind of patterns.

      "automatic debugger of threaded programs which detects many of thread-correctness issues such as data-races, dead-locks, threads stalls" (Sarcasm fails me...)
      Perhaps sarcasm fails you because it isn't helping you? Thread-correctness, data-races, dead-locks, and thread stalls are all common terms used in describing threading issues. Just because they sound funny to you doesn't mean they are not meaningful to people this product is aimed at. If you understand what they are this sentence makes perfect sense.

      Please mod the writer down. All he/she is informing us of is his/her ignorance. There's nothing wrong with being unaware of threading concepts, but please don't suggest there's something wrong with things you just don't understand.

    10. Re:Looks like something they rushed out by stuktongue · · Score: 1

      "Please mod the writer down."

      Please do not mod the writer down.

      "All he/she is informing us of is his/her ignorance. There's nothing wrong with being unaware of threading concepts, but please don't suggest there's something wrong with things you just don't understand."

      I think you're missing the point Doctor Memory is making. The sense I'm getting is that he isn't criticizing the technical correctness so much as the quality of the writing. He has come to expect decent writing from Intel (as have I, for that matter) and is simply pointing out how what he read didn't seem to meet that expectation (and I agree there, too).

      From your post, I gather that English is not your first language. Either that, or your skills are only so-so. In any case, it is clear that Doctor Memory's English skills are substantially superior to yours. That you are willing to overlook flaws in the writing (assuming you recognize the flaws) is fine for you, but has little to do with the inherent quality of the writing, which can be important to others.

      If this were the real world, I think you'd owe Doctor Memory an apology.

      Take it easy.

    11. Re:Looks like something they rushed out by stuktongue · · Score: 1

      Hi, Doctor Memory.

      In a post below this, I defended your original position to another poster (not that you asked me to), but here I have to say something to you. I think you're incorrectly wailing on geekoid here; I believe he is commenting on presearch's comments, not yours (check the indents). In other words, he was coming to your aid, in a way.

      Take it easy.

    12. Re:Looks like something they rushed out by James_Intel · · Score: 1

      It's not up to our standards. It doesn't deserve all the abuse, soem of that was explained elsewhere - but it's still not our best work. We'll update it (soon). I can't promise we'll fire the person - heck, I write like that on a bad day... I don't think I wrote this piece... but I'd better check.
      Thanks for the encouragement and entertainment. :-)

      P.S. the comments about where our development team is - was not correct... I don't think the sun sets on our team - plenty of our compiler team is in the U.S. - and the threading tools folks are mostly in Illinois (but they are worldwide too). I'm not inclined to believe it was a second language issue.

    13. Re:Looks like something they rushed out by Doctor+Memory · · Score: 1

      But, uh....ummmmm.....

      *sigh* I am such a dickhead...

      Geekoid, I apologize. Your comment does make sense, once I pull my head our far enough that I can tell what part of the thread you're on. I owe you a $BEVERAGE. Feel free to flame me back, I deserve it.

      And thanks, stuktongue, I obviously shouldn't be counted on to figure this kind of stuff out on my own...

      (Damnit! I hate being a fucktard!)

      --
      Just junk food for thought...
    14. Re:Looks like something they rushed out by stuktongue · · Score: 1

      LOL, dude. Now you don't have to be that hard on yourself. :-)

      It is nice to see someone who's capable of apology, though. You don't see much humility on Slashdot, or the Internet, these days. Like the Rev. Rodney King once said, "Can't we all just get a lawn?" :-)

      I do like your use of "fucktard"... one of my favourite words. Though I prefer to use it on others. :-)

      Take it easy, man.

      P.S.: I visited Intel's web site and struggled through their writeup on the threading tools. As a potential customer and user of those tools, I must say that they need to do a better job of presenting them to people. Having said that, it seems to me that most of the compiler documentation is rather weak in comparison to, say, the core processor-related documentation. They really could use an editor in the software tools group.

    15. Re:Looks like something they rushed out by Doctor+Memory · · Score: 1

      the product name is singular, not plural (despite the last word ending in s) Then how do you explain the sentence "Intel® Threading Tools consist of the following:"? "Tools" is plural, and it doesn't matter how many adjectives you throw in front of it. The usual convention to make it singular is to suffix the name with a singular qualifier, like "suite" or "collection". Substituting form for content, as you suggest, isn't going to fix it.

      Look at it as as function: sufficient( task scheduler and generic parallel patterns ). It's not a function, it's a poorly-written sentence. And if you knew half as much about threading and synchronization as you pretend to, you'd see that the writer actually meant to use the word "specific". So the sentence should start with "Along with specific task scheduler and generic parallel patterns..." (note the distinction between generic patterns and those specific to task scheduling). Make that fix and...you've still got a five-line run-on sentence.

      If you understand what they are this sentence makes perfect sense. Um, I do know what they are (although the phrase "race condition" is generally preferred to "data-race", and "dead-lock" is almost exclusively expressed as "deadlock"), and if that sentence makes perfect sense to you, you should contact your ESL teacher and get a refund. The problem is that there is a random "of" masquerading as content and obscuring the meaning of the sentence.

      I don't mind having to double-scan sentences in Slashdot posts, and I usually don't care if I run across something similar on other tech sites. But this is an Intel white paper, and I expect that things like that have been gone over by an editor, not just C&P from some engineer's e-mail. I'm also not faulting whoever wrote it; the content was fine, it just needed to be cleaned up (which is not the engineer's job). The fact that it wasn't is what is annoying me.
      --
      Just junk food for thought...
    16. Re:Looks like something they rushed out by Doctor+Memory · · Score: 1

      I can't promise we'll fire the person I hope you don't — as I explained elsewhere, I think the content was fine, it just needed to be cleaned up before it went out, and I don't think that is the engineer's responsibility. I just hope whoever posted it to the support web site comes in tomorrow and thinks "Something's bugging me about that TBB page, I'm just going to take a quick look and make sure..." and realizes they published the original copy instead of the official version.
      --
      Just junk food for thought...
  8. Re:Umm.. by Timesprout · · Score: 4, Funny

    Intel has added kitten whiskers and pixie dust to its compilers so your ponies can now play on multiple paddocks.

    --
    Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
    What truth?
    There is no dupe
  9. OK, I'll Byte by Skjellifetti · · Score: 2, Interesting

    As the company raises the core count with each generation of new products, it will get harder and harder for programmers to manage the complexity associated with all of that available parallelism.

    As a programmer, I already have abstractions such as Active Objects. While this may make it easier for compiler writers or kernel hackers, what benefits does it bring to us ordinary mortals?

  10. The inevitable... by R2.0 · · Score: 4, Funny

    Cue "Fortran is Dead" comments in

    30
    20
    10

    --
    "As God is my witness, I thought turkeys could fly." A. Carlson
    1. Re:The inevitable... by Anonymous Coward · · Score: 0

      Fortran is LIFE!

    2. Re:The inevitable... by TeknoHog · · Score: 4, Insightful

      Fortran is dead, and it has had native parallel math since 1990. C is alive and it needs ugly hacks to get parallel math.

      --
      Escher was the first MC and Giger invented the HR department.
    3. Re:The inevitable... by Duhavid · · Score: 1

      Cue "Basic Lives!" comments in

      10
      20
      30

      --
      emt 377 emt 4
    4. Re:The inevitable... by the_greywolf · · Score: 1

      And let it stay Goddamn dead.

      It needed to be replaced 10 years ago.

      I hate it, I hate it, I hate it.

      --
      grey wolf
      LET FORTRAN DIE!
    5. Re:The inevitable... by Vituperator · · Score: 1

      How does the old saying go? Fortran favors the bald?

    6. Re:The inevitable... by cerberusss · · Score: 1

      The (young) scientists upstairs here are doing quite fine with Fortran, thank you.

      --
      8 of 13 people found this answer helpful. Did you?
  11. You're lucky... by Anonymous Coward · · Score: 1, Funny

    ... the version before this one was in ebonics.

  12. intel's product page by non · · Score: 3, Informative

    the intel product has somewhat more detail. it can be found here.

    --
    ...vividly encapsulates that post-Watergate/pre-punk/coked-up moment when you could trust no one, least of all yourself.
  13. And to make vector ops even simpler than in Parent by Anonymous Coward · · Score: 0
    Regular version:

    for(i = 0; i < 10; i++) a[i] = b[i] + c[i];


    Vectorized version is something like this:

    a[0..9] = b[0..9] + c[0..9];

    Obviously, the above isn't valid code, but the idea is there, I hope?
  14. Fortran is dead? by Anonymous Coward · · Score: 0

    I thought it was BSD that was dead?

    It is, you know.

  15. What? That's nothing! by Anonymous Coward · · Score: 0

    There was an ad today on the slashdot mainpage that read: "Want to jump of your version control tool?"

  16. Free for non commercial use? by Anonymous Coward · · Score: 1, Informative
  17. Re:Umm.. by repvik · · Score: 4, Funny

    OMG! PONIES!!!

  18. Re:Umm.. by Anonymous Coward · · Score: 0, Flamebait

    You know, it's not high school here. You don't have to pretend to be stupid. It will actually make people think worse of you, not better.

    And if you simply are ignorant, you could always read about the things in the summary. You might learn something that way.

    But has /. really got to the stage that people think it's somehow clever to be stupid? News for nerds, and all that....

  19. Some of us only want to *USE* it by Anonymous Coward · · Score: 1, Insightful

    If I am writing a quantum pyhsic calculation package or compiling it (let us say.... Molpro 96) I want it to use correctly the many core I assign it to run , using high paralllelized fortran compiler. I don not want to know how and why it does it, I jsut want it to do it. I ain't a computer sicentist, I am not writing a thesis on computer science, and I don't care a iota about this. I leave that to computer scientist. Neither does my chief care about academic computer science. Same for intel multicore. The fun fact, is that msot of us want to use the power of the many core, and don't care a bit about the how, why etc...

  20. Re:And to make vector ops even simpler than in Par by Andrew+Kismet · · Score: 0

    That's interesting, but what if step 11 of the loop is dependant on step 10? How does one vectorise that?
    I can imagine vectorisation of loops working alright for basic loops like the one you described, which'll help in a number of cases, but it's not going to scale exceptionally well. It's good, but it's nothing amazing if I'm reading this right.

  21. I dont understand this statement: by JustNiz · · Score: 5, Insightful

    >> As the company raises the core count with each generation of new products, it will get harder and harder for programmers to manage the complexity associated with all of that available parallelism.

    I'm very surprised and dissapointed by the pervasiveness of the incorrect myth thats being promoted even amongst supposedly technically knowledgeable groups that:
    a) Writing multithreaded code is terribly difficult
    b) You need to implement code to have the same number of threads as your target hardware has cores
    Both of these is completely not true at least for the PC marchitecture.

    The way to develop multithreaded code is to exploit the natural parallelism of the problem itself. If the problem decomposes down most neatly into one, three or 6789 threads, then design and write the implementation that way. Consequently the complexity of the problem does not increase as the number of cores available increases.

    In the PC architecture case, attempting to design your code based on the number of cores in your target hardware just leads to a twisted and therefore bad and also non-portable design.

    I'm surprised how few developers seem to understand that in fact its OK, normal and often desireable to have more than one application thread running on the same core. In fact you really can't even ensure or even assume that your multi-threaded app will get one core per thread even if the hardware has enough cores, or work best if it does, as core/thread allocation is dynamically scheduled by the OS depending on loading. Not to mention there's all sorts of other apps, drivers and operating system tasks running concurrently too, so depending on each core's load, one app-thread per core may actually not be the most optimal approach anyway.

    1. Re:I dont understand this statement: by vidarh · · Score: 1
      The problem is that if you have a problem that decompose neatly into four parts, and you want to be able to take advantage of new systems with far more cores, the amount of work you may need to do to decompose the problem further may be orders of magnitude more complex than getting it to decompose into the original four parts. The problem isn't when it decomposes naturally into more parts than you have cores, but when it decomposes into fewer parts.

      Developers that fail to handle that will be unable to compete with those who can as the number of cores in relatively low end systems increase and not parallelizing your apps sufficiently leave you unable to take advantage of the full potential of the systems it runs on.

    2. Re:I dont understand this statement: by bcd · · Score: 1

      >> If the problem decomposes down most neatly into one, three or 6789 threads, then design and write the implementation that way.

      Agreed. But the problem is that most programs are inherently serial in nature. Intel and others are targeting multi-core everywhere, not just the highly parallel scientific community, but the average desktop as well.

      These Intel tools are trying to solve the problem by letting you write an apparently single-threaded application, that the compiler turns into something multi-threaded under the covers. There's no harm in not exploiting the extra parallelism available, but you're missing out on some potential performance if you don't.

      The other approach is to make programmers think about the parallelism. In my experience, most programmers just aren't good at this. Some argue that we need better primitives than just semaphores, queues, etc., but I think it's human nature to think serially and that "thinking parallel" all the time just isn't going to happen.

      Personally, I don't think this will be an issue at all for several more years, because systems are typically running an SMP-aware OS and are running lots of processes/threads at a time anyway (just ps -ef or look at the Windows Task Manager and look how much is there!) Users are also becoming more sophisticated and multitasking at the user level, e.g. web browsing, listening to music, whatever else all at the same time. Parallelism at the top should be exploited first and more fine-grained parallelism can be dealt with later, IMO.

    3. Re:I dont understand this statement: by mandolin · · Score: 1
      In the PC architecture case, attempting to design your code based on the number of cores in your target hardware just leads to a twisted and therefore bad and also non-portable design

      Additionally, at least for "embarrassingly parallel" problems, it is easy enough to get the number of online processors at runtime, and (slightly harder) make the program use that information to decide how many worker threads to use.

    4. Re:I dont understand this statement: by Anonymous Coward · · Score: 0

      Your suggestion works in fantasy land where threads are free and switching between them is free also. In the real world, they aren't. They cost stack space and OS data structures. Having lots of threads in a runnable state is not a good thing for most server applications, because you will be spending a lot of time context switching (ie, book keeping) and not as much time getting real work done.

    5. Re:I dont understand this statement: by KingMotley · · Score: 1

      There is a performance (and complexity) hit when you run many more threads than you have cores. Each thread as it is switched in and out of the core has some pieces of data that also must be saved and restored to get the core back into the state it was in when that thread was last executing. At a very minimum it would be things like the instruction pointer, the processor flags, and the processor registers. So switching between threads unnecessarily will cause the process to actually run slower than if it was single (or less) threaded. The most efficient code will use one thread per core, and each thread will use the processor at near maximum. However, if you can not get a thread to use the full core, then you are better off having another thread that can be switched in when another is in some kind of wait state. It's a balancing act between being able to use the full potential of multi-core processors and still running efficiently on single-core (or less cored) machines.

    6. Re:I dont understand this statement: by ratboy666 · · Score: 2, Insightful

      A couple of points:

      1 - If the communication or thread switching overhead exceeds the thread computation, it is not worth threading.

      2 - It is (unfortunately) easy to build in "lock stepping" into otherwise independent threads. These systems scale from 1..n cores; after n cores no further scaling is seen.

      3 - It *is* difficult to build correct parallel systems. Especially with points 1 and 2 in mind (and, yes, I *have* built parallel high-speed device drivers that are lock-free to avoid switching).

      4 - *Proving* that a multi-thread program is correct is quite difficult, especially when using lock-free constructs.

      --
      Just another "Cubible(sic) Joe" 2 17 3061
    7. Re:I dont understand this statement: by Anonymous Coward · · Score: 0

      Writing multithreaded code is terribly difficult

      Are you serious? It most certainly is difficult.

      The way to develop multithreaded code is to exploit the natural parallelism of the problem itself.

      What if there isn't any? This is typically the case.

    8. Re:I dont understand this statement: by sonofagunn · · Score: 1

      Every PC/Server out there today will be doing context switching between processes and threads. Check out how many processes are running on your PC. Many of those processes are multi-threaded. So regardless of how you write your program, there will be context switching done by the one or many cores available. Modern Linux distributions switch by default 1000 times every second, even if you're running a single threaded app. If every other thread/process is idle, your app can still get 99.9% of the runtime of the CPU. Furthermore, without processor affinity, your single threaded app may get swapped around to different cores or processors while all of this stuff is going on. The overhead of context-switching is overrated. If a problem can be parallelized, it should be, even if the target system is a single core system.

    9. Re:I dont understand this statement: by JustNiz · · Score: 1

      I disgree with you that this typically is the case, as that means absolutely every operation of your code is fully dependent on the result of the previous one. I've been a professional software developer for 25+ years in a variety of industries and haven't come across a problem like that yet.

      I think the reason that most people think most problems have no parallelism is because they're used to thinking about solving problems in terms of sequential steps and implemtning code that way, so are not experienced or intuitive in identifying possible areas of parallelism and therefore overlook some real possibilities.

      But given the case that there is absolutely no opportunity for parallelism, then accept it and write a single-threaded application as forcing a design that uses multiple threads won't gain you anything except extra inefficiency.

    10. Re:I dont understand this statement: by KingMotley · · Score: 1

      Please, go and write some multi-threaded applications as I have for the past 10 years. Then you can refute your own posts.
      I am fully aware of how thread context-swithcing affects software peformance as I have not only used many of the common ones out there, but I have even written a few of my own. While I agree that problems that can be parallelized are often better if they are written as such, it is important that you always keep in mind the performance hit you take for doing that. Taking a simple application and spawning off 6,000 threads just because it can be parallelized to that extent is a bad idea, and will have serious performance issues.

  22. Yeah, GCC is Key... by mkcmkc · · Score: 1

    I recently downloaded Intel's compiler to see whether my C++ code would run faster on it--I ended up giving up on it (for now) after spending a day trying to get it to work. I'm sure their compiler has many whizzy features in it, but for me, they don't really matter unless they're in GCC. I hope Intel will realize that it's in their interest to migrate these advances there.

    --
    "Not an actor, but he plays one on TV."
  23. Would the OS benefit from using this? by wazzzup · · Score: 3, Interesting

    I know OS X is compiled using GCC but I wonder if Apple would see performance gains by using it? If they did, would it somehow introduce problems? Basically, I'm wondering if there would be a downside to using the Intel optimized compilers as opposed to all-purpose GCC compiler.

    As an aside, Linux is obviously compiled using GCC but I wonder if Microsoft compiles Windows using the Intel compilers?

    1. Re:Would the OS benefit from using this? by Anonymous Coward · · Score: 0

      For applications that spend most of their time in user mode, it won't matter if Apple uses icc. Intel's compiler generally can produce faster code than gcc. Until now, though, gcc was the only way to compile 64-bit EM64T code on a Mac (and then, restricted to command line apps, until 10.5 Leopard comes out to support GUI apps). EM64T/AMD64 code runs faster than X86 code in most applications, due in large part to twice the number of registers being available, so gcc was both the only 64-bit game in town for Macs, and it tended to produce the fastest code on a Mac (as long as it was 64-bit). The new icc can produce both 32-bit X86 and 64-bit EM64T code, though, which should squeeze even better performance out of MacIntels. For apps that spend most of their time in user mode, this will be a big benefit, regardless of whether Apple uses icc.

    2. Re:Would the OS benefit from using this? by niteice · · Score: 1

      Why would Microsoft use icl? They have their own x86/x64-targeted optimizing compiler you know...

      --
      ROMANES EUNT DOMUS
    3. Re:Would the OS benefit from using this? by tuttle · · Score: 1

      Considering that the Intel compiler doesn't support Objective-C, Objective-C++, or the PPC architecture. I would say it would be very difficult to compile universal binaries or anything that touches Cocoa or any of the newer libraries. I suppose if the code in question is lower level C code, you could write a tool chain that used icc for some things and gcc for others provided the 2 objects link correctly. It really sounds like a lot more trouble than its worth, considering Apple already works on features of gcc it would make more sense for them to help out on the optimizer as well.

    4. Re:Would the OS benefit from using this? by Anonymous Coward · · Score: 0

      A bit.. but not a lot, I'd guess. Where these compilers shine is heavy number crunching tasks, which OS's tend not to have a lot of. They're usually more get-in, do something reasonably generic, get-out. The downsides would be (as mentioned) the lack of objective-C, and other compatibility issues (although I believe that GCC compatibility is something Intel do aim for.)

      But something like a software-RAID could benefit from this which is the most obvious candidate for bulk number crunching in a typical OS kernel. Possibly the audio pipeline - that sort of thing. It's hard to see a VM subsystem significantly benefiting. Image, video and sound manipulation programs would be a more obvious target for using this sort of compiler than a kernel though.

    5. Re:Would the OS benefit from using this? by EvanED · · Score: 1

      I wonder if Microsoft compiles Windows using the Intel compilers?

      Microsoft uses its own cl.exe to build Windows.

    6. Re:Would the OS benefit from using this? by nbritton · · Score: 1

      I wonder if Microsoft compiles Windows using the Intel compilers?

      Microsoft uses its own cl.exe to build Windows.

      GIGO.
  24. No and yes by Sycraft-fu · · Score: 2, Informative

    No, they won't add them to GCC. Intel's compiler competes with GCC and it is the best there ever was. In every test I've ever seen on Intel chips, it comes out ahead and I'm sure they've no interest in changing that. However yes, the docs are out there. Intel's processors are extremely well documented and you can get everything you need. The problem isn't that the GCC people are having to guess how the processors work, the problem is that their coders aren't as good as Intel's at optimising their compiler. This isn't helped by the fact that GCC targets many architectures where the ICC is only for one.

    However don't expect Intel to help GCC out. Their answer will just be "buy the ICC".

    1. Re:No and yes by Anonymous Coward · · Score: 0

      Nope. Intel has multiple times paid for development on GCC. The Itanium, for example, had GCC and full Linux distributions before windows (and I think before wide release).

      Will they port their newest features? Probably not. Will they ensure that GCC advances at a pace were Linux runs equally well (or better) on Intel than AMD? Probably. They have nothing to gain from closing specs and not publishing their research. Especially as work that is shown to work on GCC can later be reimplemented from scratch on other compilers, providing Intel processors and edge.

    2. Re:No and yes by Anonymous Coward · · Score: 0

      Also, the goals of GCC and ICC are different.

      GCC's goal is to be portable; it may not generate the most efficient code possible, but it will generate code for dozens of platforms.
      ICC's goal is to generate the most efficient code, but only on x86/x86-64.

    3. Re:No and yes by AuMatar · · Score: 1

      In addition- GCC is meant to be a cross-compiler. It works for ALpha, Sparc, RISC, x86, and a dozen others. To get there without writing N different compilers, they need to use techniques that are architecture neutral. That means purposely passing up a lot of optimizations that are incompatible with their intermediate representations. So ICC will always be faster, since it only targets x86.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    4. Re:No and yes by smallfries · · Score: 4, Informative

      Well, no actually you can't. If you've ever spent any time going through the 1000 page Intel Optimisation Guide for the x86 then you would know that they don't spell out all of the trade offs explicitly. They describe enough to point you in the right direction but they keep a lot back. Partially because the behaviour of these chips in certain usage patterns isn't even defined by the design - it's a side-effect of several other parts of the chip design interacting. So the best that you can do is suck it and see - and in general it changes not between major ISA revisions but on individual models.

      Now, if you're Intel then you have the time and the money to work out exactly how to exploit these tradeoffs to schedule threads effectively. But you don't want to give that away for free. From the (very scanty) marketing bullshit that was linked to, it would appear that they've appear an Intel-specific threading library (probably with a POSIX interface). Separate to this is a profiling tool and a multi-threaded debugger (the latter of which is non-trivial). While any debugger will let you skip across threads allowing you do it in a deterministic manner to look for race hazards is much harder.

      The analysis tools sound nice, but the bolton library is nothing special. It's purely to win a few synthetic benchmarks and gain some marketshare for ICC and therefore more "Made for Intel" applications in the market. I'm cynical about the library because what is broken about the threading model in C/C++ would take more than a library to fix. It would require redesigning the language down to the ground and choosing a different set of control constructs.

      So finally, when you claim that it's because Intel has "better" coders. You don't know what you're talking about. I know a few guys who code GCC for a living, and they are grade A coders. It is because Intel has moved the goalposts. It's not so much that GCC targets multiple architectures, it's that they are trying to stick to (relatively) standard C where-as Intel is willing to redefine where the semantic gap sits if they can squeeze out a little more performance. Their attitude is screw portable code - talking across different compiler vendors here, rather than chip vendors. If what they need to squeeze into their compiler is no longer "C" strictly speaking, then they don't care. The gcc guys do.

      Ah yes, and portable code can be a smaller window than you expect. That weighty 1000 page Intel document is sitting comfortable next to the AMD equivalent, which differs in suprising places.

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    5. Re:No and yes by Tickletaint · · Score: 1

      GCC, the lowest common denominator; jack of all architectures, master of none? I'll take whichever compiler is optimized for the architectures I'm targeting, even if that means I have to juggle several different ones.

      --
      Make Slashdot readable! See journal.
    6. Re:No and yes by Anonymous Coward · · Score: 0

      What absolute bullshit - and exactly what you would expect from a success-hating FUD-spreading slashdot noob. Intel gives you the information required; you lack the experience or aptitude (or both) to correctly make use of it. Your assessment of the quality of GCC coders is irrelevant, first as you are probably not qualified to judge that, secondly because GCC must cross-compile for dozens of architectures. As other posters have noted the Intel compiler is a tool designed for one particular job (and gee, it's the best tool for that job!). GCC is like a chisel that you can use to open a box, turn a screw or maybe even carve wood with. Works, but it's not the best choice.

    7. Re:No and yes by suckmysav · · Score: 1

      "Intel has multiple times paid for development on GCC. The Itanium, for example, . . . "

      Good grief, intel were so desperate to get someone, anyone to write code for the Itanic that I'm not at all surprised they would help the GCC guys out. They simply couldn't wait for Itanic support to trickle through at the normal rate.

      Got any better examples?

      --
      "You can't fight in here, this is the war room!"
    8. Re:No and yes by smallfries · · Score: 1

      What absolute bullshit - and exactly what you would expect from a success-hating FUD-spreading slashdot noob.


      Gee, that makes what you have to say insightful, I guess I'll read the rest of your zzzzzzzzzzzzzzzzzzzzz
      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    9. Re:No and yes by Anonymous Coward · · Score: 0

      Wouldn't it be better if you spent that time revisiting your algorithm and data structure choices so that they aren't as fragile in the face of missed optimizations by compilers?

    10. Re:No and yes by gnasher719 · · Score: 1

      '' Intel's processors are extremely well documented and you can get everything you need. ''

      I would prefer getting the low level details from Intel (who should know them) instead of getting them from Agner Fog (whose tremendous work in figuring out lots of details I really really appreciate).

    11. Re:No and yes by Nutria · · Score: 1
      In addition- GCC is meant to be a cross-compiler. It works for ALpha, Sparc, RISC, x86, and a dozen others. To get there without writing N different compilers, they need to use techniques that are architecture neutral. That means purposely passing up a lot of optimizations that are incompatible with their intermediate representations.

      I don't think that's true.

      Many versions ago (4.0?), gcc was refactored to have a front-end parser that created an intermediate language that is then processed by the target CPU code generator. This also made it easy to support multiple source languages, since they each can have a front-end parser that generates the intermediary language.

      Of course, DEC pioneered this technique back in the early 1980s, which made it possible to link OBJ files (written in different languages) all into the same EXE.

      --
      "I don't know, therefore Aliens" Wafflebox1
  25. Re:Umm.. by geekoid · · Score: 1

    PONIESOMGBBQ!!!?!!

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  26. Re:Umm.. by Just+Some+Guy · · Score: 4, Funny

    Intel has added kitten whiskers and pixie dust to its compilers

    You're thinking of IBM.

    --
    Dewey, what part of this looks like authorities should be involved?
  27. I got yer dead Fortran... by Anonymous Coward · · Score: 0

    ...hangin' right here.

    Note the caption at the bottom of the photo that says how FORTRAN will make the machine easy to use!

  28. They don't have to. by Ayanami+Rei · · Score: 1

    PGI and Sun both make auto-parallelizing and optimizing Fortran/C/C++ compilers specifically for K8 (and i386).

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  29. On the flip side... by Ayanami+Rei · · Score: 1

    Intel is not trying to kill GCC or anything. They try very hard to make ICC compatible with gcc and g++ (ABI and command line interfaces)... so that you can just set CC=icc in your makefile and be on your way.

    It was a big source of pride for them that they got the linux kernel to build in icc without patching. *eye roll*

    But they don't expect every linux user to buy ICC or anything. They position it for use for performance reasons.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
    1. Re:On the flip side... by EvanED · · Score: 1

      It was a big source of pride for them that they got the linux kernel to build in icc without patching. *eye roll*

      Eye roll? Why?

      The Linux kernel has disclaimers saying that it's made for GCC and they don't care if it works on other compilers, and it makes pretty heavy use of GCC extensions. The kernel isn't portable code compiler-wise almost at all.

  30. You teach... by iknownuttin · · Score: 1

    You teach freshman CS, or CE, don't you?

    --
    I prefer Flambe as apposed flamebait.
  31. Re:And to make vector ops even simpler than in Par by James_Intel · · Score: 2, Informative

    You're right - vectorization - by itself can't handle step 11 dependent on step 10... and assuming there isn't a magical way to rewrite the loop to remove the dependence (which it the first thing the ocmpiler will try todiscover and do for you - but usually it can't) - then you need to look at pipelining - software pipelining on a single core, or parallelism on multi-core... but you'll have to have the right interconnect processor to processor to match the work to get multiprocessor pipelining to do what you want. Software pipelining can be very effective on loops with dependencies loop to loop.

  32. It ain't worth it by Anonymous Coward · · Score: 0

    I've tried to use the Intel C++ compiler on dozens of projects, and the gains - if any - are never worth the extra cost and work it requires to set up your environment and porting over the inconsistencies with gcc. The exception to this rule is scientific code.

    'Course if Intel truly can auto parallelize non-threaded code, that would be enough to be worthy of reconsideration. But I have a feeling that their claim - one of the holy grails of compilers - is going to fall far short of what you really get out of it.

    They do make a mighty fine Fortran compiler, though.

  33. My test data by snikulin · · Score: 1

    I got MKL 9.1 and Fortran 10 today.

    Very basic test (3D DFFT and compex(4) matrix math) on dual X5160 on WinXP32:
    SW                Time(ms)   Speed(%)
    9.0 MKL + 9.1 F   3100       100.00%
    9.1 MKL + 9.1 F   2900       93.55%
    9.1 MKL + 10.0F   2720       87.74%

    In sum: 12% improvement, 6% per piece.
    Not bad for my purposes!

    ----
    Why ./ does not have <pre></pre>?

  34. Just wait for Automatic Parallelization by Anonymous Coward · · Score: 0

    Once automatic parallelization, the holy grail of parallel computing, comes in, all these tools, models that provide users abstractions for parallel programming, will be swept away. Automatic parallelization is hard - lot of research has been done for nearly a decade and a half, but nothing production-level yet. However, something is going to come out in the next couple of years.

    Still, parallel programming models/libraries will be around so that now a parallelizing compiler would make use of it instead of the user directly using them. Also, when performance is critical and depending on applications, a user or a black-belt parallel programmer may directly want to do the parallelization himself - he knows best where parallelism is. So, I think automatic parallelization will once and for all settle the problem of bringing the benefits of emerging highly parallel architectures to the users. How tough can it be? It is not more complex than the complexity that exists in a current-day sequential compiler.

  35. Seems like Intel is following Sun's lead on this by vuglusk · · Score: 1

    It seems that the two companies are in the arm race as far as multicore developer tools are concerned. Sun released Sun Studio 12 yesterday and the buzz around release seems to be very similar. It also seem that Sun Studio 12 is a nicer integrated package overall with not just the compilers but a bunch of tools and a pretty convincing IDE.

  36. Why don't AMD have a compiler? by beswicks · · Score: 1

    Intel clearly think that it is important to offer a compiler(s) specifically for their chips, and I see that as a good thing for users of Intel based systems who want to get the most out of there hardware, so my question is why AMD do not make compiler software for their chips. They do after all have their own set of special extensions so would they not benefit from creating a compiler armed with there own "inside knowledge"?

    Seems odd to me to make the chips but not the software to allow people to fully utilize them and if GCC et al are good enough why do Intel and IBM offer there own compilers for the processors they make?

    1. Re:Why don't AMD have a compiler? by Anonymous Coward · · Score: 0

      Ummm....because compilers don't just spring into existence in the same way that money doesn't just grow on trees. AMD can't afford a compiler team. If Intel's compiler does a good job on AMD CPUs, why bother (hint - it does)?

    2. Re:Why don't AMD have a compiler? by beswicks · · Score: 1

      Well this is a bit old, but apparently it doesn't and I cannot see it being all that likely that Intel will spend time and money making there compiler kickass on AMD.

      Intel compiler crippled on AMD chips!
      http://www.swallowtail.org/naughty-intel.html

      Clearly it would take money and skills to have a compiler team at AMD, but that doesn't answer why they don't want to spend that money to compete with there main competitor in a vital area of their processors being the best they can be.

  37. Googled by beswicks · · Score: 1

    Top result on google for "amd compiler" is...

    http://www.amd.com/epd/desiging/fusionpartners/pro dbytype/3.developme/11.compiler/index.html

    list of 3rd party software from the era of Win98/NT, just a little out of date.

  38. Riddle me this by Anonymous Coward · · Score: 0

    How do you know that SSE3 will be supported in a chip down the line? It could be that to reduce die waste you'll drop it. So how do you handle that? Disable SSE4 for it?

    Or do you ask the CPU what it supports and include what it says it does?

    If you do it that way, then why can't that be so for AMD processors? Ask them what they support and if they say "SSE3" use SSE3 requests.

    It's not as if we're asking you to support AMD's version of an SSE-like library that isn't SSE.

    1. Re:Riddle me this by James_Intel · · Score: 1

      Right now, we are designed to assume SSE4 support will include SSE3, SSE2, SSE and MMX. Since that fits all processors I know of - it is a reasonable dependency. If that changes in the future, we could revisit it.

  39. No speak technese? by Anonymous Coward · · Score: 0
    What type of nerd on slashdot wouldn't be able to guess:

    "On the data parallelism side, the Intel C++ Compiler and Fortran Professional Editions both sport improved auto-vectorization features that can target Intel's new SSE4 extensions.
    The SSE instructions have Single Instruction, Multi-Data features (SIMD), like 1 add and 4 pairs of numbers to be added, they can all be "added" at same time (in parallel). One example of a vector is a direction (amount right, up, forward (3 dimensions), and amount, size or length. In a 3-D, computer generated scene with objects moving from one frame to the next, to compute new positions of the moving objects uses vector arithmetic. When you have multiple objects or points to update, you need to perform multiple arithmetic operations on each of the vectors -- all of which can be done in parallel. So if you have SIMD instructions that can do 4-8 "ops" in parallel, you can speed up the "op" calculation time by 4-8x. Compiler can further optimize by aligning data to be on a single cache page.

    For thread-level parallelism, the compilers support the use of Intel's Thread Building Blocks for automatic thread-level optimization that takes place simultaneously with auto-vectorization... Intel is encouraging the widespread use of its Intel Threading Tools as an interface to its multicore processors.
    "Intel's Thread Building Blocks, Intel Threading Tools" sounds like some way to tell the compiler which parts of code can be done at the same time and perhaps encourage programmers to think about places in their code or how to structure their code to take advantage of possible parallelism on a variable number of cores. Not sure exactly, but those should be essential points. If I thought Intel was affordable (their competitor, gnu is free) for a home user, part-time student, I'd visit the website to learn more. Who knows, they might have repriced their compiler to be under $100 with $60-80 upgrades, but not betting lunch on it.

    As the company raises the core count with each generation of new products, it will get harder and harder for programmers to manage the complexity associated with all of that available parallelism. So the Thread Building Blocks are Intel's attempt to insert a stable layer of abstraction between the programmer and the processor so that code scales less painfully with the number of cores."
    Code scaling for multi-core machines cores is "painful" (for most programmers) and is going to get more so. Intel wishes to make it less so. Otherwise end users and consumers will see little benefit in going "higher and higher" in core count. That might make it more difficult to sell newer, faster designed processors except to businesses.

  40. Re:Seems like Intel is following Sun's lead on thi by htd2 · · Score: 1

    Interesting because Sun Studio 12 is free and the Intel compiler suite are on a time limited evaluation. You only pay for Sun Studio if you want commercial support for the product.

    Sun Studio Supports x86-64 and SPARC with Linux and Solaris but does not support Windows so you would need to look for something else if you were in the market for a Suite that did provide Windows support

  41. Re:Umm.. by harryk · · Score: 1
    --
    think before you write, it'll save me moderator points.
  42. Re:Umm.. by RallyMedia · · Score: 2, Funny

    With intels new enhancements, they are now re-labeled as PWNies!

  43. Re:And to make vector ops even simpler than in Par by LWATCDR · · Score: 1

    Vectors and multithreading are two different things that a kind of related. To answer your question you can not but then that type of a loop wouldn't be a vector.
    Not everything can be a vector and multithreading code doesn't eliminate sequential code, but sometimes you still use the vector instructions for floating point ops. Why? Intel has been putting a lot more effort into SSE than FPU so SSE floating point is often faster than the FPU.
    What I want to know is when compiling under Windows you can specify Q6 to get the best performance out of the PPRO/P3 family and Q7 to get the best out of Netburst, So which would give you the best performance on the Core? I am leaning towards using Q6 for a while since P3 tend to run slower then P4s and I would rather my code be usably fast on both than super fast on the P4.

    --
    See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
  44. Holy Cow Batman! by ziggit · · Score: 1

    Wow, people still use Fortran? News to me.

  45. Eye roll for a few reasons: by Ayanami+Rei · · Score: 1

    While I'm all for compatibility, since the linux kernel doesn't even _pretend_ to work on other compilers, it's not exactly the sort of data point I would want to have ICC compared against and say: "look, it works". It's liable to over-fit the problem. Moreover the kernel is a self-contained entity with little dependancy on external libraries or APIs and any of the compatibility issues that arise in such cases would not be effectively tested.

    I would rather see them applauding a XX%-success-rate on a test against a corpus of open source applications (with a mixture C/C++ and heavy library usage) and/or the GCC regression test suite.

    Also, just because you can compile the linux kernel with ICC doesn't mean you should. There's no point. It's not like it's going to get any faster what with the handcoded assembly in the tricky parts and the bog standard techniques used elsewhere.
    And what's to say that some new driver that comes along breaks ICC compatibility for whatever reason since they only do testing with a limited range of GCC releases. It's not like Intel is doing regression checks for third party modules or anything.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON