Comparing G++ and Intel Compilers and Vectorized Code
Nerval's Lobster writes "A compiler can take your C++ loops and create vectorized assembly code for you. It's obviously important that you RTFM and fully understand compiler options (especially since the defaults may not be what you want or think you're getting), but even then, do you trust that the compiler is generating the best code for you? Developer and editor Jeff Cogswell compares the g++ and Intel compilers when it comes to generating vectorized code, building off a previous test that examined the g++ compiler's vectorization abilities, and comes to some definite conclusions. 'The g++ compiler did well up against the Intel compiler,' he wrote. 'I was troubled by how different the generated assembly code was between the 4.7 and 4.8.1 compilers—not just with the vectorization but throughout the code.' Do you agree?"
For better or worse, I've always given the intel compiler the benefit of the doubt. They have access to documents that the GCC folks don't.
I don't think it's troubling.
Firstly they beat on the optimizer a *lot* between major versions.
Secondly, the compiler does a lot of micro optimizations (e.g. the peephole optimizer) to choose between essentially equivalent snippets. If they change the information about the scheduling and other resources you'd expect that to change a lot.
Plus I think that quite a few intresting problems such as block ordering are NP-hard. If they change the parameters of their heuristic NP-hard solver, that will give very different outputs too.
So no, not that bothered, myself.
SJW n. One who posts facts.
I have worked on a couple of projects that compiled and ran perfectly with GCC 4.6 and 4.7. They no longer run when compiled with the latest versions of GCC. No warnings, no errors during compilation, they simply crash when run. It's the same source code, so something has changed. The same code, when compiled with multiple versions of Clang, runs perfectly. The GCC developers are doing something different and it is causing problems. Now it may be that a very well hidden bug is lurking in the code and the latest GCC is exposing that in some way, but this code worked perfectly for years under older versions of the compiler so it's been a nasty surprise.
And got completely different results!
Asking any audience larger than about 20 to compare the qualitative differences of object code vectorization is statistically problematic as the survey group is larger than the qualified population.
Help stamp out iliturcy.
that is a weird-ass troll (the link is a video of chiptune synthesized pop stars)
One amusing thing I discovered is that GCC 4.8.0 will actually unroll and vectorize this simple factorial function: Just look at that output!
Program Intellivision!
This is 2013 (almost 2014!) why are we talking about vectorization? Why don't people write code in vector notation in the first place anyway? If Matlab and Fortran could implement this 25 years ago, I am sure we are ready to move on now...
I just had one of those "WTF was I thinking then". An old C program of mine started dumping code. Long story short: I did one of those things in the "don't do that" category: a function scans a text buffer, stuffs a null byte here and there and returns pointers into the buffer: think tokenizing.
Point is... the buffer was stack allocated in this function! Don't try this at home, I say :-)
A newer gcc saw the buffer wasn't being accessed within the function and thought "meh, after return this buffer is toast anyway. No need to write those pesky NULLs. No one will see that!".
The compile with the old compiler "just worked" because nobody touched the stack in the meantime: the caller copied away the bits needed. Sheer luck, I'd say.
So it may well be that there are some undefined behaviours in there which fall prey to a more aggressive optimizer. Try compiling with -O0 and see whether there's any difference in behaviour. If there is... happy bug hunting :-)
...do you trust that the compiler is generating the best code for you?,,,
Trust, but verify.
.
I come from the days when it was the programmer, not the compiler, that optimized the code. So nowadays, I let the compiler do its thing, but I do a lot of double-checking of the generated code.
To be kind.
While I really want to believe that AMD is taking a huge leap with the ubiquitous console chipping, it's still a little too early to declare it a success.
OMG! What's this goatse doing here?? I thought all these images were taken down by a DMCA notice by the original asshole!
Slashdot's name? When my compiler sees
Mantle is a good idea insofar as it should kick Microsoft and/or NVIDIA up the behind. We desperately need someone to cure us of the pain that is OpenGL and the lack of cross platform compatibility that is Direct 3D.
Obviously NVIDIA won't play ball with Mantle but I've got a feeling they might have to eventually given that some AAA games developers are going code a path for it. When it starts showing up how piss-poor our current high level layers are compared to what the metal can do, they'll have no choice.
'Nuff said.
I've had code generator bugs in a c++ compiler from SGI in 1995. I was trying to do what is now called RAII. The compiler didn't really like objects on the stack, in fact it called the destructor twice, even after the stack frame was reused by another function call. I gave up on c++ for the next 30 years.
When documentation runs to hundreds or thousands of pages, it's hard to read it from cover to cover and reread it when each new version comes out.
the day that AMD came out with Mantle and started leveraging it's 100% monopoly in the console market
Among consoles that aren't discontinued or battery-powered, I count Xbox 360, PlayStation 3, Wii U, Xbox One, PlayStation 4, and OUYA. Of these, two have NVIDIA graphics: PlayStation 3 has RSX, and OUYA has the same Tegra 3 that's in the first-generation Nexus 7 tablet. The forthcoming iBuyPower Steam Machine also has NVIDIA graphics.
Was your program dealing with dates or tenses?
sometimes I want to reserve a global variable in a fixed register, but that requires all modules be compiled with the same flag.
That isn't "breaking the rules" as much as creating your own ABI. Classic Mac OS on 68K used to do this, where register A5 was typically reserved as a pointer to the program's global variable segment because the Mac OS ABI used position-independent code.
A compiler going to an assembler today is LAME.
How so? A tool should do one thing well. What an assembler does well is generate relocatable object code in a given format. If you're targeting two platforms, one of which uses ELF and the other COFF or whatever, one could use the same compiler to target both along with two different assemblers, one for each object code format.
That is a photo of a torn up pumpkin...
if (a && b=f(a) && c=g(b)) {
do stuff with a and b and c
}
If you convert that into the other format then you need to add something like six lines of code and two levels of nested if statements.
Ah, so you've just got 11.5 or so years until you can get back to C++. Cool. We'll keep it warm for you!
Well, what are they using in 2025?
while(1) attack(People.Sandy);
We do not live in a society that one must support the competitor. If the processor was not Intel produced several functions were disabled. A perfectly acceptable solution as there was no QA done for non-Intel products. It was also done in the open, if you RTFM you would have discovered that the compiler ONLY supported Intel processors.
As soon as I get my C++ loops to end I'll worry about converting them to ASM.
Having to work for a living is the root of all evil.
I suspect the vectorized version of fact(1000000) is faster than the naive implementation.
Intel's foundations will shake when they see what is in store for them soon
Kaveri is really a 12 core monster faulty codes won't save them when the 12 core monster is unleashed
Think of the waste of intellect that this situation creates, I mean the people with the smarts to do what you are describing would probably enjoy the process(of reverse engineering), but they could just as well be making things with the functionality instead of having to reverse engineer the hidden functionality and then presumably making things...
But then again, I realize that reverse engineering it would probably engender a deeper knowledge of the functionality than just perusing the documentation.
It sabatoges for non-intel. We're talking about a compiler. It shouldn't matter what brand of CPU is being used.