Overeager Compilers Can Open Security Holes In Your Code
jfruh writes: "Creators of compilers are in an arms race to improve performance. But according to a presentation at this week's annual USENIX conference, those performance boosts can undermine your code's security. For instance, a compiler might find a subroutine that checks a huge bound of memory beyond what's allocated to the program, decide it's an error, and eliminate it from the compiled machine code — even though it's a necessary defense against buffer overflow attacks."
well known for decades that optimizing compilers can produce bugs, security holes, code that doesn't work at all, etc.
This is just as poorly written up as last time. These are truly bugs in the programs using undefined parts of the language. It's silly to blame the compiler.
But according to a presentation at this week's annual UNISEX conference
Any code removal by the compiler can be prevented by correctly
coding the code with volatile (in C) or its equivalent.
This is not really about the existence of bad compiler optimization - it is about a tool called Stack that can be used to detect this, which is known as "unstable" code, and has been used to find lots of vulnerabilities already.
I know that at least GCC will get rid of overflow checks if they rely on checking the value after overflow (without any warning), because C defines that overflow on signed integers is undefined. This is even documented. If anything is declared by the language specification as being undefined, expect trouble.
The kinds of checks that compilers eliminate are ones which are incorrectly implemented (depend on undefined behavior) or happen too late (after the undefined behavior already was triggered). The actual article is reasonable— it's about a tool to help detect errors in programs that suffer here. The compilers are not problematic.
Compilers can also "optimize" away Kahan summation algorithm. See page 6 of How Futile are Mindless Assessments of Roundoff in Floating-Point Computation
The classic example of a compiler interfering with intention, opening security holes, is failure to wipe memory.
On a typical embedded system - if there is such a thing (no virtual memory, no paging, no L3 cache, no "secure memory" or vault or whatnot) - you might declare some local (stack-based) storage for plaintext, keys, etc. Then you do your business in the routine, and you return.
The problem is that even though the stack frame has been "destroyed" upon return, the contents of the stack frame are still in memory, they're just not easily accessible. But any college freshman studying computer architecture knows how to get to this memory.
So the routine is modified to wipe the local variables (e.g. array of uint8_t holding a key or whatever...) The problem is that the compiler is smart, and sees that no one reads back from the array after the wiping, so it decides that the observable behavior won't be affected if the wiping operation is elided.
My making these local variables volatile, the compiler will not optimize away the wiping operations.
The point is simply that there are plenty of ways code can be completely "correct" from a functional perspective, but nonetheless terribly insecure. And often the same source code, compiled with different optimization options, has different vulnerabilities.
C became popular because it was vastly more portable and performant than its predecessors. It still is today. None of those "better" languages that came before it or after it can beat that. And yes, extreme portability does matter when you have 100s of millions if not billions of devices that can't run anything but assembly or C. It's why the people saying that OpenSSL should be written in Java or C# are morons. Care to tell me how that's going to run on a, for example, Linksys WRT54G with only 8 or 16 MB of RAM, 2 to 4 MB of Flash storage and a 125 to 240 mhz MIPS CPU? Yeah, it's not.
The problem is that most programmers have never had to get their hands dirty doing embedded work. They live in a bubble that ignores all the memory/storage/processing-power constrained devices all around them. OpenSSL, for example, as used in something like DD-WRT would be unusable if it was written in anything but C or possibly C++.
Well I'd be pretty pissed as well if my pet language was relegated to the graveyard of obscurity by a language that was usable for real work. Dennis Ritchie was a pragmatist who got shit done not some guy wanking over the greatness and purity of the language he created. People to this day are still jealous of that.
More than that, string and vector are just as fast as the vulnerable C alternatives as long as you pre-allocate the buffers to the expected size. It's not rocket science, but the number of times I've heard "but C++ is too slow, we have to use C" makes me cry. With bounds checking in both (hand-written in C, of course), the speed is the same. Without buffer checking in either, the speed is the same.
Socialism: a lie told by totalitarians and believed by fools.