How Your Compiler Can Compromise Application Security
jfruh writes "Most day-to-day programmers have only a general idea of how compilers transform human-readable code into the machine language that actually powers computers. In an attempt to streamline applications, many compilers actually remove code that it perceives to be undefined or unstable — and, as a research group at MIT has found, in doing so can make applications less secure. The good news is the researchers have developed a model and a static checker for identifying unstable code. Their checker is called STACK, and it currently works for checking C/C++ code. The idea is that it will warn programmers about unstable code in their applications, so they can fix it, rather than have the compiler simply leave it out. They also hope it will encourage compiler writers to rethink how they can optimize code in more secure ways. STACK was run against a number of systems written in C/C++ and it found 160 new bugs in the systems tested, including the Linux kernel (32 bugs found), Mozilla (3), Postgres (9) and Python (5). They also found that, of the 8,575 packages in the Debian Wheezy archive that contained C/C++ code, STACK detected at least one instance of unstable code in 3,471 of them, which, as the researchers write (PDF), 'suggests that unstable code is a widespread problem.'"
An example of "unstable code":
char *a = malloc(sizeof(char));
*a = 5;
char *b = realloc(a, sizeof(char));
*b = 2;
if (a == b && *a != *b)
{
launchMissiles();
}
A cursory glance at this code suggests missiles will not be launched. With gcc, that's probably true at the moment. With clang, as I understand it, this is not true -missiles will be launched. The reason for this is that the spec says that the first argument of realloc becomes invalid after the call, therefore any use of that pointer has undefined behaviour. Clang takes advantage of this, and defines the behaviour of this to be that *a will not change after that point. Therefore it optimises if (a == b && *a != *b) into if (a == b && 5 != *b). This clearly then passes, and missiles get launched.
The truth here is that your compiler is not compromising application security – the code that relies on undefined behaviours is.
What is "unstable code" and how can a compiler leave it out?
The article is actually using that as an abbreviation for what they're calling "optimization-unstable code", or code that is included at some specified compiler optimization levels, but discarded at higher levels. Basically they think it's unstable due to being included or not randomly, not because the code itself necessarily results in random behaviour.
Another, more common example of code optimizations causing security problems is this pattern:
int a = [some value obtained externally];
// integer overflow occurred ...
int b = a + 2;
if (b < a) {
}
The C spec says that signed integer overflow is undefined. If a compiler does no optimization, this works. However, it is technically legal for the compiler to rightfully conclude that two more than any number is always larger than that number, and optimize out the entire "if" statement and everything inside it.
For proper safety, you must write this as:
int a = [some value obtained externally];
// integer overflow will occur ...
if (INT_MAX - a < 2) {
}
int b = a + 2;
Check out my sci-fi/humor trilogy at PatriotsBooks.
You probably haven't used any desktop compilers.
Just a sampling: