Slashdot Mirror


How Your Compiler Can Compromise Application Security

jfruh writes "Most day-to-day programmers have only a general idea of how compilers transform human-readable code into the machine language that actually powers computers. In an attempt to streamline applications, many compilers actually remove code that it perceives to be undefined or unstable — and, as a research group at MIT has found, in doing so can make applications less secure. The good news is the researchers have developed a model and a static checker for identifying unstable code. Their checker is called STACK, and it currently works for checking C/C++ code. The idea is that it will warn programmers about unstable code in their applications, so they can fix it, rather than have the compiler simply leave it out. They also hope it will encourage compiler writers to rethink how they can optimize code in more secure ways. STACK was run against a number of systems written in C/C++ and it found 160 new bugs in the systems tested, including the Linux kernel (32 bugs found), Mozilla (3), Postgres (9) and Python (5). They also found that, of the 8,575 packages in the Debian Wheezy archive that contained C/C++ code, STACK detected at least one instance of unstable code in 3,471 of them, which, as the researchers write (PDF), 'suggests that unstable code is a widespread problem.'"

9 of 470 comments (clear)

  1. Re:TFA does a poor job of defining what's happenin by Anonymous Coward · · Score: 5, Informative

    An example of "unstable code":

    char *a = malloc(sizeof(char));
    *a = 5;
    char *b = realloc(a, sizeof(char));
    *b = 2;
    if (a == b && *a != *b)
    {
            launchMissiles();
    }

    A cursory glance at this code suggests missiles will not be launched. With gcc, that's probably true at the moment. With clang, as I understand it, this is not true -missiles will be launched. The reason for this is that the spec says that the first argument of realloc becomes invalid after the call, therefore any use of that pointer has undefined behaviour. Clang takes advantage of this, and defines the behaviour of this to be that *a will not change after that point. Therefore it optimises if (a == b && *a != *b) into if (a == b && 5 != *b). This clearly then passes, and missiles get launched.

    The truth here is that your compiler is not compromising application security – the code that relies on undefined behaviours is.

  2. Re:TFA does a poor job of defining what's happenin by Nanoda · · Score: 5, Informative

    What is "unstable code" and how can a compiler leave it out?

    The article is actually using that as an abbreviation for what they're calling "optimization-unstable code", or code that is included at some specified compiler optimization levels, but discarded at higher levels. Basically they think it's unstable due to being included or not randomly, not because the code itself necessarily results in random behaviour.

  3. Re:News flash by Mitchell314 · · Score: 5, Funny

    Code with a finite half-life. Sometimes radiates when it decays. The byproducts tend to be hazardous to health, and most cause symptoms such as headaches, tremors, Carpal Tunnel Syndrome, and Acute Induced Tourette Syndrome. Handle with care. The Daily WTF has an emergency hotline if you or somebody you know has been exposed to unsafe levels of unstable code.

    --
    I read TFA and all I got was this lousy cookie
  4. Really small EXE mystery solved by Tablizer · · Score: 5, Funny

    many compilers actually remove code that it perceives to be undefined or unstable

    No wonder my app came out with 0 bytes.

  5. Re:TFA does a poor job of defining what's happenin by dgatwood · · Score: 5, Informative

    Another, more common example of code optimizations causing security problems is this pattern:

    int a = [some value obtained externally];
    int b = a + 2;
    if (b < a) {
    // integer overflow occurred ...
    }

    The C spec says that signed integer overflow is undefined. If a compiler does no optimization, this works. However, it is technically legal for the compiler to rightfully conclude that two more than any number is always larger than that number, and optimize out the entire "if" statement and everything inside it.

    For proper safety, you must write this as:

    int a = [some value obtained externally];
    if (INT_MAX - a < 2) {
    // integer overflow will occur ...
    }
    int b = a + 2;

    --

    Check out my sci-fi/humor trilogy at PatriotsBooks.

  6. Re:TFA does a poor job of defining what's happenin by lgw · · Score: 5, Funny

    No, the compiler is allowed to to anything it damn well pleases wherever the standard calls behaviou "undefined". One of my favorite quotes ever from a standards discussion:

    When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose

    Nasal demons can cause code instability.

    --
    Socialism: a lie told by totalitarians and believed by fools.
  7. Re:News flash by EvanED · · Score: 5, Informative

    Yes, you didn't RTFA, because your definition actually makes sense. TFA defines "unstable code" as code with undefined behavior.

    ...and undefined behavior is exactly what causes the things I listed.

    TFA also claims that many compilers simply DELETE such code. I have never seen a compiler that does that, and I seriously doubt if is really common.

    You probably haven't used any desktop compilers.

    Just a sampling:

    • During MS's security push a decade ago, they discovered that the compiler was optimizing away the memset in code such as memset(password, '\0', len); free(password); that was limiting the lifetime of sensitive information, because the assignment to password in the memset was a dead assignment -- it was never read from (not actually undefined behavior, but it is an example of the compiler deleting unused code that was actually there for a purpose)
    • I linked part 3 of this series to you in another response, but the first example in here discusses such an optimization that GCC did which removed security checks in the Linux kernel (see also this series -- look down at "A Fun Case Analysis")
    • GCC has long turned on -fno-strict-aliasing because optimizations based on the strict aliasing assumption break the kernel (more precisely: code that violates the standard's strict aliasing rules was being "mis-"optimized), though I don't know if it led to security implications
  8. These bugs exist even *without* signed integers! by Myria · · Score: 5, Interesting

    The first mistake was using signed integers.

    The problem is C's promotion rules. In C, when promoting integers to the next size up, typically to the minimum of "int", the rule is to use signed integers if the source type fits, even if the source type is unsigned. This can cause code that seems to use unsigned integers everywhere break because C says signed integer overflow is undefined. Take the following code, for example, which I saw on a blog recently:

    uint64_t MultiplyWords(uint16_t x, uint16_y)
    {
        uint32_t product = x * y;
        return product;
    }

    MultiplyWords(0xFFFF, 0xFFFF) on GCC for x86-64 was returning 0xFFFFFFFFFFFE0001, and yet this is not a compiler bug. From the promotion rules, uint16_t (unsigned short) gets promoted to int, because unsigned short fits in int completely without loss or overflow. So the multiplication became ((int) 0xFFFF) * ((int) 0xFFFF). That multiplication overflows in a signed sense, an undefined operation. The compiler can do whatever it feels like - including generate code that crashes if it wants.

    GCC in this case assumes that overflow cannot happen, so therefore x * y is positive (when it's really not at runtime). This means the uint32_t cast does nothing, so is omitted by the optimizer. Now, the code generator sees an int cast to uint64_t, which means sign extension. The optimizer this time isn't smart enough to know again that it's positive and therefore can ignore sign extension and use "mov eax, ecx" to clear the high 32 bits, so it emits a "cqo" opcode to do the sign extension.

    So no, avoiding signed integers does not always save you.

    --
    "Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
  9. Re:These bugs exist even *without* signed integers by Animats · · Score: 5, Interesting

    The problem is C's promotion rules. In C, when promoting integers to the next size up, typically to the minimum of "int", the rule is to use signed integers if the source type fits, even if the source type is unsigned.

    I know. C's handling of integer overflow is "undefined". In Pascal, integer overflow was a detected error. DEC VAX computers could be set to raise a hardware exception on integer overflow, and about thirty years ago, I rebuilt the UNIX command line tools with that checking enabled. Most of them broke.

    In the first release of 4.3BSD, TCP would fail to work with non-BSD systems during alternate 4-hour periods. The sequence number arithmetic had been botched due to incorrect casts involving signed and unsigned integers. I found that bug. It wasn't fun.

    C's casual attitude towards integer overflow is why today's machines don't have the hardware to interrupt on it. Ada and Java do overflow checks, but the predominance of C sloppyness influenced hardware design too much.

    I once wrote a paper, "Type Integer Considered Harmful" on this topic. One of my points was that unsigned arithmetic should not "wrap around" by default. If you want modular arithmetic, you should write something like n = (n +1) % 65536;. The compiler can optimize that into machine instructions that exploit word lengths when the hardware allows, and you'll get the same result on all platforms.