Slashdot Mirror


Finding More Than One Worm In the Apple

davecb (6526) writes "At Guido von Rossum's urging, Mike Bland has a look at detecting and fixing the "goto fail" bug at ACM Queue. He finds the same underlying problem in both in the Apple and Heartbleed bugs, and explains how to not suffer it again." An excerpt: "WHY DIDN'T A TEST CATCH IT? Several articles have attempted to explain why the Apple SSL vulnerability made it past whatever tests, tools, and processes Apple may have had in place, but these explanations are not sound, especially given the above demonstration to the contrary in working code. The ultimate responsibility for the failure to detect this vulnerability prior to release lies not with any individual programmer but with the culture in which the code was produced. Let's review a sample of the most prominent explanations and specify why they fall short. Adam Langley's oft-quoted blog post13 discusses the exact technical ramifications of the bug but pulls back on asserting that automated testing would have caught it: "A test case could have caught this, but it's difficult because it's so deep into the handshake. One needs to write a completely separate TLS stack, with lots of options for sending invalid handshakes.""

24 of 116 comments (clear)

  1. From whence the headline? by SuperKendall · · Score: 4, Insightful

    The "Apple" had only one bug, the Goto Fail bug - since Apple did not use OpenSSL they never had the second bug.

    So why is the headline painting Apple as the source of both bugs?

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
    1. Re:From whence the headline? by Anonymous Coward · · Score: 4, Insightful

      The "Apple" had only one bug, the Goto Fail bug - since Apple did not use OpenSSL they never had the second bug.

      So why is the headline painting Apple as the source of both bugs?

      Dude.. chill, it is an actual apple, as in a fruit -- it is a saying. I didn't read the headline your way at all.

    2. Re:From whence the headline? by jeffmeden · · Score: 3, Insightful

      They are comparing the test methods that might have cought the Apple SSL "goto fail" bug vs the Heartbleed openssl bug (which was unchecked memory access). How do we know there isn't another SSL bug in either? That's right, we don't. And we won't until testing (automated or otherwise) gets better in both places. Ideally we would have a way to find (and fix) lots of worms in both.

    3. Re:From whence the headline? by Anonymous Coward · · Score: 2, Informative

      It's exactly the original title of the article which is:

      "acmqueue - Finding More Than One Worm in the Apple"

    4. Re:From whence the headline? by radtea · · Score: 3, Interesting

      And we won't until testing (automated or otherwise) gets better in both places.

      I'm skeptical of testing (automated or otherwise), and I think point in TFS is well-taken: testing that would have caught this bug would have involved creating tests that virtually duplicated the system under test.

      While some code is susceptible to test-driven development and thorough testing, and that should be done where-ever possible, the resources required to test some code effectively double the total effort required, and maintaining the tests becomes a huge headache. I've worked in heavily-tested environments and spent a significant fraction of my time "fixing" tests that weren't actually failing, but which due to changes in interfaces and design had become out-of-date or inappropriate.

      That's not to say that testing can't be done better, but it's clearly a hard problem, and I've yet to see it done well for the kind of code I've worked on over the past 20 years (mostly algorithmic stuff, where the "right" answer is often only properly computable by the algorithm that is supposed to be under test, although there are constraints on correct solutions that can be applied.)

      So I'm arguing that a culture of professionalism, that implements best-practices including coding standards and code reviews (possibly automated) that check for simple things like open if statements and unchecked memory access would be lower cost and at least as effective as heavier-weight testing.

      This is a static-analysis vs dynamic-analysis argument, and while I certainly agree that dynamic analysis is necessary, both these bugs would have been caught with fairly simple-minded static analyzers checking against well-known coding standards from a decade ago.

      --
      Blasphemy is a human right. Blasphemophobia kills.
    5. Re:From whence the headline? by phantomfive · · Score: 2

      I don't know about the headline, but the other day I ran into a bug on OSX (some commandline tools developed problems with UTF-8). "No problem," I thought, "I'll just report it." I had to create an account to report a bug, which was annoying, but then when I got to the bug reporting website, I found this error message. "LOL" I thought, "but ok, I'll email them." I told them their bug website was having trouble, and they emailed back and said, "please report that through our bug reporting tool."

      --
      "First they came for the slanderers and i said nothing."
    6. Re:From whence the headline? by rasmusbr · · Score: 2

      Okay, but in this case the bug had little to do with the algorithm. The bug was triggered unconditionally for every input.

    7. Re:From whence the headline? by serviscope_minor · · Score: 3, Informative

      both these bugs would have been caught with fairly simple-minded static analyzers checking against well-known coding standards from a decade ago.

      Except they wouldn't. Coverity out right stated that their static analyzer would not have caught the heartbleed bug.

      --
      SJW n. One who posts facts.
    8. Re:From whence the headline? by Greyfox · · Score: 2
      It's no so much that -- if your coupling is loose enough, you should be able to test the API of any component in your system. But you have to come up with that test. Programmers often have blind spots for things where "no one would ever do that." It might be OK for programmers to come up with basic tests that exercise the API and make sure it's at least marginally functioning as designed, but you also really need to throw some guys at the code who just like to break things. Paid software houses don't even do that very often, much less open source projects.

      You also never see software auditing anymore. Everyone says "Oh you don't need that anymore now that we have Fortify," but fortify didn't catch this bug, did it? I did some auditing for Data General back in the '90's and found the buffer overflow in the AT&T telnet server 2 years before the same overflow was found on the Linux one. Fortify might have actually caught that one, since it was a fixed length buffer accepting user input, if anyone had ever thought to run Fortify against that program.

      --

      I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    9. Re:From whence the headline? by Anonymous Coward · · Score: 2, Interesting

      I seem to remember seeing an article on the NASA coding practice, and they do exactly what the summary suggests: every important feature is implemented twice, with two different algorithms, and they are tested against each other to ensure they produce the same result. They also do formal code reviews of every check-in (no matter how minor), and any bug found is treated as a process problem (i.e. how can we fix the process that allowed this bug in), rather than just a software problem.

      As a result they produce code which is as close to perfect as anyone has ever come, and costs about 10x the industry average to develop.

  2. Tests can never catch these bugs by Anonymous Coward · · Score: 3, Insightful

    For the same reason new viruses will always defeat anti-virus software: Each virus is tested against existing anti-virus programs and only released into the wild when it has defeated all of them.

    1. Re:Tests can never catch these bugs by Laxori666 · · Score: 2

      Did you actually read the article? The particular apple bug would have been easily caught by testing. In fact, the handshake code was copy-pasted six times in the code, and only one of the copies had the bug... if the developer had thought about about testing at all, that code would have been factored into one function, and even just by doing so the bug would have been less likely.

  3. Worth repeating... by QuietLagoon · · Score: 4, Interesting
    The ultimate responsibility for the failure to detect this vulnerability prior to release lies not with any individual programmer but with the culture in which the code was produced.

    .
    I've often said that you don't fix a software bug until you've fixed the process that allowed the bug to be created. The above quote is of a similar sentiment.

    1. Re:Worth repeating... by radtea · · Score: 4, Insightful

      I've often said that you don't fix a software bug until you've fixed the process that allowed the bug to be created.

      One of the things that struck me about the goto fail bug was that it was specifically engineered out of coding best practices in the '90's.

      Any reasonable coding standard from that time forbade if's without braces for precisely this reason. And yeah, that's a "no true Scotsman" kind of argument (if a coding standard didn't contain such a clause it was not by my definition "reasonable") but the point still holds: software developers at the time were aware of the risk of open if statements causing exactly this kind of failure, because we had observed them in the wild, and designed coding standards to reduce their occurrence.

      So to be very specific about what kind of processes and culture would have prevented this bug: a reasonable coding standard and code reviews would have caught it (much of the code review process can be automated these days), and a culture of professionalism is required to implement and maintain such things.

      The canonical attribute of professionals is that we worry at least as much about failure as success. We know that failures will happen, and work to reduce them to the bare minimum while still producing working systems under budget and on time (it follows from this that we also care about scheduling and estimation.)

      Amateurs look at things like coding standards and reviews and say, "Well what are the odds of that happening! I'm so good it won't ever affect my code!"

      Professionals say, "The history of my field shows that certain vulnerabilities are common, and I am human and fallible, so I will put in place simple, lightweight processes to avoid serious failures even when they have low probability, because in a world where millions of lines of code are written every day, a million-to-one bug is written by someone, somewhere with each turn of the Earth, and I'd rather that it wasn't written by me."

      It's very difficult to convince amateurs of this, of course, so inculcating professional culture and values is vital.

      --
      Blasphemy is a human right. Blasphemophobia kills.
  4. Neatness counts by mariox19 · · Score: 3, Insightful
    if ((err = SSLHashSHA1.update( &hashCtx, &signedParams)) != 0)
    goto fail;
    goto fail;

    Those familiar with the C programming language will recognize that the first goto fail is bound to the if statement immediately preceding it; the second is executed unconditionally.

    Sorry, but it needs to be said: this is sloppy, he-man coding. Is there a problem with using brackets? Is it your carpal tunnel syndrome? Are you charged by the keystroke?

    This is how mistakes happen. For shame!

    --

    quiquid id est, timeo puellas et oscula dantes.

    1. Re:Neatness counts by Anonymous Coward · · Score: 2, Insightful


      if ((err = SSLHashSHA1.update(
            &hashCtx, &signedParams)) != 0) {

            goto fail;

      }

      goto fail;

      Seems as though it still would/could have happened. Would it have been easier to catch? Likely. Still would have happened, though.

      True, it COULD have happened. But that's a helluva lot more obvious.

      And if you RTFA (I know....), the author really had to contrive a BS example with mismatched braces to make a case against requiring braces on all conditional code even if they're only one line.

      If mismatched braces is your "proof" that a code standard that requires braces all the time doesn't help prevent the creation of bugs like this, you're really desperate.

  5. Re:It takes brains by jeffmeden · · Score: 4, Insightful

    I've been in this field for 20+ years now, and I don't necessarily (in fact, I usually don't) agree with whatever the current trend is (which is probably why my karma is negative). One underlying trend, has been to make software something that can be made by anyone - to remove the requirement of having a special mind that is able to think through algorithms and code. This has generally been accomplished through process, and abstraction. Process - if we can describe a method well enough, then anyone should be able to follow it to it's logical conclusion. Abstraction - we keep adding layers upon layers in an effort to simplify and streamline that which is a complex thing (lots of numbers in sequence to control a microprocessor and it's accompanying hardware). You can probably tell that I'm not a great fan of either - though I'm really really trying to not be a negative type, and to go with the flow more. But I can't help my fundamental feelings that there is just no substitute for a smart individual with a gift of understanding the logic of code. I'm always against process because it takes the gift that i was given and neutralizes it. Personal feelings aside, I just don't think that all the process in the world is ever going to get ahead of the curve that is the battle between perfectly functional software and bugs.

    If you make brilliant code that only you can understand, sorry to be harsh but you aren't that brilliant. We definitely need to value people who can generate and perfect algorithms, but do you think anyone would remember/value the Pythagorean Theorem if it was 40 steps long? No, he thought of a (then brilliant) way to do it simply and easily so that one only needs to understand basic math to pull it off. This is what we need more of; a single elegant algorithm that is so short it is hard to misuse is better than 1,000 algorithms that are all so hard to understand that only the author knows exactly how it works and will be forgotten as soon as the particular language or application fades into the past.

  6. -Wall -Werror by Megane · · Score: 4, Interesting

    Turning on all warnings and forcing them to errors certainly would have caught the bug in Apple's SSL code. Anyone who just lets warnings fly by in C code is an idiot. Even if the warning is mildly silly, getting it out of the way lets the important warnings stand out. Sensible warnings from C compilers are the very reason we don't use lint anymore. Even then you still have to watch out, because some warnings won't appear at low optimization levels, and I recall hearing that there are a few obscure warnings not turned on by -Wall.

    Also, it could have possibly been introduced by a bad merge. One of the things that putting braces on every if/for/while/etc. does is give merges more context to keep from fucking up, or at least a chance to cause brace mismatch.

    As for Heartbleed, just the fact that the code wouldn't work with a compile time option to use the system malloc instead of a custom one should have been enough to raise some red flags. Because rolling your own code to do something "more efficiently" than the system libraries never introduces new problems, right?

    --
    #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
    1. Re:-Wall -Werror by rabtech · · Score: 2

      Turning on all warnings and forcing them to errors certainly would have caught the bug in Apple's SSL code. Anyone who just lets warnings fly by in C code is an idiot. Even if the warning is mildly silly, getting it out of the way lets the important warnings stand out. Sensible warnings from C compilers are the very reason we don't use lint anymore. Even then you still have to watch out, because some warnings won't appear at low optimization levels, and I recall hearing that there are a few obscure warnings not turned on by -Wall.

      Let me quote from one of the best-tested and most widely used projects out there, SQLite, from http://www.sqlite.org/testing....

      Static analysis has not proven to be especially helpful in finding bugs in SQLite. Static analysis has found a few bugs in SQLite, but those are the exceptions. More bugs have been introduced into SQLite while trying to get it to compile without warnings than have been found by static analysis.

      The bolded part has been my experience unfortunately. Static analysis is nearly useless.

      An appropriate test for something like an SSL stack is a separate test harness that "fuzzes" the stack by exploring large random combinations of values, some with known good certificates and others with randomly generated (and thus broken) ones. These days one can spin up thousands of VMs, run a massive suite of billions of test cases in parallel over a few hours, then spin them down and spend a relatively small sum of money.

      And yes, the test harness for something like this is probably going to exceed the # of lines of code of the actual implementation by an order of magnitude. For really important security-critical stuff like cryptography, SSL/TLS, keychain management, etc it is well worth the effort.

      --
      Natural != (nontoxic || beneficial)
  7. Reeks of a terrible article by rebelwarlock · · Score: 2

    We have some lovely elements coming together right here on the slashdot blurb:

    1. Stupid pun instead of a descriptive title
    2. Full caps in the article excerpt
    3. Trying to bring up coding "culture"
    4. Assertion that it totally could have been caught beforehand, but they aren't sure exactly how.

    Somehow, I don't think I'm missing much by not reading the article.

  8. Merge Conflict by znigelz · · Score: 3, Insightful

    This is clearly the automatic resolution of a merge conflict by the versioning control software. These are such a nightmare to debug and happen all the time. Developers rarely check their entire change visually post merge. Though this can be found using static analysis that force coding standards (such as forcing the use of brackets or proper indentation for the lexical scope). Though the bugs from automatic conflict resolution can only be really improved through better versioning software. These are without question the worst and most frustrating bugs.

  9. Re:It takes brains by Junta · · Score: 2

    If you make brilliant code that only you can understand

    There's a false dichotomy here. He said that only *some* are qualified enough to create solutions to complex problems. You are saying his claim is that only *one* can understand, implying that the problem can't possibly be too hard, and that any hard code to follow is just because the developer is terrible at coding.

    As a counter to your example of the Pythagorean Theorem, what about post-graduate math and science? There are tons of things which would make 40 steps seem easy by comparison. Should society forgo those just because only some people are realistically going to be able to understand and apply that correctly?

    A very ubiquitous situation is that with the 'anyone can understand it or else it shouldn't exist at all' philosophy, there is no way we'd have cryptographic libraries at all.

    I will agree that his stance against processes is a bit too harsh, but I've been around enough to know in some scenarios such a jaded perspective would be perfectly understandable. I've seen some projects that had appropriate and helpful processes that did help quality, but been witness to many many more that had ineffective process that achieved nothing but create busy work while still churning out crap code.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  10. The culture by computational+super · · Score: 2

    Yeah, the "culture" is "hurry up and get it done so you can get on to the next thing because if something takes more than an hour to do it's not worth doing" and it exists in every single software development organization on planet Earth. Until these things actually start costing real money to people with real power, this will continue.

    --
    Proud neuron in the Slashdot hivemind since 2002.
  11. Bug is somewhere else by gnasher719 · · Score: 2

    Ok, writing "goto fail;" twice in a row is a bug. But it's not the real bug. This code was checking whether a connection was safe, and executed a "goto fail;" statement if one of the checks failed. It also executed one "goto fail;" by accident, skipping one of the checks. But one would think that a statement "goto fail;" would make the connecction fail! In that case, sure, there was a bug, but the bug should have led to all connections failing, which would have made it obvious to spot (because the code wouldn't have worked, ever).

    So the real problem is a programming style where executing a statement "goto fail;" doesn't actually fail! If a function returns 0 for success / non-zero for failure like this one, it should have been obvious to add an "assert (err != 0)" to the failure case following the fail: label. And that would have _immediately_ caught the problem. There should be _one_ statement in the success case "return 0;" and _one_ statement in the failure case "assert (err != 0); return err;".