Slashdot Mirror


Why Software Builds Fail

itwbennett writes: A group of researchers from Google, the Hong Kong University of Science and Technology and the University of Nebraska undertook a study of over 26 million builds by 18,000 Google engineers from November 2012 through July 2013 to better understand what causes software builds to fail and, by extension, to improve developer productivity. And, while Google isn't representative of every developer everywhere, there are a few findings that stand out: Build frequency and developer (in)experience don't affect failure rates, most build errors are dependency-related, and C++ generates more build errors than Java (but they're easier to fix).

33 of 279 comments (clear)

  1. Because I'm lazy by OzPeter · · Score: 5, Informative

    Half the time when I'm working on any sort of non-trivial program (that is too large to hold in my head all at once) and I need to make a breaking code change (and one that is not easily managed with refactoring tools), I'll make the change where it is obvious to me and then let the compiler tell me where it broke and hence where I need to make my fixes.

    --
    I am Slashdot. Are you Slashdot as well?
    1. Re:Because I'm lazy by Z00L00K · · Score: 2

      The most important thing is not to avoid that the build fails but to avoid distributing software packages that can't be built.

      However if something can't be built due to a mistake it's often easy to find and correct. The big problems are often not that visible and it can take a while to figure them out.

      What really grinds my gears is that people release source code that is possible to build, but so full of compiler warnings that you can't be certain that it's going to work as intended.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    2. Re:Because I'm lazy by mlts · · Score: 3, Interesting

      When in CS, I had a prof that had one rule that for release (not beta/alpha/dev) code, if the code had even a single warning, it was unshippable unless there was an extremely good reason (which would be part of the README) of why it happened. Yes, this was a PITA, but he was trying to teach something that seems to have been lost.

    3. Re:Because I'm lazy by 0123456 · · Score: 5, Funny

      He's the reason compiler writes invented pragmas to turn off warnings...

    4. Re:Because I'm lazy by turgid · · Score: 2

      When in CS, I had a prof that had one rule that for release (not beta/alpha/dev) code, if the code had even a single warning, it was unshippable unless there was an extremely good reason (which would be part of the README) of why it happened. Yes, this was a PITA, but he was trying to teach something that seems to have been lost.

      You should be compiling with warnings as errors as soon as you start coding, and you should fix each one as they occur before you move on to write the next line of code.

      Putting off fixing these problems leads to bloated and fragile code and wastes much more time debugging and fixing later.

    5. Re:Because I'm lazy by ShanghaiBill · · Score: 2

      What really grinds my gears is that people release source code that is possible to build, but so full of compiler warnings that you can't be certain that it's going to work as intended.

      Where I work, all builds are run with -Wall -Wextra -Werror. So if you check in code that produces a warning, the compiler turns it into an error, and you broke the build. Which means you get to be the build babysitter until someone else breaks it.

    6. Re:Because I'm lazy by geekoid · · Score: 4, Insightful

      NO, he was teaching engineering practices, and a good one.

      People like you is why software is in such a terrible state as an industry.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    7. Re:Because I'm lazy by EvanED · · Score: 5, Insightful

      I'm a fan of warnings as much as the next guy, but there are plenty of times that's not practical. Even if you accept nothing else I say, there are plenty of times where third party code (say, for example, Boost...) has warning-producing stuff in it. Do you fix it and maintain your own branch? Submit it upstream, hope it gets accepted, then wait a month for the new release, then demand everyone upgrade to the latest bleeding-edge? That's often (maybe usually) not feasible, which means it should probably just disable it. Fortunately, GCC finally got around to adding #pragmas that sometimes let you disable and re-enable warnings for just the offending headers.

      But beyond that, there's also a reason that compilers have most warnings off by default. And why -Wall doesn't turn on anything close to all warnings. And why even -Wall -Wextra isn't all warnings. Because there's a gradation of false positive/false negative tradeoffs, and what's appropriate for you isn't appropriate for everyone. Do you really compile with all warnings, or do you suppress some? Because I can almost guarantee it's the latter, even if you're suppressing them through inaction.

    8. Re:Because I'm lazy by RabidReindeer · · Score: 3, Insightful

      You should be compiling with warnings as errors as soon as you start coding, and you should fix each one as they occur before you move on to write the next line of code.

      Putting off fixing these problems leads to bloated and fragile code and wastes much more time debugging and fixing later.

      What you should be doing outside the CS class and in the so-called "Real World" is "being productive". That usually means screw the warnings, it has to be completed ASAP or we'll find someone "more productive" than you are.

    9. Re:Because I'm lazy by UnknownSoldier · · Score: 2

      Generally I recommend leaving most warnings on. But sometimes compiler writers go completely over board.

      When you use MSVC you have to do stupid stuff like this

            #define _CRT_SECURE_NO_WARNINGS // WIN32:MSVC disable warning C4996: This function or variable may be unsafe.

      The following compiler specific header suffices to compile code using without warnings, at highest warning level.

      #pragma warning( disable: 4061 ) // enum value is not *explicitly* handled in switch
      #pragma warning( disable: 4099 ) // first seen using 'struct' now seen using 'class'
      #pragma warning( disable: 4127 ) // conditional expression is constant
      #pragma warning( disable: 4217 ) // member template isn't copy constructor
      #pragma warning( disable: 4250 ) // inherits (implements) some member via dominance
      #pragma warning( disable: 4251 ) // needs to have dll-interface to be used by clients
      #pragma warning( disable: 4275 ) // exported class derived from non-exported class
      #pragma warning( disable: 4347 ) // "behavior change", function called instead of template
      #pragma warning( disable: 4355 ) // "'this': used in member initializer list
      #pragma warning( disable: 4505 ) // unreferenced function has been removed
      #pragma warning( disable: 4510 ) // default constructor could not be generated
      #pragma warning( disable: 4511 ) // copy constructor could not be generated
      #pragma warning( disable: 4512 ) // assignment operator could not be generated
      #pragma warning( disable: 4513 ) // destructor could not be generated
      #pragma warning( disable: 4610 ) // can never be instantiated user defined constructor required
      #pragma warning( disable: 4623 ) // default constructor could not be generated
      #pragma warning( disable: 4624 ) // destructor could not be generated
      #pragma warning( disable: 4625 ) // copy constructor could not be generated
      #pragma warning( disable: 4626 ) // assignment operator could not be generated
      #pragma warning( disable: 4640 ) // a local static object is not thread-safe
      #pragma warning( disable: 4661 ) // a member of the template class is not defined.
      #pragma warning( disable: 4670 ) // a base class of an exception class is inaccessible for catch
      #pragma warning( disable: 4672 ) // a base class of an exception class is ambiguous for catch
      #pragma warning( disable: 4673 ) // a base class of an exception class is inaccessible for catch
      #pragma warning( disable: 4675 ) // resolved overload was found by argument-dependent lookup
      #pragma warning( disable: 4702 ) // unreachable code, e.g. in header.
      #pragma warning( disable: 4710 ) // call was not inlined
      #pragma warning( disable: 4711 ) // call was inlined
      #pragma warning( disable: 4820 ) // some padding was added
      #pragma warning( disable: 4917 ) // a GUID can only be associated with a class, interface or namespace

      Reference:

      * http://alfps.wordpress.com/201...

    10. Re:Because I'm lazy by QilessQi · · Score: 4, Informative

      I've spent many decades in that Real World. Ignoring compiler warnings and failing to write automated unit tests for edge cases can cause production defects and database corruption crises that will eat many, many more hours of productivity than simply addressing all compiler warnings. Not to mention causing poor end-user perception and increasing the workload up and down the software support and delivery chain.

      Developers whose coding habits cause such situations in real world enterprise or commerce systems are ultimately "less productive" than having no developer at all. :-)

    11. Re:Because I'm lazy by Imagix · · Score: 3, Informative

      That one's quite useful. You've declared a variable and now whomever is reading the code now has the additional cognitive load to try to figure out why that variable exists.

    12. Re:Because I'm lazy by chis101 · · Score: 3, Informative

      If you are talking about C/C++, the variable is *not* null in either case. If you assigned null to it, then it is null. If you never assigned any value to it, then it is whatever happened to be in memory at that location. It's a pretty good warning to let you know you are using a variable without it being assigned a value.

      int* ptr;
      if( ptr != NULL )
      {
      *ptr = 0;
      }

      This code will at some point crash. Maybe not on the first run, but at some point ptr will not be null, but will not be a pointer to valid memory.

    13. Re:Because I'm lazy by Obfuscant · · Score: 2
      Is "software" a science, and engineering practice, or is it an industry?

      People like you are what gives people who say "people like you" a bad name.

    14. Re:Because I'm lazy by bberens · · Score: 2

      This doesn't work well if/when you do something like upgrade you compiler version and code that was previously non-warning producing now produces warning. For example, I work on an older Java system that has a few million lines of code. Most of the older code does not use generics, however the modern Java compilers will throw a warning for all that old code. The old code works just fine and there's nothing wrong with it. It would be nice if it used generics but there's not a lot of sense in going back and updating millions of lines of code, some of which may not even work properly with generics. So for that entire codebase we suppress the warnings about generics.

      --
      Check out my lame java blog at www.javachopshop.com
    15. Re:Because I'm lazy by NormalVisual · · Score: 2, Interesting

      Rework the code so that it doesn't warn.

      The problem is that sometimes that's not an option. For instance, a few weeks ago I was working with some code in VS 2010 that used named enums. Even though Intellisense was smart enough to recognize the enum without the explicit name ("enum" instead of "name::enum"), the compiler kept throwing "unknown symbol" errors if the enum was left as-is, and it would throw a warning indicating that the syntax given was only valid under C++11 if I explicitly scoped the enum, which failed the build because we compile with warnings equating to errors. Changing the enum itself to be enclosed in a class or at least a namespace probably was the right way to do it, but it would end up affecting a lot of other code, which in turn meant an extra regression pass for the QA guys. So, the only practical solution at the time was to disable the warning for that block of code, and re-enable it afterwards with a comment explaining the reason.

      --
      Please stand clear of the doors, por favor mantenganse alejado de las puertas
    16. Re:Because I'm lazy by Darinbob · · Score: 3, Insightful

      Why is one reason why no one should ever use Microsoft's code or tools as exemplars. I've got a theory that the first thing Microsoft has a new intern do is write example code to give to customers, and the second thing the intern is asked to do is read the coding guidelines.

  2. LaTeX never fails by Extremus · · Score: 2

    My LaTeX builds rarely fail in MiKTeX. The compiler itself seems to be able to download packages and classes from a common repository (CTAN and its many mirrors).

  3. Re:It's usually a computer problem by matthiasvegh · · Score: 4, Insightful

    Oh coding error? Well thats helpful. Misplace a semicolon in a non-trivial meta-program or dsl in C++, and just watch the errors that the compiler spits back at you. None of which, will have anything to do with semicolons. I suppose this is why the C++ errors are considered to be easy to fix. Mistype a word, and you get 15000 lines of errors. I suppose it's easy to fix all those errors too. Yes, but figuring out what exactly the coding error was is kind of the point.

  4. Here's a concept to prevent this crap - UNIT TESTS by SirGeek · · Score: 2

    If GOOD/Complete unit tests for code exist and this change would break it, How freaking tough is it to run the unit test before committing your change to source code control ?

  5. Dependencies? by Anonymous Coward · · Score: 3, Insightful

    Dependencies just magnify all other problems. If your code depends on nothing then it won't break unless the compiler changes. Unfortunately such programs don't exist because you can never depend on nothing and do anything useful. In reality if you depended on nothing you'd end up writing your own console, your own I/O, pretty much your own CRT. This sounds great until you realize your dependency is now the hardware itself and it's likely your code won't be portable in any useful sense. That's why we have kernels.

    The problem with C++ is that dependency management is usually file-level and developers 'rarely' care about any file-level constructs (and nor should they, it's an abstract packaging concept). As a result you try to drag in one enum and end up with 100 #includes and 500 new classes you don't care about. This causes bigger object files to be emitted, vastly slower linkage and lots of dependencies you don't expect. All it takes now is for one of those includes to #define something unexpected and BOOM...the house of cards comes crashing down.

    Also, did I mention? The C preprocessor causes a lot of grief when it's abused.

  6. Dependencies Problems = "It builds on my machine" by Virtucon · · Score: 5, Insightful

    Once code is checked in and goes through the standard build process, that's where this is expected to occur because in my experience it's the local environment where the developer does the coding that's the root problem. Why? Developers don't refresh their build environment because of the potential for other problems it may create. I had one gig to unfuck some code at a company a couple of years ago and found out that in order to set up a Dev environment in this place could take two weeks or more depending on what team you were on. You had to go through a script, download this, install that, change this.. A nightmare. Updating dependencies on a local desktop created panics amongst the developers who were reluctant to ever change anything they had which "was working" because you could spend days trying to fix what was broken. Naturally any time they migrated code into test or production (there was no build system) things failed there because of dependency related issues. Also depending on who the developer was, they naturally felt that bypassing the Test/QA cycle was a job perk.

    I found dozens of dependencies on desktops that were out of date, deprecated or had major vulnerabilities and that went for the production systems as well. It was bad all the way around from a best practices perspective. Daily production crashes were the norm, the VP of Dev had a monitor on his desk so he could "troubleshoot" production problems it was that bad.

    Yes there's shops like this that are still out there.

    --
    Harrison's Postulate - "For every action there is an equal and opposite criticism"
  7. Re:It's usually a computer problem by aliensexfiend · · Score: 4, Insightful

    When I see the error avalanche the first place I check are the first few error messages and that is usually enough to spot the problem. Typos still make c++ compilers barf way too much crap.

  8. Re:Dependencies Problems = "It builds on my machin by gstoddart · · Score: 2

    I had an experience which was somewhat opposite (though, in a lot of ways pretty much the same).

    At one point, the company went with a big giant universal build system.

    Every piece of software, every module, every final build ... was recompiled from scratch on a nightly basis. It took a massive server farm many hours to do this. Even if no changes had been made.

    What would happen would be someone would break a component. The build of that component, and every downstream dependency broke. The system had no concept of "this is a beta build, not for everybody" and "this is a release, and stable".

    The result was that sometimes you'd have literally dozens of things which were now suddenly broken. It was too stupid of a build system to use the last known good.

    So, all of a sudden you get one trivial change in some module about 4 steps removed from your stuff. But, it was all broken, and your stuff couldn't be properly built until someone fixed their stuff, and the build system went through at least one more cycle, often two.

    Sometimes, companies get themselves into such a borked state with their build system (or lack thereof) that it makes doing any work impossible.

    Some of us started keeping our own local copies, and writing local build scripts, because we couldn't rely on the company wide one to actually work much of the time.

    --
    Lost at C:>. Found at C.
  9. Re:why greed fear ego based 'societies' fail? by HeckRuler · · Score: 2

    Is this some sort of cry for help? Are you ok? Do you need us to call someone?

  10. Re:It's usually a computer problem by Anonymous Coward · · Score: 5, Insightful

    Please, give up the C++ slander.

    Like any compiler output, read the first error. If you are a developer of any calibre, having a few pages of errors shouldn't phase you and it's not unique to C++ to generate a few erroneous errors. All it requires is a basic level of competence and if you don't possess that then any programming
    language that facilitates you generating anything that compiles is doing noone any favours.

  11. Re:Eighteen THOUSAND engineers?! by rolfwind · · Score: 2

    Well if they won't make the beta programs today that will get discontinued tomorrow, who will?

  12. Re:Here's a concept to prevent this crap - UNIT TE by OakDragon · · Score: 2

    If your parents had unit testing you never would've been born.

    I would still have been born, just not so buggy.

  13. Re:Here's a concept to prevent this crap - UNIT TE by EvanED · · Score: 2

    And of course everyone always builds with the same configuration, same compiler, on the same platform.

    (We have CI servers in our environment. They break not infrequently. Why? Because someone commits a change that builds fine on Linux, and when MSVC gets ahold of it, it produces a warning that GCC doesn't catch and so the build fails. Or MSVC accepts some piece of code that is not actually legal C++ because it's too loose, so when the Linux buildbots get ahold of it, they complain.)

  14. Re:Here's a concept to prevent this crap - UNIT TE by Anonymous Coward · · Score: 2, Insightful

    Why not just say that you should always build against the latest official working source before checkin? It has nothing to do with unit testing.

  15. why women don't code also a solution by globaljustin · · Score: 4, Insightful

    I know this is "offtopic" but stay with me and I'll bring it around on-topic...

    A big question that people are throwing Billions of dollars & millions of internet comments about is "How can we get more women into programming/coding?"

    Ok...b/c our industry is by default very complex, it's not unreasonable that to really drill down to an answer to that question might be fairly complex...the answer can be summarized, sure, but to really get at the problem it involves learning a bit.

    Here, in this thread, we find out why...and it affects us **all** not just woman coders, or coders...it affects how the whole company works and the perception of value...witness:

    Why do you assume all warnings are useful?? Some of the compiler warnings are just pedantic and are "noise" such as "variable declared but not used", etc.

    There is a balance between no warnings and pedantic warnings, namely the useful ones.

    One of the things that's nice about the Eclipse IDE is that you can select the importance of selected messages, all the way from "ignore" to "fatal", depending on shop standards and personal paranoia.

    However, the offline builders such as Maven and Ant cannot adopt those preferences, so it's not uncommon for a production build to spit out dozens or hundreds of warnings about things that don't actually matter.

    Working with C/C++ I almost never had clean builds, since even....

    Here we have a central thesis:

    "There is a balance between no warnings and pedantic warnings, namely the useful ones."

    Parent agrees, and describes how using a **proprietary software** (Eclipse) which adds an **extra abstraction layer** to an already ridiculous process...a process which we all know theoretically should be able to be done on a text editor

    the fact that coding, the act of developing, software engineering, the 'real work' has such obtuse solutions, solutions to problems based on...

    PEDANTIC choices...overkill...the lack of discretion...there are many reasons for this but that's another rant

    it's alienating to new people regardless of gender...the only reason many people work jobs as coders is **for the money**

    until we address these fundamental issues, the problems that arise only because some compiler programmer was overly pedantic due to lack of empathy skills will destroy any attempt to get non-traditional types into coding

    right now, you basically have to be a bit autistic, or be able to think that way on command, in order to code...part of it is genetic, but part of it is deliberate...you have to train your mind to think in a "code" instruction manner...why would a woman do all this given other options?

    the solution to pedantic, tone-deaf coding choices is, of course, a fresh perspective that can help get rid of problems from abstractions...

    we need women in coding to help make coding more appealing to women

    so, to make this on-topic, I think **more women in coding** is a long-term solution to problems in TFA

    --
    Thank you Dave Raggett
  16. Clarifications by Afty · · Score: 5, Informative

    Hi, I'm one of the authors of the paper and an engineer at Google. I wanted to clarify some points that have come up in the comments.

    First, we don't believe that failing builds are bad. We wanted to study the typical edit-compile-debug cycle that all developers (at least those writing in compiled languages) use to write code. It's perfectly fine to do something like change the signature of a method, compile, then use the compiler errors to find all places where you need to fix your code. We were interested in what kinds of compile errors people run into, how long it takes them to fix the errors, and how we can help you go from a failed to a successful build more quickly. For example, for one particular class of dependency error, we saw that people were spending too much time fixing it. So we created a tool to automatically fix the error and included the command to run the tool in the error message emitted by the compiler. After that we saw the fix time for that class of error drop significantly.

    Second, this work is not related to checking in broken code. The builds we looked at are work-in-progress builds from Google developers working on their projects, so it's code in intermediate states of development, not code that has been checked in. It's possible that broken code may be checked in, but our continuous build system will catch that quickly and force you to fix the problem. So for all intents and purposes, all of the code checked into our depots builds cleanly.

    Third, by dependency issues we probably don't mean what you think we mean. Within Google we use a custom build system with a custom build file format. Source code is grouped into build targets, and build targets depend on each other, even across languages. You can assume that code checked into the depot builds successfully, and that generally engineers are editing only code in their project and not in their dependencies. The dependency errors we describe in the paper usually result because someone added a source-code-level dependency without adding a matching dependency in the build file, resulting in a "cannot find symbol" error. For example, in a JUnit test you might write the code:
    Assert.assertTrue(foo);
    But if you don't add a dependency on JUnit to the build file, then you will get a compile error because the build system doesn't know where to find the Assert class. We would count that as a dependency error.

    Finally, at Google there is no distinction between "builds on my machine" and "builds on someone else's machine." Our build system requires that all dependencies be explicitly declared, even environmental dependencies like compiler versions and environment variables, so that a build is reproducible on any machine. This is how we are able to distribute our builds. So it's impossible for code to build on a developer's local machine but not on the continuous build system.

    I'm happy to answer further questions if people are interested.

  17. Re:It's usually a computer problem by turgid · · Score: 2

    c++ is great.

    Keep repeating it often enough and people will believe you.