Why Software Builds Fail
itwbennett writes: A group of researchers from Google, the Hong Kong University of Science and Technology and the University of Nebraska undertook a study of over 26 million builds by 18,000 Google engineers from November 2012 through July 2013 to better understand what causes software builds to fail and, by extension, to improve developer productivity. And, while Google isn't representative of every developer everywhere, there are a few findings that stand out: Build frequency and developer (in)experience don't affect failure rates, most build errors are dependency-related, and C++ generates more build errors than Java (but they're easier to fix).
Half the time when I'm working on any sort of non-trivial program (that is too large to hold in my head all at once) and I need to make a breaking code change (and one that is not easily managed with refactoring tools), I'll make the change where it is obvious to me and then let the compiler tell me where it broke and hence where I need to make my fixes.
I am Slashdot. Are you Slashdot as well?
Oh coding error? Well thats helpful. Misplace a semicolon in a non-trivial meta-program or dsl in C++, and just watch the errors that the compiler spits back at you. None of which, will have anything to do with semicolons. I suppose this is why the C++ errors are considered to be easy to fix. Mistype a word, and you get 15000 lines of errors. I suppose it's easy to fix all those errors too. Yes, but figuring out what exactly the coding error was is kind of the point.
Once code is checked in and goes through the standard build process, that's where this is expected to occur because in my experience it's the local environment where the developer does the coding that's the root problem. Why? Developers don't refresh their build environment because of the potential for other problems it may create. I had one gig to unfuck some code at a company a couple of years ago and found out that in order to set up a Dev environment in this place could take two weeks or more depending on what team you were on. You had to go through a script, download this, install that, change this.. A nightmare. Updating dependencies on a local desktop created panics amongst the developers who were reluctant to ever change anything they had which "was working" because you could spend days trying to fix what was broken. Naturally any time they migrated code into test or production (there was no build system) things failed there because of dependency related issues. Also depending on who the developer was, they naturally felt that bypassing the Test/QA cycle was a job perk.
I found dozens of dependencies on desktops that were out of date, deprecated or had major vulnerabilities and that went for the production systems as well. It was bad all the way around from a best practices perspective. Daily production crashes were the norm, the VP of Dev had a monitor on his desk so he could "troubleshoot" production problems it was that bad.
Yes there's shops like this that are still out there.
Harrison's Postulate - "For every action there is an equal and opposite criticism"
When I see the error avalanche the first place I check are the first few error messages and that is usually enough to spot the problem. Typos still make c++ compilers barf way too much crap.
Please, give up the C++ slander.
Like any compiler output, read the first error. If you are a developer of any calibre, having a few pages of errors shouldn't phase you and it's not unique to C++ to generate a few erroneous errors. All it requires is a basic level of competence and if you don't possess that then any programming
language that facilitates you generating anything that compiles is doing noone any favours.
I know this is "offtopic" but stay with me and I'll bring it around on-topic...
A big question that people are throwing Billions of dollars & millions of internet comments about is "How can we get more women into programming/coding?"
Ok...b/c our industry is by default very complex, it's not unreasonable that to really drill down to an answer to that question might be fairly complex...the answer can be summarized, sure, but to really get at the problem it involves learning a bit.
Here, in this thread, we find out why...and it affects us **all** not just woman coders, or coders...it affects how the whole company works and the perception of value...witness:
Here we have a central thesis:
"There is a balance between no warnings and pedantic warnings, namely the useful ones."
Parent agrees, and describes how using a **proprietary software** (Eclipse) which adds an **extra abstraction layer** to an already ridiculous process...a process which we all know theoretically should be able to be done on a text editor
the fact that coding, the act of developing, software engineering, the 'real work' has such obtuse solutions, solutions to problems based on...
PEDANTIC choices...overkill...the lack of discretion...there are many reasons for this but that's another rant
it's alienating to new people regardless of gender...the only reason many people work jobs as coders is **for the money**
until we address these fundamental issues, the problems that arise only because some compiler programmer was overly pedantic due to lack of empathy skills will destroy any attempt to get non-traditional types into coding
right now, you basically have to be a bit autistic, or be able to think that way on command, in order to code...part of it is genetic, but part of it is deliberate...you have to train your mind to think in a "code" instruction manner...why would a woman do all this given other options?
the solution to pedantic, tone-deaf coding choices is, of course, a fresh perspective that can help get rid of problems from abstractions...
we need women in coding to help make coding more appealing to women
so, to make this on-topic, I think **more women in coding** is a long-term solution to problems in TFA
Thank you Dave Raggett
Hi, I'm one of the authors of the paper and an engineer at Google. I wanted to clarify some points that have come up in the comments.
First, we don't believe that failing builds are bad. We wanted to study the typical edit-compile-debug cycle that all developers (at least those writing in compiled languages) use to write code. It's perfectly fine to do something like change the signature of a method, compile, then use the compiler errors to find all places where you need to fix your code. We were interested in what kinds of compile errors people run into, how long it takes them to fix the errors, and how we can help you go from a failed to a successful build more quickly. For example, for one particular class of dependency error, we saw that people were spending too much time fixing it. So we created a tool to automatically fix the error and included the command to run the tool in the error message emitted by the compiler. After that we saw the fix time for that class of error drop significantly.
Second, this work is not related to checking in broken code. The builds we looked at are work-in-progress builds from Google developers working on their projects, so it's code in intermediate states of development, not code that has been checked in. It's possible that broken code may be checked in, but our continuous build system will catch that quickly and force you to fix the problem. So for all intents and purposes, all of the code checked into our depots builds cleanly.
Third, by dependency issues we probably don't mean what you think we mean. Within Google we use a custom build system with a custom build file format. Source code is grouped into build targets, and build targets depend on each other, even across languages. You can assume that code checked into the depot builds successfully, and that generally engineers are editing only code in their project and not in their dependencies. The dependency errors we describe in the paper usually result because someone added a source-code-level dependency without adding a matching dependency in the build file, resulting in a "cannot find symbol" error. For example, in a JUnit test you might write the code:
Assert.assertTrue(foo);
But if you don't add a dependency on JUnit to the build file, then you will get a compile error because the build system doesn't know where to find the Assert class. We would count that as a dependency error.
Finally, at Google there is no distinction between "builds on my machine" and "builds on someone else's machine." Our build system requires that all dependencies be explicitly declared, even environmental dependencies like compiler versions and environment variables, so that a build is reproducible on any machine. This is how we are able to distribute our builds. So it's impossible for code to build on a developer's local machine but not on the continuous build system.
I'm happy to answer further questions if people are interested.