Slashdot Mirror


The Effect of Programming Language On Software Quality

HughPickens.com writes: Discussions whether a given programming language is "the right tool for the job" inevitably lead to debate. While some of these debates may appear to be tinged with an almost religious fervor, most people would agree that a programming language can impact not only the coding process, but also the properties of the resulting product. Now computer scientists at the University of California — Davis have published a study of the effect of programming languages on software quality (PDF) using a very large data set from GitHub. They analyzed 729 projects with 80 million SLOC by 29,000 authors and 1.5 million commits in 17 languages. The large sample size allowed them to use a mixed-methods approach, combining multiple regression modeling with visualization and text analytics, to study the effect of language features such as static vs. dynamic typing, strong vs. weak typing on software quality. By triangulating findings from different methods, and controlling for confounding effects such as team size, project size, and project history, they report that language design does have a significant, but modest effect on software quality.

Quoting: "Most notably, it does appear that strong typing is modestly better than weak typing, and among functional languages, static typing is also somewhat better than dynamic typing. We also find that functional languages are somewhat better than procedural languages. It is worth noting that these modest effects arising from language design are overwhelmingly dominated by the process factors such as project size, team size, and commit size. However, we hasten to caution the reader that even these modest effects might quite possibly be due to other, intangible process factors, e.g., the preference of certain personality types for functional, static and strongly typed languages."

6 of 217 comments (clear)

  1. Take away for me by just_another_sean · · Score: 5, Interesting

    e.g., the preference of certain personality types for functional, static and strongly typed languages.

    My guess is that this has a bigger impact on most projects than actual features of a chosen language. I was thinking it the whole time I read the summary and then, sure enough, it's mentioned as a disclaimer at the end...

    --
    Creationist Textbook Stickers Declared Unconstitutional by CowboyNeal
    1. Re:Take away for me by jythie · · Score: 4, Interesting

      That was my initial thought too. Each language and framework within the language seems to have developed its own developer culture, with people generally gravitating towards languages popular among others like themselves. I am not even sure if it is really 'personality type' of the individuals, but instead seems more likely to be the cultural norms within the developer community and what the 'right' balances when it comes to acceptable practices and such.

    2. Re:Take away for me by Mr+Z · · Score: 4, Interesting

      I finally brought up the PDF. It appears the authors consider C++ weakly typed because it allows type-casting between, say, pointers and integers.

      While this is strictly true, I find myself avoiding such things whenever possible. Main exception: When talking directly to hardware, it's often quite necessary to treat pointers as integers and vice versa.

      I guess to fairly evaluate a language like C++, you need to categorize programs based on how the language was used in the program. If you stick to standard containers and standard algorithms, eschewing casting magic except as needed (and using runtime-checked casts the few places they are), your program is very different than one that, say, uses union-magic and type punning and so on every chance it gets. (I've written both types of programs... again, FORTRAN in any language.)

      One of my more recent projects was written ground-up in C++11. It relies on type safety, standard containers, standard algorithms, smart pointers (shared_ptr, unique_ptr) fairly heavily. It's been quite a different experience to program vs. my years of C programming. Way fewer dangling pointers, use-after-free errors, off-by-one looping errors, etc. But, the paper lumps both languages into the same bucket. That hardly seems fair.

  2. More factors to normalise out. by beelsebob · · Score: 4, Interesting

    It's clear that there are more factors here that need to be normalised out. For example, they found that the category that "had" the most performance bugs was the procedural, static, unmanaged memory category, i.e. C, C++ etc, far outstripping languages like ruby. To me, it's clear that that is caused by people using these languages actually caring about performance, while people using languages who's implementations are many orders of magnitude slower, don't really file (or fix) bugs about perf.

  3. Other factors. by LWATCDR · · Score: 4, Interesting

    Almost no casual programer uses functional languages and do not tend to be used for large FOSS projects.

    --
    See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
  4. Re:KISS by Anne+Thwacks · · Score: 5, Interesting
    I have been responsible for language choice on many occasions, and I can say with confidence, sometimes I made the right choice and sometimes not.

    For a project starting in 1983, and expected to last 8 months, Basic seemed like a good idea. By 1995, it should have been obvious to everyone that a re-write in ANYTHING ELSE was justified (not that I would personally recommend using Perl to write a Basic interpreter to re-interpret the original Basic code or using Snobol4 to translate the Basic into Fortran).

    I could have used C instead - the project would probably have taken a couple of weeks longer, but would have saved countless people years of grief. I have C programs from the 80's that compile on *BSD unchanged, and still work as intended*. It was a toss-up at the time.

    My point is that the language choice may be influenced by incorrect information about the external world - because the external world is subject to massive change.

    * I had to rewrite some C from the 70's cos they were written for Idris and all in capitals :-{ Fortran4 programs from the 70's may compile and run, but you certainly need to re-test them!

    Yes its true: my lawn is written in Fortran, but my Mum's has an Ibjob border.

    --
    Sent from my ASR33 using ASCII