Do Strongly Typed Languages Reduce Bugs? (acolyer.org)
"Static vs dynamic typing is always one of those topics that attracts passionately held positions," writes the Morning Paper -- reporting on an "encouraging" study that attempted to empirically evaluate the efficacy of statically-typed systems on mature, real-world code bases. The study was conducted by Christian Bird at Microsoft's "Research in Software Engineering" group with two researchers from University College London. Long-time Slashdot reader phantomfive writes:
This study looked at bugs found in open source Javascript code. Looking through the commit history, they enumerated the bugs that would have been caught if a more strongly typed language (like Typescript) had been used. They found that a strongly typed language would have reduced bugs by 15%.
Does this make you want to avoid Python?
Does this make you want to avoid Python?
I suspect that there is something like a "law of conservation of bugs" or something in software - you take away one vector for bugs to originate and you just move them into another place.
Dynamic languages do have an easy way to introduce bugs - especially languages like javascript that simply create new variables if you have a typo.
But there is the old adage in statically typed compiled languages "Hey, my code compiles! Now I get to find out where all my bugs really are."
This also applies to other aspects of programming languages. Consider the arguments about manual vs automatic memory management. Managed code still has bugs, just not memory management bugs.
"There are a dozen opinions on a matter until you know the truth. Then there is only one." - CS Lewis (paraprhase)
Urgh, Typescript. Well, MS has creates some truly bad languages and they are still hard at it. Of course, the only aim MS ever had with its own languages was to chain people to its equally horrible platforms. This study seems to be just another element of this strategy.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
On the contrary - strong typing is preferred by those that have coded a long time and have to maintain systems that has been around for a long time.
As soon as you inherit code written by someone else you will waste a lot of time to understand how it works - and if it's not strongly typed you can easily miss something that previous coders did introduce. A strongly typed system will tell you quite fast that the code you changed the method header on was actually used by 200 subroutines. On a system written in a language not forcing strong typing you may discover that routines you didn't know existed are using it - and they are used only once per year at new years eve - guess who has to put in overtime then?
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
Perhaps so. On the other hand, Microsoft has written tons of software over the years and perhaps this study might be born out of decades of experience?
From my own ~20 years of experience writing software (never at Microsoft, mind you), I'd tend to agree with them that dynamic typing is a very good way to introduce subtle errors that would have been easily detected in a static typed system. God knows how many man hours of my life were burned hunting down such bugs created by people before me in my software engineering career.
On the other hand, static typing generally induces a slow compilation step that you have to wait through hundreds of times when developing a significant application. Dynamically typed languages are generally interpreted and forgo compilation at the expense of some runtime performance.
For whipping out some throwaway code to get something up in a hurry, nothing beats dynamically typed and interpreted. But when I want to make something seriously strong, high performance, and lasting the test of time, I'll reach for my static typed compiler every time, thank you very much.
As usual, use the best tool for the job at hand, whichever tool that may be.
The more you catch at compile time, the less there is to bite you on the ass at runtime. Cheaper in terms of development effort too to fix bugs before customer reports them.
Current python programmer, and former C programmer here. Dynamic typing is great for a small to medium sized code bases but I would hate to work on anything really big and mature without static type checking. I have worked on large code bases with maybe 100 megabytes of code and C and Ada, a lot of it dating back 20 years or so. You can't safely refactor code that old, and you can't allow data of the wrong type to get in to it, so type checking at compile time is an important line of defense.
You can do that in python with run time checks of data types and thats what you would have to do at the entry point of your libraries, but it eats into your performance doing that. Python is too malleable for static type checking.
http://michaelsmith.id.au
Yes, the language it was written in was broken.
Or are you suggesting that the people that actually write in these loose languages "because it saves me typing 5 characters" actually will write documentation saying that said function is called at new years eve? And will spend time on proper architecture? Perhaps in your fantasy world they also wrote unit tests with 90+% coverage?
No. You get the code dumped in your lap, and you will be praising the gods if there's an old completely outdated Confluence site that describes what some junior on the team thought that the software is supposed to do.
If that code is Javascript, then you might as well throw it away and rewrite it, it will be faster. If that code is Java, you can make modifications and refactorings to it and be reasonably sure that you didn't break anything totally unrelated.
Exactly what I was thinking. It isn't just that the end code might have 15% fewer bugs development will be quicker/more confident because a bunch of the stupid little mistakes you make while coding are automatically checked for and swigglies tell you fix them right away.
Except that weakly typed languages, Python, JS, and Perl, tend to more concise and quicker to write than strongly typed languages such as C, C++, and Java.
It really depends on the resources you're willing to invest in the project. If you have a good staff and are willing to invest the time then a strongly typed language can give you something more reliable.
But if you're investing fewer resources than weakly typed might be the way to go, you'll miss some dumb bugs due to the typing but you have less complexity overall, and that will give you a more stable product more quickly.
I stole this Sig
This study is one of those "Well, duh!" type studies. Strongly typed languages are easier to refactor, maintain, and debug. It's also easier for someone else to understand as they can see exactly what types of objects are being used at any given point in the code.
Weakly typed languages are easier to do short, quick, dynamic programming. And arguably, that's what they were designed for. I'm not going to haul out a C++ toolchain just to write a few simple REST services when I can write a few short Python/Flask scripts in a fraction of the time.
As always, use the right tool for the right job. Well, unless all there are is shitty tools for the job in which case you're stuck using a shitty tool. I'm looking at you Javascript, you worthless piece of Turing complete trash.
~X~
That used to be true, but I don't think it is so much anymore. Modern languages like Swift, Scala, Kotlin, etc. do a good job of being concise while still keeping full type safety. And that ends up making them faster to write. I also do a lot of Python programming, and spend way too much time running my code over and over just to discover typos and similar mistakes that my editor would have instantly highlighted for me if I'd been working in a statically typed language.
"I'm too busy to research this and form an educated opinion, but I do have time to tell everyone my uninformed opinion."
The question isn't really "Does a strongly typed language reduce bugs?", because the obvious answer is: Yes, it does. If you went to the logical extreme and created a language that only had 3 commands, you could eliminate whole classes of bugs. The more strict the rules, the harder it is to do the wrong thing.
But the question is really: Would developers spend more time fighting against the type system in a strongly typed language or against type related bugs in a dynamic one?
The answer to that question seems much murkier, and I don't think a study looking at the types of bugs checked in on GitHub can answer it.
The idea that you develop more rapidly in weakly typed languages (and, implicitly, the importance of maximizing development time) seems to stem from two erroneous assumptions.
The first is that we spend the majority of our time developing. This may be true for hobbyist programmers and possibly even consulting work for small projects. For any effort beyond that, which is the majority of paid programming work and popular open source products, the majority of time is spent fixing bugs.
The second is that strongly typed languages have to be slower to develop. In almost every domain at almost every level of complexity, I find Scala faster to develop in than Python. It gives me strongly typed error-checking as I write the language by using the presentation compiler (assuming modern IDE) to highlight issues early on. It also gives me type inference so I do not often have to specify the types I'm working in at many declaration sites. Finally, it gives me a terse transformation of collections using the combination of strong typing and type inference. I also spend less time thinking about how to express what I want to do.
For any large scale project, if the IDE can allow you to click through to a definition of a complex type or method definition, this saves oodles and oodles of time because we need to understand what we're calling and reading the code often is faster than reading the manual and deducing where it lied to you. If you're not calling code you didn't write, you're not doing something sufficiently complex as to really be interesting or you're writing something embedded in which case the focus on reducing bugs is even more important. Statically typed languages do a much better job of giving you this reference automatically.
Of course, this is all predicated on the types of development we're doing, because no language anywhere can beat the development speed of another language that has a popular, mature domain-specific library to solve a problem. If the language itself has operators or constructions which excel in a specific domain, it's also difficult to beat that language with any language that lacks that focus.
Reality is a slackware box running on a 386 tucked away in god's sock drawer.
The article talks about static typing, not strong typing; the two are different concepts. Strong typing means that type errors are always caught, static typing means that if type errors are caught, they are caught at compile time. JavaScript is both weakly typed and dynamically typed; weak typing is probably a bigger problem than dynamic typing. In any case, whatever conclusions you derive about type systems from experimenting on particular languages really only apply to that language. TypeScript is nice for JavaScript; that doesn't mean that adding static typing to Python would be as useful.
In addition, there is a price to pay for static typing: software becomes more complex, people tend to implement their own "dynamic type light" libraries, etc. So, even when static typing reduces bugs, it's not clear that it results in a better product at a lower cost, which is what you ultimately care about.
Anyone with a complete picture of their project is working on a toy. Leave college and you might find the world doesn't work the way you want.
I normally try to avoid personal attacks, but you, sir, are an idiot bitch.
The number of times I've fixed a god damn typo in a variable name made by someone who refused to stop coding in vim like it's the god damn dark ages is well beyond what I can count on my fingers, even in binary.
And if the language had been strongly typed, the code never would have been able to build on the first place. But it did. And it silently failed for years.
Terrible programmers always think they're the best and don't need help. Grow the fuck up and realize you need all the help you can get, same as everyone else.
"if (foo = bar)" isn't a bug in the code. It's only a bug in your brain.
bar = ;
if( foo = bar )
{
foo += 2;
}
so foo is bar+2 if bar is true, otherwise foo is the same false as bar, be it undef, zero, null, or blank. And if you add some local scoping, being able to manipulate foo without manipulating bar often makes a lot of sense, especially with complex objects, and especially with functional logic like if(foo = dclone(bar)) -- or the much more routine if( record = dbgetrow(statement) ) which I'm absolutely certain you've done more than 100 times.
The bug in your brain is actually not a programming one. It's a visual one. Why are "=" and "==" so visually similar when they are functionally different? I might suggest using instead of == in perl, although the boolean would be reversed. At least cmp covers you for strings. .
No they take slightly less time to get kind of working code- and significantly more time fixing the bugs in it and significantly more time in maintenance to understand what those bugs are and what data an algorithm is working on.
I still have more fans than freaks. WTF is wrong with you people?
This is where you're wrong. The complexity exists, regardless of the language. Type exists, whether your language is loosely or strongly typed. In one you just ignore it, which causes a class of errors. In the other you get free error checking. Pretending that complexity doesn't exist saves you no time, and causes errors. It's a net loss.
I still have more fans than freaks. WTF is wrong with you people?