Do Strongly Typed Languages Reduce Bugs? (acolyer.org)
"Static vs dynamic typing is always one of those topics that attracts passionately held positions," writes the Morning Paper -- reporting on an "encouraging" study that attempted to empirically evaluate the efficacy of statically-typed systems on mature, real-world code bases. The study was conducted by Christian Bird at Microsoft's "Research in Software Engineering" group with two researchers from University College London. Long-time Slashdot reader phantomfive writes:
This study looked at bugs found in open source Javascript code. Looking through the commit history, they enumerated the bugs that would have been caught if a more strongly typed language (like Typescript) had been used. They found that a strongly typed language would have reduced bugs by 15%.
Does this make you want to avoid Python?
Does this make you want to avoid Python?
I suspect that there is something like a "law of conservation of bugs" or something in software - you take away one vector for bugs to originate and you just move them into another place.
Dynamic languages do have an easy way to introduce bugs - especially languages like javascript that simply create new variables if you have a typo.
But there is the old adage in statically typed compiled languages "Hey, my code compiles! Now I get to find out where all my bugs really are."
This also applies to other aspects of programming languages. Consider the arguments about manual vs automatic memory management. Managed code still has bugs, just not memory management bugs.
"There are a dozen opinions on a matter until you know the truth. Then there is only one." - CS Lewis (paraprhase)
Since I have programmed in many different languages I have personally discovered that strongly statically typed languages do solve a lot of problems because the problems are encountered already at compile time, not during runtime. The problems are also less elusive.
There are of course still bugs around, but they are more often on the strategical level than on a pure sloppiness level caused by misspellings and mismatching methods where a method is changed but one caller isn't.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
Microsoft's big platforms are C,C++,C# this really doesn't do much for them.
Lots of Borland/Pascal programmers are saying we told you so though.
On the contrary - strong typing is preferred by those that have coded a long time and have to maintain systems that has been around for a long time.
As soon as you inherit code written by someone else you will waste a lot of time to understand how it works - and if it's not strongly typed you can easily miss something that previous coders did introduce. A strongly typed system will tell you quite fast that the code you changed the method header on was actually used by 200 subroutines. On a system written in a language not forcing strong typing you may discover that routines you didn't know existed are using it - and they are used only once per year at new years eve - guess who has to put in overtime then?
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
This observation doesn't make me wish to ditch a programming language, but it does make me glad I do test-driven development.
Warning: This signature may offend some viewers.
Perhaps so. On the other hand, Microsoft has written tons of software over the years and perhaps this study might be born out of decades of experience?
From my own ~20 years of experience writing software (never at Microsoft, mind you), I'd tend to agree with them that dynamic typing is a very good way to introduce subtle errors that would have been easily detected in a static typed system. God knows how many man hours of my life were burned hunting down such bugs created by people before me in my software engineering career.
On the other hand, static typing generally induces a slow compilation step that you have to wait through hundreds of times when developing a significant application. Dynamically typed languages are generally interpreted and forgo compilation at the expense of some runtime performance.
For whipping out some throwaway code to get something up in a hurry, nothing beats dynamically typed and interpreted. But when I want to make something seriously strong, high performance, and lasting the test of time, I'll reach for my static typed compiler every time, thank you very much.
As usual, use the best tool for the job at hand, whichever tool that may be.
The more you catch at compile time, the less there is to bite you on the ass at runtime. Cheaper in terms of development effort too to fix bugs before customer reports them.
Current python programmer, and former C programmer here. Dynamic typing is great for a small to medium sized code bases but I would hate to work on anything really big and mature without static type checking. I have worked on large code bases with maybe 100 megabytes of code and C and Ada, a lot of it dating back 20 years or so. You can't safely refactor code that old, and you can't allow data of the wrong type to get in to it, so type checking at compile time is an important line of defense.
You can do that in python with run time checks of data types and thats what you would have to do at the entry point of your libraries, but it eats into your performance doing that. Python is too malleable for static type checking.
http://michaelsmith.id.au
FUCK! I had hex-editor in my head and I fucked it up. Fuck me. I failed. And fuck you, too, for sinking to my level. We all fail together.
What is wrong with Typescript?
I used JS for some of my hobby projects, but then switched to TS.
I found that the compiler catches a LOT of bugs. Type-Os, incompatible type assignments, incomplete branching (not all cases covered). At the end I even switched on the NULL checking, and I was astonished how many possible problems were still hiding in my code with regards to possibly NULL objects...
During my use, I did not see any real drawback.
The dependency on the compiler (external tool) was of course a negative point, but I looked at the generated code and I found it good enough that at any point I could abandon using the compiler, and carry on in plain JS.
So what is you negative experience?
Easy enough to add strong typing in Python by adding a type check decorator to each function and method.
-- In the beginning was the WORD, and the WORD was UNSIGNED, and the main(){} was without form and void...
Yes, the language it was written in was broken.
Or are you suggesting that the people that actually write in these loose languages "because it saves me typing 5 characters" actually will write documentation saying that said function is called at new years eve? And will spend time on proper architecture? Perhaps in your fantasy world they also wrote unit tests with 90+% coverage?
No. You get the code dumped in your lap, and you will be praising the gods if there's an old completely outdated Confluence site that describes what some junior on the team thought that the software is supposed to do.
If that code is Javascript, then you might as well throw it away and rewrite it, it will be faster. If that code is Java, you can make modifications and refactorings to it and be reasonably sure that you didn't break anything totally unrelated.
I find it easier to refactor strongly typed languages as the compiler does most of the work for you before even running the program.
I also get the feeling that strongly typed languages have better support for IDE based refactoring as more can be safely inferred about their structure.
Two additional advantages of strongly typed languages: ability to use refactoring tools, and easier code maintenance - particularly in code reading to recover business logic.
I am currently working with a Python (sub)program, part of a large web system that I do not have access to. The data comes in in messages of an undocumented JSON format, so from beginning to end the type of a variable is whatever the type it is, and the name of the 'variable' is whatever the label string associated with the data was. It is very difficult to deduce what any part of the processing means.
Starships were meant to fly, Hands up and touch the sky - Nicky Minaj
The title of this thread incorrectly conflates "strongly typed" with "statically typed".
They are two completely different things.
As someone that has had to program in a number of languages I can say that strongly typed languages can catch a lot of trivial bugs quickly. One example is an if/then statement that allows non-Boolean arguments. If I mistype a comparison in an if/then statement then I should expect an error on compile. If I type an assignment "if (foo = bar)" instead of a comparison "if (foo == bar)" I expect this to get flagged, but some languages don't see this as a problem.
I prefer strongly typed languages as it can catch a lot of typographical errors and sloppy logic. It can also be frustrating at times since it can mean nesting type conversions to near absurdity. VHDL comes to mind in this. It can also be frustrating if trying to do something quickly and the compiler complains on what I would think is a pretty obvious implied type conversion.
It's interesting to see someone try to get an idea on how many errors strongly typed languages would catch. I'm not sure this makes an argument for one language over another. It might make an argument for testing, coding style, and such though.
I am armed because I am free. I am free because I am armed.
Personally, I think strong typing is vastly overrated and those that need it should not be coding professionally.
Oblig. car analogy: those who need road signs shouldn't be driving professionally.
...gis sdrawkcab (usually not responding to ACs; don't bother posting as AC)
Refactoring came out from the Smalltalk subculture so the claim to usability of refactoring tools or the lack thereof appears dubious. Lack of documentation in APIs and protocols is a perennial problem, though.
Ezekiel 23:20
The answer is obvious that for identical code strong typing will reduce bugs. And yet does typing force people to write in ways that lead creating bugs?
More importantly why isn't there some gray scale on typing that I could slowly turn on as my program design matures?
-Wall
-Wall -Werror
-Wall -Werror -Wextra
Also when writing in C++ mark sloppy convenience functions and constructors as deprecated so the scale above will complain and force you to fix them when the program matures.
This study is one of those "Well, duh!" type studies. Strongly typed languages are easier to refactor, maintain, and debug. It's also easier for someone else to understand as they can see exactly what types of objects are being used at any given point in the code.
Weakly typed languages are easier to do short, quick, dynamic programming. And arguably, that's what they were designed for. I'm not going to haul out a C++ toolchain just to write a few simple REST services when I can write a few short Python/Flask scripts in a fraction of the time.
As always, use the right tool for the right job. Well, unless all there are is shitty tools for the job in which case you're stuck using a shitty tool. I'm looking at you Javascript, you worthless piece of Turing complete trash.
~X~
That used to be true, but I don't think it is so much anymore. Modern languages like Swift, Scala, Kotlin, etc. do a good job of being concise while still keeping full type safety. And that ends up making them faster to write. I also do a lot of Python programming, and spend way too much time running my code over and over just to discover typos and similar mistakes that my editor would have instantly highlighted for me if I'd been working in a statically typed language.
"I'm too busy to research this and form an educated opinion, but I do have time to tell everyone my uninformed opinion."
I prefer strongly, statically typed programming than programming in e.g. Python. Like even Java, which is in many ways a horrible language, is more understandable for me. You need to understand the importance of the first two and last two words in that sentence though.
Why is it better for me to catch stuff at compile time? Because I prefer it that way.
Why is it preferable for Pythonistae to do whatever they do? Because they prefer it that way.
I have a very intelligent and productive friend who loves Python and has no problem with the way it handles types. I find Python annoying and avoid it, partially for that reason. But what would it even mean for him to be "wrong", or for me to be "wrong", or for either of us to be "right"? Nothing, that's what.
Stop bitching at each other and having stupid arguments like the "other side" is trying to steal your toys or come in your ice-cream.
Oblig: "LambdaConf 2015 - Dynamic vs Static Having a Discussion without Sounding Like a Lunatic" https://www.youtube.com/watch?v=Uxl_X3zXVAM
The question isn't really "Does a strongly typed language reduce bugs?", because the obvious answer is: Yes, it does. If you went to the logical extreme and created a language that only had 3 commands, you could eliminate whole classes of bugs. The more strict the rules, the harder it is to do the wrong thing.
But the question is really: Would developers spend more time fighting against the type system in a strongly typed language or against type related bugs in a dynamic one?
The answer to that question seems much murkier, and I don't think a study looking at the types of bugs checked in on GitHub can answer it.
The idea that you develop more rapidly in weakly typed languages (and, implicitly, the importance of maximizing development time) seems to stem from two erroneous assumptions.
The first is that we spend the majority of our time developing. This may be true for hobbyist programmers and possibly even consulting work for small projects. For any effort beyond that, which is the majority of paid programming work and popular open source products, the majority of time is spent fixing bugs.
The second is that strongly typed languages have to be slower to develop. In almost every domain at almost every level of complexity, I find Scala faster to develop in than Python. It gives me strongly typed error-checking as I write the language by using the presentation compiler (assuming modern IDE) to highlight issues early on. It also gives me type inference so I do not often have to specify the types I'm working in at many declaration sites. Finally, it gives me a terse transformation of collections using the combination of strong typing and type inference. I also spend less time thinking about how to express what I want to do.
For any large scale project, if the IDE can allow you to click through to a definition of a complex type or method definition, this saves oodles and oodles of time because we need to understand what we're calling and reading the code often is faster than reading the manual and deducing where it lied to you. If you're not calling code you didn't write, you're not doing something sufficiently complex as to really be interesting or you're writing something embedded in which case the focus on reducing bugs is even more important. Statically typed languages do a much better job of giving you this reference automatically.
Of course, this is all predicated on the types of development we're doing, because no language anywhere can beat the development speed of another language that has a popular, mature domain-specific library to solve a problem. If the language itself has operators or constructions which excel in a specific domain, it's also difficult to beat that language with any language that lacks that focus.
Reality is a slackware box running on a 386 tucked away in god's sock drawer.
The article talks about static typing, not strong typing; the two are different concepts. Strong typing means that type errors are always caught, static typing means that if type errors are caught, they are caught at compile time. JavaScript is both weakly typed and dynamically typed; weak typing is probably a bigger problem than dynamic typing. In any case, whatever conclusions you derive about type systems from experimenting on particular languages really only apply to that language. TypeScript is nice for JavaScript; that doesn't mean that adding static typing to Python would be as useful.
In addition, there is a price to pay for static typing: software becomes more complex, people tend to implement their own "dynamic type light" libraries, etc. So, even when static typing reduces bugs, it's not clear that it results in a better product at a lower cost, which is what you ultimately care about.
Anyone with a complete picture of their project is working on a toy. Leave college and you might find the world doesn't work the way you want.
If one uses Strongly Typed Languages, one gets fewer Strongly Typed Language errors; only.
Yes. And?
Catching type errors during compile is worlds better than finding at runtime, perhaps by an end user. Does it prevent logic errors, or memory leaks, or failure to sanitize inputs, or race conditions? No. It doesn't. So what?
If one uses a thermometer to check the temperature of cooked meat, one only reduces one type of infection vector. Are you suggesting the use of thermometers to check cooked meat isn't a good idea, because it won't prevent malaria or syphilis? Because that is how ridiculous your argument is.
I normally try to avoid personal attacks, but you, sir, are an idiot bitch.
The number of times I've fixed a god damn typo in a variable name made by someone who refused to stop coding in vim like it's the god damn dark ages is well beyond what I can count on my fingers, even in binary.
And if the language had been strongly typed, the code never would have been able to build on the first place. But it did. And it silently failed for years.
Terrible programmers always think they're the best and don't need help. Grow the fuck up and realize you need all the help you can get, same as everyone else.
Thats still weakly typed, just a different kind of weak. Strongly types is when you fail; "a" . 1 would throw an exception and say "i dont know how I can possibly proceed" if it was strongly typed.
Strong typing vs weak is a minor detail anyway; the real meat is dynamic vs static.
No they take slightly less time to get kind of working code- and significantly more time fixing the bugs in it and significantly more time in maintenance to understand what those bugs are and what data an algorithm is working on.
I still have more fans than freaks. WTF is wrong with you people?
This is where you're wrong. The complexity exists, regardless of the language. Type exists, whether your language is loosely or strongly typed. In one you just ignore it, which causes a class of errors. In the other you get free error checking. Pretending that complexity doesn't exist saves you no time, and causes errors. It's a net loss.
I still have more fans than freaks. WTF is wrong with you people?
One of my assignments at Apple was cleaning up a framework that had thirty or more coders over the previous ten years. The first time I ran it through the static analyzer, there were over six hundred issues identified. I tracked down and fixed all of them over the next couple of months in the process of cleaning up the code to build under LLVM and pass the App Store rules.
Most of the easier cases were things like parameters of the wrong type being passed to methods, which caused no problem at runtime and hadn't been spotted before for that reason. The more difficult ones were mistakes in memory handling, and in a lot of cases, I found myself saying "holy shit, GCC lets you do THAT?"
Comparing Obj-C to Swift, I would say that easily 90% or more of those bugs can't be written in Swift at all unless you resort to the UnsafeMutablePointer types, which put it right in your face that you're doing something dicey.
I was a huge fan of loose coupling for all the years I was an Obj-c developer, but today I spend almost no time in LLDB, since almost all of the mistakes I make writing Swift are caught at compile time.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."