Guido van Rossum On Strong vs. Weak Typing
Bill Venners writes "In this interview, Java creator James Gosling says, 'There's a folk theorem out there that systems with very loose typing are very easy to build prototypes with. That may be true. But the leap from a prototype built that way to a real industrial strength system is pretty vast.' In this interview, Python creator Guido van Rossum responds with 'That attitude sounds like the classic thing I've always heard from strong-typing proponents. The one thing that troubles me is that all the focus is on the strong typing, as if once your program is type correct, it has no bugs left. Strong typing catches many bugs, but it also makes you focus too much on getting the types right and not enough on getting the rest of the program correct.'"
Check out the recent discussions on lambda the ultimate (a programming languages weblog).
"Strong or Weak Typing doesn't make good programs. Good programmers make good programs."
Daniel
Carpe Diem
It's a great quote, sadly I'm not enough of a programming expert to come up with an equally good comment. I did once write a parser for windows media ASF stream in java, that was an experience I never want to repeat....
I think that this argument is exactly backwards. Strong typing allows mechanical type checking via compilers and incremental type checkers during the development process. Strong typing takes a lot of the load off a human developer, and puts it on mechanical systems that can be made much more reliable than humans. The important thing is to take maximum advantage of strong typing - use an IDE that does incremental compilation, etc.
On the other hand, since it is not possible to mechanically check weak typed code, weak typing places much more load on the programmer to make sure the types are correct than a strongly typed system. The result is many more bugs, and bugs caught much later in the development process when they are more expensive to correct.
well, that would explain all the repeated stories...
I never considered that anyone would find strong typing so hard. Beginning programmers hate to even declare variables, but we teach them to do it because it make life easier in the long run.
So what other imperfect things should we do away with? Roads with median strips? Safety locks on guns? Metal detectors?
I would argue that strict typing reduces one class of bugs so that I can concentrate on other less tractable classes. Who gave Guido the idea that strong-typing programmers are satisifed with clean compiles. I will be satisfied with clean compiles when the compiler can detect *all* bugs. Until then we (some of us, at least) need to work on improving languages and language tools toward that goal.
Whenever I type strongly my wife complains about the noise and asks that I type more quietly.
Those who say "why would I ever choose not to have certain errors detected at compile time" are missing the big picture. The errors that can be caught by static typing are only the beginning of the errors that can occur in a software system. And they are the ones that will be caught easily in testing. If the uncertainties of runtime typing encourage you to write more tests, so much the better! And have you noticed how much easier it is to whip up tests, and to add the extra code to deal with corner cases, in such languages? In my experience, strongly-typed code tends to be more susceptible to unexpected inputs, just because it's such a pain to handle and test them.
Further, runtime-typed languages are usually better at slaying the real dragons of software developments: the complex errors that are beyond the scope of typing. Guido said a wise thing:
As a maintainer, I would much rather have code that I can see is logically correct with my eyes, than code that a compiler can tell me type checks. High-level languages are much better at expressing complex ideas in clear code; in C and Java, the idea gets lost in the mechanical details.My experience has convinced me that a strong team can produce more robust code in a runtime-typed language than a similar team using a traditional strongly-typed language, given the same amount of time. The first team also has more leeway in trading robustness for speed of completion, when that counts.
The evaluation of an action as 'practical' . . . depends on what it is that one wishes to practice.
I did once write a parser for windows media ASF stream in java, that was an experience I never want to repeat....
Don't know ASF-streams, but I am still curious what you disliked about Java and you think would have been better in other languages?
Is Python really all the rage among (slashdot) geeks? I've tried it recently, being a bit frustrated by Java's longwindedness, but I dismissed Python, too. Sadly, I forgot the reasons, I think I didn't like the documentation much and some things I wanted didn't seem doable. Perhaps I should try it again.
I had to do a Perl project lately, though, and I know I spent hours chasing bugs that wouldn't have been possible in Java. The annoyance of that was far worse than the little bit of extra typing I have to do in Java. Had more problems with default initialisations than with types, though.
Guido: "Strong typing catches many bugs, but it also makes you focus too much on getting the types right and not enough on getting the rest of the program correct."
Really? Really??? This blanket statement certainly doesn't describe anybody I've worked with. I wonder what information he bases it on...
In a production environment, I've found that writing strongly typed programs always saves time in the long run. It doesn't take much more time and, if you occasionally make a silly mistake (like using == instead of eq in Perl), it can save you hours of aggrivation and headache.
For quick one-offs, of course, losely-typed is always the way to go.
Language creator defends his language over other languages!!!! Whoa! Total mind blower
You don't run unit tests at run time! You run them before you ship.
No practical unit test provides 100 percent coverage of all special cases that can conceivably occur in a sufficiently complex system.
Will I retire or break 10K?
I'll never give up my Model M Keyboard. Feel is everything.
Will I retire or break 10K?
Dynamic typing might well be superior in practice, but it could never work in theory!
Stupidity is mis-underestimated.
I notice types most when I am calling some sort of function, and when I am looking at the body of a function which has parameters.
In a language where parameters have specified types, I can look at the signature for a function I may want to call and see at a glance what it is expecting me to pass. When I am looking in the body of an unfamiliar function at variables which were defined as its parameters, I can see at a glance what type of things those parameters represent.
I find strongly typed languages make it much easier to provide this information I find helpful. I find these things sorely missing in real world use of languages like JavaScript, Python and Perl. That information and more can be provided through some sort of of out-of-band documentation mechanism, but I personally like having it right there as part of the language.
Larry
I'm not sure I'd agree with your analogies.
Weak typing can be incredibly useful for those cases where you'd really like to write some routines or data structures that can ignore type.
I find myself getting around typing issues in languages like C by using pointers, which I think is a much worse kettle of fish than weak typing, especially when you throw programmers who handle warnings about undefined pointer types by casting the pointer to J. Random Type into the mix.
If you look at it one way, strong and weak typing are different tools for different jobs, and you should use whichever one is appropriate for the task at hand.
If you look at it another way, who cares? The problems raised in the strong vs. weak typing argument are better solved with taking the time to design the damn program correctly in the first place, before you start cutting code, anyway.
Here are some totally unscientific definitions, use at your own risk:
Static typing: Both variables and objects have types. Type checking happens both at compile time and run time.
Dynamic typing: Variables don't have types, but objects do. Type checking happens at run time.
Strong typing: Strict and effective type checks; a string is a string and not a number. Often confused with static typing.
Weak typing: Absent or ineffective type checks. E.g.: everything is a string, or everything is a pointer. Thus, a string could be used as a number or the other way round. Often confused with dynamic typing.
Python, for example, has strong but dynamic typing.
BTW, if you haven't seriously tried a dynamically typed language yet, maybe you should - they are simply much more fun, IMHO.
Stupidity is mis-underestimated.
For strong typing I recommend an old IBM-PS2 clicky keyboard. Those who are more inclined toward weak typing can stick with the Microsoft Natural keyboard.
As somebody who has seen many forum-fights over this issue, I think it is probably subjective. There are complex tradeoffs, and the final score is probably greatly influenced by personal factors. What trips up person A may not trip up person B nearly as much.
// suspicious line
I personally like "dynamic" typing. It is easier to read, and it is easier to spot errors that may be missed if there is a lot of formal typing clutter IMO.
However, I think it might be possible to get closer to the best of both worlds by having a "lint"-like utility that points out suspicious usage. For example, it might flag this usage:
x = "foo";
doSomething(x);
y = x + 3;
If you really want it to do that, then you could tell the utility to ignore that particular line to not distract future inspections. (Assume the above language uses a different operator for string concatenation.)
Table-ized A.I.
> "Strong or Weak Typing doesn't make good programs. Good programmers make good programs."
That's probably true, but my wild guess is that the majority of so called programmers are not good programmers. Knowing that, what tool (programming language) should be provided to programmers to write a better (not necessarily good) programs? I think that's the question to be asked.
If your unit test returns FAIL, how are you supposed to know whether it's a bug in the code being tested or in the unit test itself? Writing a unit test for the unit test results in infinite descent.
Compile-time type checking has the potential to save you from some bugs in your unit tests without infinite descent because the compiler's type checker has probably had more eyeballs than your unit tests.
Will I retire or break 10K?
Strong typing catches many bugs, but it also makes you focus too much on getting the types right and not enough on getting the rest of the program correct.
(Does anyone else find it a little scary that Guido confuses "strong" and "static" typing?)
There's not much substance in this article to actually refute, but I would like to share my experience on this. I have had a lot of experience with static and dynamic, strong and weakly-typed languages, though not much with Python.
I'm a fan of statically-typed functional languages, especially SML and O'Caml. I agree that static typing catches many bugs; ones that would not be caught at compile-time in a dynamic language. However, in my experience, spending time getting the types right is not a distraction but actually a guide in the design of the program. Static typing encourages . Even if I considered all of that time (which amounts to very little once you become good at the languages) a burden, I think static typing would still be worth it. The reason is that compile-time errors are much, much easier to track down and fix than ones that occur only dynamically (or only once you've shipped your program!).
By the way, "strong" typing does not mean writing down a lot of types. (ML and Haskell have type-inference systems where you end up writing less than you would in C or Java, and maybe even less than in Python!) By the time you become an expert in a language like ML, you are hardly encountering type errors (except when you make a typo or actual mistake), and hardly writing down anything having to do with types -- the best of both worlds!
It's interesting that very few comments/analysis/quotes highlight the fact that the language molds your thinking, making one type of error more or less likely.
If you find this strange, I would submit that speaking human languages influences the way bilingual people think, so software programming would draw upon the same phenomenon.
It's the whole "I got a hammer and a nail and a saw and a drill thing..." You can't talk about bas-relief... 'cuz you don't have a chisel. Or if you do, it's in your other toolbox...
Strongly-typed languages may be more effective in preventing type errors by making the programmer think about the type of variables... than by the compiler catching any... And there is also the question of code style to consider... a weakly & dynamically typed language may have just as few bugs if it's coded with strong code guidelines... Of course, then we may just be getting back to the "good programmer makes good code" syndrome
Best of both worlds with pyLint? It claims to do type-checking based on the source, but it seems to be in an early state of development.
Also (though this doesn't check types), pyChecker can track down a few common errors as well, and seems to be more mature than pyLint.
You say
It's been written in the article that qmail and mailman have been written in python. While I agree for mailman, I just downloaded Qmail source (very small !) : no PY file.
No language does this that I know of, but here's how I think type systems should work.
It's pretty much a stronger form of SML-style type-inference.
Variables don't have types, values do, as in dynamic typing. But, variables can have type constraints.
You can optionally specify type constraints for variables and parameters.
From there the compiler/interpreter can go wild with type-inference trying to determine as much as it can about the structure of your program. As the type constraints are optional you can end up with inferred types for variables that are complex unions and disjunctions of types.
Code that looks like:
(defun foo (x) 'hello)
could be assigned the type foo : unknown -> symbol.
The code can then be weakly checked for inconsistancies between inferred types.
You can write a fully dynamic program like this by specifying no types. You can write a fully static program by specifying all types. Or, you can use a loose mix of the two that lets you do weak compile-time type checking where you want it, and avoid the hassle where you don't.
Anyway, I'm just rambling here. I'll likely try to start playing with these ideas in Lisp sometime soon, as I've been meaning to for several years now.
Justin Dubs
It's mearly dynamically typed. E.g. you can't add strings and numbers, but Python will automatically convert one to the other for you.
So I guess this thread can be deleted.
Strong typing: Original IBM PC keyboard, requires effort, but very satisfying for coding, data entry, letter writing, or any other purpose requiring text. Your hands will get tired.
Dynamic typing: The old Apple adjustable keyboard or the IBM Butterfly laptop. Breaks easily, but may fit your hands better.
Weak typing: The Atari 400 membrane keyboard. Often too wimpy to handle adult hands.
Static typing: Keyboard has loose wiring and gives you an electric shock. Ouch!
sulli
RTFJ.
I was about to post something along those lines when I noted your post. I wonder if, perhaps, there is some miscommunication and assumption going on here? What really is the difference between strong and weak typing?
"Times have not become more violent. They have just become more televised."
-Marilyn Manson
Everyone knows documentation is crucial for code readability and maintainability. Strong typing, aside from letting the compiler catch countless developer goof-ups, is an invaluable tool for good documentation. Documentation is not just overarching system descriptions or code comments. In my opininon, another _crucial_ element of documentation is the code itself. But code can only be effective as documentation when it is thoughfully designed. I find that in a well designed system, the method signature says a vast majority of what you need to know about it. The name of the method, when considered with the name of the class it is in, should give you a very good idea of what this method does. The return type, parameters, and declared exceptions should further clarify what this method does and how it does it. Of course you'll need code comments to clarify the finer points, but if you need explicit documentation to describe _what the method does_, you should be more thoughtful about the design. That being said, strong types are crucial. They allow me to specify _exactly_ what I want you to pass to my method, and _exactly_ what I will return. This "forced contract" is what allows huge systems to evolve. In a weakly typed system, you can document this same behavior until your face turns blue, but the compiler won't warn you when your documentation is out of date.
I program in Python quite a bit and love it. The economy of the syntax is wonderful so I seem to be able to do much more with much less typing. Unfortunately when the program starts to get big and I have to refactor something such as adding/removing a method, adding/removing arguments to a method, or changing the interpretation (type) of some of the arguments, I get very paranoid and wonder if I really did catch and fix all the calls of the method. Yes, a recursive grep will find these, but did I skip one? Did I misspell one of the variables. I can't tell you how many times I've gone blind looking at Python code for a bug that is caused by a misspelled variable.
As a result I usually limit my Python programs to something like 1K lines (I have one that is 2K, oh well) If the program is larger than that I use Java. (for static type checking) If the program is less than 2-3 lines I use Perl. If I'm writing something performance heavy (server daemon) I may write it in C/C++.
Nobody has noted this yet, so I guess it's up to me.
Most large systems (and I mean large) provide some kind of scripting capability, and that scripting system is often dynamically typed. Even if the base system is written in a statically typed language, it will, in general, provide a dynamically typed interface at some level, and some of the system may even be written in that dynamically typed scripting system.
Using this approach, you get the benefits of both. The underlying system, where correctness and performance are important, get written in a statically typed language. The upper levels, where performance is not so critical (because you're effectively just gluing together base components) and flexibility is important, get written in a dynamically typed language.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
Static checking that finds such problems is of modest use during development, substantial use during integration, and extremely valuable during modification. The bigger the program, the more valuable it is.
The type system in C is not particularly expressive. In C++ you would be able to declare a linked list of pointers to T, and have this checked, and avoid casting things to (void *) and back again.
In a language with type classes you could write a function to work on any kind of object and again specialize it to the particular types you are using, and have this checked at compile time. In most cases without having to write any explicit type declarations yourself.
-- Ed Avis ed@membled.com
I think an advantage of strong typing over weak typing is that it yields more efficient programs. When the type of a variable is known at compile time, the correct functions to use on it can be hard-wired into the binary, saving time (and prabably memory) when the program is run. On the other hand, weak typing is more flexible and often saves time on the developer's part.
Please correct me if I got my facts wrong.