Donald Knuth Rips On Unit Tests and More
eldavojohn writes "You may be familiar with Donald Knuth from his famous Art of Computer Programming books but he's also the father of TeX and, arguably, one of the founders of open source. There's an interesting interview where he says a lot of stuff I wouldn't have predicted. One of the first surprises to me was that he didn't seem to be a huge proponent of unit tests. I use JUnit to test parts of my projects maybe 200 times a day but Knuth calls that kind of practice a 'waste of time' and claims 'nothing needs to be "mocked up."' He also states that methods to write software to take advantage of parallel programming hardware (like multi-core systems that we've discussed) are too difficult for him to tackle due to ever-changing hardware. He even goes so far as to vent about his unhappiness toward chipmakers for forcing us into the multicore realm. He pitches his idea of 'literate programming' which I must admit I've never heard of but find it intriguing. At the end, he even remarks on his adage that young people shouldn't do things just because they're trendy. Whether you love him or hate him, he sure has some interesting/flame-bait things to say."
Now, I've no problem with literate programming, but given that even semi-literate practices like "write good comments" hasn't caught on in many places, I think Don is flogging a dead horse by suggesting that code should be entirely documentation driven.
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
I have a lot of respect for Knuth as an algorithms guy, but anything he says about programming needs to be taken with a grain of salt. When he created the TeX language, he lost all credibility - designing a language in 1978 which makes structured programming almost impossible is just insane. TeX gets a lot of praise as being 'bug free,' but that's really only half true. The core of TeX is a virtual machine and a set of typesetting algorithms, both of which are very simple pieces of code (they'd have to be to run on a PDP-10). Most of the bits people actually use are then metaprogrammed on top of the virtual machine, and frequently contain bugs which are a colossal pain to track down because of the inherent flaws in the language (no scoping, for example).
If you want to learn about algorithms, listen to Donald Knuth and you will learn a great deal. If you want to learn about programming, listen to Edsger Dijkstra or Alan Kay.
I am TheRaven on Soylent News
Using "MULTIPLYBY" instead of "*" is asinine, because both are equally descriptive. Putting a comment above the line telling people why you're doing it isn't.
> But the story during my grad school days was that, Knuth offered 1000$ prize to anyone fining a bug TeX and he doubled it a couple of times.
The $1000 bounty was from Dan Bernstein with respect to qmail. He's always found a reason to weasel out of ever paying.
Knuth started the bounty at $2.56 (one "hexidollar") and doubled it every year til it reached $327.68. Several people have claimed it, most people never cashed the checks. One of the first bug finders had his check framed.
Done with slashdot, done with nerds, getting a life.
The original prize was $2.56 (i.e. 2^8Â), and he doubled it every time someone found a bug until it reached $327.68. Over 400 bugs have been fixed in TeX, and that's just counting the core VM and typesetting algorithms - all of the rest is in metaprogrammed packages, many of which contain numerous bugs. I'm fairly sure that most programmers could write bug-free code if the only place where bugs counted was in a simple VM with a few dozen instructions that interpreted all of the rest of the code...
I am TheRaven on Soylent News
... looks like it falls into the same trap as COBOL. The idea that by making programming languages incredibly verbose, they will somehow become easier to use is a fallacy.Using "MULTIPLYBY" instead of "*" isn't going to make your code easier to read. From what I've seen (particularly of CWEB), literate programming doesn't change the programming language itself, it just adds a TeX style markup to the comments so that detailed (and nicely typeset) documentation can be generated from the source code. Take a look at some of Knuth's CWEB code, such as his implementation of Adventure:
http://sunburn.stanford.edu/~knuth/programs/advent.w.gz
It appears to be ordinary C once the CWEB documentation is stripped out.
http://en.wikipedia.org/wiki/Knuth_reward_check
Many people save these (usually small) checks as souvenirs. My father -- a frugal mathematician -- received a few $2.56 checks from Knuth, and he promptly cashed each one.
I'm not sure, but I think he's talking personally about his own work on his code. Remember that he comes from an era where people had the goal of mathematically proving that the code is indeed correct. He isn't necessarily doing this now but my persaonal guess is that he prefers statically checking the code to checking a running program. In certain kinds of mathematic/scientific applications this could make sense.
Literate programming might be more popular if it had support for interactive debugging, with the standard features common in contemporary interactive development environments.
A better way to handle that is to turn the loop body into a function or group of functions. makes the code easier to read and a good compiler will inline the function so their's no performance loss.
Knuth's view of programming seems to be that of clever tricks and fast algorithms. That may have been true when he got into the field, but it isn't anymore. Today, it's about creating big systems that need to be maintained by many programmers, not all of whom are as smart as Knuth. Algorithms come in libraries and are implemented by specialists. So, maybe Knuth doesn't need unit tests for his projects, but real-world projects do.
The thing about unit testing is that it's subject to the law of diminishing returns. A simple test of the basic functionality gets you a lot for minimal effort. Writing dozens of carefully chosen tests to examine boundary conditions etc. gives you a little bit more, but for a great deal more effort. Whether or not it's worth it depends very much on the situation and the nature of the code you're writing.
There's a relevant quirk of Stanford University employee policy. For Stanford academic employees, software usually is considered "work for hire" and an "institutional work", with Stanford holding the copyright. But books and papers are considered to be the property of the author. (Policy on this changed in the late 1990s; there's a long history here, primarily involving the founding of Sun and Cisco.) However, Stanford permits the creator to place a work in the public domain, unless external funding prohibits this.
Knuth's code is open source. But his books are not.
From the interview I see that Knuth thinks multithreading may turn out to be a flop. I agree and I would go even further. I don't think there is any question that the multithreading strategy used by Intel and AMD for multicore programming is a big flop already. Multithreading is the second worst thing to have happened to computing. The worst is single-threading which is what normal algorithmic programming is based on. Parallelism is the correct approach to computing. Computing should have been parallel from the start even on single-core processors. If parallelism had been emulated in a processor from the beginning, adding more cores would have been a simple evolutionary transition, a mere engineering problem. My point is that there is a better way to do parallel programming that does not involve threads at all. Cellular automata and neural network programmers have been emulating parallism for decades without threads.
Essentially, you need to have two buffers and a loop. While the first buffer is being processed, the second buffer is filled with objects to be processed in the next cycle. When the first buffer is done, swap the buffers and start over. Two buffers are used in order to prevent signal racing conditions. It's not rocket science. We just need to take it to the instruction level by changing it from a software mechanism into a hardware mechanism. In other words, the mechanism should reside in the processor, whether single or multicore. This is the correct approach to parallelism. Multithreading is going to be a complete disaster, a multi-billion dollar disaster. Google "Nightmare on Core Street" to find out why multithreading is not part of the future of parallel computing.
Seriously, if you're "religious" about unit testing and mock objects, then you really need to revise the way you live your life.
It's just a good habit to get into, if you take it seriously and don't just create tests that test silly little things like "is my text box centered where I slapped it on the form with gui form tool" type of stuff. That's kind of the point he's trying to make, that you program intelligently in the first place to avoid having an insane amount of redundant tests to pass each time you build.
I've been doing literate programming (well, as close as you can with C and its derivative languages) for a long time now. I've watched XP coders take that literacy and chop it all up because "it didn't look pretty enough". The idea with making something literate is to make it so clear that you can reduce the total numbers of tests needed to make that code pass to only ones that test the actual expected outputs of that function. That's something that intelligent coders who don't just follow the Agile rulebook, but apply it effectively, can do. I don't know how many times I'd see a piece of code that did one simple task, had one test to test the output of that test, then another coder drops 3 more tests because they "didn't feel comfortable with only one" without specifying WHY. That is how you get into having redundant tests that muck up your test infrastructure.
The purpose of a unit test testing for failure cases is not to detect places where the code has issues, but to ensure that the code performs as expected given the initial requirements/specifications. When people talk about "failure cases" they mean that they expect the code to return some sort of exception, not that they expect the code to fail in how it operates.
For example consider that a block of code is required to perform a "division" operation. A unit test case to test for a failure case would be one that provides a denominator of zero. In this case we can expect that the code will throw a divide-by-zero exception or return a special error code rather than halt the entire program.
And after a "unit" of code is written correctly, it shouldn't be changed a lot. Any big changes means a change in semantics, which means the original unit tests are useless anyway.Not necessarily. An easy example is a developer writes division function and test cases are written to ensure it operates as expected. A month later we determine that we need to improve performance of the software that utilizes the division function, and the current division function performs poorly. So given the unit test cases, we can re-implement an optimized version of the function, and testing for accuracy of the function is almost free.
Another example is a developer starts working on a big software project. 6 months later he conveniently leaves the company without notice and his portion of the project is only 50% complete. Would you rather have 50% of code that you are not sure works or 25% of code that you are not sure works, but 100% of the unit test cases required for the existing code to tell you whether or not it works?
An even more useful example is dealing with bugs. You receive a bug report on some software that performs some function. You eventually find the cause of the bug in a particular function, but realize that it requires a specific state of the system. You fix the bug in the software but it results in "ugly code" that at first glance is unexplainable. How should you ensure that this code is not modified accidentally? A unit test case of course! This is a good example for very special bugs that are the result of hardware issues or other software systems that are out of your control.
There are valid argument against unit test cases (and automated testing in general). One argument is that in order to do automated testing, you are required to write more code for the test cases. This is a problem because now you have new code that does not have test cases for itself and can have bugs. So if the test case is written incorrectly, then it may point out that the original code is working incorrectly when in fact the test case is the culprit.
Another argument (which is what I think you were trying to say) is that when the specifications for the code change, so must the test cases. This can potentially double the maintenance work required when updating the software.
So I don't think that every piece of code needs a unit test case. But for code that will be reused considerably and is closer to the foundation of your project, you probably should make unit test cases.
It's also out of favor because of how much of the real world of programming works. My very small company does a lot of work for a very, very large company. At my small company, we have one layer of management - the owner of the company. Everyone else is in the level of "not an owner of the company."
At the large company, there are a multitude of layers of management. Any software they build require extensive specifications and documentation far in advance of laying down any code. The decide all the aspects of the software before it's written. At my company, the boss just gives us a general outline of what he's thinking about and ask us to feel out the idea. We use a RAD environment and will often have a first iteration within a week. This version tends to get completely, sometimes multiple times. We do not document any of this in advance because the usable version may differ so much from the original ideas.
At the large company, their projects tend to take years and years, go far over budget and typically are much less useful than they had originally hoped. As a bonus, they are usually bug-ridden and unstable. Many times they just eventually get canceled by the new layer of management, who then get awards for this "cost saving measure." At my company, our projects are typically finished far in advance for a tiny price. They are typically of very high quality, with very minor bugs which we fix rather quickly.
This large company frequently hires our company to build software rather than trying to do it internally. They are usually amazed at the things we can do.
Something like "literate programming" is completely anathema to how our company works. If we started trying to write specifications in advance of figuring out what product our clients actually want (as opposed to what they think they want at the start of the process).
Now, I will state that our company only works because we don't hire idiots or slackers. Also, I am fully aware that this is not a good way to, for example, design nuclear power plant software or a baggage control system. But for businesses, all that documentation and "thinking" can just cloak that fact that the people building the software don't know what they are doing.
It actually doesn't sound to me like Knuth has heard of the term 'unit test' before this interview at all. It sounds like he thinks it means prototyping a function before writing the real version. Given that he likes to push his model of documentation-driven programming, I think he might be more sympathetic to unit tests if he understood that they can serve as a kind of formalized documentation.
Nonsense. The problem isn't with the programmers, it's with the languages. Writing object oriented code in Fortran is too difficuiltr for most programmers, but that doesn't mean that the programmers aren't up to the task, but that the language they are using isn't well suited to the job.
Learn a little erlang, or Haskel to see how easy writing massively parallel programs can be. p.--MarkusQ
Testing is not a waste of time. Writing unit-tests, is. A better use of that time could be learning about http://en.wikipedia.org/wiki/Design_by_contract, and possibly a language that implements it. (Dare I suggest http://en.wikipedia.org/wiki/D_(programming_language) ?)
The problems with unit test as someone already pointed out, is that writing small tests helps you verify the really simple things is relatively quick and easy but, those mistakes are easy to detect and find anyways, even a decent compiler/lint-tool will find many of them for you.
Most of the really tricky problems comes in the interfacing between units and no unit tests will help you here.
Design-by-contract, on the other hand, can.
First of all it requires you to give really clear formal declaration on each component and how it may be used. (A contract) This contract can be used to automate testing of all the exotic, but allowed, corner-cases your implementation may have missed.
The contract implemented by classes and function will furthermore function as a REALLY clear documentation, spelling out exactly what the allowed input and expected output is.
Finally, in a design-by-contract-enabled language, you can choose to turn on debugging at run-time, and enable all kinds of tests, including input/output validation, class invariants and other really handy tests that allows you to test AND debug your entire system at once, but still at a highly granular level.
Design-by-contract will in no way prevent you, or discourage you, from writing test-code (in fact, it encourages it), but it can help you spend your time writing those tests more sensibly, and drive your code in much more life-like scenarios.
All in much less time spent, and much more return than the traditional ways of unit-testing.
I know this problem very well from the dark days when I was still writing java.
There doesn't seem to be a satisfactory solution, it's always a tradeoff.
While reading this thread I realized a funny thing: This particular annoyance
totally vanished from my day-to-day headaches when I switched to python about
a year ago.
It's a bit wierd because Python doesn't even use braces so one would expect
it to be even harder to identify where a block begins and where it ends.
But the opposite has been the case for me: The clean syntax and language
design has led me to write, on average, shorter blocks with very little
nesting.
In other words, he considered the question, and provided the answer that described what he found for his work. Not what you'll find for your work, or anything like that. Just what he finds works for him.
From the summary, you'd think he was a trash-talking pro-wrestler. Actually, after reading the article, I did find him to be a bit preachy. Apparently you and everyone else find him unquestionably correct in all his statements from that interview. I'll admit to having generally found him right. When he's talking about a subject he's not an expert on, he mentions that. He says why his opinion isn't the best on all topics. He seems to be generally careful about what he's saying.Where what he says is likely to be personal bias, he mentions that. So weight it as you think is relevant, given who he is.
Yes, there's stuff in there I reckon is probably wrong. There's also an awful lot that's probably right, and he's generally been careful and thoughtful about what he's saying.
And also, people are claiming he said these things "in passing." Which I find to be a phrase used when you want to avoid owning up to something you said. If I call you a "whiney bitch in passing" that doesn't lessen it one bit. Knuth claims no one should listen to him. Why is he publishing books if no one should listen to him? No, he said, quite regularly, that his opinion on some things wasn't going to be very useful, as he wasn't that good at them. His books are about the topics he is good at. He doesn't write them about software development best practice, or economic, or parallel computation. He deals with the topics he knows, and that's all he expects to be listened to about. The guy said some inflammatory comments. If you read the following posts, you'll realize that I wasn't the only one that found them inflammatory or controversial. No. Knuth generally was cautious about his statements. Do please find me some actually inflammatory comments from there. Go on, direct quotes, not just rephrasings or spin."Lisp
"Not an actor, but he plays one on TV."
if you feel like experimenting with literate programming try finding the 'leo' editor (written in python)
MP3 Search Engine
This is why good programmers must be good writers as well. The ability to explain the structure of code, primarily with clear, precise names for classes, functions, and variables and secondarily with comments, is as important as the ability to decompose a program logically. I have seen programs that were twice as complex as they should have been, yet made perfect sense and were easy to hack on, because the code and comments made it easy to understand the original programmer's understanding.
In the interview, Knuth gave one reason why literate programming isn't popular: few people are good programmers, few people are good writers, and literate programming only works for people who fall into both categories. I think he's right. Encouraging people to write more unstructured prose than they feel a need to results in worthless verbiage, and most programmers naturally limit their comments to sentence fragments and the occasional short paragraph. If you have to force people to write, you need to provide structure and clear expectations. That's why Javadoc and similar schemes are so much more valuable in practice than literate programming.
I've worked with code written in the "pages-long function" style. One thing I've found consistently in such code is duplicated functionality. I've even personally seen how this happens: A programmer implements a very large, complex piece of functionality as a function. Later, the need arise to implement a variation on this functionality. It turns out that the variation is quite structurally different, so to implement it by parameterizing the original function would result in spaghetti code. So, a second enormous function is written. The programmer scans the two functions and finds no large sections of code that are duplicated between the two functions, so he calls the job done.
Naturally, code developed in this manner degenerates into an unmanageable mess as more and more variations are added. Anyone trying to program this way in the business world, for example, will quickly drown. However, people tend to get by with this style in the scientific computing world, where there are relatively few variations that cannot be handled by parameterization.
The first thing that must be done to get out of this mode of programming is to assume that such variations will arise. Treat the first problem you solve as merely the first of many variations that you will be asked to solve in the future. (Either that, or make the opposite assumption and then completely rewrite your code when the second variation arises.) Then, analyze each variation in terms of the things it must accomplish. You can categorize things using problem domain concepts, solution domain concepts, whatever, as long as you can put names on things.
(The tendency to think in human terms and give natural language names to things is a hindrance to mathematical insight, but it often generates effective ways to decompose code. Perhaps this is another reason why the "big function" style tends to plague scientific computing.)
Once you have split the original task into conceptually separate chunks that can be given reasonable natural language names, split the program up into pieces that have those names.
Voila, now you have a program made of small chunks that can be understood by other programmers. And, since each chunk has a reasonable name, programmers can limit themselves to reading the name and documentation of most pieces of the program, rather than reading every single line of code. This makes the code effectively shorter and easier to maintain, even if you never reuse the chunks to compose a new variation.