Donald Knuth Rips On Unit Tests and More
eldavojohn writes "You may be familiar with Donald Knuth from his famous Art of Computer Programming books but he's also the father of TeX and, arguably, one of the founders of open source. There's an interesting interview where he says a lot of stuff I wouldn't have predicted. One of the first surprises to me was that he didn't seem to be a huge proponent of unit tests. I use JUnit to test parts of my projects maybe 200 times a day but Knuth calls that kind of practice a 'waste of time' and claims 'nothing needs to be "mocked up."' He also states that methods to write software to take advantage of parallel programming hardware (like multi-core systems that we've discussed) are too difficult for him to tackle due to ever-changing hardware. He even goes so far as to vent about his unhappiness toward chipmakers for forcing us into the multicore realm. He pitches his idea of 'literate programming' which I must admit I've never heard of but find it intriguing. At the end, he even remarks on his adage that young people shouldn't do things just because they're trendy. Whether you love him or hate him, he sure has some interesting/flame-bait things to say."
Now, I've no problem with literate programming, but given that even semi-literate practices like "write good comments" hasn't caught on in many places, I think Don is flogging a dead horse by suggesting that code should be entirely documentation driven.
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
I have a lot of respect for Knuth as an algorithms guy, but anything he says about programming needs to be taken with a grain of salt. When he created the TeX language, he lost all credibility - designing a language in 1978 which makes structured programming almost impossible is just insane. TeX gets a lot of praise as being 'bug free,' but that's really only half true. The core of TeX is a virtual machine and a set of typesetting algorithms, both of which are very simple pieces of code (they'd have to be to run on a PDP-10). Most of the bits people actually use are then metaprogrammed on top of the virtual machine, and frequently contain bugs which are a colossal pain to track down because of the inherent flaws in the language (no scoping, for example).
If you want to learn about algorithms, listen to Donald Knuth and you will learn a great deal. If you want to learn about programming, listen to Edsger Dijkstra or Alan Kay.
I am TheRaven on Soylent News
Literate programming is an old friend for developers of functional programming languages. I see it like "code for the human mind": it provides a source code that is well adjusted to the needs of the developer, not just the machine.
It interleaves code and documentation in the same files, and provides a specialized compilator to tell the two kinds of codes apart. Just like Doxygen and Javadoc can extract the comments from a source code project, the "tangle" process can extract all the code from a Literate program and pass it to a clasic compiler.
Now that C and C++ seem to have a declining popularity, maybe we can look for better ways of getting away from the bare metal (which, don't forget it, is why those languages become popular at the beginning). Don't get me wrong, they served us well for 36 years, but I think it's time again to begin caring more for the developers' requirements and less for the hardware requirements.
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
the prize was not US$1000. it started out very small. Knuth did indeed pay out, and indeed doubled it, several times. From wikipedia: "The award per bug started at $2.56 (one "hexadecimal dollar"[24]) and doubled every year until it was frozen at its current value of $327.68. This has not made Knuth poor, however, as there have been very few bugs claimed. In addition, people have been known to frame a check proving they found a bug in TeX instead of cashing it."
Using "MULTIPLYBY" instead of "*" is asinine, because both are equally descriptive. Putting a comment above the line telling people why you're doing it isn't.
The reason for his dismissive attitude of unit tests - that he knows exactly how all of his code works, and what impact a change will have - is exactly the reason you need them. In the real world, most programmers do in fact have to share their code with others. You're not always going to know the ramifications of refactoring a particular block of code, even if you wrote it yourself. And if you can keep all of that in your head at once, either your program is trivial, or you are some sort of supergenius. Now while I think the TDD guys are a little bit overzealous sometimes with their "100% coverage or die" attitude, unit testing is still a good habit to get into, regardless of what Knuth thinks.
... looks like it falls into the same trap as COBOL. The idea that by making programming languages incredibly verbose, they will somehow become easier to use is a fallacy.Using "MULTIPLYBY" instead of "*" isn't going to make your code easier to read. From what I've seen (particularly of CWEB), literate programming doesn't change the programming language itself, it just adds a TeX style markup to the comments so that detailed (and nicely typeset) documentation can be generated from the source code. Take a look at some of Knuth's CWEB code, such as his implementation of Adventure:
http://sunburn.stanford.edu/~knuth/programs/advent.w.gz
It appears to be ordinary C once the CWEB documentation is stripped out.
That's not literate programming at all. A tad more research on your part is required. I actually remember when "web" in a computing context a literate programming tool rather than that thing you're surfing right now.
Literate Programming interleaves the documentation (written in TeX, naturally) and code into a single document. You then run that (Web) document through one of two processors (Tangle or Weave) to produce code or documentation respectively. The code is then compiled, and the documentation built with your TeX distribution. The documentation includes the nicely formatted source code within.
You can use literate programming in any language you want. I even wrote rules for Microsoft C 7.0's Programmer's Workbench to use it within the MSC environment.
I've frequently thought about going back. Javadoc and/or Sandcastle are poor alternatives.
http://en.wikipedia.org/wiki/Knuth_reward_check
Many people save these (usually small) checks as souvenirs. My father -- a frugal mathematician -- received a few $2.56 checks from Knuth, and he promptly cashed each one.
I'd kill any colleague of mine who wrote such a vacuous comment. With a golf club, in front of its cow-orkers to drive the lesson home,
Watch this Heartland Institute video
I'd kill any colleague of mine who wrote such a vacuous comment. With a golf club, in front of its cow-orkers to drive the lesson home, I sometimes add comments like that if the brace is closing a deeply nested block, but then the comment indicates which particular loop or block is ending.
The snippets have markup to indicate when some snippet needs to come textually before another to keep a compiler happy, but mostly this is figured out automatically. But in general, the resulting C code is in a different order than it appears in the source documentation. For instance, the core algorithm might come first, with all the declarations and other housekeeping at the end. (With documentation about why you're using this supporting library and not that, of course.)
He's right about unit tests... sort of. Just as most coders shouldn't be designing interfaces, most coders don't know how to test. It can often be more work writing the unit tests than writing the code.
If you have a function that multiplies two integers, most coders will write a test that multiplies two numbers. That's not good enough. You need to consider boundary conditions. For example, can you multiply MAX_INT by MAX_INT? MAX_INT by -MAX_INT? Etc. With real world functions you are going to have boundaries up the whazoo. In addition, if you have a function that takes data coming from the user, check for invalid cases even if another function is validating. Check for null or indeterminate values. Write tests that you expect to fail.
Don't blame me, I didn't vote for either of them!
I concur. Comments should tell WHY the while block is there and what it DOES. Not where it starts and where it ends, the code tells us that descriptively enough.
I've met code blocks several hundred lines long and it was never ambigious where they started and ended.
The headline is misleading. Donald Knuth represents the epitome of the solitary super-programmer stereotype, so it's only natural that he sees no need for unit tests to catch mistakes or extreme programming to improve team development practices. I don't think he's necessarily saying that those things are without value for ordinary programmers.
I somewhat disagree with what you and... *sigh* Monkeybaister posted. Yes, there are many times when long stretches of code should be broken out into functions. But I tend to do that mostly when the same bit of code is used in several different cases. The reason being is that when you start modularizing off all your while loops that are more than a dozen lines long, you create a whole new type of spaghetti code. I'm going to coin a term and call it "spaghetti-O code." You try to track down a bug and what would have been a straightforward couple pages of code now has all kinds of functions at different places in the code. As such, it can often make debugging or forming a mental map of the code much harder.
There are a few techniques to be mastered and using a language designed with parallelism in mind helps hugely with the picky details.
Assuming you work an 8 hour day, that means you are testing your code every 2 minutes and 24 seconds. Given that most of your tests will take this long to run (you've got a suite of them right?), that leaves you with zero time to actually do the work you are testing.
Frankly, if you are using Unit Tests you should be using them after major chunks of work, not in a trial and error fashion. Now if you were using them in a trial and error fashion - "lets change this, run the tests and see if they pass, no that didn't work, lets try this", etc, I could understand how you hit the 200 times per day mark. That is a completely respectable position to take. I used to work in this way during college. My several thousand unit tests do take 2-3 minutes to build. My estimate of 200 times was probably over shot and should be more like 50-75 times a day. I would also like to point out that I can continue developing while the tests run. I use Maven2 as a build tool and enjoy it immensely, it helps me do test driven development. Since you are obviously far superior to me, I will assume you know what this means but point it out to the rest of the idiots like myself. TDD is where you write your unit tests before you code. Then you satisfy your unit tests with code. When you need to change code, you change the tests and then you change the code to fix the tests. Crazy waste of time right? If you are coding in a trial and error fashion and using unit tests that way, I'd advise getting some tuition or changing career. Thanks, I love you too.
But the thing is that my employer loves my work and my code rarely breaks. Now why is that? Perhaps because I'm regression testing at all times? Perhaps it's because I take the time to think about things before I do them and, as a result, I really begin to understand what it is that I'm writing.
An added benefit is that I've found I can look at my or others unit tests and really understand what was going through their mind when the first wrote the method that I am expanding. It's quite interesting, but I'm sure you are a supreme being like Knuth and don't bother with such trivialities. I'm a highly productive individual
Here's a question: how much time do you spend working out what happened when your code breaks? TDD is a trivial amount of time compared to that. I am concerned about my software in the present and future. I wish others were also. What counts is that when you run the unit tests, they pass, and that they accurately test the conditions that need testing. I disagree with you. I run unit tests that fail all the time--on purpose.
I know it will most likely result in a swift abrasive response but I implore your highness to really spend some time understanding how unit tests can help the really stupid coders (like me).
My work here is dung.
I myself solve the problem using this construct:
...
#define BeginWhile {
#define EndWhile }
#define BeginFor {
#define EndFor }
Literate Programming is not about making programming languages incredibly verbose; it's about *describing* your program in a normal, human way, by explaining it step by step and quoting bits and pieces of the program. Sounds ideal from a documentation point of view, right ? only that if that was all there was with Literate Programming, it would be a stupid idea, as documentation has a nasty habit to not follow up with code modification.
... and then remove all the green inputs:
:-P
The really cool idea with LP is that the code snippets you use in the documentation are then weaved together to generate the "real" code of your program. So a LP document is BOTH the documentation and the code. A code snippet can contains references ("include") to other code snippets, and you can adds stuff to an existing code snippet.
Let me show you an example in simple (invented) syntax:
{my program}
{title}My super program{/title}
Blablabla we'd need to have the code organized in the following loop:
{main}:
{for all inputs}:
{filter inputs}
{apply processing on the filtered inputs}
{/}
{/}
The {for all inputs} consist in the following actions:
{for all inputs}:
some code
{/}
The filtering first remove all blue inputs:
{filter inputs}:
remove all blue inputs
{/}
{filter inputs}+:
remove all green inputs
{/}
etc.
{/}
The above is purely to illustrate the idea, the actual CWEB syntax is a bit different. But you can see how, starting with a single source document, you could generate both the code and the documentation of the code, and how you can introduce and explain your code gradually, explaining things in whichever way that makes the most sense (bottom-up, top-down, a mix of those..).
In a way, Doxygen or JavaDoc have similar goals: put documentation and code together and generate documentation. But they take the problem in reverse from what literate programming propose; with Doxygen/JavaDoc, you create your program, put some little snippets of documentation, and you automatically generate a documentation of your code. With LP, you write your documentation describing your program and you generate the program.
Those two approaches produce radically different results -- the "documentation" created by Doxygen/JavaDoc is more a "reference" kind of documentation, and does little to explain the reason of the program, the choice leading to the different functions or classes, or even something as important as explaining the relationships between classes. With some effort it's possible to have such doc system be the basis of nice documentation (Apple Cocoa documentation is great in that aspect for example), but 1/ this requires more work (Cocoa has descriptive documents in addition to the javadoc-like reference) 2/ it really only works well for stuff like libraries and frameworks.
LP is great because the documentation is really meant for humans, not for computers. It's also great because by nature it will produces better documentation and better code. It's not so great as it poorly integrates with the way we do code nowadays, and it poorly integrates with OOP.
But somehow I've always been thinking that there is a fundamentally good idea to explore there, just waiting for better tools/ide to exploit it
(also, the eponymous book from Knuth is a great read)
The summary sounds like it was written by the headline-producing monkeys at Fox, CNN -- or hell, at the Jerry Springer show. Donald Knuth is not "playing hardball." Nobody needs to call the interview "raw and uncut," or "unplugged."
The interview has almost nothing to do with unit testing and the little Knuth does have to say about the practice is hardly "ripping."
When will people stop sullying peoples' good names by sensationalizing everything they say?
Knuth is a well-respected figure who makes moderate, thoughtful statements. From the summary, you'd think he was a trash-talking pro-wrestler.
I don't think there's anything elite about writing short concise functions and breaking things up. The problem is when people first go into programming, they make these kinds of mistakes unless there are proper code reviews/training (things which often don't happen). When you are at university, the programs you write tend to be quite short and because of that, you don't realise how bad a programmer you actually are at that stage. It's only when you leap into the workplace and start writing a lot that your inadequacies become evident.
To me, programming is a discipline requiring a fair bit of intelligence, but all to often companies hire programmers like they are just hiring shelf-stackers or something. I think there is a lot more professionalism in Open Source projects than in many software houses.
http://unix-archive.kalwun.de/PDP-11/Trees/V7/usr/src/cmd/sh/mac.h
No offence meant, but I think your preconceptions may be clouding your judgement here.
You claim that today's programming field is not about clever tricks and fast algorithms. I claim that if more people understood these old-fashioned concepts, we would have much better software today. For a start, anyone developing those "libraries implemented by specialists" you mentioned had better be very good, since a lot of other people's code is going to depend on them. Having worked in groups that develop various kinds of library, I can assure you that a little more general programming knowledge about clever tricks and fast algorithms wouldn't go amiss.
You claim that today's programming field is about big systems with many programmers. I claim that this is because management and technical leadership in most places isn't competent enough to divide up a big system in modular fashion and allow smaller, more flexible teams to solve the little problems before multiplying them all up to solve the big ones. Instead, the guys at the top tend to reduce all problems to the least common denominator, "throw enough people at it and we'll win eventually" philosophy. This explains how a small company with a few dozen employees can produce software that is better in every way than the competing offering from a larger company with hundreds of developers. You don't even need to have a dew dozen genius programmers; you just need to understand the concept that there are O(n^2) lines of communication between n individuals working in a single large team, but if your project is divided hierarchically then logarithms start coming into play, and if you can split a problem into several properly independent smaller ones this becomes a constant factor overhead. This elementary mathematics seems to be beyond a lot of senior management in the software business, and that has far more to do with the need to build monolithic systems maintained by zillions of developers than any actual project requirements do.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
"Not an actor, but he plays one on TV."
I have actually seen code like this:
// end while-loop // end if-loop
}
}
You read that right. if-loop.
Done with slashdot, done with nerds, getting a life.