Donald Knuth Rips On Unit Tests and More
eldavojohn writes "You may be familiar with Donald Knuth from his famous Art of Computer Programming books but he's also the father of TeX and, arguably, one of the founders of open source. There's an interesting interview where he says a lot of stuff I wouldn't have predicted. One of the first surprises to me was that he didn't seem to be a huge proponent of unit tests. I use JUnit to test parts of my projects maybe 200 times a day but Knuth calls that kind of practice a 'waste of time' and claims 'nothing needs to be "mocked up."' He also states that methods to write software to take advantage of parallel programming hardware (like multi-core systems that we've discussed) are too difficult for him to tackle due to ever-changing hardware. He even goes so far as to vent about his unhappiness toward chipmakers for forcing us into the multicore realm. He pitches his idea of 'literate programming' which I must admit I've never heard of but find it intriguing. At the end, he even remarks on his adage that young people shouldn't do things just because they're trendy. Whether you love him or hate him, he sure has some interesting/flame-bait things to say."
... looks like it falls into the same trap as COBOL. The idea that by making programming languages incredibly verbose, they will somehow become easier to use is a fallacy.
Using "MULTIPLYBY" instead of "*" isn't going to make your code easier to read.
Now, I've no problem with literate programming, but given that even semi-literate practices like "write good comments" hasn't caught on in many places, I think Don is flogging a dead horse by suggesting that code should be entirely documentation driven.
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
It is probably folklore. But the story during my grad school days was that, Knuth offered 1000$ prize to anyone fining a bug TeX and he doubled it a couple of times. And it was never claimed. If that was true, it is very unlikely he was just flame baiting.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
I misread the headline. I was looking forward to a good Fart story from Donald "Structured Programming" Knuth. Oh well...
I have a lot of respect for Knuth as an algorithms guy, but anything he says about programming needs to be taken with a grain of salt. When he created the TeX language, he lost all credibility - designing a language in 1978 which makes structured programming almost impossible is just insane. TeX gets a lot of praise as being 'bug free,' but that's really only half true. The core of TeX is a virtual machine and a set of typesetting algorithms, both of which are very simple pieces of code (they'd have to be to run on a PDP-10). Most of the bits people actually use are then metaprogrammed on top of the virtual machine, and frequently contain bugs which are a colossal pain to track down because of the inherent flaws in the language (no scoping, for example).
If you want to learn about algorithms, listen to Donald Knuth and you will learn a great deal. If you want to learn about programming, listen to Edsger Dijkstra or Alan Kay.
I am TheRaven on Soylent News
Literate programming is an old friend for developers of functional programming languages. I see it like "code for the human mind": it provides a source code that is well adjusted to the needs of the developer, not just the machine.
It interleaves code and documentation in the same files, and provides a specialized compilator to tell the two kinds of codes apart. Just like Doxygen and Javadoc can extract the comments from a source code project, the "tangle" process can extract all the code from a Literate program and pass it to a clasic compiler.
Now that C and C++ seem to have a declining popularity, maybe we can look for better ways of getting away from the bare metal (which, don't forget it, is why those languages become popular at the beginning). Don't get me wrong, they served us well for 36 years, but I think it's time again to begin caring more for the developers' requirements and less for the hardware requirements.
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
Using "MULTIPLYBY" instead of "*" is asinine, because both are equally descriptive. Putting a comment above the line telling people why you're doing it isn't.
Dead. Sad. But Knuth is knocking on that door now, too. I pre-ordered IV about 5 years ago on Amazon. It's a race between it and Duke Nukem (3D was my fav game) For/n/ever.
Long Live COBOL !!
The reason for his dismissive attitude of unit tests - that he knows exactly how all of his code works, and what impact a change will have - is exactly the reason you need them. In the real world, most programmers do in fact have to share their code with others. You're not always going to know the ramifications of refactoring a particular block of code, even if you wrote it yourself. And if you can keep all of that in your head at once, either your program is trivial, or you are some sort of supergenius. Now while I think the TDD guys are a little bit overzealous sometimes with their "100% coverage or die" attitude, unit testing is still a good habit to get into, regardless of what Knuth thinks.
No, that isn't arguable.
Tex got started in 1977 after Unix (1974), well after SPICE (1973), and about even with BSD.
I'm not sure, but I think he's talking personally about his own work on his code. Remember that he comes from an era where people had the goal of mathematically proving that the code is indeed correct. He isn't necessarily doing this now but my persaonal guess is that he prefers statically checking the code to checking a running program. In certain kinds of mathematic/scientific applications this could make sense.
He's written some checks. Few of them are cashed - (pdf) On page 10 of this document he explains one.
Help stamp out iliturcy.
The snippets have markup to indicate when some snippet needs to come textually before another to keep a compiler happy, but mostly this is figured out automatically. But in general, the resulting C code is in a different order than it appears in the source documentation. For instance, the core algorithm might come first, with all the declarations and other housekeeping at the end. (With documentation about why you're using this supporting library and not that, of course.)
Are you saying that TeX is the first thing that Knuth wrote?
i wouldn't waste time on unit tests, either.
He's right about unit tests... sort of. Just as most coders shouldn't be designing interfaces, most coders don't know how to test. It can often be more work writing the unit tests than writing the code.
If you have a function that multiplies two integers, most coders will write a test that multiplies two numbers. That's not good enough. You need to consider boundary conditions. For example, can you multiply MAX_INT by MAX_INT? MAX_INT by -MAX_INT? Etc. With real world functions you are going to have boundaries up the whazoo. In addition, if you have a function that takes data coming from the user, check for invalid cases even if another function is validating. Check for null or indeterminate values. Write tests that you expect to fail.
Don't blame me, I didn't vote for either of them!
.... use a spell checker.
Ultimately literate programming is a matter of translation.
When you boil it all down to what the machine understands, it comes out binary.
To achieve the literate programming goal its clear there needs to be a programming language designed for it and a translator, be it a compiler or interpreter, that can take the results and convert it to machine understandable binary that runs as intended by the programmer/writer.
No he doesn't. He says they're unnecessary in most cases, and that he only uses them when he's ``feeling my way in a totally unknown environment''. Otherwise he simply doesn't need to.
The ALGOL on punch cards story is quite separate.
"Lisp
Making a program parallel will always be too hard for most programmers. But that's exactly why you don't have normal programmers do it... have the libraries do it automatically. Functions like qsort(2) are already black boxes, so they can be made to always run in parallel when the input is large enough. Other functions like say Ruby's .collect can run in parallel. For things like .each there can be a parallel and a sequential version that the programmer can pick which is appropriate.
But to do this we need operating systems that can efficiently and reliably schedule code across cores. Add an ability to 'bind' threads together, so that they schedule always at the same time but on separate real processors. This gives the program the ability know when running an operation split between these threads will always complete faster than sequentially, without vagaries of scheduling possibly starving one thread and making it run much slower.
Once you have this then you can automatically get some speedups from multiple cores on programs that are designed to only run sequentially, and more speedup on programs with just minor tweaks. You aren't going to get perfect scaling this way, but you will get substantial improvements at virtually no cost to the programmer.
The headline is misleading. Donald Knuth represents the epitome of the solitary super-programmer stereotype, so it's only natural that he sees no need for unit tests to catch mistakes or extreme programming to improve team development practices. I don't think he's necessarily saying that those things are without value for ordinary programmers.
Literate programming might be more popular if it had support for interactive debugging, with the standard features common in contemporary interactive development environments.
That's a brave stance. He's old, but he hasn't reached his dotage yet. The good doctor has contributed more to the science of information than most, and almost certainly more than you.
One of the reasons why we're reinventing so much over and over with nuisances like VB and C# is that developers are architecting grand toolchains based on ideas that were in the 1960's proven incorrect. They get a lot profits from their workarounds, and then we burn it all down and start over because they all contain the same fatal flaws.
That would be because you haven't installed Vista on it yet.
Having watched this tragedy unfold for a quarter century I've often shook my head and wondered what y'all were thinking. And then I remember that I once thought my parents were fools too. If you can read TAOCP and understand a good fraction of it you will come away with a firmer foundation for the way all things work. It's a tough slog, though, and not everybody is capable.
Help stamp out iliturcy.
erlang is the balm to ease you pain mr knuth.
Tri it youll like it.
-AC
"I've only proven that it works, I haven't tested it" - Knuth
Knuth's view of programming seems to be that of clever tricks and fast algorithms. That may have been true when he got into the field, but it isn't anymore. Today, it's about creating big systems that need to be maintained by many programmers, not all of whom are as smart as Knuth. Algorithms come in libraries and are implemented by specialists. So, maybe Knuth doesn't need unit tests for his projects, but real-world projects do.
Unix(TM) was never open source. The source was available under certain restrictions, but it failed on the "free distribution" requirement at the very least.
"Slow down, Cowboy! It has been 3 years, 7 months and 26 days since you last successfully posted a comment."
and their unit test. in my days, if you needed a language, you wrote your own assembly. and when you couldn't document it, you wrote your own mark up language.... and your own fonts. phew... multiple cores. who needs them?!
Any guest worker system is indistinguishable from indentured servitude.
It's pretty inane to end the post with "whether you love him or hate him". Who hates him? For what? Maybe you're confusing him with someone else. He's not a controversial figure in the CS world, and he's not part of the Free Software or Open Source punditry.
I hope he lives to 100+ so he can keep writing.
Well, if you are testing your code 200 times a day, you are almost certainly wasting time. Lets run some numbers:
Assuming you work an 8 hour day, that means you are testing your code every 2 minutes and 24 seconds. Given that most of your tests will take this long to run (you've got a suite of them right?), that leaves you with zero time to actually do the work you are testing.
Frankly, if you are using Unit Tests you should be using them after major chunks of work, not in a trial and error fashion. Now if you were using them in a trial and error fashion - "lets change this, run the tests and see if they pass, no that didn't work, lets try this", etc, I could understand how you hit the 200 times per day mark.
If you are coding in a trial and error fashion and using unit tests that way, I'd advise getting some tuition or changing career.
If you aren't doing a trial and error, then I'd suggest that you've perhaps exaggerated the 200 times figure.
I'm a highly productive individual, in terms of writing software and I'd guess I'd have cause of getting out the unit tests a few times a day down to maybe once every few days. The rest of the time I'm actually implementing the design thats in my head/on paper etc.
I believe in incremental development, but that doesn't mean blowing huge wads of time needlessly running unit tests. Which means, by implication, the unit tests ARE NOT part of the build process. They are something I run at times of my choosing when I think the work I'm doing is at a point that may benefit from such tests.
Making the unit tests part of the build process is like requiring a roadworthiness test for you car every one mile you drive it. Sure the car is safe, but its not very productive at getting you from A to B, you could walk faster.
What counts is that when you run the unit tests, they pass, and that they accurately test the conditions that need testing.
erlang is the balm for your multicell blues.
tri it youll like it.
There's a relevant quirk of Stanford University employee policy. For Stanford academic employees, software usually is considered "work for hire" and an "institutional work", with Stanford holding the copyright. But books and papers are considered to be the property of the author. (Policy on this changed in the late 1990s; there's a long history here, primarily involving the founding of Sun and Cisco.) However, Stanford permits the creator to place a work in the public domain, unless external funding prohibits this.
Knuth's code is open source. But his books are not.
It's been some time, but it seems to me the notion of literate programming is that anyone who knows the language can read the code and understand what's going on.
So, whether you use "*" or "MULTIPLYB" is no more a mark of literacy than if you say "Thank you" or "Merci".
Verbosity has nothing to do with it. Cobol code that is readable is better than C code that's not.
-- Slashdot: When Public Access TV Says "No"
It has always been about trying to avoid making big systems. Unit tests have many of the limitations of spell checking; they are know replacement four vigorous review.
The headline as well as the summary have quite a lot of spin in it. If you want to know what Knuth actually says, you should RTFA. Yes, I know this /., but you won't regret it, it's a good read.
By the way, Knuth uses Ubuntu.
try it youll like it
erlang is the balm for your multicell pain, mr Knuth
FVWM on Ubuntu Linux. Emacs with special modes using a homemade bitmap font. Mac OSX for Illustrator and Photoshop...
Now that's breadth AND depth.
"...but here we are in 2008 with no punch cards..."
Yes and no. Yes, the physical punch cards are gone, but they live on in financial institutions in the form of Automated Clearing House (ACH) debits and credits which use the 96 column IBM punch card format. So, the next time you use your credit card, ATM card, e-check or pay a bill online through some company's web site, on the backend they are probably using ACH upload files (aka NACHA format) which was based on IBM's 96 column punch card to transfer financial data. Magnetic tape may be used on a contingency basis but it has to have an additional header record, be EBCDIC encoded and use 9 track tape. The IRS and many state tax agencies use ACH transfers, as an option, to refund personal income taxes instead of sending taxpayers a physical check.
From the interview I see that Knuth thinks multithreading may turn out to be a flop. I agree and I would go even further. I don't think there is any question that the multithreading strategy used by Intel and AMD for multicore programming is a big flop already. Multithreading is the second worst thing to have happened to computing. The worst is single-threading which is what normal algorithmic programming is based on. Parallelism is the correct approach to computing. Computing should have been parallel from the start even on single-core processors. If parallelism had been emulated in a processor from the beginning, adding more cores would have been a simple evolutionary transition, a mere engineering problem. My point is that there is a better way to do parallel programming that does not involve threads at all. Cellular automata and neural network programmers have been emulating parallism for decades without threads.
Essentially, you need to have two buffers and a loop. While the first buffer is being processed, the second buffer is filled with objects to be processed in the next cycle. When the first buffer is done, swap the buffers and start over. Two buffers are used in order to prevent signal racing conditions. It's not rocket science. We just need to take it to the instruction level by changing it from a software mechanism into a hardware mechanism. In other words, the mechanism should reside in the processor, whether single or multicore. This is the correct approach to parallelism. Multithreading is going to be a complete disaster, a multi-billion dollar disaster. Google "Nightmare on Core Street" to find out why multithreading is not part of the future of parallel computing.
I somewhat disagree with what you and... *sigh* Monkeybaister posted. Yes, there are many times when long stretches of code should be broken out into functions. But I tend to do that mostly when the same bit of code is used in several different cases. The reason being is that when you start modularizing off all your while loops that are more than a dozen lines long, you create a whole new type of spaghetti code. I'm going to coin a term and call it "spaghetti-O code." You try to track down a bug and what would have been a straightforward couple pages of code now has all kinds of functions at different places in the code. As such, it can often make debugging or forming a mental map of the code much harder.
Seriously, if you're "religious" about unit testing and mock objects, then you really need to revise the way you live your life.
It's just a good habit to get into, if you take it seriously and don't just create tests that test silly little things like "is my text box centered where I slapped it on the form with gui form tool" type of stuff. That's kind of the point he's trying to make, that you program intelligently in the first place to avoid having an insane amount of redundant tests to pass each time you build.
I've been doing literate programming (well, as close as you can with C and its derivative languages) for a long time now. I've watched XP coders take that literacy and chop it all up because "it didn't look pretty enough". The idea with making something literate is to make it so clear that you can reduce the total numbers of tests needed to make that code pass to only ones that test the actual expected outputs of that function. That's something that intelligent coders who don't just follow the Agile rulebook, but apply it effectively, can do. I don't know how many times I'd see a piece of code that did one simple task, had one test to test the output of that test, then another coder drops 3 more tests because they "didn't feel comfortable with only one" without specifying WHY. That is how you get into having redundant tests that muck up your test infrastructure.
I'll forgive you for being a Java developer, but the fathers of C have always cited readability first (The C Programming Language ~1978). They don't call it "literate programming", which is simply a trendy buzzword, but the idea of programming for readability has been around for an extremely long time.
Assuming you work an 8 hour day, that means you are testing your code every 2 minutes and 24 seconds. Given that most of your tests will take this long to run (you've got a suite of them right?), that leaves you with zero time to actually do the work you are testing.
Frankly, if you are using Unit Tests you should be using them after major chunks of work, not in a trial and error fashion. Now if you were using them in a trial and error fashion - "lets change this, run the tests and see if they pass, no that didn't work, lets try this", etc, I could understand how you hit the 200 times per day mark. That is a completely respectable position to take. I used to work in this way during college. My several thousand unit tests do take 2-3 minutes to build. My estimate of 200 times was probably over shot and should be more like 50-75 times a day. I would also like to point out that I can continue developing while the tests run. I use Maven2 as a build tool and enjoy it immensely, it helps me do test driven development. Since you are obviously far superior to me, I will assume you know what this means but point it out to the rest of the idiots like myself. TDD is where you write your unit tests before you code. Then you satisfy your unit tests with code. When you need to change code, you change the tests and then you change the code to fix the tests. Crazy waste of time right? If you are coding in a trial and error fashion and using unit tests that way, I'd advise getting some tuition or changing career. Thanks, I love you too.
But the thing is that my employer loves my work and my code rarely breaks. Now why is that? Perhaps because I'm regression testing at all times? Perhaps it's because I take the time to think about things before I do them and, as a result, I really begin to understand what it is that I'm writing.
An added benefit is that I've found I can look at my or others unit tests and really understand what was going through their mind when the first wrote the method that I am expanding. It's quite interesting, but I'm sure you are a supreme being like Knuth and don't bother with such trivialities. I'm a highly productive individual
Here's a question: how much time do you spend working out what happened when your code breaks? TDD is a trivial amount of time compared to that. I am concerned about my software in the present and future. I wish others were also. What counts is that when you run the unit tests, they pass, and that they accurately test the conditions that need testing. I disagree with you. I run unit tests that fail all the time--on purpose.
I know it will most likely result in a swift abrasive response but I implore your highness to really spend some time understanding how unit tests can help the really stupid coders (like me).
My work here is dung.
There is. It's called WEB, and, while it was originally written for Pascal code, has been extended to a variety of languages. Derivatives exist in several more languages, and more general systems for any language you like.
Incidentally, guess who's the author. A hint, you just read an interview with the guy.
"Lisp
Unit testing has its place when designing bits of code for the first time, but experienced programmers will know how to deal with things like (for instance) missing parameters on returned web forms, passed-in classes that have not been initialised, and possible arithmetic overflow/divide by zero.
The commonest problems that I see in the field are:
I suspect that anybody intelligent enough to have ideas about how code should be documented and tested that make any kind of sense, is also likely to produce readable and testable code. People without those skills will not use testing and documentation efficiently.
At the end of the day there is no substitute for designing a proper end of the line test suite, and applying it to the application as a whole, with sufficiently good error handling that bugs can be traced.
From scarped cliff or quarried stone she cries "A thousand types are gone, I care for nothing, no not one."
It's also out of favor because of how much of the real world of programming works. My very small company does a lot of work for a very, very large company. At my small company, we have one layer of management - the owner of the company. Everyone else is in the level of "not an owner of the company."
At the large company, there are a multitude of layers of management. Any software they build require extensive specifications and documentation far in advance of laying down any code. The decide all the aspects of the software before it's written. At my company, the boss just gives us a general outline of what he's thinking about and ask us to feel out the idea. We use a RAD environment and will often have a first iteration within a week. This version tends to get completely, sometimes multiple times. We do not document any of this in advance because the usable version may differ so much from the original ideas.
At the large company, their projects tend to take years and years, go far over budget and typically are much less useful than they had originally hoped. As a bonus, they are usually bug-ridden and unstable. Many times they just eventually get canceled by the new layer of management, who then get awards for this "cost saving measure." At my company, our projects are typically finished far in advance for a tiny price. They are typically of very high quality, with very minor bugs which we fix rather quickly.
This large company frequently hires our company to build software rather than trying to do it internally. They are usually amazed at the things we can do.
Something like "literate programming" is completely anathema to how our company works. If we started trying to write specifications in advance of figuring out what product our clients actually want (as opposed to what they think they want at the start of the process).
Now, I will state that our company only works because we don't hire idiots or slackers. Also, I am fully aware that this is not a good way to, for example, design nuclear power plant software or a baggage control system. But for businesses, all that documentation and "thinking" can just cloak that fact that the people building the software don't know what they are doing.
The summary sounds like it was written by the headline-producing monkeys at Fox, CNN -- or hell, at the Jerry Springer show. Donald Knuth is not "playing hardball." Nobody needs to call the interview "raw and uncut," or "unplugged."
The interview has almost nothing to do with unit testing and the little Knuth does have to say about the practice is hardly "ripping."
When will people stop sullying peoples' good names by sensationalizing everything they say?
Knuth is a well-respected figure who makes moderate, thoughtful statements. From the summary, you'd think he was a trash-talking pro-wrestler.
After initially being a proponent, I've come to the same conclusion about unit tests myself. I don't think they're worthless, but the time you spend developing or maintaining unit tests could be more profitably spent elsewhere. Especially maintaining.
That's my experience, anyway. I suppose it's pretty heavily dependent on your environment, your customers, and exactly how tolerant your application is of bugs. Avionics software for a new jet fighter has a different set of demands than ye olde "display records from the database" business application. More applications fall in the second category than the first.
Someone brings you a complaint ( bug report ).
Write test which produces output exhibiting bug.
Fix bug.
Test passes, as bug has been fixed -- provably.
Keep test around. If you're into the CB-thing, ( Continuous Build ) then you'll already have a testing infrastructure to examine reports, email the person who checked in the code which broke the build, etc and you *will* build every hour... If you're not, you have a test suite to ensure no future changes will regress the code to the old bug.
Technology -- No Place For Wimps! Grateful Dead and Jerry Garcia Chatroom -- http://www.wemissjerry.org
WTF? You have to type shit like l | c || r | to tell LaTeX how many columns you want and how to format them? This is ugly, especially for large tables. |||||||||| my ass!
Example 2: Content mixed with Presentation
We know how stupid it was to write things like
In LaTex it's even worse. No support for stylesheets and defining or even getting colors is a pain in the ass. Just look at that:
My Eyes!
Example 3: Compiler Warnings aren't helpful
Okay, I lied, no example here. But trust me: If you ever had difficulties interpreting the output of validator.w3.org you will hate LaTeX.
I really don't get why LaTeX is still so popular among students, especially with those who don't have to type formulas all the time. It's not like there were no alternatives. DocBook looks nice for exampe.
And now you can mod me flamebait or even troll, I don't care. That's why I'm posting anonymously after all
Knuth doesn't go out of his way to promote his ideas or anything. The interviewer asks him about literate programming and he answers. I don't understand the reason for anti-knuth zealotry. Anyone care to comment?
echo 'cat sig | sh' > sig
That's a mischaracterization of literate programming.
The whole idea of literate programming is to basically write good technical documentation -- think (readable) academic CS papers -- that you can in effect execute. What many people do with Mathematica and Maple worksheets is effectively literate programming.
It has nothing to do with what language you use, and is certainly not about making your code more COBOL-esque.
Maybe think of it this way: Good documentation should accurately describe what your code does. In literate programming, the computer code is just the "comments" you add to your documentation so that the computer can execute it.
See this post, for instance.
It was TDD that he was having talking about rather than unit tests. And what he actually said was that he didn't find TDD appropriate for the sort of things he did.
But why let the facts get in the way of a sensationalised headline here on Slashdot, the respectable face of tabloid technology journalism!
Nonsense. The problem isn't with the programmers, it's with the languages. Writing object oriented code in Fortran is too difficuiltr for most programmers, but that doesn't mean that the programmers aren't up to the task, but that the language they are using isn't well suited to the job.
Learn a little erlang, or Haskel to see how easy writing massively parallel programs can be. p.--MarkusQ
"my dualcore laptop really has no problem with that"
Moore's law says (well, indirectly at least) that machines from 2007 should be roughly 256 times as powerful as machines from 1995.
Somehow, the actual performance difference (starting the computer, starting a web-browser, editing text etc.) in running Win95 on hardware from it's time, compared to running Vista on todays hardware, seems to be nowhere near a 256-times improvement.
I can only conclude that while the hardware-industry have improved itself again and again, the software industry have ate almost all of those improvements, instead of giving it to the users.
I expected much, much more from Knuth than what I've just seen in that interview and after reading the design of MMIX.
Knuth dismisses multi-core and multi-threading as a fad and an excuse by manufacturers to stop innovating. I'm amazed someone of his intelligence has managed not to read up on exactly WHY this is happening:
So he dismisses the technical problems that manufacturers have been falling over for the last few years as merely a lack of imagination. No - parallelism is here to stay, and people need to realise it rather than just wishing up some magical world where physics aren't a problem.
He dismisses multi-threading as too hard. It isn't, if you're not unfair to the concept. Nobody is getting 100% out of their single-threaded algorithms. You always have stalls due to cache misses, branching, the CPU not having exactly the right instructions you need, linkage, whatever. Nobody EXPECTS you to get 100% of 1 CPU's theoretical speed. So why do people piss all over multi-core/multi-threading when it doesn't achieve perfect speed-ups?
If you achieve only a 50% speed-up using 2 cores compared to 1, you're done a good job, in my opinion. That means you could have dual-core 3GHz CPU or a single core 4.5GHz CPU. Spot which of those actually exists. Getting a 25-50% speed-up from multi-core is easy. The 100% speed-up is HARD. If you stop concentrating on perfection, you'll notice that multi-threading is a) actually not hard to implement, and b) worthwhile.
Then there's MMIX. Knuth thinks that simplicity has to work all the way down to the CPU design. Yes, but not simplicity by way of having instructions made up of 8 bit opcode and 3x8 bit register indexes. A CPU doesn't give a crap how elegant that looks. It's also BAD design - 256 registers makes for a SLOW register file. It'll either end up being the slow critical path in the CPU (limiting top clock speed) or taking multiple cycles to access. There's also no reason to have 256 opcodes. He should have a look at ARM - it manages roughly the same functionality with much less opcodes.
It almost pains me to see the MMIX design and how it's a) not original, b) done better in existing systems already on the market, e.g ARM, and c) doesn't solve any of the performance limit problems he complains about. What's going on with Knuth?
C and C++ only seems to be declining because they are the de facto standards and everyone knows how to use them. There is no need for news articles, blog entries, etc. They may not be hip but they power everything and are more used than anything else.
You can not have heard of Knuth without also hearing of literate programming. Well OK, you can't have read his work anyway.
Remember that MMIX is not designed to be a practical hardware computer architecture. It's designed to illustrate algorithms written in assembly language. It's optimized for humans to read and write, not for computers to execute quickly. I'm glad that he's keeping assembly as part of his books, and that's he's updated them to a 64-bit RISC architecture. Reading MMIX assembly programs is the closest to hardware that some readers will ever get, so he has one chance to show those readers how computers actually work. It had better be as simple as possible for people to understand.
What a fool believes, he sees, no wise man has the power to reason away.
I think you're missing the point of MMIX. It was never intended to be an example of great processor design. It was intended to be a construct for teaching how to program, without being an Intel, or ARM, or Motorola, with the way these "lock you in" to the patterns that make most sense for that platform.
...which the compiler can't discover, because foreach describes a mechanism (looping through a sequence, in order), and not a high-level transformation.
Compare foreach with map. Map is a higher order function that takes a function and a collection, and results in a collection of the same size and structure as the original, but with each element replaced by the result of applying the supplied function to it. Note that the value of each element in the result depends only on the corresponding element of the input. It's trivial to parallelize map.
You can parallelize map easily because it has a favorable contract that specifies the relationship between its inputs and its outputs, and it just so happens that this contract is amenable to parallel execution. A smart compiler, upon seeing a use of map, can trivially tag it as a parallelism candidate.
But since foreach specifies a sequential looping mechanism, there are no guaranteed relations between the input and output (in fact, not even any simple way to determine what should be treated as inputs and outputs). When you write a foreach loop to perform the equivalent of a map, you're underspecifying the transformation you're performing on your collection, and overspecifying the mechanism. That's bad programming.
You mention Parallel LINQ, and this is very relevant. LINQ is based on operations similar to map, that transform sets into sets. LINQ queries, since they abstractly describe the relation between an input and the desired output, can be executed in a number of ways: (a) the system can translate them into SQL queries and send them to a database server to execute; (b) the system can execute them serially; (c) the system can execute them in parallel.
Are you adequate?
If you define "bug" to mean "unexpected undocumented behaviour", as Knuth seems to, then it's not surprising that there have been very few bugs claimed, since TeX is so very well documented.
But most people -- and certainly the majority of open source projects these days -- define "bug" as "undesirable behaviour"; and by that standards, TeX is chock full of bugs. To pick a couple of obvious examples, incorrect handling of ASCII 0x22 quotation marks, and treating "etc." as the end of a sentence. These aren't "bugs" to Knuth since the incorrect behavious is well documented, but by many people's standards they are.
What's purple and commutes? An Abelian grape.
Donald Knuth said, "Get off my lawn!"
I can see why Knuth doesn't like unit testing. Reading his books you see that you really need to understand your algorithm properly and identify all key sections. You need to spend a bit of time by sitting down with pen and paper and draw the flow; derive what happens at limits rather than guess or 'have faith in the gods'. By designing your algorithm properly, the unit test is kinda implied.
.
Donald Knuth does not rip on unit testing. He says unit testing is of no value to him. That's a completely different statement.
Unit tests, strictly speaking, are only the beginning. You're also going to want functional tests, and in general, you're going to want to test everything you can, unless you have a good reason for believing it doesn't need testing -- for example, you probably don't need to test things that are built-in to the language; not only is it probably better tested than any code you could write, but there's not a lot you could do if it somehow failed.
And after a "unit" of code is written "correctly", you may not be done. You may still refactor it at some point, or rewrite it with a different algorithm, and unit tests help you do that without having to worry as much about whether you broke it. More importantly, unit tests (and functional tests) also mean that you avoid the situation where you change some completely unrelated code, and it breaks your test.
They're not going to protect you 100% of the time, but they don't have to. Any bug they catch is probably going to save you enough time to be worth it -- especially because it will catch that bug early.
And then there's test-driven design -- first write a test to describe how the system is supposed to work, and then write the code that passes that test. Here, it helps you know when you're done with that particular piece of code -- that is, it prevents you from writing too much at once.
Don't thank God, thank a doctor!
Seastead this.
Remember that MMIX is not designed to be a practical hardware computer architecture. It's designed to illustrate algorithms written in assembly language. It's optimized for humans to read and write, not for computers to execute quickly. I'm glad that he's keeping assembly as part of his books, and that's he's updated them to a 64-bit RISC architecture. Reading MMIX assembly programs is the closest to hardware that some readers will ever get, so he has one chance to show those readers how computers actually work. It had better be as simple as possible for people to understand.
But that's the problem - it's illustrating an architecture that's impractical. It gives the false impression that you don't have to worry about register spillage, a slightly non-orthogonal instruction set or most of the other issues that real world CPUs have to deal with. That's exactly the point of teaching people about a CPU in the first place, and unfortunately MMIX hides you from it.
And worse still - he DOES keep referring to MMIX whenever he talks about how multi-core or multi-threading is bunk.
This is circular reasoning at its finest. The main reason the program that people write and use aren't written in a style to which a language such as Haskel or Erlang are suited is that the people who programed them wrote them in a style more suited to the languages in which they were written.
There's nothing about the tasks themselves that make them unsuited for massively parallel solutions; the actual dependency and sequences in the problems (as opposed to in the programs that are currently used to solve them) are few and far between.
For a few examples:
It may be easier to turn the question around: what exactly are the "kinds of programs that people generally write and use" that couldn't benefit from massively parallel solutions because the problems are full of "dependencies and sequences"?
--MarkusQ
> He pitches his idea of "literate programming" which I must admit I've never heard of...
Wow. Simply, wow.
Before you design for reuse, make sure to design it for use.
Those might be valid criticisms if his books were about assembly language programming or computer architecture. But they are not. They are about algorithms, and his MMIX programs illustrate how those algorithms are implemented by a computer in terms of the instructions it executes, and the registers, cache, and memory. If a reader wants to learn about real-world CPUs, they can read Hennessy and Patterson.
What a fool believes, he sees, no wise man has the power to reason away.
I think most people who post here don't know what literate programming is. It's more like writing a textbook explaining how your code works, but you can strip away the text and actually have runnable code. This code can be in any language of your choice. It makes sense from Knuth's point of view, but for many of us, we don't write textbooks for a living.
Knuth also doesn't need unit testing because he probably runs (or type checks) the program in his head. Again, for most of us, seeing the program run provides additional assurance that it works. Unit tests also provide a specification of your program. It doesn't have to be just b = f(a). For example, if your code implements a balanced binary search tree, a unit test could check the depth of all subtrees to make sure the tree is balanced. Another unit test would check if the tree is ordered. You can prove by the structure of your program that these properties hold, but a lay-man doesn't want to write proofs for the code he writes, so the second best alternative is to use unit test.
About parallel programming, Knuth is actually right. Many high-performance parallel programs are actually very involved with the underlying architecture. But we can write a high-level essentially-sequential program that uses libraries to compute things like FFT and matrix multiplication in parallel. This tends to be the trend anyways.
I once had a signature.
Remember that some things in engineering and geophysics are still fully CPU bound and that even a 1% improvement saves you more than an hour per process per week - if it's only running a dozen times that saves twelve hours. With 3D graphics a fairly small improvement with a clever tricks or fast algorithm it can also be a difference between whether the thing can be done acceptably on the hardware or not. With encryption it makes a big difference especially if you are doing something like talking to a heart pacemaker that only has a Z80 to run the code on (an example from a far better programmer than myself). Then there's the increasingly important mobile device space - they are getting more cabable but still don't have fast hardware or a lot of memory. In a lot of situations the library can not be treated as a black box and may be designed to run in different conditions to what you want.
Comment removed based on user account deletion
Uh, if you're never heard of literate programming & you're surprised by what Knuth said, it's because you're not listening very closely.
-Bill
SlashSig Karma: Excellent (mostly affected by moderatio
I've never seen any interesting and useful software that is ever "finished". You always need to add and change things, and there is always far more functionality wanted than you can produce.
If you clean up and refactor as you go, rather than at "the end", what you have described is Agile/XP development.
If you have to continuously re-run your tests then something is wrong, even with TDD.
You write tests up front. Fine. For this you need to know the API and/or the interfaces you are working against. This is normal and good.
You make those tests reasonable and small, and functional. That's good too.
Then you develop your feature/functionality.
Then when you've hacked about some hours and think it's ready, you re-run the tests.
If something is broken, you fix it.
Otherwise it works as it should.
Yes, TDD is good and you need to run the tests, but don't be anal about it. Running the tests too often (like your 200 or even 50 times a day) is just a waste of time.
What a very distasteful article which seems only about bragging rights about his own very experimental system. Only the last paragraph seems to be getting to the point. The system this guy is devising does make an interesting read, but it unleashes many questions about the feasibility of such a system. It's certainly not something that AMD or Intel can currently adopt as a strategy. Actually, it seems a bit about bragging rights about -your- very experimental system. In the kill file with you.
Why do you think Knuth's work should be relevant to getting your day-to-day tasks done as a programmer? Ever tried to read any volume of Knuth's "Art of Programming"... I tried as a CS/Engineering Student at University and found them very esoteric... been a professional programmer for 10+ years and never used or heard of anyone else using his books as a reference. Bill Gates said: "If you think you're a really good programmer. . . read [Knuth's] Art of Computer Programming.... You should definitely send me a resume if you can read the whole thing." -Bill Gates
nuf said..
That presents him as unqualified to have a convincing opinion on parallelism.
There is a time dimension to the unit test. You start with the tests that look like they meet requirements and don't cost too much to write. Over time, as users of the code complain about broken boundary cases, the unit tests are upgraded. Eventually, every boundary case that anyone has demanded your code handle correctly is covered.
At this point, marketing will decide it is time to completely rewrite the product.
If you're developing a library that other people depend on, then unit testing is invaluable. It's difficult to know in advance which ways other people will try to abuse your code. Unit testing (including all the obvious boundary cases) makes for much more reliable code. The unit tests are also handy when the code needs to be re-factored - which will happen eventually, to pretty much any code that is used for more than one application.
Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana