Literate Programming and Leo

← Back to Stories (view on slashdot.org)

Posted by ryuzaki0 on Wednesday August 28, 2002 @04:59AM from the pod-comparisons-inevitable dept.

jko9 writes "First proposed almost 20 years ago by Donald Knuth, the idea of Literate Programming is basically that of making program documentation primary, and embedding code in the documentation, rather than vice versa. Despite some obvious advantages apparent to anyone who has struggled to understand a poorly documented program, literate programming never really caught on. That all could change, however, with the release of a new program called Leo, written by Edward K. Ream. Leo supports standard literate programming languages like noweb and CWEB, but with a crucial difference - Leo adds outlines. The effect is striking: overall organization of a program is always visible and explicit. Much of the narrative of the documentation gets placed in the outline, making documentation simpler, and allowing viewers to approach the code at various levels of detail. Screenshots and tutorials for Leo are here - if that site gets slashdotted, you can download the visual tutorials in .chm form or html form from Leo's Sourceforge site. Leo is an open source program written in Python. Any current practioners of Literate Programming techniques out there? People who have tried it and given it up? Can the addition of outlines to Literate Programming make it more powerful / popular?"

3 of 358 comments (clear)

Min score:

Reason:

Sort:

Re:Inline Documentation is evil by gwernol · 2002-08-28 05:44 · Score: 5, Insightful

If your code requires massive documentation within the code to make it understandable, then your code likely needs to be rewritten.

I think you're missing the point. All code can be described at several different levels. At the highest level, you might describe a program as (for example) "an online banking application", which is a complete description of the app. However there are obviously a lot of details below this level of description :-)

Different people need to understand a program at different levels of description. The CEO may only need to know the highest level description. At the other end of the spectrum, someone working on the optimal algorithm for maintining user session should be isolated from the implementation details of other parts of the program. The architect should be concentrating on the interconnection of modules within the code, not the implementation itself.

The code itself is good at describing some levels of description and very poor at describing others. The example you give doesn't need any documentation to understand what those two lines do, but it will need documentation to understand their relevance to the higher levels of the system.

Programmers tend to see the details and often miss the larger context. This can lead to unstated and often false assumptions about what role the code fulfills and how it interacts with the rest of the system These are the hardest bugs to find and fix.

There are many ways to solve this "levels of description" problem. Inline documentation is one very valuable tool. Of course it shouldn't be:

// Adds two numbers together
a = b + c;

It should describe the functional role of the code in relation to the higher-level components of the system.

As you point out, abstraction and encapsulation are good mechanisms for constructing higher-level descriptions of functionality. Why stop there? Why not try to build up beyond these levels as well? Perhaps we will evolve to high-level languages that can express these high-level designs. Until then inline docuemntation and literate programming are excellent tools to help you achieve these goals.

--
Sailing over the event horizon
Re:Inline Documentation is evil by Viking+Coder · 2002-08-28 05:51 · Score: 5, Insightful

I can't tell what your code should do if it can't find a person named Harry.

I can't tell what your code should do if it finds multiple people named Harry.

I can't tell how to use your code to find a person whose name requires Unicode to represent it.

I can't tell if .name returns a char * that I'm supposed to free or delete [], if it returns a const char *, if it returns a string that I can modify but won't modify the original Person, if it returns a string reference which I can use to modify the original Person's name, if it returns a wstring reference which I can use to modify the original Person's name, if it returns a const string reference, or if it returns a const wstring reference, or if it uses some other string representation like a Qt one, or some custom one - heck, it could even use an MFC-style CString.

I don't like that the function you've called is named "findPerson" - wouldn't it be far better to call it something like "findPersonByFirstName"? Or "findFirstPersonWithFirstName"? For that matter, why am I calling "Person::findPerson"? Isn't that slightly redundant? Wouldn't "Person::find" be just as clear, and less verbose? Therefore, the function should be something like "Person::findFirstWithFirstName". Wouldn't that be much more highly documented than what you've got?

While we're on it, if it is returning the "first", by which method is it sorted? Shouldn't I be able to pass in a parameter which describes the order in which I want the results returned? And shouldn't you get an iterator instead of a reference, anyway?

Back to "name", is that their entire given name? Is it a nickname? Is it in last-name first format? Is there some additional identifier in the name if two people have the same name?

And I still don't know if I'll get a special Person which is supposed to be a Non-Person, if it can't find "Harry", or if this is going to throw an exception.

I don't like that your code uses a hard coded-value, "Harry".

I don't like that your code has the variable "p". Granted, you've got a pretty amazingly short scope in your example, but code tends to grow. It would be better if the variable had a slightly longer name.

There are all sorts of things to nit-pick about, that a new coder could be confused about, or bugs which might be on the verge of instantiation, even in code as simple as yours.

But my real point is this :

If I've just walked in to your code, I don't know what behavior it's SUPPOSED to have, since you haven't documented that. All I can tell is what it DOES do. And since code changes over time, it's impossible for me to distinguish between the two, unless you document it.

--
Education is the silver bullet.
Re:Just giving it a name... by gorilla · 2002-08-28 06:06 · Score: 5, Insightful

That's not overcommenting, that's commenting wrong. You should be commenting why you are doing something, not what the code does.
// Default Minimum to be same as Maximum
min = max
// We have finished this data cell, Move onto next data cell
i++;
Is good commenting, even though it's the same number of comments.