Slashdot Mirror


Literate Programming and Leo

jko9 writes "First proposed almost 20 years ago by Donald Knuth, the idea of Literate Programming is basically that of making program documentation primary, and embedding code in the documentation, rather than vice versa. Despite some obvious advantages apparent to anyone who has struggled to understand a poorly documented program, literate programming never really caught on. That all could change, however, with the release of a new program called Leo, written by Edward K. Ream. Leo supports standard literate programming languages like noweb and CWEB, but with a crucial difference - Leo adds outlines. The effect is striking: overall organization of a program is always visible and explicit. Much of the narrative of the documentation gets placed in the outline, making documentation simpler, and allowing viewers to approach the code at various levels of detail. Screenshots and tutorials for Leo are here - if that site gets slashdotted, you can download the visual tutorials in .chm form or html form from Leo's Sourceforge site. Leo is an open source program written in Python. Any current practioners of Literate Programming techniques out there? People who have tried it and given it up? Can the addition of outlines to Literate Programming make it more powerful / popular?"

7 of 358 comments (clear)

  1. Literate Programming by bigjocker · · Score: 4, Interesting

    My previous employer had a strict rule concerning code: you first write the JavaDoc for all the project, then implement it. It's useful as hell ... and if you mix that with UML design before the documentation, its a killer technique.

    --
    Life isn't like a box of chocolates. It's more like a jar of jalapenos. What you do today, might burn your ass tomorrow.
  2. Programs as flat text files - why? by Animats · · Score: 4, Interesting
    It's wierd, when you think about it, that programming is still done in flat text files. Almost nothing else is still done that way. One could argue for programs in HTML, with the code bracketed in XML so that the compiler could find it.

    Few systems even allow multiple fonts in program text, although the original Bravo editor for the Xerox Alto did.

  3. Curing unmaintainable code by gwernol · · Score: 5, Interesting

    Roedy Green has written an excellent, humorous online article on writing unmaintainable code. This relates directly to Literate Programming, especially Roedy's points about maintaining existing code. He writes (here): "[the maintainence programmer] views your code through a toilet paper tube. He can only see a tiny piece of your program at a time. You want to make sure he can never get at the big picture from doing that. You want to make it as hard as possible for him to find the code he is looking for. But even more important, you want to make it as awkward as possible for him to safely ignore anything. "

    Literate programming in general, and Leo in particular, would be the ultimate cure for this. It allows you to easily navigate between multiple levels of description of a program. This is critically important if you are coming fresh to an existing piece of code. You need to constantly cross-reference the high-level design and low-level implementations (and the various levels of description between these extremes).

    --
    Sailing over the event horizon
  4. The right balance by teetam · · Score: 4, Interesting
    Too much documentation is just as bad as too little documentation, even when the documentation is good. It is very difficult to strike a balance.

    For example, many of the core java apis are well written and well documented. If you see the HTML javadocs, you can get a pretty good idea of the class.

    However, when you open the source code of the same class, it is not good looking anymore. Why? Because each method is preceded with dozens of lines of javadoc, each of which is embedded with HTML markup. That is good when the javadoc HTML pages are finally generated, but not so good when you look at the source itself. C# is worse with its XML based documentation!

    When I look at the source code, I want to see the flow of the code easily. All the documentation in the source should only aid this and not hinder this. Javadoc does both. The explanation part of the javadoc can be very useful in understanding what the author's intent was when he/she wrote the method, but I am not so sure about the rest. The param, return and exception tags are no doubt useful, but often developers don't explain these very well. Plus, these are the tags that can easily become outdated.

    I would prefer short and succint pieces of information documenting the code, preferrably close to the line of code that it documents.

    --
    All your favorite sites in one place!
  5. Literate programming versus continuing development by Phronesis · · Score: 5, Interesting
    Although literate programming has a lot of potential, all too often literate projects become completely ossified. M.D. McIlroy's criticism of Knuth's literate programs (CACM 29, 471-83 (1986)), that they tend to be like "industrial strength Faberg eggs" as opposed to reusable tools, is still quite valid.

    For a project I am working on, I needed to extend CWEB to do some things Knuth hadn't thought of, and I found that excessive cleverness in the data structures made it much more difficult to extend than it should have been, so that Knuth could demonstrate clever data structures that probably add a few percent to the performance over what he could have achieved with more prosaic ones (Knuth does not document why he made these excessively clever design choices, nor whether the performance advantages they offer were significant).

    Similarly, a recent thread on comp.text.tex recently asking about the extensibility of TEX produced a number of comments from those who know about how unextensible and unreusable TEX really is.

    So, while I use literate programming (CWEB) to document a lot of my own code, I don't believe in all these years, that I have ever seen a good example of literate-programming that looks towards the future (refactoring, extending, reusing) as opposed to generating a fossil with that comes with a good story of its life and times.

  6. Re:Been there, done that by RelentlessWeevilHowl · · Score: 4, Interesting

    IBM's Visual Age for Java used something similar, adapted from their Visual Age Smalltalk. My problem with VAJ was that you couldn't do anything in their environment except what they had specifically designed for you to do. If you have files in disk, you can run whatever tools you want on them. But in VAJ or Visual Studio .NET? "I dunno, what's in the context menu?"

    To avoid flat text files, you'd need an interactive scripting language powerful enough to perform any task you'd care to think of (viz., Emacs). Plus you'd need enough support libraries available to you to interact with third-party utilities, and finally bindings for the abstract syntax trees of all the languages you want to program in, so you could manipulate them programatically.

  7. The creator's view of Leo by edream · · Score: 5, Interesting
    Hi. I am the creator of Leo and I'd like to say here what my own view of Leo is. Joe Orr has contributed greatly to Leo, and I would not characterize Leo exactly as he did in his original article. In this posting I hope to clear up misconceptions about what Leo is, what it can do, and the relationship of Leo to literate programming.

    I would like to distinguish between the techniques of literate programming and the practice of literate programming (LP) as it has always been done before Leo (traditional LP). The key technique of LP is what might be called "functional pseudocode." For example, here is a fragment of code that can be written in Leo:

    def spam():
    done = false ; result = None
    while not done:
    << do something complicated >>
    return result
    The line: << do something complicated >> is a section reference. It works pretty much like a macro call. In particular, the code in the defintion of << do something complicated >> has access to the done and result variables. This is almost the entire content of noweb, one form of literate programming. It turns out that this technique can be extremely useful, as simple as it seems. Leo creates one or more "derived" files from an outline automatically when the outline is written, and Leo can update the outline from changes made to derived files when Leo reads the outline.

    In contrast to the technique of literate programming, the practice of traditional LP has focused on the central role of comments, and lots of them. Here is where Leo radically parts company with the LP tradition.

    One's view of the proper role of documentation in a project hardly matters to Leo. You are free to use comments as you always did, though you will probably find that LP as implemented in Leo helps you out in unexpected ways. I discuss at length and in great detail the relationship between traditional LP, comments and Leo here. In short, discussions about the role of comments in programming (literate or not) do not get to the heart of Leo.

    In fact, Leo often reduces the need for comments. Indeed, it is good style to organize Leo outlines like a reference book. Well-designed Leo outlines act both like self-updating tables of contents and self-updating indices. This is in marked contrast to the "stream-of-consciousness" or "narrative" style typically employed in traditional literate programming.

    In my view, the essence of Leo is this: Leo makes outline organization the most important part of a program or a project. Both code and documentation could be considered secondary. At every moment, the overall big picture of a function, class, module, file or project is always at hand. Moreover, Leo makes outlines structure a part of the computer language. For example, I often define a Python class as follows:

    class myClass:
    << declarations of myClass >>
    @others

    The @others directive acts as a reference to all the text in all the outline nodes which are descendents of the node containing this class declaration. Such nodes are copied to the output (derived) file in the order in which they appear in the outline. The reference << declarations of myClass >> ensures that those declarations precede the methods. There are several other ways that outline structure is important in Leo; I won't discuss them here.

    Leo fully exploits the organizational power of outlines. A single outline typically organizes an entire project. Outlines can handle large amounts of data with ease. Moreover, it is possible to clone any part of an outline so that changes to one clone affect all other clones. This is feature makes it possible for a single outline to contain multiple views of a project. For example, when fixing a bug, I clone all nodes related to the bug and gather them in a new part of the outline, called a task node. This task node effectively becomes a view of the project that focuses exclusively on the bug. Any changes I make to code are propagated to all other clones.

    Earlier I mentioned that a well designed Leo outline acts like self-updating tables of contents and self-updating indices. Tables of contents you get for free: an entire outline is the table of contents. Clones create self-updating indices. For example, each task node acts like the index entry for that particular task.

    - Edward K. Ream