Slashdot Mirror


Literate Programming and Leo

jko9 writes "First proposed almost 20 years ago by Donald Knuth, the idea of Literate Programming is basically that of making program documentation primary, and embedding code in the documentation, rather than vice versa. Despite some obvious advantages apparent to anyone who has struggled to understand a poorly documented program, literate programming never really caught on. That all could change, however, with the release of a new program called Leo, written by Edward K. Ream. Leo supports standard literate programming languages like noweb and CWEB, but with a crucial difference - Leo adds outlines. The effect is striking: overall organization of a program is always visible and explicit. Much of the narrative of the documentation gets placed in the outline, making documentation simpler, and allowing viewers to approach the code at various levels of detail. Screenshots and tutorials for Leo are here - if that site gets slashdotted, you can download the visual tutorials in .chm form or html form from Leo's Sourceforge site. Leo is an open source program written in Python. Any current practioners of Literate Programming techniques out there? People who have tried it and given it up? Can the addition of outlines to Literate Programming make it more powerful / popular?"

23 of 358 comments (clear)

  1. Finally by maf212 · · Score: 0, Interesting

    its nice to see people trying to help out slashdotting.
    Maybe we can get other posters to get a few backup links in their posts to try to alleviate the load on these poor sites.

    --
    --Note to self. Add witty sig here, someday...
  2. Literate Programming by bigjocker · · Score: 4, Interesting

    My previous employer had a strict rule concerning code: you first write the JavaDoc for all the project, then implement it. It's useful as hell ... and if you mix that with UML design before the documentation, its a killer technique.

    --
    Life isn't like a box of chocolates. It's more like a jar of jalapenos. What you do today, might burn your ass tomorrow.
    1. Re:Literate Programming by bigjocker · · Score: 3, Interesting

      In that scenario (and in my school's freshman CS classes, which is where I got the idea), what would be useful would be a utility that parses valid JavaDocs, and outputs a subsequent Java class with all of the data members declared, and the methods stubbed out, like reversing the javadoc util.


      Try XDoclet for that. Its still in beta, but a lot of people (including me) use it for production.

      --
      Life isn't like a box of chocolates. It's more like a jar of jalapenos. What you do today, might burn your ass tomorrow.
  3. Programs as flat text files - why? by Animats · · Score: 4, Interesting
    It's wierd, when you think about it, that programming is still done in flat text files. Almost nothing else is still done that way. One could argue for programs in HTML, with the code bracketed in XML so that the compiler could find it.

    Few systems even allow multiple fonts in program text, although the original Bravo editor for the Xerox Alto did.

    1. Re:Programs as flat text files - why? by GusherJizmac · · Score: 3, Interesting

      Because it works. It is a logical and physical way to break up your code. Why else would it be in use for almost the entire existence of programming? Also, you say "Almost nothing else is still done that way". HTML is done in flat files. You just break it up according to however you want. XML files are just "flat text files" when you get down to it. The few things that aren't "flat text files", are binary proprietary formats to the detriment of everyone. MS-Word isn't a flat text file, and as such, it's very difficult to read.

      And furthermore, what does putting code in XML give you that you can't do now? Why do you need different fonts? Fonts are for layout and presentation, not for communicating instructions to the computer. Most editors support syntax highlighting, which is all you need.

      --
      http://www.naildrivin5.com/davec
  4. i dont get it.. by Anonymous Coward · · Score: 2, Interesting

    what does leo do for me?

    it looks like the oldschool windows help browser with code samples pasted into it.

    I'm not trolling - I really want to understand how this makes for better code? And my employers definition of better is faster/cheaper - they could give a rats ass about structure and good documentation. They couldn't read a program design in english any better than they could in the most cryptic C syntax I can muster.

    Something like this could help a beginner or student break down code and learn to think logically, but unfortunately I had to move to the 'real world'..

    Sometimes I can't document something until I figure out how its going to be done.. And I figure out how to do it by writing code that works. Then I document the code.

    So far this brand of rapid prototyping is the only thing that gets results fast enough to keep my bosses happy. They care not for proper technique and well-structured code and attention to detail at the design phase. 'Design' around here is no more than a vague definition of the problem to be solved. They just want it out the door.

    I'm sure I'm not alone.. How does leo help me?

  5. must use his nifty GUI ..... by Shaleh · · Score: 3, Interesting

    Yuck. Leo is a "nifty" GUI which helps you do the outline. As I comment on another thread -- we programmers like our text editors thank you very much. I am ok with a visualization program but not one which takes over my workflow.

  6. A good example: by El_Smack · · Score: 3, Interesting


    The main.cf config file of Postfix. Without the comments it's maybe 30 lines of actual settings. With comments its 540 lines, and it's clear enough that a relative n00b like myself got it up and running in 1 hr with minimal trips to the website. Good documentation was a major factor in my picking Postfix over Sendmail. No dis to Sendmail, you understand. :)

    --


    There are 01 kinds of cars in the world. The General Lee, and everything else.
  7. Inline Documentation is evil by lkaos · · Score: 3, Interesting

    If your code requires massive documentation within the code to make it understandable, then your code likely needs to be rewritten.

    With most languages, the code itself is ample documentation. For instance:

    Person &p = Person::findPerson("Harry");

    cout p.name() endl;

    Is pretty self-explanatory. Anyone can tell the output of this code. It's not that programmers need more documentation, rather they need better abstraction and encapsulation (insert your favorite argument for object oriented programming here).

    --
    int func(int a);
    func((b += 3, b));
    1. Re:Inline Documentation is evil by starbirdman · · Score: 2, Interesting

      I agree with the overall point that you are trying to make. However, your main argument against this snippet of code seems to boil down to you don't know how the function is supposed to behave. Shouldn't that be commented on the function itself and not the function call?

    2. Re:Inline Documentation is evil by lkaos · · Score: 3, Interesting

      I can't tell what your code should do if it can't find a person named Harry.

      Good point. The code was a quick example. It likely would have expanded to included error checking if the item wasn't found.

      I can't tell what your code should do if it finds multiple people named Harry.

      Assume that the list is unique.

      I can't tell how to use your code to find a person whose name requires Unicode to represent it.

      And indeed your shouldn't know how. I don't see how commenting would help this situation. If the code snippet supported Unicode, then there would be special Unicode handling classes that likely would be explanatory.

      I can't tell if .name returns a char * that I'm supposed to free or delete [], if it returns a const char *, if it returns a string that I can modify but won't modify the original Person, if it returns a string reference which I can use to modify the original Person's name, if it returns a wstring reference which I can use to modify the original Person's name, if it returns a const string reference, or if it returns a const wstring reference, or if it uses some other string representation like a Qt one, or some custom one - heck, it could even use an MFC-style CString.

      Of course, this is C++ and therefore would return a std::string as all C++ programs should.

      I don't like that the function you've called is named "findPerson" - wouldn't it be far better to call it something like "findPersonByFirstName"? Or "findFirstPersonWithFirstName"? For that matter, why am I calling "Person::findPerson"? Isn't that slightly redundant? Wouldn't "Person::find" be just as clear, and less verbose? Therefore, the function should be something like "Person::findFirstWithFirstName". Wouldn't that be much more highly documented than what you've got?

      Again though, how would commenting help this? This only goes to prove my point that properly written code doesn't need commenting. It also reenforces the idea that commenting may lead to laziness on the part of symbol naming.

      While we're on it, if it is returning the "first", by which method is it sorted? Shouldn't I be able to pass in a parameter which describes the order in which I want the results returned? And shouldn't you get an iterator instead of a reference, anyway?

      Your assuming that the container is not unique. That is a bad assumption.

      I don't like that your code uses a hard coded-value, "Harry".

      Life's a bitch. Constants are only good if they are going to be used multiple times and represent some abstract concept. To have a constant HARRY or something similiar would be silly.

      I don't like that your code has the variable "p". Granted, you've got a pretty amazingly short scope in your example, but code tends to grow. It would be better if the variable had a slightly longer name.

      There are a certain set of variables reserved for local semi-anonymous operations. For me, these are things like ptr, i, p, j, etc. It makes more sense to an experienced programmer to use variables like this since it is obvious that the variable isn't important.

      There are all sorts of things to nit-pick about, that a new coder could be confused about, or bugs which might be on the verge of instantiation, even in code as simple as yours.

      Why must we always write code to be indestructable by a "new coder"?


      If I've just walked in to your code, I don't know what behavior it's SUPPOSED to have, since you haven't documented that. All I can tell is what it DOES do. And since code changes over time, it's impossible for me to distinguish between the two, unless you document it.


      The code is the behavior its SUPPOSED to have. The maintainability nightmare arrises when there are two sources of behavior (i.e. a comment says code should be doing one thing was the code is doing something else). The code is always describing what the programs doing whereas noone really knows what the comments mean or were meant to mean.

      Comments are inferior to code because 1) they are not syntatically verified by a compiler 2) are not tested in anyway 3) and have no effect on runtime behavior.

      The real problem isn't that experience programmers don't comment well enough, its that beginner programmers expect comments to allow them to not learn the underlying language. A new-hire programmer is going to learn more (and be less productive in the short term) by reading code without any comments. In the long term, this translates to higher-productivity. The question is are we going to make this investment in our industry?

      --
      int func(int a);
      func((b += 3, b));
    3. Re:Inline Documentation is evil by Viking+Coder · · Score: 3, Interesting
      It likely would have to be expanded to include error checking if the item wasn't found.

      That task would either have to be performed by the original coder, or by someone else. In either case, documentation would help. Something along the lines of:

      // TODO : error check if it can't find a person named Harry.

      Wouldn't you agree?

      Assume that the list is unique.

      Well, that would be a good thing to document, now, wouldn't it? Otherwise, when a new coder comes in, they'll be all paranoid about the possible existence of other People with the same first name. And if the requirements of your program change to encompass the possibility of multiple People with the same name, wouldn't it be good to have a comment along the lines of:

      // ASSUMPTION: assumes uniqueness of Person

      Granted, your code could be bloated to actually test all of these conditions in each use case - but I'm just asking for comments at the top of the Person class, for instance. I think it would be more useful to document in each function that you're making such an assumption.

      And indeed you shouldn't know how.

      I agree with another poster that you could potentially overload each function that takes a string to take both a string and a wstring, for instance, in order to handle Unicode input. What I was actually suggesting was that it would be better to call your function like this:

      Person::findPerson(L"Harry")

      Of course, this is C++ and therefore would return a std::string as all C++ programs should.

      Actually, I would argue that your function should return either a "const std::string&" or a "const std::wstring&", so that it's clear that you can't modify the output, and so that less useless byte-copying is performed. Granted, string is pretty light-weight, but it's a good coding practice to get into.

      Again though, how would commenting help this?

      Doing away with comments doesn't magically make existing code better. Many people have argued with me - saying that adding comments does make code worse. I think they're crazy. Code will always have mistakes, but documentation gives you insight into the mind of the coder like code cannot. Especially when you see something like "// FIX THIS" sprinkled around. =)

      This only goes to prove my point that properly written code doesn't need commenting.

      I would argue that by your definitions the only "properly written code" would be code that meets at least one of these two criteria:

      1. It was written by someone with total recall. (In other words, they could recall the initial requirements perfectly, and had no need to write them down for posterity.)
      2. It can be proven to contain no bugs.
      Since neither criteria is very common, I would argue that almost no code is "properly written". I use your initial snippet as an example. Even something as simple as that had, in my mind, many problems. And you even agreed with one of my complaints! Therefore, your code was not properly written! COMMENT IT!

      It also reenforces the idea that commenting may lead to laziness on the part of symbol naming.

      Bad habits will always exist. One good habit is documenting unfinished code. Another good habit is documenting the design of any code, and the expected results under outlier conditions.

      Your assuming that the container is not unique. That is a bad assumption.

      If you'd documented your code better, I would have realized that. That sounds like a communication problem between two coders. One way to address that (not "solve", but "address") is that each coder try to document their assumptions, where it makes sense to do so. "At least once" would be nice.

      Constants are only good if they are going to be used multiple times and represent some abstract concept.

      Or, if their value ever needs to be changed in the future. (Such as making it Unicode compliant.) Or if the existence of the constant itself needs to be documented. Or if the constant itself comes from an original source, such as a paper describing an algorithm, or requirements specifications. Or if the constant needs to be translated into multiple languages. Or if the behavior of the constant needs to be checked by regression tests. I could go on, but I think that I've shown that your statement was rubbish.

      There are a certain set of variables reserved for local semi-anonymous operations.

      Who reserves them? Oh, you do. What about every other coder who'll have to look at your code? Do they get reserved variables, too?

      If you've ever written code like this:

      for (int i=0; i<max_i; i++)
      { ...
      } ...
      for (i=0; i<max_i; i++)
      { ...
      }

      Then you're guilty of writing non-portable code. The variable "i" is neither reserved by the compiler, nor do all compilers check to make sure that "i" is properly in scope in the same manner.

      ...since it is obvious that the variable isn't important.

      I believe you meant to say "since it is obvious that the variable name isn't important."

      I kind of like the rule that the length of a variable name should be proportional to the log of the length of its scope. *shrug* I know what you're getting at, but you must agree that as soon as the usage conditions on "p" become greater, it should probably be renamed. *shrug* Not really one of my main arguments.

      Why must we always write code to be indestructable by a "new coder"?

      Good code is a journey, not a destination. I think everyone should at least make an attempt to constantly improve their technique. If I didn't care what other people think or do, I wouldn't bother to argue with you.

      The code is the behavior its SUPPOSED to have.

      Wait just a minute. Let me go back and quote you to you, again:

      It likely would have expanded to included error checking if the item wasn't found.

      Well, WHICH IS IT? That code was either SUPPOSED to crash, if the item wasn't found, or it "likely would have to be expanded to include error checking."

      This really pisses me off. Can't you see how dumb you sound, here? I know that you're an intelligent person - you're making pretty good arguments - they just happen to be incorrect. But these two statements here, more than anything else, prove that your argument contains inconsistencies.

      The maintainability nightmare arrises when there are two sources of behavior

      Let me list sources of behavior:

      1. What the user thinks they want
      2. What the user really does want
      3. How the conditions will change in the future
      4. How the coder meant to type in the code (typing in an algorithm it's possible to have typos - it's VERY useful to CITE your sources, so they can be checked, later. Otherwise, I have to figure out, by hand, what's wrong with the code you typed in.)
      Since there are always multiple "sources of behavior", I think it would be far better to document the choices that the coder made, than to leave them up in the air, undocumented.

      Comments are inferior to code

      Code without comments is inferior to code with comments.

      Granted, I'm expecting a certain level of maturity in the people writing the comments, but your assertion seems to be that the code is somehow BETTER if you intentionally REMOVE intelligent comments. That is an untenable position.

      I disagree with your summation of "the real problem", in your parting paragraph.

      I think "the real problem" is that it's impossible for the computer to understand the intention of a coder. It is only possible to verify the intended behavior of code, by having another human read the code. That process is aided by good documentation. I agree with your assertion that bad documentation is misleading. However, code with documentation is guaranteed to be AT LEAST AS GOOD as code without documentation. It is always possible for a human to remove documentation, and look at just code. At the very least, cite your sources for algorithms that you implement - that alone would dramatically improve the quality of a lot of code.
      --
      Education is the silver bullet.
  8. Curing unmaintainable code by gwernol · · Score: 5, Interesting

    Roedy Green has written an excellent, humorous online article on writing unmaintainable code. This relates directly to Literate Programming, especially Roedy's points about maintaining existing code. He writes (here): "[the maintainence programmer] views your code through a toilet paper tube. He can only see a tiny piece of your program at a time. You want to make sure he can never get at the big picture from doing that. You want to make it as hard as possible for him to find the code he is looking for. But even more important, you want to make it as awkward as possible for him to safely ignore anything. "

    Literate programming in general, and Leo in particular, would be the ultimate cure for this. It allows you to easily navigate between multiple levels of description of a program. This is critically important if you are coming fresh to an existing piece of code. You need to constantly cross-reference the high-level design and low-level implementations (and the various levels of description between these extremes).

    --
    Sailing over the event horizon
  9. It still won't take off.. by Da+VinMan · · Score: 3, Interesting

    I've tried Leo in the past, and while I support the author's ideas and the idea of literate programming in general, I do not believe that the practice will become significantly more common in the near future.

    There are two reasons I believe this:

    1. More and more modern IDEs support the idea of folding sections of code at multiple levels. Combine this with some well placed comments, and you achieve a very high degree of readability. This nullifies the primary benefit of Leo and ensures that most developers won't ever look at literate programming tools.

    2. Changing over to literate programming is, at least superficially, a large change. It's a large change because it requires that developers switch their primary environment. That's a big deal. Even if developers had the tools for literate programming in their preferred programming language already in their hands, they probably wouldn't use it.

    I do hope I'm wrong about the above though. I think a shift in the industry (even for a relatively short time) to literate programming would give us new ways of thinking about systems design, development, and would greatly ease long term maintenance.

    --
    Please mod this post only if you think others should/n't read this. I have enough ego^H^H^Hkarma. Thanks!
  10. Re:Been there, done that by kawika · · Score: 2, Interesting

    Every compiler vendor who has sold a mainstream language compiler/IDE using a "program database" or some other such approach has tanked.

    Well, except for Microsoft. Visual Studio 6 didn't go far enough in that direction, but it was a start.

    Visual Studio.NET does a lot more. In addition to its own use of the database, the IDE is built so that third parties can hook into it and add their own functionality. For example, one vendor will be releasing an add-in that takes UML and creates source for the appropriate C# or VB classes. If you later change the classes in source, it updates the UML.

    Sorry to sound like a marketing pitch.

  11. The right balance by teetam · · Score: 4, Interesting
    Too much documentation is just as bad as too little documentation, even when the documentation is good. It is very difficult to strike a balance.

    For example, many of the core java apis are well written and well documented. If you see the HTML javadocs, you can get a pretty good idea of the class.

    However, when you open the source code of the same class, it is not good looking anymore. Why? Because each method is preceded with dozens of lines of javadoc, each of which is embedded with HTML markup. That is good when the javadoc HTML pages are finally generated, but not so good when you look at the source itself. C# is worse with its XML based documentation!

    When I look at the source code, I want to see the flow of the code easily. All the documentation in the source should only aid this and not hinder this. Javadoc does both. The explanation part of the javadoc can be very useful in understanding what the author's intent was when he/she wrote the method, but I am not so sure about the rest. The param, return and exception tags are no doubt useful, but often developers don't explain these very well. Plus, these are the tags that can easily become outdated.

    I would prefer short and succint pieces of information documenting the code, preferrably close to the line of code that it documents.

    --
    All your favorite sites in one place!
  12. Literate programming versus continuing development by Phronesis · · Score: 5, Interesting
    Although literate programming has a lot of potential, all too often literate projects become completely ossified. M.D. McIlroy's criticism of Knuth's literate programs (CACM 29, 471-83 (1986)), that they tend to be like "industrial strength Faberg eggs" as opposed to reusable tools, is still quite valid.

    For a project I am working on, I needed to extend CWEB to do some things Knuth hadn't thought of, and I found that excessive cleverness in the data structures made it much more difficult to extend than it should have been, so that Knuth could demonstrate clever data structures that probably add a few percent to the performance over what he could have achieved with more prosaic ones (Knuth does not document why he made these excessively clever design choices, nor whether the performance advantages they offer were significant).

    Similarly, a recent thread on comp.text.tex recently asking about the extensibility of TEX produced a number of comments from those who know about how unextensible and unreusable TEX really is.

    So, while I use literate programming (CWEB) to document a lot of my own code, I don't believe in all these years, that I have ever seen a good example of literate-programming that looks towards the future (refactoring, extending, reusing) as opposed to generating a fossil with that comes with a good story of its life and times.

  13. Re:look at the screenshot of pg 10 from the perl s by edream · · Score: 2, Interesting
    I am the creator of Leo. Leo shows that the "stream of consciousness" style typically associated with literate programming can be replaced with a more effective organization, one that is more like a reference book.

    So you could say that Leo turns literate programmers into reference librarians ;-)

    -Edward K. Ream

  14. Literate programming caveats by majordomo · · Score: 2, Interesting

    I've used both CWEB and noweb, the latter for a large scientific computing project involving (among other things) a large number of tensor operations. While I've thus found the TeX math typesetting features invaluable, literate programming has some serious drawbacks.

    The most common problem for me has been the function/code chunk dichotomy. You might have a code chunk like "Set initial conditions" and only later realize that your chunk is too long and you need a function: set_initial_conditions(). Literate programming makes it so easy to write chunks of code without wrapping them in functions that your code ends up with too many chunks. If you do take the time to make functions then you vitiate much of the advantage of your literate programming chunks, since you end up just deleting the chunks and replacing them with descriptive function names.

    Another serious problem is that it is very difficult to invert a literate program into human-readable source code; i.e., if you decide to junk CWEB and go back to C source and header files, you are in big trouble, since the machine-readable source code is horrendous -- not to mention stripped of all comments! So you really make a huge commitment if you decide to go the literate route.

    Having used lit. prog. for several small projects and one big project I appreciate some of its advantages, but on balance I think that well-documented standard code is better. The only thing I really miss in standard coding is TeX math typesetting, but this is easy to rectify. I just wrote a simple program to convert a regular source file into LaTeX. I use a Qt-style //! or /*! */ comment and then some TeX formatting in my source code and strip it out later to make my documentation.

    einstein.cpp
    ...
    /*! Einstein's equation
    is $G^{\alpha\beta} = 8\pi T^{\alpha\beta}$.
    */
    for (int i = 0; i != 4; ++i)
    for (int j = 0; j != 4; ++j)
    G[i][j] = 8*pi*T[i][j];

    ...

    The commands
    % simple_doc einstein.cpp > einstein.tex
    % latex einstein
    then produce a typeset version, with C++ code in typewriter font and the tensor equation in beautiful TeX math fonts.

    Lit. prog. might be good for some large, mainly single-author projects such as TeX or Mathematica, but it adds a layer of considerable complexity to your code base, forcing everyone who uses it to learn your system. It will also never make good programmers out of bad ones, and in some ways actually encourages sloppy code by making it easy to write chunks of code without good modular design. As a result, after my current project I'll probably not use a literate programming system again.

    -Michael

  15. Re:Literate programming versus continuing developm by Peter+S.+Housel · · Score: 2, Interesting

    In one respect, literate programs are a lot easier to maintain in the long term than illiterate programs because it's much easier to come back to them after a few months away.

    Since Pascal didn't support modules and separate linking, TeX and WEB weren't designed with any sort of reusability in mind. I don't think that there's anything inherent about literate programming that causes inseparable blobs of code like TeX and METAFONT to be produced.

    I generally program so that one document == one reusable library. The Monday Status page contains links to some of the literate libraries written for the Monday Project.

  16. Re:Been there, done that by RelentlessWeevilHowl · · Score: 4, Interesting

    IBM's Visual Age for Java used something similar, adapted from their Visual Age Smalltalk. My problem with VAJ was that you couldn't do anything in their environment except what they had specifically designed for you to do. If you have files in disk, you can run whatever tools you want on them. But in VAJ or Visual Studio .NET? "I dunno, what's in the context menu?"

    To avoid flat text files, you'd need an interactive scripting language powerful enough to perform any task you'd care to think of (viz., Emacs). Plus you'd need enough support libraries available to you to interact with third-party utilities, and finally bindings for the abstract syntax trees of all the languages you want to program in, so you could manipulate them programatically.

  17. The creator's view of Leo by edream · · Score: 5, Interesting
    Hi. I am the creator of Leo and I'd like to say here what my own view of Leo is. Joe Orr has contributed greatly to Leo, and I would not characterize Leo exactly as he did in his original article. In this posting I hope to clear up misconceptions about what Leo is, what it can do, and the relationship of Leo to literate programming.

    I would like to distinguish between the techniques of literate programming and the practice of literate programming (LP) as it has always been done before Leo (traditional LP). The key technique of LP is what might be called "functional pseudocode." For example, here is a fragment of code that can be written in Leo:

    def spam():
    done = false ; result = None
    while not done:
    << do something complicated >>
    return result
    The line: << do something complicated >> is a section reference. It works pretty much like a macro call. In particular, the code in the defintion of << do something complicated >> has access to the done and result variables. This is almost the entire content of noweb, one form of literate programming. It turns out that this technique can be extremely useful, as simple as it seems. Leo creates one or more "derived" files from an outline automatically when the outline is written, and Leo can update the outline from changes made to derived files when Leo reads the outline.

    In contrast to the technique of literate programming, the practice of traditional LP has focused on the central role of comments, and lots of them. Here is where Leo radically parts company with the LP tradition.

    One's view of the proper role of documentation in a project hardly matters to Leo. You are free to use comments as you always did, though you will probably find that LP as implemented in Leo helps you out in unexpected ways. I discuss at length and in great detail the relationship between traditional LP, comments and Leo here. In short, discussions about the role of comments in programming (literate or not) do not get to the heart of Leo.

    In fact, Leo often reduces the need for comments. Indeed, it is good style to organize Leo outlines like a reference book. Well-designed Leo outlines act both like self-updating tables of contents and self-updating indices. This is in marked contrast to the "stream-of-consciousness" or "narrative" style typically employed in traditional literate programming.

    In my view, the essence of Leo is this: Leo makes outline organization the most important part of a program or a project. Both code and documentation could be considered secondary. At every moment, the overall big picture of a function, class, module, file or project is always at hand. Moreover, Leo makes outlines structure a part of the computer language. For example, I often define a Python class as follows:

    class myClass:
    << declarations of myClass >>
    @others

    The @others directive acts as a reference to all the text in all the outline nodes which are descendents of the node containing this class declaration. Such nodes are copied to the output (derived) file in the order in which they appear in the outline. The reference << declarations of myClass >> ensures that those declarations precede the methods. There are several other ways that outline structure is important in Leo; I won't discuss them here.

    Leo fully exploits the organizational power of outlines. A single outline typically organizes an entire project. Outlines can handle large amounts of data with ease. Moreover, it is possible to clone any part of an outline so that changes to one clone affect all other clones. This is feature makes it possible for a single outline to contain multiple views of a project. For example, when fixing a bug, I clone all nodes related to the bug and gather them in a new part of the outline, called a task node. This task node effectively becomes a view of the project that focuses exclusively on the bug. Any changes I make to code are propagated to all other clones.

    Earlier I mentioned that a well designed Leo outline acts like self-updating tables of contents and self-updating indices. Tables of contents you get for free: an entire outline is the table of contents. Clones create self-updating indices. For example, each task node acts like the index entry for that particular task.

    - Edward K. Ream

  18. Re:Literate programming versus continuing developm by jaoswald · · Score: 3, Interesting

    This is absolutely on the mark.

    I believe that WEB was a great improvement over Pascal at the time that Knuth began to use it. However, it does not solve the underlying software engineering problem. Knuth's style at the time of TeX, etc., involved very little abstraction.

    The biggest problem this causes is that the major data structures in TeX do not have well-defined or factored interfaces that allow them to be easily changed or extended. Furthermore, important details of these data structures are basically undocumented, and often cause interdependencies between different portions of a WEB that are not at all obvious.

    If you wish to see the problem face-to-face, look through TeX: The Program at the "inner loop" and see how many different sections of the WEB that you would have to understand.

    A similar problem is his use of enumerations with certain magic values, where the magic is documented (or becomes apparent, while still undocumented) some distance away from the point of definition.

    Another serious problem with WEB is that it allows one to completely obscure the sequential nature of the program. Many times, one chunk depends on initialization that was performed by another chunk. If Knuth decided to make some laconic comment rather than remind you of that initialization, good luck reconstructing the sequential dependencies.

    If one is writing monolithic programs, writing them like a Russian novel might be easier to comprehend than one large unformatted source file. However, if one has the alternative of writing a highly modular program with clean interfaces, I don't really see any advantage to breaking up and rearranging the underlying code.