Literate Programming and Leo
jko9 writes "First proposed almost 20 years ago by Donald Knuth, the idea of Literate Programming is basically that of making program documentation primary, and embedding code in the documentation, rather than vice versa. Despite some obvious
advantages apparent to anyone who has struggled to understand a poorly
documented program, literate programming never really caught on.
That all could change, however, with the release of a new program called Leo,
written by Edward K. Ream.
Leo supports standard literate programming
languages like noweb and
CWEB, but with a crucial
difference - Leo adds outlines. The effect is striking: overall
organization of a program is always visible and explicit. Much of the narrative of the documentation gets placed in the outline, making documentation simpler, and allowing viewers to approach the code at various levels of detail. Screenshots and tutorials for Leo are here - if
that site gets slashdotted, you can download the visual tutorials in .chm
form or html form from Leo's
Sourceforge site. Leo is an open source program written in Python. Any current practioners of Literate Programming techniques out there? People
who have tried it and given it up? Can the addition of outlines to Literate
Programming make it more powerful / popular?"
i'll stick to JavaDoc
its nice to see people trying to help out slashdotting.
Maybe we can get other posters to get a few backup links in their posts to try to alleviate the load on these poor sites.
--Note to self. Add witty sig here, someday...
My previous employer had a strict rule concerning code: you first write the JavaDoc for all the project, then implement it. It's useful as hell ... and if you mix that with UML design before the documentation, its a killer technique.
Life isn't like a box of chocolates. It's more like a jar of jalapenos. What you do today, might burn your ass tomorrow.
Few systems even allow multiple fonts in program text, although the original Bravo editor for the Xerox Alto did.
The Cox plan is not a T1, you newb. It's 1.5mbit down, 128 up. A T1 is 1.5mbit both ways. Wake up, ass.
Finally, math books without any of that base 6 crap in them.
Did ANYONE learn (sic.) pseudo code ???
When i learned programming writing pseudo code was SUCH a big deal to the teacher that by the end of the year without even thinking i would write out the whole program in pseudo code, then, under each line of english add one line of code.
And has it ever paid off!
Now when I want to look at my own documentation, I just grep my java files and pull out all lines that begin with '//'
now when I am writing 20 pages of java code, and all my boss see's are comments I can tell him i'm am just writing Literate code!
Good day to you sir.
i never seen so many freakin task bar icons!
i assume leo turns illiterate users into literate programmers?
Good lord, I get ticked if I have more than 3 or 4!!!
No I didnt spell check this post...
literate, without literate programming :)
Oh, little you know. I'm in a "special" condo unit with Fibernet. I just checked DSL reports. I'm getting 1.4 up and down to Linkline. Sorry to disappoint you.
what does leo do for me?
it looks like the oldschool windows help browser with code samples pasted into it.
I'm not trolling - I really want to understand how this makes for better code? And my employers definition of better is faster/cheaper - they could give a rats ass about structure and good documentation. They couldn't read a program design in english any better than they could in the most cryptic C syntax I can muster.
Something like this could help a beginner or student break down code and learn to think logically, but unfortunately I had to move to the 'real world'..
Sometimes I can't document something until I figure out how its going to be done.. And I figure out how to do it by writing code that works. Then I document the code.
So far this brand of rapid prototyping is the only thing that gets results fast enough to keep my bosses happy. They care not for proper technique and well-structured code and attention to detail at the design phase. 'Design' around here is no more than a vague definition of the problem to be solved. They just want it out the door.
I'm sure I'm not alone.. How does leo help me?
Yuck. Leo is a "nifty" GUI which helps you do the outline. As I comment on another thread -- we programmers like our text editors thank you very much. I am ok with a visualization program but not one which takes over my workflow.
The main.cf config file of Postfix. Without the comments it's maybe 30 lines of actual settings. With comments its 540 lines, and it's clear enough that a relative n00b like myself got it up and running in 1 hr with minimal trips to the website. Good documentation was a major factor in my picking Postfix over Sendmail. No dis to Sendmail, you understand.
There are 01 kinds of cars in the world. The General Lee, and everything else.
If your code requires massive documentation within the code to make it understandable, then your code likely needs to be rewritten.
With most languages, the code itself is ample documentation. For instance:
Person &p = Person::findPerson("Harry");
cout p.name() endl;
Is pretty self-explanatory. Anyone can tell the output of this code. It's not that programmers need more documentation, rather they need better abstraction and encapsulation (insert your favorite argument for object oriented programming here).
int func(int a);
func((b += 3, b));
When we build systems, we work directly with the client and we are able to describe the system in three equal, but very different ways. Depending on the documentation required and the target audience, we can describe the system in a way that allows everyone involved to communicate effectively. This is an advantage I don't want to lose.
From what I've read, literate programming seems to be a discipline that works best when the programmers are isolated from the client. How it works when the programmers and the client closely interact is something I simply don't understand.
No Zen is good zen
So your condo is wired with fiber? Well, unless Cox has a CO inside your living room, I really don't think that matters.
Finally, math books without any of that base 6 crap in them.
Roedy Green has written an excellent, humorous online article on writing unmaintainable code. This relates directly to Literate Programming, especially Roedy's points about maintaining existing code. He writes (here): "[the maintainence programmer] views your code through a toilet paper tube. He can only see a tiny piece of your program at a time. You want to make sure he can never get at the big picture from doing that. You want to make it as hard as possible for him to find the code he is looking for. But even more important, you want to make it as awkward as possible for him to safely ignore anything. "
Literate programming in general, and Leo in particular, would be the ultimate cure for this. It allows you to easily navigate between multiple levels of description of a program. This is critically important if you are coming fresh to an existing piece of code. You need to constantly cross-reference the high-level design and low-level implementations (and the various levels of description between these extremes).
Sailing over the event horizon
Really? Why? I have ~56 icons in my tray (for all the major programs I use). I find it's alot faster to click the icon than find it in the Start Menu, and I dont clutter up my work space (the desktop).
-Ed
docbrown.net NEW!
Graphic Design, Web Design, Role-Playing Games...all the good stuff
Ed Wedig
Graphic design services
docbrown.net
All that matters to me is the near 1.5 up and down. Sure I have to share it with hogs occasionally, but I can hog it too, serving a game now and then. (And yes, I'm a posting newb, posted to the wrong story even.)
I've tried Leo in the past, and while I support the author's ideas and the idea of literate programming in general, I do not believe that the practice will become significantly more common in the near future.
There are two reasons I believe this:
1. More and more modern IDEs support the idea of folding sections of code at multiple levels. Combine this with some well placed comments, and you achieve a very high degree of readability. This nullifies the primary benefit of Leo and ensures that most developers won't ever look at literate programming tools.
2. Changing over to literate programming is, at least superficially, a large change. It's a large change because it requires that developers switch their primary environment. That's a big deal. Even if developers had the tools for literate programming in their preferred programming language already in their hands, they probably wouldn't use it.
I do hope I'm wrong about the above though. I think a shift in the industry (even for a relatively short time) to literate programming would give us new ways of thinking about systems design, development, and would greatly ease long term maintenance.
Please mod this post only if you think others should/n't read this. I have enough ego^H^H^Hkarma. Thanks!
Every compiler vendor who has sold a mainstream language compiler/IDE using a "program database" or some other such approach has tanked. (Note that I mean program database as the primary means of storing the code -- a replacement of flat files, not an addition to them.) So far, it's not really been a technological lack, it's just that programmers don't like it.
I recall reading some papers written by the major language guys a decade ago, and one of the things they all wanted to see was per-function recompilation (instead of per-translation-unit), better program information (like "where is this function used?") and other things that would require a more database-like format. Still hasn't happened except in research environments. (Pity.)
One could, but one would be a lunatic.
(I'm too tired to write it all down now, but I'll just summarize by saying that XML is not a silver bullet.)
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
Any sort of program that you plan on using long term will almost undoubtedly need updating. Do you really thing you are going to recall how some program you wrote a year or more ago was structured? That's ignoring the very likely possibility that it isn't the original programmer that is updating the code. It may be faster to just build a new program from the ground up.
Without proper documentation or comments, code is (almost) completely useless.
No, you're more than a posting newb if you think just because your house is wired with fiber that you're going to get speeds like that. Because from the nearest CO, that shit is all coax, so unless your running fiber all the way there, which can be some distance away, you're full of shit.
Finally, math books without any of that base 6 crap in them.
The biggest problem with literate programming is that most people don't write programs that are worthy of exposition. Most programs are written under extreme time constraints to solve immediate or practical problems, and their complexity arises from handling exceptions, special cases, and last minute or ill conceived extensions. Documenting these with prose actually doesn't help very much, as the prose reads pretty much as the code does: as a set of ill conceived exceptions rather than bold themes. Making the prose flow well is just work that could be used to make the code better.
If your code doesn't have these faults, then the code is already an expression of the program ideas, and one that you can excecute, so in that case literate programming techniques are needed to a much smaller degree.
There is no doubt that literate programming (like extreme programming) has its benefits, but their principal benefits are to encourage an attitude of critical evaluation to your coding efforts. This criticism is encouraged in literate programming
but not a unique feature of that approach.
There is much pleasure to be gained in useless knowledge.
If your goal is "getting it right the first time", you've missed with this post.
No, my car doesn't have little notes, but it doesn't need them, because I am a *user*. But I'd be willing to bet that the manufacturer has beaucoup documentation about each part. How it was designed, who built it, where it goes, etc.
If you don't document the code, where are the references that others will need to understand the code? Frankly, you sound like someone who doesn't really believe that they'll ever miss a spec, mis-code an algorithm, or make a mistake. Perhaps you don't understand that the purpose of a program is to accomplish some goal. To achieve the goal, it must be written by someone. In order for someone to write it correctly, they have to be able to effectively parse the logic.
I'm not sure that Literate Programming is the answer, but your argument makes little sense. "Read it a story"? Hell, why use a language at all? Just get a hex editor and start punching bits in pure machine code. Surely you're that good, but some of us lesser mortals might have trouble.
Simmer down... There's no fiber in the house. It's all CAT-5 from the disto box outside. There is a COX Fibernet box out there. (seen it) I know the BW is there for a fact. I've montitored it extensively.
No no, I have a lot in my quick launch bar too, I mean actual running programs, by the time in the lower right. Unless you happen to be sporting gigs of ram, why in the world would you have all that stuff start up every time you turn on your computer?? :)
No I didnt spell check this post...
Yeah, but do you honestly think you're "pulling one over" on Cox?
Finally, math books without any of that base 6 crap in them.
Hey Jack - I think your example is actually more bogus than what you are complaining about. Let me yank this one section out, and put things in perspective...
The goal of a programming language is to provide a machine with a set of instructions, not to sit down and read it a story. Do you expect your car to be made of parts which have little embedded notes explainging how they were engineered? Of course not, that's just silly
And, when you look at your compiled program, you don't see comments or documentation inside of it either. The compiler strips it out, as it should. However, when you code, you document. When a car builder designs a machine, they document it into such detail level it makes programming documentation look sparse (most of the time - I've seen it be overdone before ;-) It doesn't matter what you do, building cars, wiring offices, or programming, you better be documenting what you do - and those who don't regret it later, and lack of planning up front causes serious issues.
I probably shouldn't pick on your example - but it was a really nasty example.
Now, I don't completely disagree with your opinion that it's gimmicky, but, this provides yet another process for people to adopt if they so choose to. Any method that people feel comfortable with for software engineering or documentation that gets them to DO IT, well, sounds like a good idea to me.
Davis Ray Sickmon, Jr - looking for something to read? Check out my three free novels at MidnightRyder.org
Literate programming is how Wolfram wrote Mathematica. Perhaps because of that, it has remarkable structural consistency. I doubt it'll catch on, though, because it takes more discipline and brains to determine the structure before coding, than to code it up, and then explain it.
No. They just don't seem to care. There are two COX employees in the same complex. Perhaps they have something to do with it. I don't push my luck though. Well, not very often.
I don't think what he has is bad, but I think there a better ways to achieve cleaner code.
Many people have mentioned that writing cleaner code is the best form of documentation. This is definitely true, unfortunately you still have people who use letter for significant variables (i.e. not loop indexes) and who don't format their code or try to do too much in one line of code.
I think a better approach to documentation is the test driven approach that is used in XP and with packages such as JUnit and Cactus. Basiclly, you write your test cases first, which will force you to pin down the exact functionality for your components. These unit tests are essenailly doecumentation on how your components should work. Granted, this doesn't document the specific code but I think that one of the reasons why so much code is hard to read is because the functionality was not clearly thought through.
I also think API documention is more important. Alot of times I am trying to use an open source package and I have a hard time understanding how to use the API to achieve certian fucntionality. I can read the code just fine but it isn't clear how to use the objects themselves.
"The goal of a programming language is to provide a machine with a set of instructions, ..."
No. From SICP: "...a computer language is not just a way of getting a computer to perform operations but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute."
For example, many of the core java apis are well written and well documented. If you see the HTML javadocs, you can get a pretty good idea of the class.
However, when you open the source code of the same class, it is not good looking anymore. Why? Because each method is preceded with dozens of lines of javadoc, each of which is embedded with HTML markup. That is good when the javadoc HTML pages are finally generated, but not so good when you look at the source itself. C# is worse with its XML based documentation!
When I look at the source code, I want to see the flow of the code easily. All the documentation in the source should only aid this and not hinder this. Javadoc does both. The explanation part of the javadoc can be very useful in understanding what the author's intent was when he/she wrote the method, but I am not so sure about the rest. The param, return and exception tags are no doubt useful, but often developers don't explain these very well. Plus, these are the tags that can easily become outdated.
I would prefer short and succint pieces of information documenting the code, preferrably close to the line of code that it documents.
All your favorite sites in one place!
At least the idea is nice. Attempt to keep the doc in sync with the code. Except in our environment most of the doc is actually in presentation forms, some diagrams, word documents, etc. These also need to be kept in synch with the code.
As for the extraction/folding schemes most editors probably do this to some extent. Visual SlickEdit does folds, organizes things by method, class, file or whatever you want.
As for the text comments, the problem with text seems to be the tagging, and search/replace algorythms. Maybe I'm the last person to still use C++, but sometimes VSlick gets confused with templates, and can't follow pointer->method calls when the method is inherited (sometimes). It *would* be nice to have the code intelligently digested, paying attention to preprocessor symbols, doc, etc.
As for documentation, my hat's off to those who do this. Unfortunately, most if not all languages don't force this, because it's either difficult or impossible. I'm not sure that Leo forces anything more either.
I agree and disagree with Jack.
Leo seems to support only a linear tree structure. That seems restrictive. I would argue that a directed flow diagram is a better way to show the overall structural relationships of code modules. Thus a more graphic IDE would be better, something like Visio combined with an editor. I used to create prototypes in pseudocode diagrammed in Micrografix Flowcharter, which supported hyperlinks from graphic objects. So I could draw flowcharts and click on a node and jump to the underlying code text.
I do not agree with Jack's philosophy. If the only goal of a programming language were to provide instructions, we'd all be writing assembly language. The power of computers is that they support creating intellectual power tools. Why should a human have to remember 1001 esoteric rules in order to write bugfree code? Let the machine do the heavy lifting. An ultra high-level PL environment would let you tell it a story, and it would fill in the details for you. Which would you rather do: drive 3000 miles with a map in one hand every step of the way, or simply get on a plane?
I think a programming language and its usage environment should provide an intellectual buffer between the ideas of a programmer and the specific mechanization of those ideas. They should be power tools for expressing thoughts and organizing them. Documentation should be a natural part of that environment, integrated into the expression of the code.
I'm not liking EJBs very much, and I think this post points to some of the reasons why...there's so much damn boiler plate code there, just to do the same simple tasks. (and making it worse, I have a general distrust of those fancy-shmansy editors that try to do all that bean stuff for you.) Approach someone else's code and you first have to figure out where to begin...naming conventions can help, but still.
EJBs (especially entity beans; session beans (especially stateless ones) are ok, though for 90% of uses regular java classes and static classes could do the same thing) seem to be the antiperl in some respects; it makes the easy jobs difficult and the difficult jobs impossible.
SO YOU'RE GOING TO DIE: The Comic for Dealing with Death
Another idea which would work well in this respect would be altering the language used to be more reader freindly. Much C code is written whose syntax is a greater barrier to understanding the code than any concept underneath. Seperating some aspects of the language from regular syntax (such as pointer notation -- Sure, it's simple in theory, but in practice, it takes a fantastic long-term memory to remember whether you are witnessing a pointer being set to a memory address, or a value being placed into a variable without flicking around the source code or using a third party utility, which just slows you down and interrupts your thought process). Sure, an experienced coder can decipher obusficated(spelling?) code, but if the language makes it one step easier, that's a little bit of brain power to the question of "Why the hell did the original code do that?", and takes a bit away from the question "what the hell does this code do?".
It's been a long time.
..and i suggest anyone who wants to turn their 5-line useless programs that look like shit into a 500-line useless program that looks like you know something use it..honestly, why would you possibly want to spend that much time commenting, if you really have that much trouble understanding your code i suggest you either learn the language your using, or if you think you already did that, find another profession.
I've seen this before, but typically from the other way around. Browse databases generated from C++ source. You also have XMLDoc in languages like C# where every class, method, etc. is prefixed by a section of XML that describes that item and each piece of that item. And then you have intentional programming which is the concept of programming to a database instead of a flatfile where the database contains all of definitions for symbols for various languages and can flip without effort. For example you can change from C++ source to flowchart to circuit schematics and back into C++. This is called intentional programming.
Really, it depends. Some code has a pretty short life-cycle prior to a complete rewrite. The company I used to work for generally produced this kind of software. Requirements for a product might change pretty radically within a year or two.
The people I work for currently, on the other hand, expect to get a lot more life out of a given piece of code. In this case, the extra effort on design and code documentation, as well as the work required to keep it up to date, is easily warranted by the need for long-term maintenance.
Roving Web-Teleoperated Robot
One of the advantages of Literate Programming is (at least from my experinece) is that one can start with a general idea of what is needing done, and then fill down to the end, as it suits the programmer. For example, when writing a sorting routine, at some point I know I'll need code to swap the contents of two pointers. I can (in CWEB) just put a place holder in, and write it later, or, if I've got the code in my head, just write it down directly.
This method models the way that (for me at least) code is thought about. That's the key idea in LitProg - to put the source code / documentation down in a manner that models the thought processes of the programmer.
I don't have a full, firm, outline in mind right at the start. That's not to say I don't think about it - but it's not final. Using an outliner at the start would not work well with me. CWEB forces me to document the thought behind each step of the algorithm, and presents it in logical order, even though it was not written in that order.
Maybe if I had a cast in stone plan for the code before I start, I'd write better code. But I work well enough with CWEB &c that I don't see the addition of an outliner assisting.
Frankly, looking at the web page, it looks just like an outlining code editor - nothing that dramatic, and I'd rather stick to vi + CWEB.
Something I didn't put in the original notice but now regret that I left out - Leo has another new feature more difficult to describe, but that which solves the problem that several people have mentioned about not wanting to abandon an existing text editor or tool. Leo can embed an outline structure in comments, so that one programmer can work with the file in JBuilder or EMacs, and a third programmer can still work with the program(s) in Leo. In effect, Leo is a meta-text editor. When Leo opens an outline containing a file that has been edited with another editor, all of the edits are retained. This is a further extension of LP because you are getting code read back into the documentation, which means that LP techniques can be used for understanding and/or teaching existing programs. It also means that Leo allows LP to be a secondary technique to add additional structure and documentation, rather than necessarily being the primary technique. This is explained in more detail in the tutorials and Leo docs.
.NET, but this ability to separate the outline from the program is something new, as far as I know. Also, unmentioned in the original article is the idea of having clone nodes, which means your outline can put the same code section into different branches simultaneously.
It is true that there are other IDEs that allow folding, e.g. Visual Studio
The following statements will be highly inflamatory to many people. They are not intended to be inflamatory but, rather a simple observation.
Basically, Leo is yet another tool to automate the documentation of programming code. There are dozens, possibly hundreds, of programs available for this task. Yet, the problem that these tools were designed to solve remain very prevalent, if not pervasive.
The reason that the problem remains and that Leo will not solve the problem either is relatively simple. Simply put, the problem is garbage-in, garbage-out (GIGO). These tools are not able to determine the purpose of the code or the intent of the programmer that is writing it. These tools cannot read the minds of the programmers. The tools rely on the programmer to write out their thoughts and the intended purpose of the code.
Most programmers are unwilling or incapable of performing this critical step thoroughly. All too often, they use shorthand and expect the reader to understand what they mean. Or, they believe that the reader should be able to understand their thought process by reading the code itself. Furthermore, they assume that if the reader can't do this, they are simply not a good programmer (1337).
To go a step further, many programmers are not capable of clearly expressing their thoughts in their native tongue. These people are quite brilliant and can do amazing things with their code but, they can't express their thoughts to another person unless that person is indeed, able to read and comprehend the code itself.
Now, in fairness to the programmers, we have to look at what they do and what they are taught. Most programming languages are all about efficiency. They rely heavily on abreviations and aliases, why do you think it's called code? They are designed to require a minimum or typing while providing a maximum of functionallity. The programmers themselves are always striving for increased efficiency both in their code and in the way they get the code done. They always try to put out more which leads to further shortcuts and abreviations. This all tends to make programmers minimalists and their documentation clearly reflects this.
So, Leo is unlikely to provide any documentation breakthroughs. The old rules still apply, garbage-in, garbage-out. The best idea I've seen was an earlier post, where the documentation is written first and then the code is developed to match the documentation. But, honestly, which of us going to do it that way. That's a lot of work and our ingrained habits are going to be hard to break.
This wouldn't be so bad if you actually was funny, Jack. But regrettably, you are not, and if someone laughs, it is you they are laughing off.
Someone mod this up! You've hit the nail right on the head.
IOOC 911.11? Would that be the International Olive Oil Council, or the Iranian Offshore Oil Company?
Not to feed the troll, but for the benefit of any impressionable young programmers:
The goal of a programming language is to provide a machine with a set of instructions, not to sit down and read it a story.
Programming languages intended for use by humans (as opposed to languages intended primarily for machine generation) have multiple goals, three of which are to be human-writable, human-readable, and human-maintainable.
Literate programming may not be a perfect solution, but it's addressing a real issue. Current programming languages tend to be pretty horrible at expressing abstractions in a human readable way. The ideal programming language would be one that allowed you to express abstractions at the level of the problem domain, yet was able to translate that into something as efficiently executable, or close to it, as something written in a lower-level language. Literate programming allows you to do something along these lines, although it still involves a fair amount of "manual intervention" on the part of the programmer.
This seems like it should be easy to implement as an emacs mode of some sort. Anybody know if it's been done / is being done? I'd be remiss to give up XEmacs, thanks.
I am sure most of you have seen this but this is some of the most clear codeI have ever seen.
While on the topic of tools for development, what's the best open source editor out thete with support for multiple languages (C, C++, Java, Perl, Python, etc) that runs on multiple platforms (Linux/*BSD/Windows) and supports basic stuff like syntax highlighting and more advanced stuff like code block folding/collapsing, etc?
You are sorely wrong and obviously do not work any where for a living. I spend more time reading code than I ever will writing it.
Its ugly, but it seems to help me out.
Need a Linux consultant in New Orleans?
While I agree that Literate Programming is a promising concept and Leo is a promising approach to that concept, and while it's great to see some buzz for LP, Leo is not really a "new program".
As far as I remember, it has been around in some form for more than 5 years, although the Python incarnation may indeed be new (Leo started its life as a Macintosh application).
I am more of a technical writer than a programmer (well, really, I'm not much of a programmer at all), but it was always clear to me that 90% of the software development headaches I lived with at various companies could have been resolved with minimal effort early in the project.. IF anyone cared about using a methodical approach to project documentation.
But nobody likes documentation. Writing it. Reading it. Just the word makes some people itch. For some reason, this is something that BOTH business managers and programmers don't get: documentation saves work. It is a way to produce a testable set of requirements, then a testable architecture/design, then a way to match up features and metrics in production and testing.
I mean, why does everybody think writing the manual is the LAST thing you do when you make software? With all the snarky "RTFM" comments I hear from geeks, I should start a new variant...
"PUHLEASE! BEFORE YOU START CODING, WTFM!"
I've got a bad attitude and karma to burn. Go ahead. Mod me down.
Though sometimes they think they can.
That can cost sales when prospective clients read the hideous grammar and the glaring misuse of words.
For a project I am working on, I needed to extend CWEB to do some things Knuth hadn't thought of, and I found that excessive cleverness in the data structures made it much more difficult to extend than it should have been, so that Knuth could demonstrate clever data structures that probably add a few percent to the performance over what he could have achieved with more prosaic ones (Knuth does not document why he made these excessively clever design choices, nor whether the performance advantages they offer were significant).
Similarly, a recent thread on comp.text.tex recently asking about the extensibility of TEX produced a number of comments from those who know about how unextensible and unreusable TEX really is.
So, while I use literate programming (CWEB) to document a lot of my own code, I don't believe in all these years, that I have ever seen a good example of literate-programming that looks towards the future (refactoring, extending, reusing) as opposed to generating a fossil with that comes with a good story of its life and times.
At least the idea is nice. Attempt to keep the doc in sync with the code.
;-)
I hope you meant "keep the code in sync with the doc".
in our environment most of the doc is actually in presentation forms, some diagrams, word documents, etc. These also need to be kept in synch with the code.
Ummm... You mean the code has to be kept in sync with these docs, right?! Please?
From what I've skimmed of Leo, it's certainly not designed to generate/update docs after you wrote code. Thank goodness. Having to update docs to match the code can be a serious symptom. There are exceptions, of course, but in my opinion, if you're updating your docs -after- your code has already changed so often that you need a -tool- for it, welll....
might include scoping (i.e. putting warts on globals and statics) and reminding you something is a pointer (pData). But none of this lpsz crap...
I can NOT believe this post got moderated up so high. Slashdot amazes me sometimes.
...".
The original post has nothing to do with what you are blathering on about. The original code makes perfect sense but is taken out of context. The problems you have with that snippit can not possibly be made for such a short piece of code.
I could just as easily say "DAMN, that code won't run at all, there's no function declaration or anything. Hell, you can't just make some call like that, you need a main()
You know, I hate to nitpick your nitpick, but either way should be about as fast as the other since wq is one stroke if the ring and pinky move at almost the same time and that they are slightly more accessible than the x is (home row is easier than top row is easier than bottom row rule).
</rant>
the trick here is to integrate texinfo (standard GNU documentation source format) generation with automake methodology (which assumes texinfo is hand-maintained). in the vein of foo2bar naming convention, the TWERP (Texinfo With Eval-Requiring Predelictions :-) file is
processed to .texi with twerp2texi (which also handles indexing,
automatic dependency tracking a la depcomp, and Makefile prep).
here's a simple example (from doc/ref/scheme-compound.twerp):
The @twerpdoc directives expand to documentation on hashq (from Scheme) and scm_hashq (from C), mined from libguile/hashtab.c, and so forth. modify hashtab.c, do a "make" and the .texi (and .info and .html if enabled) is regenerated.
this differs from the article's system in that outline info (and document organization in general) is maintained in .twerp files, which "pull in" reference docstrings
and other bits as needed from source, rather than adhering to "one source"
doctrine. probably we will introduce more @twerpFOO directives (e.g., to
do bit-field diagram or embedded DAG layout) in the future. for more info,
see documentation on twerp2texi itself in the guile docs (in tarball above).
So you could say that Leo turns literate programmers into reference librarians ;-)
-Edward K. Ream
Want something easier in the short and long terms? How about not hiring arrogant programmers whose grasp of everyday language is so poor that they can't post a coherent Web form?
C# allows you to do this very thing. Basically the comments/documentation can be embedded right in the code itself in XML format. You can then compile the source code into documentation, as opposed to compiling it into binaries.
I've used both CWEB and noweb, the latter for a large scientific computing project involving (among other things) a large number of tensor operations. While I've thus found the TeX math typesetting features invaluable, literate programming has some serious drawbacks.
//! or /*! */ comment and then some TeX formatting in my source code and strip it out later to make my documentation.
...
/*! Einstein's equation
The most common problem for me has been the function/code chunk dichotomy. You might have a code chunk like "Set initial conditions" and only later realize that your chunk is too long and you need a function: set_initial_conditions(). Literate programming makes it so easy to write chunks of code without wrapping them in functions that your code ends up with too many chunks. If you do take the time to make functions then you vitiate much of the advantage of your literate programming chunks, since you end up just deleting the chunks and replacing them with descriptive function names.
Another serious problem is that it is very difficult to invert a literate program into human-readable source code; i.e., if you decide to junk CWEB and go back to C source and header files, you are in big trouble, since the machine-readable source code is horrendous -- not to mention stripped of all comments! So you really make a huge commitment if you decide to go the literate route.
Having used lit. prog. for several small projects and one big project I appreciate some of its advantages, but on balance I think that well-documented standard code is better. The only thing I really miss in standard coding is TeX math typesetting, but this is easy to rectify. I just wrote a simple program to convert a regular source file into LaTeX. I use a Qt-style
einstein.cpp
is $G^{\alpha\beta} = 8\pi T^{\alpha\beta}$.
*/
for (int i = 0; i != 4; ++i)
for (int j = 0; j != 4; ++j)
G[i][j] = 8*pi*T[i][j];
...
The commands
% simple_doc einstein.cpp > einstein.tex
% latex einstein
then produce a typeset version, with C++ code in typewriter font and the tensor equation in beautiful TeX math fonts.
Lit. prog. might be good for some large, mainly single-author projects such as TeX or Mathematica, but it adds a layer of considerable complexity to your code base, forcing everyone who uses it to learn your system. It will also never make good programmers out of bad ones, and in some ways actually encourages sloppy code by making it easy to write chunks of code without good modular design. As a result, after my current project I'll probably not use a literate programming system again.
-Michael
There's an old saying (was on a "Murphy's Laws of Computing" poster I used to have): "make it easy for programmers to write in English, and you'll find that programmers can't write in English."
Others have pointed out the all-too-common case where the code gets edited but the comments don't. This is bad, but not as bad as another common case: the programmer tries to comment the code, but his/her grasp of English isn't up to the task. This may be because English is a second language, or simply because the person specializes in computer languages, not human ones. In any case, the result is frequently misleading or incomprehensible comments that either do no good, or worse than no good. And, of course, deadline pressures never help.
I think Literate programming is a wonderful idea, but I don't think it's a practical one in many (most?) real-world environments.
... programming is all about generating code. The necessity and depth of documentation is determined by the context - if you're working as part of a development team, you will be strongly concerned about maintainability and reasonable code documentation will be necessary. If you're hacking together a bunch of assembly-optimized modules, then you might want to attach a minimalistic blurb to your routines, but nothing as comprehensive as most coding standards require. In the end, code documentation is all about maintainability and minimizing the 'discovery' curve for future work on your code.
What is of paramount importance is documenting your design, which is where any engineer expends 90% of their energy. Design work is what separates an engineer from a programmer; indeed, the ability to design complex software systems is what stratifies the field of software development into it's levels: CTO, architect, engineer, programmer, etc. Design work is where your college student loan payments are justified, since it really does take an advanced degree to design certain classes of systems. Implementation just requires programmming drones, but some sort of design document must exist to guide their efforts. The point here is this: code documentation is to be encouraged, and lexical programming is a good way to do it. But the significance of that pales in comparison to that of properly documenting your design.
HelpUsObi 1
Apparently literate programming was not enough to allow the developers of evisa.com to avoid making yet another site that only works with IE 5.5+.
Unimpressive.
Wagner LLC Consulting Co. - Getting it right the first time
And losing customers along the way!
Minimal comments and a language that creats documentation for you is much better. With Eiffel your classes automatically have their public members documented, and with the design by contract model the interaction between classes is obvious.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Imatix, the makers of the awsome open source webserver Xitami, has a similar product called 'Libero'. It allows you chart and build your code as you go.
http://www.imatix.com/html/libero/
You say things that offend me and I can deal with it. Can you?
I am not that old, and I seen several cases of someone cutting as pasting some similear code, comments and all, and then not updating the comments with the minor changes. When the comments references one register, and the code a different one, the comment is useless. Even though the code is similear, you can be sure that something is different, otherwise the two functions would be combined into a different one. But what logically is different between the two? What was missed?
When the orgional was written 25 years ago in assembly for a different CPU,(previous model, old code will still run) by a guy who is dead, you are in trouble. (I'm thinking of a real case here)
Great documentation also doesn't help when it covers the wrong thing. I read the documentation for one module I needed to make minor changes in and discoverd nothing about the code, instead I found the rough draft for a book: Advanced tricks with internal OOA process (Don't look on amazon, it never got further, and in any case is just enough specific to that companies old precess that it wouldn't apply elsewhere).
The problem with documentation is that good documentation rarely exists, not that it is hard to get at. Literate programing sounds good, and it would be if everyone wrote good documentation, but nobody could find it afterwards. Instead nobody writes good documentation, but at least it is in accessable places. (Generally company specific, but most companys do a fairly good job of keeping it)
I now they are 'old fashioned', but box-and-line flowcharts are my program outline, and terse comments in the code correlate to the boxes and lines on the flowcharts. I can show people a flowchart, and most can understand what is happening faster, or more fully, than with any other type of documentation.
Programming is half rubix-cube like puzzle solving and half multiplication-table recital. (Those of us doing it for a living must be slightly [at least] off-kilter.) Documentation is a personal skill/habit; the way for a program to teach you this isn't by forcing you to do both simultaneously. The only way to learn this habit (if you haven't already) is to have your final work gone over and problems pointed out, and be willing to work on it. It seems like stuff like Leo is great for people with the habits already, but yields little more than confusing documentation to go along with confusing code for the rest of us.
You've got to develop some some kind of outline first, then write the code, then test, test, and document (with flowcharting, for me) what's actually happening.
All pass beyond reach of medicine. None pass beyond the reach of love.
the above example was a bit simplistic, but if you've been in the workforce for some time, you've for sure been in the unfortunate position of 'chief archeological engineer'.
By this I mean that you've been handed a coupla-tens-of-thousands (hundreds of thousands if you're esp. unlucky) piece of spaghetti code that started its life when somebody (usually the founder) of the company 'prototyped' an idea years ago, and that got perverted and 'enhanced' countless times by countless different people, and you're asked to implement a new major 'feature' on top of it with a two-weeks deadline and everybody breathing down your neck.
While in a 5 line function the line 'i++ means go to the next cell' is obvious, in a 2000 line function with multiple exit points that can be called directly, as a callback, via pointer reference and so on, that i++ is not going to look very enlightening by itself.
[rant]
In every single profession other than software development, when something you've built before can't do the job you need it to do, you tear it down and rebuild to the new specs.
Why is it that with software there is the assumption that just because you have a cubicle shop full of caffeinated programmers you can take your old code for, say, a word processor, and with a few mandatory overtime weeks make it also be a SQL database?
How long before laws are enacted to mandate all software's source code to be open and audited before a software product is put on the market? With programs used for more and more critical applications, where lives are at stake, where Windows NT is used to guide a naval destroyer, it is really incredible that companies are allowed to sell the buggy crappy that these days passes for software.
And don't tell me that it's 'too hard' to QA software well: do you think it's easy to QA the project for a skyscraper with all the structural, electrical, hydraulical and so on work that it needs?
Ship now and fix later and 'no warranties' EULAs should be made illegal IMHO, and if this adds another 6 months of QA to a new release, so be it, better 6 months of QA for the manufacturer, than 6 months of buggy hell by the client. It's their product, why should I be on the hook for bugs I wasn't responsible for?
[end rant]
As a metaphor, lets take the five-paragraph essay. This is a simple form of basic literacy. There is a head paragraph with a few statements leading to a strong thesis. There are three paragraphs that argue for those statements in an effort to validate the thesis. Finally, there is a conclusion paragraph where the thesis is formally validated. A literate person is expected to be able to apply this structure to a hypothetical issue. This, however, does not mean that the person will have enough respect for this structure to use it in communicating on a daily basis. Often, the bogus literati believe that it is more important to be complicated in an effort to create a perceived intelligence, rather than to be direct and allow statements to be judged on their own merit.
The same is true for code. Code has it's own vocabulary, grammar, and idioms. It also has a structure that can be generalized for all code, as well as structure that is unique for each language and application. It is the application of these structures that creates legible code.
This was very clear to me a several months ago when I was wading through some code written in VC++. The person who wrote the code, though likely to do well on VC++ test, was totally ignorant the standard structures and grammar of not only OO code, but even structured programming. Repeated tasks were not converted into a generalized function. Variables were ambiguously retrieved from the registry. Identical conversions were done differently in various areas of the code. Related variable and function were not encapsulated into classes. This has nothing literacy. It had everything to do with a lack of respect for structure. I was able to take this code that worked and convert it into code that legible without the inclusion of foreign syntax.
Two last things. First, there are situations where rules of structure and grammar must be broken. There are even times when it is fun to do so. That said, it is one thing to intentionally break well known rules, and a totally different thing to be too ignorant to realize their importance. Second, translation tables are still needed between human languages and computer readable languages. The trick is to create code using existing structures and idioms in an effort to make the translation tables as simple as possible.
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
Ha ha. You closed a tag instead of a tag!
Banach-Tarski Overdrive
In one respect, literate programs are a lot easier to maintain in the long term than illiterate programs because it's much easier to come back to them after a few months away.
Since Pascal didn't support modules and separate linking, TeX and WEB weren't designed with any sort of reusability in mind. I don't think that there's anything inherent about literate programming that causes inseparable blobs of code like TeX and METAFONT to be produced.
I generally program so that one document == one reusable library. The Monday Status page contains links to some of the literate libraries written for the Monday Project.
For a project I am working on, I needed to extend CWEB to do some things Knuth hadn't thought of, and I found that excessive cleverness in the data structures made it much more difficult to extend than it should have been, so that Knuth could demonstrate clever data structures that probably add a few percent to the performance over what he could have achieved with more prosaic ones
Generally, collection management API's should be "wrapped" such that you can change the implementation without changing or reducing change to the application code that uses collection management.
Whether there is a performance penalty to such wrapping is hard to say. Generally, there will be some performance penalty for the "indirection" needed for hiding implementation.
For many domains, making the software easier for programmers to maintain is more important than speed. Some programmers get obsessive over speed for no good reason, and make stupid (change-unfriendly) code as a result. They should be embedded systems programmers if they get off on that.
(BTW, you don't need OOP to wrap collection handlers.)
Table-ized A.I.
Following things always make me take an extra coffee break.
- If you write a library. Please don't just write class reference library with your doxygen. I can see what the classes are and what they are derived from. Write also a cook book that explains how you think this library should be used.
- If you wan't to write something using a certain technique please learn it first before attempting it on a time critical task that everybody else is relaying on. Too many times have I seen somebody attempt to write STL style template library for some trivial problem.
- Draw pictures for goodness sake... Two boxes that connect to each other via line whether UML or just some word scrible tell me in a second more that two paragraphs of jibba jabba fancy SC lingo...
Any how I don't think the documentation and code should mix at all.The things that make debugging easier are logging (the ability to turn logging on to a high level is priceless) and simple code. The fancy distributed OO monsters are the worst things to maintain -- adding heaps of description text would just makes this worse. Since I've had to work with other peoples code my whole programming world view has changed. Good code isn't clever or fashionable.
Incidentally, the most useful tools I've found for problem solving an unknown mass of program are gnu grep (grep -ri) and find (find -mtime -1).
In the real world, development teams are made of of people from places where they don't learn English as a first language.
Looking around my current team, I see people from China, India, Norway, Japan, Germany, Malaysia, Iran. All are very competent developers, but many of them have writing skills that suck.
Personally, I would prefer to look at their code rather than some tortured syntax that is pretending to be the English language.
the idea of Literate Programming is basically that of making program documentation primary, and embedding code in the documentation, rather than vice versa.
This is not anything like what Knuth meant, even if it is what people are calling literate programming these days. (I'm not even sure what the above means.)
Literate programming is writing your code so that it doesn't require comments and documentation to be understood. The main points of this are clear formatting and use of whitespace (not such a common notion back then), descriptive and accurately named variables and functions (and objects), logical flow, 1 task per function, etc.
A well written program should be able to be read without understanding all the details, the goal being that programmer A can understand and modify programmer B's code easier, safer, and quicker.
what a clarification--
*mainstream language compiler*
such languages were not designed for gui and rad
such languages where created on legacy file systems
out here in the jungle though drones are performing ghoulish rituals like clustering macs and developing in-house apps driven by marketing (irad or die baby)
there are many niche products
it's like a star trek universe
and in this diverse eco-schism there are compiler/ides (out of the mainstream) which keep businesses going, and the vendors of these tools are *hanging on*
they may be decomposing, but they composing still
not all of these un-mainstream systems are a horror, i actually enjoy programming 4d (even though i have to export methods to text for personal archives)
I would like to distinguish between the techniques of literate programming and the practice of literate programming (LP) as it has always been done before Leo (traditional LP). The key technique of LP is what might be called "functional pseudocode." For example, here is a fragment of code that can be written in Leo:
The line: << do something complicated >> is a section reference. It works pretty much like a macro call. In particular, the code in the defintion of << do something complicated >> has access to the done and result variables. This is almost the entire content of noweb, one form of literate programming. It turns out that this technique can be extremely useful, as simple as it seems. Leo creates one or more "derived" files from an outline automatically when the outline is written, and Leo can update the outline from changes made to derived files when Leo reads the outline.In contrast to the technique of literate programming, the practice of traditional LP has focused on the central role of comments, and lots of them. Here is where Leo radically parts company with the LP tradition.
One's view of the proper role of documentation in a project hardly matters to Leo. You are free to use comments as you always did, though you will probably find that LP as implemented in Leo helps you out in unexpected ways. I discuss at length and in great detail the relationship between traditional LP, comments and Leo here. In short, discussions about the role of comments in programming (literate or not) do not get to the heart of Leo.
In fact, Leo often reduces the need for comments. Indeed, it is good style to organize Leo outlines like a reference book. Well-designed Leo outlines act both like self-updating tables of contents and self-updating indices. This is in marked contrast to the "stream-of-consciousness" or "narrative" style typically employed in traditional literate programming.
In my view, the essence of Leo is this: Leo makes outline organization the most important part of a program or a project. Both code and documentation could be considered secondary. At every moment, the overall big picture of a function, class, module, file or project is always at hand. Moreover, Leo makes outlines structure a part of the computer language. For example, I often define a Python class as follows:
The @others directive acts as a reference to all the text in all the outline nodes which are descendents of the node containing this class declaration. Such nodes are copied to the output (derived) file in the order in which they appear in the outline. The reference << declarations of myClass >> ensures that those declarations precede the methods. There are several other ways that outline structure is important in Leo; I won't discuss them here.
Leo fully exploits the organizational power of outlines. A single outline typically organizes an entire project. Outlines can handle large amounts of data with ease. Moreover, it is possible to clone any part of an outline so that changes to one clone affect all other clones. This is feature makes it possible for a single outline to contain multiple views of a project. For example, when fixing a bug, I clone all nodes related to the bug and gather them in a new part of the outline, called a task node. This task node effectively becomes a view of the project that focuses exclusively on the bug. Any changes I make to code are propagated to all other clones.
Earlier I mentioned that a well designed Leo outline acts like self-updating tables of contents and self-updating indices. Tables of contents you get for free: an entire outline is the table of contents. Clones create self-updating indices. For example, each task node acts like the index entry for that particular task.
- Edward K. Ream
i read some of the LP examples and got totally lost. how the heck can i understand the overall picture with so many more words around.
if you want document how to do mul matrixs, pls give a good book name so i can understand it. the compiler doesnt need to understand matrix to do it's work.
ADD 1 TO COUNTER GIVING INDEX.
or one of my personal favorites:
PERFORM UNUSUAL-ACTS UNTIL IT-STOPS-FEELING-GOOD
All of the schemes under discussion seem to define documentation as text . Is there any system which also includes graphics ? Not necessarily the much-maligned flow-chart, but rather support for the kind of quick sketches often found on the whiteboard in a programmer's work space -- ie the diagram useful to the person writing the code in the first place, but which never finds its way into documentation.
Please read my posting, "The creator's view of Leo."
Edward K. Ream
None of that is true for technical writing. It's a discipline onto itself. It's not just about good writing. (I've known computer scientists who'd written award-winning papers and articles, but couldn't write technical docs worth beans.) It's about understanding your audience and the (often painfully boring) task of writing in the clearest possible language.
Not every project needs technical writers. If you're a small software shop, and you're building a set of components with an uncomplicated API, and hiring a professional writer isn't cost effective -- then yeah, use Javadoc or some other LP tools.
But for big projects... Back in 1998, I was in charge of production for the doc set of a large Java framework. Having the API docs embedded in the source code was a nightmare. Javadoc was supposed to allow any of the engineers who wanted to to do their own API docs -- but many botched it, because they didn't understand Javadoc or HTML very well. We had professional writers, but many of them couldn't be trusted with source code. Hell, some of them didn't understand why they couldn't edit the SCCS archives!
Worst of all was when the release cycle entered code freeze. Document freeze is always later than code freeze -- but you cannot let people modify the release code base during code freeze. The only solution was to split the source, then merge the docs back in after release. Very painful.
still haunts us poor programmers. Backwards compatibility is hard to sacrifice.
Do you believe in life after death?
This is absolutely on the mark.
I believe that WEB was a great improvement over Pascal at the time that Knuth began to use it. However, it does not solve the underlying software engineering problem. Knuth's style at the time of TeX, etc., involved very little abstraction.
The biggest problem this causes is that the major data structures in TeX do not have well-defined or factored interfaces that allow them to be easily changed or extended. Furthermore, important details of these data structures are basically undocumented, and often cause interdependencies between different portions of a WEB that are not at all obvious.
If you wish to see the problem face-to-face, look through TeX: The Program at the "inner loop" and see how many different sections of the WEB that you would have to understand.
A similar problem is his use of enumerations with certain magic values, where the magic is documented (or becomes apparent, while still undocumented) some distance away from the point of definition.
Another serious problem with WEB is that it allows one to completely obscure the sequential nature of the program. Many times, one chunk depends on initialization that was performed by another chunk. If Knuth decided to make some laconic comment rather than remind you of that initialization, good luck reconstructing the sequential dependencies.
If one is writing monolithic programs, writing them like a Russian novel might be easier to comprehend than one large unformatted source file. However, if one has the alternative of writing a highly modular program with clean interfaces, I don't really see any advantage to breaking up and rearranging the underlying code.
... is the code itself.
;-), and definitely over simplified (you shouldn't have to read the entire source code of Windows to know what one of the API calls does / is meant to do (although if some recent posts are anything to go by, it would probably be wise...)
Probably a highly controversial opinion
But for me, over documentation is *far* worse than under documentation... code is a natural form of doucmentation - why? because it is a set of instructions that are followed blindly to the letter... it describes *exactly* what the computer is doing (although when calling some third party library / API, you are trusting the function to do *exactly* what the author claims it is supposed to)...
What can be better than that? Certainly not half a page of drivel, of what someone vaguely thinks the code is doing...
Of course, there are caveats - for starters, sensible, descriptive (although hopefully not to finger-blisteringly long) names for functions and variables are necessary, as is a clear, consistent layout of the code (and that means none of the old K&R style opening a program block on the line of an if / while / etc statement - block starts / ends should *always* be on clear lines at the same indentation level!!)
Javadoc-style comments are handy and useful though - handy because they put space in between functions for easier navigation of a file, and for automatic generation of your API documentation... useful, as it ties the 'specification' (ie. what the code was meant to do) with the code itself - so when it doesn't do what it is supposed to, you don't have to go hunting for the specs...
Other than that, if you need much more in the way of comments, then you can be pretty sure that the structure of the code is wrong... (unless of course speed is critical, and that is the best optimised code)
Leo really isn't all that much about either literate programming or documentation; it's about structure and clarity. I explain what Leo is in my posting, "The creator's view of Leo."
Edward K. Ream
Amen to that!
Here is one such problem in CWEB:
If you want to add new category codes for the tokenizer, Knuth has allocated all 127 lower-ascii codes to signify the 126 possible input characters, plus a special sentinal value for 127 (DEL).
Codes above 127 are interpreted in a special way as a tricky encoding of pointers into a symbol table. If I want to introduce a new value, I must make it larger than 127, and then add lots of tests (if statements) at various places in the code to check for my special new value and not interpret it as an encoded pointer. There is not a single centralized place to do this, and it would have been very easy to implement a simpler two-stage token-lookup routine (first check whether the catcode == encoded_pointer, then lookup the pointer via an ancillary value).
The code uses a lot of "magic numbers" that are neither #defined symbolically nor explained in a central point in the documentation, but scattered throughout, so the programmer must comb it looking for places at which bugs can arise.
As you say, you don't need OOP, but a better job of structured programming would have been nice. Then again, this is a program by someone who wrote a whole paper on "Structured programming with goto statements."
Maybe it *would* work, if we created programs in a different manner than we presently do. A lot of the process of creating a program is internalized.
Going from vague idea to finished tool. How about from idea managment through tools that help decompose, down to output of code and documentation? Keeping in mind the highly iterative (as well as cooperative) process all up and down the chain Resisting the urge to work from bottom to top until the refinement phase (optimization) nears.
*All steps available for inspection.*
Also bring the social methods of engineers over.
Instead of thinking "Oh this is cool to try". Think instead "this is an interesting idea, but it could have a hidden cost". Wouldn't want that "code" bridge to fail under pressure.
True however there isn't a particular process that *forces* the programmer to write "what he means".
Or at the very least makes the process of documentation transparent enough that it gets done.
Make coding indestinguishable from documenting and we'll have solved the problem.
I find this entire discussion pretty depressing as it demonstrates pretty dramatically the extremely poor state of software developer documentation.
Good developer documentation facilitates a great deal of those things that our managers keep crying for... quality software, produced quickly and efficiently, that does what our customers want.
Literate Programming was a terribly elegent way to integrate the documentation that we all should be writting into code in a way that suggests it would be far less likely to fail to keep documentation and code up to date and synchronized. This is VERY different from the current models of extractable embedded documentation, which I frankly don't find adding much. The problem is that we frequently look to tools to automatically generate the documentation when only a fully skill human developer can really explain what the intent and thought is behind each piece of code. Why is this the algorithm chosen? What others were examined and discarded, and for what reasons? These are all far more than describing the members of a class and assuming the reader can guess the USE of that class.
Thats the idea ... try to layout your design and flow before whacking at it with code. Not enough coders put alot of thought into the design before they start the problem.
Maybe the reason Literate Programming never took off is because it wastes the time of good developers. I rarely have trouble reading code written by myself or the other developers at my work, and I can even completely rewrite someone else's software and get it almost all right.
I guess it would be useful for novice programmers who do not know how to write useful comments (that is, 'self-documenting' code, as well as actual comments).
If I spent the time writing document outlines and program plans and crap beforehand it would be just that, spending time for no reason; not to mention the design changes that often go on as you are actually writing code ; the last thing in the world I want would be to have to go back and change all these plans because I decided to change an aspect of what I was doing.
As a computer science tutor, I see a lot of code, and people ask me to fix it. Sometimes I will read their comments, but usually I just look at the code because it is easier to understand. A good programmer shouldn't need to put a line on every line, and even comments like this can be annoying.
/*switch a and b*/
temp = b;
b = a;
a = temp;
Hopefully your comments clarify some subtle algorithm or explain the general outline of what you've done. They shouldn't become a dissertation!
Knuth not only writes good code, he also writes intersting code, and usually that's too time consuming for regular mortals.
I seem to remember Knuth submitting some literate programs for a pair of Jon Bentley's "Programming Pearls" columns [collected in "Literate Programming", online here and here
The latter contains Knuth's solution to an assignment by Bentley: "Given a text file and an integer k, print the k most common words in the file (and the number of occurences) in decreasing frequency."
Knuth's sumbission is a beautiful work of exposition, introducing and explaining an unusual data structure (the hash trie), and in general would not look out of place framed and hanging on the wall above one's dining table.
However, in his critique of the program, Doug McIlroy provides a solution to the problem using a simple unix pipeline that takes up less than a paragraph. McIlroy finish his critique with the following remarks:
SimonI just skimmed through the two first introduction slidesets for Leo, and I thought it was quite cool to split your program into nodes, so you can quickly and easily skip sections of code you're not working on...
But what about when compiling? Your compiler gives you a line number for the source file, and you need to edit that file for the line number to be any use, and then I think you'd need to manually re-locate the right node based on the code around the error... not fun in a larger program. Does Leo provide any way to translate the error line numbers back?
This sig is part of your complete breakfast.
I suppose the reason why Literate Programming has not caught on is simply that no major programming language forces you to do it.
Given the right tools within a programming language---say, a documentation tool such as JavaDoc, and some code and commenting conventions---and proper understanding of some software engineering methods concerning the thoughtful design of your software, it is quite possible to achieve what Literate Programming tries to achieve.
But Java, for example, doesn't require you to build a proper UML model, follow the code conventions, and JavaDoc everything in a way understandable for others. But nobody stops you from using those methods right now. The problem is just that doing Literate Programming---or, for that matter, any kind of proper, thorough documentation---eats up a lot of time, since easily around 50 percent or more of the total time spent on a project are concerned with documentation. And for most programmers, including me, it requires quite some effort to be disciplined enough to do such "proper" software development thoroughly.
In other words, it might be helpful to use a Literate Programming tool that forces you to document your stuff, but it is still up to you to create a proper design and documentation for your software.
Alright then, how about wear and tear on the keyboard. Not to mention the huuuge bandwidth saving. ;)
You are a well-known advocate as collections, especially associative or keyed collections as an alternative to OO.
But unless your collections are primitives in your language system, aren't object classes (or perhaps C++ templates) a proper way of managing that separation?
I believe most code is simply not commented properly. I have been trying for a while to teach myself to do some basic stuff in Python with little success. The main problem is that most code is poorly commented if it is commented at all. There is no telling how many times I've seen things like this:
# Now we tie up some loose ends
or
# I inserted this to fix a few things
No explaination is given as to what the code actually does or if it does explain, it doesn't tell specifically what is going on. I know that most programmers aren't writing comments for the sake of students trying to learn from their code, but I believe if they did so, things would be much clearer for everyone.
Smeghead every day of the week.
What like structured programming? Or - God forbid! - Object Oriented Programming! That crap's all just too damn slow to run and to damn difficult to code!
Sure, LP's not perfect now... but if you bother to learn only what's currently in fashion, by the time you've learnt it, the Next Bit Thing has arrived...
That said... you're perfectly correct when you say that no tool - be it OOP, LP, or any other silly acronym - can prevent bad programmers writing bad code - it can only help good programmers write better code.
There's no $$$ in 'team'...
www..--..net - for incisive, w
All right, I agree with most of what you said, but lay off the indention style! Some people actually like K&R style blocks! Some people even think *gasp* putting the opening of the block on a new line is ugly, annoying, and useless!
I have one friend who hates it so much that if he has to work with code like that, he'll spend half an hour reformatting to K&R style.
Perhaps that's a bit of overkill, but really, it just doesn't matter!
I'm not a big fan of abusing Java's interfaces (an interface for each implementation hierarchy), but in a big project that has to be properly documented and strictly specified, this would seem to help.
The interface is after all closer to the specification level, so your documentation can be strictly about the specification. Then you can let the programmers code, document and freeze the implementation independently from the interfaces.
Since an interface doesn't have any implementation sourcecode, writers could be trusted with the files, and since the interface API per se is frozen at design, they can keep modifying the Javadocs without affecting the coders.
If the writers have to modify the API per se and recompile an interface, they are changing the specification (re-design) and of course the coders are forced to adapt their code to those changes.
But otherwise there would be no need to "split the source" and then "merge". All you would have to do is provide the Javadocs for your interfaces (plus a manual based on this, perhaps) and the Javadocs for your implementation (if implementation details are to be exposed, such as efficiency guarantees, etc).
If anything, I would think the split would improve documentation.
Freedom is the freedom to say 2+2=4, everything else follows...
I may be a bit confused here:
.NET?
What exactly is it that you cannot do with your source code in Visual Studio
I don't mean that the wisdom of MS has allowed them to put all functionality ever needed in Visual Studio.
I mean that every time I checked my source code was still there in a flat file, and I could modify it with a text editor, a perl script, or whatever I wanted.
I haven't seen any repository system from which I have to import/export source code or anything like that. Am I missing something?
Freedom is the freedom to say 2+2=4, everything else follows...
I _am_ a programmer, and what you suggest is also the way I prefer to write code, only it's perhaps slightly less experimental in my case and I don't have to reach for the language reference manual quite so often.
What a long, strange trip it's been.
Repeat until enlightened: Unless you have an unlimited amount of programmer time to expend, long-term performance is a consequence of maintainability.
The "big" performance improvements are generally algorithmic in nature. They're what you get when you replace an O(N^2) algorithm with an O(N log N) algorithm, or when you replace an O(N log N) algorithm with an O(N log N log log N) algorithm which exploits cache coherency much better.
When you find that you need to change one algorithm or data structure with another (and you will), you might think twice before doing that to an unencapsulated implementation, whereas doing it to an encapsulated implementation will be much cheaper. Moreover, with an encapsulated implementation, you can play with several different implementations until you find the best one for the situation at hand.
Yes, your code may have perform better had you written an unencapsulated version of the most appropriate algorithm/data structure to begin with. What a wonderful world it would be if software engineers were also clairvoyants. :-)
No, you don't. You at least need abstraction and at most you need encapsulation. These are a subset of the problems that OOP solves, but OOP is not the only solution to those problems.
So while you don't need OOP, if you're working with a system which already has it, you might as well use it unless there's a good reason not to.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
Of course it should. Anywhere that the code does not "document" the program, the progammer needs to provide explicit documentation or write clearer code. That's very very simple if you just consider that the code *is* the program. What documentation could be more correct than the code if you want to know what the program does.
Very nice idea, but I've never seen this work in practice. The problem is basically this: You just introduced an entirely different vector under which bugs can occur. These new bugs do not break the program in any visible manner, they just destroy the productivity of a maintenence programmer.
Example... Client decides to change the behavior of a component of a system after the program has been released. Programmer goes in and changes the code, but never changes the documentation. While making the change, the programmer notices that another requested change will cause the first piece of code to break in a very subtle way if it had not been changed.
6 months later, another programmer makes some changes and notices that a line of code is clearly not doing what the documentation says it should be doing. Feeling almighty and powerful, he modifies the line so it fits the documentation, thus overwritting the change from 6 months ago and introducing a subtle bug.
What happened? If documentation is not kept up to date (one more bug vector to worry about), the maintenence programmer will either lose time verifying which is correct or will "fix" a bug that shouldn't be fixed. Neither case is a good thing. Documentation bugs are especially bad because they are so transparent.
So which is worse? I would warrant that the second is far worse. Any documentation that can be embeded directly into the statements of the language through variable names, function names, language constructs, data structures, etc are a great benefit as they cut down on the number of instances where doc bugs can occur.
Nonetheless, it's is definitely necessary to document particulary nasty chunks of code, but keep the documentation at the highest level possible. If it is a system overview comment, it belongs in a design doc. If it describes how a black box *should* work, it belongs at the highest level of that box (ie, class or function). If it is a single line genius piece of code, by all means, document it. But be assured that anyone who changes it will probably have a tough time understanding it, will have to read the documentation, and will likely remember to update the documentation out of appreciation if nothing else.
If this is tough to accept, look at one of your own statements..
But - every time I add NEW requirements, you'll have to massively modify the code. As the code becomes more and more complex, it will have more opportunities to gain bugs.
Really? And so as it becomes more complex, it will likely need documentation to describe the additions that are being glued to the side of the original design, right? Oh, right. Maybe it will remain constant?
All this while, though, the documentation might remain constant, if it's written clearly enough.
Yes, it might. Are you going to remember to check it, especially if it is right *most* of the time...
So, there you go
I will make one concession to this, in that it can depend on the editor you are using...
If your editor does brace matching, then it *may* not matter about how you indent/format blocks...
If you ever have to look at code as just a plain text document, then K&R blocks can be a *severe* problem...
K&R style blocks make it much harder to see where braces match... this is especially true if you ever use default program blocks (ie. an if statement with a single statement in the 'then' clause, without the use of any braces)
Yes, you can mitigate the problems greatly by always explicitly defining your blocks (ie. always using braces, even where they are not strictly required by the language), and being absolutely strict about how you indent code (esp. with regards to spaces vs. tabs)... however the second point can be quite hard to enforce, when you may need to use a variety of editors in different circumstances, and you have a reasonable size team working on the project...
As one of the most important aspects to understanding how code works is to be able to see where the program blocks are, and exactly what they are doing - not to mention the number of bugs caused by errors in the definition of program blocks - K&R style formatting can be the cause of many headaches, and result in a lot of wasted time / effort.
At least the link provided requires it to get to any actual information.
I think i'll just remain an illiterate programming ludite, thanks. It's worked so far.
Well, it does make programs more extensible, but if you are having race conditions or other delicate problems, tough. Another problem is that I often make 2000-line functions unwittingly --- although they are no harder to understand, they take much more time to compile with gcc.
I think I've arrived here a little late, but here's my contribution:
It seems to me that a lot of people who've posted hold the opinion that documentation and coding can (or should) be done at the same time. I think that what is most important in a programming project (as in any other engineering project*) is the design. Documentation != Design. The design is an essential part to most projects bigger than "Hello World!" Without properly designing your project before beginning the coding/documentation phases you will end up with a mess of code and only an idea of how it all ties back together in your head.
* Yes, programming is a form of engineering, no you're not designing a building or a processor, but when you're talking about a large scale application or operating system it might as well be a physical structure. Without using proper techniques most projects will end in failure.
In addition IE is required for the pages that have speech, since that uses MS Agent, and there is no comparable technology for Netscape.
You used MS-Agent and technology in the same sentence! I can think of a portable systems which provide text-to-speech (festival), and its certainly possibly to provide the page without "speech", or with a few sound clips in ogg or mp3.
Also, some languages are trivial to text to speech, such as japanese. (trivial as in a perl script and a directory with sound clips could probably doit in real time)
Well, mostly my problem is that I have no way of getting any version of IE going, and I really wanted to try the language sections, so I am bummed
3 characters versus 4 every so often, though I guess it worth it if you're using 300baud (or less) where every character counts.
The reason geeks don't like writing too much documentation is simple. It's not laziness (well not always), it's just tedium with human language (or poor speeling and grammer to).
Project completion includes documentation.
Professionalism demands it.
Always.
Full steam ahead.
The worst cause of feature creep and software bloat is delineated in your rationalization. The "simple" change should be to your documentation, and then the code should be updated to reflect that change. Note that that change in the spec is usually considered a valid reason for an increase in the revision number.
The worst thing in the world for a software company (profit motivated) is a moving feature set, and never reaching 1.0.
Of course the documentation will change. I forgot to mention how important docs are in the process of "change management". It's like war: no battle plan survives the first engagement with the enemy, but that is no reason not to have one. As long as you change the docs to reflect the new features/behaviour, there is no problem with docs "getting out of date" with respect to the code.
Besides, if you are writing the docs and someone notices a glaring issue, you can resolve it before telling someone to start coding. The earlier, the better. And you WILL do the same changes later in the project anyhow, with a few hundred percent increase in the workload.
This is why (most) programmers make horrible tech writers: they are too involved in the code to be concerned about issues that affect usability and project management.
I've got a bad attitude and karma to burn. Go ahead. Mod me down.
Leo is great! It works with whatever toolchain you are already using, and doesn't get in the way.
You could write the software, help manual, and project manage it all from file exported through Leo.
Bravo, I am definetly going to try this one out.
Because it's a text markup format, not a programming language.
I think I'll go write a 10MB technical report as one big nested S-expression. Yeah, right.
One clue too few...
People already want the computer to do what they want instead of what they say. I wonder if they still will complain 'that is not what I said.'
Young Jedi,
Why waste your time with these losers? Enjoy the last moments of summer before it retreats.
-Master
If you REALLY need to do something that the IDE isn't capable of then you can write your own tools. VAJ provides a Java API for this and it is really easy to use. You can then make the IDE do anything that you want, as long as you are able to code that behavior in Java. You can also you vi and Emacs to edit your code in VAJ if you really want to.
Lasers Controlled Games!