Writing Documentation: Teach, Don't Tell
Programmer Steve Losh has written a lengthy explanation of what separates good documentation from bad, and how to go about planning and writing documentation that will actually help people. His overarching point is that documentation should be used to teach, not to dump excessive amounts of unstructured information onto a user. Losh takes many of the common documentation tropes — "read the source," "look at the tests," "read the docstrings" — and makes analogies with learning everyday skills to show how silly they can be. "This is your driving teacher, Ms. Smith. ... If you have any questions about a part of the car while you’re driving, you can ask her and she’ll tell you all about that piece. Here are the keys, good luck!" He has a similar opinion of API strings: "API documentation is like the user’s manual of a car. When something goes wrong and you need to replace a tire it’s a godsend. But if you’re learning to drive it’s not going to help you because people don’t learn by reading alphabetized lists of disconnected information." Losh's advice for wikis is simple and straightforward: "They are bad and terrible. Do not use them."
There's a difference between a tutorial and documentation? Who'd have thunk!
From TFA: The purpose of technical documentation is to take someone who has never seen your project, teach them to be an expert user of it, and support them once they become an expert.
No. Experts in their field shouldn't need to be taught how to understand your system; that's part of being an expert ( or indeed, even a professional ). All documentation should be doing is explaining the sticky bits and providing details and/or examples ( whichever is relevant ).
Just my opinion of course. But having stepped in to countless networks/codebases, I can tell you that I just get annoyed when the documentation gets in the way of the information I need to complete my job.
Mod me down with all of your hatred and your journey towards the dark side will be complete!
Do you really want to read the source code for ssh every time you forget whether it's -p or -P to specify the port? (It's one for ssh and another for rsync...)
As an author of three successful dead-tree programming books, I have a few observations.
1) I use the electronic versions myself because of easy search (better than an index) and copy/paste.
2) In book format, it's possible to lead a reader through topics in a sensible order that builds on prior topics.
3) The challenge with electronic/on-line documentation is that there is no expectation that readers will approach the material in any particular order. Readers type a search term into google and up pops a page or two of documentation. How can the author make safe assumptions about the definitions of terms and prior conceptual knowledge the reader will have? Adding links to the definitions of terms and links to chapter oriented conceptual documentation doesn't usually help because readers are impatient, and there is no good place in the middle of the documentation to start.
4) Many readers don't know the terms to type into google and therefore aren't lead to the relevant conceptual documentation even if they would have read it had they known.
The article misses something critial - the use of common language.
The biggests example of this is HIS use of "user". A "user" for all my time in this business is the person sitting in front of the computer using the software to preform a business funtion... not programming a business function.
This leads to second issue a developer is person that uses an embedded function written by another developer - generally with a higher skill or at least akill peaked before the new developerl. That developer is trying to come up to speed about cause and effect of using a given piece of code and trusting the original developer actual did his job right.
How can you trust a person that says what documentation is to be, that cannot teach it following his own rules? The first rule of any teaching is placing terms that you are going to be using, to teach the others what you mean. You see this legal documents to prevent confussion of common terms being used in a more defined or limited manor.
In my opinion, Stack Overflow is most often the blind leading the blind. There will be 20 wrong answers, 10 answers to the wrong question, 2 suboptimal solutions, and if you are in luck there will be 1 good solution. Now, tell me which is which. It seems to me that the good answer is almost always buried under crap.
Stack overflow questions are often badly stated and difficult to find with more correct search terms. If you don't even know the search terms, the site is useless.
There have been a few times when stack overflow saved me a lot of time. There have been many times when stack overflow has been a pointless time sink.
I agree with your superficial point...who would have thunk it indeed
However, I think it is still a distinction without a difference, one that only causes more confusion.
You and I know what 'documentation' in the computing sense means, but it's not a logical concept for a non-techie.
'documentation' in computing can be as simple as the coder showing his/her work and making a formal log of changes and bug reports/fixes
however, and here is where the problem happens...what constitutes 'sufficient documentation' for a coder is *not* sufficient for a user!
the problem is, that to bridge the gap, most programmers (who are not at all schooled in education theory & by nature tend anti-social) must create some sort of 'document' beyond the 'documentation' for the end user
sometimes this takes the form a 'tutorial'
a 'tutorial' is not full instructions...it is a real-time step-by step *demonstration* which may have supplimentary material that is actually instructions
ex: I can make a video with the steps to start a car, put it in gear and how the brake and throttle work...a person, with *nothing* else but that video and factory plans of the car *could* learn to use it...but calling a basic video and factory plans 'instrucitons' is insulting!
'documentation' can be helpful
'tutorials' can be helpful
'help' menus can be helpful
even so, its not a full user manual that an end user in other industries would expect
the computing industry has decades of work in this area I fear...so many have gotten used to doing it a bad way
Thank you Dave Raggett
peaked before the new developerl.
Freudian slip much?
Have gnu, will travel.
Since when did the car owner's manual teach the owner how to drive?
I work for an airline. We train pilots on our aircraft and our procedures. We certainly do not teach them how to fly.
I hate you! You're one of those co-workers that urgently e-mails me at 1AM in the morning asking me how to use some utility I wrote. In the morning I reply, "Use the -h switch, you mother f*cker." Followed by my usual disclaimer--"Every utility I write has an -h switch, which describes the switches option-by-option, followed by short description of the function of the utility, plus gives links to additional documentation."
And if you think you're going to find the -p switch in OpenSSH source code, good luck. Option argument handling is strewn about in several different files. I know, because I've had to hack on it and add options, as well as fix the parsing of forwarding option arguments, among other things. I've seen worse, but it's a long way from some utilities, where getopt and getopt_long processing is concise and easily readable.
Pro tip: readable source code has nothing to do with methods, classes, functions, or variables per se. It's the overall structure that counts, even if it's a single 10,000 SLoC function. Most C++ apps are harder to read than a gigantic ASM app.
Most people organize their code by what it literally does--by the components they learned in school or a textbook. They tediously breakdown blocks into a myriad functions and classes based on their algorithmic role. Or they farm out "parse_int", but then have a 200-line chunk of code processing a dozen different kinds of ints (ints for timeouts, ints for userid, etc).
I don't have many simple tips for alternatives. I just know that most people are doing it horribly wrong. I like to think my code is fairly easy to read--and people have told me that--but I know I could get better.
Okay.. one simple thing people could do more often--use fewer source files, fewer classes, etc.
Also, people abstract too early, before they understand what the meaningful abstractions are. So they end up with too many abstractions, creating too much complexity. People should begin to write their applications as quickly as possible, without worrying about structure--just functionality. It's only until you're about one third or even halfway through that you have an idea of how the whole application should be structured. That's when you start over, before it's too late to re-architect, but after you have a concrete idea of what's necessary and what's superfluous.
One problem I encounter all the time is what level of competence should be assumed? If I write "try ping host xyz" should I assume they can successfully pingtest something and interpret the results? For ping, yes maybe I should assume that, but what about grep? Grep isn't officially supported by the organization so...
I feel like I'm wasting my time writing instructions for simple tasks, but I also feel that I have to write as I though a monkey is the intended audience. I hate to say it, but it's the godawful truth, that there are too many people in IT that can only read-and-do.
Copyright 2010. All rights reserved. This comment may not be copied in any way including, but not limited to caching.
-h? Next time, use all three of these: -?, -help, --help. I'm probably not going to try throwing -h at a program without having a clue what it might do.
Copyright 2010. All rights reserved. This comment may not be copied in any way including, but not limited to caching.
Command line utilities should print their documentation for any unrecognized or erroneous command line arguments (unless that's so lengthy they need to only print a subset).
Socialism: a lie told by totalitarians and believed by fools.
There's also something to be said for "sell, don't tell". I seem to recall that the Commodore 64 Programmer's Reference Manual was written as if they were enthusiastically pitching the product. For some reason, it was much easier to retain the information when the author gushed about it.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
My last owner's manual was about 400 pages; but contained only about 10 pages of useful information.
10 pages of useful information to you maybe. Other people will find some other 10 pages useful.
My mathematics textbook is 600 pages but there is only 1 page with the information that I need to solve a problem. Doesn't mean the book has only 1 page of useful information.
To be honest, when I say that "not all wikis are Wikipedia", I'm still being a bit unfair to Wikipedia. Even that project - which puts, for all practical purposes, zero dollars into content creation - still has recognized that at least some regulation is a good thing, and that even if you give everyone read access, you don't have to give everyone write access. Further, with their Pending Changes process, not all changes to a page go live immediately--for certain pages a reviewer has to approve edits before they appear on the outward-facing Wikipedia article.
A private company or organization can appoint gatekeepers to control who can edit what, and who can approve changes. Moreover, they can pay people to edit documentation, and can impose requirements and standards across the project. Wiki software can provide a lot of 'back-end' support for creating complex, multi-page, potentially multi-media documents, using markup that is relatively straightforward to learn. It can provide clear, complete records of who changed what, when--and who is responsible for breaking stuff.
Sure, waiting for people to randomly show up and write documentation, and then accepting everything they give you without any validation or quality control is a recipe for failure. But that's not the only way to use a wiki. Linus doesn't let just anyone check stuff into the kernel.
~Idarubicin
I've been writing technical documents since the early 1970s.
You can't expect one piece of documentation to serve everyone ... it's like buying a "vehicle" and trying to use it to race, haul hogs, and climb Pikes Peak.
A - Ordinary users don't give a shit how the stuff works, they want it to do something for them ... tell them how to make whatever it is work as a tool for them. Run through the common use cases, screen by screen, showing them how to make the widget-smasher do it's thing.
Start with things the way they should work, then give them some basic troubleshooting, maintenance, etc.
B - Administrative users: They need all of "A", and how to handle the other users. Add, remove and monitor users.
Start with things the way they should work, then give them some basic troubleshooting, maintenance, etc. for the added functions.
C- Service techs, sysadmins, and those who will touch the sacred code: All of A and B (be reference to the appropriat4e manual or section thereof) and then feel free to pile on the technocrapobabble.
Each detailed explanation should start with a very brief "statement of purpose" ... when will this command be needed, or what does the bit of machine do. Why would you use it?
Then explain how to use it, and the expected results if you used it right, the expected results if you screwed up, and how to recover from an error.
======== ... chronologically, what touches it and what is supposed to happen?
You need to explain for each level of user what happens to a transaction, or data, or a part being manufactured, as it passes through the process or the proigram
What will you see if there is a failure?
How do you recover from the failure?
It's not difficult, you just have to make sure that each level of user can get their task done efficiently.
That's how I got my entree into writing about Linux. Programmers are very smart, but not very eloquent and they are also very poor teachers.
There are any number of rules and guidelines for writing documentation, most of which are ignored since documentation is often the red-headed stepchild of the project.
Documentation should tell a story clearly and help the reader understand the 'why' and 'how' as well as the 'what'.
"I believe in Karma. That means I can do bad things to people all day long and I assume they deserve it." : Dogbert
Yeah, this is one of the big annoyances with *nix man(1) pages.
Between "Name" and "Synopsis", they're all missing the "Typical usage examples" section. I shudder to think how much time has been wasted through this poor design choice.
I produce a lot of documentation along with my coding, and the one thing that makes it palatable (even to me, re-reading it) are illustrations.
I'm not talking about UML class or activity diagrams, although those things are great where appropriate. It could be anything relevant to getting your point across, like a fragment of a database table showing sample data so people can visualize how a group of tables will work together. Screen grabs with arrows and circles.
My rule of thumb: if I ever find myself drawing a picture on a whiteboard as I'm explaining my module to someone, I immediately stop and take a picture of the diagram I just drew, and ASAP afterwards I turn that picture into an illustration in the user docs. Then next time I can just whip out the docs and point to the illustration.
Koans and fables for the software engineer
And for no arguments. Or at least print what is required to get help
C:\>app
Crappy app 0.0.0.1a
GPL 2 (If you don't like it fix it yourself
For help type -?
C:\>app /?
Crappy app 0.0.0.1a
GPL 2 (If you don't like it fix it yourself
I said enter -?, not /? This program was barely ported using cygwin, so you have to use *NIX arguments
Don't like it, fix it yourself
C:\>_
Not documenting is a more or less conscious technique developers and projects use to increase their market value. In those projects where the business model is consulting , you better believe that unless it's a public API , it's got zero documentation.
I know this is true. If software developers wanted to , they could write a nice book about how their source code works (as opposed to how their program works or how to write a plugin for their program).
They don't. This is not merely a case of being rushed for time or whatever. They don't want anyone else to understand it really. You control what only you understand- every developer knows this intuitively. Going to great pains to write a book-style learning aid to your actual code so that *just anyone* could take your place.. well... do I really need to finish that sentence?
Devs (don't) do it to so they can get some job security. Companies (don't ) do it to get consulting gigs. No one will admit that this is what's going on, but it is. I can hear the rebuttals already.. I can see the down-modding about to happen... go ahead.. flamebait me down. .. knock yourself out.
The closet analogy I can think of is magicians guarding their secrets. That worked very well for a long long time. The incentive was the same. If you give this away, if you make it understandable, you're going to be out of work. Slightly different dynamic, but you get the idea.
This is one of the product spaces we want to move into, but as you might imagine we have come to the conclusion that we have to move carefully, and even obliquely.
This is not exclusive to software engineering either. I know for a fact - which means i have multiple, full confessions, that mathematics teachers are far less clear than they could be wit their students. Some have said they do it to winnow out the weak. Others have admitted they do it to classes to students they don't like. They know it makes a huge difference, how you explain something- and that no one can accuse them of anything. That's basically a kind of power or force you can wield against and for people who please you or displease you.
Just sharing what I know.
You don't write the ping documentation yourself. You refer to it in the system reference manual. Somebody already wrote acceptable documentation for ping. You should study the ping man entry in, for example, the BSD user manual to see one way to write intelligble, useful documentation.
If you're advising the user to "try ping host xyz" you need to explain why and what to do if it returns the expected or unexpected results and what conclusions he can draw from them.
Oh cool! I known that shutdown -r -t 600 works on Windows when I expect it to finish installing an update and I'm ready to go for a coffee, but I never remember what it is in Linux. Thanks to your tip, I now know I can use shutdown -h but I know the Linux guy had to put a number, so let me try shutdown -h 0 and see what it tells me about how
-h? Next time, use all three of these: -?, -help, --help. I'm probably not going to try throwing -h at a program without having a clue what it might do.
Then use the damn manual. That's why we write them. If you want to know how to use the manual, use the manual:
$ man man
$ man woman
No manual entry for woman
Yep. It knows everything!
It is not the responsibility of the student to fix a broken lesson plan. For fuck’s sake, the entire point of having a teacher is that they know what the students need to learn and the students don’t!
This. I've lost count of the number of times as a medical student when I showed up in a pompous consultant's teaching session, (arranged with great difficulty, no less), and the first sentence was "So, what would you like me to teach you today?".
If I knew I'd have gone and read about it myself rather than waste time here with you, thank you very much you arrogant prick!
-h? Next time, use all three of these: -?, -help, --help. I'm probably not going to try throwing -h at a program without having a clue what it might do.
For non-Windows systems:
-h is Valid
-? is Invalid as '?' is a special parameter that may be expanded by the shell
-help is Invalid on GNU/Linux (though used often by ported applications)
--help is Invalid on older Unix systems.
For newer Windows systems:
-? is Valid (and mandatory)
-Help is Valid (and mandatory)
--help is Invalid
-help is Valid
-h is Valid
-h is the safest option
[Rent This Space]
job.
I was told how much my documentation sucks because the information they needed to operate the device wasn't in the manual. I challenged the person complaining to tell me something that was left out. I turned to the page in the manual and pointed to every answer that was left out.
Eventually the person complaining finally said "It's a bad manual because you have to read it to use it."
That's exactly the answer I was looking for.
My current place of employment does not want educational manuals. They want step by step instructions. I haven't written another manual for these people since. At my previous places of employment they raved about how good my educational documentation was. As a matter of fact the support manuals I wrote of an ISP in early 2000 that included modem and email troubleshooting leaked to other ISP's and outsourced call centers who began to implement the pirated copies into their own resources. I finally put a documentation GPL on it and released it all to the public (I've had people criticize me for not using Creative Commons, but this was a couple of years before that hit).
The lesson I learned - technicians want to learn. Monkeys only want "monkey push button".
The preceding post was not a Slashvertisement.
reminds me, I hate man pages. there, I said it. they are the devil. i have never learned anything important from a man. rftm, you say? well stfu!
With regard to technical documentation, wikis are where knowledge goes to die a horrible and lingering death.
Il n'y a pas de Planet B.
"If you're advising the user to "try ping host xyz" you need to explain why and what to do if it returns the expected or unexpected results and what conclusions he can draw from them."
This.
You can show and tell what and even how all day long to little avail; telling why educates both writer and user and lets one get on with solving problems.
Given the number of bugs in most code, I'd suggest that it is pretty poor documentation for what the code is SUPPOSED to do.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
That's nice. If I ever get attacked by a switch in the wild i'll know I can use -h to tell me what that switch is.
Now that I know what every switch is, tell me how to use them to achieve my goals.
That's what really makes the difference between being a reference and being instructions.
If I have been able to see further than others, it is because I bought a pair of binoculars.
Grep is a special case because regexes are complicated. But a tool like ls? Read the manpage.
Why do you claim --help is invalid on Windows?
It will work (at the moment) but goes against the conventions set by Microsoft.
Microsoft could, at any time, change the way it interprets a double hyphen, breaking your program.
It is safe at the moment because Windows passes the entire parameter string via ARGV
[Rent This Space]