Slashdot Mirror


What is Well-Commented Code?

WannaBeGeekGirl queries: "What exactly is well-commented code anyway? Can anyone suggest resources with insight into writing better comments and making code more readable? After about six years in the software development industry I've seen my share of other people's code. I seem to spend a lot of time wishing the code had better (sometimes _any_) comments. The comments can be frustrating to me for different reasons: too vague, too specific, incoherent, pointing out the obvious while leaving the non-obvious to my imagination, or just plain incorrect. Poorly or mysteriously named variables and methods can be just as confusing. In a perfect world everyone would follow some sort of coding standards, and hopefully those standards would enforce useful comments. Until then, any suggestions for what you, as a programmer, consider to be good/useful/practical comments? Any suggestions for what to avoid? Also, I usually work with C++ so any resources/comments specific to that language would be too."

21 of 802 comments (clear)

  1. K&P's "Practice of programming" by Anonymous Coward · · Score: 1, Interesting

    This book is a must-read on programming style. It also contains a perfect chapter on how to comment the code clearly.

  2. It's been a long time but.. by NewtonsLaw · · Score: 5, Interesting

    It's been quite a while since I wrote any significant amount of code but after spending far too many years cutting code too early in the development process I eventually woke up to the fact that coding is the *last* thing you do (apart from testing and debugging that is).

    First-up you need a good spec -- and the spec should include the user-interface details to the extent that you could actually write the user-manual from that spec.

    Indeed -- if you can't write the user-manual from the spec then the spec is incomplete.

    From the spec the programmer should develop the structure of the code in another document.

    That structure document is repeatedly refined in a top-down process until you (eventually) reach a point where you're actually cutting code.

    I was always surprised just how much easier it was when the code was written as the lowest level of the structure documentation.

    Not only could you comment out the program structure document so that the compiler would ignore it -- but you ended up with absolutely accurate and comprehensive documentation built into that source.

    Project managers love this technique (and when I was in a project management role I demanded it of my team) -- it ensures that technical and end-user documentation are no longer the bits that get left until last and thus are either very shoddily thrown together or, if the project goes really over-budget, not produced at all.

    Of course, as we all know, there's a huge amount of temptation to just leap into coding at the earliest possible stage and leave the documentation until later -- because some stupid managers use number of code-lines completed as a metric of project performance -- duh!

    If you're smart and use good tools you can selectively collapse and expand the in-source documentation so that when you're trying to get familiar with a module that someone else has written, you can descend down the structure tree one level at a time without the meaning being diluted by stuff that is at a lower level.

    Unlike the days of interpreted BASIC, there's very little overhead involved in integrating documentation and code these days -- so there's no excuse not to do it.

    If required, the documentation can be automatically extracted from the source -- but by keeping the master copy in the code it becomes easier to ensure synchronization as changes and updates are made during the lifecycle of the project.

    1. Re:It's been a long time but.. by emag · · Score: 3, Interesting

      This is a technique I hope to follow with a few "personal" projects that I'm about to start. I'm refusing to write any code until I get down at least a spec for what each section is supposed to be doing.

      Alas, since the UI is wholly divorced from the back-end (I discovered I'd designed in my head an n-tier system just by seeing what I hated about both similar and unrelated software I've been forced to work with), I'm not sure I could follow everything precisely the way you suggest. After all, in the overall design, there will actually be several clients, each displaying data appropriate to its interface (web vs. fat client vs. text vs. scripted API vs. ???). I know what I can, and in fact must, do is define just how the "back end" (in terms of the clients) tells the clients what to display, what to get from the user, etc.

      I've also got to point out that in the past, I've been guilty of the "jump right in and code immediately" method of development. Unfortunately, I didn't have much in the way of control, since we were under pressure from both upper management and the customer (they have a lot of boats and guns...) to produce code for testing immediately.

      Even though this was for a "next generation" of a system already in place, it was a complete rewrite in a language the customer wasn't all that familiar with (C++), and so they were concerned that we wouldn't be able to guarantee anything would even work, let alone within the tight timing constraints required by parts of the overall system that weren't being changed.

      We had to deal with collecting all sorts of assinine metrics, such as the SLOC (Source Lines of Code) you've mentioned. Of course, the number of comments was also a metric that needed to be collected, and so we somehow had to find time to do those while frantically trying to keep pace with specs that seemed to change arbitrarily on a weekly or monthly schedule. Assinine? Yes. Real life? Yes. The only saving graces were that we'd managed to convince everyone who made decisions that some ridiculously low SLOC count was what we were responsible for in terms of monthly performance (giving us some breathing room to do intelligent design), and regular (weekly) code reviews amongst all the developers, spec writers, and the customer, to verify that what we were doing was, in fact, correct. (Though it seemed at times that we spent the majority of the meetings teaching all of the non-developers how to code in C++)

      --
      "The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
    2. Re:It's been a long time but.. by noom · · Score: 3, Interesting


      That style of programming leads to unreadable and unmaintainable code. The problem is that, in many cases, the programmer that required reams of documentation in order to write the code also requires that documentation in order to read it. The system-level become inappropriate to the evironment and language being used since the design was done "in your head" instead of "in the code.

      IMHO, detailed specs should only be done when there can be a GAURANTEE that it won't change for at least 2 years. The is is the usually the case with specs produced by groups like W3C, IETF, or the OMG, and a lot of programmers think they should copy the activities of these beauricratic organizations in their own work, where requirements can be expected to change continuously. In this environment, maintaining a detailed specification that is separate from the code is far too impractical. It's better that programmers make the code and the focus of their attention and make sure their code *deserves* to be the focus of attention (i.e. it's well written, elegant, easy on the eyes and mind, fun to take walks on the beach with, etc...).

      -n00m

    3. Re:It's been a long time but.. by sbeitzel · · Score: 3, Interesting
      Well, it sort of depends on what kind of code you're writing, doesn't it? In applications development, the dev cycle can be 1 year long or longer, with support cycles that run about 3 to 6 months.

      In web development, your dev cycle is often 3 months or fewer, with support cycles measured in days or even hours. The practice of shifting requirements up until the ship date is one that we, as professionals, have a duty to curb. If you're implementing new requirements during the final 10% of the project, then you're allowing the customer to break the project and blame you.

      Unfortunately, the nature of programming doesn't really change between those extremes: it's still a question of figuring out what the problem (the product requirements) is, designing a solution to the problem (writing a spec and, hopefully, designing it so that three cycles from now when you get a requirement to change the product, you can), and then writing the code that implements the design.

      The comments we always see in these discussions along the lines of, "comments are for the weak; real programmers don't use comments..." don't take into consideration the fact that the odds are very good that you won't be supporting your own code in a year, you'll be dealing with someone else's crap. As professionals, it behooves us to provide as many clues as we can to the poor sods who'll follow after us -- because what goes around, comes around.

      When I'm doing really fast web development, the spec is often a drawing on a white board -- so I take a picture. The design is often a doodle in a notebook -- which I label and keep on my bookshelf. And when I start banging out code, the first thing I do is pseudocode in comments, then interleave the real thing. That way, when I'm interrupted in the process, I can pick up again quite easily.

      I've handed off a lot of code to other people, and I've never gotten any complaints about too many comments (or about useless ones). I have gotten comments about how easy it is to follow what I'm doing, and that's enough. Do what you need to do to get the job done, but keep in mind that the job doesn't stop with getting the thing to compile and link.

      --
      Oh, go on, check out my job.
  3. Document the function's contract by IvyMike · · Score: 5, Interesting

    Take a look at this function, and tell me if there's a bug:

    void foo(void) {
    int* x = 0;
    int y = *x;
    }

    Easy, the bug's the SEGV, right? Take a look at the same function, this time with comments:

    // Function: cause_segv
    // Description: Causes a SEGV for testing purposes
    void cause_segv(void) {
    int* x = 0;
    int y = *x;
    }

    The point? A bug is unwanted behaviorm, but that only makes sense if you've defined what the correct behavior is. My example is trivial, but often this is a real concern. Function "bar(int,int)" returns null whenever one of the arguments is negative--is that a bug or a feature? Your function has a goal in life, a contractual obligation to do something; make sure it's clear what that something is.

    Note that if you choose good function and good variable names, a simple one or two line comment at the beginning is usually sufficient to document whe function's intended behavior.

    I also find that an "assert()" or two on the arguments at the top of the function makes it clear what values the function accepts, and which one the function doesn't handle. It's an easy way to document the contractual obligations of the function.

    Stuff not to put in comments is stuff that's easily devised from the code. Check this out:

    // Function: square
    // Inputs: int x
    // Outputs: int
    // Used by: pythagorean(int,int)
    // Description: returns x squared
    int square(int x) { return x*x; }

    Did the "Inputs" or "Outputs" add any value? That information appears again, two lines below in the function definition, and it's guaranteed to be correct there (unlike the comment which will be out-of-date and wrong when we change "square" to work on longs). The "Used by" might have added some value, if it was correct, but as it turns out it's out of date, and 15 other functions now use "square". Any information better derived looking at the code should be left off. Any information which can be better found using "grep" or "find in files" should be left off. Any information that will probably be out of date at some point should be left off. Heck, in this situation even the description is probably extra verbiage, since it doesn't really help anyone. (I'd probably put it in out of habit anyway, though...so sue me:)

  4. Re:Describe before you apply by ranulf · · Score: 2, Interesting
    The problem with this is that
    while (1) {...}
    or more commonly
    for (;;) {...}
    is a well known construct for infinite loop. If you turn such a simple construct into six lines of source, then I dread to think how much commenting you'll use when you actually get down to solving the problem in hand.

    for (;;) { // infinite loop
    is far better - it reminds people what you're doing and if someone sees your code and doesn't understand that construct then they know what it does from the comment and they can go and find out how it works and learn it for next time.

    While you're at it, you should probably think about hiring real programmers who know basic constructs in their chosen language...

  5. PDL it is good no? by ZanshinWedge · · Score: 3, Interesting

    Personally, I like documenting backwards. Start with the requirements, work to the architecture, then get into writing PDL (Program Design Language). Essentially, you write out as detailed instructions on what the routine does as you can, without getting to the nitty gritty. It describes the intent of the code, not the code itself. It morphs into excellent comments when you expand it out into full code, and it also has the nice little advantage that it's at a high enough level that it's applicable to multiple languages (if you should desire to switch).

  6. Check out how not to do it. by chrestomanci · · Score: 2, Interesting

    See: How To Write Unmaintainable Code by Roedy Green

    Every time I read it, I laugh from all the crazy examples of how not to do things:

    eg:

    16: Names From Mathematics:
    Choose variable names that masquerade as mathematical operators, e.g.:
    openParen = (slash + asterix) / equals;

  7. Re:Simple rule of thumb by nagora · · Score: 4, Interesting
    Write code that is easy to understand and comment about wierd / unusual sections

    Nice idea; never works in practice. The reason is that what you think is easy to understand is not always what other people think is easy to understand.

    The code you are writing now might have to be modified in the future by someone just out of university which means, generally, someone with very little experience. Your red-black binary tree might be "easy to understand" for you and a novelty to them.

    Also, mature highly-factored, optimised code that has been improved over several years can be very hard to follow even when the original code was quite straight-forward (but perhaps too slow).

    Finally, as a philosophical point, source code is supposed to be terse in comparison to natural language so it should take longer to describe the code in your own language than in the programming language.

    TWW

    --
    "Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
  8. Comments Considered Harmful by Bazzargh · · Score: 3, Interesting

    This has come up before - in Martin Fowler's book, "Refactoring", he makes the controversial claim that sometimes comments are indicative of a need to change the code.

    Consider the different types of comment:
    - boilerplate comment at the top of a file: helps noone but lawyers.
    - change history comment: better use your source control tool to maintain this.
    - comment before a class: does this mean the class is badly named, or too complex?
    - comment before a method: ditto.
    - comment inside a method: could be a smaller method screaming to get out.
    Also heavily commented code is quite commonly just explaining away stupid code tricks.

    Nobody's suggesting that all comments are bad, just that a lot of the time adding comments is a poor substitute for fixing whats wrong with the code. Of course sometimes its the language thats the problem :)

    -Baz

    1. Re:Comments Considered Harmful by Sajma · · Score: 2, Interesting
      Actually, I've found Fowler's technique for extracting methods pretty useful for clarifying code. Basically, he says that if a block of code is preceded by a comment, extract that block into a method whose name is the same as that comment, and remove the comment.

      This works pretty well, and leaves you with two kinds of methods: top-level ones that look like simple lists of tasks (enqueueEvent, sendReply, etc.), and small lower-level methods that do only a specific task. And yes, these small methods usually don't need to be commented, because the method name serves as the comment.

      This technique also works well for replacing ugly boolean formulas in if-statements. Compare

      // check if message is late, and if so, do blah
      if (System.currentTimeMillis() > (message.timestamp() + Message.TIMEOUT) {
      // blah
      }

      to

      if (message.isLate()) {
      // blah
      }

      I'd argute that the latter doesn't really need a comment.
    2. Re:Comments Considered Harmful by Bazzargh · · Score: 3, Interesting
      I see you've gone back on the post a little later but its worth answering some of your points.

      Yunno, I'm really getting sick of Fowler's "worse is better" nonsense trying to win back the day for "cowboy coders" who can't handle any discpline at all and want to treat critical production code as a playground.

      Fowler's stuff isnt cowboy at all. It actually takes a lot of discipline to follow what he actually says in his book (writing tests to ensure that each refactoring is safe). His book is a series of recipes for how to introduce changes safely, not a rally call for changing code because you feel like it today. The Refactoring book can help people working in any methodology (except perhaps the 'Personal Software Process' and its variants, which want you to learn to write code right first time, and are less realistic for code maintenance). XP gets a mention but its not an XP book.

      Meaninglessly vague. Is javadoc boilerplate?

      Meaning 1 on wordnet is "a standard formulation of legal documents or news stories". Now look at this (code from Tomcat). See the boilerplate comment at the top? Its NOT THE JAVADOC - its the license. There are tools like jalopy which help you maintain this cruft but like I say it helps noone but lawyers.

      This guy must have a massively cool source control tool that actually shows him the changelog on a per-function basis, and automatically senses and shows only the changes that are significant, like an interface change.

      The 'change history comment' is the old practice of writing in a history of changes at the top of the file (not on a function by function basis as you suggest) duplicating the comment recording in the source code control system, while not necessarily recording all changes because it isnt /driven/ by that system. VSS, CVS, etc can show you the changelog externally, or you can include a $Keyword$ to get the changelog included in the comment and maintained for you. As for knowing which change is significant - developers don't know this either. The change which breaks things is the significant one and they usually don't realise they've done it at the time. Interfaces changes, while high cost, are actually less significant causes of error since they are easily caught by static checks. It is the changes in behaviour that will get you.

      We don't always get to write everything from scratch...

      This is true. often you'll not get to rename the methods of some dumbly named third party interface. And in this case the comment is inevitable. Note I did say not all comments are bad. I'm asking you to question them.

      Another smaller, uncommented method no doubt.

      Yes, exactly. If the smaller method is 4 lines long, has a blindingly obvious name, and is in the private interface of the class (often the case for extracted methods like this), the comment is superfluous.

  9. Work from the top down by Daetrin · · Score: 2, Interesting
    I recently ran into two functions in the code base I'm using that were titled "ObjectTrackDirection" and "ObjectTrackToDirection". The similarity in names was annoying, but the criminal part was that neither function had comments indicating what they did, or what the difference between the two was. In fact, the only comment in either section of code was on one line that was duplicated in both pieces of code, and which said "not sure why this is needed". This did not give me a great deal of confidence as I started out trying to decipher what exactly these functions were supposed to accomplish differently.

    No matter how clear you think you made the name of the function, there should be a comment explaining what the fuction is supposed to be doing. If the function accepts a lot of flags or variables you should briefly explain what they're each used for.

    Knowing what the function is supposed to accomplish is a big step forward, even if there are no other comments at all.

    If you're still willing to keep at it, start commenting the big blocks of code in the same manner. What are you trying to do with this loop? Why are you testing for these cases in this if statement, and if it succeeds, what are you trying to do inside of it?

    Always go in favor of more comments. I would rather have to skim by a dozen comments that I don't need to read than be left hanging for the lack of one comment when something goes wrong.

    And finally, always use whatever comment system your source control program uses! Even if it's just "I did some stuff to fix some problems with A," because if I later find out that a particular case of A is broken, I don't want to have to do a diff on every single code change made since the last time I knew that case of A worked.

    --
    This Space Intentionally Left Blank
  10. Re:Variable Names by Grab · · Score: 5, Interesting

    Sometimes, from other ppl. If I see it, it goes right back in review, and I won't pass the review until the fuckwit responsible has removed them. If you're writing code for yourself, then fine, please yourself. If you're writing code that anyone else will see, *especially* the customer, then hell no.

    Thing is, there's two essential things that a reviewer/maintainer has to understand about a program: what it does; and why it does it. It should be possible to work out the first one of these just from the code, so long as the variables and functions are named sensibly. The second can be worked out from code with some effort, or the coder can add comments to explain why they're doing things that way and make it easier for maintainers.

    But if someone has deliberately given all the variables names which don't reflect what they do, then it's utterly impossible to work out what the code is doing, and it's therefore also impossible to work out why it's doing it. So the code is unmaintainable - it isn't possible for anyone else to pick it up and work out what it does, except with massive work. If in 6 months time your company says "oh, we've got this code we can use with slight modifications, let's quote 1 month to do this contract" and then they find out you've made the code utterly obscure, then they'll crash and burn. And if that happens, the company *will* fire (or at least formally discipline) the person who wrote the original code, bcos they've been grossly negligence in doing their job. And you can kiss goodbye to any reference from them, so you'll be SOL in finding your next job.

    Grab.

  11. Re:Describe before you apply by spongman · · Score: 3, Interesting
    how about while (true) { ?

    1) It reads right in english.
    2) It's type-safe in C++ for -Wall.
    3) It doesn't use antequated or obfuscated C-isms.

    I can't stand things like:
    #define ever ;;
    ...
    for (ever) {

  12. Re:Variable Names by Squashee · · Score: 2, Interesting
    Besides naming your variables meaningfully I have an additional suggestion. I allways (when applicable) add a prefix to my variables that identifies the type. Name then becomes sName.
    • s - String
    • i - Integer
    • f - Float
    • r - Reference
    • a - Array
    • etc...
    I know this does not work for everybody, but for me this has done wonders when it comes to understanding my own stuff a couple of months later.
    --
    When in doubt, act determined. Business 101
  13. My list by edp · · Score: 4, Interesting

    One purpose of comments is to explain the code to another engineer (including oneself in the future). Another purpose is to demonstrate the code works, whether an informal argument that the code does what it should or a mathematical proof. These two purposes have different needs.

    For the former case, standard writing rules apply. Decide who the audience is. I often figure the audience is an engineer who knows the type of programming at hand, but doesn't know what is done by this particular code, and may or may not be familiar with the product, depending on circumstances. Knowing the audience tells you what assumptions to make and what has to be explained, either by prose or by giving directions to reference material.

    Write complete, grammatically correct sentences. This goes a long way to making comments comprehensible. Sometimes a little phrase won't be understood because the reader can't fill in the unwritten parts, or because there's ambiguity in the wording. It is okay to use short phrases when describing objects being defined or declared (e.g., "number of links to this object" or "dollars owed this customer), but keep the context in mind. Introduce the compound object with sentences where appropriate.

    "Dollars owed this customer" reminds me -- use units. Don't write "Money owed this customer" or "time since last update." Specify seconds or milliseconds, not time. Document how the object models whatever it is modeling. That may be a physical thing like time or a conceptual thing. E.g., if a pointer connects one object to another, document the relationship that represents. If a "debt" class contains a pointer to a "person," don't document it as "person associated with this class." Document the relationship -- this particular pointer may represent the debtor, the creditor, the escrow agent, or somebody else.

    Give context. I have seen thousands of modules that just leap into code with no explanation of what they are. Even if the comments say what a function does, a reader might not really understand it until they know what it is used for. Document where the code fits into the bigger scheme and what it is used for. Give the reader context so the purpose of the function makes sense. Even if a complete mathematical description of a function is given, so that the reader can precisely predict its behavior in every situation, it might not make sense to the human mind until they have a mental image or model of it.

    For the second purpose, demonstrating the code works, explain how the code implements an algorithm. It's not enough to explain what the steps are doing; you need to show how the total result comes out of the algorithm, unless it is something simple or familiar. E.g., a formal description of the long division taught in elementary school would generally be incomprehensible. "Find the largest digit d such that d times q is less than r[i]. Subtract d*q from r[i] to get r[i+1]. Append d to output..." Nobody seeing that for the first time would understand what it is doing, even if all the steps were clear. Even if you explained each step and explained the result, it won't be clear to some readers how the steps produce the result, so explain that.

    Document alternatives that weren't chosen, and the reasons why. If you were tempted to implement algorithm X but found you had to do Y because some error might occur, record that information. Otherwise, somebody working on the code next year might see your longer code for Y and change it to X without realizing the problem.

    This isn't intended to be a complete list, just what occurred to me at the moment.

  14. Re:Good Comments by jamieo · · Score: 3, Interesting

    I agree with the "increment loop counter" comment, that isn't worthwhile at all - but that's the difference between good and bad comments ;)

    However, I completely disagree with your premise about this being a maintenance nightmare and doubling workload.

    It's the exact opposite of a maintenance nightmare - it helps maintenance (certainly for long running large projects with developer turnover).

    It's also very little overhead. If you are a professional developer, just count how many hours you really write code in a week of working. It's not a great deal really, and the added time to add good comments is very little. The rewards of doing it are far greater than any costs.

    This is a complete mindset thing, just like coding standards - if you get in the mindset, it's easy and no cost, if you moan and complain and fight them all the time it's a pain and loads of work.

  15. Re:Timeless Prof D.Knuth says it best... by Anonymous Coward · · Score: 1, Interesting

    The trouble that I have with literate programming is that most of the examples I can think of turned out to be unreadable by anything *but* a compiler. Some people can do wonderful things with it, but the ones that need to be reached will just find new ways of being sloppy.

  16. You are a moron by RelliK · · Score: 3, Interesting

    1. Your "improved" code is much less readable than the original. Whoever has to maintain it will need more time to comprehend it.

    2. You introduced a bug on line 3 (null pointer dereferencing).

    Yes, I have personally seen code like it and I wanted to shoot the fucking idiot who wrote it.

    --
    ___
    If you think big enough, you'll never have to do it.