What is Well-Commented Code?
WannaBeGeekGirl queries: "What exactly is well-commented code anyway? Can anyone suggest resources with insight into writing better comments and making code more readable? After about six years in the software development industry I've seen my share of other people's code. I seem to spend a lot of time wishing the code had better (sometimes _any_) comments. The comments can be frustrating to me for different reasons: too vague, too specific, incoherent, pointing out the obvious while leaving the non-obvious to my imagination, or just plain incorrect. Poorly or mysteriously named variables and methods can be just as confusing. In a perfect world everyone would follow some sort of coding standards, and hopefully those standards would enforce useful comments. Until then, any suggestions for what you, as a programmer, consider to be good/useful/practical comments? Any suggestions for what to avoid? Also, I usually work with C++ so any resources/comments specific to that language would be too."
Good code comments should describe the intention of the code. Write them *before* you write the code in a function/method to describe it's purpose. This will make you think exactly what you want it to do, and will allow for others to find/fix bugs easier when the implementation doesn't meet the intention.
I then write inline comments in the code describing it's flow. It's only then do I actually write the code.
Comments at file/class level should describe what it does and is used for. It should also describe how it fits in with the big picture of it's packages and the classes around it - give a reader some architectual scope to what they're looking at.
Get into a habit, even for trivial functions/methods and you'll soon not realized you're doing it.
Some people say code shouldn't need commenting, and the code itself should be enough. In a perfect world of no bugs and only populated by wizard programmers, this is fine, but not in the world I live in. You write some code and someone else (maybe yourself) will have to debug it at some point - maybe 3-4 years down the line. Even with a "neat" language like Java, working out how things work is much more time consuming without comments.
The same goes for 'amusing' comments in the code, or CVS logs.
For your sake in the future, and your coworkers' sake now, please stop it.
PS. Did I mention how fucking annoying it is?
are the best way to comment it all.
//end if" or something.
One day you're commenting on what variables do, the next you try to explain functions, etc.
I just switched to Java from C++ and neatness is the most important thing I've acquired, not in code per se, but in variable naming. I've gotten used to doingThisWithVariableNames and DoingThisWithClassNames, while keeping THE_CONSTANTS capitalized. Ok, this isn't comments? But you'll be surprised at how much better it is to browse a new language like Java and see the norms of style in it, because old languages use too many confusing double_StandardslikeWritingThis_way.
Comments go at the top of a page, with the coder's name and date, as well as a small bug report and if you can, a brief function list for those without a visual IDE like JBuilder. You then put a like with PRE: and POST conditions in your code and try to keep one liner comments to a min.
I learned to comment the end of if structures and function blocks to make the code easier to follow... just add " }
Comments should be a paragraph long so that they make some sense. And comments, since they look different from the code sections, should be embelished with ===============, stars, and some
nice spacing and vertical bars.
Good comments to me mean good-looking comments, even if they don't have that much substance. Just my 2 cents. They're better than no comments at all.
"Wireless : LAN
I can absolutely recommend a book called Code Complete [amazon.com]. Yes, it is published by Microsoft
Yes, that's on my bookshelf -- but, given the fact that they go to great lengths to point out the importance of checking for buffer over/under-runs and fencepost errors, one can't help wondering if (in the wake of all those critical bugs in IE/Outlook/IIS) any of Microsoft's own programmers have read it.
More "do as we say, not as we do" from Microsoft?
Personally I think the linux kernel is very well documented, at least the scheduling part, which is what I've looked at. Linus has a style of inserting huge comment blocks that explain exactly what's going on, then he'll have a page of code that does it, with little or no comments.
A style suggested in Code Complete (I forget what they call it) is to write a method completely in pseudo code, make sure it's correct, then insert the actual programming code under each line of pseudo code. This technique, while clever I find leads to many useless comments like "loop through the employee records" and "increment the counter".
A good test to see if the comments are working is through a code review, people will very often not know what's going on, or point out confusing comments or code that needs a better explanation. Code Reviews really improves your idea of what good comments are and teaches you what works and what doesn't.
this is my sig.
On one of the last projects I worked on, the specs we received from the customer were horrendous. Actually, it wasn't the customer themselves who had done the specs, but another contracting firm. Spending 5 months on the project, and finding repeated errors in the "data maps" (it was apparently too bloody difficult for us to be supplied with a schema for the DBs we were supposed to be accessing and updating), I'd finally had enough.
Querying the DBs directly showed that the data maps were works of pure fantasy in several spots, or would lead to outright data loss if followed precisely. In a fit of pure...creativity...I ended up setting a "$workAroundFuckups" variable, and in the sections where it was needed, had a false evaluation do precisely what thee datamaps said, which would corrupt data. If the variable was true (ie, non-zero), it would work correctly, which meant ignoring the data maps and doing what was needed to have the data be entered correctly.
I ended up getting moved to another customer (due to the limited resources *we* had, not because of my creativity), so I don't know if the remaining folks on the project removed it after I left. When I added it, I explained to them precisely why I'd added it, and since they'd had similar experiences with what we were given to work with, were behind me 100%.
This wasn't even the *only* part of the project which was FUBARed, but it was unfortunately what I spent many a 15+ hour day dealing with, so I was rather familiar with it. Had I access to the server that *read* the data and used it, I probably would have just gone in and redesigned everything "for free", just to avoid having to deal with such a horrible layout.
This is also the client where, after a few months of an irksomely out of sync clock (off by 12 hours...made figuring out when something happened a bit of a PITA), I finally went in and set the damned clock to the proper time. Not surprisingly, the same folks who made that wonderful novel for us were the ones admining the dev server we were working on. AFAIK, no one ever noticed that the time suddenly became "correct" either.
"The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
char* foo, bar;
was good coding practice, while
char *foo, bar;
wasn't, because the code was declaring two pointers, and so the * should be with the type and not the variable name.
Even pulling out K&R, and writing sample code showing the sizeof(foo); vs the sizeof(bar); wouldn't convince him that he was wrong.
Unfortunately, I don't think it was ever "officially" settled. Nor were several of the other corrections that I immediately made to his "proposed" coding standards document he handed out at the first meeting.
Thankfully, my manager at the time listened to me (and also, helpfully, knew C and C++), so when we got the coding standards, they were filed with the rest of the useless paperwork we got, and we kept on writing things properly, including:
Three guesses as to which project was ahead of schedule. (Of course, not entirely fair, since we also didn't force code generation via Rational Rose. We instead reverse-engineered all of our final UML from the code we'd written and tested, and knew worked the way it was supposed to...)
"The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
By all means, read Code Complete--its suggestions are sensible. But the real culprit when it comes to poor software are time and resource pressures, feature creep, and other environmental factors. Maybe at least the book will let you recognize when your project is doomed and leave; McConnell seems to have done that--he isn't at Microsoft anymore.
Here's the rules I use:
If you were blocking sigs, you wouldn't have to read this.
In any situation where I see the need for code commentary, I try first to find a way to make the code clearer. If the source code is sufficiently clear, comments are unnecessary. This also avoids the risk that the comments will diverge from the code - making claims that were once true, but no longer reflect the code's actual logic.
// empty or null uiInitializerClassName means this task is not
// defined for use in this interface. Skip it.
... do something ...
... do something ...
This is poorly commented code (despite the fact that the comment is clear and accurate):
aClassName = aTask.getUiInitializerClassName();
if( aClassName != null && ! aClassName.equals( "" ) ) {
}
This is well commented code (despite the fact that there are no comments at all):
initializerName = aTask.getUiInitializerClassName();
boolean isNotNull = initializerName != null;
boolean isNotEmpty = ! initializerName.equals( "" );
boolean definedForThisUi = isNotNull && isNotEmpty;
if( definedForThisUi ) {
}
Of course, this doesn't work in all situations, but I find that I can improve the clarity and accuracy of seventy to eighty percent of my commentary this way.
Stop-Prism.org: Opt Out of Surveillance
Yes, we can tell it's been quite a while.
Right now, businesses go to their IT teams and say "We need software that does X. Deliver it in 8 weeks."
It would take 8 weeks to write a structure document in the manner you are describing (especially going through the various top-down iterations to reach code).
Instead we have to write the basic framework of the solution, get that ready in four weeks, and then change and tweak all the little pieces of functionality that have changed in the requirements since we started. Because businesses don't know what they want, and all that is definite is the go live date (usually because the Marketing team have a fixed launch date)
How are you meant to pre-document a requirement that arrives three days before go-live? With a suitable methodology you can add it, test it, and regression test the rest of the system to be sure you haven't broken it. But then generating user documentation too? Dream on, that's going to have to wait.
~Cederic
Sorry, but that's just not true.
You need fewer comments if your identifiers are well-chosen, certainly. But I've never seen a significant piece of code that would be adequately described by well-chosen identifiers alone.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Literate programming is to comments what a high level language like Pascal is to assembly language.
Programmers have the weakness of thinking that they are writing for a compiler. They should be imagining that they are writing for another human reader. This is the essence of literate programming, which is, in my opinion, the only good idea that anyone has ever had about program documentation.
If the programmer is restricted to interspersing comment lines between lines of code, then the elaboration of the documentation must follow the sequence of the code. To use a simple example, if I write a class definition, then I must declare all the member variables within the block of text that begins "class foo" and ends with "};". However, that may not be the best way to explain the significance of those variables. Maybe it would make more sense (for the human reader) to delay the declaration of a member variable until the place where it is first used. This can easily be done with CWeb. This simple example is hardly sufficient to reveal the power of LP. It is well worth taking a serious look.
After the literate program is written, it is processed by two other programs. One produces a file suitable for submission to a compiler. The other produces a TeX file, which outputs a properly formatter version to be read by a human.
The code itself is never the best documentation. It's not documentation at all, except in the most trivial cases.
Any programmer can write a literate program. It's just a matter of understanding who his audience is.
There's a third thing the maintainer needs to know which is "what it's *supposed* to do. Comments are invaluable for that. Consider the following C code fragment:
;
;
;
for (i = 1 ; i ARRAY_SIZE ; ++i)
{
do_something_to (array [i])
}
Why isn't it doing something with element 0?
Now look at these two fragments
/* Do something to all elements in array */
for (i = 1 ; i ARRAY_SIZE ; ++i)
{
do_something_to (array [i])
}
and
/* Do something to all elements in array except */
/* the first one because... */
for (i = 1 ; i ARRAY_SIZE ; ++i)
{
do_something_to (array [i])
}
Just by adding a one line comment, a bug has been exposed, or the maintainer has been prevented from inserting a bug in the second instance.
As a maintainer, I'd want to be able to see what the code does (well set out, good structure, descriptive names etc) and what the programmer meant it to do, i.e. good comments.
Anybody who puts jokey unhelpful comments in their code should be aware that these will inspire feelings of hatred and extreme violence towards them in the maintainer who has two hours to fixe the air traffic control system before the 747s start falling out of the sky.
All I want is a secure system where it's easy to do anything I want. Is that too much to ask ~~ Randall Munroe
Some variation of the methods described in "Literate Programming" by Donald Knuth are a good place to start. In summary, Knuth says that you should be able to extract from the same source both machine instructions, and a human parsable document, with unusually high importance placed on the later. Whether or not you want to imbed LaTeX into your document is up to you (I never have bothered), but on the whole find something that will make your code and algorithms understandable to another programmer who's never met you (because that's probably who will be either grading or maintaining your code at some point).
So, it's one thing to say "if they change the code, they MUST change the comments", but realize that unless you have the ability to force the issue (a tool to make you change comments before saving changes, managers who care more about firing programmers who don't follow code standards than avoiding schedule slippage -- hint: I've never seen one of these, EVER), 9 times out of 10, they just won't do it. It's like teaching abstinance as a method of reducing teen pregnancy.
Thus, the practice of having comments which are redundant w/ the code is simply setting the project up for failure as the parent poster pointed out.
It's less of an issue w/ Javadoc and Doxygen comments (which is embedded in the code) than external documentation, but the fact is that managers reward code changes, not documentation changes, and programmers are lazy.
Until you can change these basic, simple facts, what are you going to do? One strategy is to encourage self-documenting coding standards as well as encourage documentation updates. But people NEED to remain aware of the basic principle that the only authoritative documentation is the source code itself.
Unfortunately, this is rarely practical, but I find the most useful comments are written when I'm going back through code I wrote over a week ago. The reasons for doing things are no longer on the surface, and thus if there's something I look at and have to dig for understanding, then it needs better explanation.
Functions should have their arguments documented-- what they mean and what they do, and their return values documented. That is sufficient.
Inline comments have a place, but they are way over used. If you are telling me how your program works using inline comments, I will ignore those comments becuase if there is a problem, your code may not behave as advertized.
Instead inline comments should document WHY you do something a certain way and help me to understand what problems caused a particular piece of code to us a particularly clumsy algorythm or why a seeminly extranious bit of code was added. Don't tell me how-- that is what the code is for.
And use whitespace as your friend to break things up into logical chunks which are easily readable and logically connected. This is the reason for indenting your code, but the same principle can be used by adding additional line breaks to separate logical chunks of code (this makes more sense then meaninglessly breaking up functions).
I think that these are relatively language independent advice. I use it in Perl and PHP, and when I read C and C++, I appreciate these tips as well.
LedgerSMB: Open source Accounting/ERP