The Importance of Commenting and Documenting Code?
mrtrumbe asks: "The company I work for is in the process of creating a development standard to be applied to all projects. The topics being considered range from dictating the formatting of the code (an issue on which there is widespread agreement), to creating a standard for commenting and documenting the code (a far more contentious issue). On the issue of commenting and documenting, there are two extreme views being considered with most employees' opinions falling somewhere between them." To comment, or not to comment. And if you do choose to comment, what's the best way to standardize it, company-wide?
"The first view is that commenting and documentation will protect the firm from bad programmers or a programmer abruptly leaving, make the code far easier to understand to someone unfamiliar with the codebase, and are necessary for all public, private and test code. The opposing view is that there are more effective ways to mitigate the risk of bad and disappearing programmers (like mandated shared ownership of code and sufficient oversight), that comments are not necessary for clarity and can be dangerous if not kept up to date (which is considered likely), and that documentation is necessary only for public code. Where does Slashdot stand on this issue? Please share any success stories and recommendations for a company-wide standard on commenting and documentation of code.
Never comment your code. That way everyone needs to ask you for a fix when thinks break. Think of it as "employment insurance..."
A brief decription of the object/class and then simple comments on any methods. That's a minimum but I would also go for single line comments for conceptually difficult peices of code that you know you will forget in a couple of weeks. Not overly rigorous but easy enough that people do follow it.
A good model for me would be the Java SDK docs and the javadocs tool but that's just me.
Stop it! Stop it! Stop it! The Noise. Make it stop!
No, seriously, you cannot comment your code and enforce that as policy. You can't impose standards and impose enforce that! It doesn't work.
You either know how to program/code, and commenting is part of that, or you don't. Either your staff knows same or doesn't.
Go ahead and establish "guidelines", you'll feel better. But I've been in this industry for over 20 years and applying "standards" for coding and "comments" has never worked.
Write un-obfuscated code, have peer reviews and walkthroughs, and have staff that know how to create... It's really all you need.
(As an anecdotal experience -- we had "standards" on a major project, and I accidentally created a Class without the proper capitalization. A peer came to me and confronted me on said transgression and wondered what I intended to do about it. I said I intended to let it slide and would try to be better in the future. He insisted we "fix" this problem and we spent (and I'm NOT making this up!) the next day's worth of time re-factoring the code (the IDE wasn't up to speed for this -- thanks Microsoft) to "correct" the "problem". Sigh)
If your code is not commented it's not complete. My advice is to fire every developer that doesn't think that comments are necessary.
Just read this thread... http://developers.slashdot.org/article.pl?sid=05/1 1/30/1544256&tid=156&tid=8
Check out the best P2P sharing website: MEDIACHEST.COM
Comments won't protect you against bad programmers; they'll write bad/confusing code and comments no matter what.
However, I've found that writing semi-structured comments for each module and function (or object/method, if that's your poison) using something like doxygen is worthwhile for ongoing maintenance. It helps others see what the intent is, and provides a basis for writing unit tests. It even helps the original coder when they come back to the module 6 months later. It's not a matter of whether it's public code, just basic internal docs.
I use Pig Latin to comment my code. Job security, you know.
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
Commenting and documenting code is something all programmers should do. Not doing it is highly unprofessional and should not be allowed in any self-respecting firm. Making sure the documentation/comments are upto date is included in that statement.
On the other hand just because code is well documented that doesn't mean it's easily maintainable. There are various techniques used to generate good maintainable code. But without documentation any code more complex that 'hello world' tends to be a pain to maintain no matter what techniques you use.
I personaly also find the formating of code (and comments) just as important as commenting it. Reading code formated in a way you're not used to can be a pain and reading code formated in different ways doubly so. So a company-wide standard for formating code/comments would be a good idea.
There is one problem with comments, but it is a show stopper as far as I'm concerned.
Computers never read the comments, while programmers tend to read comments rather than code. The first part is obvious, and the second is easy to demonstrate. Together, they are a recipe for disaster.
Uncommented code has a number of disadvantages, but the overriding (IMHO) advantage is that both the computer and the programmer are dealing with the same thing, the code. On the other hand, with commented code they are dealing with two similar but distinct things, that are related in exactly the same way as a fine-print contract (the code) and the car salesman's verbal promises (the comments). When push comes to shove, the salesman's words mean nothing and the contract is what matters. So why even listen to the salesman?
-- MarkusQ
P.S. This is not to say that I never comment code; only that I do so sparingly and never trust the comments.
In the real world, you work on a project for a time then move on to something else. Then you or someone else is assigned to revisit your old code. You don't have time to relearn the code and you certainly don't have time to sit down the guy called in to fix it and tranfers your understanding of the project. (If you did, you would've documented the code properly in the first place, right?)
When companies don't comment and don't document their code properly, they begin this vicious cycle of rewriting old code because no one know how it should or does work and no one has the time to figure it out. Let me explain why.
Imagine you find a software package on the internet licensed in a way that suits your needs. Now imagine that software package, with very few modifications, will do exactly what you need it to do for you project. You have a choice: (1) Take that software, modify it, and deploy it, or (2) write your own from scratch.
There is only ONE determining factor in whether you inevitably choose (1) or (2), and that is DOCUMENTATION.
Now remember that software you find in your own company is no better or worse than software you find on the internet, only it has a much more liberal license for your purposes. But does that change the fact that in order to make use of it you have to understand it?
On my job, I have an approach to undocumented software. I start writing documentation for it, whether or not the author wants me to and whether or not there is really enough time for it. If I have questions, I find the author, and approach him with pen and paper. We sit down and write documentation together. Inevitably, by documenting what I find in other people's codes it ends up saving me more time than if I wrote the code myself, documented it, and debugged it. So I have been able to finish a great number of projects ahead of schedule because I don't write code: I READ it. (And this is a perl world too!) And in the end, others are able to come and read my documents and notes and reuse the software as well.
The radical sect of Islam would either see you dead or "reverted" to Islam.
If somebody asks you to code something (and you can get away with it) tell them this, "okay that is X hours for just the code and X*3 for the code and proper documentation."
Yes I made the *3 up. You know why? Because I have always had the misfortune on working on the kinds of projects where I either didn't get the time needed or the guy before me didn't do the documentation.
If you want to take a ride in your car you should walk around it making sure it is in proper working order like all the lights working. It is a law and enforced by people with guns. Now how many of you do it?
Okay, nobody. So now you are under time pressure, you are underpayed and overworked and you got a choice, either deliver on time or tell your boss your still writing documentation on the installer.
When I was still young and fresh I thought that following procedures is the way to do it. Boy was I wrong. The secret? Code fast and ugly and make sure you have moved on before the shit hits the fan. Oh and never ever be lumbered with a maintenance project. I never even seen documentation wich was up-to-date.
The entire discussion on wether or not to document is wrong. The discussion should be wether you will allot enough time to non-coding work. It applies to so many things, peer review of code, sharing and re-use of selfmade libraries, layout standards, knowledge sharing, etc etc.
The larger the company the more time can successfully be spend on non-coding things that however are always badly reviewed during your evalutation. Oh yeah very nice you tought everyone else how to code securely and made sure nobody else has bugs in their code. Now how many lines did you write? Oh, no pay rise for you.
So simply ask this of the people in favor of proper documentation. How will they find the time?
And ask the non documentation people if they will do the maintenance on their own projects 10 years in to the future.
My experience? I needly predict I need X to write code and then Y to write the proper documentation. I deliver the code and get the next project and if I protest that I am still working on the documentation then I am told that it can wait. I am still waiting. Oh and the risk of doing it properly? You get lumbered with writing maintenance and writing the documentation for everyone else because your good at it but a slow coder. ARGH!
Just comment the basics, point out in a readme.txt where to start reading and tell them wich bar in the neighbourhood serves hard liquor during lunch. Oh and if you comment some code out come back later and delete it. Can be very confusing if you have to wade through a problem where 2/3's is old code.
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
The answer (40 years of experience with this) is not to set a standard on how much commenting is needed; it's to have walk-throughs of the code with an intelligent reader who isn't directly involved with the code. If they can read and understand the code, it's enough.
Look into Fagan reviews for details on an effective way to handle this.
Document at the function level (javadoc style is nice). It's easy to remember and it helps you refactor. If you are documenting the internal magic, then the magic could probably be moved out into it's own function, which then gets it's own documentation. voila.
If you need a documentation/commenting consultant, I am available to guide your team through this process.
Technology Consulting & Free Downloads
Not only write misleading comments, but also write variable and method names both generically and misleadingly too. For example:
ArrayList aStrPtr = new ArrayList()
If you are writing in C or C++ use macros to transform your code to look like another language, but incorrectly:
#def begin: }
#def loop: if
and so on
Doxygen is a documentation system for C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors) and to some extent PHP, C#, and D.
:(
No Perl?
I keep seeing all these arguments either against commenting or against verbose languages because, supposedly, they slow development.
Now, Maybe I've just been programming too long and have gotten too good at it, but typing is never ever a slow-point in coding; heck, even learning a new language doesn't slow you down too much!
The slow part is designing your code correctly so that it's fully factored and as bug-free as you can manage--this takes thought and a bit of time, but no where near as much time as it would take to do the same release with cut & paste (I've seen it many times).
So I'm trying to figure this out, why are people making these arguments? Is it that for unexperienced people it truly is harder to put comments in with your code? Maybe they don't know how they did their magic and don't want others to figure them out? Maybe they never took a typing class and it truly takes more time to code than think? I'm really at a loss here.
Oh, and as for the authors question, you have a FANTASTIC opportunity to improve your company tenfold. Take notes of those arguing against commenting. As soon as you've collected all the votes, throw them away and FIRE anyone who was against documentation--they should not be working in any company, at least not as a programmer! If you hired people who understood programming and the development cycle, that question would have never come up.
Didn't you get the memo? - perl is self-documenting.
the layman's guide to computer science
I've been programing for most of my life and have only documented about 3 peices of my work where things got so complex, it was making my head hurt. Even then, my comment says "I feel sorry for the next person who has to figure this out..." or "Don't ask what I was thinking here...".
IMO, if someone comes up and asks for documentation, they need to be fired! They obviously either 1) don't know how to read code and shouldn't be programming; 2) Don't understand the problem the code is trying to solve.
Code is like a foreign language - you either know it or you don't. Comments are for people who don't know it; and if they don't know it, they need to find another job or learn the language.
When I program, I get in this "state" where I can't stop. When I get to that state, I am a VERY FAST programmer. If I were to document my coding, it would take me 5 times as long to write it because I would never get in that "state". On the rare occassion that I look at code and can't figure it out, I rewrite it because, obviously, the code sucks. To keep my code from sucking, I have very strict guidlines that I use when programming (in order of importance):
1. MOST IMPORTANT - use tabs in routines to show where routines start and stop
2. use tabs in routines to show where routines start and stop
3. If I do comment (yea right), Don't put parenthesis, squiglys, or brackets in the comments - it screws up vi's % command.
4. use tabs in routines to show where routines start and stop
5. Make variables' and functions' names intutive.
6. use tabs in routines to show where routines start and stop
and last, use tabs in routines to show where routines start and stop
If you use these rules and have a decent progrmamer, there's probably very little need for comments.
Keith
Support bacteria. They're the only culture some people have.
Doxygen is nice because it standardizes a particular commentation style over multiple languages, so that whatever you use for a project (or within a project), you comment in roughly the same way, using the same commands, etc.
.NET's comment/documentation scheme.
Personally, however, I very much prefer the xml-oriented way of doing it found in Javadoc and
"Times have not become more violent. They have just become more televised."
-Marilyn Manson
If you can't keep comments up to date in the code you're responsbile for, you're not competent to be responsbile for the code.
There shouldn't be any debate on the need for documentation. Document your code or hit the road. The only issue is where it goes, in separate docs or in comment blocks. (Doxygen and similar systems make it easy to generate separate docs from comment blocks. Recommended.)
Code reviews are the best enforcement - if you go in and everyone's asking "what the hell does this block do???", you need to comment it.
Tom Swiss | the infamous tms | my blog
You cannot wash away blood with blood
Code, like a math proof, is written in a specialized language that people outside of the field are unlikely to understand well.
Try removing all text from a sufficiently complex math proof, leaving only the mathematical notation, and see if you can still figure out what the mathematician is doing.
Now try to publish a paper like that.
No matter how amazing your results, such a proof will not be accepted by the mathematical community. I've run across some very good papers that were discarded because no one, including the author, could understand what all that math was supposed to *do* anymore.
You should be writing code the same way as you'd write a good proof. You don't need to explain why 1+1=2, but you definitely do not want to skip over critical parts of a proof that are necessary to understand before reaching the conclusion.
I see your point regarding using source code control for change comments. The issue that I have run into putting change comments in the code itself is one that also happens over time and multiple changes.
Here is an example, let's say you change line 188 to fix defect 2287. Next week, another developer needs to change the same line to fix defect 3012. Does that developer append on to your comment or overwrite your comment? What if the developer completely changed line 188 so that your changes were lost?
I guess that there is no perfect answer so you end up putting change comments in both the code and in the CVS (or similar) system. The downside of that is the wasted resources and potential for error in duplicitous effort.
I wrote what I thought was a pretty decent article on comments a while back:
http://freshmeat.net/articles/view/238/
The gist of it is that the source tells you what the code does, and comments tell you what it's supposed to do, why it looks that way, how it connects to other parts of the program, any weird gotchas, and so forth.
Comments help you zero in on the part of the code you're looking for when you're trying to fix a bug; and they help confirm that the code really does what you think it does.
All projects, no matter how simple, require comments.
The comment (or documentation) defines the supported API for the method or function. It is effectively the informal contract between the person writing the code and the person calling it.
The importance of the design contract is that it allows you to refactor code effectively, rather than having to reproduce every single side effect and internal detail of the code in order to avoid unknown amounts of breakage elsewhere.
And I'm with the previous guy in the thread. If you don't understand why all functions need comments, you shouldn't be writing anything even remotely important.
And yes, even code you write for yourself should be commented, so that you can come back to it a year later and refactor.
For example, take a very simple piece of code: something in a math library to add two vectors together. Suppose you implement it, and your initial implementation is generic and happens to work with complex numbers, rationals, dates, even strings. Well, that's great, but then you profile and discover it's a major bottleneck in your 3D graphics application. You want to refactor it to a high speed piece of inline assembler. You only intended to use the code for vectors of floats--but if you have no design contract, people might be using the routine with all kinds of data types, because it happened to give the result they wanted—and your hopes of a quick and easy refactoring are dashed. You end up having to define a new fastfloatvectoradd(), replace calls all over your code, and maybe end up with the original add() as dead code as far as your application is concerned.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
On the other hand, in the real world with real code, tools like BicycleRepairMan work great.
Of course, I could have code that evals a strong, or changes the base class of a live object, or alters the inheritance hierarchy, or whatever in my Python code. That's incredibly uncommon in good code, though, and it's up to me as a developer to know when I'm playing games like that and account for them.
In practice, for renaming a class/method/attribute, pulling up/pushing down methods and variables, etc BRM works find on 99% of my code--and the times that it doesn't are times that are obvious to anyone who knows anything about the system (it's not as though it silently fails in cases you'd reasonably expect it to work). And it integrates nicely with emacs and vim.
rage, rage against the dying of the light
A friend of mine told me a story about someone at a large engineering company here in the Northeast (details are being left vague to protect the guilty and/or insane). Here's what this amazing, completely mad person supposedly actually did:
In a huge software project, he named every single variable after a notable warship in United States History. If the variable was really important, it would be named after the lead ship in a battle group, like for instance an aircraft carrier or (WWI era) a dreadnaught. If the variable was just a little local variable, it would be named after something tiny, like a P.T. boat. Variables that participated in the same battles were used in the same modules. Variables' relationships mirrored their relationships in real life, so for instance, one variable would be named after a destroyer and a helper variable would be named after a tender.
Think about how utterly brilliant and devious this is!
The ONLY people who would have any chance at all of understanding the program would be anal-retentive naval history buffs! And the scope of it was supposedly amazing. If my friend was to be believed, this was an old-fashioned, NON-OO, structured-programming project with hundreds or maybe thousands of variables, all spaghetti code, everything named after fucking BOATS!
It's priceless.
How to comment your project and thoroughly preserve your sanity:
1. Ignore any standards anyone tries to force on you. Mostly such people are full of hot air, playing a role instead of just BEING a programmer. Things don't have to be buttoned-down. So, ignore the anal retentives and RELAX.
2. Start sneaking around. Gather up everything you can get your hands on, from original user specs to whatever else. Everything you can beg, borrow, or steal, put in a folder in your desk. When you have some free time, digest it and produce short, easy-to-understand summaries. And, summarize EVERYTHING: business rules, expectations, requirements, EVERYTHING. A short, clearly written summary is worth ten pounds of worthless suit-speak memos.
3. As you code, start each chunk of code (function, procedure, class, whatever) with a brief paragraph explaining, in your own words, what the purpose of the code is. Just briefly say "this is what I'm about to do, and this is why". Be brief, but specific. Mention anything weird, like odd parameters or whatever. If you have to return a weird string because Joe the Programmer is expecting it, explain it (without being cruel).
4. Within your code, use self-documenting variables and make sure your indentation, etc (style) is clear and easy to read. I know I bitched about "standards" but it doesn't hurt to read a short book like "the Elements of Java Style". It's a good book. Make your code clean and easy on the eyes. It only takes a minute. USE WHITESPACE!!! Don't clump everything together like a core dump, add some extra lines here and there. A carriage return is only a byte (two if you're on Windows). It ain't gonna kill you.
5. Whenever you do anything in your code that is non-obvious, like testing a column you got out of a database because there's junk data in there sometimes, EXPLAIN it. Just take a couple of lines to say "The import process sometimes sticks garbage in this variable, so we're doing a sanity check on it". You don't have to comment every single thing you do, but comment everything NON-OBVIOUS you do.
And, that's about it. I think it's as easy as that. There's no need for company-wide training, or workshops, or any of that stuff. Just a little common sense, and a little effort, and your code's clear to everyone.
Ninty percent of my work when working on old programs is TRYING TO FIGGUR OUT THE DATA STRUCTURES. Not the code. The Data.
How about the 25 different letters used in the field cryptically named "F-STATUS"? OR a date in a field named "D-Date"?
Document your DATA structures you code-monkeys!
Replaceable you must be, or promoted you won't.