What is Well-Commented Code?
WannaBeGeekGirl queries: "What exactly is well-commented code anyway? Can anyone suggest resources with insight into writing better comments and making code more readable? After about six years in the software development industry I've seen my share of other people's code. I seem to spend a lot of time wishing the code had better (sometimes _any_) comments. The comments can be frustrating to me for different reasons: too vague, too specific, incoherent, pointing out the obvious while leaving the non-obvious to my imagination, or just plain incorrect. Poorly or mysteriously named variables and methods can be just as confusing. In a perfect world everyone would follow some sort of coding standards, and hopefully those standards would enforce useful comments. Until then, any suggestions for what you, as a programmer, consider to be good/useful/practical comments? Any suggestions for what to avoid? Also, I usually work with C++ so any resources/comments specific to that language would be too."
I can absolutely recommend a book called Code Complete. Yes, it is published by Microsoft, but it is an invaluable language-agnostic guide to writing software that includes heavy doses of common sense regarding commenting, coding styles etc.
I Have to scan through code regularily and the biggest problem is the variable names. I realize that they must mean something to the coder but to us maintainers they're most times akin to Sanskrit. Function(method) comments are nice too.
In a time of universal lies, Telling the Truth is a revolutionary act - George Orwell
ie,
while (1) {
}
Code Complete by Steve McConnell
Writing Solid Code by Steve Maguire
I have been pwned because my
I guess the only real solution is to give a specific coding standard for every project. Before you begin coding, make up a standard that every developer has to follow, for comments, code layout, etc.
h tml
A good standard for C++:
http://www.possibility.com/Cpp/CppCodingStandard.
The practitioner of literate programming can be regarded as an essayist, whose main concern is with exposition and excellence of style. Such an author, with thesaurus in hand, chooses the names of variables carefully and explains what each variable means. He or she strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that reinforce each other. Donald Knuth. "Literate Programming (1984)" in Literate Programming. CSLI, 1992, pg. 99.
This post was compiled with `% gec -O`. email me if you need the sources
I only work with Perl. When I'm looking at someone else's code all I ask is that they outline the basic function of a particular section of code so when I need to change/enhance/debug something I can find the right area to start looking as quickly as possible.
I've never had to deal with 'obfuscated' code so I don't know about onscure variables, etc.. or how much more complicating they could be to my task.
Just point me in the right direction. Anything else is going to be too much or too little... and if I don't already know what the code is supposed to be doing I probably should be talking to someone who does before I sit down to work on the code itself.
Obviously reverse engineering of software is a whole different beast.
A fool throws a stone into a well and a thousand sages can not remove it.
Good code comments should describe the intention of the code. Write them *before* you write the code in a function/method to describe it's purpose. This will make you think exactly what you want it to do, and will allow for others to find/fix bugs easier when the implementation doesn't meet the intention.
I then write inline comments in the code describing it's flow. It's only then do I actually write the code.
Comments at file/class level should describe what it does and is used for. It should also describe how it fits in with the big picture of it's packages and the classes around it - give a reader some architectual scope to what they're looking at.
Get into a habit, even for trivial functions/methods and you'll soon not realized you're doing it.
Some people say code shouldn't need commenting, and the code itself should be enough. In a perfect world of no bugs and only populated by wizard programmers, this is fine, but not in the world I live in. You write some code and someone else (maybe yourself) will have to debug it at some point - maybe 3-4 years down the line. Even with a "neat" language like Java, working out how things work is much more time consuming without comments.
I use doxygen++ for C++; it's great
about 1/4 of my lines are comments -- most all of which are incorporated into doxygen descriptions -- and the rest only appear in the sourse listings
see http://www.doxygen.org/
Well commented code should definitely contain a liberal smattering of four-letter expletives, eg:
// no fucking idea how this works
obj.doMagic();
or...
//bet those fucking lazy cunts in the QA team don't pick this up
fileSystem.delete();
When your code is released as open source and becomes famous, people can amuse themselves by searching through the source code to find all the hidden expletives, sort of like easter eggs. If you work for a commercial organisation, you can sit back and enjoy the panic as the QA and release teams sweat it out trying to track down every last filthy utterance before shipping to a fucker...errr..customer.
Tools like javadoc, or maybe better in your case doxygen can really help when it comes to commenting code... the idea is pretty much that you place a documentation comment before each function, or class, and so on, which usually makes the entire thing much easier. Having done that, I've found that only a few more non-obvious parts have to be commented within the actual functions.
Tomorrow will be cancelled due to lack of interest
I also have seen my share of other people's code.
// increment i
Quality of comments varies.
I've seen code from the 'hardcore hacker', who believes that the statements themselves suffice as comments - 'the code is intuitively obvious, and it comments itself'.
I've also seen code from complete lamers, who dilute the code terribly with irrelevant shit:
i++;
Over the years, I've noticed that composition of code, and commenting/documentation of code, tend to draw on two different parts of the brain.
Often, I find myself in a 'zone', where the code flows freely, and where commenting code feels like a total distraction.
Other times - for instance, when I'm hunting an elusive bug, I find a different part of the brain kicking in - and at that stage, I find it easier, even pleasurable, to add meaningful comments, to change indenting, variable names etc, as if I'm narrating the code to someone else.
I guess it's a matter of balance, and using the right mental faculties at the right time.
A good rule of thumb is to imagine that someone else is sitting beside you, someone less acquainted with the task than yourself (eg a non-technical manager). Imagine you're explaining to him/her how the code works, and put these explanations in the code as succinct yet clear comments. Imagine this person asking you, 'what's that variable'. Don't be afraid of global search'n'replace of identifier names across all the applicable files. And imagine this person sometimes getting up and leaving you in peace, so you can have those precious moments to hack to your heart's content.
In conclusion, I feel that much of a person's personality can be read from one's code. Is someone fundamentally easygoing and helpful, and caring about others? Or is someone a complete egotist, emotionally shut down almost to the point of autism? In my mind, the ability of code to communicate its intent and methods to other programmers is almost as important as the code successfully performing its task, since its communicability directly affects the ability and interest of others in working on it, and thus its openness to manpower leverage.
-- In the beginning was the WORD, and the WORD was UNSIGNED, and the main(){} was without form and void...
It's been quite a while since I wrote any significant amount of code but after spending far too many years cutting code too early in the development process I eventually woke up to the fact that coding is the *last* thing you do (apart from testing and debugging that is).
First-up you need a good spec -- and the spec should include the user-interface details to the extent that you could actually write the user-manual from that spec.
Indeed -- if you can't write the user-manual from the spec then the spec is incomplete.
From the spec the programmer should develop the structure of the code in another document.
That structure document is repeatedly refined in a top-down process until you (eventually) reach a point where you're actually cutting code.
I was always surprised just how much easier it was when the code was written as the lowest level of the structure documentation.
Not only could you comment out the program structure document so that the compiler would ignore it -- but you ended up with absolutely accurate and comprehensive documentation built into that source.
Project managers love this technique (and when I was in a project management role I demanded it of my team) -- it ensures that technical and end-user documentation are no longer the bits that get left until last and thus are either very shoddily thrown together or, if the project goes really over-budget, not produced at all.
Of course, as we all know, there's a huge amount of temptation to just leap into coding at the earliest possible stage and leave the documentation until later -- because some stupid managers use number of code-lines completed as a metric of project performance -- duh!
If you're smart and use good tools you can selectively collapse and expand the in-source documentation so that when you're trying to get familiar with a module that someone else has written, you can descend down the structure tree one level at a time without the meaning being diluted by stuff that is at a lower level.
Unlike the days of interpreted BASIC, there's very little overhead involved in integrating documentation and code these days -- so there's no excuse not to do it.
If required, the documentation can be automatically extracted from the source -- but by keeping the master copy in the code it becomes easier to ensure synchronization as changes and updates are made during the lifecycle of the project.
Unless the code is Perl ;)
My variable names usually are forced into be changed after a code review by my peers...
They don't have that funny bone, when the code is going to be in production software, and maintance by others.
It's a professional image thing...
Money cannot buy happiness, but can buy something soo darn close, that you can't really tell the difference
The same goes for 'amusing' comments in the code, or CVS logs.
For your sake in the future, and your coworkers' sake now, please stop it.
PS. Did I mention how fucking annoying it is?
Take a look at this function, and tell me if there's a bug:
Easy, the bug's the SEGV, right? Take a look at the same function, this time with comments:
The point? A bug is unwanted behaviorm, but that only makes sense if you've defined what the correct behavior is. My example is trivial, but often this is a real concern. Function "bar(int,int)" returns null whenever one of the arguments is negative--is that a bug or a feature? Your function has a goal in life, a contractual obligation to do something; make sure it's clear what that something is.
Note that if you choose good function and good variable names, a simple one or two line comment at the beginning is usually sufficient to document whe function's intended behavior.
I also find that an "assert()" or two on the arguments at the top of the function makes it clear what values the function accepts, and which one the function doesn't handle. It's an easy way to document the contractual obligations of the function.
Stuff not to put in comments is stuff that's easily devised from the code. Check this out:
Did the "Inputs" or "Outputs" add any value? That information appears again, two lines below in the function definition, and it's guaranteed to be correct there (unlike the comment which will be out-of-date and wrong when we change "square" to work on longs). The "Used by" might have added some value, if it was correct, but as it turns out it's out of date, and 15 other functions now use "square". Any information better derived looking at the code should be left off. Any information which can be better found using "grep" or "find in files" should be left off. Any information that will probably be out of date at some point should be left off. Heck, in this situation even the description is probably extra verbiage, since it doesn't really help anyone. (I'd probably put it in out of habit anyway, though...so sue me:)
comes this advice:
Comments are good, but there is also a danger of over-commenting. NEVER try to explain HOW your code works in a comment: it's much better to
write the code so that the _working_ is obvious, and it's a waste of time to explain badly written code.
Generally, you want your comments to tell WHAT your code does, not HOW. Also, try to avoid putting comments inside a function body: if the
function is so complex that you need to separately comment parts of it, you should probably go back to chapter 4 for a while. You can make small comments to note or warn about something particularly clever (or ugly), but try to avoid excess. Instead, put the comments at the head
of the function, telling people what it does, and possibly WHY it does it.
--- Hot Shot City is particularly good.
are the best way to comment it all.
//end if" or something.
One day you're commenting on what variables do, the next you try to explain functions, etc.
I just switched to Java from C++ and neatness is the most important thing I've acquired, not in code per se, but in variable naming. I've gotten used to doingThisWithVariableNames and DoingThisWithClassNames, while keeping THE_CONSTANTS capitalized. Ok, this isn't comments? But you'll be surprised at how much better it is to browse a new language like Java and see the norms of style in it, because old languages use too many confusing double_StandardslikeWritingThis_way.
Comments go at the top of a page, with the coder's name and date, as well as a small bug report and if you can, a brief function list for those without a visual IDE like JBuilder. You then put a like with PRE: and POST conditions in your code and try to keep one liner comments to a min.
I learned to comment the end of if structures and function blocks to make the code easier to follow... just add " }
Comments should be a paragraph long so that they make some sense. And comments, since they look different from the code sections, should be embelished with ===============, stars, and some
nice spacing and vertical bars.
Good comments to me mean good-looking comments, even if they don't have that much substance. Just my 2 cents. They're better than no comments at all.
"Wireless : LAN
Personally, I like documenting backwards. Start with the requirements, work to the architecture, then get into writing PDL (Program Design Language). Essentially, you write out as detailed instructions on what the routine does as you can, without getting to the nitty gritty. It describes the intent of the code, not the code itself. It morphs into excellent comments when you expand it out into full code, and it also has the nice little advantage that it's at a high enough level that it's applicable to multiple languages (if you should desire to switch).
Personally I think the linux kernel is very well documented, at least the scheduling part, which is what I've looked at. Linus has a style of inserting huge comment blocks that explain exactly what's going on, then he'll have a page of code that does it, with little or no comments.
A style suggested in Code Complete (I forget what they call it) is to write a method completely in pseudo code, make sure it's correct, then insert the actual programming code under each line of pseudo code. This technique, while clever I find leads to many useless comments like "loop through the employee records" and "increment the counter".
A good test to see if the comments are working is through a code review, people will very often not know what's going on, or point out confusing comments or code that needs a better explanation. Code Reviews really improves your idea of what good comments are and teaches you what works and what doesn't.
this is my sig.
On one of the last projects I worked on, the specs we received from the customer were horrendous. Actually, it wasn't the customer themselves who had done the specs, but another contracting firm. Spending 5 months on the project, and finding repeated errors in the "data maps" (it was apparently too bloody difficult for us to be supplied with a schema for the DBs we were supposed to be accessing and updating), I'd finally had enough.
Querying the DBs directly showed that the data maps were works of pure fantasy in several spots, or would lead to outright data loss if followed precisely. In a fit of pure...creativity...I ended up setting a "$workAroundFuckups" variable, and in the sections where it was needed, had a false evaluation do precisely what thee datamaps said, which would corrupt data. If the variable was true (ie, non-zero), it would work correctly, which meant ignoring the data maps and doing what was needed to have the data be entered correctly.
I ended up getting moved to another customer (due to the limited resources *we* had, not because of my creativity), so I don't know if the remaining folks on the project removed it after I left. When I added it, I explained to them precisely why I'd added it, and since they'd had similar experiences with what we were given to work with, were behind me 100%.
This wasn't even the *only* part of the project which was FUBARed, but it was unfortunately what I spent many a 15+ hour day dealing with, so I was rather familiar with it. Had I access to the server that *read* the data and used it, I probably would have just gone in and redesigned everything "for free", just to avoid having to deal with such a horrible layout.
This is also the client where, after a few months of an irksomely out of sync clock (off by 12 hours...made figuring out when something happened a bit of a PITA), I finally went in and set the damned clock to the proper time. Not surprisingly, the same folks who made that wonderful novel for us were the ones admining the dev server we were working on. AFAIK, no one ever noticed that the time suddenly became "correct" either.
"The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
It's real simple. If the reader can't tell what the code does from reading it, either it's written badly, or the reader is incompetent. In either case, comments won't help. If I see too many obvious comments, it's a clue that the author was clueless and I should probably just throw the code away because it will cause more trouble than it fixes.
When comments are useful is to fill in information that is obvious to the author but not obvious to anyone else reading the code. When the author wrote the class/function he knew why he was doing it, why the function was needed at all, this kind of information allows a new developer to get an overall understanding of the project much faster.
http://rareformnewmedia.com/
If the code is well structured, variables, classes, methods, etc. well-named and well-conceived, it will explain itself to a large degree, and won't require an English play-by-play of every friggin detail. Generally, it's a good idea to have high-level comments that say "this chunk of code does X", but lower-level comments are often a waste of time, and only serve to clutter the code. Having said that, sometimes code is unavoidably hairy, and you have to recognize cases where the code needs some lower-level explanation, and provide it. First, avoid complexity, failing that, manage it. Generally speaking, I think code comments serve the purpose of helping s/w people to develop a mental map of the code. Code should have as few comments as possible, but no fewer :-)
OTOH, char* foo is arguably more logical than char *foo. You are declaring foo as being of type "character pointer". You are not, in fact, declaring a char with a pointer to it named foo (you never declared the char, only the pointer), which is what is implied by your recommended form.
i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net
I explain what each function is supposed to do, along with any assumptions I'm making and any desired side-effects that the coder should expect. If I'm using algorithms beyond a basic for loop, I'll stick pseudo code with example inputs and outputs in the comments as well.
Function and variable names that make sense go without saying - they may be a pain to type in, but in the end they don't hamper code efficency, so make them as self-explanatory as possible (with exceptions like using i and j for a 5 line for loop.) Remember, whitespace and good, CONSISTENT formatting is just as important as good commenting. Funky, inconsistent formatting pisses me off just as badly as cryptic commenting and 6-character all-capital variable and function names.
I comment under the assumption that the next time I look at this code, it will be years later and I'll have forgotten much of the programming language I originally wrote the code in (basically, I assume the next time I look at the code, I'll need to port it.) In instances like this, basic descriptions of what a function is supposed to do, along with pseudocode of the alogrithm with sample inputs and outputs are EXACTLY what I need - not to mention, they serve as a road map for when I'm writing and debugging the code the first time. Typically, I'll have just as much commenting as actual code for simple stuff, for anything beyond that I'll have double as much commenting as code (readable english is less efficient than clean code, so what do you expect?)
You should always look over and polish both your code and your comments before you shelve your code. If you're leading a team, it should be your code that sets the bar for good logic, commenting and formatting style. Even if you're not, good maintainable code is what they're paying you do write (I hope.) Of course, if they aren't paying you enough to write clean, commented code, then they get what they're paying for...
I thought that commenting was only supported to prevent a block of code from being compiled (or even better, dozens of little fragments of code).
What is all this about using comments to document what you are doing?
Every file should start with a preamble giving module name, version, author, and maybe revision history. Most of this can be generated automatically by your version control system.
Then there should be anything from a few sentences to a few paragraphs saying what problem this module solves and how it does it. Refer to any other documentation (e.g. UML diagrams, textbook for the algorithm) that might help illuminate what is going on.
Each function or data structure should have a similar comment explaining what it does.
Avoid comments that say "this routine is used by the Foo Function to update the Bar structure". Instead just say "This routine updates the Bar structure such that...". If the routine makes no sense on its own then it probably shouldn't be on its own.
Paul.
You are lost in a twisty maze of little standards, all different.
Comments are vital, but like all programming tools require judicious use to be effective.
It is easy for large comments to fall out of sync. with the code, so large comments should generally be reserved for high-level documentation of the kind one would expect to find in a literate program. Prefer a pointer (say, a URL) to a document explaining an algorithm than a block of text explaining the algorithm.
Brief comments on types and data constructors are vital where their use is not obvious. The same goes for functions, methods and procedures.
Function bodies should be small: it is better to have several small, easily understood functions (with names that clearly convey what they do) than one large block of code.
Use of formal language in comments keeps them short and clear. Compare the following:
The term 'list' can be omitted if the function has an explicit type signature (even if your language is untyped, specifying types somewhere is invaluable documentation.) Another point to note is that these comments make clear what happens when N > length(Xs) - you should specify what happens in all circumstances, even if that just means saying "if these conditions are not met then the behaviour is unspecified."
Including sanity checks in code is a useful alternative to documentation.
It is a very good idea to annotate code with the invariants that should obtain at key points.
Eschew clever code. Nobody will be impressed and it's a maintenance nightmare. And you'd better be very sure you got it right...
Don't cut corners. Include *all* the error checks. Learning how to write elegant robust code is what distinguishes real programmers from cowboys. Managing both is an acquired skill.
Avoid globals. Seriously.
State and mutable update lead to unreusability, bugs, madness and divorce. Learn a functional programming language (add smiley if that helps).
Someone once said words to the effect that if you find yourself writing a bulky comment, ask yourself if you could restructure your code so as to make the comment unnecessary.
- Ralph
char* foo, bar;
was good coding practice, while
char *foo, bar;
wasn't, because the code was declaring two pointers, and so the * should be with the type and not the variable name.
Even pulling out K&R, and writing sample code showing the sizeof(foo); vs the sizeof(bar); wouldn't convince him that he was wrong.
Unfortunately, I don't think it was ever "officially" settled. Nor were several of the other corrections that I immediately made to his "proposed" coding standards document he handed out at the first meeting.
Thankfully, my manager at the time listened to me (and also, helpfully, knew C and C++), so when we got the coding standards, they were filed with the rest of the useless paperwork we got, and we kept on writing things properly, including:
Three guesses as to which project was ahead of schedule. (Of course, not entirely fair, since we also didn't force code generation via Rational Rose. We instead reverse-engineered all of our final UML from the code we'd written and tested, and knew worked the way it was supposed to...)
"The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
For the unitiated, this means that all variable names have to be prefixed with letters that indicate it's type. i for int, f for float, ch for char, etc.
The linux CodingStyle file (in the Documentation directory in every kernel source kit since who knows when) slates hungarian notation thus:
If you don't have Linux kernel sources on your machine, you can get a copy of CodingStyle here (from the 2.0 kernels).
As a result of all this mucking about, because I tend to look after most of the dynamic memory, linked lists, and low level bit and byte-bashing operations, I end up with variable names with more prefix letters than letters in the name. I really detest this coding standard (which for some reason also forbids the underscore character on the grounds that it looks like a minus. Do you get foo_bar and foo - bar confused? I don't.
I don't agree with all parts of the Linux CodingStyle, especially the bit about brace placement, but it's a good starting point for any C coding standard. Unfortunately, ours was designed by microsoft-centric folk who think that
Ayhow, back to topic, the Linux CodingStyle also contains the distilled wisdom:
See: How To Write Unmaintainable Code by Roedy Green
Every time I read it, I laugh from all the crazy examples of how not to do things:
eg:
16: Names From Mathematics:
Choose variable names that masquerade as mathematical operators, e.g.:
openParen = (slash + asterix) / equals;
Nice idea; never works in practice. The reason is that what you think is easy to understand is not always what other people think is easy to understand.
The code you are writing now might have to be modified in the future by someone just out of university which means, generally, someone with very little experience. Your red-black binary tree might be "easy to understand" for you and a novelty to them.
Also, mature highly-factored, optimised code that has been improved over several years can be very hard to follow even when the original code was quite straight-forward (but perhaps too slow).
Finally, as a philosophical point, source code is supposed to be terse in comparison to natural language so it should take longer to describe the code in your own language than in the programming language.
TWW
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
This has come up before - in Martin Fowler's book, "Refactoring", he makes the controversial claim that sometimes comments are indicative of a need to change the code.
:)
Consider the different types of comment:
- boilerplate comment at the top of a file: helps noone but lawyers.
- change history comment: better use your source control tool to maintain this.
- comment before a class: does this mean the class is badly named, or too complex?
- comment before a method: ditto.
- comment inside a method: could be a smaller method screaming to get out.
Also heavily commented code is quite commonly just explaining away stupid code tricks.
Nobody's suggesting that all comments are bad, just that a lot of the time adding comments is a poor substitute for fixing whats wrong with the code. Of course sometimes its the language thats the problem
-Baz
A couple of words, OCL
This is in the interface, rather than an implementation, and you won't get the code to the impl, so what does it do ?
/**
* Get the Bug description for the given Id
* @pre id must be > 0 and less than BugList.lastId(), the highest bug number
* @post The return must not be null
* @invariant does not change the number of bugs
*/
Well that is the comment block, not all of it because there is some OCL in there, but I thought I'd leave something for Google. The point is that the description describes exactly what the method does, it also says what the caller must do or face the consequences, and what the caller can rely upon when the method returns.
And for any one who says comments aren't required if you write the code well enough... you are a muppet. Interaction via interfaces is a basic tennent of coding,
An Eye for an Eye will make the whole world blind - Gandhi
It is not only executeable code that can benefit from comments. In particular any numeric fields should have a units comment (e.g. m or m/S). It can be quite time consuming to deduce the units from the code.
If it takes splitting hairs to get the scopes right, then better split hairs.
char* foo, bar;
looks like: (char*) ((foo),( bar));
behaves like: (char)((* foo),( bar));
You get the same effect from:
y = x * a+b;
Quick explanations are rarely precise explanations when programmers are involved.
Anyway, keeping documentation longer than the code encourages shorter code more than it encourages rambling text, unless you know programmers that like typing more than I do.
Short code with a good explanation is always better than long code with poor explanation. In fact, short code is always better, period. Every line of code beyond what is needed for the task should be rooted out; it is a source of bugs and inefficiency.
TWW
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
1. Adopt some set of coding conventions. For instance, always return 0 on success/in the normal case.
2. Use informative variable and functionnames. Short names are preferred, but make sure it's clear what you mean. If it's impractical to fit all the required info into the var- or functionname, add a comment explaining the intended purpose of the variable/function.
3. Use small functions! Split actions up into logical steps. In combination with 2 this will help make your code a lot more readable, removing the need for many comments. Like Linus says: "The maximum length of a function is inversely proportional to the complexity and indentation level of that function."
4. Document any abnormal behaviour. For instance, if you've adopted the convention that functions return -1 on errors and you have a function that differentiates between different errors by returning either -1 or -2, document what the abnormal return values mean.
5. If the overall purpose of a group of functions (e.g. in one sourcefile) isn't obvious, add a general comment that explains the big picture. Code is much more readable if you know what it's trying to do.
If there is hope, it lies in the trolls.
No matter how clear you think you made the name of the function, there should be a comment explaining what the fuction is supposed to be doing. If the function accepts a lot of flags or variables you should briefly explain what they're each used for.
Knowing what the function is supposed to accomplish is a big step forward, even if there are no other comments at all.
If you're still willing to keep at it, start commenting the big blocks of code in the same manner. What are you trying to do with this loop? Why are you testing for these cases in this if statement, and if it succeeds, what are you trying to do inside of it?
Always go in favor of more comments. I would rather have to skim by a dozen comments that I don't need to read than be left hanging for the lack of one comment when something goes wrong.
And finally, always use whatever comment system your source control program uses! Even if it's just "I did some stuff to fix some problems with A," because if I later find out that a particular case of A is broken, I don't want to have to do a diff on every single code change made since the last time I knew that case of A worked.
This Space Intentionally Left Blank
i++; //increments the variable i
I think that they are unclear and do not properly explain the situation. Remember, you're writing so people can UNDERSTAND the code, not so that you can impress them with how smart you are. Instead, strive for a comment like this:
i++; /*changes the value stored in the space referred to by i to be the sum of the old value stored in the space referred to by i and the constant 1. Note: In C, this may cause what is known as a "silent overflow" if the value is too large, and go so far as to make a large positive value into a larger negative one. Oh my!
This way, people who read your code not only understand your program, but all programs. I really think that each function you write should repeat a semester's worth of computer science theory and programming practice, so that anyone who reads your code will learn from it. Remember, not everyone knows idioms, and why should they? And since we all write open source on slashdot, many novices are going to have their introduction to any computing environment by looking at the code you write at any point.
Your most humble and obedient servant,
Dan
The best practice, IMHO, is just to follow whatever the language's author prefers. He or she is likely to have quite a bit of experience, right? And presumably if you like the language then you already agree that its designer has some degree of good taste.
This means: for C, follow the style in K for C++ get a copy of Stroustrup's book and follow the style he uses; for Perl read the perlstyle(1) manual page; and for Java follow Sun's conventions. This gives you the best chance of interoperating with other people's code.
Of course, if you're doing development within a larger project, such as adding code to an existing program or writing a new utility to add to OpenBSD, then you should follow the local style conventions. Just find what the local 'source of authority' is, and follow that.
-- Ed Avis ed@membled.com
By all means, read Code Complete--its suggestions are sensible. But the real culprit when it comes to poor software are time and resource pressures, feature creep, and other environmental factors. Maybe at least the book will let you recognize when your project is doomed and leave; McConnell seems to have done that--he isn't at Microsoft anymore.
*foo means 'foo dereferenced'. In a type declaration, 'int foo' means 'foo is an int', so 'int *foo' means 'foo dereferenced is an int'. And, therefore, foo is a pointer to an int.
So, it's actually quite logical. In this: 'char foo, *bar', we declare that two things have type 'char': foo, and the thing that bar points to.
i++; /* increment i */
/* save the value of b */
a = b;
/* this function calculates theta. */
float theta(char **p, int d, float *(*fn)(int))
{
...
}
There was once an article linked from Slashdot ('tips for C programmers' or the like) which explained this clearly - but I can't find the link now :-(. Essentially, you have to consider the C declaration syntax as a kind of logical puzzle.
:-).
Take 'char *c'. This says that *c is a char. So what is the type of c? Working backwards, it must be pointer to char. Or with 'char (*c(int))' you can see that *c(int) has type char, so *c must be a function taking int and returning char, so c is a pointer to such a function. The cdecl tool can help with figuring out the more complex cases
-- Ed Avis ed@membled.com
One way of doing this is to have a comment block introduce each class and each function. If these comments are not in a standardized format, at least make them consistant. If you're using a non-obvious algorithm, this is where you should describe it, in full. If it is spectacularly non-obvious, provide a reference to a separate doc. And if your project has created design documents prior to coding, provide references back to those docs.
Obviously, the way code is "chunked" in item (1) has a lot to do with how it gets documented in item (2), and vice-versa; I put them in that order because it made it easier to explain, though in practice much of (2) is done before (1). But the two have to be taken as a whole and ultimately completed as a whole.
I don't know how many times I've seen comment-as-you-go code where the comments disagreed with the code, and it wasn't clear just what a function was trying to accomplish. But if I understand the goals of a piece of code, and reasonable care has been taken in naming and organizing it, in-line comments just get in the way.
Use inline comments sparingly. Write complete, descriptive sentences at the block level. It's also good to put blank lines before and after comment blocks.
Basically someone had been going through code and found an entire subroutine commented out with the rider "This doesn't work". The original poster went on to say (s)he'd initially missed the point and thought the commenter was dumb, until the penny dropped - this would be a massive time saver if someone else thought of the same routine.
I have to admit, I'm not sure if this commenting practice would have occured to me - until I read this I'd always deleted broken code. It's definately something to bear in mind next to you waste a few hours working on a flawed algorithm.
UNIX? They're not even circumcised! Savages!
Has this progressed much to supporting languages other than TeX, Pascal, and C?
I have "Literate Programming" by Knuth, but there always seems to be something more important coming along that I need to read "first".
"The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
You should seriously consider
The problem with comments is that they explain what the code does, but all too often the "why" - the structure of the program - is not obvious by looking at the comments - it's like trying to work out a streetmap by looking at the names in the phonebook.
I took over a large project from a major consulting firm; much of the code was immaculately commented, but the overall structure of the design was almost impossible to fathom; the documentation was out of date and incomplete, and everybody had a slightly different view of how things worked. Whenever we fixed a bug or made a trivial change, we'd hold our breath just in case the trivial change had unforeseen consequences somewhere else in the system. A simple class diagram and database schema would have been more useful than most of the comments. Unit tests would have saved us literally hundreds of hours of pain...
Code Complete, a book by Steve McConnell is a great read on this subject; I also recommend "Agile Development" by Cockburn.
It's all very well in practice, but it will never work in theory.
I write server code that will be run by another part of the organisation, on machines that I don't have access to. As a result it's useful to be able to turn on extremely detailed tracing so that the guys running the service can send us traces of sessions that didn't behave right.
/* comment */ next to the trace line would be redundant.
I find that the trace calls in the code are often as good as comments, in that someone browsing the code could use the traces to work out what's going on.
e.g.
for(lc=0;bytes_left>0;lc++)
{
trace("In main parsing loop, chunk %d, %d bytes remaining to process",lc,bytes_left);
}
... tells us the purpose of the loop, and the purpose of two variables. Clearly putting a real
I think even if I weren't tracing, I would place comments in pretty much the same places.Of course this isn't all the commenting I do -- there are certainly comments required above and beyond these, such as detailed descriptions of what a function should do above the declaration.
Sometimes, from other ppl. If I see it, it goes right back in review, and I won't pass the review until the fuckwit responsible has removed them. If you're writing code for yourself, then fine, please yourself. If you're writing code that anyone else will see, *especially* the customer, then hell no.
Thing is, there's two essential things that a reviewer/maintainer has to understand about a program: what it does; and why it does it. It should be possible to work out the first one of these just from the code, so long as the variables and functions are named sensibly. The second can be worked out from code with some effort, or the coder can add comments to explain why they're doing things that way and make it easier for maintainers.
But if someone has deliberately given all the variables names which don't reflect what they do, then it's utterly impossible to work out what the code is doing, and it's therefore also impossible to work out why it's doing it. So the code is unmaintainable - it isn't possible for anyone else to pick it up and work out what it does, except with massive work. If in 6 months time your company says "oh, we've got this code we can use with slight modifications, let's quote 1 month to do this contract" and then they find out you've made the code utterly obscure, then they'll crash and burn. And if that happens, the company *will* fire (or at least formally discipline) the person who wrote the original code, bcos they've been grossly negligence in doing their job. And you can kiss goodbye to any reference from them, so you'll be SOL in finding your next job.
Grab.
Certainly the most important comments are those that say WHY something
is being done rather than WHAT is being done.
If the code is written clearly, in sensible size functions and with
meaninful variable and function names it is easy to see what is happening.
It's the WHY that often escapes even the original author some
years down the line.
a guy at my work did that with filenames regarding projects (not coding)...the day he was out of the office and the boss was looking over a coworker's shoulder as they desperately pulled the stuff they needed....well...it wasn't a good day for him
I'm out of my mind right now, but feel free to leave a message.....
Here's the rules I use:
If you were blocking sigs, you wouldn't have to read this.
$StringPlusOne = $DollarDivideBy * $HashSemicolon + 8;
print "$EndQuote Semicolon new line";
getURL "http$Colon$Slash$SlashSlashDot${dot} org$slash";
...
If a support developer doesn't know what a little used third party function does and I don't comment it, who loses? Short answer: everyone. Support development team loses, I lose, QA and/or tech support loses as they wait for a fix.
Now, what if I (like so many other programmers and yourself apparently) have my head firmly up my ass so I think all my code is obvious in function and I only need to explain my intent? Well, after being humbled a few times, I'll figure out that it is more useful for everyone if I just use lots of comments, even things that may seem obvious once written, the role of support is vastly improved and my need to help with support is greatly reduced.
Of course, maybe you like doing support programming, it takes all kinds I guess.
I would say that this is part of the problem with code that you might create - you're hiding the implementation section.
The best way to produce the code is to create a clear division between functional elements, a clear division of data elements, and a clear division of implementation and error checking. What I mean by "division" is very dependent upon the language. If you've got an OO language its clearly easier to define the difference between data types using objects than it is in a weakly typed functional language.
However, all of these things can be done with any language. If error checking is taking a lot of space, put it in a separate function, or at the very least put some sort of divider that makes it obvious where the code begins and the debug stuff ends.
The best comment is often well structured code. Comments only make it easier to understand those rare algorithms that can be explained in a non-algorithmic way. (Actually these aren't extremely rare. FFT comes to mind.)
Mod me down and I will become more powerful than you can possibly imagine!
A couple years ago, I took a programming class at a local community college. The whole class got failing grades for the first few assignments, even though the program did what it was supposed to do and had 4 lines of comments per routine.
Turns out no one got any higher than a C until they made a whole page of comments for each line of code. On top of that, the teacher demanded the code be printed out.... I remember that I ended up turning in a 100 page document once, whereas the program was only about 90 lines.
I think that's a little too much commenting, but he still said more comments needed to be made. I understand where he was coming from (he used to program in Cobol, and this was in 1998, when everyone was scrambling for patching uncommented Y2K code), but there's such a thing as overcommenting.
First off: never underestimate the value of putting research notes in your comments! A simple "This averages O(NlogN), but is worse if the data is presorted" can really make somebody's day.
Now, the long rambling description of how I like to see comments:
Every file should have, right after the boilerplate (after copyright, before #includes etc) a brief description of that file.
Name your functions something concise, and accurate, but not necessarily precise. You don't need sort-sequence-on-predicate when sort will do just as nicely.
The same goes for variables. Using i and j for numeric iterators is fine, but you rarely should use foo and bar. Most variables should also have a short (usually <20 char) comment, although globals should have a longer comment, possibly describing how they're used.
Every function should have a docstring. In some languages, such as C, this is normally comments at the beginning of the function. In some, such as Lisp, there are conventions for including the docstring in a manner that the compiler will recognize.
The first line of the docstring should be a self-contained sentence that tells what the function does. The rest can go into detail about how to call it. The purpose of this bit is "what does a caller of this function need to know". Wait on the implementation notes for now... we want the caller information to be all one, tidy package.
If your language does have docstring support (either directly or through an external tool), then implementation notes belong in comments, not the docstring. Either way, put them after the caller information, so that somebody who just wants to use your function doesn't need to read them.
As an extention of this idea, functions should begin with a brief overview of how they work. (For extremely simple functions, this may be omitted.) If the function implements a formal algorithm, such as a sort or hash algorithm, then a formal description is certainly not out of place, or give a reference. This is also a good place to note any behavorial characteristics.
Divide your code into "paragraphs", between 5 and 20 lines long. Skip a blank line between paragraphs. This also helps find areas that are good candidates for factoring out into separate functions.
Each paragraph may start with a block comment describing either what that paragraph does, or at least the program's state at that point in execution. Feel free to make these as descriptive as you like; they're the landmarks for somebody reading the code.
Loop guards frequently should have a one-line comment describing what they're testing for, in terms of the algorithm as a whole.
If a line's meaning isn't immediately clear, then clarify it with a one-liner.
Any one-liners may be expanded to multi-liners if you need to:
Mark areas that need investigation or more work with a comment of "FIXME"; that makes grepping later on much easier.
You can also use XXX for a similar purpose, or (what I do) to mark areas of grave significance.
Don't worry about descriptive comments being too verbose. Descriptions of program state, why something exists, research notes, or prose all have their place in comments. The only time I've read a file and thought "Gee, this is overcommented" was when it had template comments with changelogs and argument lists.
Now what do you not use comments for?
Don't keep a changelog in your code. Your source-control system does a much better job.
Don't repeat the argument list. The argument list itself does a perfectly fine job of that, and will always be up-to-date. If the purpose of each argument isn't clear by its name, then add a one-line comment to the arguments. (Note: In Perl, don't shift the arguments off as needed. Get all the arguments off in the first line of code if possible, such as my ($self,$filename,$options) = @_; in order to let somebody calling the function to have an argument list available!)
Don't document the language (eg, explaining ++). This not only is unhelpful, but distracts from useful comments. If somebody doesn't know what ++ does, they can look it up.
Credits: Examples snipped and adapted from my own development, as well as CMUCL and FreeBSD sources.
- s - String
- i - Integer
- f - Float
- r - Reference
- a - Array
- etc...
I know this does not work for everybody, but for me this has done wonders when it comes to understanding my own stuff a couple of months later.When in doubt, act determined. Business 101
The biggest problem I have with going through code (as a contractor, I tend to read a lot of new "to me" code) is the case where you're dealing with 2nd or third generation code, the original developers are long gone, and the intent of the routines has changed but the header/interface comments have not. One has to read and understand the code in order to make the changes and while the comments might be helpful, they also might be outdated and wrong. Too many times I've found a comment beside a routine call that say one thing and when I go into the routine, find it's something totally different. I am especially careful about comments that talk about expected return values or side effects. These generally aren't a problem on release 1.0 code but there's a lot of jobs out there dealing with release 5 versions of things where the comments might not have kept pace with the product.
Many times a group is unwilling to submit comment/non-code changes for a module (due to the return values changing say) to source control because they don't want to have so many modules change for what they see as a "trivial" fix in another module. It's tough to get some of those changes through many review procedures and this is what causes comments in "callers" code to be outdated. Use the comments as a guide, not as gospel.
What happens when you recast it to something else?
I've had enough abrasive sigs. Kittens are cute and fuzzy.
You missed my point entirely. *foo is not a character. *foo is a pointer, and under ideal circumstances it is going to point to a character. *foo is not a character because you are not declaring a char variable, you are declaring a pointer variable.
Saying that *foo is a character implies that you can do something like *foo = 'A';. Which, surprising as it may seem, you can't. This is usually referred to as a "segfault" or "bug". But if you first assign a value to the pointer so that it actually points to something, then you can access a character at *foo.
i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net
The one true brace style as defined by K&R puts the braces on the same line as the else and if statements since they control the flow and the braces are just there for grouping.
// code
// more code
int foo(bar)
{
if (something) {
} else {
}
}
Thats 8 spaces to the tab for older stuff and 4
for newer stuff.
Other things, such as always including { and } in C, and putting them alone on their own line
Phew, I thought I was alone. I'm glad it makes sense to someone other than myself to actually have the braces line up vertically.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
hehe, if one was to view the source of a big web site that I made, they'd probably run into quite few JS variables with names like "ihateExplorer" or "ihateNavigatorfour"
"Things are more moderner than before- bigger, and yet smaller- it's computers-- San Dimas High School football RULES!"
My favorite example of well commented code, to the extreme, was a text editor w/ assembly code published in a Byte mag about 1983 or so (VDO Video Display Oriented). The source listing was nicely broken down into functions with a paragraph explaining exactely what was going on and why for maybe every 2 or 3 instructions! Anyway, the amount of English text far outweighed the actual code by maybe 10 to 1.
try { do() || do_not(); } catch (JediException err) { yoda(err); }
I would argue that int *foo means foo dereferenced is a segfault unless you've assigned a meaningful value to the pointer foo.
Or, to take your conclusion: In this: 'char foo, *bar', we declare that two things have type 'char': foo, and the thing that bar points to. Wrong, you haven't declared the thing that bar points to as type char. You haven't declared it at all. Which is precisely the sort of misunderstanding that I'm saying your style of declaration leads to.
In char *bar you have declared a pointer, and some assumptions about the pointer. You have not declared a char. There is no memory allocated to that char, nor any type checking implied about the value that bar may point to unless accessing it through bar. In other words, you have declared nothing at all about what bar points to, but rather a constraint on the use of the pointer bar.
In other words, char* foo clearly declared foo as a pointer, whereas char *foo defines *foo as a char, and by implication foo as a pointer, but no char is every actually declared.
i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net
I'm currently developing a project where I have to modify existing source code developed by someone else. I spend most of the time trying to figure out WHY the previous programmer did something the way he did it, and what the hell he was thinking. Write these into your comments. Also, write down in comments small tips for the programmers that might come after you. Comments in the style of:
/* This function expects an object that has been fully filled and checked for errors before. Be sure to never send it a NULL one! */
/* We're invoking the call to the database with the "mode" flag set to zero because currently it won't use it - but it COULD be used in the future. */
or
help a lot.
I also have to modify Java source, where I have generic classes that inherit from previous ones a lot of attributes. For example:
public class A {
public String s;
public double d;
public long l;
}
and
public class B extends A {
public String ss;
}
I prefer to write class B like this:
public class B extends A {
// public String s; : A
// public double d; : A
// public long l; : A
public String ss;
}
That way I can see quickly all of the attributes class B contains.
"Trust me - I know what I'm doing."
- Sledge Hammer
In any situation where I see the need for code commentary, I try first to find a way to make the code clearer. If the source code is sufficiently clear, comments are unnecessary. This also avoids the risk that the comments will diverge from the code - making claims that were once true, but no longer reflect the code's actual logic.
// empty or null uiInitializerClassName means this task is not
// defined for use in this interface. Skip it.
... do something ...
... do something ...
This is poorly commented code (despite the fact that the comment is clear and accurate):
aClassName = aTask.getUiInitializerClassName();
if( aClassName != null && ! aClassName.equals( "" ) ) {
}
This is well commented code (despite the fact that there are no comments at all):
initializerName = aTask.getUiInitializerClassName();
boolean isNotNull = initializerName != null;
boolean isNotEmpty = ! initializerName.equals( "" );
boolean definedForThisUi = isNotNull && isNotEmpty;
if( definedForThisUi ) {
}
Of course, this doesn't work in all situations, but I find that I can improve the clarity and accuracy of seventy to eighty percent of my commentary this way.
Stop-Prism.org: Opt Out of Surveillance
Top of a huge listing of nasty code: /* I don't do comments */
I knew that tears would be my only comfort in the coming days.
Of course, if you stick to these guidelines, you're not writing comments, but patents.. So better hide those from your coworkers, otherwise they'll try to claim royalties..
SCO employee? Check out the bounty
- Persistent and temporary file formats
- User interface
- Network protocols
- System and architectural design
- Relationships between data elements (or objects if you think that way)
Some of the above are addressed by UML and associated tools, but for things like network protocols the RFC-type format is the hands-down winner. A particular implementation of a good idea in code might last a couple of years, but the protocols for a truly revolutionary idea will live for decades. (Look at Mosaic and HTTP/HTML for a good recent example.)IMHO, comments are often confusing. A source code that has plenty of comments becomes unreadable. And comments aren't always in sync with the code when changes are made.
:
:
A source code with no comment, but whoose structure is very simple is way easier to understand. When the compiler sees long and overcomplicated expressions, it painfully transforms them into more, but basic expressions. So why not write simple code?
I often see complicated lines with plenty of ternary operator usage. Why? Write the same code with simple 'if' statements. The generated code will be exactly the same, but the source code will be easy to understand.
Another very confusing thing (IMHO) is the usage of expressions without explicit braces in loops and conditional statements. Ie. things like
if (ready())
while (*++take)
if (*take == 4)
foo();
Without indentation, it's very confusing. Worse : what if I want to add an 'else' here? If I want to add an 'else' to the second if, I really have to properly indent it to avoid confusion. If I want to add an 'else' to the first if, I have to add braces. This sort of thing doesn't ease the usage of macros and can give very nasty bugs when cutting/pasting blocks without carefully understanding where implicit braces are. So why not simply write
if (ready() != 0) {
while (*++take != 0) {
if (*take == 4) {
foo();
}
}
}
The generated code will be exacly the same, but there's no possible confusion here.
Also, 'goto' is not bad. Really. When you have to break from several loops, or just to avoid deep nesting of statements, a well placed 'goto end' is way clearer and faster than useless functions and silly 'flag_to_see_if_we_have_to_exit' variables. Don't forget that after compilation, any program will have 70% of goto-like assembler opcodes.
Comments can be interesting to note bugs, or TODO stuff.
Also have a look at the style(9) adopted by the OpenBSD team. There are good ideas.
{{.sig}}
We only care about the type of *foo, not its value. We can talk about the type of *foo even if foo doesn't point to a valid address, just as we can do sizeof(*foo) without causing a segfault.
You could argue that it's confusing to specify the type of one thing and the storage of another, but this model has the advantage that it actually gives the correct answer. Unlike 'char* foo, bar', which the other model fails to cope with.
There is no easier way to make code ugly and unreadable than to use hungarian notiation. nCount is not in anyway more readable that cnt, it clutters the code and is very annoying. If your code really needs to encode the data type in the variable name then there is something horribly wrong with it.
The difference between Canada and the USA is that in Canada healthcare is a right and gun ownership is a privilege.
One of my web sites uses a Javascript function called isCrappyBrowser(). It does exactly as the name implies, returns true if the user's using a crappy browser :)
Of course, what is crappy browser is up for debate, but it's easy to add or remove browsers from the list.
Sure, some of your stuffy cow-orkers may challenge this practice, but it's all for a good cause.
If you define structure with public stuff but default, why the hell are you using 'class'? Use 'struct'. The only difference between 'class' and 'struct' is that 'class' has private members by default, while 'struct' has them public by default.
{{.sig}}
Sorry, but that's just not true.
You need fewer comments if your identifiers are well-chosen, certainly. But I've never seen a significant piece of code that would be adequately described by well-chosen identifiers alone.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
I get peeved at how many people leave out this critical information. Lets say they have a small section of code that is changing information in a database on the fly as it is displayed to a client. They will comment that the code changes the value but nothing else. Oh how I cringe. Not to mention people who don't comment changes they made.
(Of course, not entirely fair, since we also didn't force code generation via Rational Rose. We instead reverse-engineered all of our final UML from the code we'd written and tested, and knew worked the way it was supposed to...)
Any particular reason you didn't use Rose, that you want to share? I am just getting aquianted with Rose now, and I am pretty enthusiastic.
-Kraft
Live and let live
That's clutching at straws. You may not have defined what it means yet, but the type of *foo is char.
As opposed to, say, writing
char a;
char b = a;
where because a is a char, it automatically has a value that makes sense, right? Oh. Oops. :-(
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
/* This was hard to write, it should be hard to understand */
M@
Krispy Cream is people
Not at all. According to the grammar of C and C++, char *foo is the natural form of the declaration. According to the interpretation of others here, that *foo has type char, this also makes sense. The char* foo form is horrible, because it encourages you to think that char* foo, bar would be two pointers-to-char.
Why is everyone here ignoring the analogous C++ case of references, where char &x does not imply that &x is a char, though? That sinks the most popular argument here for using *foo. The sounder argument is based simply on the grammar of the languages.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
First of all, keep your comments confined mostly to your header files. That is,
This is very bad. The comment above Foo::bar belongs in foo.h, not foo.cc. I should never have to read your .cc/.cpp/.cp files in order to use them. The comments in foo.cc should only explain clever tricks, workarounds, etc. There are two purposes for comments: (1) defining interfaces to reuse some code later and (2) describing unusual or unexpected ways of doing things. (1) belongs only in header files. (2) belongs mostly in .cc files, but can also go in headers if there's something particularly nasty afoot. This makes it easy to look at a particular project and get a good idea of how things interact, without worrying about details.
I work more with C than with C++, but the exact same rule applies (which is not always true - there is no such language as "C/C++" and what is idiomatic for one is completely awkward in the other). Modern idiomatic C dictates that you separate your code into "modules" with a .h and a .c file for each "module." .c files should only be found alone if they don't export any functions that are used in another .c file (eg, they only "use" modules rather than implement anything). The distinction should be made explicit by declaring module-local functions as 'static'
Another thing I find quite useful is using the "MVC" paradigm (Model, View, Controller). A concrete example: any web-based database application, like slashdot, for instance. The "Model" part is the actual database schemas. The "Interface" part generates HTML. The "Controller" part is the only part that actually constructs any SQL statements. Your controller has good APIs if you can completely change your database schemas without changing any code in the "Interface" part. Generally, well-designed systems like this require comments to define the "Controller" APIs, and don't have too many comments elsewhere.
Someone else has already mentioned The Practice of Programming and I'd like to chime in that this is an excellent book.
I've seen some of these in real world code, and they're not too annoying (unless you're trying to figure out what the code does :-).
//Shareef don't like it!. Another programmer named a miscellaneous callback YouWantFriesWithThat(). I was guilty of leaving around the comment // printf -- output function of kings for no reason whatsoever.
:-)
One of my favorites was a temporary function pointer named funcSoulBrotherInDaHouseNow. The same guy made a function named BashTheProcessor() with the comment
None of these symbols are exported, so the world at large was spared until now.
Staying anonymous to protect the identity of my coworkers.
The best way to ensure readability is to test readability. Show your code to two other programmers; whenever they have a question, revise the code to include the answer. After you make this a regular practice, you may find yourself anticipating reviewer comments as you code. Then you will understand readability in a way that no lone-wolf pundit possibly can.
You know that your code is well-commented when the moderators give it a +5 Informative.
IMO, the most useful and frequently overlooked element in documenting code is:
Meaningful variable/function/method/class names!
So many developers are satisfied with instance names like the ubiquitous "temp" rather than more meaningful ones like "jobStagingList".
Well-chosen and expressive variable names go a long way towards making code self-documenting.
The only thing that we learn from history is that nobody learns anything from history.
I believe peer pressure is the most effective way to improve code readability within a programming team. Coding standards can only address low-level issues, and only if they are enforced. Regular peer review focuses everyone's attention on true readability issues. A senior programmer can learn a lot from a junior programmer this way.
Now I'm not saying that all programmers who use HN are idiots. This is just in my experience, which is a relatively small cross section of a much greater pool of code. Seeing the style does set an expectation with me now, though, since all my experiences thus far have been bad. Perhaps one day I'll be pleasantly surprised.
Hmm... asking about opinions on HN might make a good interview question (on either side of the table) now that I think about it. I'll have to add that to my list of questions for when they ask "Do you have any questions?"
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
At one stage I had to do a quick and dirty with some software that was hitting priority inversion problems. I was working AC who had caused the issue by sloppy coding - I commented the fix with a warning that this was a "Kludge" and needed a proper cleanup later. AC petitioned for and got the "Kludge" word removed.
See my journal, I write things there
Go read K&R on this.
And use 8 space tabs - that makes code blocks more obvious than anything else, and it even warns you when you're getting ridiculously deeply nested control structures.
Another good thing to read is the Linux kernel coding style documentation (Documentation/CodingStyle). It's a good discussion of a lot of this stuff.
himi
My very own DeCSS mirror.
For advice on commenting, as well as so many other issues that differentiate a journeyman software engineer from a master, I highly recommend The Pragmatic Programmer. When I start a new job, I often wish my coworkers had read it, and by the time I leave the job, I try and make sure that as many as possible have.
All right, I know that sounds like a commercial, but it's really a great book.
Kevin Fox
There's a third thing the maintainer needs to know which is "what it's *supposed* to do. Comments are invaluable for that. Consider the following C code fragment:
;
;
;
for (i = 1 ; i ARRAY_SIZE ; ++i)
{
do_something_to (array [i])
}
Why isn't it doing something with element 0?
Now look at these two fragments
/* Do something to all elements in array */
for (i = 1 ; i ARRAY_SIZE ; ++i)
{
do_something_to (array [i])
}
and
/* Do something to all elements in array except */
/* the first one because... */
for (i = 1 ; i ARRAY_SIZE ; ++i)
{
do_something_to (array [i])
}
Just by adding a one line comment, a bug has been exposed, or the maintainer has been prevented from inserting a bug in the second instance.
As a maintainer, I'd want to be able to see what the code does (well set out, good structure, descriptive names etc) and what the programmer meant it to do, i.e. good comments.
Anybody who puts jokey unhelpful comments in their code should be aware that these will inspire feelings of hatred and extreme violence towards them in the maintainer who has two hours to fixe the air traffic control system before the 747s start falling out of the sky.
All I want is a secure system where it's easy to do anything I want. Is that too much to ask ~~ Randall Munroe
Bah, he was hard to work with, he was easy to terminate.
I say this for one simple reason. The code tells me what you actually did. It doesn't tell me what you intended to do. Auditing someone elses code is a royal PITA when they haven't bothered to tell you what a function is supposed to be doing.
Beagle Bros! *eyes tearing* I remember those halcyon days! So innocent...so much in so little RAM.
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
Ummmm . . . No, the type of *foo in char *foo; is a char, and foo is a pointer to a char. There's no notion of forcing anything here - you're /declaring/ the variable, not dereferencing it.
/are/ actually dereferencing, unless you explicitly cast the variable to a different type somewhere along the line the compiler will pick up the mismatched type error. Since C isn't strongly typed, this often gets you a warning and nothing else, but that reflects the fact that C lets you treat a chunk of memory however you want, because it was designed as an OS implementation language where you need direct manipulation of memory. But the language itself /does/ have a strong notion of the type of variables - strong enough that you can do a lot of static analysis. It breaks down around pointers and typecasts because you can throw away information if you choose, but again, the language's notion of types is well defined, with loopholes that make playing games with memory easier.
/is/ meaningful, and that the language properly defines that meaning.
In any case, even when you
I could compile C to Caml (a strongly typed language) if I wanted to, and the only difficult bit would be providing generalised typecasts (Caml doesn't allow casting in the general case, only for things like ints to floats or chars to ints, where there are clear, well defined conversions). Most of the type handling would pass straight through untouched.
Don't make the mistake of looking at the runtime behaviour of a language and thinking that defines it's semantics. In this case, a construct that's entirely meaningful will cause a runtime error if some other stuff isn't done, but that doesn't change the fact that it
himi
My very own DeCSS mirror.
Why bother.
Explain WHY the code works as it does, not WHAT it does.
Anybody can tell what it's doing. The reasons behind the design are another matter entirely. Here are some examples.
Just make sure that coders know that their comments are to be factual, not emotional. Comments like "I'm doing it this way because Bob said I had to" are useless at best, and inflammatory at worst.Code Reviews - We love them, we hate them, but they work WONDERS. To have another programmer read over your code and have to try to figure it out is absolutely invaluable. Besides finding really obvious bugs or questionable places, your commentary can be critiqued as well.
For those of you that program by yourself, do the "let-it-ferment" thing. Write some code, then stick it at the bottom of your stack and pull it out in a week. If the comments and code still make easy reading and good sense, keep it. Otherwise, assume if you can't read it in a week, no one else will be able to read it the first time, either.
In an actual programming dept., if you're the manager/boss, make a set time each week to review code by other members and stick to it. Your programmers may complain but they'll complain worse if they have to be there till 2am fixing a really dumb mistake.
Blog,Twitter
Very true.
In _The Mythical Man Month_, Brooks wrote (paraphrasing here): Comment your code and I will remain confused; comment your data structures and your code will become obvious.
I use Vim primarily, and I'm beginning to switch to using Emacs for coding.
Vim has the most wonderful autocomplete hotkeys; type the beginning of the function/variable name, then press Ctrl-p to search up and Ctrl-n to search down through the file, buffers, etc. Now, long variable names are actually usable for 80wpm typists like me. (I'm around 40-50 for plain text).
Does anyone know what the equivalent (or at least sorta-equivalent) commands are in Emacs?
-Chris
My favorite was actually an error message; a piece of (commercial!) software crashed, with the error message:
"Dave says this case can't happen."
Actually, as a part-time sysadmin, most-time coder, I agree.
The point of this whole thread is to understand what the code does, right? If your code is unclear, you can aid people in reading it through comments and variable names, but if your code structure is easy to understand, then the goal has already been achieved!
Sure, it can be difficult to decide without bias how understandable your code is. Never overestimate the intelligence of the person who will maintain your code. However, to me, thats simply thats another argument for keeping your code structure clear to begin with.
Paranoid
Bwaahahahahaa.
In comments, I think it's most helpful to explain the "why's" of the code, (i.e. :
SomeClass someObj = null;
// log but do nothing, conditional block will
// handle this along with related problem 2
// don't show the user our stack traces
try{
someObj = someFactoryMethod(someInput1);
}
catch(SomeExpectedException expected)
{
LogMgr.logWarning(expected );
}
catch(Exception unexpected)
{
LogMgr.logError(unexpected );
rethrowUserSanitizedError( "User-sanitized message");
}
if(someObj == null || !someObj2.someCondition(someInput2))
{
showInvalidInputMessage(so meInput1, someInput2)
}
I think this helps prevent another programmer's incomplete understanding of the "what" causing them to overlook consequences of making code changes.
One purpose of comments is to explain the code to another engineer (including oneself in the future). Another purpose is to demonstrate the code works, whether an informal argument that the code does what it should or a mathematical proof. These two purposes have different needs.
For the former case, standard writing rules apply. Decide who the audience is. I often figure the audience is an engineer who knows the type of programming at hand, but doesn't know what is done by this particular code, and may or may not be familiar with the product, depending on circumstances. Knowing the audience tells you what assumptions to make and what has to be explained, either by prose or by giving directions to reference material.
Write complete, grammatically correct sentences. This goes a long way to making comments comprehensible. Sometimes a little phrase won't be understood because the reader can't fill in the unwritten parts, or because there's ambiguity in the wording. It is okay to use short phrases when describing objects being defined or declared (e.g., "number of links to this object" or "dollars owed this customer), but keep the context in mind. Introduce the compound object with sentences where appropriate.
"Dollars owed this customer" reminds me -- use units. Don't write "Money owed this customer" or "time since last update." Specify seconds or milliseconds, not time. Document how the object models whatever it is modeling. That may be a physical thing like time or a conceptual thing. E.g., if a pointer connects one object to another, document the relationship that represents. If a "debt" class contains a pointer to a "person," don't document it as "person associated with this class." Document the relationship -- this particular pointer may represent the debtor, the creditor, the escrow agent, or somebody else.
Give context. I have seen thousands of modules that just leap into code with no explanation of what they are. Even if the comments say what a function does, a reader might not really understand it until they know what it is used for. Document where the code fits into the bigger scheme and what it is used for. Give the reader context so the purpose of the function makes sense. Even if a complete mathematical description of a function is given, so that the reader can precisely predict its behavior in every situation, it might not make sense to the human mind until they have a mental image or model of it.
For the second purpose, demonstrating the code works, explain how the code implements an algorithm. It's not enough to explain what the steps are doing; you need to show how the total result comes out of the algorithm, unless it is something simple or familiar. E.g., a formal description of the long division taught in elementary school would generally be incomprehensible. "Find the largest digit d such that d times q is less than r[i]. Subtract d*q from r[i] to get r[i+1]. Append d to output..." Nobody seeing that for the first time would understand what it is doing, even if all the steps were clear. Even if you explained each step and explained the result, it won't be clear to some readers how the steps produce the result, so explain that.
Document alternatives that weren't chosen, and the reasons why. If you were tempted to implement algorithm X but found you had to do Y because some error might occur, record that information. Otherwise, somebody working on the code next year might see your longer code for Y and change it to X without realizing the problem.
This isn't intended to be a complete list, just what occurred to me at the moment.
Code should, so far as possible in context and langauge, be self-documenting, which is to say: the code by itself should lead me as far as it can in understanding. Good, well-designed OODLs make this somewhat easier than more traditional straigh-line imperative languages, but all high-level programming languages are capable of such practices.
This means coding standards -- all code doing substantially the same thing should probably look the same. Indentation style should be sound and consistent, and unless strong countervailing arguments exist, all procedures should be very short, very well-named, and be very straightforward. As others have noted, variable names should be as precise and accurate as possible -- and, believe it or not, naming variables and procedures properly all the time is VERY hard.
When code gets written, and you start smelling the "smells" of imperfect code, this means it is time to refactor: move variables out of objects to where they belong; collapse classes where appropriate, or break them up where appropriate; rename procedures. break procedures up to clarify (and rename), consolidate procedures to clarify (and rename) and so forth.
An excellent book on refactoring is "Refactoring" by Fowler. Refactor until you shouldn't refactor any more. Then the code, sans smells, should read like a charm.
Then, and only then, will you know what the comments need to say. Comments should be things not possible to express in the code per se, or which couldn't reasonably be understood without deeper analysis. For good, well-factored and self-documenting code, most comments are usually tight, short, rare and VERY helpful. Indeed, most comments in good well-factored code are usually unnecessary for experienced readers -- and too many comments actually get in the way of reading the code.
One last thing -- keep your code comments current. Too often, changes are made in code without countervailing updates in the comments (since compilers tend not to pick up such things. The single worst possible thing is a comment that is wrong. Worse than not commenting at all. And both are heinous. Almost as bad is commenting superfluously or too much.
My view -- write your code right. Then, the comments, when necessary, will be apparent -- kinda' like the code became apparent from writing right.
I do that too!
and so on. Oh, wait. :)
Secession is the right of all sentient beings.
...who make reviewers like me stare at computer screens for endless hours trying to figure out how the hell your computer code is supposed to work.
// Creates hash table ". Question: Where does that leave me? When I find out that there's some problem in the hash algorithm, I have to dig through 200 lines of code to find some freakin' bug that is described only by "Creates hash table." Your example of why comments don't need to be made is a poor one:
// increment loop counter
Comment sparsely. Do not sprinkle your code with comments. Especially do not use comments like
Yea, I can already picture your programming style. You'd make a 200-line function with the only comment being "
loopCounter++;
That is adding zero value.
Yes, because it's one line of code, and the code is described through the variable. But when sifting through lines of code, you often find beautiful works like iHateMyJob++; or fuckMyBoss--; to name a few. And needless to say, they're uncommented in the code. Until computer code can be written bug free in complete English sentences (aka Never), the rest of your team of workers needs to understand what your code does.
Personally, I make sure every function says what goes into it, what comes out of it, and what setup (variables, etc.) need to be made for it to be called. I do not comment every single line of code, but I do make sure that every line is accounted for by descriptive sentences, explaining the task that I wish to accomplish as well as what variables / registers / actions I take to accomplish the task.
Every time someone has to change some code, you've just forced them to double their workload, and change some comments too.
Okay, this just pisses me off. You didn't mean what you said. Here's what you meant to say:
Every time I have to change some code, you've just forced me to double my workload, and change some comments too.
I can assure you, from a reviewer's point of view, comments SAVE my time from trying to understand what each piece of code is trying to accomplish. Commented code may make you work extra time to detail the lines of code (I do admit, some programmers are quite tallented at keeping track of every single line of code in their head as they work on it on the computer), but it saves tremendous amounts of time once that chunk of code needs to be integrated with other chunks of code into the final product.
Some variation of the methods described in "Literate Programming" by Donald Knuth are a good place to start. In summary, Knuth says that you should be able to extract from the same source both machine instructions, and a human parsable document, with unusually high importance placed on the later. Whether or not you want to imbed LaTeX into your document is up to you (I never have bothered), but on the whole find something that will make your code and algorithms understandable to another programmer who's never met you (because that's probably who will be either grading or maintaining your code at some point).
I once wrote some code in Command level CICS-COBOL that used GETMAIN and FREEMAIN to do dynamic memory allocation so I could take a data structure in a file (by order.item.warehouse.quantity,) and stand it on its ear (by warehouse.order.item.quantity,) to get at the data that way.
It worked fine and I documented it and the BLL cell layouts properly but the CONCEPT of dynamic memory allocation proved to be the stumbling block.
I ended up giving a course on my hack and they still didn't get it. I eventually left the company and left them with working code that they were not intellectually capable of modifying. I didn't feel good about that.
Sometimes they don't understand one concept (object instance relationships) and sometimes its another (state machines) but I always seem to encounter people having problems that they are too busy struggling with to see that the problem lies with their application of an innapropriate solution.
AND THEY AREN'T MENTALLY EQUIPPED TO PERCEIVE THE TRUE DIMENSION OF THE PROBLEM. (Not its metric, [I didn't say SCALE of the problem,] but its topology.)
There is bugger all to be done in those cases.
Being ignorant of the uses and applicability of state machines leads to combinatorial explosion of GUI (and supporting objects) code to handle different state transitions.
Rather than maintaining the state machines, (not even the engines but the state transition arcs and nodes,) they were ending up replicating GUI code.
That was really fucked up. I was asked to resign. I did, heaving sighs of relief.
(And then somebody used the building as a landing strip. Yeah. I'm still fucked up about that...)
MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
One of the best examples from my personal experience is a fairly large 8-bit assembly language project I did (for an embedded PIC16C76-based product). The project has approx 25000 lines of assembly code distributed in about two dozen files (which built about a dozen different flavors of the product, though virtually no code was duplicated due to common code being used by includes... it really was about 25k lines of assembly!)
The first time another programmer worked on it, he of course complained that it wasn't commented very well and lacked documentation.
In fact, a quick grep ';' *.src | wc compared to a wc of all the files revealed that 34% of the lines had a comment (the vast majority the whole line was a comment, as I tend to write blocks of prose above sections to explain what they do, rather than a comment for each register move).
But if 8500 lines of comments wasn't enough, there was an 11-page document I wrote about the design of this firmware... about half of it was a "roadmap" that described the "larger picture" of the firmware and how it was arranged into the various files. The other half documented specific tricks (like a 6-instruction sequence including a skip-past-a-skip that achieves a 16 bit add/subtract in that PIC part which lacks carry input). There was also a lengthy discussion of the overall strategy for managing the various bank-swapped memory of that processor and some other stuff about the real-time performance (that was a hard real time application).
Learning is painful for most people, and learning someone elses code seems to be absolute agony for many engineers & programmers. They always complain that you didn't document/comment the code enough, even in an extreem case like this!
PJRC: Electronic Projects, 8051 Microcontroller Tools
Lots of posts on quantity of comments and theres a bundle of good arguments for both the more-is-better folks and the dont-overcomment advocates. Similarly the drive to make your variable names meaningful is worthwhile but the one addition to any code, be it perl, C or anything else, that makes maintaining it easier has to be the humble newline.
The important thing is not how many comments of what type but the overall layout of the source so that it is consistently understandable on reading through it. If a comment is required to accomplish that then insert one. If, OTOH, all you need to do is break up and indent the lines a bit more intuitively then do that rather than trying to explain the more awkward structure in a comment.
Sure, you can easily pack a fully functional perl script into a 4-line .sig if you want but a 100 line script thats as squeezed together as it possibly can be becomes unreadable no matter how many comments are inserted into it. If a single line of code does more than one step in your program then consider breaking it up. If it absolutely has to be one line in order to work then backslashes are your friend. The guy that reads your code to find out how you did that after you've moved on to bigger and better things might be an entry-level hire who has enough of a learning curve to cope with without wrapping his/her head around tightly compacted code as well.
Remember how simple you kept it when you first started learning a language? Keep it that way when you're more experienced unless theres a reason to do otherwise.
I had a
I've noticed that the exerience of the programmer, like most aspects of software design, is only an asset if the programmer has done a variety of tasks. I've seen veteran programmers who have only written code and actually never maintained it. They never actually learn what maintainers need to make modifications, so the comments only help them write the code.
My best advice is to do as many tasks in the software development process as possible. This includes testing, maintaining, and working with users and even technical support (gasp!). Don't get stuck doing one thing. You won't get better and better at it, you'll become more out of touch and therefor do an even worse job.
Experience is best measured in deversity, not years.
-- Ken Kinder ken@_nospam_kenkinder.com http://kenkinder.com/
If you define structure with public stuff but default, why the hell are you using 'class'? Use 'struct'. The only difference between 'class' and 'struct' is that 'class' has private members by default, while 'struct' has them public by default.
Why should he? He's not writing C. Why should he have to double the syntax when, as you said, 'class' would do the exact same thing? Maybe it's better that he explicitly has to make everything public. Maybe he thinks, as do programmers in most other languages that support classes and struct, of structs as being mere aggregated data and classes being actual object definitions. Or maybe he just likes browsing through classes by searching for "class " in his editor. Just because the language supports something doesn't mean he has to do it, especially when it actually reduces the value of the code.
I've finally had it: until slashdot gets article moderation, I am not coming back.
Robert Pirsig in Zen and the Art of Motorcycle Maintenance commented on how as a tech writer assigned to write user manuals, usually the guy assigned to teach him how the thing worked was the least competent and knowledgeable on the whole team. Nobody minded _him_ being taken away from his work to do documentation....
I suspect coding or documentation standards are often the victim of this same practice, squared.
So, it's one thing to say "if they change the code, they MUST change the comments", but realize that unless you have the ability to force the issue (a tool to make you change comments before saving changes, managers who care more about firing programmers who don't follow code standards than avoiding schedule slippage -- hint: I've never seen one of these, EVER), 9 times out of 10, they just won't do it. It's like teaching abstinance as a method of reducing teen pregnancy.
Thus, the practice of having comments which are redundant w/ the code is simply setting the project up for failure as the parent poster pointed out.
Think of your code as a document that describes how something is done. Always imagine that it is being read by an intelligent person who doesn't already know how or why this thing is done in a certain way.
Oh, and never define words in terms of themselves. This is not helpful:
Consider: if you saw a definition like this in a dictionary, you'd laugh out loud. Every definition should either (1) completely avoid all the words in the term being defined, or if that is too cumbersome, (2) have a reference to a document/glossary/whatever that describes those words in more detail. If your project has no such glossary, it probably needs one.Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
It's simple:
Use COBOL!
Stupid people will be persecuted to the fullest extent allowed by law.
I do believe however that if you increase the complexity with a few if() conditional statements that the second form would be better BUT the support function declarations should be more compartmentalise by a
/*
SUPPORT FUNCTIONS, MAIN EXCUTION BELOW
*/
What I HATE is unnecessary function calls that give you a 10-level deep stack trace for a simple "Hello world"
A caveman dreams of being us, the incalculable power and riches. We dream of being Q, then what?
Hm ... I find the following observation interesting:
.... writing program code is mentaly a thing where you fiddle with the HOW.
... even if the code is hard to understand.
... if we need x == y, suddenly we should write WHY we need that)
// increment i
...
.... coders should do the same.
... and dropped it some 4 or 5 years ago.)
:-(
Most programmers (especialy those without formal education) find programming more an art than an engineering task.
If you talk to them about a "software project" they imediatly tell you HOW to do it and skip the "analysis phase" in which you usualy try to figure WHAT is needed.
Another thing, similar interesting is: how do you teach or how do you learn?
The best way is to start with the WHAT. That means simple facts, like: "a programming language supports the concept of storage and the concept of instructions(usualy)".
Supported with simple examples.
Then you teach the HOW. How to use the concept of storage? How to use the concept of instructions?
Supported with simple examples.
Finaly you teach the WHY. That means you teach the principles which lead to "the physical laws", efficiency, beauty or in depths insights. E.g. why to use a guard element at the end of an array while searching the array with a loop.
If we go building a new application for a customer we usualy do an OO analyis and a OO design. (If we are engineers and not artists)
That means we first analyze the WHAT. What is the application supposed to do? Then we can go deeper and analyze HOW is the application supposed to do it?
So far we are far away from code. We only found classes, some attributes and some few methods. Because we looked only on the WHAT. The HOW, we looked on so far, was only the user point of view.
If we shift to design, to actually get closer to the code, we more and more concentrate on asking us HOW. HOW will it work, HOW can it work, HOW should/will it be implemented to work. From the domain constraints of the application we shift into technical constraints of the run time environment and the implementation.
So
How to get a new customer into the DB? The code describes how to do it
The first goal in coding(the foundation is layed by the analysis) is using good class names and good method names(or the equivalents if you program not oo).
This gives you an insight WHAT you want to achive.
While you code you fill methods and add methods(if you otherwise would write to big methods) and craft HOW to achive it.
So, what should a comment now do? The comment is needed to fill the gap, or to express the third point of our mental model we make: WHY.
OOPS? Yes, we know WHAT we want, we know HOW to get it, but WHY do we get it in that way?
Comments are needed to express WHY we in certain cases drop a habit. (e.g. most for loops in C/C++/Java have for(;x less_than y;)
Several posters pointed out this is a dumb comment:
i++;
Right! But it is not dumb because it is obvious what i++ does, or because it is superfluvious.
It is dumb because it explains the WHAT. It says: what the line of is code doing.
Unfortunatly comments like that are used in language teaching books. To TEACH coders that "++" means (post-)increment.
New coders learn by that: you have to comment like this.
So again: don't comment the WHAT. Comment the WHY.
In teaching OO my rule of thumb for method sizes is: it needs to fit on the screen of my lab top.
As I beam with my labtop onto the wall, I use a big font, like 18. Methods then have a size of 8 lines or so.
If a method gets longer, you usualy make sections and write a comment on top of that section, do you?
Well, thats an excellent way to extract a method name from that comment and to put that section into a "protected" method.
With having more methods, some of them protected and factored out of otherwise to big methods, we suddenly get more freedom to "modify/extend" derived classes. And: oops, we suddenly are close to use a well known design pattern: Template Method. The big method is smal and calls several small methods. The big method is the Template Method, the smaler ones are hook methods. In a derived class suddenly I can moddify such a single step of the algorithm by overwriting a hook method.
So, now you probably have a class with 25 methods?
Probably you should divide the class up then
The art of OO Analysis gives you the insiged to realize early that a class mixes two concepts and therefor should be split up into two classes.
E.g. keeping all attributes of an author in the book class, just lets the class explode and is wrong anyway.
The ART OF OO PROGRAMMING is to anticipate such refactoring points and to use them wisely when appropriated(and not to use them when not).
Programming in itself is no art. Just like painting in itself is no art. Artist usually studdy 10 years and more under the guidiance of other artists
My programms have nearly no comments. Except for classes, WHAT is it for. Methods and their parameters, WHAT are they for, and sometimes HOW do they work.
If a method body is not clear, or if I feel I should do it, I explain WHY I do certain things like I do.
Except for counters all variables have long explaining names. Methods and classes anyway.
I do not use index variables like "bookCounter" because a loop like:
for (int bookCounter=0; bookCounter less_than MAX; bookCounter ++) {
if(found(book[bookCounter]))
return book[bookCounter];
}
meets my criteria of: it does not give any benefit. I see WHAT is going on, and there is no special need to explain WHY something is happening.
A simple 'i' is good enough. (I used for more than 10 years variables like bookCounter
The hint, pointed out by several posters, to use assert() instead of comments is absolutely valid. Drawback: tools like doxygen and javadoc, do not transfer those asserts into the outside method documentation. So by browsing HTML docs you do not see it.
If you have to dig into the code anyway, an assert is much better than a comment. Especialy if you have a test suit triggering your asserts.
To bad, realy to bad, that SUN went for an assert facility in Java 1.4 and did not use pre- and postconditions like Eiffel
Regards,
angel'o'sphere
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
It's less of an issue w/ Javadoc and Doxygen comments (which is embedded in the code) than external documentation, but the fact is that managers reward code changes, not documentation changes, and programmers are lazy.
Until you can change these basic, simple facts, what are you going to do? One strategy is to encourage self-documenting coding standards as well as encourage documentation updates. But people NEED to remain aware of the basic principle that the only authoritative documentation is the source code itself.
I write all of my comments in haiku.
comments in haiku
not for any good reason
i was really bored
Ok, not a really good example, but you know what? No one's ever said anything about it.
I use habits learned from Code Complete every day, such as taking the time to choose a good name (no, I'm not talking about Hungarian notation). The single most important point of the book is that people read code, too.
I found Writing Solid Code, another Microsoft Press book, to be less enjoyable. It was, however, worth its price for one C habit I've retained: put the rvalue on the left in equality tests. For example: if( 0 == foo ). Why wait until run time to find that you've accidentally assigned, instead of compared?
As to "junk food" in programming, I prefer to categorize all of that as "rapid prototyping."
This is a good framework. Your code is probably very pleasant to read. I might argue with your position about marking changes, however. There are times that it's highly useful to see why and when a particular block of code was inserted. Basically, if it's relevant to understanding or troubleshooting that chunk of logic, then by all means leave fingerprints.
One other suggestion you didn't make was: Write the comments first. In The Old Days this was much more important, when working e.g. in assembly language, C6, or FORTRAN. But it's still very helpful. As you start writing the code, implementing your abstract design, you pass through all the major decision-points that shape the detailed implementation. It's relatively easy to capture those decisions as you rough out the module, explaining macro control and data flow, algorithm design, etc. Later comments tend to focus on the gory details rather than the big picture.
Another suggestion is: Revise old comments. It's easy to let old comments go out of date, particularly module-level and block comments. Part of the revision discipline should be reading through the existing documentation and making sure it's (still) accurate. If not, fix it.
Two more: Design and use a set of standards for naming, module headers, error handling, diagnostic messages, etc. These things don't evolve; they have to be implemented by choice and agreement. And Use whitespace intelligently to ensure that important things are easy to see.
-- We all have enough strength to endure the misfortunes of other people. La Rochefoucauld
char *foo, bar;
IMO, the syntax used to define two pointers on a single line in the C is broken. I'd guess that it's a mistake in the original spec and simply hasn't been corrected because the fix would break some code.
Think about it, the real type is "char*" "bar" not "char" "*bar". This can be easily demonstrated with the type casting syntax: you write "(char*)(foo)" instead of (char)(*foo) -- and notice how those mean different things. I always write "char* foo;" and define only one pointer per line. I define loop counters like i, j, k, m and n on a single line, though.
_________________________
Spelling and grammar mistakes left as an exercise for the reader.
Spreadsheet
flight
simulator
I tend not to use multiple declarations per line; the problems outweigh the positives (With multiple declarations: You cannot comment variable usage later, it's (slightly) more typing to refactor code by removing or changing variables, and there's the problem you bring up).
Once you've adopted that rule, it's a lot clearer (and closer to the truth) to use "type*", since the actual type of "var1" in your example is "pointer to type", not "type".
It's also the style Bjarne uses in his code, if that makes any difference to you.
Use a tool like doxygen to formalize your comment formats. The Mozilla project has a page for their doxygen-generated documents at the Mozilla/SeaMonkey Code Documentation and Cross-Reference.
In the Java world, Sun has a couple of documents on how to comment code, How to Write Doc Comments for the Javadoc Tool and Requirements for Writing Java API Specifications. Note that the latter references Object Class Specification by Edward V. Berard, Essays on Object-Oriented Software Engineering, 1993 Simon pp. 131-162., which is an excellent read in general.
All that said, please read and live Martin Fowler's comments on coding style and comments in "Refactoring: Improving the Design of Existing Code"
All variable names should be variations on "xyzzy". Examples are "xyzzy1", "xyzzy2", "xyzzy3". When you get too many of those, you can start putting the digits in the middle: "xy1zzy", "xyz3zy". And you can also vary the capitalization: "xYzzy".
Finally, if you still need more names, you can use "plugh".
These also make good passwords.
why the hell are you using 'class'?
Because this is Java and Java lacks structs
"We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
I can only blame the fact that I was posting to slashdot late at night, and also not in the enivronment of an editor, because I would really format functions like this (at least in C++):
Briefly, the important points are this:
I've been programming in Java a lot recently, though, where all your functions are in a class scope and therefore the "ram it up against the left" set of tricks doesn't really work. In this case, I tend to format like my original example. Mostly, I think, just to save vertical space.
My favorite "useless" comment was one that was part of a (thank gawd) proprietary OS which had in the header to a chunk of assembly language "DOES NOT CHANGE REGISTER HL". The first thing it did was change register HL.
My point is that comments can be as much a problem as they can be a solution. If you have time to change the code, but no time to change the comment, then in essence you may as well have NO COMMENTS. Which really defeats the purpose.
I have found, for the most part, that GOOD function and variable names are FAR better than a half page of comments (as an example, take a page of C++ code and change all the variables to one character... see how hard that is to read?).
Second, ignore the law that says we can't exceed 80 columns. It is dumb. It is old. It defeats the purpose of having a HUGE SCREEN with little teeny lines. 80 columns are the size of IBM punch cards. AKA dinosaurs.
Third, kill whomever sez that Hungarian notation helps. It doesn't. It is the SECOND DUMBEST thing to come out of Microsoft. People who adopt it are mindless beasts of burden. You don't want to be one of THEM do you?
Fourth, BE FANATICAL about taking the what reads from others and discard what doesn't. I used to make boxes with slashes and dashes, etc. When I realized I spent more time "refixing boxes" I got rid of the boxes.
Fifth, is the best size for a bottle of gin. Gin and tonics may not help with coding, but they take the pain away from reading others code. Also good when the boss says "so, what the hell does this mean?".
Sixth, "standards" doesn't mean squat in the real world. Getting code out is far more important. Learn what is absolutely required for comments.
Seventh, if it is a trick TELL THE READER! Your audience is the next guy who has to support your dreck...er...wonderful creation. He is probably going to be less brilliant. If you use a trick of the compiler, LET HIM KNOW. I have worked (and written) code that dies a mysterious death when you "optimize it" - when in fact it IS optimized.
Eighth, learn from the screwups of others. If something you picked up reads like crap, then figure out how to make it better. Does it need to be indented more? Better variable names? Etc. Surprisingly, others probably had the same problem.
Ninth, NEVER write code with comments like "YES". I knew of a HUGE chunk (when printed is stood three feet tall) which had five comments in it. One was "YES". The contractor had to sit for several days with it remembering what it did before he could modify it. Mindblowing, isn't it?
Tenth, Keep evolving. Writing comments is a lot like writing code, you get better at it as you get older. And the style eventually gets more terse, but more reasonable. Eleventh, EVEN if it seems obvious, sometimes it may not be. Be prepared to defend what may seem like simplistic comments. Twelfth, This is the most important. BE CONSISTANT.
IANAL, but I've seen actors play them on TV
(it was long ago, but I'm still bitter.)
Take a look at these files. This project is basically an example of what not to do. It's faggotted up like a twelve-year-old schoolgir's notebook, to borrow a phrase from The Onion. In particular,
That should be obvious from the "public static void main (String argv[])".
Unfortunately, the precedence tables for operators says you're wrong.
"The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
Unfortunately, this is rarely practical, but I find the most useful comments are written when I'm going back through code I wrote over a week ago. The reasons for doing things are no longer on the surface, and thus if there's something I look at and have to dig for understanding, then it needs better explanation.
Let's take for example, C++. If a variable is passed by pointer or reference, you need to document whether it is being passed that way because it is altering the contents of the object or for efficiency (Yes, I know about const. Do you know about the debate over logical const-ness, that destrys the usefulness of the idea?) You have to document allocation and deallocation strategies. You have to document how array limits are passed. If something points to an interior of a structure, you have to document that, too. Old coding habits pop up - like allocating booleans as bits in a word - you need to document those. And, oh yeah, make sure that you document whether the int you are returning is really being interpreted as a boolean, too. There are more examples, but that's enough to make the point. C++ requires more documentation because it has a richer set of ways to do things at the micro-level of the system, any of which that can be translated to the macro-level, where it has very little effect on the logic of the system, but huge effect on the way the system is coded and interpreted.
So my first recommendation is to use a language where less needs to be documented. Lisp or Smalltalk come to mind immediately; Java or Python in a pinch.
That is all.
1. Your "improved" code is much less readable than the original. Whoever has to maintain it will need more time to comprehend it.
2. You introduced a bug on line 3 (null pointer dereferencing).
Yes, I have personally seen code like it and I wanted to shoot the fucking idiot who wrote it.
___
If you think big enough, you'll never have to do it.
When I write code (even code that nobody else will see) I write it as psuedo-code in comments first. That lets me get it straight in my head before I write the code. It seems to help -- it makes my coding go quicker.
My comments don't go as far as your examples. Instead of "loop through the employee records" and "print the record" I'd write "print all the records".
Software sucks. Open Source sucks less.
One of the best ideas that I ever encountered was that the length of variable names should be proportional to the size of their scope. So, one the one hand, this is OK:
for (int i=0; i
(assuming that the "..." doesn't contain any braces), but function/method arguments should be longer, like "name" and "parts" in this:
int createWidget(string name, billOfMaterials parts) {...}
and globally-visible items (like the class and method names above) get the longest names.
Nothing for 6-digit uids?
One of the best ideas that I ever encountered was that the length of variable names should be proportional to the size of their scope. So, one the one hand, this is OK: ... }
for (int i=0; i<size; ++i) {
(assuming that the "..." doesn't contain any braces), but function/method arguments should be longer, like this:
int createWidget(string name, billOfMaterials parts) {...}
and globally-visible items (like the class and function/method names above) get the longest names.
Nothing for 6-digit uids?
To quote our good friend Martin Fowler (in reference to Smelly Code):
"Comments are a sweet smell in code. But sometimes they are used as a deoderant, intented to mask a foul smell. If you need comments to explain a section of code, its time to Refactor".
Comments will not solve your software maintainance problems. Refactoring will. If you havn't read Martins book, then drop everything and read it. Refactoring, as well as the Gang of Four book, are the most influential books on Software Engineering.
Revolution = Evolution
Functions should have their arguments documented-- what they mean and what they do, and their return values documented. That is sufficient.
Inline comments have a place, but they are way over used. If you are telling me how your program works using inline comments, I will ignore those comments becuase if there is a problem, your code may not behave as advertized.
Instead inline comments should document WHY you do something a certain way and help me to understand what problems caused a particular piece of code to us a particularly clumsy algorythm or why a seeminly extranious bit of code was added. Don't tell me how-- that is what the code is for.
And use whitespace as your friend to break things up into logical chunks which are easily readable and logically connected. This is the reason for indenting your code, but the same principle can be used by adding additional line breaks to separate logical chunks of code (this makes more sense then meaninglessly breaking up functions).
I think that these are relatively language independent advice. I use it in Perl and PHP, and when I read C and C++, I appreciate these tips as well.
LedgerSMB: Open source Accounting/ERP
If you want one that covers those culprits, read Rapid Development by the same author - it talks about the management end of software engineering. Code Complete is meant to cover the programming end.
Coments shoud, from what I remember from my CS courses.
a) Describe in detail what the function does, or what the variable(s) store
b) Describe the preconditions of the function (what has to happen for the function to be called)
c) Describe the post conditions (what the end result of the function is (what does it return and where does it send the data)
T Money
World Domination with a plastic spoon since 1984
Why is this important? When you change the comment, you must think about the comment. You must think about the change you've done and how it fits in with the rest of the code, and what the rest of the code is trying to do.
:-) (now don't get your panties in a bunch, I'm just yanking your chain)
Contrast this with changing the code, which can safely be done willy-nilly. Thank you for saving us from random code changes by adding blocks of unverifiable natural language which may have a vaguely similar meaning to anyone other than the original author.
But if the objective is to mitigate the risk of change, might I suggest trying unit testing?
Test Infected
I actually don't worry about the potential adverse side effects in some distant section that might result from improving this section. If it passes the tests, it is right. If it is wrong, it will not pass the tests. If it passes the tests and a bug is later found, it indicates an opportunity for improvement of the unit tests. Enhance the unit tests to reflect the newly discovered requirement, then do the minimal improvements to the code necessary to pass the improved unit test.
I will make you two promises: 1. It will feel extremely unnatural at first. 2. If you really commit to it in version n, you will experience less bugs in version n+1. Unfortunately, it is impossible to perceive a bug that never exists, so you will have to have a fairly solid feel for how many bugs to anticipate to fully appreciate the improvement.
As anecdotal evidence; when I build up a new package, I start with the unit tests for the smallest components first, and gradually build my way up. At the end, I start wiring all the subcomponents together. It is not uncommon for me to spend two weeks developing a collection of classes, and less than an hour debugging them in the final integration.
Stop-Prism.org: Opt Out of Surveillance
Exactly, my whole coding philosophy is to name my vairables and procedures so that the code reads as much like english as possible. If i do my job correctly my code should halfway comprehendable by people who dont even know wtf c/c++/java/insertlanguagehere is
"The United States has no right, no desire, and no intention to impose our form of government on anyone else." - Bush 05
Actually, they weren't. I can't quite say who they were, but you're about as far away as you can get from what they do...
"The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
The documentation should ideally be in a separate document which explains each subroutine/function. But this much discipline is way beyond most programmers!
One of the reasons I like Forth is that any decent Forth system allots an equal amount of space for comments as code but in a way which does not interfere with the code. A Forth editor can then display a block of code with the corresponding block of comments beside it. Also, since the comment block is assigned anyway you feel you should put something in it so I find I comment Forth programs quite well as I go along.
TWW
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
Hmm. The code I delivered last week (50 java classes, another 45 unit test classes using JUnit) has approx. 1 Javadoc comment per public method.
That comment contains detail on what the inputs need to be, what the outputs will be, and what error conditions can occur.
The code itself uses sensible variable names (there are no single letter variable names in my code) that describe what's going on. As only two of my methods are more than 15 lines long, this means that the code is extremely easy to read and very maintainable.
Those two methods are non-trivial btw - and both of them were ones I just lacked the time to refactor, or they would be shorter. And both contain inline comments, because the code is not clear. Oddly enough, I hate this, but I've done it because I recognise the necessity. One day I'll get the time to write an internal rate of return calculation that's properly refactored and doesn't need comments, and still performs adequately, but until then I guess I'll stick with the comments.
In my experience, people do not update the comments. They do not update the technical spec, and they do not update the functional spec. I focus on delivering workable maintainable code, and putting in comments that people will neglect and will (at a later date) mislead people does not make the code maintainable. If you work in an environment where comments are regularly updated, congratulations - I don't, and I've never met anybody who does.
~Cederic
Fashion designers?
> You missed my point entirely. *foo is not a character. *foo is a pointer,
No, foo is a pointer. *foo is a dereferenced pointer, which means, since foo is defined as a pointer to char, it is a char. The fact that the pointer must be properly defined before it can be dereferenced does not change that.
> Saying that *foo is a character implies that you can do something like *foo = 'A';
No, it doesn't imply that. Saying that *foo is a static character variable implies that, but of course, it isn't, and we never said it was.
Chris Mattern
I once worked with a guy whos favorite variable and was asdf (Just roll your left hands fingers across the keyboard), and when he needed another he'd roll his fingers the other way(fdsa). It was maddening to debug. I find it useful to comment each section with what the code chunk expects in terms of data in, what it does with this data, then how and where it returns the data. I sometimes also note my thought process in writting the code this way. There are also times when I don't comment the code at all... especially if it's a chunk for an embedded box with limited storage. I do write docs for it, but they're stored elsewhere. -Just my $.02
Why worry? Each of us is wearing an unlicensed "nucular" accelerator on his back.
Sig changed for readability by G.W.
I allways (when applicable) add a prefix to my variables that identifies the type. Name then becomes sName, etc. I know this does not work for everybody,
Hehe, why are programmers so passionate about HN. Seems like coders tend to fall into 2 camps - they hate it, or they see the underlying merits, and use a modified version.
Years ago I absolutely hated Hungarian Notation. IMHO it was far too meticulousness - you spent more time trying to keep your variables up to date, then coding! (Ok, it wasn't that bad, but close.)
You can see how painfull raw HN is:
Microsoft Hungarion Notation
Does one *really* need to be that preverse about specifying the difference between byte, char, short, long, dword? Sheesh !(Notice how this pattern shows up when you take any good idea / methodolgy and become a "zealot" about it, but I digress.) Most of the time you don't need to split hairs over ints, floats, bools, chars, strings, etc. You're just making busy-work for yourself.
That said, here's a heavily modified HN (namely a lot less restrictive) for C++ that I find extremely handy:
There are a few variations on this: Microsoft uses m_ (I don't care for this style, as this is too "wordy" IMHO), Some of the well know C++ gurus (forgot if it was Eckel or Meyers) use a trailing _ (again too easy to miss IMHO)
- p pointers
(s, g, _, p are always in this order)I find these to be a good compromise:
And since I'm dealing with so much 3D math I also use:
To keep track of what I'm converting to/from.
~ 2 ~ Having a standardized coding standard at work has been a Good Thing (TM) in the long run. I didn't like it at first, (habits and all :) but it's made working on other peoples code easier.
~ 3 ~ I can't stress this next tip enough: Use *descriptive* variables names.
A descriptive name almost takes away the need for comments (almost ;-)
i.e.
If you iterating over a set, or range, say it !
for( iCard = 0; iCard < nCards; iCard++ )
~ 4 ~ Use whitespace. Align the columns of your tables up. There is a reason tables exist - to make it easier to read.
~ 5 ~ If you have a function that does some complex calculation, document where you got the formula from.i.e. See Graphics Gems Volume N, Pages nn-nn
~ 6 ~ As you come up with ideas for how the block of code you're working on could be better, made more robust, effiecient, etc, put a comment: // TODO: rewrite so is cleaner
This way you can do a search to see which parts of the code needs to rewritten.
Cheers
--
"The issue today is the same as it has been throughout all history, whether man shall be allowed to govern himself or be ruled by a small elite."
- Thomas Jefferson
I use them every day. Some of the code I deal with performs complicated mathematical algorithms. The functions run to 100+ lines. The algorithm is a compete entity in itself. It has no logical subparts. It does not decompose into meaningful steps. It just does a whole bunch 'o maths. Are you suggesting that I should refactor this code into separate parts at some arbitrary points just to keep the code shorter than 20 lines per function?
All the evidence I've ever seen says that not only do longer functions not harm readability (up to a limit of around 150-200 lines, at least), arbitrary decomposition such as you advocate does harm readability, because it forces the reader to jump around for no good reason.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
I don't think you need to be quite so anal about it. Comments or variable names that are uninformative are bad, whether they're intended to be funny or not. Comments or variable names that are informative are good, and making a dreary chore out of anything discourages people from doing it. I'll take good comments with bad humor over no comments at all any day.
Slashdot - News for Herds. Stuff that Splatters.