Best and Worst Coding Standards?
An anonymous reader writes "If you've been hired by a serious software development house, chances are one of your early familiarization tasks was to read company guidelines on coding standards and practices. You've probably been given some basic guidelines, such as gotos being off limits except in specific circumstances, or that code should be indented with tabs rather than spaces, or vice versa. Perhaps you've had some more exotic or less intuitive practices as well; maybe continue or multiple return statements were off-limits. What standards have you found worked well in practice, increasing code readability and maintainability? Which only looked good on paper?"
yes, putting the else on a new line makes it a bit more readable; you know that line marks the beginning of an else:
if( condition ) {
statement1;
}
else {
statement2;
}
reading this kind of code tells you that there is an else condition there. having a leading closing brace makes you have to read into the line to see what's going on. I know it's 2 characters, but when scanning code for structure, it helps to have it on a bew line.
If you are using your computer right, it does not only enable you to do things, it does the boring things for you, automatically.
Checkstyle is one of the tools in a company toolkit that is often overlooked but in fact VERY handy. It enables you to define a ruleset for your source code, finding stuff which is incompatible with the coding practice in your company/team/project/whatever. Moreover, you can stick it into Eclipse using the free Eclipse-CS plugin, so it will automagically mark the places which need to be change. Last but not least, you can put Checkstyle as an Ant task in your building environment (and in your continous integration toolkit) so commited code that does not conform certain standards does not build.
As for the rules themselves, we've found these to be the most successful:
Of course, we let developers to add suppresions for the 1% of false positives. In fact, there are very few suppresion rules set.
Build a tool even an idiot can use and only an idiot will want to use it. -S.O.B.
There are several tools that can detect cut and paste code:
Simian: http://www.redhillconsulting.com.au/products/simian/
PMD: http://pmd.sourceforge.net/
DuplicateFinder: http://www.codeplex.com/DuplicateFinder
And probably others
My Karma: ran over your Dogma
StrawberryFrog
For anyone coding C or C++, I strongly recommend CERT's secure coding standards:
https://www.securecoding.cert.org/confluence/display/seccode/CERT+Secure+Coding+Standards
wParam
You were doing it wrong. The compiler knows it's a fucking word or long int or whatever, use the prefixes to encode something the compiler DOESN'T know about.
Some programmers think they should be able to do anything they want.
That might be OK if you live in your parents' basement and code for yourself, but in the real world it's a bad (and selfish) idea.
Strict adherence to a standard is helpful in code review and in cases where a component is taken over by a new maintainer.
This is always important, but it's particularly important in a genuinely open, community-driven project.
The Drupal project is an example. It has a coding standard derived from the PEAR project that applies to any code submitted for inclusion in the core.
Contrib authors are encouraged, but not required, to follow it. The good ones do.
The Drupal Coder module does a very good job of nagging at you until you get the formatting right, and also helps with code migration and updating when the API changes. And it finds some common bonehead mistakes that can create security issues.
Adhering to a standard doesn't have to be painful. Using a properly configured text editor helps. There is good support for Drupal standards and conventions in OpenKomodo and the commercial Komodo IDE, as well as some other editors.
We found that it doesn't really help to enforce a *formatting* style on developers because everyone has their own. The only thing you really should be enforcing is tabs vs spaces (and it should be spaces) because mixing the two can produce some really ugly results.
We have a much better rate of return running tools like JSLint or PMD to catch issues that are syntactically valid but will be sure to cause problems down the line
That is the Indian Hill C Coding Standard. It is almost mandatory to learn for certain areas of computer programming such as device driver development and applications programming. Mainly because it is the first documented coding standard that came out and was used by universities and corporations.
I've known some companies to make it a priority that all programmers used this standard when their greatest threat to survival was keeping up technologically with their competitors.
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
Like many Python evangelists, you seem to have a remarkably limited experience of computer languages.
Here's a language with no braces:
Here's another:
Here's another:
Here's another:
or, more extensibly,
And the list goes on. Maybe you should try learning some other languages. Broaden your mind a bit. There's a lot out there that isn't Python or C/C++/Java. Some of it is quite interesting.
If you are working for a compagny which pretends engineering software and which starts by giving you a coding rules guide of one hundred pages long which speaks only of indenting and whether or not to uppercase function names regarding their scoope or how to format function name prefixes regarding the supposed module they belong to, then RUN AWAY ! (1)
There are plenty of utilities which are able to automatically do all that typo job for you.
Do not confuse coding rules and presentation style, coding rules focus on adding formal information about symbols or instruction blocks which are not already implied by the semantic of the used programming languange.
For instance, in C, if you use macros which are a substitute for inline functions, then it is good to provide a prototype for the implicit function and a syntactic tag which explicit the way the macro will be used within the macro definition.
This way, a home-made static code analyser is less likely to get confused by macro-which-looks-like-functions.
Once such formal information is put whithin your code, any serious text editor will be able to reformat and to present it the way you like most.
And such way of presenting code can be different regarding that you are currently writting it or that you are an integrator who have to merge two versions of a file or that you are reviewing someting written by one of your coworker or again that you are rewriting a piece of code you wrote some monthes ago, etc.
Continuing on C, what kind of macros will you allow to be used, only constant definitions? Inline functions replacement?
More complicated: if you use critical sections over data or code, what kind of information do you add to let a tool verify that you do not jump out of your section without realising it?
Presentation style can be usefull, but coding rules are about making a link with the design and test plan constraints and requirements.
If, in a software compagny, people debate on presentation indents and typo rather than on formatting tools and static analysers, well, they just don't get it, or more precisely they lack the competence and the talent to formalize the relatio n between the information they would like to be presented within the code and the semantic of this information. In this case, you will see presentation style changing regarding who is in charge of deciding it, regarding the current coding fashion of the moment, while the base coders will still being complaining of doing the same thing year after year, hitting the same kind of problems on the same kind of code, and that they expect code reviewing to be about the semantic of their code and not about typo.
(1) well, I guess it depends on how much you are paid, after all ;)
Now why would anybody do this? I've always assumed code like this was basically what inexperienced people would use:
Why not just return immediately if any basic conditions or assumptions are not met and prevent that completely unnecessary indentation? The only advantage I can see is that you could miss the return statement when reading the code.
:/- spoon(_).
The Linus says:
"If you need more than 3 levels of indentation, you're screwed anyway, and should fix your program."
from http://en.wikiquote.org/wiki/Linus_Torvalds
From a personal perspective that happens to tie in with the coding practices at my last company:
The second example (GNU style) I have found to be quite cumbersome in writing, unless tabs are set to 2 with braces indented once and content twice (company mandated four with one indent for content in the block), in which case I would be frustrated with the extra keypresses involved.
The first example (Allman style) I used to use until I moved over to Kernighan-Ritchie style (opening brace on same line as control statement, with functions (and classes in OOP languages) braces the exception; these are written in Allman style). This allows me to scrunch more onto the screen vertically.
FWIW I never liked the '} else {' style of elses but at the same time, I never found it difficult to read so it was never a real issue. It makes sense to me to have the else begin at the same column as the if to which it belongs.
This may be of interest to you.
"Three eyes are better than one" -- Lieutenant Columbo
This exposes an implementation detail to the user of the Object
In many languages, an interface is not an object.
and makes it difficult to refactor
Refactor how, and how often do you do this? Reply under this comment, please.
My Karma: ran over your Dogma
StrawberryFrog
The prohibition on "multiple exits" or returns comes from a misunderstanding of early program proving technology. As one of the few people who ever built a real proof of correctness system, I know that's just not a problem.
There are some topological restrictions on program proving, but you can't violate them with "break" or "return". You need "goto" to really screw up. The actual topological constraint is that backwards control paths must not cross.
The basic requirement for proving loops is that there must be some section in the loop through which control must pass on every iteration. Somewhere in that section must go the loop invariant and the termination measure.
Nobody does this for software any more, although, interestingly, full-scale machine proof of correctness of hardware logic in VHDL is not that unusual. There are commercial tools for that.
Strangely enough, Hungarian worked quite well for the problem it was originally intended to solve.
I worked at Xerox in the late 70's and my manager was Charles Simonyi, inventor of this notation. The project was BravoX (grandparent of MS Word) and was written in BCPL. BCPL basically has one type: integer. How that integer is treated is purely a function of how you reference it. E.g. fooFirst>>fooNext means "use the variable 'fooFirst' as a pointer to a structure of type FOO, one of whose elements is (from the naming convention) a pointer to some other FOO." Whereas fooFirst+1 adds one to an integer and (almost certainly) yields an invalid point that bill blow up when you try to use it. (It's been 30 years since I wrote anything in it, so I probably screwed up the example.)
Since there was only one type, the compiler didn't/couldn't perform type checking. Hungarian was a way of putting the type into the name of the variable so that the programmer could perform visual type checking. There were 9 of us on the project and the consistency/readability across the code base was impressive. Any of us could go into anyone else's code and almost immediately see what was going on.
I still use a light variant of it in my own code, but when in someone else's code I try to stick to their naming/formatting convention.
Like so many good ideas, it worked well in its original context but became twisted out of shape when used for something never intended/envisioned by the original developers (even though the person doing the twisting was, in fact, the original developer!). Another example of this is the Third Eye Software symbol table format I created for my debugger, CDB, but which was then used and abused by Mips to create a complete piece of crap. What they did still has people swearing at me 20+ years after the fact. (More on this at Third Eye Software and the MIPS symbol table)
Even in languages that recurse properly that'll overflow on big numbers. To not overflow in properly recursing languages, you need:
That second one is often called the gnu style. Here are some reasons it is not used:
1) It is cumbersome to write all extra those tabs.
2) It uses a lot of horizontal space. Every indent is 2 tabs over. Nested ifs and fors and such will quickly approach the edge of the screen.
3) It takes just as much vertical spcace as the Allman style, but the braces don't line up with anything at all.
I find it odd that you refer to this as the most readable, most obvious way to write the code. Most people retch at the sight of it. To each his own.
Btw, I haven't voted yet. I used to prefer the Allman style (largely because the first C++ book I ever read used it), but few projects use it, and several that I have worked with used the Kernighan-Ritchie style, and now I prefer it.
Write your own Choose Your Own Adventure. http://www.freegameengines.org/gamebook-engine/
Uh, no, not really. Especially when driven by specific profiling. I've gotten some pretty serious gains in tight loops on modern hardware with intelligent unrolling. If you've got stuff organized properly to stream out of memory and into the processor, you can get rid of pipeline stalls and significantly improve throughput for certain classes of loops.
The ringing of the division bell has begun... -PF
Hmm, it looks to me like the main difference w.r.t. readability is that you added an extra blank line before and after the else in the C/C++ version, which you didn't have in the Perl version. That makes it stand out quite a bit.
I also much prefer aligning the {}'s vertically.
Preferences -> Java\Editor\Save Actions -> Check "Format Source Code" Go around and smack people until they set that option and you'll never have to worry again. Anytime they do work, it will automatically format for them when they save.
Actually, reading in columns is easier than in long lines. Most papers use several columns and many blogs use an artificially narrow column.