Best and Worst Coding Standards?
An anonymous reader writes "If you've been hired by a serious software development house, chances are one of your early familiarization tasks was to read company guidelines on coding standards and practices. You've probably been given some basic guidelines, such as gotos being off limits except in specific circumstances, or that code should be indented with tabs rather than spaces, or vice versa. Perhaps you've had some more exotic or less intuitive practices as well; maybe continue or multiple return statements were off-limits. What standards have you found worked well in practice, increasing code readability and maintainability? Which only looked good on paper?"
I've worked where we were supplied a full IDE and a 17" CRT, and the coding standard forced so much white space vertically that you had to basically remember all the code.
I can't stand seeing the closing brace of an if statement sharing a line with an else, like so:
if( condition ) {
statement1;
} else {
statement2;
}
I've always found the Joint Strike Fighter's coding standards document an interesting read. It is available from Bjarne Stroustrup's website (pdf)
This sounds like a fairytale but I work for a very large IT firm which is very well known. Serious company doesn't mean good however.
In certain files (not all apparentely) all constant variables have to be declared globally. We are talking C++ here.
Think what you want, but I don't like it. The reason for the variables placements are so "that they will be easy to find".
My new standard comes from a 1950's comp sci book.
"Programs consists of input, output, processing and storage."
Lose focus of that and the project will be late, over budget and most likely broken in ways no one will understand for years.
One of my friends worked at a place where you'd have to insert whitespace to place certain elements (variables, evals, etc.) to begin at a specific col in the code within every line; in addition to standard indentation of the line. At one level, I see the concept, but seriously - highlighting and search is made to solve the same problem there.
He left that job quickly.
Returned Peace Corps IT Volunteer
Also found I prefix in .NET really bad pracitce for marking interfaces like ICollection, what about when You decide turn interface to abstract class?..
Well. The whole point of having interfaces is allowing the implementation of a certain method set to the world, which later can be used in your APIs using polymorphism. If you later decide to break the contract and make an interface a class, then probably a name change (made also automatically in tools like Eclipse or NetBeans) won't be any worse.
As for the Hungarian notation, the standard form is indeed worthless. But we tend to use simple maximum three letter abbreviations of Swing components, to know that we are taking the username from txtLogin and listening for pressing btnOK. Code is more often read than written and this quasi-Hungarian style actually works pretty well.
In fact, having interfaces named like "IPasswordProvider" is something very similar. It enables easy reading of your code and when you want to make a change, you instantly see that this type is an interface, therefore you cannot instantiate it directly, but you can implement this interface in any arbitrary class you may already have written, etc. Plus, Sun coding standards encourage you to name interfaces in a passive adjective way like "Serializable", "Comparable", etc. To comply with this format is not very natural for interfaces like "IPasswordProvider" or "IModelContext".
Build a tool even an idiot can use and only an idiot will want to use it. -S.O.B.
The worst example I ever saw was some IBM Federal Systems code written in the 1970s -- they enforced the "one page module" rule using nested includes -- NINE LEVELS OF NESTED INCLUDES!! Of course, this was coded in a FORTRAN-like MIL-SPEC language with eight character filenames, so you can guess how hard it was to find a particular module. It was at least three levels down to find the first page of global variable declarations.
I worked for a company that was destroyed by a bad coding standard.
This was a small company, that, back in '96, was awarded the contract for a POS application for a regional store chain, with back-office servers that would be updated nightly by modem.
The guys who ran the company weren't programmers (though one of them knew enough to be dangerous); they were technical salesmen. They were also big fans of Microsoft, with "MVP" plaques on the walls, and every employee except me having Microsoft certs.
I worked for them part-time while also working for another company. I advocated Unix (mostly BSDI and SunOS at the time), and always argued with them about why Unix was better (technical superiority vs. potential for big profits).
When their big project was well underway, they brought me in to do the communications part of it, where the POS terminals would contact one of several servers by modem each night ("why not just ethernet them together, get a dialup PPP connection, and use IP? the interface is so much more reliable..." Request denied).
The app was Visual Basic, with third-party "custom controls" for things like talking to modems. My part went fairly smoothly, and I was eventually asked to help out with the main application, which was suffering from unexplained crashes. When I looked at the code, I found something... strange.
For error handling, they had elected to use a program called "VB Rig" (the name came from the rigging used on sailing ships, which prevents a sailor from falling to his death. Sometimes.) What this program did was to examine the source code, and then add error handling boilerplate at the start and end of each and every function. It inserted the exact same error handling code into every function.
Because the error handler had to be all purpose, it was about 20 lines of code per function - sometimes much larger than the regular part of the function. And, worse, because it was the same for every function, and it made use of the same variable names, that meant either every variable had to be global, or you'd have to declare the ten or so standard variable names at the start of every function (they opted for the "everything is global" approach).
Which led to things like this (forgive the syntax errors, it's been years since I've touched VB):
On Error goto my_data_file_read_function_VBRIG_TRAP
open MyDataFile for writing ...
goto my_data_file_read_function_VBRIG_CLEANUP
my_data_file_read_function_VBRIG_TRAP:
on error 101 'Permission Denied
delete MyDataFile
resume
on error 102 'File Not Found
MessageBox 'Cannot read ' + MyConfigFile
resume
my_data_file_read_function_VBRIG_CLEANUP:
blah blah
my_data_file_read_function = SUCCESS ' return
As you see, the error handling code - which had to be exactly the same for every function - made use of global variables (names like DataFile1, MyFile1, UserName, etc.) to figure out what to do for each error. That meant, that if there was any possibility you might have a "File Not Found", you had to expect the filename where that might happen to be in a particular global variable - say, MyFile1 - and hope that the calling function wasn't using that name too, for the same reasons.
Naturally, files were being created and deleted at random, and the programmers often spent hours on the phone with the customer trying to figure out why the Access database had disappeared *again*.
I asked if we could just write the error handling by hand, and use appropriate local variables; or take the standard VBRig error handling and trim out the lines that weren't relevant for a particular function (as subsequent VBRig runs wouldn't touch its code region if it saw that it had been customized).
Request Denied. "This is our coding standard. We carefully reviewed the options before making the decision to use t
On the strange side is the omission of vowels on functions and varible names to save text space (it's not required, but should be consistent for similarily names objects). It sounds weird, but is still quite readable.
Apparently, the bastardized version of Hungarian Notation got popular: http://www.joelonsoftware.com/articles/Wrong.html
zm
Sig ?
multiple return statements were off-limits
Despite the fact that it's not part of the coding standard where I work, I have a few coworkers who take this to the extreme. They surround every single function they write with: ... } while(0);
do{
And then, inside the "do" block, they just put "break" in any place where they would have otherwise put "return." It drives me insane; they insist that having a single exit point from your function makes it easier to debug, but I just don't get it. I've never even seen them use gdb, anyway, so I think that abusing "printf" is their idea of "debugging"...
One thing in our coding standard that I do like is that all variables that store units must have a unit specification at the end of their name -- in other words, all frequencies might have "Hz" or "MHz", distances might have "m" or "mm", times have "sec" or "msec", and so on. This is really helpful in my field -- it's not uncommon for me to open up a file that I've never looked at before and need to make modifications to it, and if the units everything things are stored in weren't immediately obvious, I'd have to go track down somebody and ask them. The annoying thing here is when people decide not to follow this standard because they think it should be obvious...
Karma: Terrifying (mostly affected by atrocities you've committed)
I work for a major software vendor. The particular group I work in wrote the application framework for a suite of apps. Our code is mostly quite nice. There were about 20 people working on it during development and there are a few pieces that are crap, but for the most part, it's quite well designed and written.
Now, there are other groups that use this framework. One group in particular, has pretty much the same standards that our group does. The difference is, however, that their manager never had them do code reviews and so people pretty much ignored the standards. I've now been tasked with working with that group and their code is a complete nightmare. For example, a single form class with something like 16 tab pages (spread among 3 or 4 tab controls), over 200 controls, and over 9000 lines of spaghetti code.
Had this group done code reviews, this class never would have passed, and it wouldn't be such a nightmare to deal with. At this point, we're already shipping the second version, so a complete rewrite of the various nightmare components of this app are out of the question, which is too bad because it's going to be a nightmare to maintain, especially when the guy who wrote it leaves.
I've always hated doing code reviews, but this experience has made it abundantly clear to me how important they are for minimizing the damage a single clueless programmer can get away with.
I've never been convinced by any hard-and-fast coding stylistics. Sure, it's possible to write good code and bad code, readable code and unreadable code, but beauty is very much in the eye of the beholder, and, also, it depends a lot what you are trying to do. Insisting on one inflexible set of stylistics works about as well as telling people never ever to split infinitives or never ever to use the word 'said'.
Last night I came across this in the documentation for CPAN's Net::Server (you probably guessed from the above that I'm not a Pascal programmer):
You may get and set properties in two ways. The suggested way is to access properties directly via
my $val = $self->{server}->{key1};
Accessing the properties directly will speed the server process - though some would deem this as bad style. A second way has been provided for object oriented types who believe in methods. The second way consists of the following methods:
my $val = $self->get_property( 'key1' );
my $self->set_property( key1 => 'val1');
This struck me as remarkably sensible - the author of the module puts his prejudices on the table, but tells you how to do it a different way if you like. (And, personally, I prefer the first example, because it's just as clear, faster, and I've never managed to take OOP in perl entirely seriously - a problem that Larry Wall appears to have too.)
You judge good style in any particular case by looking at the overall work, not by nit-picking about the punctuation in isolation.
Virtually serving coffee
Now that we're talking about 'languages that invite bad coding practices'... Well, one of the best programming books I've read is 'Perl Best Practices'. Not only does it list out best practices but it tries to explain (well I might add) why you should code a certain way and why other ways aren't good to follow.
One of the habits I picked up from 'Perl Best Practices is:
instead of:
The else tends to get 'lost' when just following the closing bracket.
Forget this pointless stuff about tabs and spacing, I've seen some really brain-dead policies.
1) Source Control Substitute
At one shop, there were designers who edited XML + image files (kinda like web pages, but not quite). There was a compiler that built this all into a single executable. They were not permitted to edit the source directly, and had to work on copies. And those copies must be on the network instead of their local drive. And source control was not allowed.
So instead of people having local copies and then committing their work, everyone made a duplicate copy on the network for each thing they did. It took hours to make the copies, and the compile times went from a few minutes to 45 minutes. Plus, the network drive kept running out of space due to all the gzillions of copies of everything.
2) Making the "minimal" change required
I worked for a US government contractor and they wanted each change to have the minimal impact on the system that was possible. So, basically nobody ever removed code, only added. One time I encountered a huge nested if statement that spanned hundreds of lines. Upon looking at the cases, I noticed that many of them were the same. Like:
if (a)
if (b)
do x
else
do y
else
if (b)
do x
else
do x
which can, of course, be simplified to:
if (a and not b) do y else do x
This was because people had to make the MINIMAL change required each time a change was made. And removing a level of the if statements was more lines of code modified than just changing "do y" to "do x"
Imagine this, but with dozens of cases spanning hundreds of lines. I spent almost a day to build a chart listing what each combination of variables did, and finally chopped hundreds of lines of code to about 10 lines. Turns out that after years of changes, most of the cases now did the same thing.
I fully agree, but ofcourse there are always exceptions.. If you need code that runs fast then it can be better to just copy&Paste (which isn't something people seem to care about these days anymore, and then complain about having to need a faster computer every now and then just to run the same kind of program).
If you are using your computer right, it does not only enable you to do things, it does the boring things for you, automatically.
Exactly. Use the tools.
In the .net world, check out
fxCop: http://msdn.microsoft.com/en-us/library/bb429476(vs.80).aspx
StyleCop: http://code.msdn.microsoft.com/sourceanalysis
These can both be used to prevent code building if it doesn't meet standards. Sadly, the first task for me is usually to turn on "warnings as errors" and get the code up to that minimal standard.
Also check out Resharper: http://www.jetbrains.com/resharper/
for flagging some code problems.
The problem with code standards is that your best coders are probably using a standard already; and the while the worst can be dragged onto a standard, they will write bad code even with it.
My Karma: ran over your Dogma
StrawberryFrog
In a way, the languages, tools, and libraries prescribed (if any), also constitute a sort of coding practice, in the sense that they impose limits on how you can structure your code.
- The language you work with gives you certain language constructs. These constructs vary per language, and determine how you must express things and what abstractions are available to you. This has a huge impact on the structure of your code.
- Most tools like to structure and format your code a certain way, particularly when the tool generates the code. This is usually a great boon, because it will make it easy for programmers to adhere to the same coding standard and hard for them to deviate. Of course, if what you want is not what the tool wants, the tool starts getting in the way.
- The libraries you work with determine the APIs available to you. This also has a strong influence on the structure of your code. It also interacts with the language constructs available to you, as they may or may not make it easy to build an API you like to work with on top of the API that a library exposes.
Abstraction is particularly important. If a language offers powerful enough abstractions, you can structure your program so that it is easy for humans to understand what it does, and have the compiler translate it to whatever the libraries make available to you. Better abstractions also make your code more reusable.
As an example, in C, strings are character arrays. Arrays in C don't have a size associated with them. The end of a string is indicated by a character with value 0. Furthermore, the type of an array of characters is actually the same as a pointer to a character. C also doesn't have automatic memory management. Suppose now that you wanted to concatenate two strings. There are various ways to do so, but the most obvious one is the strcat function:
This function appends src to dest and returns dest (a pointer to the concatenated string).
That is, provided there was actually enough space in dest to hold the combined contents of dest and src, and the terminating NUL. If there wasn't, the function overwrites whatever came after dest, which will usually lead to your program crashing or executing code supplied by a cracker attacking your program.
The correct way to use strcat, then, is something like:
But wait! That's not all! Since the type of an array is actually a pointer, and pointers are allowed to be NULL in C, first and second in the above could actually be NULL. If either one of them is, the program will crash. So we need to add extra code to check for that ...
All those many things to remember to concatenate two strings. It doesn't have to be that way. In OCaml, for example, a string is a string, not a pointer to a character, and never null. You don't have to worry about allocating a large enough block of memory, because memory will be allocated as needed, and reclaimed when no longer reachable. As it happens, OCaml also has an operator that concatenates strings. That is besides the point here, but I had to tell you that to explain what the code looks like in OCaml. Namely:
Not only is it much shorter than the C code, it's also easier to understand what it does, and more robust.
I think this sort of thing matters a lot more than how you format or indent your code, and pretty much everything else that normally falls under the nomer of "coding standards".
Please correct me if I got my facts wrong.
Why does everybody do it that way? That is, with the opening paren on the "if" line? I have always found that difficult to read. Why not
if (something)
{
stuff
}
else
{
other stuff
}
or maybe even
if (something)
{
stuff
}
else
{
other stuff
}
This last has always seemed to me to be the most readable, most obvious way to write the code. Can anyone explain why it is not used? (other than some well-known guru prefers the other?)
Teen Angel - a Ghost Story
Thanks for the link! I found a style there that appeals to me that I've not tried before which they say could be called "Lisp style" (with my slight mods here):
for (i = 0; i < 10; i++) {
if (0 == (i % 2)) {
doSomething(i); }
else {
doSomethingElse(i); }}
At the place that I work, I got so sick of everyone's Java code being so lazily formatted (or not at all) that I FORCED people to code in a certain format. I configured the Eclipse formatter a certain way and made sure everyone on my team had it imported.
Of course people can still be lazy and find it too hard to press Ctrl+Shift+F, but when I get given a code review, if they haven't formatted it, I instantly fail it.
Some examples are:
* Large chunks of whitespace between tokens (ranging from 0 to 3 spaces between an operator and a number/String in some cases),
* Lack of or inconsistency of indentation using COMBINATIONS of spaces and tabs, and
* Braces on the ends of lines in some places, and on the next line in others.
Admittedly, programmers can be forgiven for messing up braces or whitespace occasionally (nobody's perfect), but when it's riddled throughout the code, it's time to hit Ctrl+Shift+F.
Homonyms are fun!
You're driving your car, but they're riding their bikes there.
Let's not use exceptions but use a C++ integer named "ok" with an initial value of 1. Now write code like this:
int ok = 1; ...
if(ok)
ok = MethodCall();
if(ok)
ok = expression;
return ok;
They argued it was easier to step over in the debugger and it saved you from checking code exits (code exits only allowed in the last statement).
In Java Eclipse you can click the return value and all method exit point show up (including those for checked exceptions). For those Java programmers missing this brilliant feature.