Open Source Code Maintainability Analyzed
gManZboy writes "Four computer scientists have done a formal analysis of five Open Source software projects to determine how being "Open Source" contributes to or inhibits source code maintainability. While they admit further research is needed, they conclude that open source is no magic bullet on this particular issue, and argue that Open Source software development should strive for even greater code maintainability." From the article: "The disadvantages of OSS development include absence of complete documentation or technical support. Moreover, there is strong evidence that projects with clear and widely accepted specifications, such as operating systems and system applications, are well suited for the OSS development model. However, it is still questionable whether systems like ERP could be developed successfully as OSS projects. "
I've worked on a major product in CRM market, and let me tell you, don't want to know what goes into sausage. If you knew, you wouldn't touch this code with a 10 foot pole much less bet your company on it.
I'm sure it's the same with ERP. It's just a huge polished turd, but because you don't have the source code you don't know it's a turd. You only see the polish.
Well, the importance of a test suite rises dramatically with the complexity of the project. The difficulty of making a test suite increases with the amount of hardware that you need to implement it. When I think of "big" open source projects that aren't very hardware dependant - for example, ITK (the Insight Toolkit), they tend to have nice test suites. Naturally, the little ones don't, but little projects of most things don't have test suites.
I agree, though, that automated test suites are underused. Also, not enough programmers (OSS or otherwise) seem to understand the importance of refactoring.
A message to coders: People, if your function is more than 10 lines long, you should start to consider splitting it. If it's more than 100 lines long, you're probably doing something wrong. If you have the same code written with slight modifications two or more different ways, you're probably doing something wrong. Use templates rather than repeating code if your language supports them. If you ever feel "this should probably be commented more", don't comment it - split it up into functions and let the functions be their own comments (if you have to, comment the functions as well). Use const as much as physically possible (in supporting languages). Use array objects that clean themselves up instead of arrays allocated on-the-fly whenever physically possible. If you find certain variables being used often together, group them into an object. If you find a set of functions operating on an object and only that object, make them member functions. Etc.
Just doing basic refactoring can make code far more organized and readable.
"Well, then fire it up and show me what this..." (sigh)
OSS is no silver bullet. Their last point is "OSS code quality seems to suffer from the very same problems that have been observed in CSS projects." Er, big surprise, they're all software.
- David A. Wheeler (see my Secure Programming HOWTO)
Now, I'm all for emperical data, but that is just bistromatics and totally insane.
Metrics are already "black magic." This one is no worse or better than any other dimensionless metric I've seen.
Obviously the input is in radians. The argument to a trig function is always assumed to be radians unless otherwise specified. Now, the sqrt(mumble percent) can only range from 0 to 1, so what we're looking at here is the graph of the sin function from 0 to 2.4 radians.
Do it now. Graph it. Graph the function sin(sqrt(2.4*x)) from x=0..1
Notice that this function (you might call it a transfer function) ramps up and peaks at 0.43 radians. That corresponds to a comment percentage of 3%. Then it begins to go down again. What does this mean? It means that there is a point beyond which more comments are not useful. If more than 3% of your code is comments, there's something wrong. That's all that part of the equation means!
You only classify it as "bistromatics" because you're too lazy to do the thinking and figure out what it's for.
We can't use JSP's, there hard to maintain!
We can't use Javascript, it's loosely typed!
We have to use an Object Broker, SQL is not maintainable!
All the projects that I have been on where code maintainability has been the primary goal have one thing in common. They all failed.
If that is their idea of "maintainable", they didn't fail because they shot for maintainable, they failed because they drank the kool-aide and trapped themselves into software paradigms that only work when oodles of resources are thrown at them. Smaller teams require more agile methods to get results, and that is also the mechanism whereby smaller teams can produce software where larger teams failed. (It goes both ways, I'm not claiming that as an absolute. But that small teams can and have beaten much bigger ones is an unassailable fact.)
Certainly you've got some good facts at hand to learn from, but I think you're taking the wrong lesson away. Projects that simply ignore maintainability fail, too. Can you imagine Mozilla with no concern for maintainability, or the Linux kernel?
The focus should always be on product quality, not code quality.
If you don't have quality code, you don't have a quality product. You may have an adequate product. You may be in a situation where an adequate product is all you need. I have an adequate set of knives in my kitchen, because I can not afford quality knives. But I do not pretend that they are therefore quality knives.
You're calling for a classic short-term focus, and you can and will suffer the classic penalties for short-term focus. I know, I've seen it first hand and dragged software products out of their local optima by the sweat of my brow. It's not easy, but either it happens or the product dies a code-quality death.
You need to use the proper metric for quality. Inappropriately using and paying for a strong type system is anti-quality in my book; that goes for your other two examples as well, when done correctly. (SQL and JSP code both need to be rationally minimized via the application of Once and Only Once, but they are not the cause of the unmaintainability; the abandonment of Once and Only Once is. Once and Only Once is one of the most important aspects of any proper quality metric.) Your quality metric should have functionality built into it.