Open Source Code Maintainability Analyzed
gManZboy writes "Four computer scientists have done a formal analysis of five Open Source software projects to determine how being "Open Source" contributes to or inhibits source code maintainability. While they admit further research is needed, they conclude that open source is no magic bullet on this particular issue, and argue that Open Source software development should strive for even greater code maintainability." From the article: "The disadvantages of OSS development include absence of complete documentation or technical support. Moreover, there is strong evidence that projects with clear and widely accepted specifications, such as operating systems and system applications, are well suited for the OSS development model. However, it is still questionable whether systems like ERP could be developed successfully as OSS projects. "
If they excluded PERL.
by Ioannis Samoladas, Ioannis Stamelos, Lefteris Angelis, Apostolos Oikonomou ...it was all Greek to me.
I just keep my code on one 3 1/2 inch floppy.
Haven't had a problem yet....
$7.95/mo, 200 GB disk, 2TBxfer, MySQL, PHP, RoR.
Was this really a surprise? Did anyone think that open osurce software is as a general rule well documented or documented as well as many commercial projects that have project management (for better or worse) and technical writers on staff to do internal as well as external documentation?
GNU General Public License (GPL)
Berkeley Software Distribution (BSD)
are all defined in the article.
But not ERP.
Go figure.
At least with Open Source Software you CAN maintain it if necessary. With closed source, there is no way to make any changes to old software...and much too often, the companies that make some of the obscure CAD stuff (my field, once) are out of business. At least having it open makes it possible to change something...even if you don't.
One need only peruse the source code of 5 randomly picked source forge projects to figure this one out.
And it's often not.
Many of us have and are working in the "real world" out there, and I've been less than impressed with most documentation on large products.
Not to mention design documents, which end up being dead documents that are outdated as soon as the first line of code is written. To many corporations, there's no big incentive to spend so much money on these types of activities when you can have people just churning out code and finishing the darned product in the end.
I'm not saying commericial development is any worse, but I can't say it's any better for sure either.
- sigs are for wimps.
In spite of drives towards a uniform consistent design, the OSS commmunity still has a long way to go in terms of interface design, which is the defining factor in acceptance of packages like ERP. In "The Art of Computer Programming", Knuth makes note that programmers hate I/O programming.
After nearly 35 years, it is still so. OSS remains an extreme case-in-point.
You lazy young whippersnappers and your precious Perl! You probably think you INVENTED write-only code. In my day, we wrote APL, and nobody liked it!!!!
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
What more documentation do you need than the source code? Seems plenty enough to me, seeing as by and large only developers would look at it anyway. Even if a non-programmer wanted to spin their propeller on it, the original author is only an email away. Seems rather complete to me. Of course the analysis would not be complete without an equation. 43 sounds about right to me..... it's one better than THE answer.
My karma is not a Chameleon.
...dared to challenge this article.
(insert rousing action-series music) Hercules!
You can hold down the "B" button for continuous firing.
The disadvantages of OSS development include absence of complete documentation or technical support.
Yeah, it's nothing like closed source software, which always has complete documentation. I mean, look at Windows itself. All of that documentation about all of those API calls, lots of useful specifications about interoperating with the underlying kernel, plenty of specifications about the NTFS file system...
Oh, wait. It's all kept "secret". Nevermind.
I'd like to see the same story aproach done for closed source projects. Since the focus here was on open source, specifically, it wasn't really well balanced, and it didn't tell us anything new. Anybody who's browsed sourceforge could have told you that open source development has its share of problems.
The real question is whether or not closed source projects are all that better off.
MakePassword.com Mp3 Blog
And this differs from commercial software, how?
I've spent 20 hours trying to figure out how undocumented or broken features behave in Rational's Enterprise Product Suite 2003. And that's expensive software.
I'll choose the software whose source code I can examine any day of the week. Granted, I'm a developer. But it's much worse to lack both documention and source code.
The more high-profile OSS projects mostly do have quite extensive documentation and various mailingslists and forums to support it. Plus, if official support is lacking, it is always possible to get some sort of support from a third-party company as they have exactly the same access to the software as the original developers. With other words: the spectrum of support you *can* get is much larger, even if the support you *do* get (on the smaller) projects may be lower on average.
see a Text Widget
It takes being interested in a project for one to pour himself into it. Most hackers/programmers have a thing for Operating System and programming tools, So it's not suprise that OS projects are doing betters. Or Programming tools, GCC, editors, Programming Languages, Databases... I love to program, but I could never find myself programming an ERP system, just for some company to make money of. How is it going to meet my personal need? There has to be something in it for me!
This is why accounting software, office software and lots of general use applications "suck" in the OSS word. The "motivation" is not there, even "ego" is not a good enough motivation. My fellow hackers will give me more props for some lousy 500 line python hack which does something weird and not so useful than a complete accounting software suite.
What would be interesting is to see a group of companies start an OSS project from the ground up, pour their own money, pay programmers. But then again, there is no motivation for that! Big companies are only interested in jumping on OSS projects that happen to have gained fame...
------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind
I've worked on a major product in CRM market, and let me tell you, don't want to know what goes into sausage. If you knew, you wouldn't touch this code with a 10 foot pole much less bet your company on it.
I'm sure it's the same with ERP. It's just a huge polished turd, but because you don't have the source code you don't know it's a turd. You only see the polish.
However, it is still questionable whether systems like ERP could be developed successfully as OSS projects.
Yeah, it's also still questionable whether systems like ERP can be developed successfully at all. I'd like to see statistics on the number of ERP implementations that go horribly wrong and wind up crippling or even bankrupting companies.
GNU | Enterprise
Brent J. Nordquist N0BJN
No joking here. An old question, what's the best accountant's answer to "how much is 2+2" is "whatever you'd like it to be."
Custom Enterprise Resource Planning software sometimes includes parts no boss would want the IRS or other authorities to know. With Open Source they become blatantly obvious. In this case Security Through Obscurity is the only safe model.
Sure a HONEST resource planning software can be open source. But it won't ever make the company as successful as one with some... extras.
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
I always thought that if you have enough people "chewing" (working) on the same module that it should eventually self-standardize into a least common denominator of maintainability. Which, if not the most maintainable code, should be as maintainable as possible given the design and interoperability constraints (with other modules). Evolutionarily speaking... it HAS to be maintainable or it "dies" (becomes unmaintained and then unused or superceded by another implementation).
On the flip side, a closed source module could be built "top down" to a unified set of coding standards that would help maintainability. But it's not a requirement. I've seen plenty of code bases built just this way that were horrific... But still maintained and not changed because management was willing to throw enough money to keep things going (but not enough money to make it more interoperable).
YMMV.
You can find a description of the maintainability index here.
If you look at the desription, you'll see that the equation was mainly "calibrated" based on a bunch of projects at HP. But fitting such an equation to a handful of self-selected projects doesn't give you any idea of how statistically valid it is.
Furthermore, the maintainability index contains measures that you would expect to go up as software systems become bigger; therefore, it isn't even a meaningful comparison of software systems of different size (or a single software system over time): maintaining a 1MLOC project is just a lot harder than maintaining a 100kloc project, but you may be doing equally well on both of them.
Particularly amusing is a term of 50 * sin (sqrt(2.4 * perCM)) in the maintainability index, where perCM is the average percentage of comment lines per module. As perCM ranges from 0 to 100, the argument for the sin(.) function will range from 0 to 15 (ponder for yourself what that means about how much you should comment).
Until someone produces sound maintainability data with hundreds of software projects, the use of MI is just bullshit.
From the conclusions:
Using tools such as MI derived for measuring CSS quality, OSS code quality appears to be at least equal and sometimes better than the quality of CSS code implementing the same functionality.
So, apparently, the authors think that OSS is as a general rule better than CSS from a maintainability point of view.
However, it is still questionable whether systems like ERP could be developed successfully as OSS projects.
I could be mistaken, but isn't Compiere an established OSS ERP implementation?
I think the questin shouldn't be: 'Can software like ERP be developed as OSS?' But rather: 'Are there enough people in the OSS community interested enough to develop this kind of software without any form of financial support?' I think the answer has turned out to be 'no'. The same goes for things like (good) financial software, and anything that would require heaps of work, high precision and coordination, but no spectaculair result for the common man to brag about.
Not knocking inline documentation - I think it is a great idea, but you have to make sure that developers buy into it.
Really there is a lot of common sense that can go into coding standards to help reduce recurring bugs in "problem functions". Rules for initializing and using globals, rules for maximum method length, code ownership, and small group code walkthroughs can do a lot to prevent the kind of problems you mention.
[Set Cain on fire and steal his lute.]
Good corporations understand the value of corporate alliances. Often, the cost of doing something by yourself isn't worth the payout. Business support software is one of those. Companies don't make money from selling their internally developed software. OSS provides a means for lots of small companies to get together to create this kind of software, without having to create a formal agreement. Sure, some companies are going to take advantage, but if it is open, then every company can add the features that it wants.
The problem with a software company filling this role is that their system is proprietary and unmodifiable by the client. Most companies *do* have the resources to hire a programmer or a contractor to add a feature to a piece of OSS.
Anyone have any ideas on how to prevent abuse of such a system? That is, too many people using the system and not enough people contributing?
Mad Software: Rantings on Developing So
the lack of technical support for open source software. I have gotten more help on more issues by searching Google than I have EVER gotten from some "central" help center for any closed source application.
Right, and what the hell does "Enterprise Resource Planning" mean?
It used to mean the combination of MRP ("Material Requirements Planning") + Accounting. Then along came PeopleSoft and kinda changed it to HR + Accounting. Then along came Siebel and everyone scurried to make it MRP + HR + accounting + CRM (not quite there yet, though). Then they noticed Kronos and they all scurried to make it MRP + HR + Accounting + CRM + Time & Attendance. And failed, because Time & Attendance is a big pain in the butt. Heh. So they partnered with Kronos instead.
The march of "embrace and extend" continues. Next app up: Expense Reporting (say bye-bye to Concur, etc., that's an easy app). Already on deck: data warehousing (say bye-bye to Cognos, Business Objects, etc., say hello to SAP BW). Soon to come: business process automation (say bye-bye to Ariba, etc.)
And so on, if you believe the pundits.
"ERP" has become a meaningless acronym, an umbrella under which every business app known to man is rammed into the same stinking pile of multi-million dollar shit. At some point it will probably implode from its own weight, and we'll go right back to the "best of breed" interoperable software model.
But it will be a while yet. I suspect in the meantime there will be some Open Source alternatives. I sure hope so.
I stopped reading at that point.
If they think they're so smart, those 4 guys are welcome to fork whatever project they want and do it themselves.
Have you guys looked at the formula ?
They take sin(sqrt(mumble_percent)).
Now, I'm all for emperical data, but that is just bistromatics and totally insane.
They don't even say if the argument to the sine function is in degrees or radians and one is left to wonder if they even know themselves...
I have no doubt that if you take a piece of code and does a before&after check after some major rewriting it may tell you something.
But comparing two different pieces of code with this formula is just plain bogus.
Poul-Henning
Poul-Henning Kamp -- FreeBSD since before it was called that...
Quality is still a happy user. Users like software
the works well and hopefully doesn't need a lot of documentation to make it work well. Great software
tends to teach the user how to make it perofmr or at least motivate the user to want ot invest the time to master the software for a particular use.
These guys need to understand that this approach to quality applies to all software, irrespective of
development model behind it. A software product with a lot of customers creates the momentum to maintain and enhance that product. An OSS product can be infused with similar energy due to acceptance by a large community of users (esp if many are programmer's too). The feedback from the users incents the programmers to maintain and enhance the product.
New models can be built from hybrids of OSS (donated programming in the commons) and products
that one must buy. If there emerges an ERP OSS app then there will be a business opportunity to document/train, support/fix/enhance/customize that application... and Oracle will feel the same frustration competing with that model that MS does competing with Linux.
These complaints against OSS as a model (no obtion to buy support or docs) are a business opportunity
that has been put into play by JBOSS, MySQL, and soon to be hundreds of others. The low barrier to entry is the key to high usage... It's try and don't buy (unless you'd like some training, customization, focus product enhancements, etc).
Volume, usage and effectiveness drives the software world. Quality just makes the ride more comfortable. And OSS gets more comfortable everytime the train puls through the station.
OSS is no silver bullet. Their last point is "OSS code quality seems to suffer from the very same problems that have been observed in CSS projects." Er, big surprise, they're all software.
- David A. Wheeler (see my Secure Programming HOWTO)
I have seen many a software project disregard performance, features, and development speed all in the name of maintainability.
We can't use JSP's, there hard to maintain!
We can't use Javascript, it's loosely typed!
We have to use an Object Broker, SQL is not maintainable!
All the projects that I have been on where code maintainability has been the primary goal have one thing in common. They all failed.
If you spend all of your time worrying about how the code looks, you will never finish the project. Talk to people who have built successful software. (The ones that sold millions of copies.) Very few of them are proud of the code the wrote, but they are happy with the product.
The focus should always be on product quality, not code quality.
- C++ is more readable than assembler ...
- C# and Java are more readable than C++
- At the end of this list are functional programming languages.
If you can read source more easily, then maintainability will be better.
This article will tell you why you should be interested in functional programming languages. If you're smart and open minded, you will be convinced.
The best functional languages are Haskell and Erlang (click "next" at the bottom of the page).
For example, with Java you prevent bugs by static typing variables, example:
int numberOfTries = 3;
If you later try to fill "numberOfTries" with a string, the compiler will warn you of a bug and you'll have prevented it.
With Haskell, you don't have to type int. Haskell will figure out the type for you, you get the benefit of preventing bugs with the convenience of not having to type variables.
The reason I chose Erlang is because with functional purely functional programming languages like Erlang, you can automatically multitask your program over several CPU's (or this will take minimal effort). Nice feature to have in the future because every CPU manufacturere is going multi-core chip now. Also, you can easily make a server that never goes down with Erlang because your server is automatically clustered. Just plonk down a couple networked PC's and if one dies, the server cluster will just keep on going (a bit slower) until you replaced the power supply of the broken PC.
There are tons of other advantages but, as I said, the above links will convince you if you're smart. Haskell is a bit more academic in nature, they're figuring out the best possible language and Erlang is more polished and ready to go. It was invented by Ericsson to create ultra reliable realtime servers.
- -- Truth addict for life.
I don't follow. How is that any more or less clear than:
object->doThis(that);
object->forThis(that);
Are you trying to say that the former is better because it looks more like English? Weird argument to make, considering the majority of the world's population doesn't speak English as their first language, so it's irrelevant that it happens to look like English.
Any language (except maybe assembler) can be written in a readable way. You just have to learn the idioms.
Are there standard methodologies for making non-oss code maintainable? If there are its news to me, every place I've worked has been uterly bass ackwards with their source code. Redundant libraries that do the same functions (one writen by Bob the other by Fred). Documentation that is years out of date with reality. And all the dead objects floating around, (its safer to leave them in that pull them out). With non-oss you get a pretty users manual, maybe that is what people call maintainable. Not to say there can't be sloppy OSS code. I think a great topic for discussion would be just plain maintainablity, whether its OSS or not.
The article mentions doubt on whether an ERP system can be build OSS, why not? Are they planning on giving every end user the source code and ability to recompile the company's ERP? When I install Linux and friends on my mother-in-law's computer I don't plan on giving her the source code, is it implied that OSS is less maintainable because you cannot tell if someone has an altered version? It just freaking code!
I found that progress of open source software can often be measured by the availability of generic code. Once that code has been written as a generic library, collaboration is easy and higher level code is more maintainable. This is where Open Source software can have one huge advantage: Collaboration can take place on the libraries, and many applications take profit from that work *between different projects*. You have much more possibilities then in commercial software projects, where you seldom share much work between companies.
That also makes generic code even more important for OS. We have seen huge progress in the last years and most of the lower level stuff has appeared or become more mature.
Unfortunately though, this is also what I still found to be the most lacking in higher level libraries. Most projects go like this:
"We need a really cool search front-end for Google. Let's see whether there's a cool library for this. Nothing yet? Ok, then let's just start from scratch."
And then a library is implemented, with an API for searching using Google. It is perfectly usable for exactly what the developer wanted.
Now this seems reasonable at first, but think about this: In OS development, we are not bound to time lines. Why should we chose a half-hearted attack on a problem that will surely hit others after us again? Instead, the developer could also ask, "if I create a generic library, how generic can I make it? Do we have to limit it's capabilities to requesting stuff from Google? Or maybe I could even create a library that allows to query any search engine? Or maybe, provide an API the lets you query anything, like search engines, your local hard disk, p2p networks and a local database."
In commercial software development, this would be completely unreasonable overkill. In OS however, it's a great way to collaborate. And once implemented, the foundation for a lot of applications has been lied. And it's also fun - writing libraries is fun. It is a great component model either. And it is also a pleasure to see when the work is done, because once the foundation was set writing the actual application is as easy as plugging components together.
There are also good examples, of course. Gstreamer is one. But there is still so much potential. I would like to see this kind of thinking more and more.
what about star office -> open office? Isn't that sort of close to what you want?
But ya, I know what you mean. I'm not a coder, just a consumer, but I would be more than willing to pay good cash (not x -hundreds, but maybe x-couple/3 dozen dollars) for an OS (once a year, not 3-4 times a year scamola) that paid all the developers, was still open source, had fewer apps but all of them *worked* and if I had a question or problem still remaining, I could go post on their bulletin board and get an actual for-real honest informed reply, from a paid company person, that answered my question or resolved the problem or at least determined that it was in fact pretty close to impossible to be resolved. Tell me the truth in other words, I can live with it. I'm getting sorta tired of "yep, we gots us a real community volunteer effort and it's gonna be the bestus and it almost works, just you wait until the next release!!11!1" stuff.
There's an open source business model that might work to sell a real joe consumer desktop and still pay the coders. I'll pay for 20 really well written apps with extensive user accessible written in hoo-mann speech docs, said apps that work and serve basic normal computing functionality for this "the masses" guy (moi and probably millions more), but I am not going to pay for two hundred or one thousand "almost works" volunteer apps on 6 cds and a dvd plus download even more!, etc , including the no credible help even if they have a so called wiki or board or mailing list or irc channel. Nope. I used to believe that but not any more. Not prudent any longer. This is 2005. We have entered the "better work right now right when it's installed" era, not the "coming soon" era, that was LAST century.
signed, joe consumer
Although you intented to flamebait, the joke invites the consideration that some languages make it harder to write clear specifications and some make it easier. /also/ allow for "quick and dirty" style which is ideal for little tasks. If you maintain that style for larger programs, it is exclusively the fault of your lack of talent, methodology and wits as a programmer.
In Perl's case specifically, the language lends itself to quick scripting and shorthand. This is great for little tasks, but as everyone knows, it doesn't scale well. This isn't Perl's or Larry Wall's fault. It was designed to
However, I would like to point out that some languages actually enforce a clean specification. In particular, the functional languages almost literally *force* you to it. I am thinking here of the likes of SML, Haskell, OCaml, Clean, etc. This is not the place to discuss the fact that they also have a sane type system, where others have a broken one (e.g., C - see: http://perl.plover.com/yak/typing/). Language advocacy sucks, anyway ( http://www.perl.com/pub/a/2000/12/advocacy.html).
There's this story a guy posted once on comp.lang.functional about how once he wrote a Haskell program, handed it to the manager and he said: "great, you wrote a specification, now write the program." (!) The manager actually though what was executable Haskell code was just a "spec." Haskell fully suppports Knuth's literate programming approach.
If you ever tried writing something in those languages you know how they force you to write clean code. It is simply the easier way to break code in small functions. "Factoring", I believe it's called.
My point is that I feel the OSS community could greatly benefit from non-mainstream languages. These languages have seen nearly 2 decades of intense research. Arguments pertaining to performance just don't hold anymore vis-a-vis other mainstream languages like C#, Java, Python, Perl, etc. Clean has a video-game library to prove the point (http://www.cs.ru.nl/~clean/). OCaml, SML and Clean approach C performance (Ocaml coming second to C in some "language shootouts" for most benchmarks, for all its worth). Also, some of these have been used in large "real world" problems. OCaml has been deployed in the C code verification for correctness of the flight control software for the Airbus 340 airplane, for instance (http://www.astree.ens.fr/). It is widely known that Ericsson uses Erlang for their telephony switches (www.erlang.org). The CIL - Infrastructure for C Program Analysis and Transformation is written in OCaml and if more widely known, could prevent the weekly flow of buffer overflows developed in Berkeley and can be used *right now* in large OSS projects (http://manju.cs.berkeley.edu/cil/). They've tested it in the Linux kernel, for example (whether they sent patches or not, I don't know).
Of course, you have to have an open mind, and be willing to learn and throw some old tricks out and work using a different approach/mindset. Learning things thoroughly is always hard work, but the OSS shouldn't dismiss functional languages as "academic" - and for that matter, other serious approaches, like Squeak Smalltalk, for instance.
My 2 cents.
PS: I'm no expert at programming, just a beginner, but I offer my opinion here because I feel some people just haven't been introduced to some facts and haven't heard of some stuff. Not everyone has a big company to promote their language.
Main difference between the BSD license and the GPL license: one is from California and the other is from Massachusetts
We can't use JSP's, there hard to maintain!
We can't use Javascript, it's loosely typed!
We have to use an Object Broker, SQL is not maintainable!
All the projects that I have been on where code maintainability has been the primary goal have one thing in common. They all failed.
If that is their idea of "maintainable", they didn't fail because they shot for maintainable, they failed because they drank the kool-aide and trapped themselves into software paradigms that only work when oodles of resources are thrown at them. Smaller teams require more agile methods to get results, and that is also the mechanism whereby smaller teams can produce software where larger teams failed. (It goes both ways, I'm not claiming that as an absolute. But that small teams can and have beaten much bigger ones is an unassailable fact.)
Certainly you've got some good facts at hand to learn from, but I think you're taking the wrong lesson away. Projects that simply ignore maintainability fail, too. Can you imagine Mozilla with no concern for maintainability, or the Linux kernel?
The focus should always be on product quality, not code quality.
If you don't have quality code, you don't have a quality product. You may have an adequate product. You may be in a situation where an adequate product is all you need. I have an adequate set of knives in my kitchen, because I can not afford quality knives. But I do not pretend that they are therefore quality knives.
You're calling for a classic short-term focus, and you can and will suffer the classic penalties for short-term focus. I know, I've seen it first hand and dragged software products out of their local optima by the sweat of my brow. It's not easy, but either it happens or the product dies a code-quality death.
You need to use the proper metric for quality. Inappropriately using and paying for a strong type system is anti-quality in my book; that goes for your other two examples as well, when done correctly. (SQL and JSP code both need to be rationally minimized via the application of Once and Only Once, but they are not the cause of the unmaintainability; the abandonment of Once and Only Once is. Once and Only Once is one of the most important aspects of any proper quality metric.) Your quality metric should have functionality built into it.
While they admit further research is needed,
It's not usually all that hard to get people to "admit" that they'd like more funding.
disclaimer: that was not meant as a rant, I work in science myself. But "more research is needed" is a running joke in the community. It doesn't detrect from the work, but every publication on the planet includes it, and every serious reader treats it as a mere formality and silently ignores it.
sudo ergo sum
So I've been told, sometimes by some of the biggest names in programming. Unfortunately, a firm belief among the industry doesn't make them right.
Rather than debunking this one here, I'll simply refer you to Steve McConnell's excellent Code Complete. McConnell cites a large amount of hard data to show that longer routines can be at least as good on both development time and error count grounds as shorter routines, and indeed exceptionally short routines (the 10-liners you're advocating) are amongst the worst on both metrics.
As an aside for general interest, since I'm sure a lot of people reading this comment also found that book very good, it seems a second edition has recently been published, updating the examples by a decade or so and putting much more emphasis on recent coding approaches, particularly OO. Whether that is an improvement remains to be seen, but I'll certainly be buying a copy. I guess if he's reversed his position based on more recent studies, I'll have to eat my words, too, but I doubt it. ;-)
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.