Ask Slashdot: How Do You Assess the Status of an Open Source Project?
Chrisq writes: "Our software landscape includes a number of open source components, and we currently assume that these components will follow the same life-cycle as commercial products: they will have a beta or test phase, a supported phase, and finally reach the end of life. In fact, a clear statement that support is ended is unusual. The statement by Apache that Struts 1 has reached end of life is almost unique. What we usually find is:
- Projects that appear to be obviously inactive, having had no updates for years
- Projects that are obviously not going to be used in any new deployments because the standard language, library, or platform now has the capability built in
- Projects that are rapidly losing developers to some more-trendy alternative project
- Projects whose status is unclear, with some releases and statements in the forums that they are 'definitely alive,' but which seem to have lost direction or momentum.
- Projects that have had no updates but are highly stable and do what is necessary, but are risky because they may not interoperate with future upgrades to other components.
By the treating Open Source in the same way as commercial software we only start registering risks when there is an official announcement. We have no metric we can use to accurately gauge the state of an open source component — but there are a number of components that we have a 'bad feeling' about. Are there any standard ways of assessing the status of an open source project? Do you use the same stages for open source as commercial components? How do you incorporate these in a software landscape to indicate at-risk components and dependencies?"
sourceforge, github, and other major OSI project hosts feature both last updated dates and when a project is discontinued often times notices stating so. Ultimately, some responsibility is placed on the author(s) & maybe even the community for managing this. Search engine rankings take care of the rest. And of course, there is no way to bat 100% here, some will be missed with this and just about any other method.
Try and find someone looking for help using it online. See what people say to them. If there are lots of recent problems and responses that don't seem to suggest using other products, its likely in a good state to use.
If no one is looking for help using the library, its either not in use, or way too easy to use (has that ever happened?).
One thing to look out for is that if something works well, it might not need updates very often (or at all, depending on what it is). Don't abandon something simply because its old, or not being updated. Now, it its not being updated, has lots of open issues, and no users, thats a problem.
You can also look for some issues/tickets, and see the response times on them.
This isn't a problem that is unique to open source. Several commercial libraries that we have used in the past have entered the twilight zone where the developer is neglecting them, and refuses to release any sort of roadmap or EOL announcement. Eventually, you just have to make your own call based on how much work it will be to move to a new library vs the risk of staying with the current one. At least with open source if you get stuck with a dead library you can choose to take over maintaining it on your own either as a long term strategy or a short-term stop-gap until you can move onto something else.
One metric yielding interesting results is the concept of "technical debt", as introduced by Martin Fowler. Sonar Source, for example, measures this metric very well. A project that has seen neither increase ( recently taken risk ) nor decrease ( recent moves toward stabilization ) may very well be dead. I recently used it upon our own software of 580 KSLOC. The interesting conclusion: core stable, some utilities half dead or worse, much life springing up at the functional fringes. This also holds for e.g tomcat. The tactical and strategical conclusions one may draw from such considerations are fascinating.
Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
The first thing I do with regard to investigating any OSS is to find their developer list and skim the last few months of it. It's a good way to see the level of activity, responsiveness, and how cohesive or combative the core is.
Another good technique is to search Stackoverflow for questions about the project you are considering. Look at both the number of questions asked and the quality of the answers. Especially look for questions like "Should I be using XYZ?" and "XYZ vs {Alternative to XYZ}".
Stackoverflow is moderated somewhat like Slashdot, so the best answers will usually bubble to the top.
Yeah you want to be careful with activity metrics. Awk hasn't seen many updates in the last two years. Mostly because it hasn't NEEDED much in the last ten or twenty years. That means it's already rock solid, not that it should be avoided.
0) If the project does what you need today, USE IT. Don't get so bound up in "future-proofing" your technology stack that you get paralyzed looking for "the perfect product that will do exactly what we want forever and never let us down."
1) Define your standard software stack. Mandate that all software written internally using open source components use these standard components & versions, or coordinate making a new version available to all projects if there's a particular new feature of a new version that is absolutely mandatory;
2) Always, always, always, download source for the version of the package you're installing (even if you just grab binary-only distributions to install & run), and archive it for posterity in some location YOU control and backup - DO NOT rely on "the internet" to help you find an old version of software; this allows you to fix (or hire someone to fix) any problem you have down the road in case of real critical issues where no active project maintainers can be found/hired/worked with.
3) Every few months (we shoot for ~6 months), review your stack and grab the latest versions of each component and make it available in your dev / testing environments;
4) If a component starts getting stale (no updates for 2 or more of these cycles), we'll start thinking about replacements for that component, and investigate likely alternatives, and bump this item up into the "needs monitoring" risk category - no production impact yet, but as soon as you need to release a patch of that production version using the outdated component, you're gonna be in trouble.
5) Periodically (nightly if you have resources - get something like jenkins or similar for this sort of thing) ensure that you can build these components from source successfully. Especially as they get 'stale,' you'll run into issues - system libs, headers, etc. will change over time, and there will come a point where you are no longer able to build the software without code modification. At that point, if any of your software is still using the version, then you should start raising alarms and bump the risk level up to "severe." This could cripple your production env.
6) If a crisis comes up and a dead project is the culprit... well, we've got the code and can always modify it ourselves, if we haven't found any suitable alternative.
There's really no magic to it - just make sure that developers aren't downloading "every version under the sun," and ensure that the versions you're using are reproducible, available, and actively managed on your end. Risk management is paramount.
I've had a couple of cases where I needed a feature, that there had been lots of requests for, in existing software whose development had slowed or stopped. I offered to hire the developer, bounty style, but they weren't interested.
I hired professional programmers to add the feature or make necessary changes to the existing code. I then submitted the code as patches to the original developer, hoping that he would accept the patches and make it so I didn't have to patch and compile everytime there was an update or distro change. My patches were always GPL and there were no restrictions on them, so if the developer didn't like the style or specific implementation, they could use my patch as a starting point or model and change whatever they chose.
In all cases, the developers have not incorporated the patch. In most cases, they have done nothing at all. I'd likely have been better off just buying Windows COTS.
Have their been any updates at all since you submitted your patch? If not and the time period is long enough to believe there never will be, then your best course of action is to fork. As one with enough vested in the project to pay for further development, you are probably in a better position to steward the project than the original developers, who likely have no more use for the program.
If there have been updates, then you have a more sticky position. Most likely, the maintainers considered your patches to be too narrowly applicable at least relative the difficulty required to integrate and maintain them. At that point, you are pretty much stuck re-integrating your patches with each release.
Windows COTS wouldn't necessarily solve your problem either. It just takes away the option to patch your own. If the company is not interested in making the changes you request, there isn't much you can do about it. The exception would be of the commercial software is more popular and better maintained but that's true in the open source world too. If you have a choice between two projects, both of which an do the job with adjustments, you are most likely better off contributing the one that is actively maintained than the one that isn't, even if the required changes are more extensive.
the typical example that i give here is "python htmltmpl". htmltmpl was written to solve a very specific problem: minimalist templating of HTML by allowing dictionaries of key-value pairs to substitute into HTML (value text replaces the key when named) and to do likewise for lists of dictionaries in order to e.g. create tables.
very very simple.
the problem is this: the actual scope of the work required means that the actual programming required was extremely straightforward. i.e. it was done, completed - problem solved. the scope of the work required is clear; the scope of the work required does not change; the scope of the work required does not *NEED* to change.
therein lies the problem, namely that the fact that python-htmltmpl has quotes not had any development quotes means that, as far as sourceforge is concerned, the project is "dead". look at the release dates - 2001 for god's sake!
http://htmltmpl.sourceforge.net/
the point is: just because a project hasn't had any development done on it, that DOES NOT automatically mean that it doesn't do the job. correlation != causation. python-htmltmpl *clearly* does the job it's intended to do.
i mention this case specifically because i have seen a large number of HTML "templating" languages come and go. the php-inspired one which used syntax. zope with the dreadful and insane embedding of python in templates and templates in python. many many more, all of which caused me to despair when i saw them, so much so that i was inspired to talk at one UKUUG conference at some length about best practices of keeping programming languages declarative i.e. *never* embedding programming languages into HTML (even if it's php).
and once you follow the sanity-restoring rule of keeping a programming language declarative (e.g. in the php case beginning the file with as the last two characters and AT NO POINT EVER NOT FOR ANY REASON WHATSOEVER FALLING BACK TO OR PERMITTING STATIC HTML TO BE OUTPUT IMPLICITLY)... ... once you follow that rule, then you find that you need a templating system such as php-htmltmpl or any of the others that exist. and, once you've looked closely at what you actually need out of an HTML templating language, then actually, htmltmpl provides a *really* good very simple system which covers pretty much everything you'll need. need to do an expression which is a mixture of variables and HTML? generate it explicitly in php, put it into the array - don't for god's sake try to use a god-awful mix of print, echo, dots and christ knows what else. just.. don't.
so i'm putting this out there because in certain cases, what you find is that the code that you need appears "dead", but that's not actually the case: the failure of sourceforget and github by their "metrics" have relegated perfectly good and *completed* code to obscurity.
you are therefore encouraged to participate in *unfinished* projects, with their constant changes, moving targets and massive contributions which may or may not be correctly managed, because it is those projects that have "99% activity". does that sound like a good thing to you?
I mean no disrespect to someone with a UUID that is low enough to... have done many things.
But I've been in some FOSS projects (small ones) -- and there's a lot of...issues I've seen with submitters you didn't cover. I guess the OP should get it...but I figure since you're the person explaining things...
1) Being a FOSS dev, you may still be commercially paid and have a noncompete in place.
2) The project you're on may not be GPL. Thanks for submitting stuff with an incompatible license I can't absorb. Even if you said no restrictions, if you put GPL on it, I'm now SOL and have a god-awful license tracking nightmare. Thanks for nothing. Please resend with "public domain" and a signature.
3) Many times I've received patches 'in the wrong place' in the stack. Things requiring changes that should be submitted to another library and were mangled as a fix in my platform.
4) Poor fit. Wrong option, rare case, you changed lots of whitespace becuse you don't know how to use your editor. Wrong style guide, you name it.
5) Bugfix submitted without test case.
Now admittedly, I'd always reply and let people know how to fix thse. But depending on the problems...I've seen cases where it wouldn't have been worth it.
Lastly, the hard one -- sometimes peoples fixes are just in the wrong spot and paradigm. They're written in an OO message-passing philosophy in something using a reactor/worker queue. It's not /just/ that it's work to integrate and maintain it, it's that the solution is just 'wrong for us' and the problem it fixes is not a priority. That's a really big risk if you pick up joe-random-developer that knows a language but not a platform.
FOSS is and should be inclusive, but sometimes the submitter has to ask a few questions to fit into the software.
The OP indicates they hired professional programmers, but they did not say what they hired them /for/. If you hire me to 'fix a bug in a program', you're getting a very different fix than if you hire me to 'submit a bugfix for reintegration into mainline' or to 'write a plugin doing X for application Y'
In both cases I'll ask about the quality of work you expect, what you believe is a fair price, and check what you intend to do with it. However, if like many small businesses you just want it done fast and working -- the software may very appropriately /not/ be up to standards. It's their right as a hiring manager to choose.
More relevantly in the context of a freelancer, it's my professional pride and reputation at stake to choose my implementation in the absence of terms to the contrary.
If you're clearly a penny pincher and want fast results, I will place in comments that it's a quick and dirty hack, and give you your four hour turn around with advice and a quote for a proper and full fix. And the maintainers would have every right to say 'fuck that submission'.
The length of time to wait is much longer than you want. The original author of the project still owns the copyright and the rights to the name of the project. The best option is to fork the project and start fresh.
These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...