Ask Slashdot: How To Encourage Better Research Software?
An anonymous reader writes "There is a huge amount of largely overlapping but often incompatible medical imaging research software — funded by the US taxpayer (i.e. NITRC or I Do Imaging). I imagine the situation may be similar in other fields, but it is pronounced here because of the glut of NIH funding. One reason is historical: most of the well-funded, big, software-producing labs/centers have been running for 20 or more years, since long before the advent of git, hg, and related sites promoting efficient code review and exchange; so they have established codebases. Another reason is probably territorialism and politics. As a taxpayer, this situation seems wasteful. It's great that the software is being released at all, but the duplication of effort means quality is much lower than it could be given the large number of people involved (easily in the thousands, just counting a few developer mailing list subscriptions). No one seems to ask: why are we funding X different packages that do 80% of the same things, but none of them well?"
"No one seems to ask: why are we funding X different packages that do 80% of the same things, but none of them well?"
When I think about this, I'd rather have that than one single package, if only for the reason that without competition, I'd not be able to know if it was doing anything well or not.
Pragmatism here says plurality is probably better than some kind of Stalinist central control.
"And the meaning of words; when they cease to function; when will it start worrying you?"
Not only that most researchers are not proficient in programming language, they shape their codes more like prototypes so that they can modify the codes easily as the science progress. Conventional programmers will be frustrated with this approach since they want every single spec set in stone, which will never happen in research setting since research progresses very rapidly and specs can change dramatically in most cases. If you can set the spec in stone, it is usually a sign that the field has matured and is getting transitioned to engineering-type problems. Once the transition happens, it's no longer research, it's engineering. Then you can "make the code better".
--
Error 500: Internal sig error
This problem is widespread in almost every discipline which uses any form of computation. I think the best way is for major funding sources like the NIH, NSF etc to build in to the grant terms which coding language, existing libraries be used. Or how/what/ software will be developed should be used an additional metric for deciding which proposals to accept. Proposals which are strong otherwise but do not state in clear terms how software will be built should be asked to modify their proposals to include such information. Pre-existing, well-designed, modular software architectures should be extended rather than building architectures from scratch. This is a waste of funds and time. Funding organizations must also recognize that developing good software takes time and money and set aside budgets in the grant for hiring dedicated programmers. (Scientists are very often not good software engineers and they are interested rather in trying things out quickly to see if it works at all) Such programmers can then take hacky research code from the scientists and turn it around into great reusable code.
I'm a computer scientist in the middle of getting my BA, but for research experience or in the process of taking an elective, I've spent time with grad students in other departments- mostly biology and linguistics- and the software they write. Smart people? Absolutely- they're experts in their field. But they can't write code to save their lives. I've seen things that make me want to run screaming to TheDailyWTF and the quality software engineering on display there ;)
I don't think this is a bad thing, myself. Most of this code is single-use only, being written for a specific purpose (or a specific thesis paper), and will never be used again. Not to mention they're taking enough time to get their degrees as it is- I don't think it's reasonable to ask them to become expert software engineers as well. OP claims that taxpayer dollars are being wasted, but think how much waste there'd be if every researcher had to get a CS degree before they started in their own field, too.
Dislike the Electoral College? Lobby your state to join the National Popular Vote Interstate Compact.
If you're not happy with what's out there, you need to roll your own. If what's out there is open source, you can pick the best of each of them and build the solid system you're looking for. With research projects, once the stated goal has been reached they are done - until a follow-up grant for further work is awarded. That seems to be what research is about - showing that things can be done or done a different way - not producing a useful software product. Once they show what and how, it's up to someone else to take that and make something great from all the pieces. Unfortunately that means sifting through all the duplicate stuff and finding the best approach and possibly reimplementing it to fit in with everything else you're doing.
For example, you may find Kalman filters, genetic algorithms, neural networks, GPU implementations, etc. all able to solve a particular problem. For real-world software you really don't care about all that, you just want the ONE that works best in your application. Of course then there will be papers on "extensible frameworks" with "plugins" that can handle any of those implementations... Again, for real software you pick the one that works "best" for your definition of best and go with that. To make this happen, you need to get an ego-less (read non-PhD) software team to pull it all together.
Are you seriously trying to tell us that these big labs are not using version control while developing their systems?
That's a lot more common than any sane programmer would suspect.
There is a legend that this is what happens at Intel and Microsoft. It used to be said that every odd numbered Intel was not much of an improvement. It's still true since Windows 1.0 that every other release of windows has sucked. It was perfectly predictable that Vista would tank. (No I don't hate microsoft. Even people that love microsoft can see this has become a "law".)
In both cases the supposed explanation is that there are two difffenent teams working at the same time. The better one gets the first release and second one patches their changes into it for the sucky intervening release.
No idea if that is true in practice.
Some drink at the fountain of knowledge. Others just gargle.
I track a lot of scientific software on Freshmeat. You'd be amazed at the redundancy. Medical stuff isn't as bad as some areas.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)