Ask Slashdot: How To Encourage Better Research Software?

Pay For It by WrongSizeGlass · 2011-04-29 07:14 · Score: 1

How To Encourage Better Research Software?

Use serious grant money to pay for it - even if it's paying students at a university.

Re:Pay For It by Anonymous Coward · 2011-04-29 09:19 · Score: 0

That won't help. If anything, it will hurt.
If you want good software, pay professional software developers to develop it. There's a reason why almost none of the software we use on a daily basis started off in a research lab. Software produced by researchers is usually made just good enough to write a paper or thesis about, then dropped. Software quality doesn't matter for publications, so it doesn't happen.
(Note: I say this as a grad student, who is currently writing software that will most likely be dropped as soon as I finish my thesis.)
Re:Pay For It by RightwingNutjob · 2011-04-29 14:17 · Score: 1

Amen. And not just in grad school. Even deep down in the bowels of the military industrial complex, fancy software is only ever as fancy as it needs to be, and the bug/feature line is sometimes just as blurry as it is at 11:58pm before the publication submission deadline.
Re:Pay For It by guruevi · 2011-04-29 23:43 · Score: 1

They already do, don't worry. The main problem is partly in the article - no code reviews only recently have some of them started using version control systems. The other problem is that you have PhD's in whatever scientific field writing software with minimal knowledge of Computer Science. I work in the field with grants from NIH but the software quality is simply awful because no-one has the slightest clue about good coding practices.

--
Custom electronics and digital signage for your business: www.evcircuits.com

Pragmatism? by gilgongo · 2011-04-29 07:21 · Score: 2

"No one seems to ask: why are we funding X different packages that do 80% of the same things, but none of them well?"

When I think about this, I'd rather have that than one single package, if only for the reason that without competition, I'd not be able to know if it was doing anything well or not.

Pragmatism here says plurality is probably better than some kind of Stalinist central control.

--
"And the meaning of words; when they cease to function; when will it start worrying you?"

Re:Pragmatism? by gilgongo · 2011-04-29 07:23 · Score: 1

(oh dear replying to my own post...) .. of course, if you were talking open source, then that would be another matter :-)

--
"And the meaning of words; when they cease to function; when will it start worrying you?"
Re:Pragmatism? by goombah99 · 2011-04-29 07:37 · Score: 5, Insightful

The original article is clueless about the difference between research products and production software. In research there is no a priori omniscience about what is best. What you see at the end is the few survivors of an evolutionary competition of zillions of efforts. You don't see the three planned outcomes that we had known could have been written from a well thought out requirements document.
There is a decades old saying that scientists develop the next generation of algorithms using last years computers . COmputer scientists write last years algorithm on next years computer. It is still true.

--
Some drink at the fountain of knowledge. Others just gargle.
Re:Pragmatism? by Anonymous Coward · 2011-04-29 08:17 · Score: 1

I work in this field as well. Displaying 2d cross sections and rotatable 3d views with surface meshes has been a solved problem for a long time. There's no original research there. Yet there are in the neighborhood of a dozen major- and many more minor - viewers to do just this and only this, written with lab time and research money.
Re:Pragmatism? by Anonymous Coward · 2011-04-29 08:48 · Score: 0

One problem with viewers is that many of them implement different minor features. One problem is that these things all seem easy until you build up all the little niggling details to get a functional tool. Also, of those dozen major viewers, nearly all of them will have horrendous documentation or will be written in ways that are hard to extend.

Not going to happen by robbyjo · 2011-04-29 07:21 · Score: 5, Insightful

Not only that most researchers are not proficient in programming language, they shape their codes more like prototypes so that they can modify the codes easily as the science progress. Conventional programmers will be frustrated with this approach since they want every single spec set in stone, which will never happen in research setting since research progresses very rapidly and specs can change dramatically in most cases. If you can set the spec in stone, it is usually a sign that the field has matured and is getting transitioned to engineering-type problems. Once the transition happens, it's no longer research, it's engineering. Then you can "make the code better".

--

--
Error 500: Internal sig error

Re:Not going to happen by Anonymous Coward · 2011-04-29 07:37 · Score: 0

you're describing the perfect use case for git. I've used a lot of SCMs over the years (even writing my own homebrew system on a couple platforms where nothing else was available) and git (with a gui if some sort to make it usable) excels at their development process.
Re:Not going to happen by sockman · 2011-04-29 07:41 · Score: 2

Didn't you just describe why agile came about? Because we, as software professionals, realize that specifications are not set in stone and the system should be easy to adapt and modify for future requirements.
Re:Not going to happen by oneiros27 · 2011-04-29 07:49 · Score: 1

I completely agree with the not being proficient in programming languages. And they should be required to take some security classes if they're going to be writing any significant code that runs as a CGI or acts as a service. For some reason they don't like it when I refuse to run their shell script CGIs. ... but I'd argue that they don't 'shape [it] ... so that they can modify the codes easily' ... Unless 'easily' means an attempt at a find & replace in 40+ places when they should've used a function in the first place ... but unfortunately introducing some mistake that results in team of people spending a week doing nothing, while a couple of people try to figure out what went wrong.
Yes, there are parts that are still 'science' and have floating specifications ... but then there's stuff like storing data in an archive and retrieving it. You might write some special cataloging system for it, but there is *no* reason for scientists who aren't archivists, and only barely programmers to be writing something from the ground up. It's a waste of effort, and it's a waste of tax payer's money that these things keep getting funded.
We've at least had some standardizations on file formats in the various disciplines. I'm hoping we can get some standardization in data systems to support data transport and mirroring as part of the NSF DataNet grants (although, I'm looking towards OODT, myself)
I'm looking forward to data browsing/ visualization standards (how many re-implementations do we really need to plot a few lines, map a grid onto an image, or allow someone to filter a table of data?).
Because really ... if we make these tools universal, the scientists can get back to doing science, rather than wasting time re-implementing yet another tool that doesn't really do anything better than the other stuff out there. (but they just haven't surveyed what the other stuff is out there, as they haven't looked outside their discipline)
disclaimer: I'm a programmer, working at a science data archive, in case it wasn't clear. And I've given a few presentations about how scientists need to stop implementing stuff on their own without getting help from programmers/informaticians/archivists/etc.

--
Build it, and they will come^Hplain.
Re:Not going to happen by Anonymous Coward · 2011-04-29 08:15 · Score: 3, Informative

I do medical imaging as my day job. The parent understates the "spec" problem -- its just as much a testing problem. The typical spec I work against is "create a tool that distinguishes this disease state from some other disease state and from healthy normals with optimal power". Optimal power is, of course, only defined by the results you get or against other software (probably that measures different facets of disease). Moreover, the spec gets driven by log10 increases in image numbers --- that is 1:10:100:1000:10000 images. So the original spec is generally an idea for a few images -- then as the idea gels the sample battery size is increased. A lot of places don't have 100+ image sets -- particularly for cutting edge imaging methods. There's also a catch-22 -- in general if you know how to detect algorithm failure you'd build that in to the code. By the time you get to testing on 1000 subjects there's enough code in place that it's hard to justify a rewrite using "proper SDLC". (Go off and re-read Joel on Software about the value in "rewriting" software!) Besides do you want the creative people managing software development or do you want them moving on to the next great idea?
As far as the original poster's whine -- I don't buy the "didn't have git and hg". SCCS for example wasn't pretty but certainly worked in the particle physics community which was globally distributed from the 1980's. There was a lack of sharing for two reasons: 1) if you are competing for customers and or grant money you publish the idea but don't give away the code (it's your competitive edge) 2) if you have a new idea its often the case that the available code you could find wasn't worth the effort to merge. Now one of the problems is that there is a huge buy-in for most of the toolkits -- its hard, for example, to simply lift a function out of ITK to use elsewhere. If you want to use ITK you have to buy-in and create ITK apps. It's also non-trivial to drag a function from some other framework into ITK. (This is not to pick on ITK, it's a good toolkit; it applies to most other frameworks too.) Moreover, there are a couple of different classes of image processing users -- those who are worried about whether software works (or seems to work) and those who worry about whether its right. Ideally you want both, but testing for "works" is different than testing for "right".
Heck even up until 6-7 years ago many labs had their own image format used in processing. DICOM data comes off the imaging device -- but DICOM is a very flexible standard. (Here flexibility means that about half the stuff you need to know to really do large scale processing is stored in well defined locations; the rest is vendor specific and vendor software revision specific.) So most toolkits munge the incoming data into some standard format -- simple formats sound great, but can often lack sufficient detail for a particular analysis. The Mayo Analyze 7.5 format, for example, spent years as a ubiquitous standard, but couldn't sanely store oblique images. Its at least settling down to a handful of decent storage formats which helps with interop.
Medical image research is not software engineering.
Re:Not going to happen by squidfood · 2011-04-29 08:50 · Score: 1

specs can change dramatically in most cases
Moreover, the very act of scientific progress is questioning and experimenting with the assumptions in the spec.
Re:Not going to happen by robbyjo · 2011-04-29 09:16 · Score: 1

Does "agile" software development allow scrapping 100% of the code and radically change the spec (and thereby everything else) every about 6 months just because of new scientific publication? It may sound extreme, but this often happen in research. If we take time to "structure" our code, before we know it, we have to redo it all over again. We do use libraries like GSL, BLAS, ATLAS, etc. to make our lives easier. These won't change, but whatever we build on top of these often get scrapped at regular basis. So, we really don't have incentives to "beautify" the code.

--

--
Error 500: Internal sig error
Re:Not going to happen by Anonymous Coward · 2011-04-29 09:19 · Score: 0

The problem is it's Agile on speed. The speed is such that you can't afford the time to continually try and download your ideas into a conventional programmers head. During active development, by the time I explain a cutting edge algorithm idea to a (BS level) programmer and teach them enough math to implement it correctly, there's a net time loss. It still takes as long for me to figure out what happened when things go off the rails.
Re:Not going to happen by joe_frisch · 2011-04-29 09:20 · Score: 1

We've developed a lot of in-house code here at SLAC. Often we have had better success with the prototype code developed by scientists than with the rigorously written by software engineers. This is because at a research lab the requirements are constantly changing (if we knew what we were doing it wouldn't be research), and the design cycle for specify / write / test / debug / deploy is to slow. Having the people directly involved with the experiment writing the code in real time gets better results faster for some types of problems. (for some systems, the standard engineering approach works better and we use it)
We also have the issue of multiple codes that do basically the same thing. My group recently looked into trying to merge the various codes for simulating and tuning accelerators, and we concluded that it would be more effort to merge them than to continue to develop separately. This was for a group of people who were now all in the same group and collaborating - there was no political or career pressure on this.
I think the research environment is fundamentally different from a commercial environment. In many software projects the requirements are continually changing. This is not a result of poor planning by the people requesting the software, but rather the desire to take best advantage of new scientific information as it becomes available. The resulting informal code development is very efficient for the project, but produces code that is difficult to transport to other projects.
The code we write is available to the public, but is un-documented and unsupported, and so mostly useless to others.
Re:Not going to happen by Anonymous Coward · 2011-04-29 09:48 · Score: 0

Mod parent up.

Yes, there are parts that are still 'science' and have floating specifications ... but then there's stuff like storing data in an archive and retrieving it. You might write some special cataloging system for it, but there is *no* reason for scientists who aren't archivists, and only barely programmers to be writing something from the ground up. It's a waste of effort, and it's a waste of tax payer's money that these things keep getting funded.
This is the problem exactly. Most of this software poorly reinvents the same wheel.
Re:Not going to happen by Puff_Of_Hot_Air · 2011-04-29 10:41 · Score: 0

Does "agile" software development allow scrapping 100% of the code and radically change the spec (and thereby everything else) every about 6 months just because of new scientific publication?
Yes! Every iteration (month) you can throw the whole lot away and start again. You won't though, because there are always certain building blocks you can re-use.

If we take time to "structure" our code, before we know it, we have to redo it all over again. ... So, we really don't have incentives to "beautify" the code.
I see these arguments in all kinds of contexts. All it is excuses for poor work. The funny thing is that doing things 'right' the first time generally ends up taking less time overall. I wonder how much time you waste fixing broken code, or mistaken logic? Getting two hacks to work together? Perhaps if you stopped cutting corners constantly, you would learn some of the techniques that make rapidly changing prototyping, quick and painless. Nobody has time to "beautify" their code, but you are making the age old mistake of assuming your circumstance is "unique", and forgetting to learn from others. Your circumstance is not unique, you are essentially doing what any start-up/web based company would do code wise. You need techniques like TDD, patterns such as 'inversion of control', and agile methodologies. That's if you wanted to go faster and have reliable code. It would require you to consider that you could do things better, and others may have something to teach. It might require you admit that you could be wrong...
Re:Not going to happen by Puff_Of_Hot_Air · 2011-04-29 11:06 · Score: 1

How this was modded up in a place like /. astoundes me. Let me address your points one by one.

Not only that most researchers are not proficient in programming language, they shape their codes more like prototypes so that they can modify the codes easily as the science progress.
News flash! Everyone who is writing software that is "new" is "prototyping". This is not a new problem, this is why we have design patterns and TDD.

Conventional programmers will be frustrated with this approach since they want every single spec set in stone
I take it you know very few programmers, maybe at IBM? Managers want "specs", many of us developers want to work on something interesting. If there were all these lovely specs set out in stone, it wouldn't be much of a challenge now would it?

...will never happen in research setting since research progresses very rapidly and specs can change dramatically in most cases
Hey! That sounds just like the software world! Look up "agile software development".

If you can set the spec in stone, it is usually a sign that the field has matured and is getting transitioned to engineering-type problems. Once the transition happens, it's no longer research, it's engineering. Then you can "make the code better"
Mistakenly thinking that software development was engineering is what has caused more than one company to fail. Software development is team based creative problem solving. Waterfall is dead for a reason. Still, there are ways to rapidly prototype while creating high quailty easily adapted code. This is what all modern software development techniques are about. Your problems are nothing special, you're just ignorant of the solutions.
Re:Not going to happen by Anonymous Coward · 2011-04-29 11:45 · Score: 0

Now i am sure... i wonder how could i survived out of the true way, thanks for clarify me the gospel, ho divine perfection
Re:Not going to happen by Fulcrum+of+Evil · 2011-04-29 11:49 · Score: 1

Yes! Every iteration (month) you can throw the whole lot away and start again. You won't though, because there are always certain building blocks you can re-use.
No! Agile is about iteration and isn't really suited to stringing a bunch of prototypes together. I can drive a nail with my fist, but that isn't a good idea.

--
"We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
Re:Not going to happen by Anonymous Coward · 2011-04-29 11:53 · Score: 1

I call BS --- I work in a medical imaging lab. In the past two years we've had three CS "professional programmers" added to the staff. They all came in with "methodologies that will save the day" spouting the crap the parent poster is spouting .... they are still around but let's say they are much more pragmatic. And hmmm after about 6 months they all agreed that it is indeed a unique situation.
Re:Not going to happen by Anonymous Coward · 2011-04-29 12:00 · Score: 0

Enought of your xtreme/agile fanatic religious non sense... stop calling names on others that not share your fait. Get a clue
Re:Not going to happen by Anonymous Coward · 2011-04-29 12:23 · Score: 0

I call BS on you. I bet those "professional programmers" gave up on trying to help your sorry "unique situation" and settled for being employed in a down economy. Good code is always a good idea. Good programmers tend to have good methods. Bad programmers tend to call good advice "crap".
Re:Not going to happen by Puff_Of_Hot_Air · 2011-04-29 12:34 · Score: 1

Agile is just a big catch all word that has lost all meaning. Let's say "not waterfall". Whatever you thought I meant is irrelevant, but the fact remains that "rapid prototyping" is all these guys are doing. This is not a unique situation. I used the term 'agile' to indicate that there are a range of processes and development techniques, that have come about to help improve speed and quality in these situations (and they happen to fall under the umbrella of 'agile'). If your work involves regular creation of software, it would be smart to get familiar with these techniques.
Re:Not going to happen by Puff_Of_Hot_Air · 2011-04-29 12:48 · Score: 1

Every work-based situation is unique. I obviously shouldn't have used the term 'agile' as it has become a polluted by XP evangelists and so on in the past. Pragmatism would suggest that you look at what you are doing and work on improving the processes to allow you more structured control of your 'agility'. Normally in the software world we are trying to do the reverse, go from a rigid inflexible system to one that is more 'flexible' (and again, this is achieved only by looking at all processes and improving them one by one, with greater agility as the end goal. Forget the hype). In this case the situation is reversed; there is plenty of agility, but low quality, Fortunately, many of the techniques that have been developed to enhance 'agility' in software development are applicable in these situations as well. I refer to Test Driven Development (which is a simple technique that many of us employed long before it was formalized, it's just little test programs basically), different design patterns that allow easily changing things without having to re-write everything (inversion of control is great here, because it solves the spaghetti wiring problem), and dividing work into chunks (iterations just help keep you focused on the goals, and asks you review your processes regularly. It's a feedback loop looking at how your working, that's all). This stuff is all light weight, and will not slow you down, just help with some of the problems. You should always be looking at how you work with an eye to improving.
Re:Not going to happen by Puff_Of_Hot_Air · 2011-04-29 13:15 · Score: 2

I think the research environment is fundamentally different from a commercial environment. In many software projects the requirements are continually changing. This is not a result of poor planning by the people requesting the software, but rather the desire to take best advantage of new scientific information as it becomes available. The resulting informal code development is very efficient for the project, but produces code that is difficult to transport to other projects.
Your situation is not different to many commercial environments. In fact, this is one of the largest problems in software development (notice I use the word development, not engineering). There are ways to write quality, flexible, extendible, maintainable programs in these environments, but it is much harder. I'm not talking out of my arse here, I've been in this game for many years now, and have seen approaches that work, and ones that fail. If the resultant program is truly "use once, then throw away", then continue as you are. If you find that you want to build on it later, or give it to others, then there are existing techniques that can assist. The smart approach is to add some of these ideas that look as though they would help, one at a time, and only keeping them if they help. Your right when you say that your environment is "fundamentally different", my experience has been that everyone's situation is unique, but there are certain tools, techniques, and strategies that pre-exist, and may save you time if you spend a little time investigating whether they're right for your environment.
Re:Not going to happen by Anonymous Coward · 2011-04-29 14:16 · Score: 0

Enought of your xtreme/agile fanatic religious non sense... stop calling names on others that not share your fait. Get a clue
Patience, my friend. We let him go on about his objects, design patterns, modeling languages, model integration, and unified processes. Now he's become enlightened and become all eXtreme with his pant line down at crotch level shooting us a shaka while a guitar solo plays in the background. We'll wait for him to break his scrum to let us know how THIS TIME (the other times didn't count) this one true way of coding will solve all of our problems. The problem with you, my friend, is that the code that you wrote in C or FORTRAN, the one that works extremely well for what you needed it to do, you should have spent time rewriting it in C++, then Python (yes, it will run slower, but don't worry, processor speed is increasing all the time), then added databases to it, then rewriting it in PHP so you could add that GUI you never really used but that one guy wanted, then web enabled it so that that other guy could use it, but he only tried it once, then you should have pulled in a few more people so you could refactor it, but don't forget you need to meet every other day to decide what you'll do that next day.
Re:Not going to happen by mjwalshe · 2011-04-29 22:55 · Score: 1

well you obviously haven't worked in technical programming or at at a cutting edge rnd place where quite often you are dealing with the world no 1 expert in a field.

quite often you are writing single use specialized programs and you accept some limitations - I recall one wave tank experiment where if you enter the wrong parameters you could cause the computer controlling the experiment to create such a powerfull wave it would have broken the tank and flooded the lab.

I remember one guy whos program to control a mixing experiment required typing in integers to a command line where the only prompt was "?" - his comment well I remember what I need to type - I actualy went in and added code that told you what the options where - mainly because they had scaled up to full size experiments and the cost of materials for one experimental run approached that of a small house.
Re:Not going to happen by Anonymous Coward · 2011-04-30 03:15 · Score: 0

This is just to show that you, sir, don't know what scientific programming really is. No matter how "agile" your "methodology" are, you will resort to scrapping everything (sans the libraries) and start over again and again and again.
Re:Not going to happen by dkf · 2011-04-30 08:14 · Score: 1

by the time I explain a cutting edge algorithm idea to a (BS level) programmer and teach them enough math to implement it correctly
There are two things wrong with that statement. Firstly, you're trying to get rapid results out of a BS level programmer (i.e., a total greenhorn with no real experience) and secondly, they probably don't understand that much math to begin with either (i.e., did you specify that when hiring them?) If you'd done your hiring more sanely, you'd have someone who could actually support you properly. Yes, they'd cost more than just a tyro but that'd be money well spent. Remember, you're not getting someone who's taking part of their pay in training to be a researcher; you're getting someone who's doing a specialized job that they can do in many other places too.

--
"Little does he know, but there is no 'I' in 'Idiot'!"
Re:Not going to happen by angel'o'sphere · 2011-04-30 23:50 · Score: 1

Does "agile" software development allow scrapping 100% of the code and radically change the spec (and thereby everything else) every about 6 months just because of new scientific publication?
Yes it does. That is exactly the point about it. A good team will always be able to craft its libraries/frameworks for this project in a way that even for full rewriter they still have useful code left to be reused.
angel'o'sphere

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Re:Not going to happen by angel'o'sphere · 2011-04-30 23:59 · Score: 1

News flash! Everyone who is writing software that is "new" is "prototyping". This is not a new problem, this is why we have design patterns and TDD.
Well, I only prototype if I need/want to. Even in an "agil" environment in every iteration we deliver a "customer ready" end product. Not a prototype.

Mistakenly thinking that software development was engineering is what has caused more than one company to fail.
Software Development is Engineering, hence the word Development. If you can not apply software engineering principles to your development, then you are likely in unknown waters and are researching.
angel'o'sphere

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Re:Not going to happen by hawkfish · 2011-05-02 05:14 · Score: 1

Didn't you just describe why agile came about? Because we, as software professionals, realize that specifications are not set in stone and the system should be easy to adapt and modify for future requirements.
There is a big difference between "set in stone" and "unconstrained". Put another way, "XP is aimed at customers who don't know what they want."

--
You will not drink with us, but you would taste our steel? - Walter Matthau, The Pirates
Re:Not going to happen by Anonymous Coward · 2011-05-02 10:00 · Score: 0

The real skill in programming is the ability to write code so that it can be modified.
I don't know if you've heard of Model View Controller pattern? Its a conventional way of structuring a program so that you can change the appearance, functionality, or storage separately without having to mess with the other bits. Programs change all the time, especially web applications, and if we had to start again every time we couldn't afford to make so much stuff. That's an example, and applying best practice isn't a skill, but writing a program that can be made radically different by just replacing a component... is. It takes experience I guess.
I think where a software is scrapped and rewritten in research it's often because a new student or postdoc takes it over, has trouble understanding their predecessors undocumented code, and wants to start again because it gives a feeling of control.
Anyway, do think about structuring your code, even if it's just breaking it up into smaller units. You'll save yourself a lot typing.

What the hell are you talking about? by M.+Baranczak · 2011-04-29 07:27 · Score: 1

Git and Mercurial (hg) are not sites, they're programs. They have nothing to do with code review. You could say that they do promote "efficient code exchange", but so does any other VCS. Are you seriously trying to tell us that these big labs are not using version control while developing their systems?

Re:What the hell are you talking about? by blueg3 · 2011-04-29 07:55 · Score: 2

Are you seriously trying to tell us that these big labs are not using version control while developing their systems?
That's a lot more common than any sane programmer would suspect.
Re:What the hell are you talking about? by Anonymous Coward · 2011-04-29 08:02 · Score: 1

advent of git, hg, and related sites
ie github, bitbucket, gitorious, etc. IMHO there is something fundamentally different about kind of sharing on newer dVCS as compared to traditional (cvs, svn) VCS. Not to mention the benefits of in-line code-review (on github and others).
Re:What the hell are you talking about? by Anonymous Coward · 2011-04-29 11:03 · Score: 0

This is because scientists learn their field (physics, medicine, etc) and not programing. Best practices like version control and code review are not taught. In my department at school (physics), not one of the computational research groups uses version control. And they have used Fortran since punch cards (which they still have).
Re:What the hell are you talking about? by Anonymous Coward · 2011-04-29 14:13 · Score: 0

Yes. I work there. For internal (developer-only) source code control, CVS (which doesn't really support branching) is common. SVN (kinda old, non-distributed) is common, as is Perforce. Then you have tarballs being SCP'd around for distribution outside the development group. I do know some people who don't use any revision control at all. I don't know anyone at my site (it's in the Ks of persons, so I don't have complete knowledge) using a truly modern CVS.

Same happens in the private world by Prien715 · 2011-04-29 07:28 · Score: 1

It's probably a good thing that there's at least two different groups working on the same thing. Competition creates incentives for those within it to write better code so that it's more widely adopted and they get more funding. Why do we have Chrome and Firefox?

This happens in private companies too. I heard a story about a private company that hired two different offshore contractors to write the same software independently of one another -- they were on a tight deadline and had actually read the Mythical Man Month. So when one of them was terribly buggy by deadline, they had something that worked.

I've written software for too long to believe that any one approach works for everyone, and that by not putting your eggs in one basket, the investor (in this case, the American people) comes out ahead.

--
-- Political fascism requires a Fuhrer.

Re:Same happens in the private world by Anonymous Coward · 2011-04-29 07:38 · Score: 0

This happens in private companies too. I heard a story about a private company that hired two different offshore contractors to write the same software independently of one another -- they were on a tight deadline and had actually read the Mythical Man Month. So when one of them was terribly buggy by deadline, they had something that worked.
How does this "strategy" prevent two broken products from being developed.

I've written software for too long to believe that any one approach works for everyone, and that by not putting your eggs in one basket, the investor (in this case, the American people) comes out ahead.
If TFS is correct, the investor is not coming out ahead. The whole point is that resources are wasted on multiple sub-optimal solutions. Your attitude requires not only that not any one approach works for everyone, but that no approach works for anyone. Otherwise, it makes more sense to figure out a solution that will work in a given domain and then concentrate all available resources on that approach.

Terms of grant must specify coding standards by diabolicalrobot · 2011-04-29 07:29 · Score: 2

This problem is widespread in almost every discipline which uses any form of computation. I think the best way is for major funding sources like the NIH, NSF etc to build in to the grant terms which coding language, existing libraries be used. Or how/what/ software will be developed should be used an additional metric for deciding which proposals to accept. Proposals which are strong otherwise but do not state in clear terms how software will be built should be asked to modify their proposals to include such information. Pre-existing, well-designed, modular software architectures should be extended rather than building architectures from scratch. This is a waste of funds and time. Funding organizations must also recognize that developing good software takes time and money and set aside budgets in the grant for hiring dedicated programmers. (Scientists are very often not good software engineers and they are interested rather in trying things out quickly to see if it works at all) Such programmers can then take hacky research code from the scientists and turn it around into great reusable code.

Re:Terms of grant must specify coding standards by Anonymous Coward · 2011-04-29 07:34 · Score: 1

I wouldn't say that scientists are not good software engineers. The main problem is that they aren't paid to be software engineers. To get funding, scientists have to publish in peer reviewed conferences/journals. In order to do that it is enough to get the programs into a rough shape. Spending an extra year on polishing the software is just not going to happen as this is not easily publishable in a journal.
This sucks, but until this is changed, don't expect great software engineering out of software released by scientists. (Well, it happens, but it is rare.)
Re:Terms of grant must specify coding standards by Ezubaric · 2011-04-29 08:20 · Score: 1

This already part of NSF:
http://www.nsf.gov/news/news_summ.jsp?cntn_id=116928
(Although it's called "Data Management," it also applies to software generated in the course of research.)

--

----------
I am an expert in electricity. My father held the chair of applied electricity at the state prision.
Re:Terms of grant must specify coding standards by Anonymous Coward · 2011-04-29 09:22 · Score: 0

I agree. I didnt mean to say that scientists are not good at software engineering. I meant to say that it is not worth their time and effort. No one is giving me best paper award for the beauty of my code!

You're trying to reinvent the wheel by Anonymous Coward · 2011-04-29 07:33 · Score: 0

If you really were in a position where to make a positive impact in this area you would surely be in the position to also direct one of the existing projects so that they would do better and force the other guy(s) to step up their game.

Not so fast. by Anonymous Coward · 2011-04-29 07:33 · Score: 1

As a Ph.D. candidate who writes scientific software at a large research university in the US under NIH grant funds, I can say that simply adding more developers to a scientific software project is an unrealistic solution to the problem. Having been to a conference of developers for a well-known chemistry software package (which will remain nameless) I have seen firsthand how seemingly good intentions can quickly turn into an epic battle of conquest and control of the software. Add in the huge egos and arrogance of these scientists (some very famous in their field), and you wind up with software that no one wants to develop due to problems that have nothing to do with funding or lack of qualified developers. This is probably one of the main reasons new scientific software is created in the first place.

Re:Not so fast. by Khopesh · 2011-04-29 08:33 · Score: 1

As a Ph.D. candidate who writes scientific software at a large research university in the US under NIH grant funds, I can say that simply adding more developers to a scientific software project is an unrealistic solution to the problem. Having been to a conference of developers for a well-known chemistry software package (which will remain nameless) I have seen firsthand how seemingly good intentions can quickly turn into an epic battle of conquest and control of the software. Add in the huge egos and arrogance of these scientists (some very famous in their field), and you wind up with software that no one wants to develop due to problems that have nothing to do with funding or lack of qualified developers. This is probably one of the main reasons new scientific software is created in the first place.
There are examples of problems in every solution. Zotero is a great example of F/OSS working beautifully, exemplifying researcher collaboration to develop a research collaboration tool. With some standardization and better communication between the F/OSS community and the research industry, I think we can open the door to more of Zotero and less of your chemistry software example.

--
Use my userscript to add story images to Slashdot. There's no going back.

Govt. doesn't "get" open source by mspohr · 2011-04-29 07:38 · Score: 1

Software paid for by the government is supposed to be free in the public domain. However, there are two problems with the way this rule is implemented.

A surprising number of researchers work around this restriction and keep the software proprietary (or at least secret) by contracting the software out and purchasing outside services.

Even when the software is public domain, there is no uniform requirement to make is openly available. Often you have to write to the principle investigator and after delay and obfuscation you may get some undocumented compiled code but no source or incomplete source.

The government should adopt an open source software policy which stated that software created by the government must be put in a public repository (along with documentation) and this should be verified and enforced.

This would help build a government open source ecosystem where researchers can build on the ideas of other researchers.

--
I don't read your sig. Why are you reading mine?

Re:Govt. doesn't "get" open source by oneiros27 · 2011-04-29 07:56 · Score: 1

Actually, no it's not.
There's a few issues ... the first of which is called 'Dual Use' ... basically, there's software that can be used for miltary purposes, which will *never* be put into the public domain.
Then there's stuff that might be able to be licensed, and there's a whole gaggle of lawyers who we have to get our stuff cleared through, so we can get stuff ceritifed that our work has no value, and the government isn't going to make any money off of it.
And then there's the security concerns ... your stuff runs on a server? If you release it, and something's wrong, could it make it easier to hack your system? Well, then it's safer not to release it.
Right now, it's a challenge to get anything released. I've heard of people having their code reviewed for *years* before they could release it under GPL or BSD.
And as for the issues in the quality of the software being released for re-use, there's been an effort in the earth sciences to rank software for re-use, in the Reuse Readiness Levels

--
Build it, and they will come^Hplain.
Re:Govt. doesn't "get" open source by blueg3 · 2011-04-29 08:00 · Score: 1

Software paid for by the government is supposed to be free in the public domain.
Not really. "Paid for" is a very inspecific term. Software that is developed as the result of a government-funded grant at a non-government institution is not "supposed" to be public domain. (Individual grant agencies may make such stipulations, though.)
On the other hand, software that is the work product of government employees is supposed to be public domain. Some agencies aren't particularly helpful about distributing the source (although if only one person who cares is able to obtain it, since it's public domain, they can simply distribute it themselves to interested parties). Some agencies, though, like NIST, are very good about properly distributing code that is the result of their work.
One of the reasons research code often isn't distributed well is that it's not production-ready software and is largely useless to people who aren't specialists in the field.
Re:Govt. doesn't "get" open source by Desler · 2011-04-29 08:05 · Score: 1

Software paid for by the government is supposed to be free in the public domain.
And this has been codified in which act of Congress?
Re:Govt. doesn't "get" open source by mspohr · 2011-04-29 08:20 · Score: 1

Since you're too lazy to Google it, here is a good summary of the "rules".
https://journal.thedacs.com/stn_view.php?stn_id=56&article_id=180

--
I don't read your sig. Why are you reading mine?
Re:Govt. doesn't "get" open source by Desler · 2011-04-29 08:35 · Score: 1

No where in there at all says that any software paid for by the government is in the public domain. In fact, the phrase "public domain" doesn't even exist in that article. Your link only specifies the circumstances when something can released as OSS which is not the same as what you claimed.

This article summarizes when the U.S. federal government or its contractors may publicly release, as open source software (OSS), software developed with government funds.

Clearly you didn't even read the link you posted. Now, can you provide the actual law that says that software paid for by the government has to be in the public domain?
Re:Govt. doesn't "get" open source by Desler · 2011-04-29 08:39 · Score: 1

Also since I actually do work at a company who contracts for the government, almost none of the software we create the government can release as open source or public domain as the company reserves the copyright. This is also very typical for a lot of government contracting. So thus, you're entire premise is absolutely false. Not to mention your false conflation of "public domain" with "open source".
Re:Govt. doesn't "get" open source by Desler · 2011-04-29 08:40 · Score: 1

What's also funny is that your link even cites statutory law that completely contradicts you:

It is true that 10 U.S.C. 2320(a)(2)(F) states that “a contractor or subcontractor (or a prospective contractor or subcontractor) may not be required, as a condition of being responsive to a solicitation or as a condition for the award of a contract, to sell or otherwise relinquish to the United States any rights in technical data [except in certain cases, and may not be required to ] refrain from offering to use, or from using, an item or process to which the contractor is entitled to restrict rights in data”
Re:Govt. doesn't "get" open source by mspohr · 2011-04-29 09:12 · Score: 1

A lot of nit-picking here today so I'll have to take these objections one at a time.
The handy reference chart that I linked to states that the default and usual contract clause (FAR 52.227-14 or DFARS 252.227-7014) is for the government to reserve copyright for works created at public expense. This has been my experience. In my original post I also stated that many government contractors get around this by "contributing" some of their own funding or IP to the contract and thus establish an exception to the rules. Your employer is taking advantage of this exception and this is common.
It is not possible to copyright a work of the US Government (employees), hence this is in the "public domain" (or more technically "noncopyright").
Open source has no fixed precise definition but "public domain" software could certainly be considered "open source" (but not visa verso).

--
I don't read your sig. Why are you reading mine?
Re:Govt. doesn't "get" open source by mspohr · 2011-04-29 09:15 · Score: 1

The actual laws are referred to in the article. They are the government contracting regulations (FAR and DFAR) which are referenced in the article.

--
I don't read your sig. Why are you reading mine?
Re:Govt. doesn't "get" open source by mspohr · 2011-04-29 09:19 · Score: 1

I know this is a bit technical lawyerese talk so you may not understand it but the bit you quoted states that the government cannot compel a private party which holds existing IP rights to relinquish these rights to the government. This is a different situation from that where the government is contracting for the creation of new IP.

--
I don't read your sig. Why are you reading mine?

Science wants novelty, not quality by naroom · 2011-04-29 07:39 · Score: 1

The trouble is that you don't get grants for software development. You get them for original research, i.e. novelty. All you need to publish a paper is a hackish implementation that works once. After that's done, there's no reward for improving your code and iterating further. If you're trying to stay competitive, you move on to the next thing.

Developing good software is for industry to do. Unlike academia, industry can get massive rewards for making a well-implemented toolkit. No academically-developed software will be able to compete with that in the long run.

Re:Science wants novelty, not quality by Anonymous Coward · 2011-04-29 10:43 · Score: 0

There is a lot of software in this field which is written for the purpose of enabling end-users to apply already-developed algorithms to clinical problems. These are funded via "infrastructure" and "enabling" grants for software with tremendous overlap in functionality. In the broad strokes, the problem settings are exactly the same.
The problem is, maybe 20% of a package will be novel, and the other 80% will be overlapping crap. The bigger problem is, because of incompatibility it is a dreary slog to bring the interesting bits needed from each package into an applied setting.
Re:Science wants novelty, not quality by backwardMechanic · 2011-04-29 23:22 · Score: 1

I think you're missing two points:

First, as an academic group, it is important to make your software usable for other groups. It brings collaborators to you - researchers who want to do something new, that your software almost supports. It's faster for them to work with you than start from scratch - more for your expertise in the field than for your coding abilities.

Second, industrial software isn't open source, and for niche markets is often of terrible quality at a very high price. Open-source is slowly pushing industrial software out of science - biomedical imaging being a very good example. With OS, if the package doesn't do what you need, you can at least extend it yourself. Often more importantly, you can also check it for bugs. Less out-and-out coding bugs, more subtle bias in the way the data is processed. Super-secret-magic-algorithms do not make for reliable science.

that's why they're called NIH grants by Anonymous Coward · 2011-04-29 07:42 · Score: 0

The other guy's stuff simply won't do for serious, cutting edge science.

Because researchers aren't programmers by AdmiralXyz · 2011-04-29 07:42 · Score: 3, Insightful

I'm a computer scientist in the middle of getting my BA, but for research experience or in the process of taking an elective, I've spent time with grad students in other departments- mostly biology and linguistics- and the software they write. Smart people? Absolutely- they're experts in their field. But they can't write code to save their lives. I've seen things that make me want to run screaming to TheDailyWTF and the quality software engineering on display there ;)

I don't think this is a bad thing, myself. Most of this code is single-use only, being written for a specific purpose (or a specific thesis paper), and will never be used again. Not to mention they're taking enough time to get their degrees as it is- I don't think it's reasonable to ask them to become expert software engineers as well. OP claims that taxpayer dollars are being wasted, but think how much waste there'd be if every researcher had to get a CS degree before they started in their own field, too.

--
Dislike the Electoral College? Lobby your state to join the National Popular Vote Interstate Compact.

Re:Because researchers aren't programmers by blueg3 · 2011-04-29 08:06 · Score: 1

You think that's bad, you should see the software written by engineers that's used to perform many important engineering tasks!
Re:Because researchers aren't programmers by Anonymous Coward · 2011-04-29 09:42 · Score: 0

I heard something great about the general approaches to computers in various fields, from a buddy of mine:
"I went to a conference last weekend, and the operating systems were pretty much broken up by specialty. All the physicists used Macs, all the computer scientists used Linux, all the network admins used BSD, and all the electrical engineers used Windows 95."
Re:Because researchers aren't programmers by Anonymous Coward · 2011-05-01 07:20 · Score: 0

think how much waste there'd be if every researcher had to get a CS degree before they started in their own field, too.
Thats why we should teach computer science in high school. Every field benefits from it, and its a cheap and great exercise for young minds.

Requirements by Anonymous Coward · 2011-04-29 07:43 · Score: 0

Requirements are the hardest part of software.

Imagine two rooms, right next door to one another. Both have quite large whiteboards in them. In the left room, a group has drawn an urban scene with buildings, cars, people walking on the street, etc. Now, imagine that a group in the right room is attempting to duplicate the design on their whiteboard, but the only method of communication between the two is plain old text messages that don't contain any graphics or URLs.

The task we've described above is easier than getting the concept of software from one brain to many others, even with all the diagrams, charts, code, and tools we can bring to bear.

Now you should understand the answer to your question.

Convert research into useful by gr8_phk · 2011-04-29 07:47 · Score: 3, Insightful

If you're not happy with what's out there, you need to roll your own. If what's out there is open source, you can pick the best of each of them and build the solid system you're looking for. With research projects, once the stated goal has been reached they are done - until a follow-up grant for further work is awarded. That seems to be what research is about - showing that things can be done or done a different way - not producing a useful software product. Once they show what and how, it's up to someone else to take that and make something great from all the pieces. Unfortunately that means sifting through all the duplicate stuff and finding the best approach and possibly reimplementing it to fit in with everything else you're doing.

For example, you may find Kalman filters, genetic algorithms, neural networks, GPU implementations, etc. all able to solve a particular problem. For real-world software you really don't care about all that, you just want the ONE that works best in your application. Of course then there will be papers on "extensible frameworks" with "plugins" that can handle any of those implementations... Again, for real software you pick the one that works "best" for your definition of best and go with that. To make this happen, you need to get an ego-less (read non-PhD) software team to pull it all together.

Linux by goombah99 · 2011-04-29 07:47 · Score: 1

The article description sounds like a perfect description of the state of all the linux distro's, all the linux desktop managers, and all the linux word processors. That is, there is a proliferation of not quite compatible products that do 80% of the job well.

So I guess the article is saying we should take this shining example from computer engineering and use it to refor how scientific packages are developed.

Wow. glass houses much?

--
Some drink at the fountain of knowledge. Others just gargle.

Re:Linux by Anonymous Coward · 2011-04-29 10:37 · Score: 0

Erm, except it's not the same. There are a lot of incompatible implementations of things (there's jokes about how many different package managers there are), but there are also compatibilities in various places. You can convert betweeen rpm and deb. The free desktop project defines standards to let programs code to one API to support KDE and Gnome (and XFCE, etc.). Diversity can be good, but needlessly replicating code and making things incompatible is bad.

The problem is not ... by wisnoskij · 2011-04-29 07:48 · Score: 1

... wasted effort, but the allocation of the money and the people involved.
It is not all that hard to create very good, powerful and even big applications, but it becomes a hell of a lot harder if you throw tons money and people at it.
And yes I have worked in university physiology and they have horrendous software as well, but there is simply no other alternatives and very few real programmers working in the field, so no one who could fill the hole knows about it.

--
Troll is not a replacement for I disagree.

You are wrong by Anonymous Coward · 2011-04-29 07:53 · Score: 0

Competing implementations and competition leads to better code, and a status quo that constantly moves forward.

You are arguing FOR government-gifted monopolies in health care?

You astroturfing shill. I hope your little "medical records" startup fails, you jackass.

Intel and Microsoft by goombah99 · 2011-04-29 07:55 · Score: 2

There is a legend that this is what happens at Intel and Microsoft. It used to be said that every odd numbered Intel was not much of an improvement. It's still true since Windows 1.0 that every other release of windows has sucked. It was perfectly predictable that Vista would tank. (No I don't hate microsoft. Even people that love microsoft can see this has become a "law".)

In both cases the supposed explanation is that there are two difffenent teams working at the same time. The better one gets the first release and second one patches their changes into it for the sucky intervening release.

No idea if that is true in practice.

--
Some drink at the fountain of knowledge. Others just gargle.

Wrong incentives by Anonymous Coward · 2011-04-29 08:07 · Score: 0

Having worked on a few of the imaging packages you refer to, I can tell you from first hand experience that the NIH does not give any incentive to write good code, nor to share with other researchers. The NIH awards money based on research publications. Most of the software produced by these labs was written by grad students who's only incentive was to publish, and then graduate. Their code is usually not of the highest quality, and they have no incentive to make it better, nor share with anyone else.

There are a few exceptions, people who 'get it'. Have a look at the ITK effort (http://www.itk.org), Slicer (http://www.slicer.org) and the CTK (http://www.commontk.org/index.php/Main_Page) effort. It is rare to find a research lab willing to use other's software...

Cheers

Not really. by jd · 2011-04-29 08:07 · Score: 2

I track a lot of scientific software on Freshmeat. You'd be amazed at the redundancy. Medical stuff isn't as bad as some areas.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)

Re:Not really. by xtracto · 2011-04-29 19:09 · Score: 1

I agree with this. Look at the amount of frameworks to do Agent Based Modelling and Simulation and you will see there are at least 5 that are the same thing (Repast, Mason, NetLogo, StarLogo, Swarm) and a lot more that are very similar.

--
Ubuntu is an African word meaning 'I can't configure Debian'

Solution : Make it findable by oneiros27 · 2011-04-29 08:09 · Score: 1

Although I admit, some of it's the 'not invented here' problem, one of the big reasons there isn't better collaboration is that most scientists don't know that someone's working on something similar.

I deal with software that works on FITS files. The two main fields that use FITS -- medical imaging and astronomy. Do you think the two collaborate? Hell no.

And even if you do find out about some great new tool ... it's after it's been released. Which might be two or more years after they started development, so you've been duplicating work for all of that time.

I'm of the opinion that we need some sort of a cross-discipline registry for science software. List all of the stuff that can make plots ... what formats they read, what file formats they can export to, what special features they have, etc. Let me search for all of the software that can read NetCDF files ... or rice-compressed FITS, etc.

And then we need to get people to list their software (and most would, because if people use their software, and give acknowledgements in peer-reviewed papers, they can use that to justify continued funding), and in the future, we need people to list projects striving to make new software, so we know about it, and rather than expend effort to make something new, we can contribute to a project working on something similar.

--
Build it, and they will come^Hplain.

Re:Solution : Make it findable by Anonymous Coward · 2011-04-29 08:31 · Score: 0

and in the future, we need people to list projects striving to make new software, so we know about it, and rather than expend effort to make something new, we can contribute to a project working on something similar.
One problem with this is that it's not just about the software -- it's about figuring out how to formulate the medical research question with the appropriate human subject collections and getting the software to answer the question. That "figuring out how to formulate the medical research question" is what separates the top research labs from the rest -- that information is held dear. If the measurement algorithm is novel then it's not going to see the light of day until the medical piece is well on the way to publication.

Center grant and modularity by smoothnorman · 2011-04-29 08:12 · Score: 1

Two things seem to track with successful scientific software:

1. A center grant. These are understandably difficult to get, and typically require some venerable central Dumbledore (preferably with nobel), but they get around the insightful: "Science wants novelty, not quality" comment. That is, a center grant is designed to allow money and publications to flow for development without overt novelty.

2. Keep the software modular, well-documented, and open. Publish everything about every interface, file format, and provide a dazzling array of examples and tutorials. As an excellent example of this see the CCP4 suite of software http://en.wikipedia.org/wiki/Collaborative_Computational_Project_Number_4/

As other commenter have sensibly pointed out the way good software is typically produced these days requires a large commercial effort; but this represents the current path of least resistance, and not historical impossibility. Good academic software, albeit rare, is often the most amazing in scope and design.

Grant System to Blame by Anonymous Coward · 2011-04-29 08:13 · Score: 0

As a PhD student in bioinformatics/computational biology I can tell you first hand that the grant system we work under is what causes the problem.

The vast majority of this type of software is written by grad students and our goal is not to make software, it is to get results so we can write papers. Our PI's encourage this approach. I have been told point-blank "...we are not working on tools, we are working on methods. Tools are the result of that." Which, of course, anyone who has worked in software engineering knows is complete crap. So anyone looking for a tool for a particular purpose in their field is likely to find crap that is difficult to use, has outdated dependencies, no documentation, etc. To the point where it is easier for them to write their own.

Until the effort to develop usable tools, document them, and publish them, is recognized and funded by the grants that pay for the work to begin with, the current situation will continue to hold and probably get worse.

Re:Grant System to Blame by wjousts · 2011-04-29 08:25 · Score: 1

Related is that the audience for most tools is so narrow that there's no commercial market for it. Even where there is some commercial market (e.g. statistical analysis software) the software produced, while powerful, is still often not that good or easy to use, and really expensive.

Simple problem by wjousts · 2011-04-29 08:21 · Score: 1

It's a simple problem, research software are either written by scientists that don't program or programmers who don't understand the science. So either you end up with a powerful and technically correct software that has an interface that is completely cryptic, confusing and generally unusable or you get a nice glossy looking software that doesn't do what it's supposed to do.

It's really hard to find people who can do both.

Re:Simple problem by Anonymous Coward · 2011-04-29 09:16 · Score: 0

I have written a lot of scientific software, i.e., software for research. It is much faster to write it than to search for it already written. Such code, like just about everything else in modern research, is very specific and would need major editing even if you tried to use it in another lab that is studying exactly the same problem, due to different (non-computer) hardware, different analyses, different ways to slice the pie. And this is not all bad. One of the main ideas of science is independent verification.
Nobody in science gives a shit about software engineering. There's no time for it.
Re:Simple problem by wjousts · 2011-05-02 00:54 · Score: 1

All of which is just expanding on my point. Scientist don't program and don't care about software engineering and best practices, as you say, they don't have time for it. Also, a lot of what they create isn't intended to live for a long time, it's often put together as fast as possible for the use in a single (or a small set of) experiment. The problem comes when those one-off programs mutate into larger programs that somebody then decides they can make money off by selling to other scientists.

Here is how by Anonymous Coward · 2011-04-29 08:47 · Score: 0

Don't fund the authors. If you must fund something, fund buyers. Let the market create the product; purchasers will quickly and accurately weed out the crap.

Re:Here is how by uid7306m · 2011-04-29 12:11 · Score: 1

Yeah, right. Scientific software is not at all like a word processor. In a word processor, you can tell immediately if it is behaving wrong: you know what it is supposed to do. But in some scientific computation, it's just the reverse. It tells you that the answer is 3.184, and it is not immediately obvious whether the answer is right or wrong. That's the difference.
"To weed out the crap" as you put it, you need to understand the computation in detail, design some test cases that are relevant to your own problem, test it, and think about the approximations that are being made. This can take weeks, months, or even years.
And, then after you've decided if it works well enough for *your* research problem, the next guy is probably going to run it on a slightly different problem. Some of those questions will raise their ugly heads again. Are those approximations still good enough? Do I need different test cases?
Basically, the difference is that with commercial software, you write it once and people run it a billion times. With research software, you write it once, and run it once. Maybe, you'll use it again, once, or maybe not. Maybe three other guys (girls) will use it too. But that's about it. Once the program works, your research question is answered, and that might be the end of it.

Sparklix take on scientific software by guznik · 2011-04-29 08:48 · Score: 1

The high cost of scientific software and its lack of accommodation to what scientists really need were the reason we came up with Sparklix electronic lab notebook and its business model.

Apparently a lot of today's scientific software is developed by engineers who know nothing about the scientific areas they are targeting, trying to create yet another CRUD or CRM-like application with scientific flavor. What we attempted to do with Sparklix is to bring the researchers an experience which would be as close as possible to their paper notebook, because this is the way they are thinking and this is how they expect their software would work. To get to good results our team is composed of both developers and scientists, and this makes the software fit the actual need and not remain in "thoughtland".

Regarding cost we made it simple - the product is free, and additional services (such as on-site installations) cost money. This is the direction the internet is going to, we see no reason in charging money for the basic software in today's world of Google, Facebook and Twitter.

Standardize on efficient data representations by rlseaman · 2011-04-29 09:42 · Score: 1

Regarding FITS (Flexible Image Transport System), if this is used in significant ways in medical imaging, the astronomical FITS user community would love to know about it and collaborate. Regarding rice-compressed FITS, I (and undoubtedly my coauthors) would be beyond fascinated to learn of either medical imaging use cases or compression tools for this purpose. Alternately, any FITS-based medical imaging applications should be aware of the astronomical data compression work accessible through http://heasarc.nasa.gov/fitsio/fpack (hopefully I'm not slashdotting myself :-) Another field planning to use FITS is digital manuscript archiving per the Vatican ( http://bit.ly/aagZxN ). Regarding the topic of this thread, the comments here emphasize that the real issue is standardizing on data formats. The richer the community (and none are richer than health and medicine), the richer the software ecosystem.

Re:Standardize on efficient data representations by backwardMechanic · 2011-04-29 23:39 · Score: 1

What's the relationship between FITS, HDF and NetCDF? I've looked into the last two, but eventually decided they were far more complicated than I needed them to be - and did the evil thing that so many of us do - invented my own simple format that is 'just enough' for my own needs :-).But FITS is new to me.
Re:Standardize on efficient data representations by rlseaman · 2011-04-30 06:23 · Score: 1

FITS is the ubiquitous data format in astronomy, see http://fits.gsfc.nasa.gov/ - it has idiosyncrasies from arising originally in the 1970's, but is extremely portable and forgiving of a wide range of host operating systems and development environments. The specification has also been published in the refereed astronomical literature, making it suitable for very long term (even in astronomical terms) archival storage. Hence the interest of the Vatican in using this for their manuscripts. Recent data compression work is quite state of the art (if I do say so myself), and would be applicable to other scientific image or table formats, including your homebrew format.

In-house approval exemption by memoryhole · 2011-04-29 11:43 · Score: 1

Medical imaging is a special case.

The FDA has a "in-house" exemption for things like software-based medical solutions (for example, the software that calculates the best way to deliver a radiation blast to your tumor, or the software that identifies tumors in MRI results, or whatever). In essence, to share software you've developed, you have to go through a lengthy and expensive approval process. Once you've written something, no matter how nice it is, there's a huge threshold of liability, expense, and hassle you have to overcome in order to be legally allowed to give it to other institutions.

Don't do it by Anonymous Coward · 2011-04-29 12:28 · Score: 0

Eliminating duplication of effort sounds good but it would be a total disaster.

The great advantage of the American research establishment is that there is no central authority deciding what to research. There are many different funding sources and many different establishments. Central authorities which decide on the "right" approach to pursue have no place in science --- it cannot be done by consensus.

Get ImageJ by Anonymous Coward · 2011-04-29 13:11 · Score: 0

Perhaps NIH itself knows how to produce good open source imaging software with tons of plugins. Try ImageJ, from http://rsbweb.nih.gov/ij/

What can slashdot do? by jowilkin · 2011-04-29 14:01 · Score: 1

It's kind of useless to post this on Slashdot honestly. What good is it going to do? If you have an idea for how to solve this and are a researcher then talk to the NIH. Otherwise this is just a lot of hot air.

If the software is mature enough to be widely useful, then a company should try to commercialize it under and SBIR or an STTR or something. If it's not mature then it should stay what is is - a research prototype. Most research prototypes end up being useless in my experience and it's not worth the effort at all the try to pull them all together. If there are a bunch of useful ones like you seem to be claiming, then submit a grant and do something about it.

Get a grant to create guidelines and clearinghouse by GrantRobertson · 2011-04-29 14:14 · Score: 1

Drop whatever you are doing right now and start writing a grant proposal. Here is what you will propose to do:

Create a set of guidelines to encourage reuseability of software. These will include:
- General guidelines as to modularity, reusability, liscensing, and documentation rather than specific instructions about languages.
- General guidelines as to revision control, and the posting of resulting software, similar to the Data Management Plans referred to by another commenter.
- Minimum standards as to openness of licenses and future availability of resulting software.
- Restrictions against going around the rules to produce proprietary software.
Create a clearinghouse web site with the following features:
- Explanations of the above guidelines with help for researchers to follow the guidelines, including forums, online courses, and perhaps seminars.
- A repository where researchers can publish their software in such a way that the version they used for their research is maintained in "stasis" in perpetuity, while also allowing others to fork, branch, borrow, or part out that software as their needs require.
  (While common revision management applications and schemes allow only a hierarchical, "branching," structure for the repositories, this repository will likely require the ability to track a very complex "graph" of the evolution of software and all the ways that researchers mix, combine, and improve the software.)
- A system which allows users to rate and review software placed into the repository for both usefulness and adhearance to the guidelines. (Researchers who regularly post software which does not adheare to the guidelines will be less likely to get grants in the future.)

If you play your cards right you could end up with a lifetime career managing this clearinghouse. Why am I not writing this proposal? Because I am working on something else which I consider to be even more important.

BS by Anonymous Coward · 2011-04-29 14:42 · Score: 0

I would like to see the statement "why are we funding X different packages that do 80% of the same things, but none of them well". We have X different packages because they do what they were written to do very well. That 80% overlap is applying mostly the same standard processing tools. One tool doesn't fit all.

Re:BS by dkf · 2011-05-01 10:44 · Score: 1

We have X different packages because they do what they were written to do very well.
But a significant fraction of that X are programs written to work on one version of one specific dataset. There can even be sane reasons for this; some datasets change format between versions. Genetics data is uniformly terrible this way. I worked on a project last year to take one of these hyper-specific packages and turn it into something that another person (i.e., anyone other than the PhD candidate who wrote it) would consider using at all; it was a huge amount of work from a talented team of about 15 software engineers (plus me!) to mash that code into shape. In the process, the code became much faster, safer, more correct (we found some scientific errors; luckily for the original author, he'd had his viva by then), more useful, and more usable.
What works best is when you've got scientists doing the scientific side and software engineers taking the scientific work and turning it into products (whether OSS or not). The scientists are the domain experts, the SEs are the people who can transform a vision into reality; a scientist says what relationships need to be present in the results, a SE turns that into a database schema with appropriate FK constraints and indices. (Sure, each could learn the other's craft, but that's not as useful as having specialists working together.)

--
"Little does he know, but there is no 'I' in 'Idiot'!"

Much ado by RandCraw · 2011-04-29 16:51 · Score: 1

In fact there are several well-designed user-extensible medical image processing frameworks available already. ImageJ, MIPAV, and ITK were funded by the NIH and fill the very void suggested by the OP. Many more mature medical imaging tools that serve a variety of niches are freely available, many of which include free source code.

Frankly, I think the OP's main thesis is fundamentally wrong. Medical imaging research is about inventing or improving IP techniques and algorithms, not implementing and distributing software tools. Asking researchers to deliver more than a design or perhaps benchmark results would be counterproductive and a poor use of research funds. If better software tools are the goal, then some more constructive questions might be: Who best should manage such an effort? Who should fund it? And how could we fund and coordinate such endeavors better?

Personally, I'd like to see, as part of any publication, the software, data, and runtime parameters be part of the submission. "Unreproducible research considered harmful", should be the new maxim. But I digress.

IMHO, the current state of gov't funded medical imaging research tools is doing quite nicely, thank you. If the OP really does in fact know a better way, then he should write up his grand plan and submit a grant proposal of his own.

submitter is a commercial programmer by mjwalshe · 2011-04-29 22:57 · Score: 1

this is the same crap you get on quora where OO freaks question why Fortran is used rather than C++ for a lot of technical programming.

code hell by Anonymous Coward · 2011-04-30 14:07 · Score: 0

yay, a few thousand medstudents / medical engineers with a interest in programming contributing to a common code base. The thing would become a monolithic mess inside of a week, and every new developer could wast 12 months learning how the x million lines of code is cobbled together. Every new addition will then destroy 5 things that used to work....

Slashdot Mirror

Ask Slashdot: How To Encourage Better Research Software?

104 comments