Slashdot Mirror


Mozilla Plan Seeks To Debug Scientific Code

ananyo writes "An offshoot of Mozilla is aiming to discover whether a review process could improve the quality of researcher-built software that is used in myriad fields today, ranging from ecology and biology to social science. In an experiment being run by the Mozilla Science Lab, software engineers have reviewed selected pieces of code from published papers in computational biology. The reviewers looked at snippets of code up to 200 lines long that were included in the papers and written in widely used programming languages, such as R, Python and Perl. The Mozilla engineers have discussed their findings with the papers’ authors, who can now choose what, if anything, to do with the markups — including whether to permit disclosure of the results. But some researchers say that having software reviewers looking over their shoulder might backfire. 'One worry I have is that, with reviews like this, scientists will be even more discouraged from publishing their code,' says biostatistician Roger Peng at the Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. 'We need to get more code out there, not improve how it looks.'"

5 of 115 comments (clear)

  1. Wrong objective. by smart_ass · · Score: 5, Insightful

    I don't know the actual objective ... but if the concern is "'We need to get more code out there, not improve how it looks.'" ... the objective is bad.

    Wouldn't shouldn't this be about catching subtle logic / calculation flaws that lead to incorrect conclusions?

    Agree ... if this is about indenting and which method of commenting ... then yeah ... bad idea.

    But this has the possibility of being so much more. I would see it as free editing by qualified people. Seems like a deal.

    --
    Ouch ... did I just say that.
    1. Re: Wrong objective. by old+man+moss · · Score: 5, Interesting

      Yes, totally agree. As someone who has tried to reproduce other people's results (in the field of image processing) with mixed success. It can be incredibly time consuming trying to compare techniques which appear to be described accurately in journals, but omit "minor" details of implementation which actually turn out to be critical. I have also had results of my own which seemed odd and were ultimately due to coding errors which inadvertently improved the result. Given the opportunity, I would have published all my academic code.

      --
      rt
  2. Hell Yes! by Garridan · · Score: 5, Insightful

    Where do I sign up? If I could get a "code reviewed by third party" stamp on my papers, I'd feel a lot better about publishing the code and the results derived from it. Maybe mathematicians are weird like that -- I face stigma for using a computer, so anything I can do to make it look more trustworthy is awesome.

    1. Re:Hell Yes! by JanneM · · Score: 5, Insightful

      Problem is, at least in this trial they're reviewing already published code, when it's too late to gain much benefit from the review on the part of the original writer. A research project is normally time-limited after all; by the time the paper and data is public, the project is often done and people have moved on.

      There's nobody with the time or inclination to, for instance, create and release a new improved version of the code at that point. And unless there's errors which lead to truly significant changes in the analysis, nobody would be willing to publish any kind of amended analysis either.

      --
      Trust the Computer. The Computer is your friend.
  3. researcher vs. software developer by Anonymous Coward · · Score: 5, Informative

    People doing scientific research and software developers are really doing very different things when they write code. For software developers or software engineers, the code is the end goal. They are building a product that they are going to give to others. It should be intuitive to use, robust, produce clear error messages, and be free of bugs and crashes. The code is the product. For someone doing scientific or engineering research, the end goal is the testing an idea, or running an experiment. The code is a means to an end, not the end itself; it needs only to support the researcher, it only needs to run once, and it only needs to be bug free in the cases that are being explored. The product is a graph or chart or sentence describing the results that is put into a paper that gets published; the code itself is just a tool.

    When I got my Ph.D. in the 1990s, I didn't understand this, and it brought be a lot of grief when I went to a research lab and interacted with software developers and managers, who didn't understand this either. The grief comes about because of the different approaches used during the development of each type of code. Software developers describe their process variously as a waterfall model, agile development model, etc.. These processes describe a roadmap, with milestones, and a set of activities that visualize the project at its end, and lead towards robust software development. The process a researcher uses is related to the scientific method: based on the question, they formulate a hypothesis, create an experiment, test it, observe the results, and then ask more questions. They do not always know how things will turn out, and they build their path as they go along. Very often, the equivalent "roadmap" in a researchers mind is incomplete and is developed during the process, because this is part of what is being explored.

    In my organization, this makes tremendous conflict between software developers, who want a careful, process driven model to produce robust code, and researchers, who are seeking to answer more basic questions and explore unknown territory in a way that has a great deal of uncertainty and cannot always easily deliver specific milestones and clarity into schedule that is often desired.

    It is worse when the research results in a useful algorithm; of course, the researcher often wants to make it available to the world so that others can use it. This is more of a grey area; if the researcher knows how to do software engineering, they may go through the process to create a more robust product, but this takes effort and time. The fact that Mozilla wants to help debug scientific code is a very good thing; it often needs more serious debugging and re-architecting than other software that is openly available.

    I wish more people understood this difference.