Call For Scientific Research Code To Be Released
Pentagram writes "Professor Ince, writing in the Guardian, has issued a call for scientists to make the code they use in the course of their research publicly available. He focuses specifically on the topical controversies in climate science, and concludes with the view that researchers who are able but unwilling to release programs they use should not be regarded as scientists. Quoting: 'There is enough evidence for us to regard a lot of scientific software with worry. For example Professor Les Hatton, an international expert in software testing resident in the Universities of Kent and Kingston, carried out an extensive analysis of several million lines of scientific code. He showed that the software had an unacceptably high level of detectable inconsistencies. For example, interface inconsistencies between software modules which pass data from one part of a program to another occurred at the rate of one in every seven interfaces on average in the programming language Fortran, and one in every 37 interfaces in the language C. This is hugely worrying when you realise that just one error — just one — will usually invalidate a computer program. What he also discovered, even more worryingly, is that the accuracy of results declined from six significant figures to one significant figure during the running of programs.'"
Particularly if the research is publicly funded.
Great!
I'm getting somewhat tired from reading articles, where there is little or no information regarding program accuracy, total running time, memory used, etc.
And in some cases, i'm actually questioning whether the proposed algorithms actually work in practical situations...
If Pandora's box is destined to be opened, *I* want to be the one to open it.
seem to understand the very idea of scientific methods or processes, or the reasoning behind empiricism and careful management of precision.
It's a failure of education, no so much in science education, I think, as in philosophy. Formal and informal logic, epistemology and ontology, etc. People appear increasingly unable to understand why any of this matters and they essentialize the "answer" as always "true" for any given process that can be described, so science becomes an act of creativity by which one tries to create a cohesive narrative of process that arrives at the desired result. If it has no intrinsic breaks or obvious discontinuities, it must be true.
If another study that contradicts it also suffers from no breaks or discontinuities, they're both true! After all, everyone gets to decide what's true in their own heart!
STOP . AMERICA . NOW
Much quantitative academic and scientific work could benefit from the use of tools like Sweave, which allows you to embed the code used to produce statistical analyses within your LaTeX document. This makes your research easier to reproduce, both for yourself (when you've forgotten what you've done six months from now) and others.
What other kinds of tools like this are /.ers familiar with?
"Anyone who [rips a CD] is probably engaging in copyright infringement." - David O. Carson
I've always been a big fan of releasing my academic work under a BSD licence. My work is funded by the taxpayers, so I think the taxpayers should be able to do what they like with my software. So I fully agree that all software should be released. It is not always enough to just publish a paper, but you should release your code so others can fully review the accuracy of your work.
The scientific community needs to get as far as we can from the policies of companies like Gaussian Inc., who will ban you and your institution for simply publishing any sort of comparative statistics on calculation time, accuracy, etc. from their computational chemistry software.
I can't imagine what they'd do to you if you started sorting through their code...
Hey mate, spare a sig?
Please apply Hanlon's razor before leaping to conspiracy theories. Or Occam's razor might inform you that a conspiracy among thousands of scientists is a highly improbable occurrence; look for a solution that doesn't involve a perfect lid of secrecy among a group of (frequently) socially inept people.
$_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
One significant figure?
What? Scientists showing their work for peer-review? It's MADNESS I tell you. MADNESS !
"Why should I make the data available to you, when your aim is to try and find something wrong with it"
-Prof. Jones CRU
My bet is there is a simple explanation...namely that scientists outside of computer science are too busy in their respective fields to know anything about code, or even care. The egocentric Slashdot-worldview strikes at the heart of logic yet again.
I got my PhD in fluid mechanics funded by NASA, and as such my findings are easily publishable and shared with others. My analysis code (such as it was) was and is available for those would would like to use it. More importantly my experimental data is available as well.
This represents the classical pure research side of research where we all get together and talk about our findings and there really aren't any secrets. But even with this open example, there are still secrets when it comes to ideas for future funding. You only tip your cards when it comes to things you've already done, not future plans.
But more importantly, there are whole areas of research that are very closed off. Pharma is a good example. Sure there are lots of peer reviewed articles published and methods discussed, but you'll never really get into their shorts like this guy wants. There's a lot that goes on behind that curtain. And even if you are a grad student with high ideals and a desire to share all your findings, you may find that the rules of your funding prevent you from sharing.
Sheldon
The scientific process is to invalidate a study if the results cannot be reproduced by anyone else. That way you can eliminate all potential problems like coding errors, invalid assumptions, faulty equipment, mistakes in procedures, and 100 of the other things that can produce dodgy results.
It can be misleading to search through the code for mistakes when you don't know which code was eventually used in the final results (or in which order). I have accumulated quite a lot of snipits of code that I used to fix a particular need at the time. I am sure that many of these hacks were ultimately unused because I decided to go down a different path in data processing. Or the temporary tables used during processing is no longer around (or in a changed format since the code was written). There is also the problem of some data processing being done by commercial products.
It's just too hard. The best solution is to let science work the way it has found to be the best. Sure you will get some bad studies, but these will eventually be fixed over time. The system does work, whether vested interests like it or not.
Nobody said conspiracy, just plain crappy code. You don't need a conspiracy if you are "trying to prove" something, your crappy code spits out what you want to see and you run with it. You just need plain old incompetence.
As it is written, the editorial is saying that if there is any error at all in a scientific computer program, the science is usually invalid. What a lot of bull hunky! If this were true, then scientific computing would be impossible, especially with regards to programs that run on Windows.
Scientists have been doing great science with software for decades. The editorial is full of it.
Not that it would be bad for scientists to make their software open source. And not that it would be bad for scientists to benefit from some extra QA.
I'm perfectly OK with the fact that their job is science and not coding, but would they go to the satellite assembly guys and start gluing parts at random ?
Non-Linux Penguins ?
As it happens, my students and I are about to release a fairly specialized code - we discussed license terms, and eventually settled on the BSD (and explicitly avoided the GPL), which requires "citation" but otherwise leaves anyone free to use it.
That said, writing a scientific code can involve a good deal of work, but the "payoff" usually comes in the form of results and conclusions, rather than the code itself. In those circumstances, there is a sound argument for delaying any code release until you have published the results you hoped to obtain when you initiated the project, even if these form a sequence of papers (rather than insisting on code release with the first published results)
Thirdly, in many cases scientists will share code with colleagues when asked politely, even if they are not in the public domain.
Fourthly, I fairly regularly spot minor errors in numerical calculations performed by other groups (either because I do have access to the source, or because I can't reproduce their results) -- in almost all cases these do not have an impact on their conclusions, so while the "error count" can be fairly high, the number of "wrong" results coming from bad code is overestimated by this accounting.
Back in college, I did some computer vision research. Most people provided open source code for anyone to use. However, aside from the code being of questionable quality, it was mostly written in Matlab with C handlers for optimization.
In order to properly test all of the software out there you would need:
1. A license for every version of Matlab.
2. Windows
3. Linux
4. Octave
I had our school's Matlab, but none of the code we found was written on that version. Some was Linux, some Windows, (the machine I had was a Windows box with Matlab) consequently we had to play with Cygwin...
I mean, basically, you need to distribute a straight-up VM if you want your results to be reproducible. (which naturally rules out Windows or Matlab or anything else proprietary being at the core.)
I'm working on my dissertation proposal, and I'd like to be able to re-run the benchmarks that are shown in some of the papers I'm referencing. But must of the source code for those papers has disappeared into the aether. Without their code, it's impossible for me to rerun the old benchmark programs on modern computers so that I and others can determine whether or not my research has uncovered a better way of doing things. This is very far from the idealized notion of the scientific method, and significantly calls into question many of the things that we think we know based on published research.
If all scientists run their results through the same analytical software, using the same code as the first researcher, they are not providing confirmation, they are merely cloning the results. That doesn't give the original results either the confidence that they've been independently validated, or that they have been refuted.
What you end up with is no-one having any confidence in the results - as they have only ever been produced in one way and arguments thatt descend into a slanging match between individuals and groups of vested interests who try to "prove" that the same results show they are right and everyone else is wrong.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
I am suspect of the interface reference. Are they counting things where an enumeration got used as an int, or there was an implicit cast from a 32bit float to a 64bit one? From a recent TV show "A difference that makes no difference is no difference." Stepping back a bit there will be howls from OO/Functional/FSM zealots that look at a program and declare its inferior architecture, lack of maintainability etc. indicate its results are wrong. These are programs written to be run once to turn one set of data into a more understandable and concise one. A truth test set run through it is good enough, they don't need iso compliant, triply refactored, perfectly architectured code to get the right answer. I don't think any of my CS proffs would have cared about such inane drivel they barely paid attention to what language we each picked to solve the assignment in. My software engineering proff would have yelled about comment density and coding standards compliance, but I consider that a different discipline primarily applicable to widely used and/or safety critical code.
*However*
Keeping track of digit precision through a calculation isn't CS, its fundamental grade school science. That is only one step from forgetting to do unit analysis for a sanity check. If they are forgetting that, they are probably also not looking at numerical conditioning, or trying to get by with doubles when they need bignums. None of this is CS egocentrism, its stuff we learn in math and science courses.
refactor the law, its bloated, confusing and unmaintainable.
I'm finishing a program that inverts GRACE data to reveal fluctuations in gravity such as those caused by melting glaciers. This program will eventually be released as open source software under the GPLv3. It's largely built on open source libraries like the GNU Scientific Library, but snippets of proprietary code from JPL found their way into the program years ago, and I'm currently trying to untangle them. The program can't be made open source until I succeed because of an NDA that I had to sign in order to work at JPL.
It's impossible to say how long it will take to banish the proprietary code. While working on this project, my research is at a standstill. There's very little academic incentive to waste time on this idealistic goal when I could be increasing my publication count.
Annoyingly, the data itself doesn't belong to me. Again, I had to sign an NDA to receive it. So I can't release the data. This situation is common to scientists in many different fields.
Incidentally, Harry's README file is typical of my experiences with scientific software. Fragile, unportable, uncommented spaghetti code is common because scientists aren't professional programmers. Of course, this doesn't invalidate the results of that code because it's tested primarily through independent verification, not unit tests. Scientists describe their algorithms in peer-reviewed papers, which are then re-implemented (often from scratch) by other scientists. Open source code practices would certainly improve science, but he's wrong to imply that a single bug could have a significant impact on our understanding of the greenhouse effect.
Real scientists don't use simulators with incomplete equations and fudge factors to match highly manipulated historic data to "prove" their case with game machines that have no predictive capability or other external validation. That simply is not the way you build a valid fundamentals based model starting from the equations of motion. IPCC reports previously noted whole terms in the equations' energy terms that were inadequately described or represented, then have done no research to fill the terms, modellers just zeroing them out or putting in small constants for significant *variables*. These are not real scientists, their processes and practices have been clearly shown to be antithetical to valid science.
These models are just primitive speculative tools, often reflecting personal biases in data selection and derivation, NOT fundamental equations. The models are NOT valid physics data or experiments.
On prediction failure, Hansen's 1988 "A,B,C" forecasts of rising temperature are rapidly diverging from the cooling we are actually experiencing right now, where case C assumed we massively limited CO2 also. Missed the side of a barn with a shotgun, tsk, tsk, tsk.
As part of publication and peer review all data and providence of the data as well as any additional formula's, algorithms, and the exact code that was used to process the data should be placed online in a neutral holding area.
Neutral area needs to be independent and needs to show any updates and changes, preserving the original content in the process.
If your data and code (readable and compilable by other researchers) isn't available then peer review and reproduction of results is foolish. If you can't look in the black box then you can't trust it.
Ward
. Silence! Be thankful thy species is unpalatable! .
Code should be release, but this should not be confused with replicating scientific results. ] If you want to replicate research, you need to write your own code according to the methods described in the research. your answer then needs to match the original to test the code.
Scientists need to realize that if they're going to get public support, they really need to be very careful with their choice of wording. Like it or not, the scare mongers, and I mean scare mongers in the sense that there are people who are trying to scare folks into believing that Global Warming is some sort of wealth redistribution scheme by the socialists, are going to use any hint, real or not, that scientists are making up their findings.
Scare mongers? Let's take a look at some of these "hints" that scientists are making up their findings. From May 7, 2002
Dozens of mountain lakes in Nepal and Bhutan are so swollen from melting glaciers that they could burst their seams in the next five years and devastate many Himalayan villages, warns a new report from the United Nations.
From January 17, 2010:
In the past few days the scientists behind the warning have admitted that it was based on a news story in the New Scientist, a popular science journal, published eight years before the IPCC's 2007 report.
It has also emerged that the New Scientist report was itself based on a short telephone interview with Syed Hasnain, a little-known Indian scientist then based at Jawaharlal Nehru University in Delhi.
Hasnain has since admitted that the claim was "speculation" and was not supported by any formal research.
Do I need to pull the quotes that claim NY and Florida will be underwater?
As for the "fear mongers" saying that GW is a socialist wealth redistribution scheme.
Some officials from the United States, Britain and Japan say foreign-aid spending can be directed at easing the risks from climate change. The United States, for example, has promoted its three-year-old Millennium Challenge Corporation as a source of financing for projects in poor countries that will foster resilience. It has just begun to consider environmental benefits of projects, officials say.
Industrialized countries bound by the Kyoto Protocol, the climate pact rejected by the Bush administration, project that hundreds of millions of dollars will soon flow via that treaty into a climate adaptation fund.
Strange. When did Rush and Hannity start writing for the NY Times?
There is no "I disagree" mod for a reason. Flamebait, Troll, and Overrated are not substitutes.
The more important the research, the larger the item under study, the more rigorous the investigation should be, the more carefully the data should be checked. This isn't just for public policy reasons but for general scientific understanding reasons. If your theory is one that would change the way we understand particle physics, well then it needs to be very thoroughly verified before we say "Yes, indeed this is how particles probably work, we now need to reevaluate tons of other theories."
So something like this, both because of the public policy/economic implications and the general understanding of our climate, should be subject to extreme scrutiny. Now please note that doesn't mean saying "Look this one thing is wrong so it all goes away and you can't ever propose a similar theory again!" However it means carefully examining all the data, all the assumptions, all the models and finding all the problem with them. It means verifying everything multiple times, looking at any errors any deviations and figuring out why they are there and if they impact the result and so on.
Really, that is how science should be done period. The idea of strong empiricism is more or less trying to prove your theory wrong over and over again, and through that process becoming convinced it is the correct one. You look at your data and say "Well ok, maybe THIS could explain it instead," and test that. Or you say "Well my theory predicts if X happens Y will happen, so let's try X and if Y doesn't happen, it's wrong." You show your theory is bulletproof not by making sure it is never shot at, but by shooting at it yourself over and over and showing that nothing damages it.
However that this process is done right becomes more important the bigger the issue is. If you aren't right on a theory that relates to migratory habits of a sub species of bird in a single state, ok well that probably doesn't have a whole lot of wider implications for scientific understanding, or for the way the world is run. However if you are wrong on your theory of how the climate works, well that has a much wider impact.
Scrutiny is critical to science, it is why science works. Science is all about rejecting the ideas that because someone in authority said it, it must be true, or that a single rigged demonstration is enough to hang your hat on. It is all about testing things carefully and figuring out what works, and what doesn't.
Only if the real programmers out there promise to be nice to us scientists.
Most scientists will know a lot about, well, science... but not much about writing code or optimizing code.
Like my scripts. All correct, all working... lots of formulas... but probably a horribly inefficient way to calculate what I need. :-)
the last thing I need is someone to come to me and tell me that the outcome is correct but that my code sucks.
(And no, I am not interested in a course to learn coding - unless it's a 1-week crash course).
Comment removed based on user account deletion
This is hugely worrying when you realise that just one error -- just one -- will usually invalidate a computer program.
Back in the 1970s, a bunch of CompSci guys at the university where I was a grad student did a software study with interesting results. Much of the research computing was done on the university's mainframe, and the dominant language of course was Fortran. They instrumented the Fortran compiler so that for a couple of months, it collected data on numeric overflows, including which overflows were or weren't detected by the code. They published the results: slightly over half the Fortran jobs had undetected overflows that affected their output.
The response to this was interesting. The CS folks, as you might expect, were appalled. But among the scientific researchers, the general response was that enabling overflow checking slowed down the code measurably, so it shouldn't be done. I personally knew a lot of researchers (as one of the managers of an inter-departmental microcomputer lab that was independent of the central mainframe computer center). I asked a lot of them about this, and I was appalled to find that almost every one of them agreed that overflow checking should be turned off if it slowed down the code. The mainframe's managers reported that almost all Fortran compiles had overflow checking turned off. Pointing out that this meant that fully half of the computed results in their published papers were wrong (if they used the mainframe) didn't have any effect.
Our small cabal that ran the microprocessor lab reacted to this by silently enabling all error checking in our Fortran compiler. We even checked with the vendor to make sure that we'd set it up so that a user couldn't disable the checking. We didn't announce that we had done this; we just did it on our own authority. It was also done in a couple of other similar department-level labs that had their own computers (which was rare at the time). But the major research computer on campus was the central mainframe, and the folks running it weren't interested in dealing with the problem.
It taught us a lot about how such things are done. And it gave us a healthy level of skepticism about published research data. It was a good lesson on why we have an ongoing need to duplicate research results independently before believing them.
It might be interesting to read about studies similar to this done more recently. I haven't seen any, but maybe they're out there.
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
While I am fan of open source and this idea in general, for climatology, this is a non-issue. Look there: http://www.realclimate.org/index.php/data-sources/
It's more code out there than one amateur can eat for life. And you know what? From the experience of people who wrote these programs, there isn't actually much people looking at it. I doubt that any scientific code will get many eyeballs. This is more a PR exercise.
It seems to me that what's important is the theory being modeled, the algorithms used to model it, and of course the data. The code itself isn't really useful for replicating an experiment, because it's just a particular - and possibly faulty - implementation of the model and as such is akin to the particular lab bench practices that might implement a standard protocol. Replicating a modeling experiment should involve using - and writing, if necessary - code that implements the model the original investigators intended to implement, but distinct from that which they actually used.
Running the same code on the same data demonstrates very little, and finding bugs in the original code tells you nothing about what results would/should have been achieved had the model been implemented correctly. But of course it's great for throwing stones and "discrediting" a result without actually adding anything constructive to the issue at hand.
I hate to break it to you but all programming is highly specialized. Climatology is in no way special in this regard.
Neither do programmers have to understand the abstract model of the program to write it or evaluate it. The vast majority of professional programmers do not understand the abstract model of the code they create. You do not have to be a high-level accountant to write corporate accounting software and you don't have to be a doctor to write medical software. Most programmers spend most of their time implementing models created by non-programmers from fields of which the programmers have no detailed knowledge.
Does that mean that programmers can't spot crappy code just because they don't understand the details of the model? No, it does not. Most software errors don't arise from the model but from sloppy practices in the management of the software project itself. An experienced programmer doesn't even have to know the language of project to see that it's creation and maintenance was incompetently handled.
You don't have to be a climatologist to know that the CRU software was utter crap that would produce sound outputs only by divine intervention. For any experienced programmer, it was immediately obvious that it was a great reeking gob of amateur coding with no structure, no plan and no standards. In my experience, most scientific software is like the CRU software. It evolves in an ad hoc manner over many years with no governing organizational structure.
Commercial software developers have created a wide range of tools and procedures to manage large, vital projects. In the main, scientist use none of these tools and most of them appear unaware they even exist much less why and when they are needed. As a result most scientific software project management is completely amateurish. If most scientific software were written for commercial applications, the developers would be sued or imprisoned for fraud.
Scientist tend to be arrogant and dismissive of the work of others especially those who work in the commercial sector. You believe that because you understand climatology that you therefore understand all the tools you are using. Well you don't. You think that because no one can understand your abstract model that therefore they cannot find significant errors in your code. Well, they can. You think we should reengineer our entire civilization based on your unquestioned and unexamined computerized ivory tower auguries.
Well, we won't.
Your just going to have to suck it up and withstand the at least the same scrutiny we give important commercial software.
In the past ten years, we've seen warming of 0.18 degrees Celsius, which is less than the 0.25 degrees Celsius that was predicted, but it certainly hasn't been cooling. This is why the Arctic ice and Antarctic ice are melting. Yes, stop the presses, the globe is warming!
What a fool believes, he sees, no wise man has the power to reason away.