Open v. Closed Source-Climate Change Research
theidocles writes "The ongoing debate over the 'hockey stick' climate graph has an interesting side note.
McKitrick & McIntyre (M&M), the critics, have published their complete source code and it's written using the well-known R statistics package (covered by the GPL). Mann, Bradley & Hughes, the defenders, described their algorithm but have only released part of their source code, and refuse to divulge the rest, which really makes it look like they have some errors/omissions to hide (they did publish the data they used). There's an issue of open source vs closed source as well as how much publicly-funded researchers should be required to disclose - should they be allowed to generate 'closed-source' solutions at the taxpayers' expense?"
should they be allowed to generate 'closed-source' solutions at the taxpayers' expense?
No. I paid for it I want to see it. How else will we know if it works the way they say it works?
I'm a virgo and on Slashdot. Coincidence? Yes.
how much publicly-funded researchers should be required to disclose
All of it, baby. We're paying for it -- we should have the right to:
a) Know what you're spending our money on
b) Have the right to make it better ourselves
c) Learn of security flaws early so we can correct them
Especially when there is some doubt about the nature of the results in the closed source model from Mann et al.
The dangers of knowledge trigger emotional distress in human beings.
The problem with most of these studies is that they refuse to release the raw data.
... but no thanks !!!
A lot of times they select subsets of the data and then normalize or otherwise massage the data.
Thanks
Science, like government, should be transparent. The public should be able to see and evaluate every part. Any science, or government, that hides it's implementation is inherently suspect to corruption.
Closed science is half a step from religion. You are expected to have faith in the researcher's methodologies, analysis, assumptions, and motives. Sorry, but good science does not rely on faith.
I'm of the opinion that anything that gets published should be published in its entirety, at least at some point. For example, people who publish protein structures can put the coordinates "on hold" for up to 18 months.
And to say because the research is done with "taxpayer's money" is missing the point: If you can't reproduce every step, it's voodoo, not science. And we make policy decisions based on science, not voodoo (I hope).
While I would like all works performed for the government that are not of National Security importance to be more open I don't think it is necessary.
A lot of work peformed for government agencies is contractual with businesses. These same businesses employ tricks of the trade and such to deliver what is required. To have them detail how the work is just suicidal. The same goes for software they develop for use by the government. Unless specifically addressed in the contract I do not believe there is a right to disclose the code, let alone make it available to the public.
That last part is key. Even if they disclose the source to the government there is no obligation on either party to make it public.
This argument that they have something to hide is childish. It is designed to provide no leeway. Simply put, once labeled as such what other option other than disclosure exist? You might as well say "You have to release it, its for the children" and then proceed to use whole "hates kids, wants kids to die" guilt trip that is far to common in politics today.
Summary. Release it if only its an upfront requirement of the project and agreed upon by both parties. In the future a requirement by law that all government projects must be fully disclosed to include the source of any software may be nice but I bet it would have so many exceptions written into it that it would result only in a "feel-good" law.
* Winners compare their achievements to their goals, losers compare theirs to that of others.
That Global Warming is a manmade, real phenomona is accepted by 99.9% of scientists in the fields involved. To trot out the "only a theory", "some experts dispute" etc routine is like getting the Flat Earth Society involved every time someone talks about circumnavigation. "Heads in the sand" is going to be on our culture's gravestone when the next lot of intelligent life evolves here and starts wondering why parts of Nevada are 10,000 times the normal radiation level.
The Slashdot Paradox: "100% Overrated"
It shouldn't. But of course there are two ways to resolve this inconsistency:
Option 2 please.
Cheers,
Ian
Most of the "arguments against open-source science" mentioned here are not about science at all. The secrecy surrounding commericial and national-security "science" is good only in a financial or political sense: they do not help science, per se, at all. And personality conflicts are a factor as well: I suspect that Mann et al's reluctance to release source stems from an extreme personal frustration at McKitrick et al's persistent and (in my view) not always well-supported attacks.
"Their argument since the beginning has essentially not been about methodological issues at all, but about 'source data' issues [...] Only if you remove significant portions of the data do you get a different (and worse) answer."
You're over-trivializing a DRAMATICALLY IMPORTANT POINT. The original study is focused on North American data almost exclusively for certain time periods. That data (from a single species of tree) skews the results in such a way as to make the current trend seem unique and drastic. On the other hand, if you treat that data source in such a way as to balance it with the other data that is available, you see a VERY DIFFERENT TREND!
The response has been to claim that weighting the data in this way reduces the number of data points unacceptably (I would agree, but that doesn't make MBH98 right).
That's the whole point here, and the other side continues to say, "you're throwing away data" when any competent researcher would have thrown it out in the first place (note: there's an exception. if you produced a report that was specific to N. America, MBH98 would be your model, and it seems to be a fine model for that... N. America is seeing record warming as compared with the last few centuries, and that's all you can extract from MBH98).
Also keep some perspective in mind here. We're in a period where temperatures could rise MORE than ANYONE is predicting and not make a dent in the graph over the last 10million years. If you graph out the last 10 million years, you see that temperatures over the last 10,000 years have been part of a huge, cyclical spike in temperatures. We're at what is likely the peak of a drastic temperature swing, and it WILL plumet again into a new ice age (unless we decide to and are capable of coming up with a way to prevent it). I'm not drawing any conclusions from that, just pointing out that there are natural forces at work here, capable of making temperature changes that we a) cannot yet conclusively explain and b) the likes of which no human has ever experienced.
It's important to keep a sense of perspective and to remember that we have very impressive climate models... all of which might be wrong.
Where's that wealth of information about the secret US wars in Central America in the 1980s? Or in Angola in the 1970s? Or in Chile in the 1970s? Or in Cambodia and Laos in the 1960s? Iran in the 1950s? These secret wars are secret largely for *political* purposes - the military secrecy benefits evaporate within months. But the political purposes - covering liability for abuse, war crimes, and just plain lying about the causes, effects, and benefits of the war - those last forever.
--
make install -not war
How many times have you asked someone, "What does your code do to solve this problem?" and got a description of an algorithm which, when you finally get to see the source, does not match the code?
In my case, the answer to that question is, "Lots." I have had it happen in pure science (neutrino physcis), applied science (medical physics) and software development (database programming, data analysis, etc.)
I am painfully aware that my own published descriptions of algorithms have often left out minor details that may be critical in some applications, but that page limits in peer-reviewed journals necessitate. It is not uncommon to get a call from someone doing similar work asking for details about what you've done, how you've done it, and in some cases, asking to look at source code.
In contentious areas of science such requests are not always met with full disclosure, which is a sign that the people involved are no longer doing science. They are doing politics. This happens a lot, and it brings the scientific process to a halt on the question at issue.
In the case at hand, the original authors have done a very poor job of describing what they have done, and an extremely poor job of defending their work. Their refusal to publish their source code for their analysis gives credibility to their critics.
There are certainly legitimate cases where code ought not be published. If a lab has spent many, many years developing a framework for solving a certain type of problem and wants to get the most advantage out of that framework before releasing it, they may reasonably want to limit it's disemination for a while. But those sorts of reason don't apply in this case, and the source should be made available to anyone who wants to reproduce their actual results. That would just be good science.
--Tom
Blasphemy is a human right. Blasphemophobia kills.
One's interests in keeping clients does not entitle you to make a scientific claim that cannot be peer reviewed. If a paper such as Mann is now regarded as fact, and indeed, makes policy, despite the obvious sloppiness regarding its data management process, then, what is the point of science anyway?
Science is supposed to be about peer review, rigor, that every assumption behind every assertion can be challenged. If, all we have is someone with a Phd can claim that they have a fact as our science, then, what is the point of even trusting them?
Without independent verification and an open process, there's nothing to separate scientists from creationists, and the people are going to pick whoever makes the most attractive sales pitch.
This is my sig.
The whole point of science is not so that we can trust the opinions of scientists, it is so that scientists can give us repeatable steps to demonstrate a new point.
This whole notion of "it makes the scientists happy so we should just trust them" goes against every single thing that we in the west have fought for since the renaissance.
Your whole argument illustrates this problem precisely. You argue that, "well, even though the key piece of statistical evidence in global warming is questionable, we should still believe in the conclusion."
This is so wrong.
Maybe if scientists published all of their data in a uniform format, to a uniform site, with exact steps to reproduce, all of their source data, and how they draw conclusions from them, then, you might have a field that is useful. But right now, you have got hyper expensive journals all over the place as a repository for articles that only sketch out a discovery and not actually do it, and that simply is not good enough to be taken credibly.
The scientific process is excellent. But today's scientific product sucks.
This is my sig.