Call For Scientific Research Code To Be Released

← Back to Stories (view on slashdot.org)

Call For Scientific Research Code To Be Released

Posted by Soulskill on Tuesday February 9, 2010 @02:41AM from the but-then-people-will-see-how-awful-it-is dept.

Pentagram writes "Professor Ince, writing in the Guardian, has issued a call for scientists to make the code they use in the course of their research publicly available. He focuses specifically on the topical controversies in climate science, and concludes with the view that researchers who are able but unwilling to release programs they use should not be regarded as scientists. Quoting: 'There is enough evidence for us to regard a lot of scientific software with worry. For example Professor Les Hatton, an international expert in software testing resident in the Universities of Kent and Kingston, carried out an extensive analysis of several million lines of scientific code. He showed that the software had an unacceptably high level of detectable inconsistencies. For example, interface inconsistencies between software modules which pass data from one part of a program to another occurred at the rate of one in every seven interfaces on average in the programming language Fortran, and one in every 37 interfaces in the language C. This is hugely worrying when you realise that just one error — just one — will usually invalidate a computer program. What he also discovered, even more worryingly, is that the accuracy of results declined from six significant figures to one significant figure during the running of programs.'"

103 of 505 comments (clear)

Min score:

Reason:

Sort:

Seems reasonable by NathanE · 2010-02-09 02:44 · Score: 4, Insightful

Particularly if the research is publicly funded.
1. Re:Seems reasonable by fuzzyfuzzyfungus · 2010-02-09 03:19 · Score: 5, Insightful
  
  The "The public deserves access to the research it pays for" position seems so self-evidently reasonable that further debate is simply unnecessary(though, unfortunately, the journal publishers have a strong financial interest in arguing the contrary, so the "debate" actually continues, against all reason). Similarly, the idea that software falls somewhere in the "methods" section and is as deserving of peer review as any other part of the research seems wholly reasonable. Again, I suspect that getting at the bits written by scientists, with the possible exception of the ones working in fields(oil geology, drug development, etc.) that also have lucrative commercial applications, will mainly be a matter of developing norms and mechanisms around releasing it. Academic scientists are judged, promoted, and respected largely according to how much(and where) they publish. Getting them to publish more probably won't be the world's hardest problem. The more awkward bit will be the fact that large amounts of modern scientific instrumentation, and some analysis packages, include giant chunks of closed source software; but are also worth serious cash. You can absolutely forget getting a BSD/GPL release, and even a "No commercial use, all rights reserved, for review only, mine, not yours." code release will be like pulling teeth.
  
  On the other hand, I suspect some of this hand-wringing of being little more than special pleading. "This is hugely worrying when you realise that just one error — just one — will usually invalidate a computer program." Right. I know that I definitely live in the world where all my important stuff: financial transactions, recordkeeping, product design, and so forth are carried out by zero-defect programs, delivered to me over the internet by routers with zero-defect firmware, and rendered by a variety of endpoint devices running zero-defect software on zero-defect OSes. Yup, that's exactly how it works. Outside of hyper-expensive embedded stuff, military avionics, landing gear firmware, and FDA approved embedded medical widgets(that still manage to Therac people from time to time), zero-defect is pure fantasy. A very pleasant pure fantasy, to be sure; but still fantasy. The revelation that several million lines of code, in a mixture of Fotran and C, most likely written under time and budget constraints, isn't exactly a paragon of code quality seems utterly unsurprising, and utterly unrestricted to scientific areas. Code quality is definitely important, and science has to deal with the fact that software errors have the potential to make a hash of their data; but science seems to attract a whole lot more hand-wringing when its conclusions are undesirable...
2. Re:Seems reasonable by apoc.famine · 2010-02-09 04:12 · Score: 5, Insightful
  
  As someone doing a PhD in a climate related area, I can see both sides of the issue. The code I work with is freely and openly available. However, 99.9% or more of the people in the world wouldn't be able to do a damn thing with it. I look at my classmates - we're all in the same degree program, yet probably only 5% of them would really be able to understand and do anything meaningful with the code I'm using.
  
  Why? We're that specialized. Here, I'm talking 5% of people studying atmospheric and oceanic sciences being able to make use of my code without taking several years to get up to speed. What's the incentive to release it? Why bother with the effort, when the audience is soooo small?
  
  Release the code, and if some dumbass decides to dig into it, you either are in the position of having to waste time answering ignorant questions, or you ignore them, giving them ammo for "teh code is BOGUS!!!!" Far easier to just keep the code in-house, and hand it out to the few qualified researchers who might be interested. Unsurprisingly, a lot of scientific code is handled this way.
  
  However, I do very much believe in completely transparent discourse. My research group has two major comparison studies of different climate models. We pulled in data from seven models from seven different universities, and analyzed the differences in CO2 predictions, among other things. The data was freely and openly given to us by these other research groups, and they happily contributed information about the inner workings of their models. This, in my book, is what it's all about. The relevant information was shared with people in a position to understand it and analyze it.
  
  It'd be a whole different story if the public wasn't filled with a bunch of ignorant whack-jobs, trying to smear scientists. When we're trying to do science, we'd rather do science than defend ourselves against hacks with a public soapbox. If you want access to the data and the code, go to a school and study the stuff. All the doors are open then. The price of admission is just having some vague idea wtf you're talking about.
  
  --
  Velociraptor = Distiraptor / Timeraptor
3. Re:Seems reasonable by Sir_Sri · 2010-02-09 04:14 · Score: 5, Informative
  
  And it's not like the people writing this code are, or were trained in computer science, assuming computer science even existed when they were doing the work.
  Having done an undergrad in theoretical physics, but being in a PhD in comp sci now I will say this: The assumption in physics when I graduated in 2002 was that by second year you knew how to write code, whether they've taught you or not. Even more recently it has still been an assumption that you'll know how to write code, but they try and give you a bare minimum of training. And of course it's usually other physical scientists who do the teaching, not computer scientists, so bad information (or out of date information or the like) is propagated along. That completely misses the advanced topics in computer science which cover a lot more of the software engineering sort of problems. Try explaining to a physicist how a 32 or 64 bit float can't exactly replicate all of the numbers they think it can and watch half of them have their eyes gloss over for half an hour. And then the problem is what do you do about it?
  Then you get into a lab (uni lab). Half the software used will have been written in F77 when it was still pretty new, and someone may have hacked some modifications in here and there over the years. Some of these programs last for years, span multiple careers and so on. They aren't small investments but have had grubby little grad student paws on them for a long time, in addition to incompetent professor hands.
  None of scientific computing is done particularly well, they expect people with no training in software development to do the work, assuming it was done when software development existed, and there isn't the funding to pay people who might do it properly.
  On top of all that it's not like you want to release your code to the public right away anyway. As a scientist you're in competition with groups around the world to publish first. You describe in your paper the science you think you implemented, someone else who wants to verify your results gets to write a new chunk of code which they think is the same science and you compare. Giving out a scientists code for inspection means someone else will have a working software platform to publish papers based on your work, and that's not so good for you. For all the talk of research for the public good, ultimately your own good, of continuing to publish (to get paid) trumps a public need. That's a systematic problem, and when you're competing with a research group in brazil, and you're in canada their rules are different than yours, and so you keep things close to the chest.
4. Re:Seems reasonable by Troed · 2010-02-09 04:20 · Score: 3, Informative
  
  Your comment clearly shows you know nothing about software. I'm able to audit your source code without having a slightest clue as to what domain it's meant to be run in.
  
  --
  it's in my head
5. Re:Seems reasonable by TheTurtlesMoves · 2010-02-09 04:30 · Score: 5, Insightful
  
  Your not the F***** pope. You don't get to tell people they are not worthy enough to look at your/code data. You don't like it, don't do science. But this attitude of only cooperating with a "vetoed" group of people is causing far more problems than you think you are solving by doing it. You are not as smart as you think you are.
  
  Want to make a claim/suggestion that has very real economic and political ramifications for everyone, you provide the data/models for everyone. Otherwise, have a nice hot cup of shut the frak up.
  
  --
  The Grey Goo disaster happened 3 billion years ago. This rock is covered in self replicating machines!
6. Re:Seems reasonable by mjwalshe · 2010-02-09 04:36 · Score: 2, Insightful
  
  Well up to a point however its the model you have to validate. Years ago I helped write some code to model the behavior of pumps and one of the tests we did was to run the model and compare it to real life and also run the model in reverse to see if we got back to the same point we started from. With out knowing a ton about CS/Mathamatics and the modeling methods used and access to the origional data a non specialist is not going to get very far.
7. Re:Seems reasonable by apoc.famine · 2010-02-09 04:42 · Score: 4, Insightful
  
  Of all the stuff that's important in scientific computing, the code is probably one of the more minor parts. The science behind the code is drastically more important. If the code is solid and the science is crap, it's useless. Likewise, the source data that's used to initialize a model is far more important than the code. If that's bogus, the entire thing is bogus.
  
  Sure, you could audit it, and find shit that's not done properly. At the same time, you wouldn't have a damn clue what it's supposed to be doing. Suppose I'm adding a floating point to an integer. Is that a problem? Does it ruin everything? Or is it just sloppy coding that doesn't make a difference in the long run? Understanding what the code is doing is required for you to do an audit which will produce any useful results.
  
  Unless you're working under the fallacy that all code must be perfect and bug free. Nobody gives a shit if you audit software and produce a list of bugs. What's important is that you be able to quantify how important those bugs are. And you can't do that without knowing what the software is supposed to be doing. When it's something a complicated as fluid dynamics or biological systems, a code audit by a CS person is pretty much worthless.
  
  --
  Velociraptor = Distiraptor / Timeraptor
8. Re:Seems reasonable by Troed · 2010-02-09 05:00 · Score: 4, Insightful
  
  You argument is void. A bug is a bug. Either it affects the outcome of the program run or it doesn't - and I still don't need to know anything about what it's supposed to do to verify that. You just need to re-run the program with a specified set of inputs and check the output - also known as verified against its own test suite.
  (Yes, I'm a Software Engineer by education)
  
  --
  it's in my head
9. Re:Seems reasonable by Pinky's+Brain · 2010-02-09 05:02 · Score: 2, Interesting
  
  Lets assume for a moment you publish your code in the reproducible research sense, this will mean you also publish all the code necessary to compute the graphs in your papers ... at that point I can at the very least determine if what you thought was significant in the initial results as explained in your papers is still there.
10. Re:Seems reasonable by professionalfurryele · 2010-02-09 05:05 · Score: 2, Interesting
  
  Having worked in academia I can attest to the very poor code quality, at least in my area. The reason is very simple, the modern research scientists is often a jack-of-all-trades. A combination IT professional, programmer, teacher, manager, recruiter, bureaucrat, hardware technician, engineer, statistician, author and mathematician as well as being an expert in the science of their own field. Any one of these disciplines would take years of experience to develop professional skills at. Most scientists simply don't have time to do that, so they wing it. I think publishing code would be a good idea as scrutiny would help quality, but a big chunk of this code is never going to be of professional quality because it isn't written by professional programmers.
11. Re:Seems reasonable by MikeBabcock · 2010-02-09 05:13 · Score: 3, Insightful
  
  Both are issues. If your code is buggy, the output may also be buggy. If the code is bug-free but the algorithms buggy, the output will also be buggy.
  The whole purpose of publishing in the scientific method is repeatability. If the software itself is just re-used without someone looking at how it works or even better, writing their own for the same purpose, you're invalidating a whole portion of the method itself.
  As a vastly simplified example, I could posit that 1 + 2 = 4. I could say I ran my numbers through a program as such:
  print f(1, 2); f (a, b): print $b + $b;
  If you re-ran my numbers yourself through MY software without validating it, you'd see that I'm right. Validating what the software does and HOW it does it is very much an important part of science, and unfortunately overlooked. While in this example anyone might pick out the error, in a complex system its quite likely most people would miss one.
  To the original argument, just because very few people would understand the software doesn't mean it doesn't need validating. Lots of peer review papers are truly understood by a very small segment of the scientific population, but they still deserve that review.
  
  --
  - Michael T. Babcock (Yes, I blog)
12. Re:Seems reasonable by bmajik · 2010-02-09 05:18 · Score: 5, Insightful
  
  However, 99.9% or more of the people in the world wouldn't be able to do a damn thing with it. I look at my classmates - we're all in the same degree program, yet probably only 5% of them would really be able to understand and do anything meaningful with the code I'm using.
  I think the world is very lucky that Linus Torvalds wasn't as narrow-sighted and conceited as you are.
  
  Why? We're that specialized. Here, I'm talking 5% of people studying atmospheric and oceanic sciences being able to make use of my code without taking several years to get up to speed. What's the incentive to release it? Why bother with the effort, when the audience is soooo small?
  Release the code, and if some dumbass decides to dig into it, you either are in the position of having to waste time answering ignorant questions, or you ignore them, giving them ammo for "teh code is BOGUS!!!!" Far easier to just keep the code in-house, and hand it out to the few qualified researchers who might be interested. Unsurprisingly, a lot of scientific code is handled this way.
  However, I do very much believe in completely transparent discourse. My research group has two major comparison studies of different climate models. We pulled in data from seven models from seven different universities, and analyzed the differences in CO2 predictions, among other things. The data was freely and openly given to us by these other research groups, and they happily contributed information about the inner workings of their models. This, in my book, is what it's all about. The relevant information was shared with people in a position to understand it and analyze it.
  It'd be a whole different story if the public wasn't filled with a bunch of ignorant whack-jobs, trying to smear scientists. When we're trying to do science, we'd rather do science than defend ourselves against hacks with a public soapbox. If you want access to the data and the code, go to a school and study the stuff. All the doors are open then. The price of admission is just having some vague idea wtf you're talking about.
  Have you heard of "ivory tower"? You're it.
  Your position basically boils down to this: "unless you read all the same things I read, talked to all the same people I talked to, went to all the same schools I did... you're not qualified to talk to me".
  That is _the_ definition of monocultural isolationism.. i.e. the Ivory Tower of Academia problem.
  Here's the problem: if your requirement is that anyone you consider a "peer" must have had all of the same inputs and conditionings that you had... what basis do you have for allowing them to come out of the other side of that machine with a non-tainted point of view?
  As a specific counterpoint to your way of thinking:
  My dad is an actuary.. one of the best in the world. He regularly meets with the top handful of insurance regulators in foreign governments. He manages the risk of _billions_ of dollars. The maths involved in actuarial science embarass nearly any other branch of applied mathematics. I have an undergraduate math degree and I could only understand his problem domain in the crudest, rough-bounding box sort of fashion. Furthermore, he's been a programmer since the System/360 days.
  Yet his code, while there is a lot of it, is something I am definitely able to help him with. We talk about software engineering and specific technical problems he is having on a frequent basis.
  You don't need to be a problem domain expert in order to demonstrate value when auditing software.
  Furthermore, as a professional software tester, I happen to find that occasionally, not over-familiarizing myself with the design docs and implementation details too early allow me to ask better "reset" questions when doing design and code reviews. "Why are you doing this?" And as the developer talks me through it, they understand how shaky their assumptions are. If I had been "travelling" with them in lock step
  
  --
  My opinions are my own, and do not necessarily represent those of my employer.
13. Re:Seems reasonable by Starlet+Monroe · 2010-02-09 05:38 · Score: 2, Insightful
  
  This is a conundrum for me. My research is in the world of radiation physics, where results can definitely be life-changing. I absolutely respect the amount of impact small discrepancies can have on outcomes, but I also struggle to find a balance. The project I'm on right now is a retrospective analysis, so the results we report won't directly affect anyone. If policy changes are made from what we determine, the results will.
  My role is to conduct some fairly complex calculations against a data set, for which I've built some custom software and a database. The software isn't great software...it's good enough to get the job done. I validate the input...a little bit. Just enough to make sure we're using the right file. I confirm that the data I need exists in our input, but I don't do any boundary checking on it. Why should I? There's only one data file that gets analyzed, and as we collect more data, we run it again. I'll probably use this code in "production" four times in the course of the study. Are there stupid bugs that crop up if strings show up in the data instead of floats? Sure. But there won't ever be strings in the data, and the code won't ever be used after we run the data through. We don't have the budget for me to spend the time to write it "right", the way I would if it was for enterprise use. And we sure can't afford to QA it, too.
  I respect the idea that all code should go through a complete development cycle before use in production, and I think it's certainly important for that to happen in science, but I think there have to be limits. Sometimes the object is to get something done, and the difference between doing it "best" and "good enough" doesn't mean the difference between "right" and "wrong."
  
  --
  ++
14. Re:Seems reasonable by Troed · 2010-02-09 06:17 · Score: 2, Insightful
  
  You're not doing science if you're not performing work that can be falsified (and replicability is a cornerstone in that).
  I'd rather have you do science.
  
  --
  it's in my head
15. Re:Seems reasonable by khellendros1984 · 2010-02-09 06:19 · Score: 3, Informative
  
  It may not be possible to actually "discuss" the topic, but it's certainly possible to find bugs that may or may not influence the output of the program. And given the original input data, it's possible to remove the bugs, run the corrected program against the original input data, and see if the output is different. It would take someone with knowledge in the target topic to analyze the output data and decide if any difference is significant, but the actual check for bugs could certainly be done by anyone that "speaks" the language the program was written in.
  
  Even with something like a Global Warming argument, a person with a strong grasp of both English and logic might not be able to verify claims in an argument, but they can certainly analyze the argument for certain logical fallacies. Perhaps the fallacious section of the argument doesn't invalidate the argument as a whole. You can't trust this generic English-speaker to accurately make that determination, but they're certainly able to identify and remove a strawman, an ad hominem, etc.
  
  --
  It is pitch black. You are likely to be eaten by a grue.
16. Re:Seems reasonable by STRICQ · 2010-02-09 06:24 · Score: 2, Informative
  
  Careful, you are getting dangerously close to the conceited, "Holier than thou" attitude that many climate scientists are spewing out. You really don't know what you're talking about when you say the op doesn't know what he's talking about. I'm a software engineer, finding bugs, even when you don't know what the code is doing, is a lot easier than you would think.
17. Re:Seems reasonable by Explodo · 2010-02-09 06:29 · Score: 2, Interesting
  
  You seem to assume that your code is correct. What if, by allowing others to audit it, bugs were found that significantly altered the output. Wouldn't that be something that you'd be interested in? Or, what if you spent years working on your doctoral thesis but at the last second found an error in your software that was what allowed your results to be in line with your assumptions and theory work? Would you scrap your years of work, or would you ignore it since you're freaking tired of working on it and want to be done already? Now assume that the results of your work are used to set public policy somewhere down the road...would you be honest enough to stand up and say it was fraudulent?
18. Re:Seems reasonable by bmajik · 2010-02-09 06:29 · Score: 5, Insightful
  
  there are well funded lobby groups and others with too much time on their hand looking for ANYTHING that is wrong.
  Errors are only errors if they are reported by the "right" people?
  Do you want to know how many questions Linus Torvalds has answered for me? Zero.
  I actually _have_ gotten personal responses from Theo DeRaadt on some OpenBSD issues but they all have the general form of "you're not interesting, don't waste my time".
  Nevertheless, I rely on OpenBSD. The fact that Theo has neither the time nor the interest in having a deep meaningful conversation with me about his code neither changes the quality of his code nor prevents him from releasing every 6 months, on schedule.
  I don't think that there is an expectation that scientists stop doing their day jobs to do software support for people. I think there is an expectation that publicly funded research used to set public policy be easily available to all comers.
  I'm a bit frustrated by the apparent contradiction. For the first time perhaps in history in the USA, you have armchair folks trying to do technical audits of scientific tools, research, and publications -- for free.
  I thought the "normal" problem in America is that the population is too apathetic to care and too stupid to provide any critical analysis. And yet we see this happening more and more frequently and the climate-science establishment is circling the wagons instead of celebrating the fact that there are a handful of people that for once give a damn about interesting research tools and methods.
  I must concede that there are some downsides to discussing your opinions and findings with others: When people disagree with you, it ends up taking some of your time.
  
  --
  My opinions are my own, and do not necessarily represent those of my employer.
19. Re:Seems reasonable by bdwlangm · 2010-02-09 06:45 · Score: 2, Insightful
  
  If I find code that will cause heap corruption in your code (e.g. you wrote past the end of an array in C), then there is a bug in that code whether you do fluid simulations, or make 3D games. I worked as an undergraduate RA under some guys doing ocean modelling, and found several small bugs before I had the foggiest idea what most of the code was meant to do. Yes there will be many problems someone without your background can't find in your model, but that is not an argument for closed source science.
  A more important concern is that someone else who does have your background should have access to your code. That would be part of "peer review". Otherwise they're taking your computations on faith, with no way to reproduce.
20. Re:Seems reasonable by Pentagram · 2010-02-09 06:53 · Score: 2, Insightful
  
  You assume far too much. I don't trust an analysis of anything, by anyone, who doesn't know what they are actually looking at. In your example you can look and analyze but you don't need to understand what it is....
  If the code is freely available and so are the data used, what is stopping you rerunning the experiment with the same data if you find a bug? No analysis comes into it: if the results are significantly different, you can show that the program is running incorrectly.
  
  I'm seeing a pretty clear parallel between your view of how the code can be analyzed and the AGW ignoramus skeptic view of AGW science as a whole. I don't trust arguments for or against AGW that aren't by people with educations to demonstrate they at least *might* know what they are talking about.
  A mathematician could point out flaws in the calculations of climate science, a physicist could point out problems with the understanding of the physics, a chemist could point out issues with the understanding of the chemistry... you don't have to understand an entire issue to notice problems with a subset of the science. I speak as someone who accepts the majority expert view of climate change.
21. Re:Seems reasonable by bdwlangm · 2010-02-09 06:59 · Score: 3, Insightful
  
  just wait until some amateur gets a hold of the code, runs it, and claims that all global warming data is questionable because this model has a bug or produces weird output
  The onus is on the researcher to demonstrate/argue that for the inputs given the code produces meaningful results. If you don't like that then stop doing research with computations? Idiots can always misrepresent you, no matter how you publish. Most of us understand that simulations are limited.
  
  Second, it will waste the researchers time releasing the code and then responding to questions when people are like "lolz this code blows"
  What makes you think that there will be more people trying out that code and not understanding it, than currently there are people reading the paper and not understanding it? Personally I'm not going to waste my spare time downloading complex simulations that I know nothing about and try to invalidate them.
  
  That being said, it should definitely be available as a part of the peer review process if something is really called into question.
  So make it available and reference it in your paper. No one's asking you to tell everyone on the planet about it.
22. Re:Seems reasonable by Pentagram · 2010-02-09 07:05 · Score: 3, Insightful
  
  Maybe its a bug that only pops up on certain inputs. Maybe the researcher knows this and avoids those inputs (or wrote the program without intending to go anywhere near the input range where the code fails). This sees fine to me...researcher needs a one-off set of statistics and writes some quick and dirty code that does it even if it isn't robust or even efficient.
  Sorry, but I wouldn't trust any code that fails on certain inputs!
  I can accept code that isn't efficient, that's just not necessary. I can accept bugs in peripheral code (such as an added-on GUI) but the code that actually does the science really should be as good as the scientist can write. If it has known bugs they should be fixed before any research is published that is based on the code.
  I speak as someone who has written code for scientific research.
  
  Releasing this code is probably bad for two reasons. If the researcher is not aware of bugs outside of the exact inputs they used, they probably aren't going to disclose them--just wait until some amateur gets a hold of the code, runs it, and claims that all global warming data is questionable because this model has a bug or produces weird output.
  Good. That means researchers will be more careful about the code they are writing, and we can all have more confidence in the science.
  
  I don't expect researchers to write great code for everything...it may be repetitive or inefficient but they can usually tell from the result (and comparing it to other models) whether or not something went wrong.
  
  Comparing it to other models? What if they are wrong too? Perhaps that's how they verified their results. Trying to tell if the program is correct from the results is even worse. You end up fixing bugs until the code produces the result you want.
  
  I know that I write code at work (IANAClimate Researcher) that is quite sloppy or wasteful because I just want to see what the result looks like (and will never run the program again)
  That's exploratory programming, and is quite fair enough (in fact I think people should do it more), but you shouldn't use such code to do anything important. Throw it away and start again.
23. Re:Seems reasonable by philipgar · 2010-02-09 07:18 · Score: 3, Insightful
  
  Actually, I'm pretty sure everyone is fairly close with the current data they're generating to prevent other groups from beating you out the door with your idea. The exceptions to this rule are when professors trust one another, and know that the other wouldn't use the information you're supplying them with to do the same research you are already working on.
  As a graduate student, you definitely don't want to share code you've developed immediately. You may spend 2 or 3 years of a PhD writing code, and get a couple papers out of it, but with the code base in place you plan on getting a handful more. More to the point, these papers become relatively easy to generate, because you spent those years developing the program that allows you to do it. Writing papers, and generating results, analyzing them etc takes time, so you can't do everything at once. Releasing your code too early means other groups can do these other experiments, and you, the grad student who spent so many years setting up the code or experiment for them, still wouldn't be able to graduate, because you have not produced enough original research, and instead only developed the tools others used to pump out results.
  As a student nears graduation, they might be more willing to release their code, as then competition is less of a concern. Someone won't pick up your code and be releasing a paper based on it in 2 or 3 months, it just takes too long to get up to speed. However, the BIGGEST impediment to releasing software in academia is the support that you have to give to your software if anyone is going to use it. You first need to audit and clean up your code, a non-trivial task. You have to supply documentation on how to use the software, another non trivial task, and then provide documentation on the basics of how it works etc. All of this stuff takes a lot of time, and doesn't tend to help a student graduate. Also, once code is released, there's an expectation that you'll be providing some level of help with questions. Granted that normally rarely happens (as the author has gone on to do other things, and hasn't touched the code in years). It just becomes a difficult thing to do.
  Phil
24. Re:Seems reasonable by Urkki · 2010-02-09 07:48 · Score: 2, Insightful
  
  You assume far too much. I don't trust an analysis of anything, by anyone, who doesn't know what they are actually looking at. In your example you can look and analyze but you don't need to understand what it is....
  I'm seeing a pretty clear parallel between your view of how the code can be analyzed and the AGW ignoramus skeptic view of AGW science as a whole. I don't trust arguments for or against AGW that aren't by people with educations to demonstrate they at least *might* know what they are talking about.
  You're basically saying you're qualified to analyze and discuss a topic you do not understand simply because you know a language. That is just B.S.
  So if the scientist who wrote the computer model isn't a qualified software engineer and doesn't have intimate knowledge of the workings of processor architectures, computer languages and all that, then any results he gets using a computer program of his own making are not to be trusted?
  I think you just threw out a significant portion of latest science...
25. Re:Seems reasonable by Xyrus · 2010-02-09 08:12 · Score: 2, Insightful
  
  The discussion in scientific circles is constructive.
  The quasi/anti-science mad-dog drivel that makes up almost all the rest of the discussion is what they're circling the wagons about. It's like having Joe Sixpack looking come into your workplace screaming that your professional work is bullshit and you should be fired.
  And actually having people in power listen to him.
  ~X~
  
  --
  ~X~
26. Re:Seems reasonable by Troed · 2010-02-09 09:06 · Score: 2, Insightful
  
  Yes - but the fact that there are classes of errors (specially those pertaining to the construction of the model) that would be hard to find without domain knowledge does not invalidate the fact that you'll be able to find other classes of errors.
  Errors as those detailed in the article.
  
  --
  it's in my head
27. Re:Seems reasonable by pod · 2010-02-09 09:14 · Score: 3, Insightful
  
  Exactly, although I echo the sentiment the presentation could have been better.
  Everywhere we turn there are people who think they are smart telling us what to do and what to think, because they know what is best for us. They're the experts with years of training, and we know nothing. Do not question the high priests, do not pay attention to the man behind the curtain.
  This is just following the general trend of late, culminating in "this time, it's different, trust us". We think we're smarter, we're better, we have more tools, we have more knowledge, we have more insight, and that things are somehow fundamentally different, and that today we can fix all the problems that our predecessors have been unable to fix in centuries past. In the end, the more we "fix", the more we break.
  As a lay person, I know we cannot predict what the weather will be like next week, and all I see around me is global climate hysteria. I don't see science, I don't see deliberation, I don't see openness, I don't see debate. I see politics and dogma. Enough of this "you're not smart enough to understand so just trust me" nonsense. Enough of this "science by consensus". It doesn't exist, and it's not scientific anyways even if it did.
  Show everyone the science, open up the process, accept opposing data (heck, accept ALL legitimate data to begin with), interpretations and views, so we can all see why it is that we need to undertake a complete reorganization of economy, society and personal life, at a cost of trillions of dollars and undoubtedly much resulting misery and suffering.
  It was global cooling and visions of frozen wastelands and a new ice age. Where did that go? Then it was the ozone hole that would fry anyone not wearing SPF1000 sunblock. Where did that go? Then it was global warming and sea level rise that would make disaster movies seem like documentaries. Where did that go? Now we have the amorphous all-encompassing "climate change".
  But THIS TIME, it's different. Really. This time, we're smarter, and we have better science, and we've learned, and we know better, we know for sure. Trust us.
  Well, sorry. You're gonna have to do better than that.
  
  --
  "Hot lesbian witches! It's fucking genius!"
28. Re:Seems reasonable by wealthychef · 2010-02-09 09:21 · Score: 3, Insightful
  
  Just release your god damned code and don't worry about it. What are you afraid of? The sky will not fall. Your reputation will not crumble. Of course it's not perfect, duh. The point of releasing it is not to have people check for perfection, it's to see if there is a bug that could explain your surprising results. It's part of defending your results. Deal with it. I don't trust you.
  
  --
  Currently hooked on AMP
29. Re:Seems reasonable by CptNerd · 2010-02-09 09:30 · Score: 2, Interesting
  
  What it does is, it eliminates one possible cause of errors. Software that doesn't do bounds checking, for instance, is like uncalibrated measuring instruments. Writing 100 numbers into an array of 10 integers will cause 90 numbers to be written into random areas of memory, and you can't be guaranteed that they aren't affecting other parts of your model, including parts that have been calculated previously and which are now overwritten by false values. I saw something just like this when converting a legacy communications package from Fortran to C, all through the code the previous programmers had defined 16 character strings and were writing 256 characters into them, due to a change in one constant that wasn't used to define the array bounds. Fortunately the problem caused the C code to crash, but the Fortran code would occasionally produce strange results, caused by this coding error.
  
  I've been a programmer for 30 years, and a science geek for longer than that, and I would assume that a scientist would be in favor of eliminating as many potential errors as possible in the instruments they use, whether the instruments are hardware or software.
  
  --
  By the taping of my glasses, something geeky this way passes
30. Re:Seems reasonable by mathfeel · 2010-02-09 09:37 · Score: 2, Interesting
  
  You argument is void. A bug is a bug. Either it affects the outcome of the program run or it doesn't - and I still don't need to know anything about what it's supposed to do to verify that. You just need to re-run the program with a specified set of inputs and check the output - also known as verified against its own test suite.
  Unlike many pure-software case, scientific simulation can and MUST be checked against theory/simplified model/asymptotic behavior. The latter requires specialized understanding of the underlying science. The kind of coding bug you are talking about will usually (not always) result in damningly unphysical result, which would the immediate prompt any good student to check for the said bug. Heck, my boss usually refuse to look at my codes when I am working on it (besides advising on general structure) so that even if my code got the expected result, he can still perform an independent inspection.
  
  --
  The only possible interpretation of any research whatever in the 'social sciences' is: some do, some don't
31. Re:Seems reasonable by quanticle · 2010-02-09 10:06 · Score: 2, Interesting
  
  A more important concern is that someone else who does have your background should have access to your code. That would be part of "peer review". Otherwise they're taking your computations on faith, with no way to reproduce.
  I fully agree. Perhaps something that scientific journals could do is to create a source code repository that allows researchers to publish the source code used to create the results along with the results themselves. At the very least, other researchers would be able to look at the code and see if there are any glaring errors or omissions.
  
  --
  We all know what to do, but we don't know how to get re-elected once we have done it
32. Re:Seems reasonable by Xyrus · 2010-02-09 10:55 · Score: 2
  
  This is the reason why scientist feel they are wasting their time engaging the public.
  You clearly already have your mind set in stone. You've got the whole thing figured out backed by 100% solid conspiracy theories. No amount of data, review, validation or verification will change your mind.
  It doesn't matter if the code is open. You don't care. You're not going to look at it because is so much easier to wrap yourself in a blanket of your own crazy world view and discredit anything that might just impact your comfy existence.
  In fact, the arctic could melt completely and 20% of Florida could go underwater and you'd still deny anything was happening or that anything should be done about it.
  Scientists would just be wasting valuable resources in dealing with people like yourself. Just like democrats would be wasting their time talking at a Tea Party convention, or Republicans would be wasting their time at a MoveOn.org convention.
  You will not be convinced. Ever. Like most followers of McIntyre and Watts. It doesn't matter what the scientists do it will never be enough. Even God itself would not convince you otherwise.
  Fortunately, the research will continue with or without your support.
  ~X~
  
  --
  ~X~
33. Re:Seems reasonable by Thiez · 2010-02-09 11:31 · Score: 2, Insightful
  
  > Then it was the ozone hole that would fry anyone not wearing SPF1000 sunblock. Where did that go?
  We stopped using the CFCs that were identified as a major contributor to the problem and it appears that is working. Oh sorry, I don't think that supports your argument.
34. Re:Seems reasonable by ChrisMaple · 2010-02-09 14:21 · Score: 3, Insightful
  
  Something like a climate model has a very exclusive audience
  The final audience of a climate model is (economically) every person alive. If the models are as good as some climatologists claim, the final audience is every living thing on earth.
  Making their code public doesn't mean they have to answer their phone. But they're going to have to answer to someone if it can be shown that their code deliberately produced false results, as was the case with the "hockey stick" scandal.
  
  --
  Contribute to civilization: ari.aynrand.org/donate
35. Re:Seems reasonable by apoc.famine · 2010-02-09 18:37 · Score: 2, Insightful
  
  Spoken like a Software Engineer!
  
  A bug isn't just a bug. Either it affects the outcome of the program run or it doesn't. The issue is that if you don't know what the outcome should be, you won't be able to tell. Nobody in scientific computing just "re-run(s) the program with a specified set of inputs and check(s) the output". The input is 80% of the battle. We just ran across a paper which showed that the input can often explain 80%+ of the variance in the output of models similar to the one we use.
  
  So there's our dilemma - what we feed the model is very, very, VERY limited. If something crashes or returns an anomalous result when fed a string instead of an integer, we'll never notice. Why? Because we'll NEVER feed it a string. If all the climatological data we get to feed the model is from NCAR reanalysis, we'll make damn sure the model can handle that data input. Might there be serious issues if another format is fed it? Sure. But that will probably never happen.
  
  Scientific programming is garbage, by and large. Perform a code audit on it, and you'll find a lot of bugs. But largely, the parts that are in active use are relatively bug free. Why? Because we compare our output with that of other modeling groups. In my office there are two posters comparing seven models from seven different universities. I can tell you who treats oceanic uptake of carbon the same as our group does, and who treats it differently. If one model was a major outlier, we'd have identified that, and asked them what code they use to calculate oceanic carbon uptake.
  
  This is science, not Software Engineering. We troubleshoot and find bugs by comparing OUTPUT, not CODE. It's only when we find that output is significantly different that we look to code to figure out why. It's akin to having 7 browsers all try to render a page. If 5 of them render the same thing, one is close, and one doesn't look anything like the others, your first guess is to take a look at what that one oddball is doing. The same goes for scientific code.
  
  The people writing it aren't software engineers, by a long shot. But if they really screw up, everybody knows. It's not through a code audit - it's because their output doesn't match either what's observed in nature, or what other models output. Would rigorous code audits make our code better? Sure. Is CS volunteering to come do it for us? No. Would we have the time to deal with their nit-picking? No. We validate output, not code. And largely, it works.
  
  --
  Velociraptor = Distiraptor / Timeraptor
36. Re:Seems reasonable by Troed · 2010-02-09 21:59 · Score: 2, Insightful
  
  Sorry, no. You're just displaying your ignorance above. You cannot look at the output and say that just because it fits with your preconceived notions it's therefor correct. You do not know if you have problems in a farhenheit to celcius conversion, a truncation when casting between units etc (yes, examples chosen on purpose). You might get a result that's in the right ballpark. You might believe you have four significant digits when you only have three. Your homebrewn statistical package might not have been audited by a statician etc.
  You simply do not know all the things you claim above that you do know.
  
  --
  it's in my head
37. Re:Seems reasonable by TheTurtlesMoves · 2010-02-09 23:22 · Score: 2, Interesting
  
  And when your code output does not match theirs, its a bug in your code... because you know, we know its not a bug in our code. Trust me! To replicate the results code should be available. Its is a requirement to provide the source in many journals already.
  
  Science does not require trust. It requires transparency. Closed source is not transparent.
  
  --
  The Grey Goo disaster happened 3 billion years ago. This rock is covered in self replicating machines!
38. Re:Seems reasonable by Troed · 2010-02-10 02:53 · Score: 2, Insightful
  
  If I write a program to model ocean currents, and it spits out a map of oceans very, very similar to what's been well observed in the ocean. I can assume my code is good enough.
  No. As long as you believe that, you're not doing science.
  
  --
  it's in my head
great! by StripedCow · 2010-02-09 02:47 · Score: 3, Insightful

Great!
I'm getting somewhat tired from reading articles, where there is little or no information regarding program accuracy, total running time, memory used, etc.
And in some cases, i'm actually questioning whether the proposed algorithms actually work in practical situations...

--
If Pandora's box is destined to be opened, *I* want to be the one to open it.
More to the point, people increasingly don't by aussersterne · 2010-02-09 02:49 · Score: 4, Insightful

seem to understand the very idea of scientific methods or processes, or the reasoning behind empiricism and careful management of precision.
It's a failure of education, no so much in science education, I think, as in philosophy. Formal and informal logic, epistemology and ontology, etc. People appear increasingly unable to understand why any of this matters and they essentialize the "answer" as always "true" for any given process that can be described, so science becomes an act of creativity by which one tries to create a cohesive narrative of process that arrives at the desired result. If it has no intrinsic breaks or obvious discontinuities, it must be true.
If another study that contradicts it also suffers from no breaks or discontinuities, they're both true! After all, everyone gets to decide what's true in their own heart!

--
STOP . AMERICA . NOW
1. Re:More to the point, people increasingly don't by bsDaemon · 2010-02-09 03:05 · Score: 4, Insightful
  
  I think a lot of it has to do not just with failures in education, but also due to the way science (in particular, but everything in general) is reported in the media. One week a study saying coffee will kill you gets reported, then a couple of days later a story saying another study says coffee will make you immortal is reported on, both with equal voracity, neither with expert commentary or perspective. C+ students who look good on camera banter back and forth about it, laughing jocularly and ultimately creating a situation in which, by their own dismissal and misunderstanding, perpetuate that to their viewers.
  
  Its come to the point where many, many people just dismiss the whole business of science. "They can't even make up their minds!" they say, as if the point of science is to make up ones' mind. Of course, this is where the failure of education to actually educate comes into play. Classical liberalism has been turned over, spanked and made into the servant of corporate mercantilism and we're all just now supposed to sit down and shut up. Science, is in its essence, a libertarian (note small 'l') pursuit through which one questions all authority, up to and including the fabric of existence itself -- all assumptions are out the window and any that cannot pass muster is done away with.
  
  But, just like socio-political anarchism (libertarian socialism), the spirit of rebellion and anti-authoritarianism inherent in science has been packaged and sold in a watered down and safe-for-children package at the local shopping mall only to be taken out of the box when the powers that be feel that they can use it for their own purposes. Not to be a downer or anything, its just I really do think this is bigger than just science. It's to do with people willingly leading themselves as sheep to the slaughter on behalf of the farmer to make the dog's job easier.
2. Re:More to the point, people increasingly don't by phantomfive · 2010-02-09 04:36 · Score: 2, Interesting
  
  people increasingly don't seem to understand the very idea of scientific methods or processes, or the reasoning behind empiricism and careful management of precision.
  Surprisingly, not true. In fact, it's getting better, despite what Idiocracy claims.
  
  An easy way to see this is to compare High School Musical to Grease. Both of them were roughly the same movie, separated by a few decades. In Grease, the smart kids were shown as dorks, and the cool kids were the ones who were most likely to drop out of school. In High School Musical, the 'brainiac' kids weren't portrayed as better or worse than the jocks, just different. So perceptions are changing.
  
  Mainly I don't know when this golden time period was that everyone understood formal and informal logic, epistemology, and ontology. At least now, most everyone understands [citation needed], twenty years ago most people had trouble with that (and Wikipedia spoils me: now when I read a newspaper I keep wanting to find the link to click on for the citation of their assertions.).
  
  --
  Qxe4
Stuff like Sweave by langelgjm · 2010-02-09 02:49 · Score: 3, Interesting

Much quantitative academic and scientific work could benefit from the use of tools like Sweave, which allows you to embed the code used to produce statistical analyses within your LaTeX document. This makes your research easier to reproduce, both for yourself (when you've forgotten what you've done six months from now) and others.
What other kinds of tools like this are /.ers familiar with?

--
"Anyone who [rips a CD] is probably engaging in copyright infringement." - David O. Carson
1. Re:Stuff like Sweave by xtracto · 2010-02-09 03:51 · Score: 2, Interesting
  
  Should there be a universal language,
  It is called Z notation. I have seen it used in several articles and at least a book on multi-agent systems.
  
  --
  Ubuntu is an African word meaning 'I can't configure Debian'
2. Re:Stuff like Sweave by John+Hasler · 2010-02-09 04:06 · Score: 2, Insightful
  
  > This raises the question in what programming language the scientific code
  > should be published.
  The one it was written in. What should be published is the exact code that was compiled and run to generate the data. Think of it as similar to making the raw data available.
  
  --
  Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
3. Re:Stuff like Sweave by ColdWetDog · 2010-02-09 06:33 · Score: 2, Funny
  
  I think it should be Perl. Then it would be uniformly incomprehensible which would level the playing field. Nothing else would be as fair.
  
  --
  Faster! Faster! Faster would be better!
It should be released and under a free licence! by bramp · 2010-02-09 02:50 · Score: 3, Interesting

I've always been a big fan of releasing my academic work under a BSD licence. My work is funded by the taxpayers, so I think the taxpayers should be able to do what they like with my software. So I fully agree that all software should be released. It is not always enough to just publish a paper, but you should release your code so others can fully review the accuracy of your work.
About time! by sackvillian · 2010-02-09 02:50 · Score: 5, Informative

The scientific community needs to get as far as we can from the policies of companies like Gaussian Inc., who will ban you and your institution for simply publishing any sort of comparative statistics on calculation time, accuracy, etc. from their computational chemistry software.
I can't imagine what they'd do to you if you started sorting through their code...

--
Hey mate, spare a sig?
1. Re:About time! by je+ne+sais+quoi · 2010-02-09 04:11 · Score: 2, Informative
  
  One thing to point out is that there are now plenty of open source codes available for doing similar things as gaussian so it can be avoided now with relative ease. Two that come to mind are the the Department of Energy funded codes: nwchem for ab initio work and lammps for molecular dynamics. I use the NIH funded code vmd for visualization. The best part about those codes is that they're designed to be compiled using gcc and run on linux so you can get off the non-open source software train all together if you wish.
  
  --
  Gentlemen! You can't fight in here, this is the war room!
Re:Why release it? by ShadowRangerRIT · 2010-02-09 02:50 · Score: 2, Insightful

Please apply Hanlon's razor before leaping to conspiracy theories. Or Occam's razor might inform you that a conspiracy among thousands of scientists is a highly improbable occurrence; look for a solution that doesn't involve a perfect lid of secrecy among a group of (frequently) socially inept people.

--
$_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
Engineering Course Grade = F by BoRegardless · 2010-02-09 02:51 · Score: 4, Interesting

One significant figure?
1. Re:Engineering Course Grade = F by natoochtoniket · 2010-02-09 03:30 · Score: 2, Interesting
  
  That actually surprised me, too. Loss of precision is nothing new. When you use floats to do the arithmetic, you lose precision in each operation, and particularly when you multiply two numbers with different scales (exponents). The thing that surprised me was not that a calculation could lose precision. It was the assertion that any precision would remain, at all.
  Numeric code can be written using algorithms that minimize loss of precision, or that are able to quantify the amount of precision that is lost (and that remains) in the final answers. But, if you don't use those algorithms, or don't use them correctly and carefully, you really cannot assert _any_ precision in the result.
  If you know your confidence interval, you can state your result with confidence. But, if you don't bother to calculate the confidence interval, or if you don't know what a CI is, or if you are not careful, it usually ends up being plus-or-minus 100 percent of the scale.
2. Re:Engineering Course Grade = F by khayman80 · 2010-02-09 03:53 · Score: 2, Informative
  
  Yes, sounds like someone didn't read What Every Computer Scientist Should Know About Floating-Point Arithmetic.
MaDnEsS ! by Airdorn · 2010-02-09 02:53 · Score: 4, Funny

What? Scientists showing their work for peer-review? It's MADNESS I tell you. MADNESS !
This is not science. by Coolhand2120 · 2010-02-09 02:59 · Score: 4, Insightful

"Why should I make the data available to you, when your aim is to try and find something wrong with it"
-Prof. Jones CRU
1. Re:This is not science. by Cyberax · 2010-02-09 03:01 · Score: 2, Insightful
  
  His colleague was _sued_ (by a crank) based on released FOIA data. It might explain a certain reluctance to disclose data to known trolls.
2. Re:This is not science. by Idiot+with+a+gun · 2010-02-09 03:10 · Score: 5, Insightful
  
  Irrelevant. If you can't take some trolls, maybe you shouldn't be in such a controversial topic. The accuracy of your data is far more significant than your petty emotions, especially if your data will be affecting trillions of dollars worldwide.
3. Re:This is not science. by jgtg32a · 2010-02-09 03:50 · Score: 2, Insightful
  
  Shit like this is why I'm hesitant about going along with Climate Change. I'm in no way qualified to review scientific data, but I can tell when someone is shady, and I don't trust shady people.
4. Re:This is not science. by crmarvin42 · 2010-02-09 04:01 · Score: 4, Interesting
  
  1) Do you seriously think that the whole climate science depends on one scientist's data?
  No, but his work does include suggestions that regulators pay close attention to based on his status within the community. If he were posting on this very same topic, but was not being used as a primary source by regulators then I could see your point. However, that is not the case and theoretical situations are not really relevant.
  
  2) CRU was trolled by FOIA requests. They are nuisance to deal with, as far as I was told.
  Then hire someone to handle them for you, or have grad students do it.
  
  I could say the same thing about publishing and peer review. It's a major PITA to get formatting done just right, making sure that those outside of my small sphere of research can understand what I did without getting lost in all of the jargon. Suck it up! It is an unfortunate, but necessary part of doing research at a public institution.
  
  3) Scientists are people, people have emotions. That's why peer review is used.
  Not sure what this has to do with anything. Peer review is valuable and necessary, but it has never pretended to be about accuracy of the data. It's about cleaning up the presentation so that it is clear, reproducible, and free from OBVIOUS error.
  
  As a reviewer, I don't know what exactly was done, but if a list of numbers that should add up to 100 instead adds up to 120, then I can catch that. Whether the problem is due to a typo, or sloppy data fabrication, or a computer error is not something I can ascertain. I have to trust that the authors explanation and fix are true and accurate, in which case I am trusting that they are honest, competent and attentive. The more of their data and methodology that they expose to scrutiny, the less faith I have to have and the more I can ascertain for myself directly.
  
  --
  Bureaucracy expands to meet the needs of the expanding bureaucracy.-Oscar Wilde
5. Re:This is not science. by acoustix · 2010-02-09 04:27 · Score: 5, Insightful
  
  "Why should I make the data available to you, when your aim is to find something wrong with it?"
  That used to be what Science was. Of course, that was when truth was the goal.
  
  --
  "A plan fiendishly clever in its intricacies"- Homer Simpson
6. Re:This is not science. by ae1294 · 2010-02-09 04:52 · Score: 3, Insightful
  
  1) Do you seriously think that the whole climate science depends on one scientist's data?
  Irrelevant, if you use public money to do your research your boss gets all that work.
  
  2) CRU was trolled by FOIA requests. They are nuisance to deal with, as far as I was told.
  Irrelevant, FOIA requests are part of the deal when you take public money. Don't like it? Don't take public money. The whole idea that FOIA requests can be labeled troll sounds like a very bad idea. I for one don't want to start hearing the government claim that the EFF are trolls and thus are ignoring their FOIA requests.
  
  3) Scientists are people, people have emotions. That's why peer review is used.
  Irrelevant, ???
Slashdot Egocentrism. by stewbacca · 2010-02-09 03:00 · Score: 2, Insightful

My bet is there is a simple explanation...namely that scientists outside of computer science are too busy in their respective fields to know anything about code, or even care. The egocentric Slashdot-worldview strikes at the heart of logic yet again.
1. Re:Slashdot Egocentrism. by quadelirus · 2010-02-09 03:22 · Score: 2, Interesting
  
  Unfortunately computer science is pretty closed off as well. Too few projects end up in freely available open code. It hinders advancement (because large departments and research groups can protect their field of study from competition by having a large enough body of code that nobody else can spend the 1-2 years required to catch up) and it hinders verifiability (because they make claims on papers about speed/accuracy/whatever and we basically have to stake it on their word and reputation and whether it SEEMS plausible--this also means that surprising results from lesser known researchers might be less likely to get published).
  
  I think it our duty as scientists to ALWAYS release the code, even if it is uncommented and unclean. I'm very glad to be researching under an advisor who requires that we always release our code as open source after papers have been published so that other groups can build on what we've done. This should absolutely be universal.
2. Re:Slashdot Egocentrism. by FlyingBishop · 2010-02-09 03:26 · Score: 2, Insightful
  
  What's your point? If a Biologist has no understanding of code, they have no business running a simulation of an ecological system. If a physicist has no understanding of code, they have no business writing software to simulate atomic processes. If a Geneticist has no understanding of code, they have no business writing software that does pattern matching across genes.
  Those who don't want to write software to aid in their research may continue not to do so (and continue to lose relevance.) But if they're going to use software, they have to use best practices. To do otherwise likewise makes their work quickly fading in relevance.
3. Re:Slashdot Egocentrism. by AlXtreme · 2010-02-09 03:41 · Score: 3, Interesting
  
  that scientists outside of computer science are too busy in their respective fields to know anything about code, or even care.
  If their code results in predictions that affect millions of lives and trillions of dollars, perhaps they should learn to care.
  What I've personally seen of scientists is a frantic determination to publish papers anywhere and everywhere, no matter how well-founded the results in those papers are. The IPCC-gate is merely a symptom of a deeper problem within scientific research.
  If scientists are too busy because of publication quota's and funding issues to focus on delivering proper scientific research, maybe we should question our current means of supporting scientific research. Currently we've got quantity, but very little quality.
  
  --
  This sig is intentionally left blank
4. Re:Slashdot Egocentrism. by Rising+Ape · 2010-02-09 04:05 · Score: 4, Insightful
  
  Nonsense, they're not trying to produce code, they're trying to produce science. It doesn't matter how ugly the code is, or how inefficient, as long as it produces correct answers. Since software engineering "best practices" seem to change every week (and do not prove program correctness in any case), what are they supposed to do, spend huge amounts of time learning as much as a professional software engineer would? Do you do that for all the tools you use?
  Does anyone have any evidence that the code is *wrong*? I.e. does it actually produce significantly wrong answers? I suspect not - this is just the latest FUD-spreading trick.
  This is just typical programmer "when your tool's a hammer" mentality. Software's not the most important thing in the world, and science has better ways to verify correctness - have several independent analyses of the same thing for example, or different ways of measuring the same thing to check for consistency.
5. Re:Slashdot Egocentrism. by c_sd_m · 2010-02-09 04:35 · Score: 2, Interesting
  
  What I've personally seen of scientists is a frantic determination to publish papers anywhere and everywhere, no matter how well-founded the results in those papers are. The IPCC-gate is merely a symptom of a deeper problem within scientific research.
  
  They're trained for years on a publish or perish doctrine. Either they have enough publications or they get bounced out of academia at some threshold (getting into a PhD, getting a post doc, getting a teaching position, getting tenure, ...). Under that pressure you end up with just the people who churn out lots of papers making it into positions of power. In some fields you're also expected to pull in significant research funding and there are few opportunities to do some without corporate partnerships. So if you're going to fund students to publish papers, you need to accept limits on what you can publish. The only alternative is to leave the field.
  There's no shortage of problems with the research community these days.
Peer Review vs. Funding by stokessd · 2010-02-09 03:04 · Score: 2, Informative

I got my PhD in fluid mechanics funded by NASA, and as such my findings are easily publishable and shared with others. My analysis code (such as it was) was and is available for those would would like to use it. More importantly my experimental data is available as well.
This represents the classical pure research side of research where we all get together and talk about our findings and there really aren't any secrets. But even with this open example, there are still secrets when it comes to ideas for future funding. You only tip your cards when it comes to things you've already done, not future plans.
But more importantly, there are whole areas of research that are very closed off. Pharma is a good example. Sure there are lots of peer reviewed articles published and methods discussed, but you'll never really get into their shorts like this guy wants. There's a lot that goes on behind that curtain. And even if you are a grad student with high ideals and a desire to share all your findings, you may find that the rules of your funding prevent you from sharing.
Sheldon
1. Re:Peer Review vs. Funding by PhilipPeake · 2010-02-09 03:54 · Score: 4, Insightful
  
  ... and this is the problem. The move from direct government grants to research to "industry partnerships".
  Well, (IMHO) if industry wants to make use of the resources of academic institutions, they need to understand the price: all the work becomes public property. I would go one step further, and say that one penny of public money in a project means it all becomes publicly available.
  Those that want to keep their toys to themselves are free to do so, but not with public money.
That's all wrong by Gadget_Guy · 2010-02-09 03:06 · Score: 2, Interesting

The scientific process is to invalidate a study if the results cannot be reproduced by anyone else. That way you can eliminate all potential problems like coding errors, invalid assumptions, faulty equipment, mistakes in procedures, and 100 of the other things that can produce dodgy results.
It can be misleading to search through the code for mistakes when you don't know which code was eventually used in the final results (or in which order). I have accumulated quite a lot of snipits of code that I used to fix a particular need at the time. I am sure that many of these hacks were ultimately unused because I decided to go down a different path in data processing. Or the temporary tables used during processing is no longer around (or in a changed format since the code was written). There is also the problem of some data processing being done by commercial products.
It's just too hard. The best solution is to let science work the way it has found to be the best. Sure you will get some bad studies, but these will eventually be fixed over time. The system does work, whether vested interests like it or not.
Conspiracy? by Coolhand2120 · 2010-02-09 03:06 · Score: 2, Insightful

Nobody said conspiracy, just plain crappy code. You don't need a conspiracy if you are "trying to prove" something, your crappy code spits out what you want to see and you run with it. You just need plain old incompetence.
1. Re:Conspiracy? by obarthelemy · 2010-02-09 03:13 · Score: 4, Insightful
  
  Yes and no. Which assertion do you think more probable:
  1- "These are not the desired results. Check your code".
  2- "These are the desired results. Check your code".
  No conspiracy, but a conspiracy-like end result.
  
  --
  The Cloud - because you don't care if your apps and data are up in the air.
2. Re:Conspiracy? by bunratty · 2010-02-09 03:23 · Score: 2, Insightful
  
  Let's think through what would really happen if scientists released their code. The code has bugs, as all code does. People with an ulterior motive would point to the bugs and say "Look here! A bug! The science cannot be trusted!" And millions of sheeple would repeat "Yes! The code has bugs! And therefore I refuse to believe it!" It won't matter whether the bugs are relevant to the science; the fact that there are any bugs at all will cause people who want to disagree to say there's doubt about the results. Meanwhile, they will go about their business using computer systems that are riddled with bugs, but function well enough the vast majority of the time they're not even aware of the bugs.
  
  --
  What a fool believes, he sees, no wise man has the power to reason away.
3. Re:Conspiracy? by crmarvin42 · 2010-02-09 03:41 · Score: 4, Insightful
  
  And then they fix the bug and either...
  
  A. The results change, thus indicating that the bug was important in some way. In this case, fixing the bug gained us not only silencing the critics, but improving our understanding.
  
  or
  
  B. The results don't change, thus indicating that the bug, while still a bug, was not important to the final result. In this case, we've fixed a bug that the critics were using as a banner, and that they were mistaken in it's importance. We don't get the improved understanding, but we do get a chance to politely say STFU to the more vocal/less qualified critics.
  
  Either way looks like win/win to me.
  
  --
  Bureaucracy expands to meet the needs of the expanding bureaucracy.-Oscar Wilde
4. Re:Conspiracy? by xtracto · 2010-02-09 03:47 · Score: 3, Informative
  
  Agreed 100%.
  You would not believe the amount and crappy quality of the code performed during "research projects", specially when the research is in a field completely unrelated to Comp. Sci. or Soft. Eng.
  I have personally seen software related to Agronomy, Biology (Ecology) and Economics. The problem with a lot of that code is that sometimes researchers want to use the power of computers (say, for simulation) but do not know how to code, they then read a bit about some programming language and implement their program s they are learning.
  The result? you can imagine.
  
  --
  Ubuntu is an African word meaning 'I can't configure Debian'
5. Re:Conspiracy? by bunratty · 2010-02-09 04:10 · Score: 2, Insightful
  
  From recent events, I think both A and B are wrong. When an error is pointed out in research that shows AGW is happening, people use that error as an excuse not to believe any research that AGW is happening, even years after the error is corrected. When an error is pointed out in the IPCC report about a minor effect of climate change, people use that error to doubt all effects of climate change. Correcting the errors or pointing out they don't change the results will not silence the critics. It will only make the critics claim that their opinion is being suppressed even though the science has been indisputably proven to be flawed and therefore cannot be trusted!
  
  --
  What a fool believes, he sees, no wise man has the power to reason away.
6. Re:Conspiracy? by pavon · 2010-02-09 05:14 · Score: 2, Interesting
  
  Yes, there are stubborn idiots that will believe what they want regardless of the evidence. There are self-entitled people that complain no matter how good of a service you provide. There are unreasonable assholes in this world.
  However, since nothing I do will appease them, why should I give a moments consideration to them whatsoever? I am going to base my actions on what will best convince/serve the reasonable people, on top of what makes the best science. Hiding data and and not being responsive to criticisms is counterproductive to those goals.
  Case in point. The recent inclusion of data that had not been peer reviewed in the IPCC report didn't convince me that everything in the report was garbage, but it meant that everything in there had to be weighed on it's own merits, as I couldn't trust the vetting process done by the IPCC. It didn't discredit climate change itself, but it did undermine the ability of the IPCC to act as a credible distiller of the state of climate change research.
  These are the issues that you need to be concerned about, not how the ideologues and pundits are going to react.
one error will invalidate a computer program?!?!? by Anonymous Coward · 2010-02-09 03:11 · Score: 2, Insightful

As it is written, the editorial is saying that if there is any error at all in a scientific computer program, the science is usually invalid. What a lot of bull hunky! If this were true, then scientific computing would be impossible, especially with regards to programs that run on Windows.
Scientists have been doing great science with software for decades. The editorial is full of it.
Not that it would be bad for scientists to make their software open source. And not that it would be bad for scientists to benefit from some extra QA.
I concur by dargaud · 2010-02-09 03:16 · Score: 4, Interesting

As a software engineer who has spent 20 years coding in research labs, I can say with certainty that the code written by many, if not most, scientists is utter garbage. As an example, a colleague of mine was approached recently to debug a piece of code: "Oh, it's going to be easy, it was written by one of our postdocs on his last day here...". 600 lines of code in the main, no functions, no comments. He's been at it for 2 months.
I'm perfectly OK with the fact that their job is science and not coding, but would they go to the satellite assembly guys and start gluing parts at random ?

--
Non-Linux Penguins ?
1. Re:I concur by Rising+Ape · 2010-02-09 04:13 · Score: 2, Insightful
  
  > 600 lines of code in the main, no functions, no comments
  Does that make it function incorrectly?
  Looking pretty and being correct are orthogonal issues. Code can be well-structured but wrong, after all.
2. Re:I concur by Rising+Ape · 2010-02-09 04:23 · Score: 3, Insightful
  
  >So, while it is perfectly understandable that, say, physicists can't spend 5 years learning CS, at the very least they should be made aware that it requires trained people to write sane code and that they must hand the job to specialists, and spend their valuable time doing what the're skilled at.
  And where will they get these specialists, and who will pay for them?
  Add the overhead of explaining exactly what the code is supposed to do, and the fact that the specialist won't know the physics purpose of it all, and I wouldn't be suprised if there were more errors this way, not fewer. Most science code is fairly short, so all the fuss about "structured programming" (or is it OOP these days?) isn't as important.
Observations... by kakapo · 2010-02-09 03:17 · Score: 4, Informative

As it happens, my students and I are about to release a fairly specialized code - we discussed license terms, and eventually settled on the BSD (and explicitly avoided the GPL), which requires "citation" but otherwise leaves anyone free to use it.
That said, writing a scientific code can involve a good deal of work, but the "payoff" usually comes in the form of results and conclusions, rather than the code itself. In those circumstances, there is a sound argument for delaying any code release until you have published the results you hoped to obtain when you initiated the project, even if these form a sequence of papers (rather than insisting on code release with the first published results)
Thirdly, in many cases scientists will share code with colleagues when asked politely, even if they are not in the public domain.
Fourthly, I fairly regularly spot minor errors in numerical calculations performed by other groups (either because I do have access to the source, or because I can't reproduce their results) -- in almost all cases these do not have an impact on their conclusions, so while the "error count" can be fairly high, the number of "wrong" results coming from bad code is overestimated by this accounting.
Code isn't good enough. by FlyingBishop · 2010-02-09 03:19 · Score: 2, Interesting

Back in college, I did some computer vision research. Most people provided open source code for anyone to use. However, aside from the code being of questionable quality, it was mostly written in Matlab with C handlers for optimization.
In order to properly test all of the software out there you would need:
1. A license for every version of Matlab.
2. Windows
3. Linux
4. Octave
I had our school's Matlab, but none of the code we found was written on that version. Some was Linux, some Windows, (the machine I had was a Windows box with Matlab) consequently we had to play with Cygwin...
I mean, basically, you need to distribute a straight-up VM if you want your results to be reproducible. (which naturally rules out Windows or Matlab or anything else proprietary being at the core.)
Also true for CS research by DoofusOfDeath · 2010-02-09 03:22 · Score: 2, Interesting

I'm working on my dissertation proposal, and I'd like to be able to re-run the benchmarks that are shown in some of the papers I'm referencing. But must of the source code for those papers has disappeared into the aether. Without their code, it's impossible for me to rerun the old benchmark programs on modern computers so that I and others can determine whether or not my research has uncovered a better way of doing things. This is very far from the idealized notion of the scientific method, and significantly calls into question many of the things that we think we know based on published research.
Not a good idea by petes_PoV · 2010-02-09 03:23 · Score: 5, Insightful

The point about reproducible experiments is not to provide your peers with the exact same equipment you used - then they'd get (probably / hopefully) the exact same results. The idea is to provide them with enough information so that they can design their own experiements to [b]measure the same things[/b] and then to analyze their results to confirm or disprove your conclusions.
If all scientists run their results through the same analytical software, using the same code as the first researcher, they are not providing confirmation, they are merely cloning the results. That doesn't give the original results either the confidence that they've been independently validated, or that they have been refuted.
What you end up with is no-one having any confidence in the results - as they have only ever been produced in one way and arguments thatt descend into a slanging match between individuals and groups of vested interests who try to "prove" that the same results show they are right and everyone else is wrong.

--
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
1. Re:Not a good idea by petes_PoV · 2010-02-09 06:00 · Score: 2, Insightful
  
  Experiments produce results
  Errrm, experiments produce data. It's the analysis of that data plus the insight and knowledge of the analysts and scientists that turn it into results. The problem is that if everyone uses the same software they'll never notice any systemic failures in the processing it performs.
  
  --
  politicians are like babies' nappies: they should both be changed regularly and for the same reasons
2. Re:Not a good idea by Rising+Ape · 2010-02-09 08:24 · Score: 2, Insightful
  
  >Do it the same way you do it in industry: document the model number of the oscilloscope, the firmware revision and every important setting you can get your hands on.
  There's no reason on earth why you'd even do that. Just say "the voltage was measured to be x +- y". Results from science experiments should *not* depend on specifics of equipment any more than they should depend on a specific scientist. In fact, the wider the variety of equipment, code and analysis methods used to measure the same thing, the better - it makes the result more robust.
  In your example, both people should recheck their results independently, perhaps try different methods, even do another experiment.
  There are some situations where seeing the code is useful, but only after all other methods to reproduce the result have failed. Sharing code is just inviting common errors.
  In your hypothetical scenario below, the result could be reproduced by writing a new program to do the same thing.
Nothing to do with CS by nten · 2010-02-09 03:39 · Score: 2, Insightful

I am suspect of the interface reference. Are they counting things where an enumeration got used as an int, or there was an implicit cast from a 32bit float to a 64bit one? From a recent TV show "A difference that makes no difference is no difference." Stepping back a bit there will be howls from OO/Functional/FSM zealots that look at a program and declare its inferior architecture, lack of maintainability etc. indicate its results are wrong. These are programs written to be run once to turn one set of data into a more understandable and concise one. A truth test set run through it is good enough, they don't need iso compliant, triply refactored, perfectly architectured code to get the right answer. I don't think any of my CS proffs would have cared about such inane drivel they barely paid attention to what language we each picked to solve the assignment in. My software engineering proff would have yelled about comment density and coding standards compliance, but I consider that a different discipline primarily applicable to widely used and/or safety critical code.
*However*
Keeping track of digit precision through a calculation isn't CS, its fundamental grade school science. That is only one step from forgetting to do unit analysis for a sanity check. If they are forgetting that, they are probably also not looking at numerical conditioning, or trying to get by with doubles when they need bignums. None of this is CS egocentrism, its stuff we learn in math and science courses.

--
refactor the law, its bloated, confusing and unmaintainable.
Not that simple by khayman80 · 2010-02-09 03:47 · Score: 3, Interesting

I'm finishing a program that inverts GRACE data to reveal fluctuations in gravity such as those caused by melting glaciers. This program will eventually be released as open source software under the GPLv3. It's largely built on open source libraries like the GNU Scientific Library, but snippets of proprietary code from JPL found their way into the program years ago, and I'm currently trying to untangle them. The program can't be made open source until I succeed because of an NDA that I had to sign in order to work at JPL.
It's impossible to say how long it will take to banish the proprietary code. While working on this project, my research is at a standstill. There's very little academic incentive to waste time on this idealistic goal when I could be increasing my publication count.
Annoyingly, the data itself doesn't belong to me. Again, I had to sign an NDA to receive it. So I can't release the data. This situation is common to scientists in many different fields.
Incidentally, Harry's README file is typical of my experiences with scientific software. Fragile, unportable, uncommented spaghetti code is common because scientists aren't professional programmers. Of course, this doesn't invalidate the results of that code because it's tested primarily through independent verification, not unit tests. Scientists describe their algorithms in peer-reviewed papers, which are then re-implemented (often from scratch) by other scientists. Open source code practices would certainly improve science, but he's wrong to imply that a single bug could have a significant impact on our understanding of the greenhouse effect.
Re:Don't use that word by harvey+the+nerd · 2010-02-09 03:53 · Score: 3, Interesting

Real scientists don't use simulators with incomplete equations and fudge factors to match highly manipulated historic data to "prove" their case with game machines that have no predictive capability or other external validation. That simply is not the way you build a valid fundamentals based model starting from the equations of motion. IPCC reports previously noted whole terms in the equations' energy terms that were inadequately described or represented, then have done no research to fill the terms, modellers just zeroing them out or putting in small constants for significant *variables*. These are not real scientists, their processes and practices have been clearly shown to be antithetical to valid science.

These models are just primitive speculative tools, often reflecting personal biases in data selection and derivation, NOT fundamental equations. The models are NOT valid physics data or experiments.

On prediction failure, Hansen's 1988 "A,B,C" forecasts of rising temperature are rapidly diverging from the cooling we are actually experiencing right now, where case C assumed we massively limited CO2 also. Missed the side of a barn with a shotgun, tsk, tsk, tsk.
Peer Review / publication process by Wardish · 2010-02-09 03:55 · Score: 2, Insightful

As part of publication and peer review all data and providence of the data as well as any additional formula's, algorithms, and the exact code that was used to process the data should be placed online in a neutral holding area.
Neutral area needs to be independent and needs to show any updates and changes, preserving the original content in the process.
If your data and code (readable and compilable by other researchers) isn't available then peer review and reproduction of results is foolish. If you can't look in the black box then you can't trust it.

--
Ward

. Silence! Be thankful thy species is unpalatable! .
Re:Seems reasonable - up to a point by Jaydee23 · 2010-02-09 04:11 · Score: 2, Interesting

Code should be release, but this should not be confused with replicating scientific results. ] If you want to replicate research, you need to write your own code according to the methods described in the research. your answer then needs to match the original to test the code.
Re:Don't use that word by ArcherB · 2010-02-09 04:57 · Score: 4, Interesting

Scientists need to realize that if they're going to get public support, they really need to be very careful with their choice of wording. Like it or not, the scare mongers, and I mean scare mongers in the sense that there are people who are trying to scare folks into believing that Global Warming is some sort of wealth redistribution scheme by the socialists, are going to use any hint, real or not, that scientists are making up their findings.
Scare mongers? Let's take a look at some of these "hints" that scientists are making up their findings. From May 7, 2002

Dozens of mountain lakes in Nepal and Bhutan are so swollen from melting glaciers that they could burst their seams in the next five years and devastate many Himalayan villages, warns a new report from the United Nations.
From January 17, 2010:

In the past few days the scientists behind the warning have admitted that it was based on a news story in the New Scientist, a popular science journal, published eight years before the IPCC's 2007 report.
It has also emerged that the New Scientist report was itself based on a short telephone interview with Syed Hasnain, a little-known Indian scientist then based at Jawaharlal Nehru University in Delhi.
Hasnain has since admitted that the claim was "speculation" and was not supported by any formal research.
Do I need to pull the quotes that claim NY and Florida will be underwater?
As for the "fear mongers" saying that GW is a socialist wealth redistribution scheme.

Some officials from the United States, Britain and Japan say foreign-aid spending can be directed at easing the risks from climate change. The United States, for example, has promoted its three-year-old Millennium Challenge Corporation as a source of financing for projects in poor countries that will foster resilience. It has just begun to consider environmental benefits of projects, officials say.
Industrialized countries bound by the Kyoto Protocol, the climate pact rejected by the Bush administration, project that hundreds of millions of dollars will soon flow via that treaty into a climate adaptation fund.

Strange. When did Rush and Hannity start writing for the NY Times?

--
There is no "I disagree" mod for a reason. Flamebait, Troll, and Overrated are not substitutes.
Precisely by Sycraft-fu · 2010-02-09 05:02 · Score: 3, Insightful

The more important the research, the larger the item under study, the more rigorous the investigation should be, the more carefully the data should be checked. This isn't just for public policy reasons but for general scientific understanding reasons. If your theory is one that would change the way we understand particle physics, well then it needs to be very thoroughly verified before we say "Yes, indeed this is how particles probably work, we now need to reevaluate tons of other theories."
So something like this, both because of the public policy/economic implications and the general understanding of our climate, should be subject to extreme scrutiny. Now please note that doesn't mean saying "Look this one thing is wrong so it all goes away and you can't ever propose a similar theory again!" However it means carefully examining all the data, all the assumptions, all the models and finding all the problem with them. It means verifying everything multiple times, looking at any errors any deviations and figuring out why they are there and if they impact the result and so on.
Really, that is how science should be done period. The idea of strong empiricism is more or less trying to prove your theory wrong over and over again, and through that process becoming convinced it is the correct one. You look at your data and say "Well ok, maybe THIS could explain it instead," and test that. Or you say "Well my theory predicts if X happens Y will happen, so let's try X and if Y doesn't happen, it's wrong." You show your theory is bulletproof not by making sure it is never shot at, but by shooting at it yourself over and over and showing that nothing damages it.
However that this process is done right becomes more important the bigger the issue is. If you aren't right on a theory that relates to migratory habits of a sub species of bird in a single state, ok well that probably doesn't have a whole lot of wider implications for scientific understanding, or for the way the world is run. However if you are wrong on your theory of how the climate works, well that has a much wider impact.
Scrutiny is critical to science, it is why science works. Science is all about rejecting the ideas that because someone in authority said it, it must be true, or that a single rigged demonstration is enough to hang your hat on. It is all about testing things carefully and figuring out what works, and what doesn't.
Only if... by captainpanic · 2010-02-09 05:03 · Score: 2, Insightful

Only if the real programmers out there promise to be nice to us scientists.
Most scientists will know a lot about, well, science... but not much about writing code or optimizing code.
Like my scripts. All correct, all working... lots of formulas... but probably a horribly inefficient way to calculate what I need. :-)
the last thing I need is someone to come to me and tell me that the outcome is correct but that my code sucks.
(And no, I am not interested in a course to learn coding - unless it's a 1-week crash course).
Comment removed by account_deleted · 2010-02-09 05:13 · Score: 2, Insightful

Comment removed based on user account deletion
It's an old story by jc42 · 2010-02-09 05:14 · Score: 4, Informative

This is hugely worrying when you realise that just one error -- just one -- will usually invalidate a computer program.
Back in the 1970s, a bunch of CompSci guys at the university where I was a grad student did a software study with interesting results. Much of the research computing was done on the university's mainframe, and the dominant language of course was Fortran. They instrumented the Fortran compiler so that for a couple of months, it collected data on numeric overflows, including which overflows were or weren't detected by the code. They published the results: slightly over half the Fortran jobs had undetected overflows that affected their output.
The response to this was interesting. The CS folks, as you might expect, were appalled. But among the scientific researchers, the general response was that enabling overflow checking slowed down the code measurably, so it shouldn't be done. I personally knew a lot of researchers (as one of the managers of an inter-departmental microcomputer lab that was independent of the central mainframe computer center). I asked a lot of them about this, and I was appalled to find that almost every one of them agreed that overflow checking should be turned off if it slowed down the code. The mainframe's managers reported that almost all Fortran compiles had overflow checking turned off. Pointing out that this meant that fully half of the computed results in their published papers were wrong (if they used the mainframe) didn't have any effect.
Our small cabal that ran the microprocessor lab reacted to this by silently enabling all error checking in our Fortran compiler. We even checked with the vendor to make sure that we'd set it up so that a user couldn't disable the checking. We didn't announce that we had done this; we just did it on our own authority. It was also done in a couple of other similar department-level labs that had their own computers (which was rare at the time). But the major research computer on campus was the central mainframe, and the folks running it weren't interested in dealing with the problem.
It taught us a lot about how such things are done. And it gave us a healthy level of skepticism about published research data. It was a good lesson on why we have an ongoing need to duplicate research results independently before believing them.
It might be interesting to read about studies similar to this done more recently. I haven't seen any, but maybe they're out there.

--
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
For climatology, this is a non-issue by azgard · 2010-02-09 05:17 · Score: 2, Informative

While I am fan of open source and this idea in general, for climatology, this is a non-issue. Look there: http://www.realclimate.org/index.php/data-sources/
It's more code out there than one amateur can eat for life. And you know what? From the experience of people who wrote these programs, there isn't actually much people looking at it. I doubt that any scientific code will get many eyeballs. This is more a PR exercise.
Models & Algorithms by drooling-dog · 2010-02-09 05:39 · Score: 2, Insightful

It seems to me that what's important is the theory being modeled, the algorithms used to model it, and of course the data. The code itself isn't really useful for replicating an experiment, because it's just a particular - and possibly faulty - implementation of the model and as such is akin to the particular lab bench practices that might implement a standard protocol. Replicating a modeling experiment should involve using - and writing, if necessary - code that implements the model the original investigators intended to implement, but distinct from that which they actually used.
Running the same code on the same data demonstrates very little, and finding bugs in the original code tells you nothing about what results would/should have been achieved had the model been implemented correctly. But of course it's great for throwing stones and "discrediting" a result without actually adding anything constructive to the issue at hand.
1. Re:Models & Algorithms by Troed · 2010-02-09 06:13 · Score: 2, Interesting
  
  Wrong. You're taking two separate issues and try to claim that since there are two one is irrelevant. Of course it's not - both are. However, verifying the model DOES take more domain knowledge than verifying the implementation. We're currently discussing verifying the implementation, which is still important.
  
  --
  it's in my head
Arrogant Scientist Are Not Project Managers by Shannon+Love · 2010-02-09 05:50 · Score: 2, Insightful

I hate to break it to you but all programming is highly specialized. Climatology is in no way special in this regard.
Neither do programmers have to understand the abstract model of the program to write it or evaluate it. The vast majority of professional programmers do not understand the abstract model of the code they create. You do not have to be a high-level accountant to write corporate accounting software and you don't have to be a doctor to write medical software. Most programmers spend most of their time implementing models created by non-programmers from fields of which the programmers have no detailed knowledge.
Does that mean that programmers can't spot crappy code just because they don't understand the details of the model? No, it does not. Most software errors don't arise from the model but from sloppy practices in the management of the software project itself. An experienced programmer doesn't even have to know the language of project to see that it's creation and maintenance was incompetently handled.
You don't have to be a climatologist to know that the CRU software was utter crap that would produce sound outputs only by divine intervention. For any experienced programmer, it was immediately obvious that it was a great reeking gob of amateur coding with no structure, no plan and no standards. In my experience, most scientific software is like the CRU software. It evolves in an ad hoc manner over many years with no governing organizational structure.
Commercial software developers have created a wide range of tools and procedures to manage large, vital projects. In the main, scientist use none of these tools and most of them appear unaware they even exist much less why and when they are needed. As a result most scientific software project management is completely amateurish. If most scientific software were written for commercial applications, the developers would be sued or imprisoned for fraud.
Scientist tend to be arrogant and dismissive of the work of others especially those who work in the commercial sector. You believe that because you understand climatology that you therefore understand all the tools you are using. Well you don't. You think that because no one can understand your abstract model that therefore they cannot find significant errors in your code. Well, they can. You think we should reengineer our entire civilization based on your unquestioned and unexamined computerized ivory tower auguries.
Well, we won't.
Your just going to have to suck it up and withstand the at least the same scrutiny we give important commercial software.
Re:Don't use that word by bunratty · 2010-02-09 06:38 · Score: 3, Informative

On prediction failure, Hansen's 1988 "A,B,C" forecasts of rising temperature are rapidly diverging from the cooling we are actually experiencing right now, where case C assumed we massively limited CO2 also
In the past ten years, we've seen warming of 0.18 degrees Celsius, which is less than the 0.25 degrees Celsius that was predicted, but it certainly hasn't been cooling. This is why the Arctic ice and Antarctic ice are melting. Yes, stop the presses, the globe is warming!

--
What a fool believes, he sees, no wise man has the power to reason away.