UK University Researchers Must Make Data Available
Sara Chan writes "In a landmark ruling, the UK's Information Commissioner's Office has decided that researchers at a university must make all their data available to the public. The decision follows from a three-year battle by mathematician Douglas J. Keenan, who wants the data to do his own analysis on it. The university researchers have had the data for many years, and have published several papers using the data, but had refused to make the data available. The data in this case pertains to global warming, but the decision is believed to apply to any field: scientists at universities, which are all public in the UK, can now not claim data from publicly-funded research as their private property."
There's more at the BBC, at Nature Climate Feedback, and at Keenan's site.
The public pays for gathering the data, the public should have access to that data. Kinda hard to find fault with that.
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
no, peer review is good. It helps to point out mistakes or inconsistencies. Getting rid of scientific journals is quasi-good (less profit motive in science, but also less chance to get work out there).
"There is a way that seems right to a man, but its end is the way of death." Proverbs 16:25 (NKJV)
On the other hand, this will likely produce a whole stream of deliberately inaccurate analyses with ulterior motives behind them.
But with the data public, it'll be easier to shoot them down for picking, choosing, skewing, and what else.
There is no reason why this kind of data should ever be "secret"
Phil Willis, a Liberal Democrat MP and chairman of the Science and Technology Select Committee, said that scientists now needed to work on the presumption that if research is publicly funded, the data ought to be made publicly available.
That doesn't seem unreasonable to me. Appendices with raw data are often included already in the online editions of journals. Of course, if the ruling applies to all data generated in the course of a study, whether it is used in publications or not, it could be onerous indeed.
errr... no always.
Putting data into peoples hands whoa aren't experts often leads to bad things. See every non expert who believed Wakefield study because they didn't understand how to interpret data. In that case kids died , and kids are still dying.
In principle I agree with you, but we live in an are where everyone thinks they are a qualified expert in anything. That simply isn't true, and no good will come out of this.
The data wan't show a flaw in the study because it wasn't used, but he will inevitably cherry pick data to 'prove' the study is wrong. And people like Hannah Devlin are always happy to publish claims without proper study. So no good can come from this, and people need to understand that.
It's hard problem to solve.
The Kruger Dunning explains most post on
If people cannot replicate your results it isn't science.
And with Climate Science part of the process is showing how you collected and interpreted the data. If you are not willing to share the raw data so other researchers can attempt to replicate your methods and results then don't bother publishing.
As opposed to the proselytizers who are funded by the NGO's and the new "Green" capitalists and rent-seekers.
One of the more interesting bits of the Climategate emails showed that Mann was happy to share his data EXCEPT to people who he thought would disagree with his methods and results.
And in this case Mann was also the recipient of the tree ring data showing that again if you agreed with the owners ideas he had no problem getting you copies of what you needed.
Opening the data up for free access means that other groups, who have more interest in scooping than being right, have more ability to do that scooping. That leaves the people who did the work in the cold.
That is not hard to achieve: someone has to make an FoI request, the cost to prepare the data has to be estimated, someone has to get hired to collect and format the data and then the data is released. That can take a considerable amount of time.....but that's not the only issue. In my field of particle physics raw data is generally useless unless you understand how it was collected and how to analyse it.
Even assuming that you had several petabytes of disk/tape available to store it, raw data from ATLAS would be completely useless to you unless you really understand the detector "warts and all". Trying to understand this data without access to the detector itself and the ability to test and cross-check ideas looking at (and sometimes carefully tweaking) the hardware is literally impossible....and that is before you get into the thorny international issues about who did what and so whether it falls under any one country's laws.
These issues were discussed on a previous experiment I worked on in the US and the conclusion was that it did not serve the public to have data released in just about any form: the raw data was useless and even the processed data still had considerable "quirks" which required understanding (e.g. acceptance drops at detector boundaries etc.). This was aptly demonstrated by a pilot project which resulted in no interest at all from the public but which worryingly attracted a few nutters who were more interested in proving their pet theory than in doing science.
So while I am very sympathetic to the "the public paid for it the public should be able to access it" argument I do not think that the public's interest is best served by releasing raw data in all (most?) cases. The best way to serve the public interest is to ensure that results and ideas arising from that research are freely available to all and allow the public to build on that.
Simply generating massive amounts of data isn't considered science - figuring out what it means is. I say this as someone who is very good at generating data quickly, but not particularly good at interpreting it.
Spot on. I have a PhD in Comp. Sci. (Multi-Agent Systems / Market Based Control). One of the things you learn (maybe in you Universitity degree courses or in your first paper presentation) is that data does not mean *anything*, what matters is the interpretation of such data.
Nevertheless, I am of the opinion that programs used for the generation / manipulation of such data should also be free / scrutinable. Specially those developped during the research as they are also being paid by the tax payers money.
In the field I am working now (Agent based computational economics) a lot of people do these so called agent-based simulations, then they write a nice paper about what their simulations showed and try to publish it. The problem is that they keep their code! and in that respect they are deffinitely removing a good chunk of the "methods" part of their research. It is absolutely impossible to duplicate that work without the code.
Ubuntu is an African word meaning 'I can't configure Debian'
That is indeed an issue. Presumably the methodology is already published, as is the rule for scientific papers.
There is at least one case in =two climate research papers where what the methodologies claimed was impossible because the data to do it didn't even exist. This didn't come out for 16 years, and was only discovered because a FOI request was finally honored.
In this case, the authors of the papers had claimed that the station data that they used was from stations that had "few, if any, changes in instrumentation, location or observation times." (quote from one paper) and "selected stations have relatively few, if any, changes in instrumentation, location, or observation times" (quote from the other paper)
"Hey! We only used great data!"
Now, these two authors used the same data, and one of these authors was actually a co-author of the other paper. These authors are Jones (hello climate gate) and Wang.
Now, they finally sourced the data as being from the Chinese Academy of Sciences, which coincedentally had co-published a report with the US Department of Energy at about the same time as those two research papers, stating quite specifically that DATA OF THAT QUALITY DID NOT EXIST. The report was specifically about the quality of the Chinese climate record.
Both papers concluded that the Urban Heat Island effect was minimal. Too bad that they didn't actually have data good enough to draw that conclusion. They said they did, tho.
None of this would have come out if it wasn't for the Freedom of Information Act. Jones and Wang both obstructed the release of the data (denying FOI requests, etc) for nearly 2 decades.
This all came out several years ago, but the media didnt give a fuck. They did care about hacked emails tho. Go figure. Now, as it turns out it probably wasn't Jones who was lying his ass off. Wang was a co-author on Jones's paper and supplied the "data." Jones gets credit for having his email hacked.
"His name was James Damore."