Slashdot Mirror


UK University Researchers Must Make Data Available

Sara Chan writes "In a landmark ruling, the UK's Information Commissioner's Office has decided that researchers at a university must make all their data available to the public. The decision follows from a three-year battle by mathematician Douglas J. Keenan, who wants the data to do his own analysis on it. The university researchers have had the data for many years, and have published several papers using the data, but had refused to make the data available. The data in this case pertains to global warming, but the decision is believed to apply to any field: scientists at universities, which are all public in the UK, can now not claim data from publicly-funded research as their private property." There's more at the BBC, at Nature Climate Feedback, and at Keenan's site.

86 of 352 comments (clear)

  1. Sudden Outbreak of Common Sense by nacturation · · Score: 5, Insightful

    The public pays for gathering the data, the public should have access to that data. Kinda hard to find fault with that.

    --
    Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
    1. Re:Sudden Outbreak of Common Sense by nacturation · · Score: 2, Informative

      Now if only the same rules were applied to the fraudsters who promote evolutionism...

      Responding to a troll, I know... but if you really want the data on evolution (as opposed to foaming at the mouth and making up words to make yourself feel better about the mythology you chose that tells you that faith is when you blindly believe while being unable to show any data [Hebrews 11:1, bitches]): http://talkorigins.org/

      --
      Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
    2. Re:Sudden Outbreak of Common Sense by c++0xFF · · Score: 2, Insightful

      Opening the data will encourage further research. The data will be available for others to use, instead of forcing constant duplication.

      "Standing on the shoulders of giants" means to build on what has been done before. Hiding the source data shows just how "little" you are.

    3. Re:Sudden Outbreak of Common Sense by NeutronCowboy · · Score: 4, Insightful

      Creationists regularly mangle papers, taking quotes out of context and all.

      Get ready for an onslaught of mangled data analysis, with data being taken out of context, the results published to some blog, and people making policy decision based on those blog postings.

      the media will focus on the new controversies this will spawn

      That's a guarantee. While in theory, I welcome this development, I suspect that in practice it will lead to more chaos than before. Not because the data is shoddy, but because some meteorologist will think that running a data set through an excel curve fitting algorithm is science.

      --
      Those who can, do. Those who can't, sue.
    4. Re:Sudden Outbreak of Common Sense by smoothnorman · · Score: 2

      There is no question that having the data released eventually should be the rule. It shouldn't even be considered proven science until it can be thoroughly recreated. However, the tricky bit is mandating exactly by when it must be released. If a lab has spent a long time, let's say 10 years, accumulating some hard fought data, they should be allowed the benefit of a few publications before releasing all the data so that better (likely privately) funded labs do to the easy rapid analysis and 24/7 postdoc tag-team writing abuse and thus steal all the reward. Give them say... the sqrt(years_to_collect_the_data) out of encouragement to continue to do the heavy lifting. (my experience of this situation comes from protein crystallography and deposition of the hard won data there)

    5. Re:Sudden Outbreak of Common Sense by jacksonj04 · · Score: 4, Insightful

      You know the multi-billion dollar LHC? Guess what they did their first physics on. Not finding new exotic particles, but proving that what we think we know so far still stands up. Duplicating data is exactly how things get proven and disproven. If Group A and Group B use exactly the same source data there's no possibility of Group B proving Group A's research wrong.

      --
      How many people can read hex if only you and dead people can read hex?
    6. Re:Sudden Outbreak of Common Sense by Anonymous Coward · · Score: 5, Insightful

      some meteorologist will think that running a data set through an excel curve fitting algorithm is science.

      Nope -- it's only science if you adjust and filter the data first to make it match your truth. Resist releasing your data though, others may adjust and filter it other ways to make it match their truth. All science in the world of research driven by political agendas and egotistical arrogance.

      Disclose, when in doubt disclose more. Anything less in scientific arenas where others can't repeat your experiments is just a symptom of fear, insecurity, and lack of confidence that your conclusions will stand up to the view and study of many brains (some better than yours, some worse).

      Same argument for why FOSS is better - many eyes reviewing (in theory) and rapid fixes.

    7. Re:Sudden Outbreak of Common Sense by maxume · · Score: 3, Informative

      Even worse, some hack might shove the data through some perl code:

      http://www.timesonline.co.uk/tol/news/uk/article7028418.ece

      --
      Nerd rage is the funniest rage.
    8. Re:Sudden Outbreak of Common Sense by interkin3tic · · Score: 3, Insightful

      some meteorologist will think that running a data set through an excel curve fitting algorithm is science.

      Nope -- it's only science if you adjust and filter the data first to make it match your truth.

      I don't think that's what he was saying. He's saying this will lend itself to overly simplistic interpretations. Which is a good prediction in climatology, considering what people got out of "climategate."

    9. Re:Sudden Outbreak of Common Sense by thepike · · Score: 5, Insightful

      I totally agree. If people just start looking at each others data instead of verifying it, a lot of mistakes (or fraudulent data) will never be caught.

      Also, I have to wonder what the timeline for releasing data is. My research is funded with government money (NIH and NSF) but it can take years to get enough data to make a worthwhile paper. If I have to release my data before then it will hurt my ability to publish papers without getting scooped. You could end up with a whole closet industry of people just data mining the data others have had to disclose. And, here's the main catch, if you don't have to release results you haven't yet reported on, the problem isn't solved at all because I could just choose to "not yet publish" any results that don't agree with what I want to say. Nothing says I ever have to publish results I get, so why wouldn't I just sit on them?

      Not that sitting on data just because it doesn't agree is a good thing, but it happens. And plenty of good data goes unpublished (experiments fail, uninteresting results happen, journals don't publish negative results very often etc) so what about that data? Overall this law isn't going to help anything, and will just cause issues.

    10. Re:Sudden Outbreak of Common Sense by Obfuscant · · Score: 4, Insightful
      If Group A and Group B use exactly the same source data there's no possibility of Group B proving Group A's research wrong.

      Wrong. If Group B cannot duplicate Group A's analysis of the data, that proves that Group A did something wrong and probably came to the wrong conclusion.

      If Group B cannot duplicate the experiment and get the same data (and knowing that means being able to compare both sets) that calls the experiment as a whole into question.

      There is more to science than simply applying equation A to data B and getting number C.

      This hubbub all came about because of the difficulty in prying the source data out of the hands of the guy who produced the "hockey stick" figures. It's covered in the book "Broken Consensus" I think it's called. The "hockey stick" is not the "source data", the source data is all of the individual readings from all the instruments, prior to corrections for sampling errors or known issues. One cannot verify the quality of the "hockey stick" result without having the source data and being able to verify the processing steps that were done to it.

      The downside to free and open access to all data is that research groups get grants to collect AND process the data to come up with results. Opening the data up for free access means that other groups, who have more interest in scooping than being right, have more ability to do that scooping. That leaves the people who did the work in the cold. There is good reason to delay opening the data until the group being paid to collect it has a chance to use it.

    11. Re:Sudden Outbreak of Common Sense by Anonymous Coward · · Score: 2, Interesting

      Creationists regularly mangle papers, taking quotes out of context and all.

      Would you *really* expect anything else? They do the same thing to the Bible, and they LIKE the Bible.

      Personally, I'm not of the opinion that anyone who seriously believes the planet was instantly created 6000 years ago should be permitted to SPEAK in the debate on climate change. How can you argue about the interpretation of 20k year old ice core data with someone who believes that core was put there by THE DEVIL to confuse people?

    12. Re:Sudden Outbreak of Common Sense by SETIGuy · · Score: 2, Insightful

      Sure, I'll give you the data. But I wasn't funded to put the data in a format that's easy to understand. I've also got a job, and I don't get paid to support a competitor's data analysis attempts. Good luck.

    13. Re:Sudden Outbreak of Common Sense by michaelwv · · Score: 5, Insightful

      Absolutely. The public should have access to the data. Public grants then also need to pay for curating the data. Libraries aren't free, archives aren't free, package data in an actually useful form takes precious time, which is scientists most precious resource. Having data in a form that is useful to the 25 people in your research group is very different than providing data that can be used by thousands of people. It's analogous to the difference between the quick bash script you have that backs up your movies to your external hard drive, and having something that you're willing to distribute to 1000 people and provide support.

    14. Re:Sudden Outbreak of Common Sense by finarfinjge · · Score: 4, Insightful

      I'll probably get flamebait or troll for this too, but this has always been the danger of the over-advocacy of climate change. Climate science is not even close to "settled". Nor is evolution, nor is physics. Well established and able to make verifiable predictions yes. Settled. No. The direct result of making the absurd claim that some cutting edge field of science is settled is this. Some complete moron then says "see, global warming wasn't settled, so evolution is bunk too" I've seen similar idiotic comments about plate tectonics as well. A number of years ago (far enough back it hasn't been cached), I wrote here that as scientists, we had better be right about climate change. Now we reap what we have sown. If it annoys you that idiots make claims like "global warming wasn't settled, so how can you be sure about evolution", look to the strident supporters of the cause. They (I'm talking about realclimate etc., here) are as responsible as Beck. By hammering any and all dissent without any concern as to the validity of the claims, they have made this type of comment inevitable. We will be seeing much more of it and we have only ourselves to blame.

    15. Re:Sudden Outbreak of Common Sense by martin-boundary · · Score: 3, Insightful
      Who cares? Are you arguing for science, or for little confidentiality fiefdoms?

      There is literally no point in doing Science (with a capital S) if the data isn't available for scrutiny by everyone. Without scrutiny, it's all he said/she said, rumours and bullshit.

      As to signing confidentiality agreements etc, there comes a time when a researcher has to decide: does he want to contribute to human knowledge (=> don't sign) or does he just want to wank around with secret data (=> sign it)?

      It sucks to be unable to use purportedly available data, just because it can't be divulged, but it's better that way in the long run.

      Unsupported data is worse than useless, it's a cancer that grows every time someone else quotes the unsupported result, until it gets to the level of unchallenged folk wisdom within the community.

    16. Re:Sudden Outbreak of Common Sense by Rogerborg · · Score: 5, Interesting

      If a lab has spent a long time, let's say 10 years, accumulating some hard fought data

      If a lab has been spending my tax money for 10 years, I want my employees to give me my data right Goddamn now.

      The "reward" for doing publicly funded research is that you keep getting funded. I don't care one whit what you think you're entitled to: if you're taking my money, you work for me.

      --
      If you were blocking sigs, you wouldn't have to read this.
    17. Re:Sudden Outbreak of Common Sense by the+gnat · · Score: 4, Insightful

      my experience of this situation comes from protein crystallography and deposition of the hard won data there

      Ah, a fellow crystallographer. Welcome, brother!

      I was about to post a similar comment. However, I only agree with you up to a point. Once you publish a paper reporting the structure, all of the raw data should be made publicly available (including diffraction images - although deposition of those isn't quite feasible yet). I would apply the same standard to any other field: you shouldn't publish until you are comfortable releasing the underlying data. I don't care if you're still working on some super-secret follow-up paper, as far as I'm concerned your publication is useless if I can't go to the PDB and download the coordinates. And if you're using public resources to solve your structure (like NIH funding, or one of the DOE's synchrotrons), your results are public property.

      There was once intense resistance to even mandating coordinate deposition (long before I got started in the field), which just sounds insane now. Some of the people doing the most complaining were in fact some of the best funded. A decade later, the field went through the same bullshit whining with regard to reflection data. Now most journals require both coordinates and reflections, and not only has the field not suffered in the slightest, many more studies are now possible and the majority of structures can be solved without experimental phasing. If we'd left things the way the naysayers wanted it, every group attempting to study, say, ribosome structure would have to either plead with more senior groups for coordinates in order to solve their structures (and, almost certainly, further bloat the author lists and potentially cede some control over their project - which, I imagine, would have suited the senior faculty just fine), or waste half a decade making heavy metal derivatives. It is difficult to convey to non-crystallographers how huge a waste of time and money - most of it coming from tax dollars - this scenario would be.

      Now, where it gets messy is situations where you have to release data ASAP, instead of waiting until publication. American structural genomics groups do this (it may be a requirement of the NIH), but PDB deposition is more of an endpoint in itself for them, and no one is going to bother trying to scoop them on most of those proteins. Genomics centers also do this. A grad school classmate of mine worked on a sequencing project where much of the gruntwork was performed by the DOE, and they had extremely strict release rules. She complained that other groups (of bioinformaticists) could start analyzing the data before she'd had a chance to complete her own studies, because the outsiders didn't have to spend a lot of time thoroughly annotating the genome before publishing. (I don't think it held her back in the end - she graduated with several papers in Science.) In many situations like this, to obtain the data you need to agree to an embargo on publications, to prevent that sort of underhanded behavior. I saw an article retraction recently where the scientific content was undisputed, but the investigators had (unintentionally, it appeared) broken an embargo by submitting the paper when they did.

      In general, I think the scientific community - especially the part funded by the public - should err on the side of maximum disclosure of data, and I don't have much sympathy for the researchers in this story (and I'm not particularly sympathetic towards "climate skeptics" either). I do worry that rules will be used to harass researchers in supposedly controversial fields (Richard Lenski's adventures with Conservapedia are a particularly nauseating example), but as a scientist, I also think the benefits of making massive amounts of data available to anyone are far too important to let these risks bother us, and the drawbacks of keeping such data private are much worse than having to fight off the occasional knuckle-dragging lunatic.

    18. Re:Sudden Outbreak of Common Sense by the+gnat · · Score: 2, Interesting

      If a lab has been spending my tax money for 10 years, I want my employees to give me my data right Goddamn now.

      Okay, but does that mean you should get to see the data before they're done analyzing it, before they can write a paper on their results? If we instituted such a rule, there would be nothing to stop scientists from bombarding their competitors with FOIA requests, and using the released data to scoop them. At the very least we'd need embargo rules, but even that won't entirely prevent abuses of the system. Most basic research isn't just a system of data factories, careful analysis by experts is essential for interpreting the results, and if scientists don't have some assurance that they'll be permitted to publish these analyses before their competitors stomp all over them, the academic system would simply break down. (Or is that what you want?)

    19. Re:Sudden Outbreak of Common Sense by interkin3tic · · Score: 4, Funny

      Creationists regularly mangle papers, taking quotes out of context and all.

      Get ready for an onslaught of mangled data analysis, with data being taken out of context, the results published to some blog, and people making policy decision based on those blog postings.

      Hmm... I think you've brought up another valid point: some researchers might take the data, rehash it and publish it as their own, getting credit for it, much as you have taken my point, restated it with a minor additions, and got all the mod points for it.

      Which is to say, I see what you did there ;)

    20. Re:Sudden Outbreak of Common Sense by mkiwi · · Score: 2, Funny

      A mathematician, an engineer, and a computer scientist are the final candidates for the top tech spot at a major corporation. They are summoned one by one to be interviewed.

      The mathematician goes to the interview. The person interviewing him is the CEO of the company. Only one question is asked: "What is 1+1?"
      The mathematician pulls out a pen and paper, makes a few scribbles, and says "This is proof that 1+1=2!"

      The engineer goes to the interview next. The CEO asks him the same question, "What is 1+1?"
      The engineer promptly grabs a calculator from his pocket, types in 1+1 and presses the equals sign. He shows the result to the CEO: "This calculation proves that 1+1=2!"

      The computer scientist is last. He is nervous, but fairly calm. The CEO asks him the same question.
      The computer scientist pauses, scratches his head for a second, and pulls out his laptop and asks "What do you want it to be?"

    21. Re:Sudden Outbreak of Common Sense by martin-boundary · · Score: 2, Interesting

      The downside to free and open access to all data is that research groups get grants to collect AND process the data to come up with results. Opening the data up for free access means that other groups, who have more interest in scooping than being right, have more ability to do that scooping. That leaves the people who did the work in the cold. There is good reason to delay opening the data until the group being paid to collect it has a chance to use it.

      Why do you think that delaying is necessarily the correct solution to the scooping problem? There are plenty of alternatives that could solve the problem.

      For example, the lab could license the data to anyone on the condition that someone in the lab (or the lab itself) is listed as a co-author of any paper that uses its data. That way, no scooping is possible, and outside researchers could still analyse the data as soon as they want.

    22. Re:Sudden Outbreak of Common Sense by nacturation · · Score: 4, Insightful

      Sure, I'll give you the data. But I wasn't funded to put the data in a format that's easy to understand. I've also got a job, and I don't get paid to support a competitor's data analysis attempts. Good luck.

      Your so-called competitors will be sure to mention your viewpoints when your funding runs out and you apply for more. Not only is your research not easy to understand and you don't let others analyze the data to attempt to reproduce your conclusions, but you think that other members in the scientific community are competitors and you feel a need to sabotage their efforts by making it difficult for them to use taxpayer-funded data to advance science. If science is such a business to you, then how about you fund it all yourself from the profits you make?

      --
      Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
    23. Re:Sudden Outbreak of Common Sense by HungryHobo · · Score: 5, Insightful

      not really.
      Your problems with these possible situations are based on the deeply flawed system we have in place now.

      Give academics the respect and credit they deserve for collecting vast quantities of high quality data rather than merely for the 2 page paper they write about some interesting statistical anomalies they found in said data and this ceases to be a problem.

      The way papers are written, reviewed and published today and the way academics are given credit is based on a system hundreds of years old when it costly to print hundreds of pages of boring figures.

      Now data is cheap beyond words. Publishing a few hundred words or a gigabyte is little different when your audience is fairly small and the way academics publish should reflect that but it's too hidebound and dogmatic to do that.

      A professor who does nothing but produce a high quality and hard to acquire dataset deserves credit even if he comes to no conclusions at all.

      The problem is with the system and with the way academics think.
      Not with this possible change.

      Fix your system.

    24. Re:Sudden Outbreak of Common Sense by NeutronCowboy · · Score: 5, Funny

      I think you've brought up another valid point: some researchers might take the data, rehash it and publish it as their own, getting credit for it, much as you have taken my point, restated it with a minor additions, and got all the mod points for it.

      I stand on the shoulder of giants. ;)

      --
      Those who can, do. Those who can't, sue.
    25. Re:Sudden Outbreak of Common Sense by the+gnat · · Score: 4, Insightful

      Give academics the respect and credit they deserve for collecting vast quantities of high quality data rather than merely for the 2 page paper they write about some interesting statistical anomalies they found in said data and this ceases to be a problem.

      The problem is that interpreting raw scientific data is enormously time-consuming, because there's so much information available that we can't possibly assimilate it all. I have a PhD in biochemistry and advanced training in crystallography, but I couldn't look at a ribosome structure and easily figure out what it meant, because I don't know very much about ribosomes. The people solving the structure, on the other hand, have exactly the background necessary to perform detailed analyses, and they will undoubtedly notice things that completely escape me. And I think you're understating the value of the scientific literature. A 2 page paper on statistical anomalies won't get you a faculty position at a major university, but a well-written 10 page paper on the meaning of a crystal structure certainly can. This is even more the case if they took additional time to perform non-crystallographic experiments to verify new hypotheses.

      I don't deny that there are issues with our system, but you're completely missing the point of writing papers. Simply generating massive amounts of data isn't considered science - figuring out what it means is. I say this as someone who is very good at generating data quickly, but not particularly good at interpreting it. Now I write data analysis software instead, and leave the question-asking to more suitable minds.

    26. Re:Sudden Outbreak of Common Sense by T+Murphy · · Score: 2, Insightful

      People making bad conclusions from good data is better than making (any) conclusions from no or bad data. By using good data, it helps give the proper scientists a chance to use logic and reason to correct people. We can't change the minds of creationists because we are not drawing our conclusions from the same 'data'. People believe in global warming because of data, now deny it because of doubt in the data. They may be impulsive and believe whoever speaks the loudest, but it does imply we can bring them to the next step and compare analyses, not just data.

    27. Re:Sudden Outbreak of Common Sense by Vornzog · · Score: 4, Insightful

      I work for a government lab that produces DNA sequences. We are obligated to release our data into a public database as soon as it has been verified for any samples that come from the US, and we release most of our foreign data, too, unless the other country involved gets pissy.

      Nothing good comes of that speed. We get crackpots thinking they've made major discoveries (not one real one yet), we get scooped for major papers (think Science), sometimes by our own collaborators using only our data and none of theirs, and we generally spend a lot of time, effort and *more money* on media spin control. There is such a thing as releasing the raw data too fast.

      We get a *ton* of FoI requests, too - people think we are withholding the good data, or being stubborn by not providing them composite statistics in exactly the format they want to see. The truth is, up until I got involved, the data management technology was so far behind the current bog-standard capabilities of the rest of the world, we couldn't actually answer the questions that were being asked, barring Herculean effort.

      Don't get me wrong, I think we *should* be releasing all of this data - delayed by just a bit. That way the people who generate it would have a better shot to get recognition/credit for their work, the crackpots would have less ammo for their rants, the press would be more likely to get the facts right the first time, and the scientific integrity of the whole process would be upheld, as everyone would get the raw data to review. It'd probably save a ton of money.

      The "reward" for doing publicly funded research is that you keep getting funded.

      Collecting good data is hard work, and the payoff is big publications, which you need if you want to continue getting funded. Once you've got that big publication in your pocket, though, you'd better by coughing up that data set. Otherwise, everything you say is suspect. Kudos to the UK for getting this half-way right, but they'd better set some reasonable constraints on the timing of these required data releases, or face any number of frivolous lawsuits from conspiracy theorists and 'data analysis specialists' who don't want to do any of the hard work themselves...

      I don't care one whit what you think you're entitled to: if you're taking my money, you work for me.

      I don't care if you are a ditch digger or a particle physicist. Doing all the hard work and getting none of the credit sucks regardless of what we are discussing or who is paying the bills. So put up or shut up. Would you be willing to do all of the grunt work in your job, but take none of the recognition? Most people wouldn't - those are the kinds of jobs that make people go 'Postal'. If you aren't doing it (and even if you are), do you really expect anyone else to?

      --

      -V-

      Who can decide a priori? Nobody.
      -Sartre

    28. Re:Sudden Outbreak of Common Sense by LingNoi · · Score: 3, Insightful

      some researchers might take the data, rehash it and publish it as their own, getting credit for it

      While making reference to the original data? That's called science.

      While not referencing the original data? That's called plagiarism, it's happened in science before and usually ends your career.

    29. Re:Sudden Outbreak of Common Sense by budgenator · · Score: 2, Interesting

      What if group B notices that a temperature station one day reports the temperature is -12.4C one day and 10 minutes later it's +12.4 C the next? On 2010-Apr-21 22:10, Drifting buoy 48534 did just that and that's an automated report, imagine the fun and games when human error gets added in! The data is bad, there is a lot of bad data points in the records and the records were never intended for the purpose they are being used for so quality control is even more critical. We really need a large number of human eyeballs looking at the data to find these problems.

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    30. Re:Sudden Outbreak of Common Sense by HungryHobo · · Score: 2, Insightful

      The original objection was that if the data is hard to come by then it's unfair to academics who wouldn't get the credit after gathering the data.

      Of course simply generating massive amounts of data isn't science but it is a very very very important part of science.

      Is an academic who can write that well-written 10 page paper on the meaning of a crystal structure any less mentally capable because he didn't have the funds or facilities to gather the data he's looking at?

      If you open up the data then someone will undoubtedly notice things that completely escape the handful who got the data in the first place.

      The obvious solution is to give credit where credit is due and respect the ability of some people to perform good experiments.

      If economic systems were run the way academics operate then you'd end up with something like this:

      Nobody gets paid for raw lumber.
      Nobody gets paid for seasoned wood.
      Finished wooden items would be worth a fortune.

      And as a result anyone who wanted to make things from wood would have to own an area of forestry, logging equipment, a saw mill, a kiln and finally any tools for the final step.

    31. Re:Sudden Outbreak of Common Sense by AJWM · · Score: 5, Insightful

      Examining old data has one value and one value alone - verifying that the claim made for the data matches up with the data. [...] Access to raw data for any other reason is pointless.

      Hardly. One could analyse the raw data looking for something other than what the original researchers were looking for. There might even be some interesting signal buried in the data that original team, focusing on something else, disregarded as noise. Minute timing errors in, say, solar wind data returned from a spacecraft might turn up some oddity of orbital mechanics, for example. The researcher focusing on the sensor data rather than the timestamps will miss it, but it's all part of the raw data. How many biologists discarded moldy Petri dishes as ruined, without recording that, before Fleming thought to investigate why bacteria didn't grow near the mold?

      --
      -- Alastair
    32. Re:Sudden Outbreak of Common Sense by TapeCutter · · Score: 5, Interesting

      "This hubbub all came about because of the difficulty in prying the source data out of the hands of the guy who produced the "hockey stick" figures. It's covered in the book "Broken Consensus" I think it's called. The "hockey stick" is not the "source data", the source data is all of the individual readings from all the instruments, prior to corrections for sampling errors or known issues. One cannot verify the quality of the "hockey stick" result without having the source data and being able to verify the processing steps that were done to it."

      I threw away some mod points because it irks me how unskeptical the garden variety climate skeptic actually is when it comes to accepting the hockey stick has been discredited. Here are a few points you should consider with your skeptics hat on...

      1. Mann's original hockey stick was published in the jounal Nature, they are not well known for publishing shoddy work.

      2. A senate inquisition was held on Mann's paper in which the National Acedemies of science were called in to give expert testimony on the veracity of Mann's paper. As you will no doubt learn when reading the testimony the NAS came down firmly in favour of Mann although they did highlight some minor technical problems.

      3. Given that the NAS were able to agree with Mann's conclusions under oath at a hostile inquisition, how did they do so without access to the data?

      4. The journal science is also not well known for publishing shoddy work. So why did NAS then publish a follow up study by Mann in their journal Science if they were not satisfied he had no only addressed the minor technical problems in the original but also greatly increaed the robustness of the findings?

      5. Why can't I find a listing for a book called "broken consensus" which you cite as a source? Shouldn't you at least adhere to your own standards of evidence?

      6. How do you explain the links to the data and methods found in an article called Dummies guide to the hockey stick on Mann's website?

      7. Why do people belive that some difficult to obtain data (ie: time consuming) from a few nations means that the other 99.99999% of the raw data available on the web is insuffitient to recreate the hockey stick?

      8. Why is McIntrye only interested in "auditing" climate science that disagrees with his opinion? Could this be because his own paper did not stand up to the traditional auditing method called "the test of time"?

      If the above points do not at least cause you to question your sources then I can only conclude your sketics hat must have slipped down over your eyes...

      --
      And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
    33. Re:Sudden Outbreak of Common Sense by the_womble · · Score: 2, Insightful

      I totally agree. If people just start looking at each others data instead of verifying it, a lot of mistakes (or fraudulent data ) will never be caught.

      On the other hand, a lot of errors in interpretation and statistical analysis will be caught.

    34. Re:Sudden Outbreak of Common Sense by TeknoHog · · Score: 2, Funny

      If I have been able to see further, it is due to being surrounded by midgets.

      --
      Escher was the first MC and Giger invented the HR department.
    35. Re:Sudden Outbreak of Common Sense by Xest · · Score: 3, Informative

      The issue with the FoIA in the UK is that there is a clause requiring bodies to only have to comply with the request if the cost of fulfilling it is not more than around £450.

      I've seen first hand local government abuse this by claiming that collation of the data would take 18 hours and that their FoI officer is paid £25 an hour, and hence the cost of providing the data is too high. Quite why it requires someone paid £50k a year to collate some basic data that they should already have collated anyway I've no idea, but still, they use this excuse, and the information commissioner allows such abuse of it.

      So although as you say it's a great theoretical win, I believe it'll make no difference in practice either way due to the ease of which public bodies are able to sidestep FoI requests.

    36. Re:Sudden Outbreak of Common Sense by hazem · · Score: 2, Insightful

      Agreed. Asimov wrote in the forward of one of his robot books, "If knowledge poses a dangerous problem, I can't believe that ignorance is the solution." I think it applies aptly here.

      Sure, some people will accidentally misuse the data, and others (hopefully fewer) will intentionally misuse the data, but for many, having that data available has a great potential for increasing the understanding we all have.

    37. Re:Sudden Outbreak of Common Sense by finarfinjge · · Score: 2, Interesting

      Point 1, Nature. Was still publishing articles supporting Piltdown man within 2 years of it being finally accepted as a hoax. They have been fooled before
      Point 2, Senate "inquisition" slammed Mann et.al. (if you are talking about Wegman, he called Mann's work obscure and incomplete with conclusions not supported by the data)
      Point 3, Not sure how you came to the conclusion that calling the conclusions unsupported by the data "agreeing"
      Point 4, See point 1,
      Point 5, guess you didn't try too hard http://books.google.com/books?id=8WqYkGxvPlAC&dq=%E2%80%9CShattered+Consensus:+The+True+State+of+Global+Warming%E2%80%9D&printsec=frontcover&source=bl&ots=veoYFgaLg9&sig=khERol_VbglL4JwcNuzN5JbaLJo&hl=en&ei=UxggS90fksmUB7CyxegF&sa=X&oi=book_result&ct=resul#v=onepage&q&f=false

      Point 6, Given that this article is about the FIRST, RELUCTANT, release of some of the very data that Mann used in constructing the hockey stick, I find it a bit rich that you are willing to claim that the raw data is all available
      The temperature data is available, but the 'hidden decline' tree ring data . . . Not so much
      Point 7, McIntyre's site is quite well known for being especially hard on skeptics who post idiotic comments. Of course, Steve is not advocating a point of view that requires the expenditure of billions of dollars. He is just asking for some discipline in the work being done.
      In his real job, he is required to audit data under rules that would make climate scientists crap. In mining you have to publish your data (drill records) as soon as practicable. Competition?? Tough. Don't show your raw numbers and interpretation methods?? Go to jail. McIntyre audits a lot of data other than climate scientists data. Under rules that people like Jones and Mann would whine miserably about.

      Maybe you might want to take off your own blinders and find a sight in addition to realclimate for information.

  2. yro my ass by meow27 · · Score: 2

    making publicly funded non-military research has nothing to do with privacy. Public money is spent for the public good and there is no good justifiable reason to keep it hidden from the public... especially if its meant for the betterment of society.

    if you want your data to be private, get your own privately funded money

    1. Re:yro my ass by geekoid · · Score: 4, Insightful

      errr... no always.

      Putting data into peoples hands whoa aren't experts often leads to bad things. See every non expert who believed Wakefield study because they didn't understand how to interpret data. In that case kids died , and kids are still dying.

      In principle I agree with you, but we live in an are where everyone thinks they are a qualified expert in anything. That simply isn't true, and no good will come out of this.

      The data wan't show a flaw in the study because it wasn't used, but he will inevitably cherry pick data to 'prove' the study is wrong. And people like Hannah Devlin are always happy to publish claims without proper study. So no good can come from this, and people need to understand that.

      It's hard problem to solve.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    2. Re:yro my ass by DAldredge · · Score: 2, Insightful

      "No good"?

      None?

      Are you quite sane?

  3. Many academes want this too... by Improv · · Score: 2, Insightful

    Science journals have long fought this, because their profit model is strongest when they own copyright and are the exclusive publishers of a paper. Peer review and scientific principles don't mesh well with peer review though, and many academes have either "published" their papers on their own websites or found other ways to try to work around the journals.

    Ridding peer review and science of copyright would be a great improvement.

    --
    For every problem, there is at least one solution that is simple, neat, and wrong.
    1. Re:Many academes want this too... by DarkKnightRadick · · Score: 4, Insightful

      no, peer review is good. It helps to point out mistakes or inconsistencies. Getting rid of scientific journals is quasi-good (less profit motive in science, but also less chance to get work out there).

      --
      "There is a way that seems right to a man, but its end is the way of death." Proverbs 16:25 (NKJV)
    2. Re:Many academes want this too... by Obfuscant · · Score: 2, Insightful
      Wikipedia has proven that peer review can be supported for almost nothing.

      And that's the value you get from it. Allowing everyone to "peer review" everything results in the "truth" being the result of a majority vote, not the result of it being true.

      Peer review requires peers, not random people off the street.

      The storage and administrative costs for all research papers should cost at most $50/researcher,

      You're confusing the cost of "peer review" with the cost of archiving a paper. Peer review takes place prior to publishing the paper. The value of many journals, compared to "random website" is that there IS peer review, and you are less likely to find random babbling and incoherent thought in the journal.

      However, Open Access is the wave of the future, so you will eventually get peer reviewed work online, like you want.

      Of course, this discussion is about the data behind the papers, not the papers themselves. I don't know of a single paper that includes the "raw source" data it was based on. That's the purpose of the paper, to analyze and theorize.

    3. Re:Many academes want this too... by melikamp · · Score: 2, Interesting

      This is still an overestimate, I think. The Wiki says there are 5758 higher education institutions in the US alone. The entire budget of the Wikimedia foundation hovers around $10000000 a year, which is ~ $1736 per year per institution. We can have a project that costs 10 times as much as Wikipedia, containing, most likely, more than hundred times more data, for measly $17360 per year per institution. This is about as much as one lucky teaching fellow gets paid. This is such a trivial sum of money for the academia as a whole, Harvard alone could afford it for several decades if they wished so, although it would make a quite a blip on their balance sheet.

      Whatever, Wikimedia is already doing it with textbooks, so that part is taken care of. It would be nice, though, do have a big ass research exchange, kind of like famed JSTOR, but where everything comes with the source attached (original LaTeX, raw media, raw data, etc), and everything is available to everyone (public domain or, better yet, copyleft).

  4. Re:Good and bad by oldhack · · Score: 3, Insightful

    "Scientists" scared of goofy analysis are priests, not scientists. Take their funding away and use their PhD parchment for toilet paper.

    --
    Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
  5. Re:Publicly funded by Monkey-Man2000 · · Score: 2, Informative

    Free if you pay your TV tax or pirate I believe

    --
    This post was generated by a Cadre of Uber Monkeys for Monkey-Man2000 (603495).
  6. Re:Good and bad by Monkeedude1212 · · Score: 4, Insightful

    On the other hand, this will likely produce a whole stream of deliberately inaccurate analyses with ulterior motives behind them.

    But with the data public, it'll be easier to shoot them down for picking, choosing, skewing, and what else.

    There is no reason why this kind of data should ever be "secret"

  7. Awful summary by Protoslo · · Score: 4, Insightful
    It turns out that "the data" are measurements of petrified tree rings, which were collected in the course of (presumably) a government grant-funded study. Now Queen's University researchers must compile the data for release because of the (UK) Freedom of Information Act. The scientists quoted in TFA apparently did not use the ring data for anything relating to climate studies, but Keenan has that purpose in mind.

    Phil Willis, a Liberal Democrat MP and chairman of the Science and Technology Select Committee, said that scientists now needed to work on the presumption that if research is publicly funded, the data ought to be made publicly available.

    That doesn't seem unreasonable to me. Appendices with raw data are often included already in the online editions of journals. Of course, if the ruling applies to all data generated in the course of a study, whether it is used in publications or not, it could be onerous indeed.

    1. Re:Awful summary by blair1q · · Score: 2, Interesting

      Now Queen's University researchers must compile the data for release because of the (UK) Freedom of Information Act.

      Seems unreasonable. They should charge the requester for any effort needed to "compile" or transmit the data. No reason the public should foot the bill for any particular formatting or delivery.

    2. Re:Awful summary by pkphilip · · Score: 3, Informative

      Michael Mann used the same tree ring data as temperature proxies for his studies and has published papers on this. But now the very same scientists who collected the tree ring data claim that data cannot be used as a temperature proxies - even though they haven't mentioned a word about how this would invalidate Michael Mann's work.

      http://climateaudit.org/2010/04/21/mann-of-oak/#more-10811

  8. NSF by martas · · Score: 2, Interesting

    does anyone know if the NSF has similar requirements?

    1. Re:NSF by imidan · · Score: 3, Interesting

      The NSF has recently taken more of an interest in research data management. They're definitely starting to make it a requirement of grant funding that the research data be digitally stored, backed up, and, after a cooling-off period to allow the principal researchers to publish, made available to the public. I'm working on a research data management group at my university, and the researchers generally seem open to the idea, though they're loathe to put in any extra effort to make it work.

    2. Re:NSF by guruevi · · Score: 2, Informative

      Yes it does, kinda. Thanks to our publishing overlords however these 'making available' issues are more difficult than just publishing it online or so. The data cannot be made available as long as a publishing house has copyrights on it and the publishing house usually takes copyright for all work for years including data that is not directly published by them especially when the work is or becomes popular. However NSF/NIH grants usually have the requirement to release all data to the public a couple of years (usually around 10 or 25 years depending on the grant) after collection or publishing. But if you don't publish through one of the big names, your career as a scientist usually doesn't go much of anywhere. Also, a lot of machinery can't be afforded on any grant but a governments' (multi-million dollar machines), the device that collects the data could be funded by the NIH and the grant has the requirements to release data 10 years after collection. However in order to make money to keep the system running, the institution needs other funds from other sources each with their own constraints.

      Disclaimer: I am not a scientist but I manage about 60TB of collected data owned or funded by a combination of private/individual funds, internal funds, corporate funds, publishing houses, NIH, NSF and other grants which should or should not be made available to the public.

      --
      Custom electronics and digital signage for your business: www.evcircuits.com
  9. Re:There are problems with this by Sparr0 · · Score: 2, Interesting

    Yes, yes, and yes. What is the problem? If they are racing, there is obviously something worth racing TO. If both teams have all the data, that goal will be reached no later, probably sooner.

  10. Re:There are problems with this by xilmaril · · Score: 3, Informative

    Does this mean every biology, chemistry, physics, and engineering research group (I'm talking about grad students and postdocs, here) would have to open their lab notebooks to anyone who asked?

    Researchers who ply their trade on the cutting edge of science live in perpetual fear of being "scooped" by another group who publishes their discovery first. These are sometimes literally "races." So now a group at one university could demand access to the notebooks of a group at another university? And vice versa?

    Not at all.

    It means they have access to each others results and source data when published (once the group is done researching this phase, and is ready to publish). There's no "opening notebooks", simply because that's a terrible metaphor for how data is collected these days.

  11. Re:There are problems with this by sourcerror · · Score: 3, Informative

    You only have to publish your data after publishing your article, which means "you won". You don't have to publish data for a research in progress.

  12. Not always feasible... by Anonymous Coward · · Score: 2, Informative

    We can agree that the whole scientific process does not make much sense if we have to believe in the interpretations without seeing the actual data. From this perspective it is crucial for all scientific data to be open.

    The other perspective comes from the individual scientist. It might take years to put together a complete data set of a particular phenomenon via experiments, literature review, digging in the ground or looking at the stars. So after looking for something special you finally discover something new and write a small article about it. This will just be something along the lines of: "hey, there is something interesting going on here." Now you go back and look carefully at all your data for similar events, filter out noise because you have a better idea what to look for and then hopefully publish more about. So the next article will not only contain more information but also some analysis about the possible origins of the phenomenon and so forth.

    Imagine you had to open your carefully put together data right the second after you recorded it. Other people might grab your stuff and your research might not even be cited because they just looked at all the steps that you took that were not successful and repeated the experiments or used other available data.

    This interest in keeping your data private cannot be avoided with the current system of judging a scientist by his or her publications.

  13. Re:Good and bad by Rising+Ape · · Score: 3, Insightful

    That doesn't matter. The important thing is that the attacks are made. Even if every one is shown to be completely wrong, people will still remember all those (erroneous) anti-global warming reports. Especially since the media will enthusiastically report the initial attack and relegate the news of its rebuttal to a small paragraph on page 34, if they report it at all.

  14. Re:Formatting Standards? by Rising+Ape · · Score: 2, Insightful

    I am more concerned with the time and effort it will take to format data for external users.
    An accompanying more detailed methodology will surely have to be provided for the data to be used correctly.

    That is indeed an issue. Presumably the methodology is already published, as is the rule for scientific papers. What could happen is that competent scientists have to waste their time debunking incompetent analyses by axe-grinding cranks.

    Actually, if the requirement is specified up front as terms for the grant, I'm not opposed to it. I don't think it'll do any good, mind you, as a rule all that's useful is published, and scientists are generally happy to cooperate if you need more, as long as you have honest intent. But the current system is a charter for arseholes using FoI requests to harass scientists.

  15. Teaching? by jfw · · Score: 2, Interesting

    OK, why does this argument not also apply to teaching? I am paid to teach and do research from the public purse. My teaching is available to any one who meets certain standards and pays a user fee. Access to data should be the same.

  16. Re:Good and bad by interkin3tic · · Score: 3, Insightful

    But with the data public, it'll be easier to shoot them down for picking, choosing, skewing, and what else.

    Not sure what regulations are on "release all data to the public" but seems like there are loopholes big enough to drive a bus through. For instance, in my field, no one but me knows how many cells I looked at. Maybe that thing I said happens in these cells happens in all those cells. Maybe I looked at 300 before seeing one doing what I said, took a picture of that one, and that was that. All my data would be that one cell I cherrypicked.

    Even if I did take pictures of all 300, no one knows but me. Those other 299 can dissapear.

    If I'm -not- evil though, this could hurt me. If I looked at say 3000 cells, and 10 were doing a thing that I thought was significant, I could have my reasons. Maybe the other 2990 were the wrong cell type or something. Being the expert, that might be obvious to me just from looking at them. A non expert looking at them might not see that. They would just see that out of 3000 cells, I chose the 10 that supported my data. They might call foul without bothering to have me explain myself.

    There's no reason the data should be secret, but most data doesn't stand on it's own, and writing up supporting information to -all data gathered- just isn't going to happen.

  17. So a non expert by geekoid · · Score: 2, Interesting

    wants to use data that wasn't used for climate change and models in order to prove that the studies that didn't use them are flawed.

    Add to that a reporter who continually overstates anything the climate change denilist say, I'm sure it will confuse even more people.
    This should be fun.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  18. Re:Good news from the UK by geekoid · · Score: 2, Insightful

    You say that until he gets on a major talk show, talks about his improperly interpret results and suddenly 20 million people are parroting his incorrect results.

    Suddenly it's not a good thing because those same outlets will not give the same time to actual experts.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  19. Re:Good and bad by geekoid · · Score: 2, Interesting

    Scientists are always concerned when people who have no idea what they are doing try to interpret data. It has nothing to do with being scared.
    For example:
    Lets say this guy cherry picks some data to support his belief and Opera finds out about his 'findings' and puts him on the air. Suddenly 25 million people who aren't qualified to judge his assessment is not hounding politician over incorrect data.
    I just spent about 10 years watch this very thing happen to Vaccines. Some idiots bad study gets on Opera, and a year later people are dying.

    It's a real and serious problem, and the people causing it(media) are doing nothing to fix it.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  20. Re:Formatting Standards? by Bigjeff5 · · Score: 2, Insightful

    What could happen is that competent scientists have to waste their time debunking incompetent analyses by axe-grinding cranks.

    It's much more likely that incompetent scientists will be debunked by more competent analysis, because as soon as there is any controversy regarding a study the scientific community swarms to verify one way or the other.

    Also, it's just as important to know what data was disregarded, and why (there are a plethora of valid reasons, but there are even more invalid reasons) as it is to know what was included. The GP's point about the tree ring data that was collected but never used, why wasn't it used? Was it simply because they weren't interested in doing a tree-ring study, and used the data for something else entirely? Or did it make their model not work quite right so they tossed it out? How is anybody to know if they can't look at the data they collected?

    Furthermore, if the raw data is not provided, you cannot verify that the models and statistical conclusions are correct. What if there is a problem with the model the researchers were using? Well, if you plug the data into a better model, or even just a different model, you'll see a big difference if one of them is wrong. Climate science relies heavily on computer models, and often multiple researchers will use the exact same model in their study, so it's not hard to get a systemic error across multiple studies.

    In other words, how can you verify anybody's science without the original data they observed to begin with? I'm never going to look at this data, I wouldn't have a clue what to do with it, but I know there are a lot of climate researchers who are chomping at the bit to verify these studies.

    --
    Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
  21. It's a step in the right direction... by El+Fantasmo · · Score: 2, Interesting

    I believe that all public universities (in the US) that cannot prove public money was not used on research, should be required to release the findings/data to the public shortly after it is published. Of course there are exceptions for things involving national security and what not.

  22. Conflicting Laws? by Sir+Mal+Fet · · Score: 2, Interesting

    I wonder how this conflicts with the laws about Privacy of Data. For example, if a company shares a dataset that contains sensible information with a University (this CAN be done, at least in my country, with a contract. We compromise to safeguard the data and to not violate Privacy laws, consultants also do this everywhere) for the purpose of developing a model or some other application that needs the data. The professor then publishes a paper with the main (non-corporate secret) results, and uses public funds for an undergrad or something. Does this mean the professor can be sued into giving the information away? Doing this clearly violates the laws on privacy, but would conflict directly with the Freedom of Information Act. Compelling with one law contradicts the other... I do not think that this can be upheld in court for EVERY case, instead it would have to be analyzed in a case-by-case basis using (possible costly) lawsuits. Then again, IANAL, so maybe I'm wrong... (Full disclosure: I am a researcher in data mining)

  23. Re:Good and bad by Areyoukiddingme · · Score: 2, Informative

    If I'm -not- evil though, this could hurt me. If I looked at say 3000 cells, and 10 were doing a thing that I thought was significant, I could have my reasons. Maybe the other 2990 were the wrong cell type or something.

    Of course you would. And if you truly did find such a strange sample set, you would document those reasons with just such a sentence. Maybe they WERE the wrong cell type, and in your paper you would be expected to say precisely that. Odds are fairly good you would have a citation concerning why those cells are the wrong type, if not several, since any such assertion that 99.7% of your sample set is junk would be unusual enough to require justification of your methodology. Perhaps no better cell culture or separation method is available. This should be easy enough to document and explain. It took me just one sentence fragment, after all.

    Most likely, if there really isn't any better method, there have been multiple papers describing what the limitations are and why, in an effort to formulate a better method. There is probably also active, ongoing research into creating a better method, since any line of inquiry with such poor sources is bound to attract attention. To paraphrase Heinlein: To score an academic coup, find out what everyone agrees is impossible. Then do it.

    Yes, you're probably going to experience an uptick in the noise floor if you're a UK researcher. The timecube guys are out there, and audible now. But complying with the UK directive is easy. Provide the data. You're not required to address specific demands of every crank who claims your data proves the existence of aliens. The people who control your tenure/salary/book deals/whatever read your paper, saw your cite, and moved on. The guy worried about aliens doesn't affect much of your life. Just your inbox.

  24. Re:good news by zz5555 · · Score: 3, Interesting

    I don't know. The USA (and a lot of other countries) might not be too happy since it means releasing the UK is saying it's OK for these scientists to release the USA's proprietary data. So I guess, you're right in that those jerks like the USA (and a lot of other countries) that wanted to profit from this data will get their comeuppance, but I wonder if we now need to increase taxes in order to pay for these services that used to make a profit. So that means that we all need to pay more money because of this.

    I also wonder what it means for the university to release data that is illegal for them to release. I mean, on one side the court says they need to release it, but on the other side other courts say it's illegal to release it. Should be interesting in the UK for a while.

  25. Re:Formatting Standards? by michaelwv · · Score: 2, Informative

    Science makes progress through experiments. You design an experiment; you figure out what measurements you need to make; you make those measurements according to the requirements and specifications of your experiment; what do you need to control for? what calibrations are important? how much data do you need for a statistically significant sample? The answers to all of these questions are different depending on the experiment you want to do. Using data from someone else experiment means you have to go through all of these steps and then try to account for that fact that they way the data were gathered isn't quite right for what you want to do, you need to control for different things than the original experimenters, etc. This takes generally takes expertise in both the original scientific question and the new one. I get enough citations and questions from good-intentioned, responsible astronomers who use our data in published papers in subtly, but significantly, incorrect ways. I try to deal with such occurrences helpfully, but if often takes a long time to guide the interested fellow astronomer through the relevant literature explaining why what they did isn't quite right. When I write about something in a field that's new to me, I'm quite sensitive to this and try to check extensively that I'm not making a classic 1st-year graduate student mistake in that field. Don't even get me started on all of the email I get with re-analyses of our data by retired engineers.

  26. Re: data retention now required too? by sl149q · · Score: 4, Insightful

    If people cannot replicate your results it isn't science.

    And with Climate Science part of the process is showing how you collected and interpreted the data. If you are not willing to share the raw data so other researchers can attempt to replicate your methods and results then don't bother publishing.

  27. Re:Good and bad by finarfinjge · · Score: 3, Insightful

    MOD PARENT UP!!

    The problem that the climate scientists have created for themselves is that they are hiding the data from everyone. Up until a few months ago, these requests were relatively rare. Some of the requesting parties actually have fairly strong credentials. Steve McIntyre may be hated by the folk at realclimate, but he is an IPCC reviewer. To stonewall him is a little different than refusing to provide it to Jenny McCarthy.

  28. Re:Peers? by sl149q · · Score: 5, Insightful

    As opposed to the proselytizers who are funded by the NGO's and the new "Green" capitalists and rent-seekers.

    One of the more interesting bits of the Climategate emails showed that Mann was happy to share his data EXCEPT to people who he thought would disagree with his methods and results.

    And in this case Mann was also the recipient of the tree ring data showing that again if you agreed with the owners ideas he had no problem getting you copies of what you needed.

  29. This is definitely a good thing by OrwellianLurker · · Score: 2, Insightful

    I am a pretty big cynic, and I remain unconvinced that AGW is a significant problem. It doesn't help that the raw data isn't disclosed. I wish scientists would go back to doing science and quit trying to be policy makers.

    --
    'Political power grows out of the barrel of a gun.' - Mao Tse-tung
  30. Re:Good news from the UK by mjwx · · Score: 2, Insightful

    You say that until he gets on a major talk show, talks about his improperly interpret results and suddenly 20 million people are parroting his incorrect results.

    The problem is that we dont apply the same standard to a talk show as we do to a scientific institution.

    If a talk show spreads incorrect information absolutely nothing happens, if a scientific institution does the same there will be a royal commission, investigation, scrutiny and even if they are found innocent someone's career is still ruined.

    What we need is to get rid of the double standard, lets just say if Box News makes a deliberately misleading statement about the Australian Hoop Snake they should be investigates, charged and the editor, producer and reporter fired and barred from working in the media field again. If we started giving news agencies with the same scrutiny and punishments as universities then the level of misinformation would drop dramatically.

    Published scientific reports should also have the data published publicly, however there should be severe punishments for the misuse of this data to spread misinformation and attempts to ruin careers.

    --
    Calling someone a "hater" only means you can not rationally rebut their argument.
  31. Re:Good and bad by martin-boundary · · Score: 2, Informative

    That doesn't matter. The important thing is that the attacks are made. Even if every one is shown to be completely wrong, people will still remember all those (erroneous) anti-global warming reports.

    People don't matter. Science doesn't advance by asking what Aunt Rosie from Ohio thinks about a particular result. Those who matter are scientists, and scientists read peer reviewed journals. Peer review is all about filtering out all those attacks so that nobody who matters needs to read them.

  32. Raw data can be useless by Roger+W+Moore · · Score: 4, Interesting

    Opening the data up for free access means that other groups, who have more interest in scooping than being right, have more ability to do that scooping. That leaves the people who did the work in the cold.

    That is not hard to achieve: someone has to make an FoI request, the cost to prepare the data has to be estimated, someone has to get hired to collect and format the data and then the data is released. That can take a considerable amount of time.....but that's not the only issue. In my field of particle physics raw data is generally useless unless you understand how it was collected and how to analyse it.

    Even assuming that you had several petabytes of disk/tape available to store it, raw data from ATLAS would be completely useless to you unless you really understand the detector "warts and all". Trying to understand this data without access to the detector itself and the ability to test and cross-check ideas looking at (and sometimes carefully tweaking) the hardware is literally impossible....and that is before you get into the thorny international issues about who did what and so whether it falls under any one country's laws.

    These issues were discussed on a previous experiment I worked on in the US and the conclusion was that it did not serve the public to have data released in just about any form: the raw data was useless and even the processed data still had considerable "quirks" which required understanding (e.g. acceptance drops at detector boundaries etc.). This was aptly demonstrated by a pilot project which resulted in no interest at all from the public but which worryingly attracted a few nutters who were more interested in proving their pet theory than in doing science.

    So while I am very sympathetic to the "the public paid for it the public should be able to access it" argument I do not think that the public's interest is best served by releasing raw data in all (most?) cases. The best way to serve the public interest is to ensure that results and ideas arising from that research are freely available to all and allow the public to build on that.

  33. Think about what you are asking by Roger+W+Moore · · Score: 2, Insightful

    If a lab has been spending my tax money for 10 years, I want my employees to give me my data right Goddamn now. .....if you're taking my money, you work for me.

    Just stop and think for a second about exactly what it is that us scientists are being paid to do. We are NOT being paid to collect data we are being paid to figure out how the world works and how to apply that knowledge for the betterment of mankind. The data is an end towards that means.

    Now, do you REALLY want us to spend a serious fraction of our time and money preparing and making available the raw data in a form which will probably be useless to you instead of analysing and coming up with results which you are far more likely to find useful? Is that REALLY the best way for us to serve the public interest?

    Examples of how this could go horribly wrong immediately come to mind: it could delay finding medical cures as researchers spend time releasing, instead of analysing data, companies could request the data and develop/patent drugs which YOU will then pay through the nose for, nutters will start horribly misrepresenting the data to "prove" their pet theory on warp drive etc. etc. How does any of this serve the public interest?

    If you want an even clearer example: taxpayers fund each country's intelligence agencies. So does this mean that since you own all the data every tax payer should be able to request to see it whenever they want? Obviously not because it would not be in YOUR best interest for such data to be public. While the reasons are different the conclusion is the same for scientific data. It may be your data but you are paying us to collect it, analyse it and come up with results which ultimately improve yours, and everyone else's, standard of living.

    1. Re:Think about what you are asking by rjiy · · Score: 2, Insightful

      Now, do you REALLY want us to spend a serious fraction of our time and money preparing and making available the raw data..

      Nope. We expect you folks to spend some time thinking up a way so that you don't spend any time at all on "preparing" the supposedly "raw" data _and_ still make it available to the desirous public. Like you know putting up a file on a website with some footnotes. I hear universities have some websites.

    2. Re:Think about what you are asking by qc_dk · · Score: 2, Informative

      I don't think you understand how scientific funding works. I am not given a lump sum and then told to go figure something out. This is how it works in the EU:

      I am given a sum of money. This has to be accounted for. There are a number predefined areas where I can spend this money. During this project I will have to fill in time sheets detailing what I'm spending my time doing. All the different work areas will have spending limits. I.e. I can't just put some more time into community outreach(like preparing data) at the cost of Research and development time. There will be a number of milestones I have to reach along with something called deliverables. Deliverables can be reports or code or raw data if the EU has decided it was of interest. At the same time I also have to prepare papers and so on to keep my position at the university.

      Where do I account for the time spent on giving out data because some random person wants it? Answer: I can't so it will have to come out of my own pocket or I could commit fraud and put my time down into one of the projects and newsflash I'm not going to risk jailtime.

      The fact is that while the public is funding the science, what they are funding is SPECIFICALLY not the data distribution. You don't believe you should have access to Pfizers raw data even though it is funded by the public(through the purchase of medicine), do you? Because what you paid for was a pill and not the data.

      If you believe in this so strongly, please lobby research grant givers to always include funding for public data dissemination, no matter whether the grant giver thinks the data will be of use and has included it as an accountable deliverable.

  34. Re:The data spans 40 years by Ctrl-Alt-Del · · Score: 3, Insightful

    Unfortunately, Climategate proved that, at least in the field of climate research, "peer review" is worthless; Mann et al were actively conspiring to ensure that only "friendly" eyes carried out the reviews; anyone thought to be showing signs of scepticism were blacklisted, whether individuals or publications.

    To add to that, Glaciergate proved that much of what was claimed to be peer-reviewed was actually just regurgitated propaganda, often based on anecdotal evidence (reminisces of mountaineers published in a student rag? Puh-lease!)

    So, appeals to authority ("oh but all this research has been peer reviewed") just don't hold any more. Not until all the data and all the methods used to arrive at the results are made available, and the results can be independently confirmed or denied, can we say whether the research was worth the weight of mouldy notebooks it was archived on.

    --
    "Life is like a sewer - what you get out of it depends on what you put into it" - Tom Lehrer
  35. NOT ONLY DATA, METHODS WANT TO BE FREE!! by xtracto · · Score: 4, Insightful

    Simply generating massive amounts of data isn't considered science - figuring out what it means is. I say this as someone who is very good at generating data quickly, but not particularly good at interpreting it.

    Spot on. I have a PhD in Comp. Sci. (Multi-Agent Systems / Market Based Control). One of the things you learn (maybe in you Universitity degree courses or in your first paper presentation) is that data does not mean *anything*, what matters is the interpretation of such data.

    Nevertheless, I am of the opinion that programs used for the generation / manipulation of such data should also be free / scrutinable. Specially those developped during the research as they are also being paid by the tax payers money.

    In the field I am working now (Agent based computational economics) a lot of people do these so called agent-based simulations, then they write a nice paper about what their simulations showed and try to publish it. The problem is that they keep their code! and in that respect they are deffinitely removing a good chunk of the "methods" part of their research. It is absolutely impossible to duplicate that work without the code.

     

    --
    Ubuntu is an African word meaning 'I can't configure Debian'
  36. Re:Formatting Standards? by Rockoon · · Score: 4, Informative

    That is indeed an issue. Presumably the methodology is already published, as is the rule for scientific papers.

    There is at least one case in =two climate research papers where what the methodologies claimed was impossible because the data to do it didn't even exist. This didn't come out for 16 years, and was only discovered because a FOI request was finally honored.

    In this case, the authors of the papers had claimed that the station data that they used was from stations that had "few, if any, changes in instrumentation, location or observation times." (quote from one paper) and "selected stations have relatively few, if any, changes in instrumentation, location, or observation times" (quote from the other paper)

    "Hey! We only used great data!"

    Now, these two authors used the same data, and one of these authors was actually a co-author of the other paper. These authors are Jones (hello climate gate) and Wang.

    Now, they finally sourced the data as being from the Chinese Academy of Sciences, which coincedentally had co-published a report with the US Department of Energy at about the same time as those two research papers, stating quite specifically that DATA OF THAT QUALITY DID NOT EXIST. The report was specifically about the quality of the Chinese climate record.

    Both papers concluded that the Urban Heat Island effect was minimal. Too bad that they didn't actually have data good enough to draw that conclusion. They said they did, tho.

    None of this would have come out if it wasn't for the Freedom of Information Act. Jones and Wang both obstructed the release of the data (denying FOI requests, etc) for nearly 2 decades.

    This all came out several years ago, but the media didnt give a fuck. They did care about hacked emails tho. Go figure. Now, as it turns out it probably wasn't Jones who was lying his ass off. Wang was a co-author on Jones's paper and supplied the "data." Jones gets credit for having his email hacked.

    --
    "His name was James Damore."
  37. More details by Sara+Chan · · Score: 2, Informative

    I am the story's submitter. My original submission included a link to the mathematician's web page about this; the page has many more details. There have also been other news stories, e.g. at the BBC.

    The UK Freedom of Information Act has exemptions for data that has not yet been used in publications, vexatious requests, etc.

  38. How's this for bad? by Roger+W+Moore · · Score: 3, Interesting

    But you haven't given a reason why it's actually bad

    It wastes scientists' time that would be better spent analysing the data rather than releasing it, it wastes money collecting and disseminating the data, it pollutes the real scientific results with those of nutters trying to prove their pet theory and, in the case of commercially useful data, it risks having companies use the data to develop something commercially useful that will then be locked away behind patents and the public will be charged through the nose for.

    There is also the more subjective, human issue that if you don't let people who have worked like crazy to get the data have at least the first shot at analysis then recruiting scientists is going to become extremely hard and motivating them to perform large-scale experiments will be even harder if they just have to give the data away - why would you bother if you can just sit around and get the data as soon as it is collected?

    Is that bad enough? There are ways you could mitigate some of the above but the bottom line is that nothing is free: it will cost more money to make the data publically available and, as a taxpayer myself, I see no real benefit from doing it and some serious potential pitfalls.