Major Scientific Journal Publisher Requires Public Access To Data
An anonymous reader writes "PLOS — the Public Library of Science — is one of the most prolific publishers of research papers in the world. 'Open access' is one of their mantras, and they've been working to push the academic publishing system into a state where research isn't locked behind paywalls and subscription services. To that end, they've announced a new policy for all of their journals: 'authors must make all data publicly available, without restriction, immediately upon publication of the article.' The data must be available within the article itself, in the supplementary information, or within a stable, public repository. This is good news for replicating experiments, building on past results, and science in general."
And not just the data that was cherry-picked to support the hypothesis?
This is bad news for ecologists and others with long-term data sets. Some of these data sets require decades of time and millions of dollars to produce, and the primary investigators want to use the data they've generated for multiple projects. Current data licensing for PLOS ONE (and--as far as I know-- all others who insist on complete data archiving) means that when you publish your data set, it is out there for anyone to use for free for any purpose that they wish; not just for verification of the paper in question. There are plenty of scientists out there who poach free online data sets and mine them for additional findings.
Requiring full accessibility of data makes many people reticent to publish in such a journal, because it means giving away the data they were planning on using for future publications. A scientist's publication list is linked not only to their job opportunities and their pay grade, but also to the funding that they can get for future grants. And of course those grants are linked to continuing the funding of the long-term project that produced the data in the first place.
What is needed is a new licensing model for published data that says "anyone is free to use these data to replicate the results of the current study, however it CANNOT be used as a basis for new analyses without written consent of the primary investigator of this paper or until [XX] years after publication." Journals would also need to agree that they would not accept any publications based on data that was used without consent.
It seems to me that this arrangement would satisfy the need to get data out into the public domain while respecting the scientists who produced it in the first place.