Scientific Data Disappears At Alarming Rate, 80% Lost In Two Decades
cold fjord writes "UPI reports, 'Eighty percent of scientific data are lost within two decades, disappearing into old email addresses and obsolete storage devices, a Canadian study (abstract, article paywalled) indicated. The finding comes from a study tracking the accessibility of scientific data over time, conducted at the University of British Columbia. Researchers attempted to collect original research data from a random set of 516 studies published between 1991 and 2011. While all data sets were available two years after publication, the odds of obtaining the underlying data dropped by 17 per cent per year after that, they reported. "Publicly funded science generates an extraordinary amount of data each year," UBC visiting scholar Tim Vines said. "Much of these data are unique to a time and place, and is thus irreplaceable, and many other data sets are expensive to regenerate.' — More at The Vancouver Sun and Smithsonian."
And in 20 years, these results too shall be lost.
Trying to ignore that a paper about the unavailability of scientific data is locked behind a paywall.
This is nothing new though, I do occasional conversion from ancient data formats, people need to pay better attention, imagine trying to read an 8" CP/M floppy today.
As libraries move to digital storage rather than the dead tree that's been fine for thousands of years they are inviting a catastrophe, possibly only one well aimed solar mass ejection from massive data loss.
Whichever side of the "data is" vs. "data are" argument one falls on, I hope we can all agree that mixing both forms within the same sentence is definitely wrong.
No but it is amazing what NEW science you can do with OLD data. I've worked with the Transportable Array project for example http://www.usarray.org/researchers/obs/transportable it's over a decade old and scientists are still discovering new ways to take advantage of the data and will likely be doing so for decades to come. On the other hand a lot of data is just junk due to poor quality metadata; when was that instrument calibrated? I dunno. Damn. At leat in geophysics we have the National Geophysical Data Center to curate this stuff http://www.ngdc.noaa.gov/ at least until Congress cuts it's funding.
-73, de n1ywb
www.n1ywb.com