Storing Data For the Next 1,000 Years
An anonymous reader writes "This may be an interesting take on creating long-term storage technologies. A team of researchers at UCSC claims to have come up with a power-efficient, scalable way to reliably store data for a theoretical 1,400 years with regular hard drives. TG Daily has an article describing this technology and it sounds intriguing as it uses self-contained but networked storage units. It looks like a complicated solution, but the approach is manageable and may be an effective solution to preserve your data for decades and possibly centuries." Nice to see research on this using the kinds of real-world figures for disk lifetimes that recent studies have been turning up.
No, not punch cards... but close!
Stone and chisel. That's the way to store data for 1,000 years. The reason why I say this is simple. The more "religious" the world's populations become, the closer to the dark ages we become. (The reverse is true as well as history illustrates.) I expect there will be a second "dark ages" at which point all other technologies will simply not be available.
Since there will be many holes shot into this theory, let me be one of the first to fire a shot. Electricity (as we know it) may not be around then. I am not predicting the dark ages, but who's to say that far in advance there is still a live socket.
Any storage device that relies on outside power cannot be guaranteed for 100 years, let alone 1400. I would have more faith in a stone tablet.
This is a fine example of "academic" research dollars at work.
Flexible bare-metal recovery for Linux/UNIX
Given the media, specifications and some time and money, a trio of engineering, electronics and CS students will make a machine that will read any old tape, punchcard, early HDD, etc. A CD is laughably simple technology, an engineer 100 years from now will build a player (in a way that may not look anything like our current players) in no time at all.
Today's technology is even more well documented and certainly not beyond the capabilities of future generations to make readers for.
If you find an old tape and want to do it in an afternoon, you are out of luck. If you are an historian that really, really wants to get to the data, it is not all that hard.
First, it ignores physics. MTBF can't be used in reverse. Yes, it is possible that the MTBF on a newish disc is 300K hours or more, put differently, if you've got 1000 such discs running, then every 300 hours, about every 2 weeks, one will die.
This does however:
It would offcourse if degradation in idle state was -ZERO-. If aging made -ZERO- difference and if the MTBF-rates quoted are realistic AND constant over centuries (i.e. older discs DONT start to fail more often, not even if they're centuries old)
In short: bullshit. It's overwhelmingly likely that not a single disc out of 1000 will remain functional after a millenium, even if it is powered down 97% of the time. At which point no amount of redundancy, distributed or not, will help.
Also, the exersize is pointless. As long as storage-capacities keep growing exponentially, nearly the entire cost of storing a set of data is in the first few years. If you've paid what it costs to safeguard data for a decade, you've already paid 95% or thereabouts of what it costs to store it forever.
So, storing something safely for a very long time is actually a easy task, all you need to do is:
Yeah, this -does- mean that data that nobody cares about will die. Tough luck.
For example, if you -currently- have a petabyte you want stored, you could buy 3 petabyte enterprise storage-servers, at a cost of perhaps $3million. You host these at three separate companies, say one in europe, one in japan, one in usa. For this you may pay $300.000/year. Total cost for first 5 years: $4.5 million
After 5 years you buy 3 new entry-level storage-servers. Storage/dollar has doubled ever 18 months, or a factor of 12 over 5 years. The servers now cost let's say $300K, and they're 4U-units rather than complete racks now, so hosting-costs is down to $50.000/year.
Total cost for years 5-10: $550.000
After 10 years you buy 3 new 1U "small office" servers. They cost $21K in total. Hosting is $10K/year. Total cost for years 10-15: $71K.
After 15 years you sign up for the needed amount of space on 3 separate servers and pay $3K/year, or $15K for the period.
After 20 years you put the data on 3 thumbdrives and store them however one can cheaply store a thumbdrive, total cost perhaps $1000
Or you sign up with 3 separate el-cheapo hosting-providers and pay $300/year.
After 25, you send the data as an attachment to your choise of 3 free email-providers, they all come with atleast 500PB free storage anyway, it's not as if you'll notice the extra 1PB attachment.
More likely though, you've got much MORE data to take care of in the future, so you're still paying $1million/year. Only now that buys you a storage-solution where the old 1PB-archive is a completely trivial file, taking up a so minute fraction of the array that it's not even noticeable and the incremental cost is essentially zero.
It's easy to build distributed, reliable storage that theoretically lasts thousands of years if you assume that you can just keep going down to the corner computer store and buy replacement parts that more or less work like today's parts, that operating systems keep doing what they have always been doing, and that networks keep working the way they always have. But those are bad assumptions.
What kind of data that will be lost otherwise do we have to back-up for posterity? I mean, come on, no one is going through your perl-scripts, c++ classes, 10000 digital holiday pictures, diaries of what you had for breakfast, or IRC logfiles. You are not that important! Although it would be fun to speculate what kind of information would have been in the caveman-wiki.
no one is going through your perl-scripts, c++ classes, 10000 digital holiday pictures, diaries of what you had for breakfast, or IRC logfiles
I'm sure that the people in the 11th century would have said the same thing about their accounts and letters, and yet historians and archeologists depend on them to tell us what life was like 1000 years ago.