Slashdot Mirror


Storing Data For the Next 1,000 Years

An anonymous reader writes "This may be an interesting take on creating long-term storage technologies. A team of researchers at UCSC claims to have come up with a power-efficient, scalable way to reliably store data for a theoretical 1,400 years with regular hard drives. TG Daily has an article describing this technology and it sounds intriguing as it uses self-contained but networked storage units. It looks like a complicated solution, but the approach is manageable and may be an effective solution to preserve your data for decades and possibly centuries." Nice to see research on this using the kinds of real-world figures for disk lifetimes that recent studies have been turning up.

21 of 243 comments (clear)

  1. Sometimes old tech is best by erroneus · · Score: 5, Insightful

    No, not punch cards... but close!

    Stone and chisel. That's the way to store data for 1,000 years. The reason why I say this is simple. The more "religious" the world's populations become, the closer to the dark ages we become. (The reverse is true as well as history illustrates.) I expect there will be a second "dark ages" at which point all other technologies will simply not be available.

    1. Re:Sometimes old tech is best by Anonymous Coward · · Score: 2, Insightful

      Umm, stone carvings aren't immune to little things like weathering effects. Microscopic etching isn't going to be any better at retaining data for long periods of time than a stamped CD (which is essentially the same thing).

      The reason why ancient carvings are durable is because they're macroscopic, and hence inherently have lots of built-in redundancy. (The shape of a letter, for example, uses vast quantities of atoms shaped in a precise way to convey very little information; 5 to 7 bits worth, for Latin-style alphabets.)

      Even then, they weren't designed with modern error correction codes, so a missing letter here or a defaced section there has to be guessed from context.

      We could probably do stone carving better with modern technology, from a data density perspective, but part of what makes the traditional approach effective is its low coding efficiency, and intuitive simplicity.

  2. But what about... by bigredradio · · Score: 3, Insightful

    Since there will be many holes shot into this theory, let me be one of the first to fire a shot. Electricity (as we know it) may not be around then. I am not predicting the dark ages, but who's to say that far in advance there is still a live socket.

    Any storage device that relies on outside power cannot be guaranteed for 100 years, let alone 1400. I would have more faith in a stone tablet.

    This is a fine example of "academic" research dollars at work.

    1. Re:But what about... by fucket · · Score: 2, Insightful

      Good point. Also, in 1400 years there may no longer be any humans on earth to read the tablets you store so you might want to lock a human or two in the vault with your data.

    2. Re:But what about... by superdave80 · · Score: 2, Insightful

      "Electricity (as we know it) may not be around then."

      I'm not sure how you expect electricity to 'change' in the future.

      If a civilization can't generate electricity, then they wouldn't have the technical knowledge to even know what to do with digital data, so the whole point would be moot.

    3. Re:But what about... by Ihmhi · · Score: 2, Insightful

      It's electricity, not Greek Fire. It's not some big mystery on how to generate it. Even if we're using microscopic black holes to generate power, it would not be hard to set up a windmill and some copper wire.

      The bigger issue would be being able to actually read the data.

    4. Re:But what about... by Anonymous Coward · · Score: 1, Insightful

      Eventually the connectors of today will be obsolete,...
      How long would it take to rig up some sort of temporary connector from bits of metal and insulator? (about 5 mins tops). The fundamental principle of electricity (the two conductor circuit) isn't going to change because it's a basic part of physics. Even if consumer goods all have some weird wireless data/power transfer, physics/electronics labs will still have basic facilities for experimenting etc.

      the voltages of today will be unsupported
      Even if they are no longer standard, don't you think that maybe some sort of 'variable voltage transformer' might still exist, even if only normally used for lab work etc.?

      and likely forgotten
      if you think this is likely, inscribe a ceramic plate with the voltage/power requirements and screw this plate to the equiptment. I'd gamble on SI units like volts not changing in 1000 years (and you could always insribe a mathematical derivation of the units from fundamental physical principles if you're really worried).

      Power is NOT going to be a problem because its basis is so fundamental. We still know how smelt iron/bronze/copper etc. with a small, simple setup even though thousands of years have passed and industrial methods are normally used.

  3. Rotate your media by profplump · · Score: 2, Insightful

    Wouldn't it be a lot easier to simply keep the archive on a live system, and rotate it to new media from time to time as the old media dies and new storage systems become available? After all, if no one is looking after this system, what's to keep it from being forgotten in the basement of a long-abandoned building?

    In addition to taking advantage of the falling cost of storage for a fixed-size data set -- making future replacement media purchases much cheaper than redundant media purchases today -- you also have the opportunity to re-process the data into new formats, so that you'll still be able to read it when you want it.

  4. Uh, what? by ChePibe · · Score: 1, Insightful

    The more "religious" the world's populations become, the closer to the dark ages we become. (The reverse is true as well as history illustrates.)

    I realize that taking swipes at religion at /. is simply common fare and is an easy way to boost karma, but seriously, what? Where is this link between religion - one would assume all religion, as the OP discuss the population of the entire world - and this surge to the dark ages?

    From the demographic viewpoint, a simple look at the high rate of belief in deity/practice of religion and the United States - the world economic leader, and still, in spite of some losses in this area, the center of innovation in all (well, at least most) things technological - would seem to indicate that the causal link between a belief in religion and a return to the "dark ages" is tenuous at best. For fun, compare the rate of technological advance in the U.S. with that of the devoutly non-religious Soviet Russia or Communist China throughout the cold war.

    Then, one could look at individuals - Mendel, Newton, a wide assortment of Muslim mathematicians and astronomers, etc. Even a look at more mundane topics, such as engineers and inventors shows a broad array of other religious folks as well. As a Mormon, the first two that come to mind are Browning, a perhaps unrivaled genius to this day in the design of firearms, and Farnsworth - largely responsible for the electronic television.

    Now, I'll be the first to concede the point that several religious groups have shown less technological advance over time, Wahabi Muslims in particular come to mind, but so do numerous others. Some groups have eschewed technology altogether, such as the Amish, but these are exceptional cases. But to argue that the act of being religious at all is somehow tied to a magical turn to the dark ages is absurd, and to argue that a lack of religion has always led to some drive away from the dark ages is no better.

    1. Re:Uh, what? by Hal_Porter · · Score: 2, Insightful

      For fun, compare the rate of technological advance in the U.S. with that of the devoutly non-religious Soviet Russia or Communist China throughout the cold war. I think you could make an argument that Russia and China were theocracies for much of the Stalinist period. For example I read that Mao apparently gave a speech which was interpreted as him saying that quarks were the fundamental constituent of matter. After that Chinese physicists were careful not to publish papers that might contradict the great man. In Russia Lysenkoism was famously the officially supported theory of agriculture. And in in Nazi Germany relativity and quantum mechanics were denounced as "Jewish physics" and physicists studying them were fired, which in a rare instance of poetic justice was probably not very helpful to the German atom bomb project. I think the Nazis would have messed up science far more if they had stayed in power longer and created the sort of Dark Ages agricultural slave empire they obviously planned.

      Obviously Chinese, German or Russian social scientists were under much more obvious pressure to publish ideologically orthodox papers during their respective theocracies than physicist or biologists. Regardless of whether Nazism as religions, they behaved like intolerant monotheisms socially. In fact they were probably far worse since they existed in an age where orthodoxy could be enforced, rather than mere orthopraxy. This by the way is what Orwell was worried about - the ability of 20th Century totalitarianism to get inside people's heads.

      By contrast America has lots of religion, but more importantly it has lots of religions, possibly because the Constitutional prohibition on an established state church allows them to survive. In the China or Russia lots of believers in the official religion ended up being crushed by the State because they were on the wrong side of a doctrinal dispute.

      So at the risk of stating the obvious I'd say that a theocracy leads to science being suppressed, not a large number of competing religions. Competition is good, and that something that atheists, Communists and Gaia worshippers should understand as well as the believers in older traditional religions.
      --
      echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
    2. Re:Uh, what? by Anonymous Coward · · Score: 1, Insightful

      Competition is good, and that something that atheists, Communists and Gaia worshippers should understand as well as the believers in older traditional religions. Something I never understood while living in the USA. If one believes in one god how come there are hundred different ways to believe in that god. All claim to follow the bible, yet most have their own "version" of the bible .... makes no sense. Ohh and let's not forget the part where Americans pick and choose which one of god's laws to abide by at the moment....

      Love and piece my ass:
      - "When you go to war against your enemies and you see a beautiful woman and find her desirable, you may take her. If she ceases to please you send her away." Deut. 21:10

      - "They waged war as god had commanded them and killed every male. But they kept the women as captives and took their wealth as spoil. Moses was enraged. 'So you spared the women? Kill every woman who has had sexual intercourse and kill every little boy, but keep the virgin girls for yourself. Divide them up evenly.'" Num. 31:7, 14

      - "I saw that the people were marrying foreigners. Their children were even learning foreign languages. I called down curses on them. I struck them and tore the hair out of their heads and made them swear by god, 'you will not marry foreigners.'" Neh. 13:23 "So I purged them of everything foreign. I drew up regulations defining everyone's duty. Remember me, oh god, for my happiness." Neh. 13:30

      I am now reading the Qur'an which is just as hateful as it dictates that non believers have no rights and should be punished which is exactly what is happening right now.
  5. Try harder by daBass · · Score: 4, Insightful

    (Ever tried to get data off an obsolete tape backup There are loads of people that can make this work. The most important thing is having the specs of what is on it, how it was recorded. (even just a few hints and some knowledge of how computer systems in that era might have recorded data is enough) That the machine used is no longer functioning and had an interface that doesn't work with your USB-only modern PC anyway is of no relevance.

    Given the media, specifications and some time and money, a trio of engineering, electronics and CS students will make a machine that will read any old tape, punchcard, early HDD, etc. A CD is laughably simple technology, an engineer 100 years from now will build a player (in a way that may not look anything like our current players) in no time at all.

    Today's technology is even more well documented and certainly not beyond the capabilities of future generations to make readers for.

    If you find an old tape and want to do it in an afternoon, you are out of luck. If you are an historian that really, really wants to get to the data, it is not all that hard.
  6. Re:Only half the problem by complete+loony · · Score: 2, Insightful
    --
    09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
  7. Emulation may work better by prattp · · Score: 2, Insightful

    I agree that virtual machines are a solution to file formats becoming obsolete, but I think that emulation may be more appropriate than virtualization for this purpose. VMware can only be used on x86 computers, and even on x86 computers future processors may have subtle differences that could affect old virtual machines. An emulation of an entire computer, including the processor, can be ported to any computer, and have exactly identical behavior.

    Also, it may not be necessary to layer virtual machines inside each other, if you have an emulator that that is easy to port new machines, such as by being open source and relatively simple. That is a large part of the motivation for the Macintosh Plus emulator I maintain.

  8. Idiotic by Eivind · · Score: 4, Insightful
    This is completely idiotic.

    First, it ignores physics. MTBF can't be used in reverse. Yes, it is possible that the MTBF on a newish disc is 300K hours or more, put differently, if you've got 1000 such discs running, then every 300 hours, about every 2 weeks, one will die.

    This does however:

    • NOT imply that a average disc will last for 300K hours of operation, i.e. 47 years.
    • NOT imply that a disc that is idle 90% of the time will last for 470 years.
    • NOT imply that a disc that is idle 95% of the time will last for a millenium.


    It would offcourse if degradation in idle state was -ZERO-. If aging made -ZERO- difference and if the MTBF-rates quoted are realistic AND constant over centuries (i.e. older discs DONT start to fail more often, not even if they're centuries old)

    In short: bullshit. It's overwhelmingly likely that not a single disc out of 1000 will remain functional after a millenium, even if it is powered down 97% of the time. At which point no amount of redundancy, distributed or not, will help.

    Also, the exersize is pointless. As long as storage-capacities keep growing exponentially, nearly the entire cost of storing a set of data is in the first few years. If you've paid what it costs to safeguard data for a decade, you've already paid 95% or thereabouts of what it costs to store it forever.

    So, storing something safely for a very long time is actually a easy task, all you need to do is:

    • Create multiple copies at geographically distinct sites.
    • Regularily transfer the copies to newer larger media


    Yeah, this -does- mean that data that nobody cares about will die. Tough luck.

    For example, if you -currently- have a petabyte you want stored, you could buy 3 petabyte enterprise storage-servers, at a cost of perhaps $3million. You host these at three separate companies, say one in europe, one in japan, one in usa. For this you may pay $300.000/year. Total cost for first 5 years: $4.5 million

    After 5 years you buy 3 new entry-level storage-servers. Storage/dollar has doubled ever 18 months, or a factor of 12 over 5 years. The servers now cost let's say $300K, and they're 4U-units rather than complete racks now, so hosting-costs is down to $50.000/year.
    Total cost for years 5-10: $550.000

    After 10 years you buy 3 new 1U "small office" servers. They cost $21K in total. Hosting is $10K/year. Total cost for years 10-15: $71K.

    After 15 years you sign up for the needed amount of space on 3 separate servers and pay $3K/year, or $15K for the period.

    After 20 years you put the data on 3 thumbdrives and store them however one can cheaply store a thumbdrive, total cost perhaps $1000
    Or you sign up with 3 separate el-cheapo hosting-providers and pay $300/year.

    After 25, you send the data as an attachment to your choise of 3 free email-providers, they all come with atleast 500PB free storage anyway, it's not as if you'll notice the extra 1PB attachment.

    More likely though, you've got much MORE data to take care of in the future, so you're still paying $1million/year. Only now that buys you a storage-solution where the old 1PB-archive is a completely trivial file, taking up a so minute fraction of the array that it's not even noticeable and the incremental cost is essentially zero.
  9. missing the point by nguy · · Score: 3, Insightful

    It's easy to build distributed, reliable storage that theoretically lasts thousands of years if you assume that you can just keep going down to the corner computer store and buy replacement parts that more or less work like today's parts, that operating systems keep doing what they have always been doing, and that networks keep working the way they always have. But those are bad assumptions.

  10. Re:Only half the problem by jimicus · · Score: 2, Insightful

    Simple: You use only formats that are openly specified and free software. HTML and everything XML-based actually is text. Even if it's text you're not 100% out of the woods. EBCDIC, ASCII, Unicode plus however many others have existed over the years.

    While it's not generally too awkward to convert from one characted encoding to another, "just text" is a slight oversimplification.
  11. No one is interested... by HetMes · · Score: 3, Insightful

    What kind of data that will be lost otherwise do we have to back-up for posterity? I mean, come on, no one is going through your perl-scripts, c++ classes, 10000 digital holiday pictures, diaries of what you had for breakfast, or IRC logfiles. You are not that important! Although it would be fun to speculate what kind of information would have been in the caveman-wiki.

  12. Nobody but historians? by argent · · Score: 4, Insightful

    no one is going through your perl-scripts, c++ classes, 10000 digital holiday pictures, diaries of what you had for breakfast, or IRC logfiles

    I'm sure that the people in the 11th century would have said the same thing about their accounts and letters, and yet historians and archeologists depend on them to tell us what life was like 1000 years ago.

  13. Re:Very little is laughably simple by daBass · · Score: 2, Insightful

    If you;re uncomvinced go study the maths on auto focusing an pit tracking lasers, not to mention D/A conversion, reed solomon error corection etc. Why would you use such arcane methods a 100 years from now? If they looked closely at the disc, they would see the patterns. Knowing (as they will) that we used to use "binary", they'll quickly assume they represent 1 and 0. Take a quick scan of the entire disc and do the rest in memory. Somehow I doubt they'll have much of a problem with D/A conversion either. (which is so simple, they'll figure that one out too. Understanding the data is supposed to be audio, they'll quickly put two and two together. Most likely, they will actually convert it to the way their computers store sound and let 22nd century A/D converters do the job)

    Just because you want the data off the disc, doesn't mean you need to create a player the same way we do now!

    Try finding someone now who could build a decent siege engine or longbow that would be good enough to fight a medieval battle. Hell , even finding someone these days who can rebuild steam engines is tough! There seems no shortage of such people on the Discovery Channel!
  14. Re:Very little is laughably simple by daBass · · Score: 2, Insightful

    You make it sound hard, but considering people nowadays slice open completely proprietary computer chips running proprietary code and reverse engineer the thing using a microscope and some simulation software, the CD isn't going to be too hard to do 100 years from now.

    You have to remember that it is going to be pretty obvious for anyone that the original use was to play back music. Most likely, they will find them in places where the player is still next to it - even if it doesn't work. Even without the red book spec, there will be loads of cues about how the data might be on there.

    And who knows what computing will be up to? Is giving a computer a electron microscope scan of a CD and telling it: "it's supposed to be sound, probably in binary encoding and it will have some error correction data in there" so hard to imagine? I don't think it is if technology keeps advancing like it does now.

    Will they do it in a weekend? probably not, but what makes you think that if you can't do it in a weekend, everybody is just going to walk away and say: "not worth it, its too hard". That is not how humans worked a thousand years ago, not how they work now and nor will they in the 23 century.