Slashdot Mirror


Storing Data For the Next 1,000 Years

An anonymous reader writes "This may be an interesting take on creating long-term storage technologies. A team of researchers at UCSC claims to have come up with a power-efficient, scalable way to reliably store data for a theoretical 1,400 years with regular hard drives. TG Daily has an article describing this technology and it sounds intriguing as it uses self-contained but networked storage units. It looks like a complicated solution, but the approach is manageable and may be an effective solution to preserve your data for decades and possibly centuries." Nice to see research on this using the kinds of real-world figures for disk lifetimes that recent studies have been turning up.

5 of 243 comments (clear)

  1. Only half the problem by Raindance · · Score: 4, Informative

    Part of the solution to very long-term storage, of course, has to involve a method to read the data you've archived.

    I tend to think systems such as the one described in the article aren't good long-term solutions. If their math works on the failure rates, that's fantastic- but just try to hook up a 2028 computer to one of these things to pull the data off.*

    (Ever tried to get data off an obsolete tape backup?)

    I think the most reliable archival system is going to be an active one, where data is saved on modern storage hardware and always copied to more modern tech as it arrives.

    The other side of this is, for anything more advanced than text-- given that you can get at the data, what do you open it with? File types die over time and it's basically impossible to find programs to open certain files nowadays, much less such programs that will run on a modern OS. I think the answer to this has to be virtualization. Store the data *and* programs that can open the filetypes you need opened inside a portable virtual machine (e.g., a Windows vmware image). Over time, you may have to layer virtual machines inside virtual machines as OSes grow obsolete. But that's okay- virtualization is only going to become more elegant, and the end result is that you'd have your data in its original environment, completely accessible by native programs.

    *Some elements of this problem could be solved by having backup servers use wireless and filesharing protocols that might stand the test of time- e.g., 802.11n and SAMBA. No need to just pick one 'most likely to be future-proof' combination, either: run bluetooth and serial access, webdav and a http fileserver, etc. Still, *not* storing data on modern hardware is always going to be a risky kludge.

    There's probably room for a lucrative business based around this-- figuring out the most elegant way to archive and retain meaningful access to data under various computing/disaster scenarios. Hey, I do consulting. :)

    1. Re:Only half the problem by LoudMusic · · Score: 4, Informative

      (Ever tried to get data off an obsolete tape backup?)

      I think the most reliable archival system is going to be an active one, where data is saved on modern storage hardware and always copied to more modern tech as it arrives. Oh man, the headaches involved here. It only takes five years and archived data is obsolete. And yes, virtualization can help, but in the past I've resorted to keeping an entire system available, off-line, to guarantee that the client be able to open their data. Sometimes you get lucky and there's either a plug-in for the old app to export to the new app, or one for the new app to import from the old app. But even on the rare chance that one is available, I've never seen a 100% conversion - even on simple stuff.

      Maybe old data was meant to die.
      --
      No sig for you. YOU GET NO SIG!
  2. Re:What about filling it up? by Blkdeath · · Score: 4, Informative

    Unless 10 PB (petabytes) means something other than what I think (10,000 terabytes), where did they get the $4700 number? I even read their definition of static cost (You have to go up a few paragraphs) and I still don't know.

    Table 3: Comparison of system and operational costs for 10 PB of storage. All costs are in thousands of dollars and reflect common configurations. Operational costs were calculated assuming energy costs of $0.20/kWh (including cooling costs).

    Does $4.7 million sound a bit more realistic?

    --
    BD Phone Home!

    Shameless plug. Like you weren't expecting it.

  3. Re:Sometimes old tech is best by evanbd · · Score: 4, Informative

    You could, of course, update the technology a bit: Rosetta Project. High density, readable with a high quality microscope, and partially readable with the naked eye -- the spiral of shrinking text should make the usage instructions obvious: "get a magnifying glass, there's more here."

  4. Re:Uh, what? by hypnagogue · · Score: 3, Informative

    Well the Old Testament was written by backward Taliban types in the dark ages. What do you expect? Something I didn't realise about the Old Testament until recently is that when they talk of the the Philistines binding Samson in 'chains of iron' it's because the Philistines had managed to master the technology to use iron but the Israelites hadn't. 1 Samuel 13:19 Now there was no blacksmith to be found throughout all the land of Israel, for the Philistines said, "Lest the Hebrews make themselves swords or spears."

    It wasn't technology gap, it was arms control enforced during centuries of oppression. They certainly did have the technology, as the technology itself is described in dozens of passages. (Deu 4:20, 1 Sa 12:31)

    There's lots of other signs that they were not exactly academically inclined either, like the biblical value of 3 for Pi which was less accurate than the value the competing civilisations knew. 1) A round bathtub is not the same thing as a circular bathtub, 2) even if it was circular you forgot to account for the annulus.

    But don't worry your arrogant little head about it. Other people are stupid and you are smart.
    --
    Liberty you never use is liberty you lose.