How Long Do You Want Digital Media To Last?
spamfiltertest writes "CNET asks 'Would you like your digital-storage media to last 20 years, 25 years, 30 years, 35 years or 40 years?' If you're an organization or government agency, the U.S. government and an optical-disc industry group would like you to answer that question in a quick survey. I would think that we would like our data to last forever, but maybe it's just me."
Would you like your digital-storage media to last 20 years, 25 years, 30 years, 35 years or 40 years?
If you're an organization or government agency, the U.S. government and an optical-disc industry group would like you to answer that question in a quick survey.
I work in the records department of a two year tech college. We use document imaging hardware and software to store student files on WORM optical media permanently and then we destroy the physical paper files over time.
We expect that our digital media will far outlast what we have on other permanent storage mediums, such as microfiche, which go back to 1972. If the "antiquated" microfiche can hold up that long why not our records stored on the digital media?
We realize that no storage method is 100% foolproof (i.e. you can misfile microfiche, lose physical files, misplace pages, etc) but we have put a lot of faith into the setup we currently have. If time has a negative effect on both the originals and backups we could find ourselves reverting to tried and true methods used in years past.
It's mildly humorous to me that long term data integrity (i.e. "forever") is never mentioned when companies present you with all the benefits of a digital setup. The benefits of the system are great (such as easy access to student information at various sites without any reproduction necessary, security features, etc) but will our microfiche outlast our digital media? I may never know but currently, based on recent discussions about the degradation of digital media over time, it appears that it may.
I feel sorry for the poor bastards that would have to go back to storing and reproducing everything to and from microfiche if and when we find out that digital media might not have the necessary longevity we require.
Make it last as long as possible. Any media set to self destruct after a set date is no use to anyone. Make the best you can and keep inproving it.
I like muppets.
i want it locked up in some archaic and obsolete drm so that i can't get at it anyway.
sum.zero
In 25 - 30 years, the data on that disk probably won't be readable by the current software available. Just like that 8-track that you will never find a car to use in. To keep your data current you'd have to convert and rearchieve every so many years.
It must at least last until you are sure you don't need the data anymore.
If you mod this up, your slashdot background will turn into a beautiful sunset!
While data is obviously stored on media, talking about the lifetime of data is not the same as talking about the lifetime of media. So, the original poster's "forever" comment is unrelated to the survey he links to.
If you have media that you know won't last over 30 years, just copy it onto new media at the 20-25 year point. In most cases, that's not that big of a deal. Besides, by the time that 20-25 year mark rolls around, it's very likely that you'd want to convert to a faster "online" media anyway, like holographic storage.
You have enemies? Good. That means you've stood up for something, sometime in your life. --Winston Churchill
I'm not sure it's realistic. One nice thing about digital storage is you can copy it to new media with no loss at all. A book, or painting, or photograph, might last longer (in theory). But when it does wear out it can't be magically duplicated like bits can.
So if you want stuff to last forever, each generation of people needs to convert the old stuff into a new format. But if you are only doing this once a generation, it's not that big of a deal. You could even make it a family tradition, the passing of the old to the new. Assuming of course that you actually care about keeping something 'forever'.
As long as the data can be transferred quickly (no CD swapping) I don't need the hardware to last for decades if I can move the data over to another system without a problem before it fails. The whole point of digital data is so that it can be replicated and transfered rather than for the hardware to last forever.
The whole point of storing data on WORM media is to prove that the data remained unaltered during storage.
You want to be able to have an audit trail that shows any modifications (timestamps included) to the records. You also want to make sure that images that were stored were unaltered ("photoshopped"). You want to make sure that an exact copy of the information was stored and remains exact for the life of the media.
If it's not stored on write once media then that can't be guaranteed.
You might have actually hit the nail on the head here. If the archival media doesn't last longer than copyright, the material may never enter the public domain. We're already seeing this loss with film and books.
So for me I think of it this way. My parents and grand parents have only a few pics of the gererations that came before. Some really old picutres we have came from around 1910. The pictures are for the most part not in very good share. I see these pictures of these people who were loved deeply by the people I love and I wish I could know them better.
Now I have a nice digital camera(Canon Digital Rebel) that was expensive, but I got it for a good reason. I am about to get married and do the whole family thing. I hope someday that a great-grand kids over maybe even a further down the line will be able to look at all the pictures I will take and maybe understand a little better where they came from, what the world was like, and how pretty there great grandma was:)
Professional archivists tend to recommend that data be turned over onto new media every 5 years regardless of how well it's weathering the years.
But the truth is that, paradoxically, the most critical data tends to be the least likely to be refreshed, because access to it is typically quite limited.
Our own department of defense doesn't know where it stashed all of it's nuclear materials over the years. Why? because they recorded it on a magnetic tape, put the tape in a vault, and had someone stand in front of the vault with a gun for 40 years, and now the tape has turned to goo, and in other cases the tape seems readable but there is no technology available to read it.
We should always strive for and recommend rigorous archival policies, but we should also strive for media that can possibly withstand the ages should some knucklehead put it in a concrete box or just forget about it completely for a few decades instead.
This is just like television, only you can see much further.
Your experience is the opposite of mine. 1.2MB floppies use the majority of their theoretical storage capacity and as such are quite fragile. The most durable floppies, in fact, are 360kB DS/DD 5.25", as they store the least data per unit of area. (Or, of course, single sided 180kB discs, which are basically the same thing.)
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
That is why I have always used plain text for the most part.
The files I wrote in the late '70s are still around today, and completely readable and editable.
Whatever editor I used (I used edlin for a very short time, and then WordPerfect and MS Word - as well as several no-name apps - on DOS machines - later I found vi and emacs under Unix - and have dabbled in OpenOffice and Abiword on my Linux systems) I made sure it had the option of saving the files as plain text - or I quickly stopped using it.
Nowadays I am using XML for anything significant (that I think I may want to publish - on the web or in print) - and plain text for everything else (and XML is really plain text from a software standpoint).
I don't have any software incompatibilities because I don't use proprietary formats to begin with.
Everyone doesn't think like that, however I am trying to educate as many as possible. Some yahoo sent me a Visio drawing the other day; I sent him a message saying, "save it as jpeg or png so I can read it". He did, and I was happy (not to mention I could easily incorporate his drawing in my own documentation/notes or translate it to some other format if needed).
This happens all the time, someone sends me a Microsoft Project file, or some other format that I do not have software for. I force them to change it to an open format - and after awhile they learn (at least to send me data that fits my open model - or that I can translate to something open). When their tools are dead and their files are useless, I will still be able to reference information that happened in the past.
So, your argument about software is only valid if you use proprietary file formats (only readable by one software application). I do not - and so your argument is not valid for me.
No one should use proprietary file formats for this reason - proprietary file formats hinder the migration of data from one technology to another (hardware or software).
Lodragan Draoidh
The more you explain it, the more I don't understand it. - Mark Twain
There are several problems with P2P data storage techniques:
1) The data over time becomes corrupted. This can be from ordinary memory copy errors (a stray cosmic ray turns a 1 into a 0 or the other way around), or when you send a packet over the network somehow the checksum works out even with corrupted data (it does happen quite often... especially over many generations of data). It happens, so get used to it, and over thousands of years it will be a huge issue. I've found that bit rot over even 10-15 years is incredibly huge for most magnetic media, and optical media, while slightly better than magnetic media, still has some serious problems over time. Electronic memory (RAM) is even worse.
2) P2P data stores are based on popularity. Data that is frequently requested will always be available. The problem is with the data that may only have occasional usefulness, but when it is needed it is very valuable. This is BTW a problem of the ages as well, as even dead-tree librarians also struggle with this same issue, where you have to discard genuine garbage from time to time, and have to decide if it truly is garbage or something that has long term value. The difference with a dead-tree library and a P2P system is that this cycle is 5 to 10 years for a dead tree library but only on the order of days or hours for a P2P repository, and stuff gets discarded much more quickly.
3) Trusted sources of data are hard to identify. This is an issue even larger for P2P systems. The point of a decentralized P2P system is that taking down any one node won't kill the network or even lose the data (hopefully). The problem here is that with all nodes being (supposedly) equal you can't tell real data from forged and/or modified data (avenues for censorship of all kinds and forms). Just because you have 10 copies from 10 sources that says one piece of data is a certain way doesn't mean that the one lone server that says differently is wrong. What is the criteria to show which data packet should be ignored? Again this is a dead-tree library issue as well, but there you have publisher reputations and "original manuscripts" to compare against that are not available in a P2P environment.
While a neat idea, there is quite a bit more work to be done addressing these and other problems with P2P networks. There are valid uses for the technology, and some of these issues are being dealt with in various degrees, but you can't ignore the fundimental problems with the technology and information storage issues in general.