The Digital Dark Age
zygan wrote to mention a Fairfax Digital article about the possibility of a digital dark age, as a result of the increasingly short-term lifespan of digital storage. From the article: "It is 2045, he suggests, and his grandchildren are exploring the attic of his old house when they come across a CD-ROM and a letter, which explains that the disk contains a document that provides directions to obtaining the family fortune. The children are excited. 'But they've never seen a CD before - except in old movies - and, even if they found a suitable disk drive, how will they run the software necessary to interpret the information on the disk? How can they read my obsolete digital document?'"
Scary article. But probably too true.
In my opinion data archival screams to be handled in as simple an lowest-common-denominator a way as possible. For me, that means text for documents, and picture formats that would seem guaranteed to be around for a long time, if not forever. I'm guessing a good candidate for pictures would be something like jpg. I can't imagine jpg going away or ever being a non-decipherable picture format. Video might be a tougher nut to crack but I would guess some flavor of mpg.
Note that none of these flavors: text; jpg; nor mpg, include or imply any reliance on vendor proprietary formats (yes, I know there's a certain proprietary tinge to the picture and video forms, but they're pretty universal). So, storing and archiving for historical purposes rules out Microsoft and all of their formats. This would especially make sense considering there are already huge compatibility issues with Microsoft documents among their various versions of their products.
Also, for retrieval assurance it no longer makes sense to me to use "dead" or "inert" methods for storage, e.g., tapes, cds, dvds, etc. Instead, at least for my purposes I maintain multiple physical and current storage devices for all of my important data. This has been a recent (last three years) development for me when I started reading about early failures of the supposedly rugged storage.
So, that being the case that introduces (introduced) the need to devise a strategy for forward migration of all of may data so nothing got left behind. Fortunately, this has been mostly easy since right now the "active" storage du jour seems to be hard disk drives, and the capacity has grown sufficiently with each new generation of drives I have been able to simply roll my data forward onto the new drives with the new data with plenty of room to spare.
This shouldn't be an approach foreign to comapanies with reasonably competent data shops either. But maybe a philosophical change. All is not lost, and hopefully all will not be.
Just my $.02. ~
Really, who knows what the future holds? And who says we won't be able to trace history back to these days and even further? And just because we don't use a media anymore means it is forgotten and no one will ever be able to read the media again. I mean, if one did some digging, I bet he/she would find information to be able to read punch-cards even. Just my 2 cents.
Sometimes I comment just to hear myself typing.
Each moment arises out of the moment before - call it 'dependent arising'. No object exists in perpetuity - even black holes evaporate over long time spans.
This being said, our digital storage systems, in a collective sense, are becoming more like a brain and less like an archive. 'Memories' of some importance are in multiple locations and accessible via different search methods. They're also being changed, just as memories of our pasts acquire a patina as we age. Someone took something I wrote in the early 90s on Usenet and added it to their humor site. My flickr content is spreading if the hits are any indication, as are my contributions to YouTube.
Public records are an important thing, but understand the other, positive things that are happening in the background as the the internet acts less like a database and more like a neural net with each passing day.
I am very easy to get along with, but I don't have time to waste being nice to people who are being stupid. -Theo
Subject of a Cowboy Bebop episode. This is why I watch anime. They actually take some time to examine an idea like where to find a Betamax player 150 years from now. http://rfblues.aaanime.net/Sessions/session18.htm
Sorry about the writing. Robot fingers, you know? Cliff Steele in DOOM PATROL #23
Just give me the document. I'll print off a hard copy today, that new fangled paper technology looks promising (Assume acid free paper, no sunlight, etc, for you picky individuals). Just leave them a cd with my contact info. I will give them the directions to the family fortune, I promise. You can trust me, I'm a [insert political party of choice here].
I read an article about 10 months ago about the "death of history" due to the electronic age.
In a nutshell, as we've moved to more digital forms of communication (phone and email), one of the primary methods historians use to piece together older eras is going extinct - the written correspondence from one person to the next.
It was an excellent article; my google-fu sucks apparently because I can't find hide nor hair of it. Curses. No +5 Informative for me.
You better watch out, there may be dogs about . .
Reminds me of a discussion I once got into about analog vs. digital storage. Some of the people on the analog side argued that the myth of digital media being everlasting is false -- which it is. Digital media, on their own, should be seen as temporary storage. The true virtue of digital media isn't even the media itself -- it's the content. Content is what can be copied over and over again with no degradation.
;P
Like oral traditions, the chain of copying needs to remain unbroken for any information to truly last forever, outliving "mere mortal" media. As long as P2P networks continue to exist, I can die happily knowing that the sum of mankind's knowledge will be floating around there somewhere... even if it is buried under millions of terabytes worth of lesbian porn.
Skype is too convoluted... Now I'm reverse-engineering the Kyoto Protocol.
Why do people keep saying CD's die in 2 year/5 years/x years? Has anyone actually had a CD die on them? I have CD's in front of me at this very moment that are over 10 years old and still work great (yes I did in fact test them). Is there some conspiracy by the blank CD manufacturers to make you think all your CD's are going to die so you need to keep transferring the contents from one disk to another forever?
Heh, thats something I didn't understand in the scenario mentioned in the summary, why would someone create a paper explaining a document on a cd, but then not bother to print out the document itself? Seems a bit weird to be combining "formats" like that if you will. More than likely what would happen is that the grandchildren would find a spindle of cds that may contain old family photographs and throw them out not knowing what they contained(priceless family memories or they could just be leisure suit larry games)
Monstar L
Seriously, if you want to think in terms of 100-150 years, this is a solved problem, and without the need for stone tablets. Pigment-based inks on acid-free paper. Silver-based black and white photo chemistry on acid-free paper. Stitched bindings, not glue. Store in a trunk where there's neglible light. Put the trunk in the attic of a house where it's reasonably safe from large amounts of water (rain or flood). Civil War documents using these techniques have survived nicely to the present day. The Bell Labs archives have Alexander Graham Bell's original laboratory notebooks, still easily legible. To date, there are no reliable archival media for this length of time for audio or moving pictures. Write it down. Sketch it (as silver-based photographic materials are getting harder and harder to find). And you can be the source material for the historians of 2155 :^)
Only if you expect to be in the situation of having no software to read JPG, and no specification. That's a slightly extreme scenario? Since your data has been, obviously, carried forward. You could always carry forward source code or specifications too, along with your JPG corpus. Or am I missing something?
you had me at #!
Idea #1
What about a semi-intelligent expert system daemon that, given two document formats, could figure out how to convert one to the other?
Consider this: I would like to archive a set of CAD documents, but they're in archaic format X. Modern CAD formats are A, B, and C. CAD programs typically have ancestors that can convert from past versions for migration purposes.
So consider an interlinked set of CAD converters:
#1 can convert formats F, G, H to formats D and E.
#2 can convert formats W, Y, X, and Z to formats I, J, K, L, and F.
#3 can convert formats D and E to formats A, B, and C.
Consider then a daemon that continuously monitors a filesystem looking for documents that aren't in a current format. It then fires up the converters and performs the conversion while archiving all past versions.
So in the example, the daemon fires up converters 2, then 1, and finally 3.
It could also cryptographically sign the files to provide a chain-of-custody.
It also maintains a set of applications and an emulator for different operating systems. When one needs to open an archaic dataset, one can either look at the converted files or call the daemon directly to seamlessly pass an emulated application session to the user if you want to look at it in the original form.
Idea #2
Documents could contain their own viewers. Yes, I know that's a bad idea making document objects executables, but hear me out. The document custodian daemon could also maintain a sandbox for document viewers to run in - it could even be a standardized virtual machine written in something like Java. This is getting a little out of my area of expertise, but I'll ask my girlfriend about it. It would get interesting after several levels of emulated virtual machines.
This year, hard drives became cheaper than tape for the first time in terms of $/GB. RAID with NFS should be way better than tape backup in terms of retention and nearline access, but I'm not really an IT guy.
I'm sure there's a business model in there somewhere.
'Be always mindful, even when ditch-digging.' --D. T. Suzuki
Copy it to a new format. That is the real beauty of digital. Since it can be perfectly duplicated easily and quickly it's no problem to move it to a newer format. I have data on my drives now that was orignally on 5.25" floppy. It has just been recopied many times. Some of it has been converted to new formats, some of it is unmodified. Either way, it's still here despite being decades old.
I don't know where this silly idea comes from that somehow digital is really fragile and we'll just lose all of it later. Sure, we lose tons of it all the time, but it's worthless, by and large. The by product of the information age is that we produce so much of it, it is not only impossible to archive all of it, it's undesirable. To have more information than you could ever sift through would be almost as bad as having none at all.
Also what's the this stupid notion that we'll forget how to read things? That's like saying that we'll forget how to build sailing ships, now that we have motors. Of course that's not the case, the knowledge is preserved, in the case of sail boats, they are still made.
This is even more clear for computers since emulation is a major protect for many people. We have emulators for all kinds of old systems. Means if you find data for one of them, you just load up said emulator and it'll get at it.
Digital actually seems to be the ultimate prevention against a dark age. The ease of copying information and archiving it in multiple spots means that it's difficult for a single catastrophe to wipe out large amounts of data forever. There was a lot of work in teh past, for example the Mayan Codexes, that was destroyed and is totally unrecoverable. It was fragile precisely because it was hard to copy and thus there wasn't much of it around. Now, of the orignal hundreds of thousadns of Codexes, we have but 3.
I think it's just a bunch of alarmism.
I have 10 year old CDRs, Gold backing with the dark purple / blue phenol? dye, burnt on a 1X scsi plexstor in 1995. Those still read flawlessly. I also have some cheap Al / yellow dye, They lasted about 3-4 years before starting to generate checksum errors.
It all depends on the media and storage conditions. Conditions here are very dry 20% Humidity most of the time and stored at room temp.
I was thinking of how you could store data that would really stand up to the test of time. History provides us some examples: things cut into stone seem to do pretty well. Paper isn't bad, providing you store it well. Animal skins, not so good. Celluloid isn't either (evidenced by the old movies and cartoons that are degrading).
However glass is really good, and while it might not have the proven track record that stone tablets to, it can also support a much higher data density. For example, Ansel Adams original glass plate negatives are in some cases just as sharp as the day they were shot, and they should stay like that for the foreseeable future providing they're well taken care of. But even they are dependent on the chemicals used in processing -- whether the silver sticks to the glass over time, etc.
So here's what I was thinking: what if you used some sort of photographic process to physically etch a pattern of bits into glass: use a fairly strong acid and get the etching pretty deep, or maybe etch the bits at the bottom of phonograph-like grooves so that light surface touching wouldn't destroy them. If you could make something like this that could be read with a regular CD Rom, that would be even better.
I think some sort of process like this is used on metal (or is it actually glass?) to make the dies for stamping CDs. Basically I'm suggesting just make and retain the masters, but don't degrade them by stamping anything.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
For optical media, it's very easy... assuming the media actually survives, it's the same way this guy plays vynil LP's using a flatbed scanner:
7 769,00.html
http://wired-vig.wired.com/news/digiwood/0,1412,5
Obviously, in the future, ultra-high resolution optical input will put the current scanning/video technology to shame; they will just need to scan the thing in and run a program against the data to get the contents of the media back.
I work with a bunch of library science and archvist types who worry about this all the time.
It's such a pain taking care of books that are a few hundred years old. But they miss the point when it comes to digital.
For example, data I had on 5.25" floppies was moved to 3.5" floppies, then to a 20MB hd, then to a CD-ROM, then onto my current system.
If it's that important you transition it to new media.
I'm sureI have read the same article several years ago,I cannot remember were, maybe on Scientific American or such. After a search on sciam.com I have found this dated January 1995, more than ten years ago. Are we reading the older news ever posted on slashdot?
this post contain no useful information, no need to mod it down
This question is akin to somebody in 1900 asking what the world would be like in 2000 when the population kept growing and everybody had horses on the street - "think of all that manure accumulation - how will we walk without stepping in crap?"
The point is - the question is irrelevant. In 100 years, assuming the continued growth of storage mediums, the average personal user will have access to terrabytes, if not more, for personal use. I imagine that the most basic of ISPs (if such an entity continues to exist separately from other existing utilities) will provide users with gigabytes of personal space online to keep store/back-up their data. The only reason to put things on physical mediums will be for short-term backups.
I think a more pressing question is "will we be able to find the needle in the haystack?" Sure - Google does a decent job of indexing the internet now but even they are not 100%. Also the fact that while they may not be 'EVIL' today, it only takes 1 CEO change for them to become what most other companies are and then it's up to the next do-gooder to start an index from scratch. Then, assuming you can find stuff, you'll have to break the 200Mb encryption key. Luckily, the local Kinkos will have a quantum computer that you can use for $7.50/hour.