Slashdot Mirror


The Digital Dark Age

zygan wrote to mention a Fairfax Digital article about the possibility of a digital dark age, as a result of the increasingly short-term lifespan of digital storage. From the article: "It is 2045, he suggests, and his grandchildren are exploring the attic of his old house when they come across a CD-ROM and a letter, which explains that the disk contains a document that provides directions to obtaining the family fortune. The children are excited. 'But they've never seen a CD before - except in old movies - and, even if they found a suitable disk drive, how will they run the software necessary to interpret the information on the disk? How can they read my obsolete digital document?'"

16 of 413 comments (clear)

  1. this should be soluble. by yagu · · Score: 5, Interesting

    Scary article. But probably too true.

    In my opinion data archival screams to be handled in as simple an lowest-common-denominator a way as possible. For me, that means text for documents, and picture formats that would seem guaranteed to be around for a long time, if not forever. I'm guessing a good candidate for pictures would be something like jpg. I can't imagine jpg going away or ever being a non-decipherable picture format. Video might be a tougher nut to crack but I would guess some flavor of mpg.

    Note that none of these flavors: text; jpg; nor mpg, include or imply any reliance on vendor proprietary formats (yes, I know there's a certain proprietary tinge to the picture and video forms, but they're pretty universal). So, storing and archiving for historical purposes rules out Microsoft and all of their formats. This would especially make sense considering there are already huge compatibility issues with Microsoft documents among their various versions of their products.

    Also, for retrieval assurance it no longer makes sense to me to use "dead" or "inert" methods for storage, e.g., tapes, cds, dvds, etc. Instead, at least for my purposes I maintain multiple physical and current storage devices for all of my important data. This has been a recent (last three years) development for me when I started reading about early failures of the supposedly rugged storage.

    So, that being the case that introduces (introduced) the need to devise a strategy for forward migration of all of may data so nothing got left behind. Fortunately, this has been mostly easy since right now the "active" storage du jour seems to be hard disk drives, and the capacity has grown sufficiently with each new generation of drives I have been able to simply roll my data forward onto the new drives with the new data with plenty of room to spare.

    This shouldn't be an approach foreign to comapanies with reasonably competent data shops either. But maybe a philosophical change. All is not lost, and hopefully all will not be.

    Just my $.02. ~

    1. Re:this should be soluble. by Anonymous Coward · · Score: 4, Funny

      this should be soluble

      That could be a problem. At least a CD won't get damaged by water.

    2. Re:this should be soluble. by merreborn · · Score: 5, Informative

      I'd think bmp would be preferable to jpg. bmp is to images what .txt is to text (and while ASCII is arbitrary, it's a single substitution cypher, and therefore easily crackable) -- the simplest, uncompressed format. I've written 1-bit (black and white) bitmaps by hand. I couldn't ever hope to do the same in jpeg.

    3. Re:this should be soluble. by jd · · Score: 5, Interesting
      I would personally opt for PNG for images, to avoid loss of data. Video almost has to be MPEG, as neither MNG nor APNG have really gone anywhere at this time and the BBC's high definition format isn't getting much adoption yet either. For audio, MP4 would seem the best choice - less loss of data, but more likely to be readable in the far future than Ogg Vorbis (which is a shame) or AIFF (yay! AIFF's gonna die!)


      No matter what form you store the data in, if you want it readable in the far future, you've got to remember two things - there's no guarantee ANY specific technology will exist, and there's no guarantee ANY specific timeframe for the reading to take place.


      What you want, then, is to do the reverse of the language decoding that has taken place over the years. Imagine yourself faced with a puzzle every bit as baffling as Egyptian Hyroglyphics, only stored at a vastly greater information density and probably in an electronic format. What would you want/need, to be able to recover the data?


      Well, there would seem to be a few things that are essential. First, the explorer in the future will need to know the data is there and in what form. So, if you're using optical storage, make that clear (along with frequency). If you're using N-state logic, make it clear what N is. If there are M layers, tell them the value of M. You don't need to know all of the technical information, because all they need is where to start looking.


      Secondly, the information needs to be correctly indexed. Languages are broken because types of information can be grouped and identified. The same will be true here. So, produce a contents list with corresponding data formats and/or MIME types, along with the offsets within the medium.


      Thirdly, a key is a REALLY good idea - something analogous to the Rosetta Stone. Let's say you're using binary logic and a fairly rudimentary FS on the storage medium with text-based directories. The key would be a printout of the root directory in binary, again in ASCII and a third time as a set of records describing the logical layout. The printout would also need the offset of the directory. From this, it would be trivial for someone in the year 3000 to determine how offsets were calculated, how the data was laid on the disk and how the data is connected.


      If physical storage is going to be used, ensure the various media used will last about the same length of time. So, if you're aiming for a hundred years, CDs may just about work. But you must NOT have the CD in contact with sulphides or anything else which will destroy the surface. The CD must be kept cold (but not so cold it is damaged) to slow decomposition. It should also be kept somewhere where accidental exposure to UV is impossible.


      If you're keeping paper notes with the data, as I've suggested, the paper must be acid-free and the inks must be long-lasting. Most modern paper is of very low grade, as are most modern inks.


      If you're looking more at a time capsule that is for the FAR future (we're talking something that happens AFTER Star Trek), then you've got to be extra careful but it should still be possible. I see no reason why you couldn't have physical storage under ideal conditions which could be retrievable after a thousand years or so. You just have to be very careful on what you choose to use. Same with paper. If you're looking to produce the next Beowulf (no, not the clustering technology), then you're probably going to want to look at vellum or some other extremely high-quality medium. I'd also look up early inks on the Internet and modify a recipe that could be used as a refill for a printer ink cartridge. Many early inks are highly stable (iron oxide is one example) and fade more by damage to the medium than decay of the ink.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    4. Re:this should be soluble. by Pharmboy · · Score: 4, Informative

      I have heard the same for photographs. Today's photographic paper isn't the same as older stuff, with less silver, and it tends to fade quicker. While we can rely on 100 year old photographs, our decendents may not. Most paper nowdays is relatively acidic as well, so it breaks down faster with any exposure. This would mean books as well. While there is good paper that is better than the old stuff, most is made to be cheap, not high quality.

      --
      Tequila: It's not just for breakfast anymore!
    5. Re:this should be soluble. by kabz · · Score: 4, Funny

      Yeah, I can just imagine ...

      You find the CD buried in a box in the garden.
      You see the Microsoft logo. An old, long-dead company.
      You scrape some dust off the CD.
      You read through the logos and fine print on the CD.
      You see the logo 'PlaysForSure' (tm)
      You groan and throw the CD in the trash.

      --
      -- "It's not stalking if you're married!" My Wife.
  2. dark age by foxhound01 · · Score: 5, Funny

    They'll take it to that crazy old guy in the corner house with uncut grass in his lawn, for he was once a great programming guru and has a ton of still functioning archaic equipment that requires insanely large amounts of power.

    --


    Linux is to the internet as Duct Tape is to the Universe.
  3. easy by DrSkwid · · Score: 4, Informative

    perhaps the same way I would read a wax cylinder today

    visit a specialist

    a good place to start would be here :

    http://www.bl.uk/collections/sound-archive/wtmcyli nder.html

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
  4. Huh? by tktk · · Score: 4, Funny
    ...they come across a CD-ROM and a letter...

    \/\/H47'$ 4 L3773r?

  5. SESSION #18 - SPEAK LIKE A CHILD by infonography · · Score: 4, Interesting

    Subject of a Cowboy Bebop episode. This is why I watch anime. They actually take some time to examine an idea like where to find a Betamax player 150 years from now. http://rfblues.aaanime.net/Sessions/session18.htm

    --
    Sorry about the writing. Robot fingers, you know? Cliff Steele in DOOM PATROL #23
  6. The format is probably not relevant by hungrygrue · · Score: 4, Informative

    as the CD probably couldn't be read regardless. CDs do not last forever. http://www.warehousephoto.com/How_Permanent_is_you r_CD-R.htm In fact many will be unreadable in as little as 2 years. If you want to archive, print it with good ink on acid free archival paper.

  7. I think that.. by slapout · · Score: 4, Insightful

    ..a more likey outcome is that patents and DRM will lead to a digital dark age.

    --
    Coder's Stone: The programming language quick ref for iPad
  8. Re:The times they are a changing by dfjunior · · Score: 5, Funny

    Hell no!
    Zip discs are the *only* reliable way to archive digital data indefinitely

  9. Similar issues with old movies by Alien54 · · Score: 4, Insightful

    All too often these are literally rotting away in storage, because the originals are decaying, and the movie companies are unwilling to invest money to rescue them, even though they would sue you for millions if you published these on your own.

    --
    "It is a greater offense to steal men's labor, than their clothes"
  10. Yeah, but so what? by ottffssent · · Score: 4, Insightful

    The example's contrived. I don't like contrived examples unless they illustrate an important principle, which this one doesn't really do. Such data loss has already started happening even in my own life, but I don't think that's a bad thing. The fairly minimal effort required to keep data up-to-date is a natural impediment to a policy of keeping everything. Data which isn't worth a new hard drive and an rsync dies. Data which isn't worth the effort of importing and re-saving in a newer format dies. This isn't bad. It's not new either.

    Data goes the way of the dodo not because of technological obstacles, but because of a decision made or not made to preserve it. We don't know how the great pyramids were built, the obelisks shaped and erected, etc. not because there was no way to preserve that information, but because it wasn't important enough to justify the effort. The same is true of 10-yr-old WP documents I made to bill people when I mowed lawns for spending money, or a million other things that get saved or trashed every day.

    If you're serious about the problem, then it's not a technical hurdle. Data storage is cheap. Emulators are good. Batch document conversion is possible. The problem, if you're willing to call it that is that the benefit has to outweigh the cost. Lowering the cost of data preservation only increases the cost of data searching and real information retrieval. And very quickly it becomes a philosophical argument about the value of preserving irrelevant knowledge in a world that has moved on. Yet the argument is couched in terms of data storage and manipulation which is really the tiniest corner of the issue.

  11. Easier by Sycraft-fu · · Score: 4, Interesting

    Copy it to a new format. That is the real beauty of digital. Since it can be perfectly duplicated easily and quickly it's no problem to move it to a newer format. I have data on my drives now that was orignally on 5.25" floppy. It has just been recopied many times. Some of it has been converted to new formats, some of it is unmodified. Either way, it's still here despite being decades old.

    I don't know where this silly idea comes from that somehow digital is really fragile and we'll just lose all of it later. Sure, we lose tons of it all the time, but it's worthless, by and large. The by product of the information age is that we produce so much of it, it is not only impossible to archive all of it, it's undesirable. To have more information than you could ever sift through would be almost as bad as having none at all.

    Also what's the this stupid notion that we'll forget how to read things? That's like saying that we'll forget how to build sailing ships, now that we have motors. Of course that's not the case, the knowledge is preserved, in the case of sail boats, they are still made.

    This is even more clear for computers since emulation is a major protect for many people. We have emulators for all kinds of old systems. Means if you find data for one of them, you just load up said emulator and it'll get at it.

    Digital actually seems to be the ultimate prevention against a dark age. The ease of copying information and archiving it in multiple spots means that it's difficult for a single catastrophe to wipe out large amounts of data forever. There was a lot of work in teh past, for example the Mayan Codexes, that was destroyed and is totally unrecoverable. It was fragile precisely because it was hard to copy and thus there wasn't much of it around. Now, of the orignal hundreds of thousadns of Codexes, we have but 3.

    I think it's just a bunch of alarmism.