Slashdot Mirror


Fault Tolerant Archive Solutions?

Bob Washburne asks: "Does anyone know of a file system or storage protocol which allows you to recover a file even when sections of the media have become corrupt? This would be for archival storage - tapes, CR-R's, etc. - rather than on-line. I have been doing a lot of digital retoration/preservation work. Digitising home recordings from the 50's, photos from the 1890's, etc. and cleaning them up. I now have several gig of files - and it will continue to grow - representing hundreds of hours of work and I'm starting to get nervous about losing it to wear or bad media."

"There are several solutions for on-line storage; RAID, UPS, and frequent backups. As I fill a CD-R I make several copies of it and send them to reletives who live out of state. So I am fairly well protected against local disaster.

But what happens when the CD-R itself becomes degraded - possibly from scratches or bad lamination - and cannot be read by the normal file system? Murphy's Law would guarentee that all the backup CD-s were from the same bad batch or were lost, etc. So I am left with a CD that is 90% good, but that ugly 10% prevents me from getting my file.

I remember studying about N-dimentional parity and Hamming codes in Comp Sci class, so I know that it is possible to store a file with signifficant error correction capabilities. But has any such scheme been actually implemented?

I would expect any such scheme to include the ability to adjust the degree of recoverability (size vs. robustness) and to be able to span volumes. Since most physical damage is contiguous, you would hope that the storage would be non-contiguous. And you would think that this would either represent a unique file system or a custom raw storage methodology useable only by the storage aplication.

Thanks for your insights."

5 of 18 comments (clear)

  1. Raid on a single disk by crow · · Score: 3

    You could do raid on a single disk (or, presumably, disc, if you're using a CD). Since you're assuming that most of the media will be good, you simply treat the disc as a collection of, say, ten 70M regions, using nine (or eight) for data and the remaining one (or two) for parity, which would allow for reconstruction of any one (or or two) dammaged regions.

    Of course, raid assumes that you errors are self-detecting, but I suspect that this is also true of CD media failure.

    Now the trick is to implement it. You could encode it such that it looks like a normal ISO-9660 CD, except for special "garbage" written to the last 70M or so. You would need a special version of mkisofs, as well as special recovery tools.

    In theory, you could have mkisofs figure out exactly how much extra space you have left on the CD and use it for parity of each previous block of that size. If you have more than some threshold of free space, it could use more than one parity block for multi-block failure recovery. Then if you cram the disc to within a meg or two of being full, you are still protected from failure, but only if the failure covers a small area.

    This sounds like a good project for a data storage class.

  2. I've done this with floppy disks... by Anonymous Coward · · Score: 3

    Because of the relatively high failure rates of floppys (in my experience), I would always make duplicate images on disks of an original. First off, if you're using a CDR, buy quality recordable media - if you're using the 100 pack that you bought for $30, I'm just going to laugh at you. Make an ISO of the filesystem you want to burn, then make, say, five copies of it. You might want to check that they are all really identical (using dd to reextract the ISO raw from each disc) otherwise this isn't going to work. So then, say that you are using an archived cd, and its failing reading a particular file, you can 'dd' the image from the cd (turn off the 'terminate on read error', and make sure it puts empty blocks in where it wasnt able to read the data) Just record the data sectors that it couldnt read, and splice them in from one of your copies (all blocks are identical, use 'dd' to extract the ones you need). Rewrite the image to a new disc, and you're done. "But, that seems like an awful lot of work" - And you're right. Chances are that at least one of your five or so discs will work "out of the box" with out splicing needed, but what happens when all of them have issues? A redundant filesystem is a great idea, adding parity so that it "just works" when blocks go bad. But if you're using cdr's, and want ISO9660 filesysem, you get the iso9660 filesysem that doesnt have the parity. The method I outlined above works, I've used it on floppy's, I know others that have used it on DAT tapes, and it will even fit in with the "make lots of copies and send them to relatives" approach you already have.

  3. My own solutions. by Christopher+Thomas · · Score: 3
    My own system is the following. It isn't perfect, but it's pretty robust:

    • Use a RAID 1 (drive mirroring) for day-to-day data storage.
    • Back things up on to CDROM.
    • Burn two (or more) copies of everything per CD (just copy the directory tree two or more times, so that the copies are widely separated physically).
    • Burn multiple copies of each backup CD, and store them in different locations (ideally different buildings).


    I'm blithely pretending that CDs will last forever. If they don't, then I should check the integrity of all of my backup CDs every couple of years, and copy the data from failing sets to new sets. This involves doing something like calculating a CRC code for all files in the archive, and storing a copy of this with each copy of the backup tree.

    I also keep a paper copy of anything really important (that will fit on paper, at least).

    There's also an active system that I'm interested in trying, but it would have to be continually maintained (CDs and tapes can be left unattended for years, if they're stored well). The active system would be a bunch of servers with RAID drives that stored the files to be preserved, along with CRC information for the files. These servers actively mirror content from each other, trying to each keep a complete set of the data (updates propagate through the mirroring network). They'd also perform integrity checks on their own data and data from other servers (let the other server know that its data differs from the local copy). As long as the servers are maintained and swapped out when they fail, the data should be preserved intact forever.

    The catch is that, while you could in principle preserve storage media for a century, I wouldn't want to bet on a server network being maintained (in whatever form) for a century.
  4. CD-R already has error correction by micromoog · · Score: 3
    The CD-R format has significant error correction built in. Many of the CD's you have may already have suffered considerable damage, but still work because of the error correction.

    More info: geeky, geekier, geekiest. An interesting tidbit is that the data is interleaved serially, meaning the data and the parity codes are spread across wide arcs of disc. That's why it's recommended to clean discs from the center out, not around the discs (so if you scratch it, you damage unrelated segments).

    So, I think the idea of duplicating your CD-Rs and sending them to your relatives is a good one. For more fault tolerance, just send some more copies to some more relatives.

  5. Re:don't forget tapes by Tower · · Score: 3

    Another reason is that DLT drives and DAT autoloaders have *vastly* larger capacities than a CD-R. A small DAT cartridge could have 6 12/24GB DDS-3 tapes in it. At only a couple bucks per tape, that's dirt cheap, and you can reasonably store 72-120GB for each cartridge load. Less changing, more automation == nice.

    DLT drives are great, but definitely toward the high end of the scale. A basic DDS-3 DAT will only set you back a few hundred, and gives you a lot of room to work with.

    --

    --
    "It's tough to be bilingual when you get hit in the head."