Slashdot Mirror


One Way To Save Digital Archives From File Corruption

storagedude points out this article about one of the perils of digital storage, the author of which "says massive digital archives are threatened by simple bit errors that can render whole files useless. The article notes that analog pictures and film can degrade and still be usable; why can't the same be true of digital files? The solution proposed by the author: two headers and error correction code (ECC) in every file."

8 of 257 comments (clear)

  1. To much reinvention by DarkOx · · Score: 5, Interesting

    If this type of thing is implemented at the file level every application is going to have to do its own thing. That means to many implementations most of which wont be very good or well tested. It also means applications developers will have to be busy slogging though error correction data in their files rather than the data they actually wanted to persist for their application. I think the article offers a number of good ideas but it would be better to do most of them at the filesystem and perhaps some at the storage layer.
        Also if we can present the same logical file when read to the application even if every 9th byte is parity on the disk that is a plus because it means legacy apps can get the enhanced protection as well.

    --
    Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    1. Re:To much reinvention by paradxum · · Score: 5, Insightful

      It already exists, it's called ZFS on solaris boxxen. Each block uses ECC, it can correct itself on each read, and generally can indicate a failing disk. This truly is the filesystem every other one is playing catchup with.

    2. Re:To much reinvention by MrNaz · · Score: 5, Insightful

      Ahem. RAID anyone? ZFS? Btrfs? Hello?

      Isn't this what filesystem devs have been concentrating on for about 5 years now?

      --
      I hate printers.
    3. Re:To much reinvention by Hatta · · Score: 5, Interesting

      Don't forget PAR2. I never burn a DVD without 10%-20% redundancy as par2 files. Even if the filesystem gets too damaged to read, I can usually dd the whole disk and let par2 recover the files.

      --
      Give me Classic Slashdot or give me death!
  2. par files by ionix5891 · · Score: 5, Informative

    include par2 files

  3. It's that computer called the brain. by commodore64_love · · Score: 5, Interesting

    >>>"...analog pictures and film can degrade and still be usable; why can't the same be true of digital files?"

    The ear-eye-brain connection has ~500 million years of development, and has learned the ability to filter-out noise. If for example I'm listening to a radio, the hiss is mentally filtered-out, or if I'm watching a VHS tape that has wrinkles, my brain can focus on the undamaged areas. In contrast when a computer encounters noise or errors, it panics and says, "I give up," and the digital radio or digital television goes blank.

    What we need is a smarter computer that says, "I don't know what this is supposed to be, but here's my best guess," and displays noise. Let the brain then takeover and mentally remove the noise from the audio or image.

    --
    "I disapprove of what you say, but I will defend to the death your right to say it." - historian Evelyn Beatrice Hall
  4. Re:What files does a single bit error destroy? by Rockoon · · Score: 5, Insightful

    Most modern compression formats will not tolerate any errors. With LZ a single bit error could propagate over a long expanse of the uncompressed output, while with Arithmetic encoding the remainder of the file following the single bit error will be completely unrecoverable.

    Pretty much only the prefix-code style compression schemes (Huffman for one) will isolate errors to short sgements, and then only if the compressor is not of the adaptive variety.

    --
    "His name was James Damore."
  5. Parchive: Parity Archive Volume Set by khundeck · · Score: 5, Interesting

    Parchive: Parity Archive Volume Set

    It basically allows you to create an archive that's selectively larger, but contains an amount of parity such that you can have XX% corruption and still 'unzip.'

    "The original idea behind this project was to provide a tool to apply the data-recovery capability concepts of RAID-like systems to the posting and recovery of multi-part archives on Usenet. We accomplished that goal." [http://parchive.sourceforge.net/]

    KPH