Slashdot Mirror


Error-Proofing Data With Reed-Solomon Codes

ttsiod recommends a blog entry in which he details steps to apply Reed-Solomon codes to harden data against errors in storage media. Quoting: "The way storage quality has been nose-diving in the last years, you'll inevitably end up losing data because of bad sectors. Backing up, using RAID and version control repositories are some of the methods used to cope; here's another that can help prevent data loss in the face of bad sectors: Hardening your files with Reed-Solomon codes. It is a software-only method, and it has saved me from a lot of grief..."

8 of 196 comments (clear)

  1. ZFS? by segfaultcoredump · · Score: 3, Interesting

    Uh, is this not one of the main features of the ZFS file system? It does a checksum on every block written and will reconstruct the data if an error is found? (assuming you are using either raid-z or mirroring. Otherwise it will just tell you that you had an error).

  2. Speed? by grasshoppa · · Score: 3, Interesting

    My question is of speed; this seems a promising addition to anyone's back up routine. However, most folks I know have 100s of gigs of data to back up. While differentials could be involved, right now tar'ing to tape works fast enough taht the backup is done before the first staff shows up for work.

    I assume we're beating the hell out of the processor here; so I'm wondering how painful is this in terms of speed?

    --
    Mod me down with all of your hatred and your journey towards the dark side will be complete!
    1. Re:Speed? by xquark · · Score: 4, Interesting

      The speed of encoding and decoding directly relates to the type of RS and the amount of FEC required. Generally speaking erasure style RS can go as low as O(nlogn) (essentially inverting and solving for a vandermonde or Cauchy style matrix) A more general code that can correct errors (the difference between an error and an erasure is that in the latter you know the location of the error but not its magnitude) may require a more complex process, something like Syndrome-Berlekamp Massey-Forney which is about O(n^2).

      It is possible to buy specialised h/w (or even GPUs) to perform the encoding steps (getting roughly 100+MB/s) and most software encoders can do about 50-60+Mb/ for RS(255,223) - YMMV

      --
      Arash Partow's Philosophy: Be a person who knows what they don't know, and not a person who doesn't know.
  3. Re:Drives already do this by Marillion · · Score: 4, Interesting

    My biggest failed prediction in the world of computers was the CD-ROM.

    I was an audio CD early adopter and I knew from articles I read that audio CD's often had a certain defect rate. The defect rate was usually such that you would never hear it. One artist even published all the defects in the liner notes.

    Based upon this, I presumed that you would never get the defect rate to zero and that no one would trust a data medium with anything less than perfection - and thus predicted the CD-ROM would never catch on.

    They don't have to get the rate to zero. Just close enough to zero for the RS to function.

    --
    This is a boring sig
  4. Re:Drives already do this by xquark · · Score: 3, Interesting

    My understanding is that it is possible to drill a few holes no larger than 2mm in diameter equally spread over the surface of an "audio cd" and with the help of h/w RS erasure decoding, channel interleaving and channel prediction (eg:probabilistically reconstruct missing right channel from known left channel) one can produce a near perfect reconstruction - that's what usually happens to overcome scratches and other kinds of simple surface defects.

    --
    Arash Partow's Philosophy: Be a person who knows what they don't know, and not a person who doesn't know.
  5. Re:Drives already do this by femto · · Score: 4, Interesting

    Another view is that everything is a code in a noisy environment, so there is no way to talk about "the underlying device" as it itself is just another type of coding. Magnetic recording can be viewed as a way of encoding information onto the underlying (thermal) noisy matter. There is some very deep stuff happening in information theory. Let's take the empty universe as a noisy channel. Now every structure in the universe (including you and me) becomes information encoded over the empty universe. One gets the feeling that any "ultimate theory" won't be expressed in terms of forces and fields but some underlying, unifying, concept of information.

  6. Re:You are all dumb as there is only one way. by billcopc · · Score: 3, Interesting

    Reed-Solomon is ancient compared to par2.

    No, you're dumb. Par2 IS Reed-Solomon. Silly me to expect an AC to fact-check the most trivial subjects of a post.

    The procedure explained in TFA is basically adapting a different tool to behave more or less like single-file par2. That makes it redundant (in the /. sense, not the data-recovery sense).

    There is one thing I would love to see, and that's local disk checksumming. That's right, take a 500gb disk, chop it into slices and do RAID-5 on them as if they were individual spindles. It's been years since I've had a hard drive actually die on me, but I've seen bit-errors more often than I'd like. Having self-checking built into the filesystem (or low-level disk access) would help ensure 100% data integrity, and you could still do RAID-1 on top of it for safety.

    --
    -Billco, Fnarg.com
  7. Re:Drives already do this by Wowsers · · Score: 4, Interesting

    I loved my DAT (for audio) portable recorder, it employed Double-Reed-Solomon error correction, you would have to do some serious hammering to the side of the recorder to get the tape to "skip" in a way the error correction could not correct it and you'd hear it drop out, running and recording was NOT out of the question though.

    Now what do the consumers have for recorders - cr*ppy, cheap, nasty, low bitrate, overcompressed MP3 recorders. The recording industry killed off an excellent (but expensive) format to palm off rubbish compressed audio to the masses. (Proper PCM recorders are no different in price to the DAT decks).

    --
    Take Nobody's Word For It.