Slashdot Mirror


Recoverable File Archiving with Free Software?

Viqsi asks: "Back in my Win32 days, I was a very frequent user of RAR archives. I've had them get hit by partial hardware failures and still be recoverable, so I've always liked them, but they're completely non-Free, and the mini-RMS in my brain tells me this could be a problem for long-term archival. The closest free equivalent I can find is .tar.bz2, and while bzip2 has some recovery ability, tar is (as far as I have ever been able to tell) incapable of recovering anything past the damaged point, which is unacceptable for my purposes. I've recently had to pick up a copy of RAR for Linux to dig into one of those old archives, so this question's come back up for me again, and I still haven't found anything. Does anyone know of a file archive type that can recover from this kind of damage?"

12 of 80 comments (clear)

  1. Re:where have you been? by jason.stover · · Score: 5, Informative

    Here's the parchive sourceforge site .. Links to PAR2 utils, spec, etc...

  2. Try apio by innosent · · Score: 4, Informative

    There used to be a cpio-like archiver called apio, that was designed for those types of situations. Of course, that might not be much help for non-unix systems (unless you plan on running in Cygwin), but I remember having great success with it for the old QIC tapes, which were in my experience the worst backup medium for important data ever (better to have no backup than think you have a good one, but have a dead tape)

    --
    --That's the point of being root, you can do anything you want, even if it's stupid.
    1. Re:Try apio by innosent · · Score: 4, Informative

      Sorry, I believe it was afio

      --
      --That's the point of being root, you can do anything you want, even if it's stupid.
  3. Par2 works great by dozer · · Score: 5, Informative

    Store the recovery information outside the archive. Par2 works really well. You can configure how much redundancy you want (2% should be fine for occasional bit errors, 30% if you burn it to a CD that might get mangled, etc.). It's a work in progress, but it's already really useful.

    1. Re:Par2 works great by Stubtify · · Score: 4, Informative
      Allow me to second this. Par2 is everything the first PAR files were and more. No matter what has been wrong I've always been able to recover with a 10% parity set. (even this seems like a lot of overkill, except on USENET). Interestingly enough Par files have revolutionized USENET, I can't remember the last time I needed a fill.

      good overview here: PAR2 files

      comparison between v1 and 2: here

  4. cpio by Kevin+Burtch · · Score: 5, Informative


    True, tar cannot handle a single error... all files past that error are lost.

    On the other hand, cpio (and clones) can handle missing/damaged data without losing the undamaged portions that follow (you only lose the archived file that contains the damage). It is the only common/free format I can think of (from the top of my head) that is capable of this.

    --
    - Preferences: Solaris 10 (servers), Ubuntu (desktops), Solaris 11 (personal servers) -
  5. Re:Are you sure tar is unacceptable? by wiswaud · · Score: 5, Informative

    if you make a big tar then bzip2-it, then store the file on a CD.
    then 2 years later you want the data back.
    there's a read-error at some point within the .tar.bz2, and it gives you some garbage data.
    bunzip2 will actually be able to recover all other 900kB chunks of the original tar file, except for this missing chunk or part of it.
    Tar will just choke at that point and you lost everything past the read error. bunzip2 was able to recover the data past the error, but tar can't use the data.
    It's quite frustrating.

  6. Re:wow man by Viqsi · · Score: 4, Funny

    Y'know, I would've done that a long time ago, but my health care provider doesn't cover ideologuectomies. They claim that it doesn't threaten your physical life, just your social one. The bastards.

    :D

    --

    --
    viqsi - See "vixen"
    If we do not change our direction we are likely to end up where we are headed.
  7. Yes... by caesar79 · · Score: 4, Funny

    its an amazing technology...only quite involved.
    Basically you concatenate all the files together (cat should do), print it out on good 32lb paper, get a professor's signature and file it in a college lib...heard those things stick around for centuries

  8. tar/gzip recovery toolkit by wotevah · · Score: 4, Informative
    A quick google search turns up the link shown at the end of this post, from which I quote:

    The gzip Recovery Toolkit

    The gzip Recovery Toolkit has a program - gzrecover - that attempts to skip over bad data in a gzip archive and a patch to GNU tar that enables that program to skip over bad data and extract whatever files might be there. This saved me from exactly the above situation. Hopefully it will help you as well.
    [...]
    Here's an example:

    $ ls *.gz
    my-corrupted-backup.tar.gz
    $ gzrecover my-corrupted-backup.tar.gz
    $ ls *.recovered
    my-corrupted-backup.tar.recovered
    $ tar --recover -xvf my-corrupted-backup.tar.recovered > /tmp/tar.log 2>&1 &
    $ tail -f /tmp/tar.log

    http://www.urbanophile.com/arenn/hacking/gzrt/gzrt .html
  9. RAR Archives by vasqzr · · Score: 4, Funny


    Back in my Win32 days, I was a very frequent user of RAR archives.

    Bablefish translation: I was a huge warez kiddie.

    On a related noted, were there any wide-spread, legitimate uses of .RAR? I only remember .ARJ and .ZIP

  10. Re:Yeah by sasami · · Score: 4, Insightful

    Par archives is just a scam popularized by cluless usnet abusers. Think about it, if those files really could reconstruct a corrupt rar archive, why not post only the smaller par files ... Get youself double copies and you'll be far better off

    Ignore this post. It's either a troll or an idiot.

    PAR files substitute for missing pieces. They don't regenerate the whole file by themselves. Go look up how RAID 5 parity works. They're not called PAR files for nothing.

    Just because you don't understand how something works has no bearing on the fact that it does work. Except in certain performance-sensitive cases, doubling up is the least intelligent way of adding redundancy.

    ---
    Dum de dum.

    --
    Freedom is not the license to do what we like, it is the power to do what we ought.