Slashdot Mirror


A Look at Data Compression

With the new year fast approaching many of us look to the unenviable task of backing up last years data to make room for more of the same. That being said, rojakpot has taken a look at some of the data compression programs available and has a few insights that may help when looking for the best fit. From the article: "The best compressor of the aggregated fileset was, unsurprisingly, WinRK. It saved over 54MB more than its nearest competitor - Squeez. But both Squeez and SBC Archiver did very well, compared to the other compressors. The worst compressors were gzip and WinZip. Both compressors failed to save even 200MB of space in the aggregated results."

2 of 252 comments (clear)

  1. Re:Why compress in the first place? by Master+of+Transhuman · · Score: 0, Troll


    Compressing files intended for BACKUP, as opposed to DOWNLOAD, DOES increase the chance of losing the entire file. That was the poster's point and it is entirely correct.

    NEVER use compression on a backup unless you have PAR files you can use to recover the lost data if a bad sector on a CD, DVD, or bad block on a tape is discovered on restoration.

    The Disk Archive (DAR) program is one of the few backup programs that can generate PAR files during the backup.

    --
    Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
  2. Re:Because it makes a hell of a lot of sense. by Master+of+Transhuman · · Score: 0, Troll

    "While your point is true that it MAY be more difficult to recover from a corrupt file, that's not the right methodology. If your backups are that valuable, you'd make multiple copies - plain and simple."

    Two problems with your response:

    1) If your data is that valuable, compressing makes it more likely to lose it.

    2) If your data is that valuable, making two copies takes twice the time and space - even with compression - and if you use compression and get a bad sector, fifty percent of your backup is now useless. Sure, the odds are good that you can recover from the second backup - but if IT has a bad sector - even in a different place - possibly because your device is going bad - then you've lost the second backup as well.

    If you backup more than once UNCOMPRESSED, you can recover almost anything because it is VERY unlikely that a bad sector will occur in the exact same spot or even in the same file (assuming the one file does not take up most of the specific media.)

    If your data is valuable, back up twice uncompressed. If your data is only so-so valuable, back up twice compressed. If your data is easily replaced, back up once uncompressed. NEVER back up once compressed - you might as well not back up at all then.

    Alternatively, use PAR files to recover - as long as you're willing to add the extra space and time - which sort of obviates the advantage of compression, doesn't it?

    And if the only valid argument for compression is saving the cost of media, then obviously your data is less valuable than you think it is - in which case why bother backing it up at all (other than legal requirements)? The cost of media simply is not a factor in comparison to the cost of the time required to back it up, the cost of the time to restore if needed, and the value of the data itself. That is being "penny-wise and pound-foolish" - a typical attitude among geeks who are obsessed with efficiency over effectiveness. Save a few gigabytes of space and lose the data - yeah, that's real smart...

    If you want to back up quickly and securely, have two devices backing up simultaneously uncompressed - or two devices backing up simultaneously compressed with PARs. You can't lose - it's that simple.

    --
    Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!