Slashdot Mirror


A Look at Data Compression

With the new year fast approaching many of us look to the unenviable task of backing up last years data to make room for more of the same. That being said, rojakpot has taken a look at some of the data compression programs available and has a few insights that may help when looking for the best fit. From the article: "The best compressor of the aggregated fileset was, unsurprisingly, WinRK. It saved over 54MB more than its nearest competitor - Squeez. But both Squeez and SBC Archiver did very well, compared to the other compressors. The worst compressors were gzip and WinZip. Both compressors failed to save even 200MB of space in the aggregated results."

4 of 252 comments (clear)

  1. horrible site interface by the_humeister · · Score: 0, Redundant

    Is it just me or is that site really difficult to navigate amongst all those ads? Speed of compression would have been nice too.

    1. Re:horrible site interface by the_humeister · · Score: 0, Redundant

      Looks like I posted too fast. There's a speed comparison somewhere around there...

  2. common compression utilities benchmarks by qazwsx789 · · Score: 0, Redundant

    I did a small test of the common linux compression commands back in 2000. Here are the results: (note that some of the command options have changed since then, for example tar now uses -j for bzip2)

    THE COMPRESSION UTILITY TEST

    Compression utilities tested: zip, rar, gzip, bzip2, tgz(tar with the z flag invoked). Each test was run three times. For each completed test the system was rebooted. Hardware used: Pentium2 350Mhz, 256Mb RAM. OS: linux Mandrake 7.1. The system load was minimal. The "time" commands was used to time the elapsed time, the "ls -l" command was used to determin the size and a script was used to determine the total size of gzip files.

    Note: gzip, packs individual files recursively. For bzip2, the command invoked was tar -cvIf file.bz2 dir (in gnu tar, the I flag invokes bzip2). for tgz, tar with the z flag invokes gzip.

    TEST 1 - compressing multiple files

    total size of the dir: 91.621.857 bytes, total files: 3540 (most of these files are ascii and html, but there are a few gifs and jpgs too.)

    default compression settings:

    tool time elapsed MB/s compressed to time elapsed uncompressing
    gzip 1m.44s 0.88 24.884.124 37s
    zip 1m.10s 1.3 25.813.958 41s
    rar 3m.25s 0.44 20.784.489 48s
    bzip2 3m.54s 0.39 17.399.561 1m.17s
    tgz 1m.09s 1.32 23.821.446 36s

    maximum compression settings:

    tool time elapsed MB/s compressed to time elapsed uncompressing
    gzip 2m.00s 0.76 24.670.516 36s
    zip 1m.42s 0.89 25.593.448 39s
    rar 10m.12s 0.14 18.698.710 1m.02s
    bzip2 n/a (the comprsession rate can not be specified through tar, is the maximum default?)
    tgz n/a (the compression rate can not be specified through tar, is the maximum default?)

    CONCLUSION: use tgz (tar with the z flag) if time is an issue, otherwise use bzip2(tar with the I flag)

    TEST 2 - compressing 1 ascii file

    size of the ascii file: 53.819.786 bytes (the file was taken out of my mailbox)

    default compression settings:

    tool time elapsed MB/s compressed to time elapsed uncompressing
    gzip 42s 1.28 15.560.144 15s
    zip 41s 1.31 15.560.261 17s
    rar 1m.57s 0.45 11.507.387 17s
    bzip2 1m.58s 0.45 10.788.502 39s
    tgz 54s 0.99 15.560.907 8s

    maximum compression settings:

    tool time elapsed MB/s compressed to time elapsed uncompressing
    gzip 44s 1.22 15.486.842 15s
    zip 45s 1.19 15.486.959 16s
    rar 6m.40s 0.08 09.582.810

  3. Multiply Packing by gagge · · Score: 0, Redundant
    void SendVariables(int *var)
    {
    unsigned char c = 0;
    for(int x = 0; x < 5; x++)
    c = c*3 + var[x];
    SendChar(c);
    }

    void RetrieveVariables(int *var)
    {
    for(int x = 4; x >= 0; x--)
    {
    p[x] = c%3;
    c = c/3;
    }
    }