Slashdot Mirror


Best Format for Archive Distribution?

Meostro asks: "I'm looking for the best format to use to distribute arbitrary datasets. Tarballs compressed with gzip seem to be the most common thing out there, with zip coming in a close second. What advanced compression packages are the most widely recognized or available on the widest array of systems? Cross-platform compatibility is my most important goal, followed by compression ratio, decompression time, compression time and extra features (solid archives, support for multiple files, etc.). I'm starting up a free data site to provide test data for anything you can imagine: images for compression and format interpretation, text and audio for language processing, programming language examples to test parsing, and more. I hope this will grow to be a significant (read: multi-gigabyte) archive, so I want to start off right with my distribution format. Right now the plan is data.tar.bz2, but i'm open to anything that will give me better compression as long as it's available for Linux, Windows and Mac."

2 of 109 comments (clear)

  1. CPIO by DarkDust · · Score: 3, Interesting

    I prefer .cpio.bz2 because unlike tar cpio can handle special devices just fine (or do I miss some switch for tar which makes it able to handle devices and links ?). Since it's also in the POSIX standard this should be pretty portable as well.

  2. Re:RAR's license is garbage by gowen · · Score: 3, Interesting
    md5 probably provides enough integrity checking for test data & split/cat make splitting/reassembling ANYTHING easy.
    But RAR/PAR encoding use a Reed-Solomon error correction scheme, so you can send a few additional files and the data blocks within them can be used to replace *any* lost data blocks from the original set. It's really crafty.

    You're right about the license, though.
    --
    Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.