Best Format for Archive Distribution?
Meostro asks: "I'm looking for the best format to use to distribute arbitrary datasets. Tarballs compressed with gzip seem to be the most common thing out there, with zip coming in a close second. What advanced compression packages are the most widely recognized or available on the widest array of systems? Cross-platform compatibility is my most important goal, followed by compression ratio, decompression time, compression time and extra features (solid archives, support for multiple files, etc.). I'm starting up a free data site to provide test data for anything you can imagine: images for compression and format interpretation, text and audio for language processing, programming language examples to test parsing, and more. I hope this will grow to be a significant (read: multi-gigabyte) archive, so I want to start off right with my distribution format. Right now the plan is data.tar.bz2, but i'm open to anything that will give me better compression as long as it's available for Linux, Windows and Mac."
One problem with this is that it is not a common format. For limited use (one-time distribution, short-term backup), this is OK. But what about long-term archives.
If you want to de-compress this stuff in 10 or 20 years, will you be able to find software then that can handle it? Epspecially if the new cell processors somehow become popular, will Windows BOHICA 2025 edition be able to run 20-year-old binaries in order to read this thing?
If the source is available, the job is easier in Linux, but if the format is not actively maintained, it may take a lot of work to modify the program to run whatever Linux looks like in 20 years.
"-1 Troll" is the apparently the same as "-1 I disagree with you."