A Look at Data Compression
With the new year fast approaching many of us look to the unenviable task of backing up last years data to make room for more of the same. That being said, rojakpot has taken a look at some of the data compression programs available and has a few insights that may help when looking for the best fit. From the article: "The best compressor of the aggregated fileset was, unsurprisingly, WinRK. It saved over 54MB more than its nearest competitor - Squeez. But both Squeez and SBC Archiver did very well, compared to the other compressors. The worst compressors were gzip and WinZip. Both compressors failed to save even 200MB of space in the aggregated results."
WinRK may have won only because he used the fast compression setting on all the compressors he tested. Results for default setting and best compression settings are TBA.
A key benefit to PKZIP and tarballs formats is that they will be accessible for decades or hundreds of years. These formats are open (non-proprietary), widely implemented, and free (as in freedom) software.
The same can't be said for WinRK. Therefore, if you plan to want access to your data for a long period of time, you should carefully consider whether the format will be accessible.
I did a short review and benchmarking of unix compressors people might be interested in.
Mouse powered Chips, Open source Processors and Lego
Why mess around with compressing individual files? DiskDoubler is definitely the way to go. Hell, I even have it set up to automagically compress files I haven't used in a week.
Its running perfectly fine on my Mac IIci.
Know what I like about atheists? I've yet to meet one that believes God is on their side.
"I just don't understand the desire for compression in the first place."
Sometimes, people have to download things.
I rarely criticize things I don't care about.
If you look at the methodology - all the results were obtained using the software set to the fastest mode - not the best compression mode.
.wav and .mp3 files I'd want to do a binary compare of the restored files to ensure they weren't just run through a lossy codec...
So, I would consider gzip the best performer by this criteria. After all, if I cared most about space savings I'd have picked the best-mode - not the fast-mode. All this articles suggests is that a few archivers are REALLY lousy for doing FAST compression.
If my requirements were realtime compression (maybe for streaming multimedia) then I wouldn't be bothered with some mega-compression algorithm that takes 2 minutes per MB to pack the data.
Might I suggest a better test? If interested in best compression, then run each program in a mode which optimizes purely for compression ratio. On the other hand, if interested in realtime compression then take each algorithm and tweak the parameters so that they all run in the same time (which is a realtively fast time), and then compare compression ratios.
With the huge compression of multimedia files I'd also want the reviewers to state explicity that the compression was verified to be lossless. I've never heard of some of these proprietary apps, but if they're getting significant ratios out of
3 hours 47 minutes with WinRK versus gzipping in 3 minutes 16 seconds. Is it really worth watching the progress bar for 200 megs smaller file?
They do it to sell more ad impressions. Each time you go to the next page you load a new ad.
I can't believe TFA made /. The only thing more defective than the benchmark data set (Hint: who cares how much a generic compressor can save on JPEGs?) is the absolutely hilarious part where the author just took "fastest" for each compressor and then tried to compare the compression. Indeed, StuffIt did what I consider the only sensible thing for "fastest" in an archiver, which is to just not even try to compress content that is unlikely to get significant savings. Oddly, the list for fastest compression is almost exactly the reverse of the list for best compression on every test. The "efficiency" is a metric that illuminates nothing. An ROC plot of rate vs compression for each test would have been a good idea; better would be to build ROC curves for each compressor, but I don't see that happening anytime soon.
I wouldn't try to draw any conclusions from this "study". Given the methodology, I wouldn't wait with bated breath for parts two and three of the study, where the author actually promises to try to set up the compressors for reasonable compression, either.
Ouch.
Since the original site seems to be really slow and split into a billion pages, those who aren't aware of it might want to look at MaximumCompression since it has tests for several file formats and also has a multiple file compression test that is sorted by efficiency. A program called SBC does the best, but the much more common WinRAR comes in a respectable third.
http://www.popularculturegaming.com -- my blog about the culture of videogame players
It's interesting to note that Stuffit produces worthwhile compression of JPG images, something long thought to be impossible.
I'd heard the makers of Stuffit were claiming this, but I was sceptical, it's good to see independant confirmation.
Quidquid Latine dictum sit, altum videtur (anything said in Latin sounds important)
if you download a file over gprs and each megabyte costs you 3$, then saving 200 megabytes means saving 600$, which is a price for a low-end pc or almost a laptop.
another case is if you only have 100 megabytes you can use and only a zzzxxxyyy archiver can compress it into the 100mb while gzip -9 leaves you with 102mb.
so it really depends if you need it or not. sometimes you need it, mostly you don't.
but bashing on the issue "like nobody ever needs it" is certainly wrong.
I'd tell you the chances of this story being a dupe, but you wouldn't like it.