Exhaustive Data Compressor Comparison
crazyeyes writes "This is easily the best article I've seen comparing data compression software. The author tests 11 compressors: 7-zip, ARJ32, bzip2, gzip, SBC Archiver, Squeez, StuffIt, WinAce, WinRAR, WinRK, and WinZip. All are tested using 8 filesets: audio (WAV and MP3), documents, e-books, movies (DivX and MPEG), and pictures (PSD and JPEG). He tests them at different settings and includes the aggregated results. Spoilers: WinRK gives the best compression but operates slowest; AJR32 is fastest but compresses least."
Nothing to see. High compression = slow and low compression = fast. umm duh?
Which compression format are you going to send the article....
I never would have guessed that there was a tradeoff between the quality and speed of compression! No way! Next they'll be saying things like 1080p HD offers quality at the expense of computational power required!
Screw speed and size reduction. All I want it compatibility with other OSs (i.e., fewest things that have to be installed on a base OS to use it). For that, I'd have to say Zip and/or gzip wins.
So that's why smaller computers are slower, right?
I fill an old station wagon with backup tapes, and then put it in the crusher.
http://www.maximumcompression.com/ ?
Not every software achieves maximum efficiency. It is perfectly imaginable that a compressor could be slow and bad. It is nice to see that these compressors did not suffer that fate.
as its slashdotted
this site
http://www.maximumcompression.com/
has been up for years and performs tests on all the compressors with various input sources, much more comprehensive
s'all she wrote, Jim. Coral cache of it works, though.a spx?artno=4&pgno=0
http://www.techarp.com.nyud.net:8080/showarticle.
I remember people did MUCH more exhaustive (30+ programs) comparisons back in the BBS days. Yes... it was a much simpler time.
Bit hard to have a spoiler when the article isn't available.
These posts express my own personal views, not those of my employer
These two formats are still widely used out there, and why are we compressing MP3's?
I read this earlier today through the firehose. It was interesting, but the graphs are what struck me. It seems to me all the graphs should have been XY plots instead of pairs of histograms. That way you could easily see the relationship between compression ratio and time taken. Their "metric" for showing this, basically multiplying the two numbers, is pretty bogus and isn't nearly as easy to compare. With the XY plot the four corners are all very meaningful. One is slow with no compression, one each good compression/time, and the sweet spot of good compression and good time. It's easy to tell those on two opposing corners apart (good compression vs good time), where as with the article's metric they could look very similar.
Still, interesting to see. The popular formats are VERY well established at this point (ZIP in Windows and Mac (stuffit seems to be fading fast), and GZIP and BZIP2 on Linux). They are so common (especially with ZIP support built into Windows since XP and also built into OS X) I don't think we'll see them replaced any time soon. Of course, with CPU power getting cheaper and cheaper we are seeing formats that are more and compressed (MP3, H264, Divx, JPEG, etc) so these utilities are becoming less and less necessary. I no longer need to stuff files on floppies (I've got the net, DVD-Rs, and flash drives). Heck, if you look at some of the formats they "compressed" (at like 4% max) you almost might as well use TAR.
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
Which compressors on the list run on non windows platforms?
You have gotta be kidding me, article is posted and there are no best compression test results! Lame!
Nice comparison, but there's really only two that matter (at least on PCs):
ZIP for cross-platform compatibility (and for simplicity for less technically-minded users).
RAR for everything else (at 3rd in their "efficiency" list, it's easy to see why it's so popular, not to mention ease of use for splitting archives, etc).
"Nothing strengthens authority so much as silence." - Charles de Gaulle
This is a poor article on several points. First, the entropy of the data in the files isn't quantified. Second, the strategy used for compression isn't described at all. If WinRK compresses so well on very high entropy data, there must be some filetype specific strategies used.
Versions of the programs aren't given, nor the compile-time options (for the open source ones).
Finally, Windows Vista isn't a suitable platform for conducting the tests. Most of these tools target WinXP in their current versions and changes to Vista introduced systematic differences in very basic things like memory usage, file I/O properties, etc.
The idea of the article is fine, it's just that the analysis is half-baked.
What's the point of compressing JPEG,MP3,DivX etc since they already do the compression? The streams are close to random (with max information) and all you could compress would be the headers between blocks in movies or the ID3 tag in MP3.
They didn't think their cunning plan to create more ad revenue by creating a shitload of pages all the way through...
the most interesting thing about text compression is that there is only about 20% information in the english language (or less). yes, that means that 4/5ths of it is meaningless filler. filled up with repetitive patterns. as you can see, i really didn't need four sentences to tell you that, either.
i wonder how other languages compare, and if there is a way to communicate much more efficiently.
Some people are sending huge graphics files and paying for badnwidth and/or sending to people with slow connectiuons, so they actually have a use for maximal compression.
I have to agree that for most people (myself included), compatibility is all that matters. I'm so glad Macs now can natively zip. But there are valid reasons to want compression over compatibility.
7-zip cribsheet:
weak on retarded things to zip like WAV files (use FLAC) mp3's, jpegs and divx movies.
7zip does quite well in documents (2nd) and ebooks (2nd) 3rd on MPEG video, 2nd in PSD
also i expect 7zip will improve in higher end compressions settings, when possible i give it hundreds of megs and unlike commercial apps 7zip can be configured well into the "insane" range
Snowden and Manning are heroes.
These days, file compression is pretty much only used for large downloads. In those instances, you really have to use either gzip, pkzip, or bzip2 format, so that your users can extract the file.
Yes, having a good compression algorithm is nice, but unless you can get it to partially supplant zip, you'll never make much money off it. Also, most things these days don't need to be compressed. Video and audio are already encoded with lossy compression, web pages are so full of crap that compressing them is pointless, and hard drives are big enough. Although, I haven't seen any research lately about whether compression is useful for entire filesystems to reduce the bottleneck from hard drives. Still, I suspect that it is not worth the effort.
All I want it compatibility with other OSs (i.e., fewest things that have to be installed on a base OS to use it). For that, I'd have to say Zip and/or gzip wins.
Sure, but there's also the issue of finding the files you really want to share and there KDE has very nice front ends. There's a nice find in Konqueror, with switches for everything including click and drool regular expressions. Krename coppies or links files with excellent renaming. Finally, Konqueror has an archive button. The slick interface does not preclude the use of command line tools because the rename and archive programs will take piped input. The GUI is nice for review of the output and easy further processing.
Friends don't help friends install M$ junk.
because then they can use those graphs to pump their sponsor (WinRK)
Snowden and Manning are heroes.
UM, yeah, the dataset includes WAV files. Try flac. Then you will have exhausted a little more of the compression programs available.
This is the First Post compressed really well, so it took until after a few posts to show up.
Have you read my journal today?
I read the article, got shocked at the time spent comparing the compression of MP3s and DiVX, and didn't read much further.
Google's top hit turns up this site which is chock full of data on every compressor you ever & never heard of:
http://www.maximumcompression.com/index.html
Wikipedia has nice charts to quickly see features and OS support for a handful of common compressors:r chivers
http://en.wikipedia.org/wiki/Comparison_of_file_a
The newsgroup comp.compression has been around awhile, and is maintaining an excellent FAQ:C omp.compression_FAQ
http://datacompression.dogma.net/index.php?title=
I have to admit I switched over/back to ZIP about a year ago for everything for exactly this reason. yeah, it meant a lot of my old archives increased in size (sometimes by quite a bit), but knowing that anything anywhere can read the archive makes up for it. ZIP creation and decoding is supported natively by Mac and Windows and most Linux distros right from the GUI, so it makes it brain-dead simple to deal with.
Recursive: Adj. See Recursive.
Meanwhile, I noticed they didn't include the latest winner of the Hutter Prize, which is unfortunate since its latest entry looks like it will come in at nearly a 10% improvement over all prior text compressors using novel semantic modeling techniques.
Seastead this.
rzip is like bzip2 on steroids. Works great for me.
See http://en.wikipedia.org/wiki/Rzip
I use a single finger
The L-Zip project at http://lzip.sourceforge.net/ seems to be down right now but it should be included in any file compression comparison. It could reduce files to 0% of their original size and it was quick too.
It was so good at what it did that I bet Microsoft bought them out and are going to incorperate the technology into Windows.
Looks like the server was /.'ed. Mirrors: MirrorDot and Network Mirror.
Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
I use MS DOS 6 with doublespace doubling my hard drive space. I store stuff on both C and H drives, zipping, arjing and rarring all my jpgs and divx files. I figure with the amount of compression I'm using, I'll have roughly 20 times as much room as regular plebs. Suckers!
You should also see my 133t power strip setup. I don't need extra sockets, with daisy chaining I can fit as many devices as I want! LOL LOL Unfortunately my faulty circuit breaker keeps switching off at the most inconvenient times, I'll have to get that seen to.
If I have seen further it is by stealing the Intellectual Property of giants.
See also: the Archive Comparison Test. Covers 162 different archivers over a bunch of different file types.
It hasn't been updated in a while (5 years), but have the algorithms in popular use changed much? I remember caring about compression algorithms when I was downloading stuff from BBSs at 2400 baud, or trading software with friends on 3.5" floppies. But in these days of broadband, cheap writable CDs, and USB storage, does anyone care about squeezing the last few bytes out of an archive? zip/gzip/bzip2 are good enough for most people for most uses.
Repton.
They say that only an experienced wizard can do the tengu shuffle.
http://compression.ca/act/ has a much more exhaustive test, and no ads either.
It seems odd that they didn't include executables/dlls in the comparison (where maxmumcompression.com does). I also find it odd that they are compressing items that normally don't compress very well with most data compression programs (divx/mpegs/jpegs/etc). I'm guessing this is why 7-zip ranked a bit lower than most.
I did some comparison last year, and found 7-zip to do the best job for what I needed (great compression ratio without requiring days to complete). It also doesn't take into account the network speed at which the file is going to be transmitted. I use 7-zipfor pushing application updates and such to remote offices (most over 384k/768k WAN links). Compressing w/ 7-zip has saved users quite a bit of time compared to winrar or winzip.
I would definitely recommend checking out maximumcompression.com (As others have, as well) over this article. It goes into a lot greater detail.
I have been thinking about creating a new language with about 60 or so words. The idea is that you don't need a lot of words when you can figure out the meaning by context. Strong points are that the language would be very easy to pick up, and you would get that invigorating feeling of talking like a primitive cave man.
As an example of the concept, we have the words walk and run. They are a bit too similar to be worth wasting one of our precious few 60 words. Effectively, one could be dropped with have the other taking on a broader meaning without any real repercussions. The words sit and shit are also fairly similar. When you have a guest over, you can say something like, "Please, shit down." Because of context, it would be all okay. Just remember, there is a difference between shitting on the toilet and shitting in the toilet.
Once you start despising the jerks, you become one.
Er... did ya check out the comparisons? As you can see here here jpeg at least can be compressed considerably with Stuffit. According to this the program can "(partially) decode the image back to the DCT coefficients and recompress them with a much better algorithm then default Huffman coding." I've no idea what that means, but it does seem to be more thorough and complex than what you wrote.
Mod parent up! I noticed that too, very interesting - I wonder whether a jpg compressed as efficiently as the JPEG standard allows could still be improved upon by StuffIt, or whether it just takes advantage of the inefficiency of most jpg compression code..
With Quantum computing perhaps we'll start to see really elegant compression, like 2d checksums with bitshifting. If you can make all the data relate to each other than each bit of compressed file cuts the possibilities in half, get it down to maybe 1,000,000,000 possibilities and then tell it that it needs to be able to play in winamp and... well, use a lot of processing power.
It's closed sourced and proprietary though. Someone needs to make an open-source RAR compressor - the problem is you can't use the official code to do that (as it's specifically in the licence), but you could use unrarlib as a basis...
File compression is also very important for backups, both for capacity and backup/restore speed. But you know what? In backups, you want to ensure that the archives are going to be recognisable and readable by as wide a variety of software as possible, so your disaster recovery options are open. Sure, you probably encrypt them, but there portable and fairly standard tools are also a good idea rather than some compression&archival app's built-in half-baked password protection.
As for compressing whole file systems, it doesn't work well because data compresses by variable amounts. It's hard to get a layout that handles this well - when a program overwrites a few blocks of a file, those blocks might grow and force everything to move, or force fragmentation of the file. That sort of thing. You might say to compress the data but store it in the original block layout - which works and solves the above problem, but loses you your performance gains because the drive will generally read a whole block if part of it is needed, so you have no net change. This doesn't mean that efficient read/write compressing file systems aren't possible, just that they are hard, and probably won't perform as well as you might initially expect. They'll also have very _different_ performance characteristics because of the changes required to make them work without insane levels of fragementation or lots of block copying.
Compressing file systems are amazing for backups, though, where files are written, read, or truncated, but rarely appended to or partially overwritten. I'd LOVE a widely supported r/w compressing FS for our backups here, but have to make do with compressed archives at the moment. Tape drives compress, but I don't have the cash for an SDLT here and we need that kind of capacity.
It's a waste of time using a general purpose compressor on data that's already been compressed by domain specific audio or video compressors.
Even it the amount of additional compression is insignificant, ZIP, RAR, etc. are still very useful as container formats for MP3, JPG, etc. files since it's easier to distribute 1 or 2 .ZIP files than it is 1000 individual .JPG files. And if you're going to package up a bunch of files into a single file for distribution, why not use the opportunity to save a few kilobytes here and there if it doesn't require much more time to do that?
"People that quote themselves in their signatures bother me" - athakur999
I am forever amazed that originating servers & mirrors of oft-released (minor releases of large) EXEs, ISOs, etc. do NOT - by default - compress their files, ie, before the first-requested transfer happens.
PROPOSAL (not likely to be so new, I suppose):
Whenever a requested file is NOT already compressed:
1. On the Server-Side:
- [FTP or HTTP] file-transfer programss/protocols should (by default) compress them (using the best compressor for that type of file), and
- save the now-compressed version of the big file on the server (in case of future requests for the same file), and
On the Client-Side:
- User can be asked (unless there's been a default reply saved) in which form the file should be saved (ie, compressed or decompressed), and
- the received file is saved in the form requested by User.
We, in Australia, need such compression, as we've recently had significant INCREASES in our Internet service DATA costs... either because ISPs are just beginning to need to invest in ADSL-2+ DSLAMS -or- we're now using data for VoIP applications (and ISPs figure they're entitled to some of the $'s we save) -or- due to greed?
Others may also have high data costs.
In any case, I'm sure no one would mind some server-side changes that would reduce the sheer quantity of data that needs to be transferred.
Interesant algorithms: i suppose that the patents are expired. Key items:
- Tail-biting LZ77.
- Lempel-Ziv-Yokoo LZY 1992, Kiyohara and Kawabata 1996.
- LZ78SEP.
- LZWEP.
- LZYEP.
No War, Peace Again!Yes. Jpeg includes lossless compression. First, it discards information(which is the part that you can tune), and then it losslessly compresses the result of that step. Stuffit backs out the standard lossless compression and uses some other better algorithm. If you are worried about it, use Jpeg 2000 or something similar, they are better at discarding information.
Nerd rage is the funniest rage.
Save yourself 24 pages of crap, here's the punchline:
...
...
Aggregate Results
Overall, WinRK was the champion at compressing the filesets. It had an average compression rate of 23.2%. It was 9% better at overall compression than its closest rival, SBC Archiver which had an average compression rate of 21.3%.
The poorest compressors overall, at default settings, were the trio of WinZip, gzip and ARJ32. They only had average compression rates of about 13%.
However, gzip was the undisputed speed champion. It only took just over 121 seconds to completely process the complete fileset collection which weighed in at over 1.6GB. It was over a third faster than the runner-ups, ARJ32 and WinZip.
The other compressors were pretty slow at their normal compression settings. However, WinRK was extremely slow, compared to the others. It took almost 1.5 hours to compress the entire fileset collection.
The most efficient data compressor for the aggregated results was gzip. Its super-fast compression speed, coupled with its average compression rate allowed it to become the undisputed overall efficiency champion. ARJ32 and WinZip were also very efficient compressors. They were more than twice as efficient as their nearest rivals, StuffIt and bzip2.
The other compressors may have been good at certain files, but overall, they were pretty inefficient. The most inefficient compressors overall was WinRK by a large margin . No matter how good it was at compressing files, its extremely slow compression speed totally killed its efficiency ratings.
Conclusion
WinRK was the best compressor in most filesets it encountered. So, it was not surprising that it was the overall compression champion. However, its performance was offset by its abysmally slow performance. Even with a really fast system, it still took ages to compress the filesets. On several occasions, it took more than 18 minutes to compress just 200MB of files. Thanks to this flaw, it had the dubious honour of being the most inefficient compressor as well.
SBC Archiver, which was just slightly poorer than WinRK at compression was much faster at the job. Although it was nowhere near the top of the speed rankings, its faster speed allowed it to attain a moderate efficiency ranking.
WinRAR, which is a favourite of many Internet users, displayed a surprisingly bland performance at default settings. Although it had a pretty good overall compression rate of just under 19%, it was very slow at its default settings. That made it the third most-inefficient compressor. Surprising, isn't it?
In contrast, another perennial favourite, WinZip which had a lower overall compression rate of 13% managed to attain a much higher efficiency rating because it was able to compress the filesets much faster than WinRAR. Quite surprising since many users have abandoned it for WinRAR in view of its rather dated compression algorithm.
StuffIt is a dark horse. It has a pretty good compression rate overall but with an unimpressive compression speed. However, its amazing performance with JPEG files cannot be denied. JPEG files is undeniably StuffIt's forte. No other compressor even comes within a light year of it.
gzip and ARJ32 are both the fastest and the worst compressors of the lot. They have unimpressive overall compression rates but more than makes up for it with their tremendous compression speeds. Therefore, it isn't surprising to see them garner the top two spots in compressor efficiency. However, we would still recommend GUI alternatives like WinZip. It is almost as efficient as gzip and ARJ32 and far more user-friendly.
Based on our results, we can only come to one conclusion. If you do not like to change the settings of your data compressors and want a good, fast and user-friendly data compressor, then WinZip is the best one for the job.
So there you have it - the results of the Normal Compression Test.
PK....k..6]..Y..Q...zip.huSmk.A..~..&.K!..3...GYo. s../..w.^..3...rw.na.sT.9..,$z..Tf..K..os..r.i.saS ..a..O.7...*.._BP.8.W!.`9..*..k..R;.".0.^..;.'..*. o.~L_.7.. T(w.J...6t..i..X.]...u.+..W..?.r..K...Y.O..{.."}.. *,.;..Zp..WZ).YQ.0~2)xE..59C..m+.Vk..t
-William Brendel
By default, Stuffit won't even bother to compress MP3 files. That's what it shows an increase in file size (for the archive headers) and why it is the fastest throughput (it's not trying to compress). If you change the option, the results will be different.
I imagine some other codecs also have similar options for specific file types.
The only thing more pathetic than a PC user is a PC user trying to be a Mac user. We have a name for you people: switcheurs.
We have a name for you people too. Unfortunately, it can't be repeated in mixed company.
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.Web.HttpException: Server Too Busy
Engineering is the art of compromise.
Still, in both cases, it works; who can argue with that.
wtf? How is this highly compatable? gzip has a much larger install based.
Engineering is the art of compromise.
Back in the early/mid 90s I was pretty obsessed with data compression because I was always short on hard drive space (and short on money to buy new hard drives with); as a result I tended to compress things using whatever the format du jour was if it could get me an extra percentage point or two. Man, was that a mistake.
Getting stuff out of some of those formats now is a real irritation. I haven't run into a case yet that's been totally impossible, but sometimes it's taken a while, or turned out to be a total waste of time once I've gotten the archive open.
Now, I try to always put a copy of the decompressor for whatever format I use (generally just tar + gzip) onto the archive media, in source form. The entire source for gzip is under 1MB, trivial by today's standards, and if you really wanted to cut size and only put the source for deflate on there, it's only 32KB.
It may sound tinfoil-hat, but you can't guarantee what the computer field is going to look like in a few decades. I had self-expanding archives, made using Compact Pro on a 68k Mac, thinking they'd make the files easy to recover later, which didn't help me at all now -- a modern (Intel) Mac won't touch it (although to be fair a PPC Mac will run OS 9 which will, and allegedly there's a Linux utility that will unpack CPP archives, although maybe not self-expanding ones).
Given the rate at which bandwidth and storage space are expanding, I think the market for closed-source, proprietary data compression schemes should be very limited; there's really no good reason to use them for anything that you're storing for an unknown amount of time. You don't have to be a believer in the "infocalypse" to realize that operating systems and entire computing-machine architectures change over time, and what's ubiquitous today may be unheard of in a decade or more.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
Theres also another, rather uncommon format that wasn't tested that is somewhat important. UHARC- File extension UHA. It is dog slow, but offers better compression than probably any of the others. It is still used by software pirates with their custom install scripts, and I have seen it in official software install routines as well.
You can keep Rar and zip and toss out the others, but the UHA extension (or a dummy extension) will probably exist on your computer at some point in time.
Even those who arrange and design shrubberies are under considerable economic stress at this period in history.
RAR irritates me though. It's rare enough that I usually have to dig up a decompresser for it and install it special for just one file and then I never use it again. I just don't like having to deal with files that require me to install new software just so I can use that one file. In that vein, I really don't think the article is relevant. I certainly won't use novelty file formats unless it looks like it has "legs". It's not like I want to make a file that becomes useless when the maintainer of the decompression utility loses interest and it goes away.
Why is "ease of splitting archives" considered to be important? You can do it with zip automatically, or any other archive format you care to choose by using, for instance, split -d -b 2048m filename, to split the output stream of any compressor into files no larger than 2 gig, with names starting with filename001.
How many systems don't have any form of cat?
Can you be Even More Awesome?!
There is no mention of the file format that is being used for the compression. I would have liked to see the test done comparing all the common formats as well as each programs specialty format.
It makes a lot of sense, considering how my eyelids feel after reading what the article is about.
Andrew Tridgell's rzip wasn't on there either.
http://samba.org/junkcode/
Tridge is one of the smart guys behind samba. And rzip is pretty clever for certain things. Just ask google.
Is TFA available in zip format?
This site uses *23* pages. Does anyone else hate pagination? Sure, it has it's uses (decreasing bandwidth consumption) if you're planning on using it as a reference, but if it's an ARTICLE meant for one-off reading, please, for Pete's sake, just use single page format and save us all the hassle of waiting 5 annoying seconds as the next page loads with all your shitty design!
At the end of the day, if you want to use a "quick link" system to quickly get someone to the conclusion, use this handy thing invented back in "the day". It's called Hypertext - use a hyperlink, use it in a table of contents, and save us all a crapload of wasted time.
Well, on Windows I use 7Zip. It is registred to the following extentions 001, 7z, arj, bz2, cab, cpio, deb, gz, iso, rar, rpm, tar, z and zip. All those I have used worked just fine for decompression (meaning arj, bz2, gz, iso, rar, tar, z and zip) It can only create 7z, zip and tar though.
Ahhh...the great dumpster continuum. Many a free computer will be found there. -- sowth (748135)
For those that don't know how to join files on Windows, it would be:
The /b parameter is very important because it indicates to join the files in binary format. That said, I do not know how to split files on Windows.
Ahhh...the great dumpster continuum. Many a free computer will be found there. -- sowth (748135)
The article conveniently forgets to mention whether the conpression tools are cross-platform (OSX, Linux, BSD) and/or open source or not.
That makes a lot of them utterly useless for lots of people. Yet another windows-focussed review, bah.
The "winners" have special compression modes for .wav files, etc. (lossless audio algorithms) so of course they "won" for datasets which include those files.
On the other hand, "zip" works everywhere. I can send a zip file to my granny and she could find things inside it.
No sig today...
http://www.maximumcompession.com/
THERE is the most exhaustive data compressor comparison. Including many different tests, and scores of compressors.
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
This is easily the WORST article I've ever seen on data compressors.
Does anybody zip their mp3 or avi files?
As for their "Compression Efficiency" calculation, that's about as much use as a chocolate teapot.
No sig today...
Does it address the number of overflows, smashed stacks, tap-dancing on the heap vulnerabilities? I monitor software vulnerabilities for my employer, and there has been a stready flow of exploitable bugs in archiving software (everything from zlib , to Winzip. (Who knew Wz includes an ActiveX control, allowing users to be owned via a wenbsite?!) Many anti-virus apps have also been vulnerable to issues in unpackers of various flavours.
Everything I needed to know about life, I learnt from Blake's Seven
Poor article. Even the Wikipedia article is more "exhaustive." http://en.wikipedia.org/wiki/Data_compression
h tml
_ Predictive_Coders_PPMZ.htm
Even two minutes googling for "data compression" will get you more useful and better "compressed" information.
http://www.ics.uci.edu/~dan/pubs/DataCompression.
http://datacompression.info/
http://www.maximumcompression.com/
http://www.compression-links.info/Link/248_Markov
i found a similar article written in italian: http://www.amdplanet.it/archivio/articoli/131/ the result are the same...
I don't care about which compression mechanism works the fastest or produces the smallest files. I care about usefulness. The format has to be open and widely-used, and the algorithms have to be reasonably fast. That means I either use .zip, .tar.gz, or .tar.bz2. Goofy formats like .rar and .ace just aren't worth the headache.
http://outcampaign.org/
This is easily the best article I've seen comparing data compression software
Hardly exhaustive. There is no mention of rzip.
Have you got your LWN subscription yet?
Give it am MD5 hash and a file length and it will compute all the possible files that could have produced the hash. Automatically filter our the invalid files and the set you're left with can't be that large.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
for general purpose lossless compression. Most modern compression utilities out there mix and match the same algorithms which do the same thing.
With the exception of compressors that use arithmetic coding (which has patents out the wazoo covering just about every form of it), virtually all compressors use some form of Huffman compression. In addition, many use some form of LZW compression before executing the Huffman compression. That is pretty much it for general purpose compression.
Of course, if you know the nature of the data you are compressing you can come up with a much better compression scheme.
For instance, with XML, if you have a schema handy, you can do some really heavy optimization since the receiving side of the data probably already has the schema handy which means you don't need to bother sending some sort of compression table for the tags, attributes, element names, etc.
Likewise, with FAX machines, run length encoding is used heavily because of all the sequential white space that is indicative of most fax documents. Run length encoding of white space can also be useful in XML documents that are pretty printed.
Most compression algorithms that are very expensive to compress are usually pretty cheap to decompress. If you are providing a file for millions of people to download, it doesn't matter if it takes 5 days to compress the file if it still only takes 30 seconds for a user to decompress it. However, when doing peer to peer communication with rapidly generated data, you need the compression to be fast if you use any at all.
Nevertheless, most generaly purpose lossless compression formats are more or less clones of each other once you get down to analyzing what algorithms they use and how they are used.
How well do the compression algorithms compress the other compression algorithms files? :)
It has poor compression and slows down the filesystem viciously, mostly due to fragmentation; I've see 200000 fragments in a single file!
I think the compression algoritim it uses is ZLW, you're lucky to get 1.5:1 in the best cases.
There are other issues, like a 20Gb compressed file giving fake disk errors (on a drive with 40Gb of free space) but generally the poor compression and performance is enough to ensure that you don't want to use it.
Explain this.
MP3 (aka MPEG-1 Layer 3), MPEG-1/2/4 video, DivX, JPEG, and non-PCM WAV are all compressed media files. Using a file compression utility meant for documents on these formats is a silly waste of time.
I mean, why not compress your already compressed media with all eleven tools in a chain while you're at it?
"Mine goes to eleven."
/dev/null beats them all in time and space.
Without that there is "Nothing to see", "Move along, move along"
Perhaps to http://compression.ca/act/act-calgary.html
Maybe RAR includes special features for multi-part archives: seeing the archive contents when you only have a single archive, or even extracting as much as possible from only a subset of the archives. Or even something like PAR, letting you get all of the data when you are missing one or two archives by adding error correction data. I don't know that RAR has any of these features, though, except for the first one.
Switch back to Slashdot's D1 system.
This reminds me of... pkunzip.zip
The saddest poem
Back then it was a case of trying to compress all the source for a project (in Turbo Basic) onto a single floppy for a quick backup. I vaguely remember that ARJ gave the best compression then. I suspect we were comparing with ZIP and LHA.
:)
We also went through various sorts of DOS (MS, DR) trying to find the one that gave us the maximum free RAM so we could compile the project.
Happy days
*imagines parent comment spoken in the voice of comicbook store guy off the simpsons* . . .
heh
Yes, LZMA is good, and more importantly it's free (would you really trust your data to some binary blob implementing a secret algorithm?). On Windows there's the excellent 7-zip (also free) and on Unix you can use LZMA Utils to get a gzip-style single file compressor, though it's still a bit developmental and it doesn't have gzip or bzip2's advantage of being well-known and installed everywhere.
However, the very best lossless compression, not mentioned in the article, is probably lrzip which combines LZMA compression with a pre-compression stage of shuffling around the data somehow (a bit technical I know, but bear with me). It likes to gobble memory but it tends to be either much smaller than bzip2, much faster than bzip2, or both.
-- Ed Avis ed@membled.com
Here is another comparison on the Linux Journal which compares tools such as rzip, lzop, lzma and 7za in addition to bzip2 and gzip.
Gzip and bzip2 compress only one file into one package*, and the common method thus is to use tar+gzip or tar+bzip2. While this may not make any difference for video and audio, I think that it makes at least some difference for the documents. The article does not say anything about tar, so I wonder if that might have changed the results, at least a bit. The man page of gzip at least say that using an archive such as tar before packaging improves the results.
.tar.gz and .tar.bzip2 are so common combination, that in a sense, they can be considered as an archive format, imho.
On the other hand, tar could be used in combination with other packages as well - maybe it would have changed their results, too..? But still,
*) Although for bzip2, you can concatenate the packages (you can concatenate even gzipped files, but they can't be uncompressed into separate files, and you don't usually want that). I haven't tried this ever, though, but that's what the man page says.
It was more important back in the floppy days where you could also do things like 'split to use all the free space on this disk'. RAR was the first one (or first one I saw) that implemented that right. Might still be useful e.g. to split huge datasets to write to multiple DVDs.
cat: not all compressors/decompressors accept streams (e.g. rzip), and cat won't work across volumes: taking the DVD dataset example again you couldn't use a streaming cat decompress unless you've got as many DVD drives as you have data discs whereas using RAR et al you could use a single drive that would let you switch DVDs in and out. Or I suppose you could write a disk-change-capable cat, though.
It may not be an Open Source license, but it source is available and it is portable. ftp://ftp.rarlabs.com/rar/unrarsrc-3.7.3.tar.gz.
7-zip, multiplatform, superior speed and compression, open source.
I only skimmed the article but what with all the hullabaloo about dual/quad core chips, why didn't they use "exhaustive" as an excuse to check out the parallelisability (if that's a word) of each compression algorithm? IIRC they didn't list the hardware they used or any of the switches they used, which is a glaring omission in my book.
;)
Of all the main compression utils I use, 7-zip, RAR and bzip2 (in the form of pbzip2) all have modes that will utilise multiple chips, often giving a pretty huge speedup in compression times. I'm not aware of any SMP branches for gzip/zlib but seeing as it appears to be the most efficient compressor by miles it might not even need it
It's mainly academic for me now though anyway, since almost all of the compression I use is inline anyway, either through rsync or SSH (or both). Not sure if any inline compressors are using LZMA yet, but the only time I find myself making an archive is for emailing someone with file size limits on their mail server. All of the stuff I have at home is stored uncompressed because a) 90% of it is already highly compressed and b) I'd rather buy slightly bigger hard drives that attempt to recover a corrupted archive a year or so down the line. Mostly I'm just concerned about decompression time these days.
Moderation Total: -1 Troll, +3 Goat
I suppose while CPUs are becoming faster the things that we want to zip are becoming larger. I thik the average time spent zipping things up has stayed the same over the years and its just the .zip files that have grown bigger.
With CPUs these days is there any reason not to default to max compression?
I have excellent Karma and I am not afraid to Troll it.
While the main thrust of JPEG is to do "lossy" compression, the final stage of creating a JPEG is to do lossless compression on the data. There are two different official methods you can use: Huffman Coding and Arithmetic Coding.
Both methods do the same thing: they statistically analyse all the data, then re-encode it so the most common values are encoded in a smaller way than the least common values.
Huffman's main limitation is that each value compressed needs to consume at least one bit. Arithmetic coding can fit several values into a single bit. Thus, arithmetic coding is always better than Huffman, as it goes beyond Huffman's self-imposed barrier.
However, Huffman is NOT patented, while most forms of arithmetic coding, including the one used in the JPEG standard, ARE patented. The authors of Stuffit did nothing special - they just paid the patent fee. Now they just unpack the Huffman-encoded JPEG data and re-encode it with arithmetic coding. If you take some JPEGs that are already compressed with arithmetic coding, Stuffit can do nothing to make them better. But 99.9% of JPEGs are Huffman coded, because it would be extortionately expensive for, say, a digital camera manufacturer, to get a JPEG arithmetic coding patent license.
So Stuffit doesn't have remarkable code, they just paid money to get better compression that 99.9% of people specifically avoid because they don't think it's worth the money.
Does my bum look big in this?
http://www.maximumcompression.com/ is better source
Simpletron may be what you require, pizzach. It refactors the English language such that there is only one word for any given concept. All colours are purple, all distances are a mile, all numbers are seven, etc. It's quite handy!
But by that metric opera's doing well.
t +explorer%2C+opera&ctab=0&geo=all&date=all
http://www.google.com/trends?q=firefox%2C+interne
By any other metric it isn't doing particularly well, despite it being a fairly competant browser.
Also FatPhil on SoylentNews, id 863
In it's efficiency graphs they order the negative scoring ratios wrong! Afterall, they considering something that adds 1MB in 2 seconds to be worse than one that increases the size by 1MB in 2 minutes. So doing the same thing *slower* actually ranks it ABOVE the other one. Plus, what matters, even for large files, is NOT the time for compression. What you REALLY want to compare is the ratio and the time for EXTRACTION on those settings. Any file will be compressed once, decompressed thousands of times. A minute longer to produce means little. A minute longer to extract for everyone extracting it matters a lot.
That was my point. Google Trends is a really great way of seeing what people are searching for on Google, but it isn't a good way to learn what software people are using.
In many jurisdictions, mathematical processes are absolutely not patentable. Are payware compression tools using "patented" algorithms cheaper in those countries where zero royalties are owed, or are British consumers being shafted up the arse again?
I only need a faster decompressor and slower compressor, no a slower decompressor and faster compressor.
zip: OK.
gzip : OK.
bzip2: Fatal.
7zip: OK.
With disk space becoming less and less expensive with each passing week, any of the compressors would work fine for nearly everyone's need.
The article seems to be measuring the compression speed of each program with its native algorithm, it would have been better to do a set of programs with each algorithm first. As the article is comparing two variables at once, how good the algorithm is and how good the implementation in that program is, the results are slightly meaningless.
.zip and .tar.gz because you know people will be able to open them. Proprietary algorithm X may be really efficient but if no one can open it, who cares?
Having said that, do I really care in practice that much about if algorithm A is 5% faster than algorithm B? I personally do not, I care if the person receiving them can open them. So the second problem with the article is that it is one computer user on his own, in the real world you would just distribute
My little Linux and tech blog
Apache 2 has mod_deflate for compressing data on the fly as its sent. There are caveats with that though. Some browsers don't support it well, and like anything else compressing files that are already compressed doesn't buy you anything.
I was actually expecting to get moderated funny instead of interesting...but there is still time. I suppose my post had shown even more thought than I had even thought it did. Choosing the number 60 was a bit extreme and random, but I was trying to emphasize what I am aiming for. It's called shooting for the stars and landing on the moon. I don't know how many words the language would come out to be if/when I start creating it.
One thing that I didn't expect was the stream of very informative posts. Thanks to the people who replied with constructive comments! 118 word toki pona shows that you can do a lot with a little. Simpleton is a bit more extreme than I was aiming for though. (laugh) You don't have to have a concept for everything in a language. Klingon is a language whose words tend to relate to war. Toki pona focuses on "the good things in life."
The idea for a language where one word can have a lot of meanings actually came from studying Japanese. Here are some examples from the Japanese English edict dictionary on Jim Breen's site:
Once you start despising the jerks, you become one.
Real men upload their code to public FTP servers and let the world mirror it.
I want to delete my account but Slashdot doesn't allow it.
Things are less blue.
(And I'm not speaking of the sky)
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
What is this kgb compression I'm starting to see? I found the site and couldn't install from source under Linux. The Windows software installed but I haven't been able to open a damn thing with it. The docs are pretty crappy.
Anybody had any good experience with kgb-compressed files?
Hardly a spoiler!
America, Home of the Brave.
As linked by other folks on this thread, maximumcompression.com will show that WinRK (proprietary) and PAQ8 (GPL) take the crown in compression. The free PAQ series (wiki, homepage) kick some serious butt...
(Tested on a Project Gutenberg text "The Man who was Thursday")
79105___thurs.paq8l-7
79112___thurs.paq8l
96495___thurs.bz2-9
96708___thurs.rz
107583__thurs.7z
123847__thurs.gz-9
320553__thurs.txt
--
Slashcode bug # 497457 - unfixed since December 2001 - Go look it up!
o/~ Join us now and share the software
...and then you have to join the split-up files again before you can extract. RAR (and other archivers with split archive support) automagically extracts without wasting time and disk spice on a 'join' operation.
Coffee-driven development.
"The idea for a language where one word can have a lot of meanings actually came from studying Japanese"
You don't need japanese for that: http://dictionary.reference.com/browse/set
The Mac may have started out as a machine for artists, but along the way Apple figured out artists never have much money and hang on to computers for decades past when they should have been replaced.
So Apple started selling computers to regular people and business people and sysadmins and anyone else who can afford one.
Starving artists? Who gives a shit about them when Apple can sell a 4 grand PC to somebody who just wants to have one? Compression method? It doesn't fucking matter WHAT compression method they use. SIT, DMG, ZIP, whatever.
Use the one that suits your needs and shut the hell up. Oh and get a real job you artist bum.
http://code.google.com/p/zipmt/
Wow, that's great compression. What's the status of the decompression support?
Well, maybe it's related to this! :-)
The Tao of math: The numbers you can count are not the real numbers.
http://www.freebyte.com/hjsplit/#win32
Nerd rage is the funniest rage.
No shit! AJR32 isn't even the name of a program. No wonder it executes fast, and your faith must be stronger than Yoda's if it has any effect on your file size at all.
RAR has recovery records (settable percentage of each archive dedicated to ECC, default off) and recovery volumes (dedicated files with PAR-like recovery capabilities). "Keep broken files" can be used to extract from broken or truncated archives.
The thing that I find the strangest is that modern compressors have also special modes for JPEG files.
Either they detect them quickly to completely avoid trying to compress them and achieve superior speed.
Or some compressor use special mode, where the software decompresses the JPEG data back to the DCT stage and then use some more modern and efficient algorithm to store the DCT data than the original Huffman code.
It's strange because although their suit of software included StuffIt, they completely failed to demonstrate it.
(Instead, apparently StuffIt went for the "avoid compression to gain speed" route)
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Put it another way, if I compress a file whose SHA1 sum is: I do want the decompressed file to give the same SHA1 sum. Otherwise we are not talking about compression/decompression and it's really comparing apples to oranges.
Another thing: JPEG is (usually) lossy.
Say I've got my nice picture and save it from Photoshop (or The Gimp) using a "80% efficiency". And I've got a lossy JPEG that I consider OK. The last thing I want is a compressor adding *additional* compression artefacts to my pictures. I know JPEG in my case is lossy, but that doesn't mean I want to have a picture crappier than my already lossy "original". So do these fake compressor (as explained before, if the uncompressed file isn't bit-for-bit to the original, we're not about a compression program) add other artefacts to JPEG pictures or are they smart enough to "replace" the Hufman with another algo that produce exactly the same output?
Otherwise not only are we not talking about a compression program but we're not talking either about a program correctly encoding a picture.
I don't know if I'd call foul, but according to the article, the default configuration that they test gzip with was 'gzip -5 -v' but according to http://www.gnu.org/software/gzip/manual/gzip.html and every other version of the gzip manual I've read, the default compression level is -6. This will make gzip's default setting appear to compress less and run faster than the real default settings. This is incorrect.
Does this mean that the jpeg you put into a Stuffit archive may not match the jpeg you pull out of it, or does it recode to Huffman when you extract?
"No one likes working in a hamster wheel, and your shop smells of cedar shavings from here." - TaleSpinner
The person that did the comparison does not undertsand the significance of compression in practice. Files that are not really compressible by general-purpose compressors do not form a useful benchmark set. For example trying to compress JPEG is an execise in futility and only demonstrates an astonishing degree of incompetence. One typical thing that should have been done is eleminate all tests were not at least a compression to 70% size was reached, since compression makes not sense in these cases at all. Another thing that definitely should have been in there is compression of a typical HDD backup! That is were I, and probably many other people, use compression most.
Bottom line: 90% useless, at least 90% of the useful test-cases missing.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I have written one such compression algorithm myself. And it really does work. I'm going to ship it, just as soon as I fix this one lingering bug in the decompression routine . . .
I also have a different but equally revolutionary compression algorithm under development. It can compress a file of any size down to one byte. In my proof-of-concept tests, it successfully compressed and then decompressed 256 different files, some of which were over 100GB in size! I'm working on adding support for more than 256 files, but I've got more research to do first.
FATMOUSE + YOU = FATMOUSE
Try 7 Zip. The 7z compression format is totally unknown, but 7Zip will manage to compress and uncompress into a wide variety of formats (including zip, rar). It has a decent GUI and shell integration and to top it all, it is open source.
My other OS is the MCP!
for example bzip uses the burrows wheeler transform, which dates to about 1992, and is interesting to read about. basically it transforms blocks (say 200k at a time) so the block is "sorted" and stores the transform required to reverse this. then applies a simple runlength/delta style compression to the sorted block. this turns out to produce compression ! huffman and arithmetic compression are about as old as information theory itself.
s form
http://en.wikipedia.org/wiki/Burrows-Wheeler_tran
one advantage is that if there is corruption in a block then you only lose that block, instead of the entire file.
of course there are many ways of encoding redundancy/recovery info. some much more sophisticated. i remember doing a math unit which involved the group properties of elliptic curves. which are cubic in one var and quadratic in the other. they look like a circle and a hyperbola from memory and you define a group operation by using the property that two points form a line that intersects somewhere else (then you flip on an axis). they can be used for encryption, but also to provide "global" "hologram style" redundancy. i think i have read about using it for some next gen optical disks (they use several layers of error correction), this was years ago, so its possible it has already been implemented in hddvd or bd.
Thanks for the free pr0n!
Text? XML? Source code? Executable code? GIF? PNG? AAC? Flash SWF? JAR files? Actual E-Books (MS LIT, eReader, Plucker)? Which version of Office docs are those? Are they XML? Quality level of JPEGs? Could the ratios of HTML and Office documents be any more arbitrary?
Yes, it recodes it into Huffman after extraction. Here is Aladdin's white paper on how they do it.
Does my bum look big in this?
I'm no expert, but I don't think this is accurate. When you use stuffit on a jpeg you get a stuffit file (.sitx) not a jpeg. They are using their own algorythm to compress the jpeg not simply changing it to another form of jpeg. A small project like PAQ8 has similar (but not as good) jpeg compression but I seriously doubt that they paid a patent fee. Both are compressing the data with their own algorithms and then decompressing them and converting them back to the original file.
Again, I'm no expert, so if I'm wrong please let me know.
http://www.popularculturegaming.com -- my blog about the culture of videogame players
Its true that LZMA often flies below the radar and not many people are aware of it (just try Googling for it, or looking for research papers about it--there's not much).
However, it is the algorithm used in 7-Zip. It is represented in this test.
Speaking as a person with interest in 64K intros, LZMA is an awesome, awesome algorithm if you need fast decompression and *small decompression code*. A carefully hand-tuned implementation of an LZMA decompressor would be less than 2K of assembly code, and could perhaps be crammed into 1K by a sufficiently clever hacker. This is an order of magnitude smaller than most algorithms that can give comparable compression performance.
The high compression of LZMA comes from combining two basic, well-proven compression ideas: sliding dictionaries i.e. LZ77/78, and markov models (i.e. the thing used by every compression algorithm that uses an arithmetic encoder or similar order-0 entropy coder as its last stage). LZMA is awesome because the contexts used in its model are segregated according to what the bits are used for. Folding that knowledge right into the model results in a simple but very effective compression scheme.
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
You get to pretend to be interested in, nay like computers, yet at the same time deride, insult and attempt to belittle your fellow comrades that share your interests. Behold the wonders of modern technology as you too can become a power user without actually having to care about computers, algorithms or hacking! You don't need to learn any obscure and arcane mumbo jumbo to run your new limited range of overpriced junk.. *ahem* feature-full all in one complete deluxe no-need-to-ever-upgragde top of the line hardware.
So don't wait any longer. Tilt your head high and stare down your nose at the pathetic dweebs that have no class. Get superiority, snobbishness and a dash of self loathing as a Mac geek!
--
(apologies to real Mac users)
Yes, they convert JPEGs to .SITX files. But the data inside the .SITX file is basically that JPEG data unpacked and then repacked with an arithmetic coder. The actual method used is called "Arsenic" and uses arithmetic coding, RLE and a block sorting compressor based on the Burrows-Wheeler Transform.
Does my bum look big in this?
huh i don't get this, since a bit is a computers' smallest unit of information, how does one fit several values into it ?
seriously i would like to know that
It seems PAQ8L does something similar to Stuffit while being free. Did they pay for the patent as well, but decided to distribute their code for free anyways?
What I don't get is, why is the *nix community attached to tar+gzip or tar+bzip? It seems like a couple of outdated formats with bad archiving features. Why would anyone want to go through the trouble of processing files twice just to compress a directory?
I was more thinking of a native way ;-)
Ahhh...the great dumpster continuum. Many a free computer will be found there. -- sowth (748135)
I'm having a hard time following you, as I am not a compression algorithm expert. Can you explain how to "fit several values into a single bit"? Where I come from, that would be considered a very good trick indeed.
The basic premise is that a specific sequence of symbols, based on probability, boils down into a thin fractional range. The Wikipedia article on arithmetic coding explains it quite well.
When you add a symbols, sometimes the binary representation is not precise enough to represent that range, so you add a bit (or several bits) to make it more precise. At other times, the binary fraction is already precise enough to represent the updated range after you've added a symbol, and in those cases you don't need to add any more bits to the output value.
Does my bum look big in this?
Looking at their compression rate for documents the data looks highly suspicious. Experiment after experiment reports text compression in the order of 20-30% of original size using bzip2, yet they only get 60% of original size??? This is half as good as widely reported figures!
>>file compression is pretty much only used for large downloads
not entirely correct -compression is also used to reduce bandwidth/maximize data in streaming data applications such as financial market trade data.
The company I work for is switching backend feeds, and although the new feed is 'richer' we need to have it compressed so that clients who have been using dedicated 512kbps circuits will not have to upgrade to fractional T-3 circuits at great expense.
So we are compressing on the fly at the server end and decompressing at the client side -which except for a slight delay is transparent to the customers.
We were going to use PKzip, but the license was too dear, so we settled for gzip
-I'm just sayin'
With Arithmic encoding however you'd encode each character according to the exact probability it has of occuring and write it as a fractional number between 0 and 1. For example, if you want to encode an "A", you'd pick a number between 0.0 and 0.2 (the lower 20% of our number); if you want to encode a "B", you'd use a number between 0.2 and 1.0 (the upper 80% of our number).
What you keep track of during encoding is the upper and lower bound of this number. So, when I want to encode the first "A", my lower bound is 0.0 and the upper bound is 0.2. The next character to encode is "B". We already now the range we can pick from is 0.0 - 0.2, but to encode a "B" we need to pick a number in the upper 80% of this range, so 0.04 to 0.20 (picking a number between 0.0 and 0.04 would encode another "A").
The next letter, another "B", would use a range 0.072 - 0.200. The 3rd "B" would narrow the range to 0.0976-0.2000. The 4th "B" narrows it to 0.11808 - 0.2000.
At some point, the upper and lower bound will have a few most significant digits in common that cannot change anymore. When this occurs, you can start writing these out as part of your compressed stream. For example, when we encode the 6th character (the 2nd "A"), the range becomes 0.118080 - 0.134464. The first two digits (0.1) can't change anymore now, so we can write them out, and just continue narrowing the range further for subsequent data to be compressed.
At some point, there'll be no more data to be compressed, and you then just pick a number (as convenient as possible) between the upper and lower bound you have established, write it out and end the stream. The process is the same when doing this with binary floating point numbers.
SBC archiver is worth the extra hassle... If your dealing with billable network transfers. Someone needs to reverse engineer the application so we can implement it on *nix systems.
UHARC (http://en.wikipedia.org/wiki/UHarc) is missing from that list. It's best know for Game rips, it compresses multimedia files really well but also takes a lot of time to do it.
Well, first thing that I notice in this so called report is a constantly rotating adverts. They refresh page every second. So I believe this was the only reason in writing this report.
Then, let's take a look at the data. WinRAR - the fastest method was "fastest", ok, but why the "default" was also "fastest" and not "Normal" as it is in reality? And where is its "Slowest" method?
My opinion - this report was bought by WinRK or what is its name. Typical FUD.
So what I take from those figures is that the best in the field are only just better than half again as effective at compression as my old stalwart compressors (gzip and a copy of WinRAR I bought 4 years ago), and less than 10% better if you take speed into account.
Not really an incentive to buy these 'advanced' and considerably more expensive compressors. I'll stick with the free and old versions thanks.
Yes, but if you design a compressor to split files well, it will do that, and if you want it to do something else, you'll have to program that in as well.
If you design it to work on streams, it can do anything with streams that you have utilities for. Including splitting across volumes. For instance, iirc, you can pipe through tar and get the ability to change media. Or, growisofs will, i believe, do the same thing for DVDs, with a bit of command line fu.
Can you be Even More Awesome?!