PKWare Zips to Growth
Rob Kennedy writes "The Milwaukee Journal Sentinel has a story about PKWare's new business plan. It talks about the investment group that bought the company after founder Phil Katz's death in 2000, and the plan for PKWare to produce what president and COO Timothy H. Kennedy (no relation) calls 'the next generation of zip' by adding various security features."
PGP compresses files during the encryption process.
Hah. He took the established ARC format, which had copyrighted free-as-in-beer public domain routines in C, and rewrote them in x86 asm for speed... and then sold PKARC (Phil Katz ARC) as a commercial product. The original inventors of ARC sued him and won - he even kept the same misspellings in the strings, for fuck's sake. He settled for a lump sum in court, then ended up making a couple of changes to the ARC format and renamed it PKZip.
That, and if you actually look at the ZIP format, you'll notice that it's all routines invented by other people. "Shrink" is dynamic LZW, "Reduce" is RLE with a second-pass probabalistic encoder, and "Implode" is a sliding dictionary with post-compression using Huffman/SF-tree encoding.
Katz was an excellent promotor and had good networking skills. I admire him for that much, and for establishing a defacto format that scaled nicely to 64-bit sizes and arbitrary-length Unicode filenames. HOWEVER, he was hardly a pioneer in compression algorithm design. Give him credit where credit is due.
For comparison purposes, I downloaded cs94_002.zip and recompressed it with the latest version of WinRAR (3.10 beta 3), set to maximum compression. The result:
cs94_002.rar (Source) 9.4MB (9,407,157 bytes)
WinRAR appears to compress much better than bzip2; however, it isn't free. Interestingly, as good as WinRAR is, even it doesn't come that close to having the best compression ratio out there.
For lots of useful statistics on the relative capabilities of virtually every compression engine in the world, check out Jeff Gilchrist's Archive Comparison Test. A lot of progress is still being made in compression technology, so the state of the art keeps changing.
begin 644
Well the other reason for doing encryption after compression, is to mitigate dictionary attacks. So the cost of breaking in by brute force includes both decryption as well as decompressing.
Programs that encrypt computer files tend to make the files much larger, gobbling up valuable room on a hard drive or ...
This is bullshit. I do not know of even a single cipher which makes the files larger. Indeed all ciphers commonly used today for file-archiving are block-ciphers which transform a fixed-size (typically 64 bit) cleartext-block into an identically sized ciphertext-block. Examples of such ciphers include DES, IDEA, Blowfish, 3-DES, AES, Twofish and many others.
Combining encryption with data compression is a natural, said Stephen Crawford, vice president of marketing.
The vice-president of marketing is not typically a good person to ask about technical issues. In this case he is correct though, it is a good idea to compress files prior to encryption, this both saves place, aswell as making certain attacks a little bit harder due to more entrophy in the compressed plaintext than in the plaintext itself.
Unfortunately for him this idea is so obvious that it's been implemented in typical encryption-programs for ages. Both PGP and GPG for example by default compress the plaintext priorto encrypting it. This is hardly novel.