FEAD Compressing Compressed Files by 50-75%?
An anonymous reader asks: "I just installed Acrobat Reader and found that it was using FEAD which claims - 'FEAD© Optimizer© significantly reduces the size of application programs on average by 50% (in some cases up to 75%, depending on the specific software), even when they are already compressed with common compression technology like ZIP or CAB.' . It seems that they optimize each application individually at thieir labs. But an average of 50% compression on already compressed binary files seems to be too good to be true. Anyone familiar with how someone may be able to achieve this?"
Other than that ? Probably just marketing.
Software should be free as in speech, but if we also get some free beer, all the better.
This is a common hoax. Maybe 2 years ago, another Slashdot editor posted this hoax. So, it's a repeat hoax for Slashdot, too.
The thing they tout as FEAD is basically a load-over-network-on-demand thingy. They haven't actually developed anything that does compression, they're just storing some of the app on a server somewhere to be downloaded on demand. The hype at their site mislead you, like it was meant to do.
11*43+456^2
It's decompressing the file that's hard.
You can compress all your files down to a single bit using this patented two step process:
1. Discard all zeros.
2. Use one to represent any length sequence of ones.
This is as reliable a compression scheme as most backups to tape I've ever seen, and you can fit a huge number of files onto a single floppy.
Some people have a way with words, and some people, um, thingy.
I think that it reads as you interpret: if you put some stuff in a .ZIP, it will further compress it. But, on a very close reading, they are only comparing sizes, and not necessarily saying they are compressing the zip file.
From the article: "Netopsystems specialists combine and customize these tools and processes for each individual software product so that optimal size reduction results are achieved."
Note the following from the whitepaper: "Usually software producers compress their data by generating cabinet files or the like...Applying a conventional compression tool like WinZip or WinRAR on such data does not lead to appreciable - often negative - results."
Read strictly, this says what we know: compressing a compressed file generally doesn't work. They aren't saying they compress the compressed file here.
Note that towards the bottom, they are comparing 'lossless compressed' data to what they do.
So, here's my bet: they probably do something like crack open a cab or zip, parse a PDF, for example, for 'magic things' that can be ignored without changing the functionality ('lossy' but nothing of significance lost), or take an HTML file and strip all spaces and newlines between tags. Similar things could be done for other file types: Removing quotes and instead, magic-quoting commas in a CDF. Etc, ad inifinitum.
All in all, it's lame, but so is most software.
If you have a gigantic amount (hundreds of gigs terabytes) of different files to back up or move around, with so many file formats that you can't keep them straight, then it might be worth it. If you are lazy and it's cheap, it might be worth it. Other than that, I fail to see the real utility here - disk is cheap, bandwidth is getting cheaper, and reasonably assuming the bulk of this data is generated (an adequate assumption), you can do very similar things by fiddling around with the the output formatting in code.
J
"Lying through one's teeth" comes to mind...
by employing the latest in smoke and mirrors technology. they've invented a new mirror that reflects 110% of all light. neat huh?
That Site (c) is an Eyesore (c). I wonder if these Dipshits (c) realize that all those "(c)" marks make their Site (c) Difficult (c) to Read (c).
Sounds to me like an EXE compressor like UPX - they can compress EXE files better than a ZIP archive can (by taking advantage of known aspects of executable files); so by unzipping, EXE-compressing, and re-zipping, one can reduce the size of an already existing ZIP archive.
Omnes arx vestrum sunt adiuncta nobis.
When you use an executable compressor, like PKLITE, on an executable file, it can't compress all the data. This is because EXEs will dynamically load more data, and if that data is compressed, the code can't read it.
I suspect these guys are going in and manually altering the code to perform a decompression. This would certainly produce a benefit.
Here's something for you to try: Take an executable and zip it. If it compresses, then there's probably SOME give in it. And most executables I see are compressable.
Using ZIP -9 gives a 20MB file.
So, FEAD offers slightly better compression. (I know there's other crap, including the installer, registry settings, icons, ...)
Still, is it worth the annoyance of the greatly increased install time?
Also, how is FEAD saying they are 50% better than other compressors?
Nothing to see here; Move along.
I think thumperward is trying to tell us that he thinks it is possible to compress already compressed files by another 50%.
I have no idea how FEAD works, but here's how I'd do something similar:
A large portion of shipped executables are blocks of standard code from the compiler. If you're using a Microsoft compiler, you can strip out the standard chunks and pull those chunks in from the binaries that are already in Windows.
If you're using another compiler, you can still probably do the same kind of thing: some intelligent block compression with the included code that'll do better than the "dumb" compression from "zip" or other algorithms.
You could combine that with compiler space optimization tricks, too: loop RE-rolling, for example. A lot of compilers do tricks to make code faster. Not many do things to explicitly make code smaller. A "shriking" compiler (or a disassembler/reassembler) that produced small code combined with an "expander" app that made the code bigger but faster could make very small apps.
Forward, retransmit, or republish anything I say here. Just don't misquote me.
> It seems that they optimize each application individually at thieir labs. But an average of 50% compression on already compressed binary files seems to be too good to be true. Anyone familiar with how someone may be able to achieve this?
Maybe they're just removing the bloat. I've read on comp.risks about a guy who disassembled Windows regedit and found embedded strings and even images, which were not actually used in the application program.
But the link is not at all clear about what they are actually doing. For that matter their basic claim about how much compression they're getting vis-a-vis ordinary methods is very vaguely worded.
Sheesh, evil *and* a jerk. -- Jade
At a previous company I worked at, we had developped a (proprietary) method of compressing x86 binaries, which yielded on average a 5:1 compression ratio, when zlib usually only yields 2:1.
I have a far superior algorithm in both time and space complexity. Start with 1. Then simply transform it to the requisite number of 1s and 0s, a la 1101101001001. Bah to your two-step process. :-)
I could not justify my existence if I were a turkey farmer. Would I terminate myself? Undoubtably, yes.
Most of the standard compressors work at the byte level, and work only with small chunks of the file, for speed reasons. Obviously, releasing the above constaints may yeild improvements on the order that they claim, but compression might take hours...
Lossy compression eh? Get LZip for all your lossy file compression needs! It can reduce your file sizes up to 100%!
I couldn't say for sure, but it's possible that theyre just using a better coding scheme. ZIP et al use (as far as I know) variations on the LZ type compression algorithms. These are fast, but definitely not the best entropy removal methods available. Arithmetic coding OTOH is very effective, removes more entropy than LZ, LZW, or Huffman, but is slow because it needs to collect statistics on the entire file before compression. I dunno about decompression speed though Arithmetic coding is patented though, same as LZW,so not just anyone can use it. Just my $0.02...
my sig could kick your sig's arse...
My company uses freelance designers to create HTML templates for the sites we build and operate. The designers we work with typically do their graphics work first, then cut the graphics into pieces and build HTML-ized versions of the layout with WYSIWYG HTML editors such as Dreamweaver or GoLive. Having just recently realized the number of tabs, newlines, and spaces that the average WYSIWYG editor inserts into HTML documents, I've started a crusade to begin optimizing our sites one by one.
So far, I've only "de-bloated" one of our sites. I was able to cut the size of the index.html from 14,719 bytes down to 9,252 bytes just by eliminating unnecessary spaces, tabs, and newlines. This particular template was built using Dreamweaver, and the designer apparently has his Dreamweaver preferences set to indent HTML. In terms of filesize, that translates to a great deal of wasted bytes (spaces used for indentation) in just about every "nested" HTML element there is, especially tables.
Considering we get approximately 10,000 hits to this particular site each day, the savings add up. The initial 14,719 bytes minus the optimized 9,252 bytes means 5,467 fewer bytes per pageload which were comprised entirely of spaces, tabs, and newlines - junk as far as any browser is concerned. At 10K pageloads, that means more than 50 megs of saved bandwidth per day for this site alone. And that's just the index.
We run more than 100 sites; by the time I get done stripping extraneous whitespace out of all of them, I seriously expect our bandwidth to be cut in half. If you're running a homepage on Geocities, sure, who cares... But when you're running a dedicated server and doing 100+ gigs a month of transfer, stop and think about how many of those gigs are useless. Spaces, tabs, newlines which are invisible to your visitors' browsers.
I'd be willing to bet that the majority of websites on the internet could reduce their monthly bandwidth consumption by 25% or more if they'd remove unnecessary whitespace from the HTML files they're serving. Don't underestimate the waste that's taking place!
AFAIK upx does pretty much the same thing, for free
http://upx.sourceforge.net
it generally gets 50-75% too. IIRC it make a really fast (faster than a HD read) decompressor prepended to a compressed program.
The only thing I can think of to do better is to actually rearrange the binary in a more efficient order / go in at the assembler level and replace any repeated 5 instruction or more sequence with a function call.
I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment