New Binary Diffing Algorithm Announced By Google
bheer writes "Today Google's Open-Source Chromium project announced a new compression technique called Courgette geared towards distributing really small updates. Courgette achieves smaller diffs (about 9x in one example) than standard binary-diffing algorithms like bsdiff by disassembling the code and sending the assembler diffs over the wire. This, the Chromium devs say, will allow them to send smaller, more frequent updates, making users more secure. Since this will be released as open source, it should make distributing updates a lot easier for the open-source community."
An interesting approach - I wonder if this would also work as well on compiled bytecode like .NET or Java uses?
Google has to pay the cost for maintaining servers and handling bandwidth for all the OS updates they push out. The more efficient they are in this process, the more money the save.
The good news is that the same benefits could be applied to Red Hat, Ubuntu, openSUSE, etc. Lower costs helps the profitability of companies trying to make a profit on Linux.
The end users also see benefits in that their packages download quicker. I'd be honestly pretty disappointed in any major distro that doesn't start implementing a binary diff solution around this.
http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
That's potentially very misleading. I can compress any document, down to a single but, if my compression algorithm is sufficiently tailored to that document. For example:
if (compressed_data[0] == 0):
return = get_Magna_Carta_text()
else:
return unzip(compressed_data[1:])
What we need to know is the overall distribution of compression ratios, or at least the average compression ratio, over a large population of documents.
But, if the compilers are similar enough to create the same pseudocode/bytecode/ASM, or smart enough to save the source code, and use it for future comparisons, then wouldn't one patch be just as portable as the original source code?
It's a good theory and you're a smart person, but:
Portability doesn't appear to be Google's primary concern, though. They seem to be keen on the idea of delivering binaries over the wire (real binaries, not bytecode) -- see Google Native Client.
Breakfast served all day!