Delta Compression for Linux Security Patches?
cperciva asks: "For people without fast internet connections, it is often impractical to download large security patches. In order to avoid to reduce patch sizes, some operating systems -- starting with FreeBSD over a year ago, and recently followed by Mac OS X and Windows XP SP2 -- have started to use delta compression (also known as binary diffs, which constitutes a portion of my doctoral thesis), and can often reduce patch sizes by over a factor of 50. In light of the obvious benefits, I have to ask: When will Linux vendors follow suit?"
I disagree. I've used smartversion on Windows for a couple years now for making versioned archives of important files, and I wish Linux had something comparable. It's liked having a portable single tar.gz of an entire cvs repository without all the headaches...
SUSE already does this.
RPM in general, however, doesn't nicely support this feature. Either RPM needs to be extended/modified, or a new format needs to be made. While I favor a new format for many reasons other than this, modifying RPM is probably the best solution in order to provide backwards compatibility.
It works beautifully but I can't help but think it is a waste of bandwidth.
http://www.daemonology.net/bsdiff/
bsdiff and bspatch are tools for building and applying patches to binary files. By using suffix sorting (specifically, Larsson and Sadakane's qsufsort) and taking advantage of how executable files change, bsdiff routinely produces binary patches 50-80% smaller than those produced by Xdelta, and 15% smaller than those produced by
http://sourceforge.net/projects/diffball
A general delta compression/differencing suite for any platform that supports autoconf/automake, written in c, w/ builtin support for reading,writing, converting between multiple file formats, and an easy framework to drop in new algorithms.
Especially since the license of bsdiff is not even close to a BSD license (don't let the name of BSD Protection License fool you). Unless the license is changed to something like BSD, BSDiff is not going to be implemented anywhere except in closed source software. Debian cannot even package this software becauses it is non-free.
I guess the bottom line is if you want to have something accepted in open source *and* in propriatary software, you want to license under BSD. You want to cater to one group (closed source in this case), you will lose the other.
You point out TeTeX at 14+MB which is as bare as it gets for TeTeX, then comes the TeTeX-Doc and the TeTeX-Extra which by now we're up to over 50MBs.
Oh and here is the real kicker. Debian has updated 2.02 3 if not 4 times this month. Now 150MB+ to over 200MBs of fixes? Nope. SP2 looks a bit smaller now don't it?
And that doesn't even touch the -1,-2,..-20 Debian patches they keep spewing out for project after project.
The only plus for a 56k access is they don't cap youru downloads on a monthly basis. The badside obviously is bandwidth, but for me its time down waiting for important packages like TeTeX to update.
Having a SVN approach to patching systems makes sense. Or CVS if you prefer a different versioning system approach.
It's already been said but it is worth repeating, especially when one runs KDE or GNOME. Just Build a freakin' base package and update us with Binary Images that are new or replaced, documentation that is new or revision updates and binaries to the executables, libraries, so on that change and not the mountains of innert parts that don't change.
You can't tell me KDELIBS , KDEBASE needs to be completely rebuilt each .x revions or -x revision by Debian and by completely rebuilt I mean all the inert files that don't actually get touched during the build process other than to make sure some wallpaper image still exists. Hell the Wallpaper backdrops, etc should be add-ons, not part of the distributions. But then again I suppose everyone thinks we all have T1 access.
K.I.S.S.
Enter "deltup" a tool that looks at to tarrballs and gives you a diff between the 2 that you can use to "transform the old tarball to a exact copy of the new tarball", it even preserves MD5 checksums compatibility. Now some enterprising gentoo user create a "dynamic deltup server" that automates the creation of these delta files, and people can reuse the delta files that other people used.
Using this technique in combination with gentoo portage people can reduce there traffic with on average 75%.
Have a look at the following URL's for more information:
http://forums.gentoo.org/viewtopic.php?t=215262
m e.html
http://linux01.gwdg.de/~nlissne/deltup-status.ati
Rigolo