Slashdot Mirror


Delta Compression for Linux Security Patches?

cperciva asks: "For people without fast internet connections, it is often impractical to download large security patches. In order to avoid to reduce patch sizes, some operating systems -- starting with FreeBSD over a year ago, and recently followed by Mac OS X and Windows XP SP2 -- have started to use delta compression (also known as binary diffs, which constitutes a portion of my doctoral thesis), and can often reduce patch sizes by over a factor of 50. In light of the obvious benefits, I have to ask: When will Linux vendors follow suit?"

7 of 289 comments (clear)

  1. Re:Doesn't make as much sense to use for Linux by bluesguy_1 · · Score: 3, Interesting

    I disagree. I've used smartversion on Windows for a couple years now for making versioned archives of important files, and I wish Linux had something comparable. It's liked having a portable single tar.gz of an entire cvs repository without all the headaches...

  2. SUSE by DreadSpoon · · Score: 3, Interesting

    SUSE already does this.

    RPM in general, however, doesn't nicely support this feature. Either RPM needs to be extended/modified, or a new format needs to be made. While I favor a new format for many reasons other than this, modifying RPM is probably the best solution in order to provide backwards compatibility.

  3. Gentoo Portage by WamBamBoozle · · Score: 4, Interesting
    I wonder why Gentoo doesn't do this. Gentoo, as far as I can tell, always distributes a bzip2'ed tar of any particular distribution.

    It works beautifully but I can't help but think it is a waste of bandwidth.

  4. It's already doing it. by Mongo222 · · Score: 3, Interesting



    http://www.daemonology.net/bsdiff/

    bsdiff and bspatch are tools for building and applying patches to binary files. By using suffix sorting (specifically, Larsson and Sadakane's qsufsort) and taking advantage of how executable files change, bsdiff routinely produces binary patches 50-80% smaller than those produced by Xdelta, and 15% smaller than those produced by .RTPatch (a $2750/seat commercial patch tool).

    http://sourceforge.net/projects/diffball

    A general delta compression/differencing suite for any platform that supports autoconf/automake, written in c, w/ builtin support for reading,writing, converting between multiple file formats, and an easy framework to drop in new algorithms.

  5. License of BSDiff by gnuman99 · · Score: 3, Interesting
    Certainly for your primary commercial auto-updated Linux distributions it does, but for anything else it usually doesn't.

    Especially since the license of bsdiff is not even close to a BSD license (don't let the name of BSD Protection License fool you). Unless the license is changed to something like BSD, BSDiff is not going to be implemented anywhere except in closed source software. Debian cannot even package this software becauses it is non-free.

    I guess the bottom line is if you want to have something accepted in open source *and* in propriatary software, you want to license under BSD. You want to cater to one group (closed source in this case), you will lose the other.

  6. Re:Several reasons, but not all technical by tyrione · · Score: 3, Interesting
    Well congratulations.

    You point out TeTeX at 14+MB which is as bare as it gets for TeTeX, then comes the TeTeX-Doc and the TeTeX-Extra which by now we're up to over 50MBs.

    Oh and here is the real kicker. Debian has updated 2.02 3 if not 4 times this month. Now 150MB+ to over 200MBs of fixes? Nope. SP2 looks a bit smaller now don't it?

    And that doesn't even touch the -1,-2,..-20 Debian patches they keep spewing out for project after project.

    The only plus for a 56k access is they don't cap youru downloads on a monthly basis. The badside obviously is bandwidth, but for me its time down waiting for important packages like TeTeX to update.

    Having a SVN approach to patching systems makes sense. Or CVS if you prefer a different versioning system approach.

    It's already been said but it is worth repeating, especially when one runs KDE or GNOME. Just Build a freakin' base package and update us with Binary Images that are new or replaced, documentation that is new or revision updates and binaries to the executables, libraries, so on that change and not the mountains of innert parts that don't change.

    You can't tell me KDELIBS , KDEBASE needs to be completely rebuilt each .x revions or -x revision by Debian and by completely rebuilt I mean all the inert files that don't actually get touched during the build process other than to make sure some wallpaper image still exists. Hell the Wallpaper backdrops, etc should be add-ons, not part of the distributions. But then again I suppose everyone thinks we all have T1 access.

    K.I.S.S.

  7. Gentoo now has "source delta's" reducing traffic by rigolo · · Score: 3, Interesting
    Well, gentoo is known for the fact that you download the source of every program and than start compiling. These sources are distributed in .tar.gz or .tar.bz form and can be very large. A version change (even a change from .0.0.1 to .0.0.2) has it's own tarball and therefor is downloaded again completly. But, the real changes between these 2 can be small.

    Enter "deltup" a tool that looks at to tarrballs and gives you a diff between the 2 that you can use to "transform the old tarball to a exact copy of the new tarball", it even preserves MD5 checksums compatibility. Now some enterprising gentoo user create a "dynamic deltup server" that automates the creation of these delta files, and people can reuse the delta files that other people used.


    Using this technique in combination with gentoo portage people can reduce there traffic with on average 75%.


    Have a look at the following URL's for more information:

    http://forums.gentoo.org/viewtopic.php?t=215262

    http://linux01.gwdg.de/~nlissne/deltup-status.atim e.html


    Rigolo