Slashdot Mirror


Subversion as Automatic Software Upgrade Service?

angel'o'sphere asks: "I'm working on a contract where the customer wants a automated, Internet-based check-for-updates, update and install system. So far we've considered a Subversion based solution. The numbers are: a typical upgrade is about 10MB in size. Usually it's about 30 to 50 new files (which have an average size of about 200kB) and 2 database files (which can be anywhere from 500MB to 2GB) that change regularly. Upgrades are released about every 3 months, and this will probably become more frequent as the system matures. The big files are the problem as we estimate about 100-300 changes in every file. The total user base is currently 2000 users, creeping up to probably 5000 over the next year, and might be finally end up at some 30,000 users. Any suggestions from the crowd about setting up a meaningful test environment? How about calculating the estimated throughput of our server farm? Does anyone know of projects that have tried something similar using an RCS or a configuration management system?" "We want to support as many concurrent users as possible (bandwith is not an issue). We use an Apache front end as a load balancer and as many Subversion servers as necessary on the backend. My largest worry, from my calculations, is disk access on the Subversion server. We could not run meaningful tests, because a typical PC kills itself if you try to run more than 4 or 5 parallel Subversion clients doing an upgrade (due to insanely high disk IO, and high seek times)."

4 of 41 comments (clear)

  1. Rsync? by Karora · · Score: 3, Informative

    Wouldn't Rsync be better for what you want? Why do you need to be able to choose different versions to fetch?

    If the files contains parts that are constant along with parts that vary then rsync will in many cases only transfer the partial file. With Subversion that won't apply for binary files, but rsync will still recognise partial matches even on those.

    --

    ...heellpppp! I've been captured by little green penguins!
  2. times two by Lord+Bitman · · Score: 3, Informative

    remember that svn always uses more than double the actual space required to hold the files for a "working copy". For "one-way" updates, svn is _NOT_ the answer.

    --
    -- 'The' Lord and Master Bitman On High, Master Of All
  3. Re:rsync by commanderfoxtrot · · Score: 3, Informative

    Subversion uses binary diffs in a similar way to rsync. The original poster pointed out bandwidth was not an issue- therefore any bandwidth advantages rsync gives (and yes, there are plenty) are meaningless.

    Subversion gives excellent control (tags anyone?) of binary installations. We use it at for things way beyond the usual source code storage.

    I have also found disk IO is the main killer. I would suggest looking in to caching. The subversion client sends straightforward HTTP commands to the server. I have a custom PostgreSQL backend which does some caching- in his place, I would have a Squid set up to cache some basic data fetches- obviously, you need to be careful to not cache old data but that's not hard.

    So yes, Subversion is excellent for this, and with a little thought, the heavy disk IO can be reduced. Cache, cache, cache.

    --
    http://blog.grcm.net/
  4. Re:Some clarifications, especially about rsync by eklitzke · · Score: 3, Informative

    You may be interested in the Unison project. More info can be found here: http://www.cis.upenn.edu/~bcpierce/unison/

    --
    #include ".signature"