Slashdot Mirror


Performance Tuning Subversion

BlueVoodoo writes "Subversion is one of the few version control systems that can store binary files using a delta algorithm. In this article, senior developer David Bell explains why Subversion's performance suffers when handling binaries and suggests several ways to work around the problem."

200 comments

  1. Why binaries? by janrinok · · Score: 2, Interesting

    I know it can handle binaries, but I cannot think why I would want to. Can anyone help?

    --
    Have a look at soylentnews.org for a different view
    1. Re:Why binaries? by janrinok · · Score: 0, Redundant

      And yes, I have RTFA. I just cannot think why I would want to do what they suggest...

      --
      Have a look at soylentnews.org for a different view
    2. Re:Why binaries? by autocracy · · Score: 3, Informative

      First answer: Images. Many other possible answers... :)

      --
      SIG: HUP
    3. Re:Why binaries? by eikonoklastes · · Score: 1, Insightful

      Tracking images/graphics while developing a web site?

    4. Re:Why binaries? by Anonymous Coward · · Score: 0

      To keep DLL/SO files you're linking to locally? I don't know, just guessing.

    5. Re:Why binaries? by teknopurge · · Score: 4, Informative

      release management - you can store _compiled_ application bundles, ready-to-go.

    6. Re:Why binaries? by Anonymous Coward · · Score: 5, Informative

      putting a toolchain under CM control, so that you can go back to not only an earlier version of your own code, but the version of the toolchain you used to compile the code at that point in time. Absolutely necessary to be able to recreate the full software environment of a past build, without relying on that version of the toolchain still being publicly available (not to mention including any patches/mods you made to the public toolchain).

    7. Re:Why binaries? by Anonymous Coward · · Score: 0

      Art assets for games. We do this at work (the games aren't too big).

      For some reason they use Visual Sourcesafe at work, eurgh.

      I tried to persuade them to use Perforce, but the boss was having none of it.

    8. Re:Why binaries? by janrinok · · Score: 1

      Thank you. This is an answer that I can relate to. Still not what I need, but I can see why some people might want to do it.

      --
      Have a look at soylentnews.org for a different view
    9. Re:Why binaries? by jfengel · · Score: 5, Insightful

      It's really nice to be able to have your entire product in one place and under version control. Third party DLLs (or .so's or jars), images, your documentation... just about anything that's part of your product.

      That way it's all in one place and easily backed up. If you get a new version of the DLL/jar/so you can just drop it into a new branch for testing. If your customer won't upgrade from version 2.2 to version 3.0, you can recreate the entire product to help fix bugs in the old version rather than just saying, "We've lost it, you've got to upgrade."

      Basically, by putting your entire project under version control, you know that it's all in one place, no matter what version it is you want. Even if the files don't change, you know how to reconstruct a development installation without having to dig around in multiple locations (source in version control, DLLs in one directory on the server, etc.)

      Yeah, so it costs some extra disk to store it. Disk is cheap.

    10. Re:Why binaries? by autocracy · · Score: 4, Insightful

      Oh, I shouldn't feed trolling... but he does have an account... The target audience and main users of Subversion are not "high level network techs." Software developers / coders is where you want to look. That said, I'm disappointed in the article... I was hoping for tweaks rather than "use a tarball." The information / stats provided was interesting, though.

      --
      SIG: HUP
    11. Re:Why binaries? by janrinok · · Score: 1

      I wasn't trying to troll. I had a genuine question. It has now been answered.

      --
      Have a look at soylentnews.org for a different view
    12. Re:Why binaries? by javaxman · · Score: 4, Insightful

      1) you want deployment without the need to build
      2) you have proprietary build tools limited to developer use, or release engineers unable to build for whatever reason ( similar to #1, I know... )
      3) images, of course.
      4) Word, Excel, other proprietary document formats are all binary.
      5) third-party binary installation packages, patches, dynamic libs, tools, etc.

      You're just not trying, or you're thinking of version control as something that only programmers would use, and that they'd only use it to store their text source. There are as many reasons to store binary files in version control as there are reasons to have binary files...

    13. Re:Why binaries? by Anonymous Coward · · Score: 0

      For some reason they use Visual Sourcesafe at work, eurgh.

      I pity the fool that chooses VSS. Not even Microsoft themselves use that horrible excuse for a version control system. It's as if someone wrote it over the weekend and showed it their boss and he said "lets release it!"

    14. Re:Why binaries? by Anonymous Coward · · Score: 0

      Click the "parent" link, he was replying to a post by CogDissident (951207), not to you.

    15. Re:Why binaries? by janrinok · · Score: 1

      My bad. Shuts mouth and leaves quietly.....

      --
      Have a look at soylentnews.org for a different view
    16. Re:Why binaries? by Aaron+Denney · · Score: 1

      Yeah, so it costs some extra disk to store it. Disk is cheap.

      Disk is cheap, but bandwidth (and latency!) is not. Being able to send deltas over the wire is very nice.
    17. Re:Why binaries? by janrinok · · Score: 1

      I was trying, but as a hobby programmer I used it exactly as you described. I use it to store my scripts, source code and documentation.

      --
      Have a look at soylentnews.org for a different view
    18. Re:Why binaries? by jfengel · · Score: 2, Interesting

      That's certainly true. It's tolerable when I'm on the LAN with the server. When I'm working via VPN from home, I get up and watch some TV when doing a full checkout of my system. (Some of that is the binaries, though much is just the sheer number of files and the latencies caused by the SSH.)

    19. Re:Why binaries? by norton_I · · Score: 1

      I use subversion to track latex documents, which have figures in them. I usually store both the original source file (often a binary) as well as the .eps version of figures (text, but might as well be binary) in svn, since I can't regenerate them from a script.

      I don't understand why the author of the article wants to do what he is, but lots of people have good (or good enough) reasons for wanting to track binary files.

      I always hope I don't have to keep binaries in svn, but since so many people seem to love them (for reasons passing understanding) I often end up with binary files under VC. Not sure why I would want it to track deltas... most binary files I can think of would not be likley to generate large common regions between versions, but I am sure it could be useful for someone.

    20. Re:Why binaries? by poopdeville · · Score: 1

      Serialized objects to use as a cache.

      --
      After all, I am strangely colored.
    21. Re:Why binaries? by Anonymous Coward · · Score: 1, Informative

      For this you should create a software repository that you store your jars / exes / binary files and include their version number in the name or directory. Then back it up.

      Version Control is for when you can actually see a difference in versions.

      If you have jars checked into CVS / SVN you should move to using something like Maven so you can store your internal jars on a web server.

    22. Re:Why binaries? by daeg · · Score: 5, Interesting

      Not just images in the sense of PNGs and JPGs, but original source documents as well (PSD, AI, SVG, etc). We track several large (40MB+) source files and I've seen some slowness but nothing to write home about.

      We give our outside designers access to their own SVN repository. When we contract out a design (for a brochure, for instance), I give them the SVN checkout path for the project, along with a user name and password. They don't get paid until they commit the final version along with a matching PDF proof.

      This solves several issues:

      (a) The tendency for design studios to withhold original artwork. Most of them do this to ensure you have to keep coming back to them like lost puppies needing the next bowl of food. It also eliminates the "I e-mailed it to you already!" argument, removes insecure FTP transfers, and can automatically notify interested parties upon checkin. No checkin? No pay. Period.

      (b) Printers have to check out the file themselves using svn. They have no excuse to print a wrong file, and you can have a complete log to cross-check their work. They said it's printing? Look at the checkout/export log and see if they actually downloaded the artwork and how long ago.

      (c) The lack of accountability via e-mail and phone. We use Trac in the same setup, so all artwork change requests MUST go through Trac. No detailed ticket? No change.

      (d) Keeps all files under one system that is easy to back up.

      You may have a little difficulty finding someone at both the design and print companies that can do this, but a 1 page Word document seems to do the trick just fine.

    23. Re:Why binaries? by IWannaBeAnAC · · Score: 2, Interesting

      What you want is a makefile that will track the dependencies in the latex documents, and generate .eps files from the figures. There are a few around on the web, but I haven't yet seen a 'does everything' version. What program do you use to generate the .eps ?

    24. Re:Why binaries? by javaxman · · Score: 1

      I was trying, but as a hobby programmer I used it exactly as you described. I use it to store my scripts, source code and documentation.

      Exactly... all you need for hobby projects. You don't have to worry about someone else needing your binaries. You don't have lots of images or proprietary-format documents ( some of which can get really large... think of a database as a document... ), and you don't even have to worry about someone being able to build your project w/o this dynamic library or that compiler or this other tool.

      We have to worry about all of that stuff, and want every aspect of it to be in version control so we have a record of who changed what, and when... I'm sure you get it now. Think enterprise. Recent press releases from CollabNet typically include lines like "More than 300 industry leading companies use CollabNet's solutions today, including Reuters, Philips Medical System, Federal Express, Cap Gemini and Barclays Global Investors among others."... you can bet some of those folks have a binary file or two which need to be tracked in version control.

    25. Re:Why binaries? by jbreckman · · Score: 3, Interesting

      We use it for version control and sharing of powerpoint/audio files. It keeps things considerably saner than a shared drive.

      And yes, for a 250mb audio file, it is VERY slow.

    26. Re:Why binaries? by rblancarte · · Score: 4, Insightful

      I was thinking the same - especially since I use Subversion.

      But taking a quick look at the article, I get an idea - storing your binaries at different version levels w/ it. Say I am developing a software package, us SVN for each level of revisions. With major releases I could store the produced binaries with the package to prevent the need to recompile when I am pulling down a version. Basically it would truly version control your binaries as well.

      In some ways the article makes me wish I did that with the project I am currently working on. I might start doing it now.

      -R

      --
      It is human nature to take shortcuts in thinking.
    27. Re:Why binaries? by norton_I · · Score: 1

      Unfortunately, for drawings I usually use Illustrator or Canvas on Windows. I also generate figures in Matlab, which can be automated, but is a major pain to do so. Ideally, I would switch to Inkscape for the drwaing, but last time I looked at it (quite some time ago) it was not ready. I hear it is much better now, but I am not going to learn a new program halfway through writing my thesis.

      Thanks

    28. Re:Why binaries? by XO · · Score: 1

      checkout/export log? I have searched for something like that, and have found no such option. Also told on #svn, that svnserve doesn't log accesses. How do you set that up?

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    29. Re:Why binaries? by Anonymous Coward · · Score: 0

      eh. It was a decent question and on topic. No need to leave :)

    30. Re:Why binaries? by autocracy · · Score: 1

      Yeah, I just went up the tree to your original post (which i did answer) and with the current setup for the mod system, it DOES look like my post was saying you were trolling. Guess it'll be a bug report now... *clicky*

      --
      SIG: HUP
    31. Re:Why binaries? by aled · · Score: 1

      Subversion has hooks where it call scripts, like precommit, postcommit, etc. You can implement what you want in those scripts. Some are contributed in subversion site.

      --

      "I think this line is mostly filler"
    32. Re:Why binaries? by Bassman59 · · Score: 1

      I know it can handle binaries, but I cannot think why I would want to. Can anyone help? I use Subversion for my FPGA sources. When I go to release a version of a design, I include the binary build results (.bit files from the place and route tools, and the .mcs file used to program the config EPROM) in my release tag. This is so checksums and such match exactly, and this is important because the production people put stickers on the EPROMs that display the version, part number and checksum. While I can certainly rebuild from the source, if the toolchain changes then the resulting bit file may change too (even though the result is functionally correct and meets timing). Tagging with the build result makes things simple.
    33. Re:Why binaries? by Make · · Score: 1

      Storing JAR files in SVN is wrong most of the time. Either the JAR file is built from the sources in this same repository, and build results should really not be stored in SVN. Or it is required to build the software - then it should be part of the local computer's software installation, and your ~/build.properties should point to the exact location.

    34. Re:Why binaries? by cibyr · · Score: 1

      It's worth noting the TortiseSVN (a windows shell-extension GUI for Subversion) can diff Word files. That's right, pull up your log for a file/directory, right click on a commit and click "show differences" - the doc opens up in Word in reviewing mode, with changes annotated and highlighted in red.

      SVN rocks for a whole bunch of other reasons, but that's the best thing about it that I've discovered by accident.

      --
      It's not exactly rocket surgery.
    35. Re:Why binaries? by TheRaven64 · · Score: 1

      For most figures, that's almost exactly what I do. Plots are generated from data using GNUplot. Diagrams are generated by the Makefile calling an AppleScript to tell OmniOutliner to export the latest version as a PDF. Some images, however, can't be generated in this way. Photographs are one example, as are rendered images from an overnight ray-tracing job. These go into SVN, and can be checked out when I want to look at an old version.

      --
      I am TheRaven on Soylent News
    36. Re:Why binaries? by gpinzone · · Score: 1

      Then again, you could simply zip or tar up the binaries and put them on a fileserver. The filenames will be easily identifiable and unique. Subversion (or any other CVS type system) is the wrong tool for the job.

    37. Re:Why binaries? by daeg · · Score: 1

      You also have Apache access logs. Unfortunately, when you log the HTTP REPORT commands, it logs everything, e.g., a REPORT is checkouts, updates, AND commits all in one. Unfortunately, there isn't a SVN Hook for "pre-checkout" and "post-checkout" yet. However, there is for "post-commit", so when you combine them, the HTTP REPORT items from Apache are the browse/checkout items.

      You can also set up your own authentication mechanism through Apache. We use Django, for instance, which then logs that at least they authenticated.

      While there is no "true" log, you can make at least a semblance of one.

    38. Re:Why binaries? by pAnkRat · · Score: 1

      Svnserve is ok for a few developers accessing the repository.

      Once you let external people in to play, using apache+mod_svn+mod_dav makes sense.
      You can have ACL on abitrary directory levels in the repository with apache, this is not possible with svnserve.

      Since apache is handling the requests, the logs are right there.

      We use svn here exclusivly since 0.9, for development in java.
      We store everything, from java source files, jars, images, database dumps, or whatever else, in svn, as far as the projekt needs it.
      I never have encountered anybody here complaining the "binary" file are slower then text files.

      I also think the article is wrong where it says that svn only stores deltas.
      AFAIK for each file, svn stores a base revision, then all the deltas, and a "current" or "HEAD" copy of the file.
      That way, svn use at least twice the filesize in disk space to store one file.
      Not that diskspace matters anymore....

      One of the design axioms in the svn team was that diskspace is cheap, and bandwith needs to be conserved.

      Happy hacking.

      --
      we need an "-1 Plain wrong" moderation option!
    39. Re:Why binaries? by Anonymous Coward · · Score: 0

      What tool do you use to keep successive versions of pictures you retouch?

      Maybe your source material isn't binary? But if it were?

    40. Re:Why binaries? by Anonymous Coward · · Score: 0

      It's probably just calling the "Compare and Merge documents" tool that is built-in to Word. It's in the Tools menu.

    41. Re:Why binaries? by Anonymous Coward · · Score: 0

      I use subversion to track latex documents, which have figures in them. I usually store both the original source file (often a binary) as well as the .eps version of figures (text, but might as well be binary) in svn, since I can't regenerate them from a script.
      Use MetaPost!
    42. Re:Why binaries? by WuphonsReach · · Score: 1

      That's certainly true. It's tolerable when I'm on the LAN with the server. When I'm working via VPN from home, I get up and watch some TV when doing a full checkout of my system. (Some of that is the binaries, though much is just the sheer number of files and the latencies caused by the SSH.)

      Our solution is that the main trunk is always kept up-to-date on the workstations. I have a script that runs at 4am each morning that does a "svn up" on that part of the repository tree. When I'm ready to use my laptop a few hours later, I have very few worries that I don't already have the latest. And when I do a quick "svn up" on a particular section that I'm working on, odds are high that I already have the latest. I have yet to see "svn up" do something bad to my local working copy. In cases whre there are conflicts, it brings down the conflicting revision as a new file (with the rev# tacked onto the end). Other users run a "svn up" as soon as they login at the start of the day.

      I also don't find that SSH introduces much latency (we're using SSH pub keys, Pagaent and TortoiseSVN). It takes about a few seconds to setup the initial connection for a TSVN update command (or a "svn up" at the command line), after which it runs at full tilt until finished. On a busy day, I might do around a dozen commits and a small handful of "svn up" commands (since I pull everything down daily).

      Now, as I said before, we're an odd shop. We don't really use SVN to do traditional development work. We're almost more of a service bureau where each person works on half a dozen jobs where they do 99% of the work on each job. The main reason we use SVN (and previously, VSS+SourceOffSite) is to keep in sync with each other (and to have a file history). So that if a project director has a question on a project, I can go look in my working copy (maybe do a quick "svn up") and see what is checked in. Or I can provide assistance to another person by looking at their project files.

      --
      Wolde you bothe eate your cake, and have your cake?
    43. Re:Why binaries? by Dr.+Trevorkian · · Score: 1

      That's probably a side-effect of serving svn via Apache. http://svnbook.red-bean.com/en/1.0/ch06s04.html

    44. Re:Why binaries? by aled · · Score: 1

      Storing JAR files in SVN is wrong most of the time. Either the JAR file is built from the sources in this same repository, and build results should really not be stored in SVN. Or it is required to build the software - then it should be part of the local computer's software installation, and your ~/build.properties should point to the exact location.


      For most simple projects its easier to just commit the jars within the project. That way one can just checkout and compile. With your way, where the programmer is supposed to get the jars? and what is the problem with keeping them in the SCM?
      --

      "I think this line is mostly filler"
    45. Re:Why binaries? by aled · · Score: 1

      Then again, you could simply zip or tar up the binaries and put them on a fileserver. The filenames will be easily identifiable and unique. Subversion (or any other CVS type system) is the wrong tool for the job.


      Maybe but then you may want to have a tool to automate adding new versions, then you would want something to automate comparing versions, if you have limited bandwidth it would be useful to send only deltas, specially with big files. To make it short, you end with a Subversion clone. Because Subversion is at its core a versioned filesystem.
      --

      "I think this line is mostly filler"
    46. Re:Why binaries? by RoloDMonkey · · Score: 1

      Um, because a lot of use it to code things other than command line executables. For instance, I do web development, and it is really great to have the development and staging sites in the repository, including images, SWFs, etc. That way everyone on the team can fiddle on their computer all the want, but when they are done they commit, and I know that I am looking at the same thing they are.

      Also, it protects against screw-ups. Imagine that I open the only copy that we have of a high quality photo, crop, resize, and compress it, and then hit "Save" instead of "Save As". Let me tell you, that's a nice time to have options like "Revert" and "Save Revision As".

      --
      Long live the Speaker Bracelet
      Rolo D. Monkey
  2. Re:first by janrinok · · Score: 0

    Not quite - but thanks for your contribution to an intelligent discussion....

    --
    Have a look at soylentnews.org for a different view
  3. SVN will not replace CVS (IMO) by Anonymous Coward · · Score: 1, Interesting

    Subversion fails to follow symbolic links that point to code that other projects share for the sake of a minority that still develops using Windows (which doesn't have real symbolic links).

    CVS http://en.wikipedia.org/wiki/Concurrent_Versions_S ystem has prooven itself to be superior and far more intuitive.

    You have code that many projects share, like multi-platform-compatibility-layers? Just use symbolic links and CVS will follow them.

    In SVN you have to create a repository for these shared source files and write config files by hand to make it include these files your repository.

    I hardly see SVN reach the point of flexibility CVS has. They support Windows (which doesn't have symbolic links) and give up usability.

    Except this difference SVN and CVS are the same. There are marginal differencies in features but these affect no real world use. So if you want a version control system where you don't need to write config files by hand you choose CVS. If you want the latest hype you choose SVN.

    There wasn't really a need for SVN.

    1. Re:SVN will not replace CVS (IMO) by scribblej · · Score: 4, Informative

      You ever try to move a directory structure full of source code from one place to another in CVS -- or even to move or rename a single file...?

      HINT: When you do it the way CVS provides, you will lose all of your revision history.

      SVN does not have this fatal flaw.

    2. Re:SVN will not replace CVS (IMO) by OverlordQ · · Score: 3, Insightful

      Subversion fails to follow symbolic links that point to code that other projects share for the sake of a minority that still develops using Windows (which doesn't have real symbolic links).

      I am an SVN newbie, but that kinda sounds like Externals.

      --
      Your hair look like poop, Bob! - Wanker.
    3. Re:SVN will not replace CVS (IMO) by mountie · · Score: 0, Troll

      Headline:

      SVN Fixes an implementation flaw of CVS, worsens others, completely ignoring the big picture!

    4. Re:SVN will not replace CVS (IMO) by Anonymous Coward · · Score: 1, Informative

      You ever try to move a directory structure full of source code from one place to another in CVS -- or even to move or rename a single file...?

      HINT: When you do it the way CVS provides, you will lose all of your revision history.

      SVN does not have this fatal flaw.


      What the hell are you talking about? You just log into the CVS server and move the directory/file in the repository.

      Having to write config files by hand to route around non-existant symbolic links support on a platform that does support symbolic links is what I call a "fatal flaw".

      If SVN is so great... why is the majority not using it? It's not like it is entirely new.

      I can tell you why. Because developers are still angry with that wet-script-kiddie-dream-called-autoconf it selfimportant complaints about M4-here and can't-find-AC_blablabla-there. They don't want to run into the next selfimportant barrier on their way to actually get their project done. CVS just WORKS! For many years now. And if you have problems moving files/directories because your project is hosted on SF then that's the consequence of your choice and not CVS's fault.

      But maybe it's more about configuring the projects development environment these days than getting work done.

    5. Re:SVN will not replace CVS (IMO) by Crazy+Taco · · Score: 4, Interesting

      And you can ALSO save space by version controlling ANY type of file because of its binary delta features. My software team often would place .doc files or other sorts of documentation into our projects, and CVS would save full copies of each document to version control them, chewing up space like crazy. If you work on a big software project, where you can run into things like 1000 page word specification files, you do NOT want a version control system that doesn't use binary differencing. This is another reason why SVN WILL replace CVS.

      --
      Beware of bugs in the above code; I have only proved it correct, not tried it.
    6. Re:SVN will not replace CVS (IMO) by Anonymous Coward · · Score: 1, Interesting

      You ever try to move a directory structure full of source code from one place to another in CVS -- or even to move or rename a single file...?
      HINT: When you do it the way CVS provides, you will lose all of your revision history.
      SVN does not have this fatal flaw.


      Yeah, that is a problem with CVS. Your revision history is there, you just can't trace it since a move is a delete and recreate. So if in your move/rename commit comment you say where you are moving it from, you can manually trace (though this is a huge pain).

      We have moved all our CVS repositories to SVN at work. As much as I like the revision history problem being gone, I would've pushed harder to stick with CVS (I didn't think SVN was ready at the time, and I still don't). CVS is way more stable, solid, and trouble free, and clients for it are also very stable. SVN has numerous issues that keep popping up, mostly in the clients (the working copy metadata gets corrupted all the time), but some that might even be server-side corruption (didn't quite figure out why, but everyones' working copy got corrupted in the same place).

      Are there any SVN-to-CVS conversion utilities out there for those of us who want to go back to CVS?

    7. Re:SVN will not replace CVS (IMO) by Vellmont · · Score: 2, Insightful


      If SVN is so great... why is the majority not using it? It's not like it is entirely new.

      Momentum for the most part. CVS is good enough 95% of the time, so it takes some reason to change over. I've recently started using svn after using cvs for years. I'm still not as familiar with svn as I am with CVS.

      Personally I don't really like the different branching/tagging behavior in subversion, but I also think I just don't know it as well. Someday I'll have to find some decent documentation on how to use it properly.

      --
      AccountKiller
    8. Re:SVN will not replace CVS (IMO) by Crazy+Taco · · Score: 5, Informative

      For many open source projects, finding good documentation is hard. In the case of Subversion, it couldn't be easier. In fact, the Subversion team has taken documentation to such a level that they should be considered THE model for documentation in the open source community. They have written a book (published in print by O'Reilly, but maintained and posted for free by them on the Internet) that documents their system, and it is very good. My job at the last company I worked for was to write wizards for the Eclipse platform that would automate several of the most common tasks that a Subversion user would try to do, and that book was the only reference I needed. You can find the book on their site here: http://svnbook.red-bean.com/ . They even do nightly builds of the book, so not only is their documentation complete and useful, it is also incredibly thorough and up to date.

      If anyone on here hasn't read it, DO IT, because the first half will teach you why you want Subversion rather than CVS or some other alternative, and how to use it and how to get the most out of it (second half is lower level stuff you may not care about). It even includes best practices. Once you really learn how to use Subversion, you won't want to use anything else. And this is the way to get started.

      --
      Beware of bugs in the above code; I have only proved it correct, not tried it.
    9. Re:SVN will not replace CVS (IMO) by gbjbaanb · · Score: 1

      the biggest flaw with CVS is the fact its a client->files system, ie your client writes data directly to the repository. If you lose your network connection halfway through writing... uh-oh. SVN fixes this by making it all a client-server model instead. It is better.

      However, SVN does have soem disadvantages (and some say these are so bad its not worth using it). SVN only manages whole directories - you cannot operate on a single file, try checking out a single file and you'll find you cannot. SVN also has a strange tagging/branching system, based on a kind of 'symbolic link' system that is efficient, but seems to be an engineering solution to a problem that needs to be solved by something more intuitive to a human operator. (eg. you cannot 'tag' a set of files, you have to make a branch and name the branch to the tag you want, it is great back at the server, but useless from the client POV).

      So, SVN is better is many respects to CVS, but it is worse is others, and unfortunately it is not the ultimate source-control system I wish it was.

      (oh, and as for the article - good stats, but ultimately useless - what's the point of tarring your files and storing them just so checkins go faster. Now if he'd supplied a patch to the client that tarred all files you were checking in, and to the server to untar them before checkin now *that* would make an excellent article.)

    10. Re:SVN will not replace CVS (IMO) by KarlKFI · · Score: 1

      I love svn, but I would REALLY love it if it had svn:internals that allowed for linking within the same repository.

      I currently do it with svn:externals for internal revision stopped links, but when you update it reconencts for each external... slow as hell.

    11. Re:SVN will not replace CVS (IMO) by warriorpostman · · Score: 1

      Totally agree with you on this point. It's a point that's not made often enough with regards to using either open source or proprietary software development tool. Documentation is SO important. And the parent is correct: Subversion's technical documentation is unusually lucid.

    12. Re:SVN will not replace CVS (IMO) by Anonymous Coward · · Score: 0

      Subversion doesn't have tags or branches at all. What it has is a copy. The copy is created with Copy On Write (COW) semantics. Want a "tag"? Create a copy in the /tags directory. Want a branch? Copy to /branches. Want to branch from a tag on a branch? No problem.

      This is significantly easier than the way CVS does it. One of the development teams here is still using CVS and has huge, complex, Perl script to handle merging on their development branches. They need these scripts because CVS makes it complex. On SVN it would be very simple.

      Our developers have had no problems at all using Subversion, weather they have come from using CVS or Visual SourceSafe. There is certainly no confusion about "branches" and "tags".

    13. Re:SVN will not replace CVS (IMO) by WuphonsReach · · Score: 1

      So, SVN is better is many respects to CVS, but it is worse is others, and unfortunately it is not the ultimate source-control system I wish it was.

      On the upside, it's being actively developed and some of those issues are being addressed (others are central design themes which will probably not change until v2.0 or v3.0, if ever).

      (I came from the VSS world... where the tool was dead and no longer under active development. I much prefer a tool that is still in active development. Even if it has some quirks still.)

      --
      Wolde you bothe eate your cake, and have your cake?
  4. performance not the biggest problem by hpoul · · Score: 3, Interesting

    for me performance is (currently) the least of my problems with subversion..
    more that you lose changes without any warning or whatsoever during merging .. http://subversion.tigris.org/servlets/ReadMsg?list Name=users&msgNo=65992 .. and noone seems to be too bothered..

    (don't get me wrong, i love subversion .. and i use it for my open source projects.. but currently CVS is way better.. just because of the tools and a few unnecessary annoyances less)

    --
    Find me at http://herbert.poul.at
    1. Re:performance not the biggest problem by Anonymous Coward · · Score: 0

      It has binary and that is a big no-no in the community.

    2. Re:performance not the biggest problem by PatrickThomson · · Score: 1

      Yes, because a bug report that's 9 days old is indicative of a deep flaw in the developer structure. You should have at least said you were the one who filed it in the interests of full disclosure. Anyway, it's safe practice to check in the trunk modifications before you merge.

      --
      I am one of many. My idea is not unique, nor do I expect my voice alone to sway you. I speak in a chorus of opinion.
    3. Re:performance not the biggest problem by eli173 · · Score: 3, Informative

      Anyway, it's safe practice to check in the trunk modifications before you merge.

      I think you missed his point... he'd committed all his changes. The problem is that if you merge a file or directory deletion in, where that file or directory had modifications committed, Subversion won't tell you about the conflict, but will delete the file or directory including the new modifications.

      You wanted to delete it, so who cares, right?

      Subversion represents renames as a copy & delete. So now, you rename a file or directory, and do the same dance as above, and the renamed file or directory does not have changes that were made on trunk under their previous names. So renaming a file can re-introduce a bug you already fixed.

      No big deal, the devs will fix it soon, right? Wrong and wrong again.

      That is the problem.

    4. Re:performance not the biggest problem by hpoul · · Score: 1

      take a look at the issue tracker.. there it was reported as a problem with renaming directories (months ago .. not by myself)..
      so there is an announcement that there might be better renaming support in svn 1.5 ..

      anyway .. i haven't found a similar report for just the case of deleting directories.. (since renaming = copy + delete) .. so i figured i would ask in the forum (not a bug report btw.) if this is at least known .. but anyhow ..

      you want to know other annoyances ? how about .. the need to store plain text passwords if you want to use svnserve ? (afaik this is also announced for 1.5)

      --
      Find me at http://herbert.poul.at
    5. Re:performance not the biggest problem by hpoul · · Score: 1

      exactly .. and.. my biggest complaint is.. that i actually recommended subversion to everyone who asked because of his cool advantage over CVS - the versioning of directories ... how cool is that .. you can move a file and still have all the history of the file.. so .. no problem with refactoring a big project..

      after all .. it's one of the biggest features.. see http://subversion.tigris.org/ "Subversion versions not only file contents and file existence, but also directories, copies, and renames."

      well .. of course it all works great.. until you merge...

      but as i said .. i'll keep using it for my smaller open source projects.. but i can't honestly recommend it to bigger projects for a company ..

      --
      Find me at http://herbert.poul.at
    6. Re:performance not the biggest problem by PatrickThomson · · Score: 1

      Yes, but you've not actually lost any data, you can pull the deleted files out of the repository. So at worst it would reintroduce a bug you would be able to find and fix later - but who merges without checking it worked?

      --
      I am one of many. My idea is not unique, nor do I expect my voice alone to sway you. I speak in a chorus of opinion.
    7. Re:performance not the biggest problem by eli173 · · Score: 1

      Yes, but you've not actually lost any data, you can pull the deleted files out of the repository. So at worst it would reintroduce a bug you would be able to find and fix later - but who merges without checking it worked?

      Reintroducing a bug is a very bad thing. And if you've only worked on projects with 100% test coverage, and automated execution of said tests, you're going to be in for a real rude awakening when you get a job.



      Um... sorry, let me set this flamethrower down here, turn it off, and I'll just back slowly away...

    8. Re:performance not the biggest problem by LionMage · · Score: 2, Informative

      So at worst it would reintroduce a bug you would be able to find and fix later - but who merges without checking it worked?

      What if the merges are done by someone who isn't familiar with all the code changes and the expected associated application behaviors? What if there are dozens or even hundreds of code changes in a branch being merged to trunk? What if your QA work is being done by people who are not developers and who have no involvement in the merge process?

      These are not just hypothetical issues. I work on a team which espouses the agile methodology, and many times we've missed bug fixes in merges because of the way Subversion treats moves (copy + delete instead of truly changing the parent directory of a given file), or because Subversion's merge facility got confused (especially when changes were made both to the branch and trunk versions of a file).

      Recently, I was put in charge of merging a branch to the trunk for my team's project, and discovered that some methods were duplicated because one of our programmers had deleted the original version of a given method, then pasted in a completely different implementation into a different location in the same source file. It was easy enough to catch this with Java classes (since they won't compile correctly if you have two instances of the same method signature in the same class), but JavaScript was a slightly different story...
    9. Re:performance not the biggest problem by pohl · · Score: 1

      But if you're working on a project with 100% test coverage, you can afford to revert, can't you? It's the case where you have 0% test coverage that reverting is most dangerous, and on that end of the spectrum it really is your fault anyway.

      --

      The "cue the foo posts in 3, 2, 1..." posts will commence with no subsequent foo posts in 3, 2, 1...

  5. What about git? by Anonymous Coward · · Score: 1, Interesting

    In short: Use git-svn

    Long version: The fraction of a few speedup described in the article is blown away by the several orders of magnitude you get by using git. Then there are all the other goodies, like real branches and merges, git-bisect, and visualization with gitk. Subversion is just for people who are forced to use it, or those not exploring all their options these days.

    1. Re:What about git? by koreth · · Score: 2, Interesting

      Hear hear. git-svn makes Subversion tolerable. The only reason I'd ever choose native Subversion over a newer system like git or Mercurial is if I needed some tool that had builtin Subversion integration and didn't support anything else. Absent that criterion, IMO if you choose Subversion it's a sign you don't really understand version control too well.

    2. Re:What about git? by javaxman · · Score: 2, Insightful

      The only reason I'd ever choose native Subversion over a newer system like git or Mercurial is if I needed some tool that had builtin Subversion integration and didn't support anything else. Absent that criterion, IMO if you choose Subversion it's a sign you don't really understand version control too well.

      What if you have a bunch of developers working with some ( unfortunately, let me say that ) Windows-only tools for historical reasons ? Are you really saying that I should have a team of VisualStudio users install cygwin on their systems ?

      git is great for Linux kernel developers, but 'install this massive compatibility layer to use this product' will fail to make you a lot of friends, especially in a Windows-friendly corporate environment. I say that as an avid, daily CygWin user and longtime Windows hater. We could have maybe picked Mercurial, but a year ago when we looked, it didn't even hit our radar as a possibility.

      Subversion has some little issues, but it's getting lots of attention, and the problems aren't bad. I'm a little suspicious that the performance claims of Mercurial might not be measuring apples-to-apples... an 'svn commit' is both an 'hg commit' and 'hg push', if you want to be fair.

    3. Re:What about git? by statusbar · · Score: 1

      Very interesting! Does git-svn work on Win32 and Mac OS X?

      Newer schematic and circuit board layout programs support svn directly (via svn.exe on windows) - Would there be a git-svn.exe to replace the svn.exe with the same command line set?

      --jeffk++

      --
      ipv6 is my vpn
    4. Re:What about git? by Anonymous Coward · · Score: 0

      I think it is true that cygwin is required for using git.

      But, git is definitely NOT just for Linux kernel developers. I've used cvs and svn (neither extensively). After a brief introduction to git (half a day), I moved all of my projects under it. I was up and running very quickly. There were minor pains, I'll admit, but minor not major.

      It has now been nearly 5 months and...

      > Are you really saying that I should have a team of VisualStudio users install cygwin on their systems ?

      I'm not the OP but YES. I would recommend it. Git changes your work flow. Branching, merging, commiting, are effortless and git encourages you to do them often.

      A person experienced with cvs surely doesn't find those things difficult, but when they are effortless, there is a difference.

      -brandon

    5. Re:What about git? by XO · · Score: 1

      Part of the reason I started using SVN was because of the ease of use of TortoiseSVN. There's nothing that I can find that could be easier for people who aren't into this kinda stuff.

      The other people I am working with are game level builders, not code geeks. So, they've never used anything like this.

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    6. Re:What about git? by sam_vilain · · Score: 1

      What if you have a bunch of developers working with some ( unfortunately, let me say that ) Windows-only tools for historical reasons ? Are you really saying that I should have a team of VisualStudio users install cygwin on their systems ?

      Not at all. Please, continue to bury your head in the sand about the evolution of version control. By the time the Java implementation is fully complete, the C# .net implementation works on all platforms and derived git protocols show that git is a version control protocol, not a tool, your favourite tool will probably have adopted it under the hood.

      Or you could just stop making objections for the sake of it and use it. your choice :-)

      --

    7. Re:What about git? by rgravina · · Score: 1

      I use git on Mac OS X, so I assume git-svn will work too.

    8. Re:What about git? by mibus · · Score: 1

      Not at all. Please, continue to bury your head in the sand about the evolution of version control. By the time the Java implementation is fully complete, the C# .net implementation works on all platforms and derived git protocols show that git is a version control protocol, not a tool, your favourite tool will probably have adopted it under the hood.

      Or you could just stop making objections for the sake of it and use it. your choice :-)


      I think you're missing the point - people want working version control, easily usable in Windows, and they want it now. The company I work at uses SVN all over the place for exactly that reason (tortoisesvn!). When we have tortoisegit or equivalent, and it offers good performance and stability, we'll probably switch. (I was using git before we had a shared repository, and loved it. SVN just makes me want to hide every time I try to merge!)
    9. Re:What about git? by demallien2 · · Score: 1

      Wait! You mean that there are actually Windows developers out there that DON'T have cygwin installed?!?! I'm ... speachless. What do you do when you need access to all that Posix-y programming goodness that is out there? A port? Yikes! And scripting - without cygwin, you're obliged to learn the Windows Way(tm), which means you're left out in the cold if you ever find yourself on one of the ever growing unix systems out there.

      Personally, I have never bothered with learning how to manipulate Windows. You learn unix, and then you install Cygwin if you should ever find yourself working on a Windows system.

      Of interest, I have worked for 4 different companies in my programming career. All worked on Windows systems, because - well, until very recently, a Linux desktop was a pain in the butt, and Macs cost too much compared with a stripped down Dell. But every single company used Cygwin as part of the standard developper environment.

    10. Re:What about git? by javaxman · · Score: 1

      Of interest, I have worked for 4 different companies in my programming career. All worked on Windows systems, because - well, until very recently, a Linux desktop was a pain in the butt, and Macs cost too much compared with a stripped down Dell. But every single company used Cygwin as part of the standard developper environment.

      Congratulations on never having had to work in a _real_ Windows shop. I'd never touch a Windows machine without immediately installing CygWin myself, but 'real' windows developers buy into the Microsoft Way and use Visual Studio or some other MS tool and look at you funny if you talk about anything else.

  6. svn+ssh and master mode ssh. by frodo+from+middle+ea · · Score: 4, Informative
    My solution, use svn+ssh and keep a ssh connection to the svn server in Master mode. All svn+ssh activity tunnels through this master connection , no need for ssh handshake each time or for that matter no need to even open a socket each time.

    Plus if the master connection is set to compress data ( -C ) , then you get transparent compression.

    Now if only I could expand all this to fit 2 pages....Profit!!!

    --
    for the last time people, I am "frodo from middle eaRTH", not "middle eaST".
    1. Re:svn+ssh and master mode ssh. by Abcd1234 · · Score: 1

      Wow, I didn't even know that existed. Thanks!

  7. git by bartman · · Score: 1

    A really great way to optimize your SCM is to upgrade to git.

    --
    -- bartman
  8. Use GForge or GForge/AS or customize them by aisnota · · Score: 1


    Well if you use a customized version of GForge or the Advanced Server edition linked to some content engine or even someting more basic like NCftp at a professional level.

    CVS and Subversion both work with GForge according to their web site.

    Customized editions are available as far as I know. They have a roster of high end clientele, Cisco, MIT and others, so they must work pretty smart and well for people.

    --

    --
    http://www.aisnota.com/slashdot/ Welcome to Logic and the Future
  9. why import/export by Anonymous Coward · · Score: 1, Interesting

    why use subversion only as import/export? That's the complaint here right? (the slow import/export speeds?) I thought the point in using revision control is to checkout then do commit/update commands???

    1. Re:why import/export by pearlmagic · · Score: 1

      Amen to that. TFA was really missing some useful information. The only place I can see doing clean export/checkout is on a build machine where they need to guarantee there isn't anything leftover that could compromise the build. I'm one of the SVN admins at our company and this is something that we're going to have to keep an eye on.

    2. Re:why import/export by flink · · Score: 1

      Creating a Subversion working copy (checkout) involves a lot of irrelevant local disk overhead. Same thing for check in. Since import and export don't create a working copy you can measure the attribute this benchmark is looking at more accurately: how fast is svn at reading and writing binary files in its repository.

  10. It may have performance problems, but... by Crazy+Taco · · Score: 5, Interesting

    It is still the wave of the future. I've worked in it extensively, and it is still the best version control system I've ever used. Because of its other strengths, it is continuing to expand its user base and gain popularity. You can tell this because Microsoft is now actively attempting to copy Subversion's concepts and ways of doing things. Ever used Team Foundation Server? It is just like Subversion, only buggier (and without a good way to roll back a changeset... you have to download and install Team Foundation Power Tools to do it). I'm a new employee at my company (which uses Microsoft technology), and yet I've been explaining how the TFS system works to seasoned .Net architecture veterans. The reason I can do this? I worked extensively with Subversion, read the Subversion book a few times (the O'Reilly book maintained by the Subversion team), and worked on a project for my previous company that basically had the goal of making versions of the TFS wizards for Subversion on the Eclipse platform. It only took me about one day of using TFS to be able to predict how it would respond, what its quirks would be, etc, because it's technical underpinnings are just like Subversion. So even with performance issues, if even Microsoft is abandoning its years of efforts on Source Safe and jumping all over this, you can know that its strengths still make it worth adopting over the other alternatives. After all, if Microsoft was going to dump source safe, it had its pick of other systems to copy, as well as the option of trying to make something new. What did it pick? Subversion.

    --
    Beware of bugs in the above code; I have only proved it correct, not tried it.
    1. Re:It may have performance problems, but... by GrievousMistake · · Score: 4, Interesting

      Honestly, if you think Subversion is the wave of the future, you haven't been paying much attention. It fixes some fundamental flaws in CVS, which is nice, but elsewhere there's exciting stuff like Monotone, darcs and many others. It seems people aren't looking hard enough for source control options, when they'll go wild over things like SVN, or more recently GIT.

      I suppose one has to be conservative with deployment of this stuff, you don't want to have code locked away in unmantained software, or erased by immaturity bugs, but it's still an interesting field.

      --
      In a fair world, refrigerators would make electricity.
    2. Re:It may have performance problems, but... by Crazy+Taco · · Score: 1

      I do agree, and I'm open to experimentation, but I've used SourceSafe, Team Foundation Server, ClearCase, CMVC, CVS, and Subversion, and I've found Subversion to be the best by far, as well as very reliable. Is Subversion absolutely the best out there, or the very best system possible? Perhaps not, but the problem is that there are so many systems out there, and so many of them (nearly all) are inferior, that you can't be suprised or blame people for jumping for joy at Subversion and becoming huge fans. I think people (myself included) are getting tired of having to keep searching for a better version control system. CVS was good, but it had some pretty big flaws that Subversion fixed. Subversion made a good version control system great, and I think most people will probably start using that long term just liked they used CVS long term until a major reason to switch comes along. Subversion is familiar (if you've used CVS), stable, well documented and fully featured. Are other systems better? Possibly, but Subversion is so good that the marginal utility of switching to yet another new version control system is pretty low, especially given Subversion's edge with users in the area of familiarity.

      --
      Beware of bugs in the above code; I have only proved it correct, not tried it.
    3. Re:It may have performance problems, but... by cching · · Score: 1

      You should really have a look at monotone. Better quality code and features you'd probably only *dream* about. They're a bit slow on getting to that 1.0, but it's a very solid RCS right now. I'm piloting it where I work for a project, hopefully I can convince my team to adopt it. The only shortcoming I'm running into right now is the toolset that we've built/found based on CVS (Bonsai, Codestriker, and some others). I just can't match those yet.

    4. Re:It may have performance problems, but... by Anonymous Coward · · Score: 0

      For me, one of the most useful features of Subversion is its ability to act as a plugin to Apache. This lets me integrate it into my company infrastructure very easily (HTTP/s protocol, LDAP authentication, etc). I see monotone is more of a distributed rcs, how well does it implement these types of features when acting as a server?

    5. Re:It may have performance problems, but... by maxume · · Score: 1

      None of the tools you mention are 'distributed', they are all 'hosted'(to host a distributed repository, you generally just put it somewhere). Svk is supposed to be pretty nice, and it knows how to talk to subversion, so you get the best of both. I use bazaar for a bunch of small projects, it's great, fast enough, backing up is as simple as copying the directory..

      --
      Nerd rage is the funniest rage.
    6. Re:It may have performance problems, but... by cching · · Score: 1

      It doesn't act as a server, it truly is a distributed RCS. Yes, you probably want a central repository where you do integration and builds, but that's a CMS issue, not an RCS issue.

      As for being able to pull over HTTP, that isn't in, but I think it's been discussed. Honestly, though, I don't find the need to pull over HTTP that important, but, of course, your needs vary. I'm just tired of being tied to a central repository in order to do commits. I work disconnected a lot and monotone fits the bill. Not to mention we have distributed developers who complain all the time of the need to connect to a central repository to do simple commits.

    7. Re:It may have performance problems, but... by XO · · Score: 1

      I am pretty frustrated with Subversion, and all I do is manage a few pieces of source code with it. It's always farking my things up, it seems. Well, at least once a month or two, I'm mucking around with the internals of the repository to fix crap that svn did.

      Also, SVN doesn't -have- a way to rollback.

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    8. Re:It may have performance problems, but... by XO · · Score: 1

      oh, and about 100mb of binaries, also .. but it never messes up the binaries, it just takes forever to update/commit/checkout on those

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    9. Re:It may have performance problems, but... by GrievousMistake · · Score: 2, Insightful

      Monotone is my current favourite also, but it's pretty different from the CVS/SVN style of work, and not nearly as widespread, which makes it harder to use in a team project. Git borrows a lot from it and gets exposure from being used for Linux kernel VC.

      There are still some reasons for choosing SVN over monotone though, the major one for me is partial checkout, which you learn to appreciate once you've been stuck behind dialup or on a cell phone. (On the other hand, SVN doesn't do complete checkouts.)

      People tread carefully when dealing with their version control. I think both Sourceforge and Gnome only relatively recently went from CVS to SVN. If you're still using CVS for current projects (or God forbid, Visual SourceSafe), it may make sense to get them switched over to SVN, and use monotone for small sandbox projects until you can make a good case for using it in a new, bigger project (especially one where you anticipate a lot of branched work, maybe with parallelly mantained branches).
      It seems simpler to develop and integrate tools with monotone than with CVS, and there's development going on for things like trac support, so I have high hopes for the eventual availability of a large number of tools for working with monotone.

      --
      In a fair world, refrigerators would make electricity.
    10. Re:It may have performance problems, but... by cching · · Score: 1

      There are still some reasons for choosing SVN over monotone though, the major one for me is partial checkout

      The monotone team are planning on addressing this. I'd have agreed with you when I first started looking into monotone, but, since I started using it, it's become less important to me. However, I do know that some people on my team would really find this feature important.

    11. Re:It may have performance problems, but... by Anonymous Coward · · Score: 0

      It is still the wave of the future. I've worked in it extensively, and it is still the best version control system I've ever used.

      Good Lord, what else have you been using?? Sure, SVN is great replacement for *CVS* but that's about all I can say for it. It doesn't even have real branching, tagging, or merging.

      Have you tried git? Mercurial? Darcs? Hell, if McVoy wasn't such a **** I'd include BitKeeper too.

      Try git, and then imagine it with more consistent UI and better documentation. Now *THAT* would be the best version control system.

      Here's something I just did in git just now:

      Started work on a couple new features. Made several commits. Realized that it would be bigger than I thought, and would touch more modules than I thought. So I quickly checked in my "work in progress". Then I created a branch off a previous point in the history of the current branch, and then transplanted (i.e., created an equivalent set of commits) the current branch (starting from this new branchpoint) as a second branch off mainline. Later on, I'll merge it all back together. I then checked out the new branch, moved the head of the branch back one revision (basically "popping" the work in progress commit off the branch and leaving the changes in the working directory for continued work).

      And all the old commits were still stored in the history, but not pointed to from any active branch. Eventually they will drop out of the logs and get garbage collected, but I can put everything back the way it was if I really want to.

      And the graphical "gitk" browser lets me see all the branches and merges graphically so I don't get too confused.

      Git makes this kind of fuckery SUPER EASY, and it's VERY FAST, and it uses simple files and directories in the .git dir to manage tags and branches and so forth. So you can just use shell scripts to manipulate things if you want, use Unix command pipelines to do awesome things, even use "echo SHA1ID > .git/../appropriate_file" to create branches.

      Subversion just doesn't come close. Its idea of a merge is to do a diff and apply. Yuck. I had a project in CVS with multiple parallel branches and moving merge points and it was easier to maintain in CVS than in SVN. Because, you see, Subversion... doesn't... have... tags!! What a clusterfuck. Eventually I found the "svnmerge" script which made things easier, but still clutters everything with spurious changes because it has no clue about the history of each branch.

      So, uhm, yeah Subversion is okay for simple linear projects but come on, it's just CVS++.

      Try this sometime: check out your subversion project, then go into the working dir and initialize it with git (or one of the other version control systems). Use git to incrementally develop new features, and then rearrange them into clean patches, one concept per patch, etc., and then roll back to the beginning and apply each patch one at a time and check into subversion, then delete the .git dir. I do this a lot when forced to use CVS or SVN. I *hate* it when I have to "go back" and add a fix to the history of a branch out of order. With git, I can collapse all the changes related to one concept into a single changeset, etc., very cool

      Oh yeah, forgot to mention, git interfaces with svn repositories (basically treating the svn repo as remote branch), which makes this stuff even easier.

      Try it my man, and soon you'll be singing the praises!

    12. Re:It may have performance problems, but... by etrusco · · Score: 1

      I guess you didn't search very well, or else you would have found cvsnt, which was "a better cvs than cvs" way before svn ever existed.
      Here:
      http://cvsnt.org/wiki or http://www.march-hare.com/cvspro/

    13. Re:It may have performance problems, but... by Crazy+Taco · · Score: 1

      Also, SVN doesn't -have- a way to rollback.
      Yes it does. In fact, that is one of its biggest strengths. You use the merge command for it (and you can also use some other ones in combination with each other). Your trouble with Subversion probably has more to do with a lack of understanding than the working of the tool. The tool itself is solid (and no one would build a version control system without a way to roll back a change). Subversion does have a bit of a learning curve, in large part because it was one of the first (if not the first) version control system to fix the problems it does, and as a consequence of fixing them it had to depart of bit from how things on CVS and other systems worked in the past. If you are having difficulty, please read the first few chapters of the O'Reilly Subversion book. It is written by the Subversion team, and as I pointed out in another post, is provided free by them on the Internet. It is thorough and extremely up to date (they do nightly builds on the book). You can find it here: http://svnbook.red-bean.com/ . The first few chapters not only teach you how to do all these things you are complaining about correctly, they also lay out best practices for project organization. Happy reading! This will if you do it.
      --
      Beware of bugs in the above code; I have only proved it correct, not tried it.
    14. Re:It may have performance problems, but... by WuphonsReach · · Score: 1

      That's pretty surprising that you see performance issues. We run a few tens of thousand files in one of our repositories (a 10-20GB working copy, but only 3GB in the repository). We find performance in SVN to be quite good. Lots and lots of binaries (and SVN is a darned sight better then VSS/SourceOffSite at storing those).

      All told, our total repository space is 20GB spread over about 2 dozen repositories (the 3GB one is the largest and where most work occurs).

      It's working exceeding well for us. We run svn+ssh connections using TSVN+pagaent. There are a few things that I wish it did better (such as partial working copy checkouts or making the working copies smaller at the expense of more network/server traffic).

      --
      Wolde you bothe eate your cake, and have your cake?
    15. Re:It may have performance problems, but... by WuphonsReach · · Score: 1

      Honestly, if you think Subversion is the wave of the future, you haven't been paying much attention. It fixes some fundamental flaws in CVS, which is nice, but elsewhere there's exciting stuff like Monotone, darcs and many others. It seems people aren't looking hard enough for source control options, when they'll go wild over things like SVN, or more recently GIT.

      It may or may not be the wave of the future, but after looking at version control systems for almost 2 years before switching to SVN last year, it has a good bit of momentum.

      - Needed to work with multiple client OSs (Win32, OS X, Linux, etc)
      - Open-source tools
      - Easy to use (which is why we like TortoiseSVN)
      - Efficient network traffic
      - Efficient storage of binary files
      - Command-line tools in addition to GUI tools

      After almost 9 months of solid use, we're pretty happy with it and are looking forward to v1.5 and v1.6 to fix some things we don't like still.

      Now, we're an oddball shop, because 90% of what we use SVN for is not development work, but for keeping documents & project files synchronized across people working in multiple locations. Once v1.5/v1.6 allow for sparse/partial working copies, we will probably move towards getting rid of our central file server.

      --
      Wolde you bothe eate your cake, and have your cake?
    16. Re:It may have performance problems, but... by XO · · Score: 1

      svn merge does not rollback. svn merge changes your file to equal whatever previous version you give it, and then you re-commit it to make the change like it never happened. If you want to undo the commit that you just did, the only way to do that is to go and edit the repository directly, and hope that no one did an update/checkout in the time it takes you to do that.

      A better way for it to work, saying that you didn't want to erase the last commit, would be to put a command in the current HEAD "revert file.c to revision 802".

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    17. Re:It may have performance problems, but... by jrumney · · Score: 1

      The ability to pull over HTTP is an advantage for open source projects, but for most commercial development, it would probably be seen more as a security hole. Sure you can set up HTTPS with client cert authentication, but then you might as well use SSH.

  11. Oh thank god by Richard+McBeef · · Score: 0, Troll

    My version control system is so fucking slow. It pisses me off to no end. I mean I'm all like trying to check stuff in and it takes forever. Thank god someone took the time to speed these bitches up.

  12. Store them differently by Tankko · · Score: 4, Interesting

    I've been using Subverison for 2 years on game related projects. Most of our assets are binary (photoshop files, images, 3D models, etc), plus all the text based code. I love subversion. Best thing out there that doesn't cost $800/seat.

    What I don't like about this article is that it implies I should have to restructure my development environment to deal with a flaw in my version control. The binary issue is huge with subverison, but most of the people working on subversion don't use binary storage as much as game projects. Subversion should have an option to store the head as a full file, not a delta, and this problem would be solved. True, it would slowdown the commit time, but commits happen a lot less than updates (at least for us). Also the re-delta-ing of the head-1 revision could happen on the server in the background, keeping commits fast.

    1. Re:Store them differently by XO · · Score: 2, Interesting

      I need to probably seriously set up a development environment to examine this, but it seems that there are probably some pretty serious program ineffencies, if throwing a processor upgrade at the problem decreases the time 14x, as the article seemed to indicate.

      It's like when I added 2,457 files to a VLC play list. It took 55 minutes to complete the operation. I immediatly downloaded the VLC code, and went looking through it...

      It loops, while(1), through a piece of code that is commented "/* Change this, it is extremely slow */", or some such. The moment I have a C/C++ Linux development environment functioning, I am going to fix that, if it hasn't been already, as well as looking into the SVN problem.

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    2. Re:Store them differently by mibus · · Score: 1

      I need to probably seriously set up a development environment to examine this, but it seems that there are probably some pretty serious program ineffencies, if throwing a processor upgrade at the problem decreases the time 14x, as the article seemed to indicate.


      I wish it were so simple. They moved from a dual 500MHz, 500MB RAM machine, shared amongst tasks, to a 3.2GHz 2GB RAM machine solely doing SVN. That's no small upgrade, and isn't at all telling which of the three main variables (CPU, RAM, shared-or-not) actually made the difference. Use a better computer, it's better. Duh.
    3. Re:Store them differently by XO · · Score: 1

      I still can't believe that the -server- is having problems with processing like that, though. Since SVN stores each changeset as a seperate file, all it should have to do is send out the changeset. INstead, the server sits there doing -something- for 50% of the time, then spends the other 50% of the time sending it.

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
  13. What's wrong with version control? by shirai · · Score: 4, Interesting

    Okay, I know this is completely off-topic but I'd really like to get some responses or some discussion going on what makes version control suck.

    I mean, is it just me or is revision control software incredibly difficult to use? To put this into context, I've developed software that builds websites with integrated shopping cart, dozens of business features, email integration, domain name, integration, over 100,000 sites built with it, (blah blah blah) but I find revision control HARD.

    It feels to me like there is a fundamentally easier way to do revision control. But, I haven't found it yet or know if it exists.

    I guess for people coming from CVS, Subversion is easier. But with subversion, I just found it disgusting (and hard to manage) how it left all these invisible files all over my system and if I copied a directory, for example, there would be two copies linked to the same place in the repository. Also, some actions that I do directly to the files are very difficult to reconcile with the repository.

    Since then, I've switched our development team to Perforce (which I like much better), but we still spend too much time on version control issues. With the number, speed of rollouts and need for easy accessibility to certain types of rollbacks (but not others), we are unusual. In fact, we ended up using a layout that hasn't been documented before but works well for us. That said, I still find version control hard.

    Am I alone? Are there better solutions (open source or paid?) that you've found? I'd like to hear.

    --
    Sunny

    Be my Friend

    1. Re:What's wrong with version control? by Cee · · Score: 3, Insightful

      Yes, version control is more difficult than not using any tool at all, but that goes for most stuff in life. There are certainly areas where usability can be improved.

      Fiddling with stuff you are not supposed to fiddle with is generally a no-no when using source control. I found though that I got used to the Subversion way to do things (learned that the hard way). For example Subversion on the client side does not really handle server side rollbacks of the complete repository since the files are cached and hashed locally. One way to make source control more transparent to the user could be to let the filesystem handle it.

    2. Re:What's wrong with version control? by norton_I · · Score: 2, Interesting

      You are not alone, but I think the problem is intrinsic (or nearly so). VC is one more thing you have to worry about that is not actually doing your work. It is easy as long as you don't want to do anything with VC you couldn't do otherwise. If all you do is linear development of a single branch, it is pretty easy. Memorize a few commands for import, checkout, and checkin and you are fine, but all you really get is a backup system. As soon as you want to branch and merge and so forth, it becomes much more complicated.

      I think the only way to make it work really well is to have an administrator whose job it is to be a VC expert, rather than a programming expert. You need someone with some serious scripting skills and a deep understanding of the structure of the VC filesystem. With the proper scripts in place, you can really streamline the process for your specific project and enforce your coding practices, but maintaining the system is a seperate skill from programming. Also, when performing non-standard merges or whatever, you would probably need a coder to work with the admin to make sure you don't do it in a way that will hamstring you later. Of course, most projects can't afford that, and many programmers don't want to leave their code in the hands of some script monkey, or won't believe that someone else can do something as "trivial" as vc better than them :)

    3. Re:What's wrong with version control? by jayp00001 · · Score: 1

      Nope, version control in general stinks. MS Team foundation server is an attempt to make it easier because Microsoft controls both client and server aspects. I'd say they were marginally successful in making it easier than a seperate version control system. Alot of the problems stem from the fact that version control should be invisible and isn't and many people have different ideas about version control. You mentioned that perforce allowed you to directly make changes to files and later reconcile them. Generally speaking that's a version control nightmare (as it's expected tht you check in and check out copies) and many users do exactly that.

    4. Re:What's wrong with version control? by Anonymous Coward · · Score: 1, Insightful

      You mentioned that perforce allowed you to directly make changes to files and later reconcile them. Generally speaking that's a version control nightmare (as it's expected tht you check in and check out copies) and many users do exactly that.

      You sound like someone who's only used to the VSS way of doing things. Lock-Edit-Release. Try this with a parallel development shop where different teams are on different continents, throw in production support and bug fixing, and you'll quickly see where the true nightmare lies. (Especially if you don't do any branching or tagging)

      SVN/CVS users normally do optimistic locking, i.e. Copy-Edit-Merge.

      I personally prefer to have my local copy completely disconnected from source control, allowing me to edit files willy-nilly. (maybe to test some changes or do some debugging)

      Generally speaking, it's only a nightmare if you don't know what you're doing, or don't know how to merge.

    5. Re:What's wrong with version control? by jgrahn · · Score: 2, Insightful

      You are not alone, but I think the problem is intrinsic (or nearly so). VC is one more thing you have to worry about that is not actually doing your work.

      If it isn't about doing your work, then why do you do it?

      Of course it is about doing your job. If you're a programmer, it's analogous to asking your C compiler not to suppress warnings. You would have to find those bugs anyway, and you would do a much worse job without the help.

      In my work, version control (or whatever fancy name ending in "management" you like to put on it) relieves me of enormous burdens. It lets me do separate work in isolation. It lets me plan and replan my work, reschedule so that feature B gets delivered before feature A. It lets me review other people's changes, and it lets others review mine. It lets me track the root cause of a bug, created years ago. It lets me know exactly what I delivered to some poor guy.

      Note though that you need more than a tool. You need to have a common view on how to use it in your environment.

      And you cannot have people who think it's useless non-productive non-work, because they won't care -- and quite soon they will turn it into useless non-productive non-work by taking "a few shortcuts" which negate all the positive effects of version control, making it analogous to wearing an expensive Armani suit and leaving the fly open.

    6. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      "VC is one more thing you have to worry about that is not actually doing your work".

      If that is true for your work, you should not use version control. However, keeping a trail of what they did and why, and clearly tagging releases _is_ actually doing their work for many people, and it probably should be for even more of them.

    7. Re:What's wrong with version control? by greed · · Score: 1

      Rolling back the repository on the server is a very, very bad idea unless you're recovering from a major "OH DAMN!". Much better, in Subversion, is to just copy the old, good one that you want to the latest version, then the clients will know to update.

      CVS can get so badly lost you have to manually hack the entries file if you start making revisions vanish on the server.

    8. Re:What's wrong with version control? by 0xABADC0DA · · Score: 2, Interesting

      But that's the problem with subversion... the things that one might normally do all the sudden are 'fiddling with stuff you are not supposed to fiddle with' and a big 'no-no'.

      1) You want to make a copy of trunk to send to somebody:

          tar cvf project.tar .

      With svn you have to go through a bunch of magic to do this or you end up giving them an original copy when you may have local changes (you tweaked some config option or whatever), your username, time svn repo address and structrure, etc. If you do svn export it makes a copy of what is in HEAD not in your folder, so there is no way to do this without going back and weeding out this junk

      2) You want to export something

          # svn export svn:something /tmp
          svn: '/tmp' already exists

      Really, you think?

      3) You make a copy of a file and then decide to rename it (or other cases).

          # svn cp /other/file.c file.c
          # svn mv file.c newname.c
          svn: Use --force to override
          svn: Move will not be attempted unless forced
          # svn --force mv file.c newname.c
          svn: Cannot copy: it is not in repo yet; try committing first

      Svn says you *must* do a bogus commit because you wanted to rename a file, or alternatively you can revert the new file and lose it? wtf? dumb.

      4) You want to do the same thing on lots of files

          # svn mkdir newdir
          # svn cp *.c newdir
          svn: Client error in parsing arguments

      That's right you have to break out your bash/perl script skills to do this. Lame.

      There's a *lot* to dislike about svn. It's basically just 'icky' all throughout. The checkouts are huge and ugly, many operations are slow (compared to monotone), its really annoying to have a private repo that you sync occasionally so you end up with zillions of tiny commits or losing work because you didn't commit enough. And the repo itself is very large -- converted a 2g repo from svn to monotone preserving revisions and even with straight add/del instead of renames/moves the monotone database was a small fraction of the size, about 1/6th. Incidentally, the monotone version was much faster in pretty much every way.

      Monotone is technically much better than subversion, except for one problem that you can't checkout only a subset of a repo. Maybe they have fixed that by now and if so it would be crazy to use svn instead of it IMO. I'm sure there are also many others out there better than svn.

    9. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      Is it just me or do computers suck? To put this into context, I've written 3 novels with pens and typewriters, but I find word processors HARD.

    10. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      You guys are just begging for this book... Pragmatic Version Control, and there is both a CVS and SVN version. It takes a common sense, keep it simple type of attitude towards version control. After many years, I finally "get it" with VC after reading just a few chapters in this book.

    11. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      What I don't understand is why SVN (and CVS) rely on having a directory in every directory on your local copy.

      Why not just store that information separately? It would make many filesystem operations much simpler.

    12. Re:What's wrong with version control? by norton_I · · Score: 1

      If it isn't about doing your work, then why do you do it?

      I thought what I meant was clear. I never intended to claim that VC was useless, non-productive, or non-work. In fact, part of my point was that it is work, which many people don't understand. By "not doing your work" I meant simply that your final product is a program, not a tree of revisions. Time spent on VC does not directly result in satisfying customer needs, rather it makes it easier to create reliable software more quickly, and with less risk of losing information you need.

      Of course VC is an incredibly useful tool, essentially required in many applications. I use it all the time, and almost every time I think something is not worth adding to VC, I regret it.
    13. Re:What's wrong with version control? by Anonymous Coward · · Score: 1, Insightful

      *sighs* So learn to use the tools you use properly then.

      >1) You want to make a copy of trunk to send to somebody:
      >
      > tar cvf project.tar .

      tar cvf project.tar --exclude .svn .

      That excludes the subversion metadata. But, if you do that, you are most definitely doing the wrong thing. Never, ever, send things to third parties without checking things in and noting the revision of the stuff you send. Doig otherwise defeats one of the purposes of version control, to keep track of what is happening. If you send somebody a copy of your working tree, three weeks later you have absolutely no idea of what you actually sent him. If whatever you have checked out is broken and you don't want to break the trunk, create a branch, commit on that branch, and do an export from that branch instead. Branches are almost free in Subversion (only some metadata gets copied) and the advantages of knowing what you sent off are immense.

    14. Re:What's wrong with version control? by adrianmonk · · Score: 1

      But with subversion, I just found it disgusting (and hard to manage) how it left all these invisible files all over my system

      It only puts them in your working copy. Most development practices include the assumption that you wouldn't deploy your working copy simply by copying it directly. There are several models of how to generate something to be deployed. One of the most common ones is to have a script or build tool that operates on the working copy and generates something that can be deployed. That something could be a tarball, or it could be an RPM, or it could be a directory tree that can be used directly, or it could be a source distribution. Another common way to do it is to have a script that actually installs directly from the source tree onto a system. But it's really not very common to simply use the source tree (or a copy of it) directly.

      and if I copied a directory, for example, there would be two copies linked to the same place in the repository

      That would only happen if you copied the files directly instead of using svn copy as the documentation says to do. If you really want to use a GUI to do it, and you're on Windows, get TortoiseSVN and use that. There are probably other similar tools for other systems.

    15. Re:What's wrong with version control? by NuShrike · · Score: 1

      That's what makes CVS great. The total transparency that allows hack a few lines of Entries and you're done in record time.

      No mucking with database files, no mucking with rollback points, so on.

    16. Re:What's wrong with version control? by totally+bogus+dude · · Score: 1

      Agreed, and I would suggest that the best way would be to use svn export (rather than checkout) to grab a copy from the repository without the housekeeping directories. That is exactly why the function exists, after all.

      While I'm sure shirai has valid complaints, complaining about the dot files as if they're some mysterious magical thing just makes it sound like they've not bothered to read even a very basic introduction to Subversion. That's not entirely unreasonable: they don't want to have to learn to use RCS, they just want it to work; and that's the ultimate goal for all software. But I don't think there's any source control software out there that "just works" without requiring some understanding of how it actually works.

      The only real quirk is that have to tell the RCS when you're moving/renaming, deleting, or creating files and folders, rather than just doing it and letting it work out what happened. But it's fairly consistent: changing contents you can do without special actions, but changing the file structure requires you to be explicit.

    17. Re:What's wrong with version control? by Anonymous Coward · · Score: 0
      *sigh* so learn to understand comments you are replying to.

      What you are trying to ignore so you can score points is that SVN makes the user have to go out of their way so it behaves like one would expect it should. It's not about just 'learning the tools', it's about the tool svn being retarded and annoying for no good reason. For instance, what option to scp excludes the .svn folders for sending to a host-only vmware image? Oh just use rsync instead or make a tar then copy that or use some other work-around because subversion has an annoying file system? That's dumb. Sometimes in cases like that it's okay to just copy the svn files, but not always.

      Branches are almost free in Subversion That's a pretty bold and clearly wrong claim. Branches in subversion don't cost much on the server, but on the client they are ridiculously expensive. When you make a branch you take up more than twice the space in every checked out copy of the repo even if nothing at all is changed. For instance we had an svn repo with about 200-300 megs of material but it took up several gigs on developers machines because of branches. And unless you have a separate repo per project then the recommended structure makes it impractical to not have most branches reside on the local system.
    18. Re:What's wrong with version control? by chthon · · Score: 1

      Version control is part of the software development process.

      If you are building a simple program on your own, then the basic thing to do with it is versioning in a straight line.

      However, if your program architecture becomes more complex and/or more people are working on it, then version control becomes synchronisation system.

      When you have a more elaborate development process, version control is tied in with change control and tracking.

      So, yes, version control is hard.

      I am a fulltime VC administrator for a group of about 80 developers. The basic things that are planned for when we release something, is that we should be able to reproduce the build using the tools in the VC system (which is here Continuus/Telelogic by the way).

      For my personal projects, I particularly like the combination SVN + trac.

    19. Re:What's wrong with version control? by chthon · · Score: 1

      1) You want to make a copy of trunk to send to somebody:

      tar cvf project.tar .

      With svn you have to go through a bunch of magic to do this or you end up giving them an original copy when you may have local changes (you tweaked some config option or whatever), your username, time svn repo address and structrure, etc. If you do svn export it makes a copy of what is in HEAD not in your folder, so there is no way to do this without going back and weeding out this junk

      In any VC/CM system, you should not export things which have not been committed. If you have local changes, they should have a reason, and they should be committed. Afterwards you can export them properly without all the .svn overhead.

      I think that I see where you are heading with 2), you want the same functionality as cp/mv/... If that is the case, maybe you should file a bug report.

      3) is difficult to reason about. If I would do the same in Continuus, I am obliged to commit my transactions also before doing the rename. In effect, the rename would consist of doing a delete of the old file and the addition of the new renamed file (histor will be preserved). Since both systems work with a database/workarea separation, I presume that they share some of the same design trade-offs.

      4) had me bitten also this week. I think this should be considered a bug. Command-line parsing should be consistent across applications, and thus svn should be able to parse multiple arguments and decide that the last one in the row is the destination.

    20. Re:What's wrong with version control? by aled · · Score: 1

      That's what makes CVS great. The total transparency that allows hack a few lines of Entries and you're done in record time.

      No mucking with database files, no mucking with rollback points, so on.


      No atomic commits, no renames, no http, no delta of binary files, no safe handling of binary files...
      --

      "I think this line is mostly filler"
    21. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      It seems you're just too fucking stupid to be trusted to use your tools correctly. Do you bitch and whine because your C compiler "leaves .o files everywhere" and you have to exclude them if you want to tar up your source code?

      As for 0xABADC0DAs other objections (#2, 3 & 4): they're equally as dumb. You need to commit the change you've just made before you try to perform a totally seperate operation? No way, say it ain't so?!

    22. Re:What's wrong with version control? by gstein · · Score: 1

      Holy crap. You really have no idea on how to use Subversion, do you? As the previous poster said... take a little time to learn your tool, rather than blaming the tool. It is *always* a bad idea to check out the root like you're doing. Of *course* it will take several gigs if you also check out all the tags and branches.

      Per project? The book covers that:
      http://svnbook.red-bean.com/nightly/en/svn.branchm erge.maint.html#svn.branchmerge.maint.layout

    23. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      Yeah I guess blaming the user is much better than blaming the tool to you. Sorry, but that's just pathetic.

      What is the reason for .svn folders? There is none, it's just laziness and poor choices by the subversion developers. Other systems work fine without this holdover from cvs.

      Poster said 'just make a branch retard it's free'. How is it free unless you don't have tags/ checked out and are typing in a long svn: url?? How annoying is that. It's not free, in fact it has a high cost.

      You say 'only ignorant people check out the whole root'. First of all, who said it was the root? That's an assumption that *you* made because you need some excuse to apologize for svn. A single project within a repo with lots of files or large files can easily take up gigabytes because of branches. Second, yes there *are* ways to minimize the huge overhead that subversion adds by jumping through hoops, but it should not be necessary and it isn't necessary with other more capable version control systems. Third, why shouldn't somebody check out the whole root other than because svn explodes if they do? If somebody wants to work that way because it is more convenient for them then this is no problem with most other systems besides svn.

      Another poster said I'm just 'too fucking stupid' to use svn. Gotta love the class displayed in this thread by subversion apologists. Instead of flying off the handle, jumping to conclusions, and disparaging people because you have some sort of emotional connection to your version control system maybe actually try some of the better alternatives that people have mentioned.

    24. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      What is the reason for .svn folders?

      Meta-fucking-data, you blittering retard.

      Poster said 'just make a branch retard it's free'. How is it free unless you don't have tags/ checked out and are typing in a long svn: url?? How annoying is that. It's not free, in fact it has a high cost.

      What the holy fucking crap are you talking about? It's real fucking hard to type:

      svn co http://svn.example.com/tags/my_tag_1_1

      Wow, I think I better go lay down after that!

      single project within a repo with lots of files or large files can easily take up gigabytes because of branches.

      DON'T. CHECKOUT. THE. FUCKING. BRANCHES. YOU. FUCKING. RETARD. Branches are created some place else. The documentation even suggests one: /branches! It's dead fucking simple to have /trunk, /tags and /branches. Unless you're legally brain-dead I guess, then you might just have an excuse.

      Third, why shouldn't somebody check out the whole root other than because svn explodes if they do? If somebody wants to work that way because it is more convenient for them then this is no problem with most other systems besides svn.

      You shouldn't check out the root because as you so brilliantly point out you'll end up checking out all the tags and branches. If someone wants to work that way, they're a fucking retard, but sure they're welcome to do that. However you can't then try and claim it could be "more convenient for them" at the exact same time that you're complaining about doing exactly that.

      Another poster said I'm just 'too fucking stupid' to use svn. Gotta love the class displayed in this thread by subversion apologists. Instead of flying off the handle, jumping to conclusions, and disparaging people because you have some sort of emotional connection to your version control system maybe actually try some of the better alternatives that people have mentioned.

      You're right. I take that back. I should have said you're too fucking stupid to use any revision control system, or touch anything important, ever, under any circumstances, because your failure to understand basic concepts, read a manual or listen to people who actually do know better than you could quite likely cause serious injury or death to innocent bystanders. I hope to God that you're a code monkey in some inconsequential corporate job where your incompetence can not cause any serious or lasting damage. You belong on TheDailyWTF.

    25. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      >What is the reason for .svn folders? There is none, it's just
      >laziness and poor choices by the subversion developers. Other
      >systems work fine without this holdover from cvs.

      There is a very good reason to have the .svn directories to store metadata. The metadata has to be stored somewhere, it can be stored in a directory in the same directory as the files as CVS and Subversion does, it can be stored in a directory at the top of the checkout as Bitkeeper, Quilt or Git does, or it can be stored on the server the way Perforce does it. All those methods have advantages and disadvantages.

      The Perforce way is nice because it keeps the metadata away from the working copy, on the other hand, it also means that if you rename your working directory perforce will be hopelessly lost, whenever you have to keep two views of the same thing in sync (the Perforce client state stored on the server and the files on disk) you risk inconsistencies. Keeping the metadata on the server also means that almost all operations have to talk to the server, making disconnected development a real pain.

      Having the metadata in the working copy is very nice in another way. It makes it harder for the metadata and the working copy to get out of sync since they are located closer to each other. Lets say I'm doing some lengthy development and want to test an alternate vay of doing things: with CVS or Subversion I can just make a local copy of my working tree (or a part of my working tree) and do some development in the copy. If it turned out to be good, I can commit from the copy, or from the original working tree I had. Since the metadata resides in the same directory it just works. But it also means that every directory is littered with CVS or .svn directories.

      Having a directory at the top is nice because it keeps it in one place, and doing an export is as easy as checking things out and then doing "rm -rf .git". It has the some of the same advantages as keeping metatada in every directory. But it means that you have to keep your working directory together at all times, you can't check out "foo" and then expect to be able to make a local copy of just "foo/bar" and be able to check in things from that directory. On the other hand neither Git nor Bitkeeper (I believe) have support for checking out partial trees anyway.

      In my opinion the Subversion way is superior to the Perforce way. I use Perforce at work every day and am amazed at how clumsy it is to use compared to Subversion. Yes, Perforce has superior branching and merging support, but I do that once a week, I do basic checkins and checkouts many times a day, so which is most important for me? And Subversion has made the choice to waste a lot more disk space on keeping an extra copy of the repository files in the .svn directory, but that gives me quite nice disconnected support, so for me that tradeoff is definitely a good one.

      But to mindlessly say "svn sucks because it has all those .svn directories", isn't that a bit stupid? It's a tradeoff as everything else.

    26. Re:What's wrong with version control? by NuShrike · · Score: 1

      Atomic commits would be nice, renames easily done on the repo with recoverable side-effects, why http? otherwise cvsweb, no binary delta is GOOD because binary-patches are SLOW and unnecessary red-herring (CVS has obliterate; what does SVN have?), handled gigs of mission-critical binaries safely so far.

    27. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      What is the reason for .svn folders? Meta-fucking-data, you blittering retard. No, that's the reason for a .svn folder not for .svn folders. Monotone for example makes a single MT folder in the checkout root. This is the smart way to do it. There's no reason for thousands of .svn folders -- it's just stupid, and it actually causes problems.

      DON'T. CHECKOUT. THE. FUCKING. BRANCHES. YOU. FUCKING. RETARD. You either check out the branches or you use the URL addresses a lot. You obviously prefer to use URLs and use svn commands to list repo folders, branches, etc rather than tab completion and ls. Not everybody would like to work that way, and what you are saying is that subversion is a really crappy system for them -- for worse than even CVS for instance which *can* work that way with no problems.

      And if you aren't checking out branches anyway then why especially should they be just like making a copy? In some systems a branch is just a file with the changeset or hashes of what branched. Those systems are technically superior for branching.

      You shouldn't check out the root because as you so brilliantly point out you'll end up checking out all the tags and branches. Exactly the point. If you don't want to check out specific small sections of a repo (individual projects, a single branch at a time) then what you are saying is that subversion is a really, really bad version control system to use. In this respect the majority of version controls systems are far better than subversion.

      Seriously you must be extremely mentally unbalanced to have such an irrational and emotional connection to a version control system. Seek help.
    28. Re:What's wrong with version control? by WuphonsReach · · Score: 1

      But with subversion, I just found it disgusting (and hard to manage) how it left all these invisible files all over my system and if I copied a directory, for example, there would be two copies linked to the same place in the repository. Also, some actions that I do directly to the files are very difficult to reconcile with the repository.

      That's actually a very strong advantage of SVN. Working copies do not have to map 1:1 with repositories. But it was a big change compared to how Visual SourceSafe works. In fact, svn:externals makes heavy use of this.

      Now, for us, we prefer to keep things very similar across our workstations. For instance, we have a "company-jobs" repository that we use to store our projects. That repository is always checked out to C:\Company\Jobs and we pull down the entire repository (we even have a "svn up" batch file that runs when you login to make sure you're in sync). Below that point, our structure is /svn/company-jobs/prefix/projectnumber/ - where "prefix" is the 3 character company code and "projectnumber" is the billing code and project name.

      We also have a standard set of folders that we create in each project (one of these days, I'll script this). Such as "docs/", "data/", "media/", etc.

      Now, I have a love/hate relationship with the .svn/_svn folders being stored in the working copy tree. I like it because I can easily backup a working copy, or move it to another folder, and not have to remember what the SVN checkout path was. It's also fairly resilient. In SourceOffSite, we had a bad issue where SOS would trash its record of what you had checked out, losing status. In SVN, when part of the working copy gets corrupted, we delete those folders and re-do a SVN update (in cases where "svn cleanup" fails).

      --
      Wolde you bothe eate your cake, and have your cake?
    29. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      VC is one more thing you have to worry about that is not actually doing your work.

      Any programmer who is unwilling to learn how to use their VC tool - is a fool.

      Using VC is part and parcel of programming and development. Not using VC is like not knowing how to use the inline debugger, or the IDE, or what the basic compile switches are.

      Now, branch & merge are inherently complex. Assuming that you can just muddle your way through it is a big mistake. Akin to trying to write an entire module without doing *any* pre-planning or design.

    30. Re:What's wrong with version control? by WuphonsReach · · Score: 1

      The idiot put 35 megs of logs into the repository.

      SVN actually works pretty well at storing log files (it's very efficient at it - both storing and sending the changes). Especially for distribution of said log files and secure storage of the log files. Because SVN doesn't support purge, it makes a good WORM-style solution for logs. And with svn+ssh and restricting the commands that can be run with a particular SSH key make it fairly secure from tampering.

      Yeah, it's probably overkill. But it's a legitimate use of the tool for cases where you want to transport log files quickly and securely to an offsite location. Our servers push the log files once an hour.

      --
      Wolde you bothe eate your cake, and have your cake?
    31. Re:What's wrong with version control? by aled · · Score: 1

      Atomic commits would be nice, renames easily done on the repo with recoverable side-effects, why http? otherwise cvsweb, no binary delta is GOOD because binary-patches are SLOW and unnecessary red-herring (CVS has obliterate; what does SVN have?), handled gigs of mission-critical binaries safely so far.


      CVS doesn't handle renames, you have to do it manually with add and delete and there is no link from the new file to the old. SVN isn't perfect (rename = copy + delete) but at least supports copy preserving history and it's planed to support true renames.
      HTTP/HTTPS is good for flexibility if you need to access from internet or your security infrastructure uses client certificates, LDAP authentication or you need to pass over a proxy. Not for everyone but is good to support different use cases.
      Some people seem to disagree over binary deltas.
      AFAIK there is not an obliterate command in CVS and neither in SVN. You must carry a manual procedure (dump/filter/restore in SVN case).
      Many CVS projects are migrating to SVN (GCC, all of Apache, GNOME, KDE, etc). Don't hear any of them going back.
      In fact CVS is very inefficient storing binary files, it stores the full version in the ,v. SVN store deltas, so it can reduce the storage size.
      But if you are happy with CVS, more the good to you. It served me well, but it looks older each.

      --

      "I think this line is mostly filler"
    32. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      Yes, Perforce has superior branching and merging support, but I do that once a week, I do basic checkins and checkouts many times a day, so which is most important for me? And if you like that, how about being able to commit changes at any time to your own local repository without having to push out your state to everybody else? You can commit things to your local repo with monotone as much as you want -- and if the code is bad it still only breaks your compile. You could set it up to automatically check in your changes every 15 minutes if you wanted.

      And Subversion has made the choice to waste a lot more disk space on keeping an extra copy of the repository files in the .svn directory, but that gives me quite nice disconnected support, so for me that tradeoff is definitely a good one. Not really, when you consider that for about the same amount of space you could have all the revisions of what you have checked out instead of just HEAD. And you can do commits, reverts, diffs, etc across versions.

      But to mindlessly say "svn sucks because it has all those .svn directories", isn't that a bit stupid? It's a tradeoff as everything else. To mindlessly say "svn sucks because it has all those .svn directories" would be a bit stupid. It would also be what we call a straw-man. The .svn folders are just one of many reasons why subversion sucks, and as you point out there are many real downsides to it and few positive points that defend it. If you look back at the thread, I mention it as a relatively minor concern -- which some subversion apologists went crazy about (I guess it is a touchy subject for them).

      I do find it interesting that in your long post about .svn folders you make only one actual point in favor of lots of .svn folders, that one can copy a folder locally and make changes and then if it works then check that back in. But can you actually check it back in? Only if it is up-to-date with current revision, so either you could just commit the changes in the main folder and copy the new files back in or you are going to be checking something in that breaks your working main copy. If your two separate folders are so different that you cannot reconcile them then you need to make a branch, in which case you can just branch HEAD and copy the other set of files into it. So being able to just clone a small part of the repo is only a very small actual benefit, and then only if informally using version control (which the other posters on this thread would lambaste you for, if they had any sort of real convictions instead of just being fanboi's).
    33. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      You either check out the branches or you use the URL addresses a lot. You obviously prefer to use URLs and use svn commands to list repo folders, branches, etc rather than tab completion and ls. Not everybody would like to work that way, and what you are saying is that subversion is a really crappy system for them -- for worse than even CVS for instance which *can* work that way with no problems.

      You make so many idiotic points I can't be bothered, but this is the most stupid. So you're saying that other VC systems magically locate individual directories within the repository for you? I'm guessing this must be done via. some form of mind-computer interface, as you apparently don't even need to tell it the location of the directory you want!

      What does "you either check out the branches or use the URL addresses a lot" even mean? It makes about as much sense as "You either check out the branches or use the -r switch a lot." for CVS. How does CVS or Perforce or Clearcase allow you to "use tab completion and ls"? It's nonsense. Just how often are you checking out branches, anyway?

      I just can't make sense of your argument. How else should it do it? GUI tools are available that can browse a remote repository. Hell, if you're using WebDAV you can use a web browser to do it! Other VCS use similiar methods. What's so special about the way SVN does it?

      You're just confirming, over and over again, that you don't understand any form of VCS, let alone Subversion, and it's quite apparent that you don't understand it because you're an idiot.

    34. Re:What's wrong with version control? by Anonymous Coward · · Score: 0

      I can't be bothered ... I just can't make sense of your argument ... You're just confirming, over and over again, that you don't understand You don't understand an argument and so you feel like that's somebody else's fault. Sounds like short-man's syndrome to me.
    35. Re:What's wrong with version control? by NuShrike · · Score: 1

      CVS rename:
      ssh repository
      cd /cvs
      mv foo,v bar,v
      (done)

      Can also create symlinks to share code between different projects, and preserves all of history. Not transparent, but it's available and it works.

      CVS obliterate dates back to RCS and is done with:
      cvs admin -o 1.1 (for specific version) , or :1.10 (for everything up to 1.10 inclusive), 1.2:1.10 (for deleting 1.2 to 1.10). You'll need a fancy script to handle branches.

      man it. SVN is still too immature if it can't do this.

      Efficiency of storing binaries is very inconsequential when storage is free these days. Not only that, but efficiency of delta'ing varies and wasteful in time (lookup bsdiff) so why bother? And with the obliterate option, it makes the issue moot, and SVN unusable for binary CM and deployments.

      I can access everything from CVS over ssh and VPN tunnels just fine from anywhere in the world. I don't really see a need for a complicated HTTP overhead although it can be nice, but basically says there's a no-trust relationship going on.

      I just wish there CVS would maybe uses BerkeleyDB or something faster for the file db in backend so tagging isn't so slow, handles file histories over 4GB better (isn't there a better way than a full copy/replace of the ,v file on commits?), and some other nits. For the enterprise development I do, CVS or Perforce are the only choices that don't have issues.

    36. Re:What's wrong with version control? by aled · · Score: 1
      The way you do rename in cvs don't preserve history. Yes, the file will have a new name from now on, but it breaks any previous version builds. YMMV depending on your language, tools, etc Doesn't work for many use cases, but more importantly, it isn't clear in cvs logs. CVS just doesn't support it.

      It seems that cvs admin -o isn't a very "mature" or robust option. On the contrary it is very manual and can be dangerous.

      From the manual:

      None of the revisions to be deleted may have branches or locks. If any of the revisions to be deleted have symbolic names, and one specifies one of the '::' syntaxes, then CVS will give an error and not delete any revisions. If you really want to delete both the symbolic names and the revisions, first delete the symbolic names with cvs tag -d, then run cvs admin -o. If one specifies the non-'::' syntaxes, then CVS will delete the revisions but leave the symbolic names pointing to nonexistent revisions. This behavior is preserved for compatibility with previous versions of CVS, but because it isn't very useful, in the future it may change to be like the '::' case. Due to the way CVS handles branches rev cannot be specified symbolically if it is a branch. See section 5.5 Magic branch numbers, for an explanation. Make sure that no-one has checked out a copy of the revision you outdate. Strange things will happen if he starts to edit it and tries to check it back in. For this reason, this option is not a good way to take back a bogus commit; commit a new revision undoing the bogus change instead (see section 5.8 Merging differences between any two revisions).


      Subversion has an established procedure for doing this, it just doesn't do it on an online repository:
      svndumpfilter exclude calc calc-dumpfile

      Efficiency of storing binaries is very inconsequential when storage is free these days. Not only that, but efficiency of delta'ing varies and wasteful in time (lookup bsdiff) so why bother? And with the obliterate option, it makes the issue moot, and SVN unusable for binary CM and deployments.
      You seem to have a need to delete old binary files. I don't see the point to do that. You are agreeing that storage is cheap. I think most people would prefer to keep history if not hard pressed by storage limits. If you google it or in comments you'll see some people with multigigabytes and no plans to obliterate any part of it. The SVN procedure is enough for this as a repository maintenance option. Perhaps not the majority of user needs to obliterate regulary.

      HTTP may has no use for you, but for other people may be a show stopper. CVS doesn't support alternate protocols. SVN supports SSH with svnserve, so it's more flexible.

      I just wish there CVS would maybe uses BerkeleyDB or something faster for the file db in backend so tagging isn't so slow

      I think that is a design flaw of CVS. You may want to check http://www.march-hare.com/cvspro/. It support some niceties than CVS has not and in some cases even SVN, though their comparison seem a little outdated. Disclaimer: I haven't tried it :-)
      --

      "I think this line is mostly filler"
    37. Re:What's wrong with version control? by NuShrike · · Score: 1

      The way you do rename in cvs don't preserve history. Yes, the file will have a new name from now on, but it breaks any previous version builds.
      cp instead of mv fixes that. Then cvs remove the previous version when the rename is complete. Yes, it's clunky and you can't directly link the two files unless you symbolically link them, but it only a minor annoyance -- not a show stopper.

      As for cvs admin -o, I have a Perl script that generates the necessary commands I need to delete the versions. It can even be done visually with WinCVS, so I don't consider this dangerous nor lacking; just incomplete.

      However, SVN's method is IMO very complex, and basically unsupported when SVN is supposed to improve upon CVS, not remove features. Even db-based Perforce has full support for it, and some other other SCSs.

      We have a need for binary obliteration because we store every single "test" framework release build. When the final version goes live, then there is no need to keep any of the previous prototype builds. For us, SVN is the improved CVS isn't really.

      Thanks for the CVSNT link. Although I've been peripherally aware of its improvements over CVS, it almost matches SVN for many features and worth a harder look for the server setup.. Thanks!

  14. More about tuning your processes by weinerofthemonth · · Score: 3, Informative

    Based on the headline, I was expecting some great method for tuning Subversion for increased performance. This article was about performance tuning your processing, not Subversion.

  15. But why deltas? by Weston+O'Reilly · · Score: 1

    The reasons given here are valid and pretty obvious reasons why you'd want to store binaries in version control. But what is the big advantage of storing deltas of binaries, instead of complete files like CVS? Is it just disk space savings?

  16. Is this a solution in search of a problem? by Anonymous Coward · · Score: 0

    Disk space is stupidly cheap.

    1. Re:Is this a solution in search of a problem? by Anonymous Coward · · Score: 0

      Disk space is stupidly cheap. Yeah, and soon programmers will be too. When that happens, the wasted time spent uploading and downloading binaries (which was the point of the article) will no longer matter.
  17. running the toolchain... by iangoldby · · Score: 3, Insightful

    If you put the toolchain into CM, do you also put the operating system in? Just as the sourcecode is no good if you don't have the right toolchain to build it, the toolchain is no good if you don't have the right OS to run it.

    I suspect the answer (if you really need it) is to save a 'Virtual PC' image of the machine that does the build each time you make an important baseline (or each time the build machine configuration changes). Since the image is likely to be in the GB size range, you might want to store it on a DVD rather than in your CM system.

    1. Re:running the toolchain... by 19thNervousBreakdown · · Score: 1

      Don't forget, the VM is useless without the hypervisor/player/whatever, so you need to check that in too. Of course, that's generally useless without the OS, so check that in too. Even if you have an OS/hypervisor, that's useless without the hardware, so you need to check that in too.

      Or, rather than trying to figure out how to version control hardware, you could write portable code and use open standards, and not worry about all mess.

      --
      <xml><I><am><so><damn>Web 2.0</damn></so></am></I></xml>
    2. Re:running the toolchain... by NuShrike · · Score: 1

      The OS is irrelevant when the machines being deployed to are the same OS and flavor. Windows, Linux, etc.

      Eventually, the OS updates but the tool chain updates with it. Release Management is about handling the NOW and not the whatif. Especially since stable and mature OSs don't really change that much, and aren't a toolchain dependency (unlike you RedHat people).

      You have some other granular, fault tolerant, and centralized release tracking model that works better and doesn't rely on different directories, tarballs, or some other hack method?

    3. Re:running the toolchain... by iangoldby · · Score: 1
      Actually, my point was about the inconsistency of putting the toolchain into CM but not the OS. Personally I don't think it is the right thing to do.

      Eventually, the OS updates but the tool chain updates with it.
      If you put the toolchain (but not the OS) into CM, you can't guarantee that the toolchain will still work after the OS has updated. But if you try the updated toolchain, then why did you put the old toolchain into CM in the first place?
    4. Re:running the toolchain... by NuShrike · · Score: 1

      Why have file-system journals?

      Instant traceable recovery of the tool-chain especially when spread out between multiple developer over long periods of time as long as the core dependencies don't change.

      Let me give you an example: developers have been building this game with tools and related artwork. Once the release A has been made, all sources and binaries are checked-in. A year later, developers come back to localize this release for international release, but meanwhile the source and tool chain had moved on.

      Do you attempt to destabilize the certified release by updating all sources and tool chain to the latest, or reduce work and time by working with the snapshot of what release A was developed in and only change artwork and displayed text?

      Sometimes, if the tool chain API is stable, it should be no problem to update to the latest, but not if it's custom in-house.

      How often does the OS change and does it really affect the tool chain? For example, Win32's API has been stable for 6-7 years now.

      I'm sorry all of you Red Hat/Fedora/Linux people are screwed in the meantime.

  18. Developers will not do these workarounds by javaxman · · Score: 3, Informative
    At least in a general case, I couldn't expect the developers I work with to gzip their binaries before checking them into version control.

    Doing so means you have to unzip them to use them. Not very handy. Most users want to use Subversion the way they should be able to use version control- a checkout should give you all of the files you need to work with on a given project, with minimal need to move/install pieces after checkout. Implementing the 'best' suggested workaround would mean needing a script or other way to get the binaries unpacked. Programmers are often annoyed enough by the extra step of *using* version control, now you have to zip any binaries you commit to the repository?

    I'm unimpressed by their performance testing methodology... they give shared server and desktop performance numbers, but have no idea what 'else' those machines were doing? Pointless. I'd like more details regarding what they're doing in their testing. Their tests were done with a "directory tree of binary files", but don't say what size or how many files?

    My tests on our server show a 28MB binary checkout ( LAN, SPARC server, Pentium M client ) takes ~20 seconds. Export takes ~2sec. That must be a big set of files to cause a 9 minute *export*... several gigs, am I wrong? It'd be nice for them to say. Most of us, even in a worst case, won't have more than a few hundred MB in a single project.

    The only *real* solution will be a Subversion configuration option which lets you say "please, use all my disk space, speed is all I care about when it comes to binary files". CollabNet is focused enough on getting big-business support contracts that it shouldn't be long before we see this issue addressed in one manner or another. You -know- they're reading this article!

    1. Re:Developers will not do these workarounds by Crazy+Taco · · Score: 1

      My tests on our server show a 28MB binary checkout ( LAN, SPARC server, Pentium M client ) takes ~20 seconds. Export takes ~2sec. That must be a big set of files to cause a 9 minute *export*... several gigs, am I wrong? It'd be nice for them to say. Most of us, even in a worst case, won't have more than a few hundred MB in a single project.

      What you say is true, but the problem is that that isn't how any organization I've seen tends to use Subversion (or Subversion like) version control systems (TFS comes to mind). What they typically do is have a root directory in the repository, and the roots of ALL projects are stored under that. Therefore, every project in the organization is in the same file tree in the Subversion repository. Most all of those projects are probably pretty small like yours, but all of them combined can be equal to many gigabytes of space. Despite these performance issues, most organizations still need that sort of structure so that they don't have tons of repositories sitting around. This way, they only have one repository and they can store every single project in it. But unfortunately, this is also where the performance hit comes in.

      For those using Subversion (or TFS) at most organizations, you can see what I mean by looking at the revision or changeset (check in) numbers. Check something in. Note the number. Wait about an hour without checking something in. Look at the repository's latest revision. Odds are, the revision number has gone up. Why? Someone else on some other project stored in some other folder out on the tree checked something in, and the repository keeps track of binary deltas of the entire tree! That's a good thing because that is what lets you move and rename files without losing their history (even changing the name of your project or moving it), but again, it is also how you get gigabyte upon gigabyte into the repository and start crippling performance.

      --
      Beware of bugs in the above code; I have only proved it correct, not tried it.
    2. Re:Developers will not do these workarounds by pearlmagic · · Score: 1

      The performance hit for having everything in one repository "shouldn't" be as bad as you think. Each revision is it's own separate file, and determining which files have changed in that set is very minimal, so doing an update on your repository when some other project checked something in should be negligible, as files in your project didn't actually change. Of course larger repositories will start to take longer doing checkouts/exports, and project "cleansing" should always be the first line of defense in these performance hits. TFA didn't at all talk about update/commit at all, which is what people should be using instead of checkout/export. If they did that, the performance hit would go away suddenly I think. He also didn't say which DB backend it used (BDB or FSFS).

    3. Re:Developers will not do these workarounds by XO · · Score: 1

      My "server" is my desktop machine, AMD Athlon XP 2000+. 30mb files take around 12 minutes or so to checkout. That's to a computer on the LAN. Add 8 or 9 of those to my project, and we're spending hours doing checkouts. Fortunatly most of those files never change, but when they do, that's an automatic 10-15 minutes of time that's going to be spent waiting by every person that gets an update after that.

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    4. Re:Developers will not do these workarounds by chthon · · Score: 1

      I think this is the case for all VC systems.

      I once did a test between Continuus and Subversion for checking out the same tree. The result where the same. Why ? Not because of the version control system, but because of the speed of bringing updates over the network to the disk. Creating files and directories is expensive.

      In my automated builds (using Continuus), reconfigures (updates) take from 5 to 20 minutes, depending upon the size of the tree and the amount of changes done. With big changes, add 5 to 10 minutes to the previous figures.

      You need at most 2 or 3 checkouts, most developers have enough with one. All the rest should be done by updating your work area to bring your development in sync with the main trunk or the particular branch your working against.

    5. Re:Developers will not do these workarounds by WuphonsReach · · Score: 1

      My "server" is my desktop machine, AMD Athlon XP 2000+. 30mb files take around 12 minutes or so to checkout.

      You *really* need to examine your setup to find out why that is happening.

      A 300 MB checkout (maybe a few dozen files) should only take about a minute to prep and then however long it takes to move over the wire. Since we're using svn+ssh, things are extremely efficient (SSH pub-keys restricted to running the svnserver tool in tunnel mode combined with PuTTY and TortoiseSVN). Our largest repository is a few gigabytes with close to 100,000 files (but I haven't counted lately).

      Our server is simply a 4-disk RAID10 system, 2GHz Athlon64 X2 CPU, and we're running inside a Xen DomU. Last time I looked, I think we're only giving the SVN DomU around 512MB of RAM. We're using the FS-based repository format instead of the BerkelyDB.

      Most of the time, our bottleneck is the WAN connection. And Xen's network performance (in that version of Xen) isn't all that great to write home about either. But we get a good 2-3 MBps across the gigabit connection which is good enough for now. Since most of our users are on the far side of a T1 line (which is only 150KBps) it's not an urgent issue.

      --
      Wolde you bothe eate your cake, and have your cake?
    6. Re:Developers will not do these workarounds by XO · · Score: 1

      Well, I initially thought that this was the painful long process of processing a binary diff, on the receiving computer. This article, as well as glancing at the CPU meter shooting through the roof when someone does an update involving one of those 30mb binaries indicates that the svnserve is doing something distressingly weird.

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
  19. What do you find hard? by sheldon · · Score: 1

    In it's simplest form... just keeping a history of changes, it really isn't that bad.

    where it becomes complicated is when you start talking about branching, merging, or trying to deal with dependencies across projects, etc.

    But if done well, version control helps more than hurts.

  20. Re:Utter garbage by Anonymous Coward · · Score: 0

    Visual SourceSafe destroys all open sores garbage. Visual SourceSafe destroys anything you put in it :)
  21. Comment removed by account_deleted · · Score: 2, Interesting

    Comment removed based on user account deletion

  22. Vesta is better by ebunga · · Score: 3, Interesting

    If you actually care about your code and making proper releases, use Vesta. Transparent version control that even tracks changes between proper check-ins (real "sub" versions). Built-in build system that beats the pants off of Make. It even has dependency tracking to the point that you not only keep your code under version control, but the entire build system. That's right. You can actually go back and build release 21 with the tools used to build release 21. It's sort of like ClearCase but without all the headache. Did I mention it's open source?

    The first time I used Vesta, it was a life-changing experience. It's nice to see something that isn't a rehash of the 1960s

  23. Notice.. by sudog · · Score: 2, Interesting

    .. that the article is glaringly absent *actual check-in times.* Or, where *actual check-in times* are available, the details of whether it's the same file as in previous tests is glaringly absent. This leaves open the question as to whether the data set they were working on was identical or whether it was different between the various tests.

    Questions that remain:

    1. Does the algorithm simply "plainly store" previously-compressed files, and is this the reason why that is the most time-efficient?
    2. What exactly was the data for the *actual check-in* times? (What took 28m? What took 13m?)
    3. Given that speedier/efficient check-in requires a large tarball format, how are artists supposed to incorporate this into their standard workflow? (Sure, there's a script for check-in, but the article is absent any details about actually using or checking-out the files thus stored except to say it's an unresolved problem regarding browsing files so stored.)

    The amount of CPU required for binary diff calculation is pretty significant. For an artistic team that generates large volumes of binary data (much of it in the form of mpeg streams, large lossy-compressed jpeg files, and so forth) it would be interesting to find out what kind of gains a binary diff would provide, if any.

    Document storage would also be an interesting and fairer test. Isn't .ODF typically stored in compressed form? If not, then small changes wouldn't necessarily affect the entirety of the file (as it would in a gzip file if the change were at the beginning) and SVN might be able to store the data very efficiently. Uncompressed PDF would certainly benefit.

  24. SVN does not have obliterate by NuShrike · · Score: 1

    Btw, CVS does do binary difference storage. Ever done a diff between two versions of .doc files?

    SVN will never beat CVS on space-efficiency in the long run.

    SVN does not have granular history obliterate whereas CVS/Perforce does, so CVS might be bigger initially, but you can always delete very old versions. These old binary versions are the ones you can rebuild from source or you really don't care anymore.

    It exists forever in SVN.

    1. Re:SVN does not have obliterate by WuphonsReach · · Score: 1

      It exists forever in SVN.

      Which is both a blessing and a curse. In CVS (or even VSS which had a "destroy" option), a rogue developer could kill off large amounts of project history which might go unnoticed for long periods. In SVN, it's simply not possible. In VSS, we had an extremely limited subset of people who were allowed to "destroy".

      Our old VSS server's repository was around 20-30GB after close to 5 years of use. Lots and lots of binary files in there, which ate up tremendous amounts of disk space due to inefficient storage of binaries.

      In SVN, because binaries are stored efficiently (compression + deltas), we're not seeing anywhere near the rate of growth that I was seeing with VSS. And we're not even bothering to compress easily compressed binaries anymore (such as MDBs or huge DOC files).

      (With SVN, there's always the option to do a dump/restore of your SVN repository, where you filter things out during the restore, resulting in a smaller repository. Probably not much worse then the old VSS Archive/Restore tedium.)

      --
      Wolde you bothe eate your cake, and have your cake?
  25. Yes, use Mercurial or another distributed tool! by Richard+Mills · · Score: 1

    Although Subversion does a great job of being a better CVS than CVS, yes, it is hard to use. Let me clarify: It is easy to use for a small project with just a few developers. But for large projects with many developers scattered all over, it, or any centralized revision control system becomes a nightmare (to me, anyway). The biggest problem I have with Subversion/CVS-type systems is that eventually managing the branches becomes a nightmare, and it becomes really easy to screw stuff up.

    My work became a lot easier when I started using distributed revision control systems. My favorite is Mercurial, a very fast and lightweight system written in Python. The main reason that I like it is that is by far the easiest to use revision control system that I have worked with. In addition to being fast, intuitive, supporting completely disconnected operation, and other great features, branching and merging is a breeze. And, most importantly, it makes it very easy for the developers on the large projects I work on to keep from stepping on each other's toes because everything is a branch . Whenever I checkout ("clone") the repository that we consider the "central" (or "trunk") repository, all of my commits happen in my local mirror of the repository, and when I am finished I "hg push" those changes back, merging them back into the "trunk". (My explanation may seem a little confusing, but the Mercurial development model is explained pretty well here.) The great thing about this model is that branching is the most natural thing in the world (in fact, everyone essentially always works on their own "branch") so it actually gets used. I have experienced too many cases with CVS or Subversion where something should have happened on its own branch but didn't because it was too confusing, too slow (with the bottleneck of the central server), etc.

    Although Mercurial is still pretty young, it is mature enough that some very large projects (e.g., Mozilla) have moved to it. I urge everyone who is looking for a powerful, but intuitive and easy-to-use revision control system to take a look at it. I have used several revision control systems and Mercurial is the first one that really makes me feel more productive.

    1. Re:Yes, use Mercurial or another distributed tool! by cerberusss · · Score: 1

      You're talking about branching being hard. But as far as I read the GP, she actually meant that without branching she finds it hard. I don't know where she got that from, though. Without branching, it's not hard at all.

      --
      8 of 13 people found this answer helpful. Did you?
  26. Subversion is Sex by N8F8 · · Score: 1

    You don't like sex?

    --
    "God fights on the side with the best artillery." - Napoleon, Marshal of France - speaking truth to power
  27. Correction by achurch · · Score: 1

    You ever try to move a directory structure full of source code from one place to another in CVS -- or even to move or rename a single file...?

    HINT: When you do it the way CVS provides, you will lose all of your revision history.

    s/lose/have to check another file for/

    Yes, I'm working on a 100k-line project (in CVS) that's undergone significant directory restructuring, and no, I've never found this to be a problem. If anything was to push me to Subversion, it'd be the fact that the CVS logs are split up among files in the first place, so I can't get a concise log of changes to the project as a whole (without maintaining a separate ChangeLog, and then why do VC in the first place?).

    My main beef with Subversion, from what I've read of it so far (correct me if I'm wrong), is that it insists on using some form of database to store the project data, rather than using ordinary files as CVS does. This may improve the efficiency of accesses, but it also makes it harder to recover the data when catastrophic failure occurs. With CVS, even if part of the repository gets nuked, I can still recover anything that's left, at worst by just comparing the ,v file in the repository with my working copy; I'd be pretty nervous about using a VC system in which that sort of last-ditch fallback wasn't available.

    1. Re:Correction by WuphonsReach · · Score: 1

      My main beef with Subversion, from what I've read of it so far (correct me if I'm wrong), is that it insists on using some form of database to store the project data, rather than using ordinary files as CVS does.

      Not sure if you're talking about server-side (the repository) or client-side (the working copy).

      On the server-side, you have a choice of either FS or BDB for storing the repository. I prefer FS. There's also the SVN mirror scripts and other backup options. The FS format stores each revision # in a separate file (or sets of files). So there are ways to work around a corrupt revision #. And, depending on when the corruption happened, you could probably pull the old tape backups and replace just that broken file (because old rev# files are never touched again in the FS repository). I suspect you could even symlink old rev# files to a read-only medium and things would still work.

      On the client-side, I stay out of my .svn folder (other then peeking periodically out of curiosity).

      --
      Wolde you bothe eate your cake, and have your cake?
    2. Re:Correction by achurch · · Score: 1

      I was referring to the server side. So using FS does keep the file data in an easily accessible format? I'd been under the impression that it didn't. Thanks for the tip.

      (I'm not so worried about the client side--if your CVS or .svn or whatever control directory gets messed up, you can always save your changed files away somewhere and pull a new copy of the repository to clean that up.)

    3. Re:Correction by WuphonsReach · · Score: 1

      I was referring to the server side. So using FS does keep the file data in an easily accessible format? I'd been under the impression that it didn't. Thanks for the tip.

      Well, it's at least a bit more accessible then the BDB format. Here's what a FS repository looks like. The interesting folder is the "db" folder.

      /var/svn/web # tree -L 2
      .
      |-- README.txt
      |-- conf
      | |-- authz
      | |-- passwd
      | `-- svnserve.conf
      |-- dav
      |-- db
      | |-- current
      | |-- format
      | |-- fs-type
      | |-- revprops
      | |-- revs
      | |-- transactions
      | |-- uuid
      | `-- write-lock
      |-- format
      |-- hooks
      | |-- post-commit.tmpl
      | |-- post-lock.tmpl
      | |-- post-revprop-change.tmpl
      | |-- post-unlock.tmpl
      | |-- pre-commit.tmpl
      | |-- pre-lock.tmpl
      | |-- pre-revprop-change.tmpl
      | |-- pre-unlock.tmpl
      | `-- start-commit.tmpl
      `-- locks
      |-- db-logs.lock
      `-- db.lock


      In the "revs" and "revprops" folders are one file per revision commited to the server. The "revprops" files are plain text and look like:

      # cat 92
      K 10
      svn:author
      V 3
      wuphon
      K 8
      svn:date
      V 27
      2006-11-05T02:20:12.153488Z
      K 7
      svn:log
      V 15
      import from VSS
      END


      However, the contents of each individual revision file in the "revs" directory is a lot more complex. It's a binary format, with content being compressed and delta'd off a previous revision (not always THE previous revision, SVN uses an algorithm to pick a revision lower down to delta off of). It would be possible to reconstruct the data, but would probably be easiest to use the SVN libraries to do so.

      But as I said, assuming that the rev file was written correctly and then backed up. You can easily restore a rev file that gets corrupted at a later date. And since the rev files never change, you could even use tripwire or some other checksum utility to keep a guarded eye on those files. The biggest risk would be that SVN corrupts things as its writing to the revision file. That failure would be noticable fairly quickly by other users of the repository.

      I can't find my notes (but this comes close) on how SVN picks base versions to create deltas off of for a particular revision. IIRC, to re-construct revision #100 of a file that had been revisioned 100 times, it only has to look at O(N log N) previous revisions instead of O(N). So it may need to read revision files {100, 92, 80, 65, 32, 5, 1} - or something like that. So if you were trying to retrieve the 100th revision of a file, but the 99th revision was corrupt, you'd still be able to get the contents of the 100th revision. You would only run into issues if the Nth revision was based off of a revision that had been corrupted.

      So, for heavily edited files, there's a bit of redundancy built in.

      --
      Wolde you bothe eate your cake, and have your cake?
  28. CVS lacks useful features by Frankie70 · · Score: 1

    Haven't used subversion yet, but have used Perforce, clearcase & CVS.

    Frankly CVS just doesn't cut it for me. It lacks too many features.
    1) Atomic checkins/submits
      I am trying to submit changes in 5 files as a single bugfix.
    A submit/checkin should either succeed for all 5 or fail for all 5.
    CVS doesn't do this. The end result is that I may end up submitting
    a change in the header without submitting a correspond change in the
    implementation file.

    2) Changelists
      After checking in multiples files together, at any point in time, I should
    be able to find out all the changes that were checked in at the same time.
    CVS has no way of doing this - Submitting 5 files together is the same as
    submitting 5 files separately as far as CVS is concerned.

    3) More Changelist features for non-submitted changes
    Let us say I am working on 3 different bugfixes. Perforce allows me
    group together my changes in different changelists even before I
    submit the changes. That is I can create changelist A B & C.
    In changelist A - I have files a.c & a1.c changed, in changelist
    B, I have b.c & b1.c changed & so on. So I decide I am done with
    all the changes required in the subset A, I can submit it very easily
    or undo all changes in changelist B.

    4) Merges
    Merges between branches are a breeze with Perforce. With CVS it's
    a pain. Perforce stores a lot of information about merges which have
    already happened which in invaluable. In CVS, merges between branches
    are very little more than changes manually copied from one branch to
    another.
    I can do a lot of stuff which I can't do with CVS
    - I can very trivally merge Bugfix 1111 (comprising of 5 files
    checked into changelist XXXX) from a branch to another branch or
    the main trunk.
    - Because Perforce stores information about merges, I can do periodic
    single command merges very easily between a branch & the trunk - perforce
    will not try to merge in changes which have already been merged the last
    time I did a merge.

    I could go on & on, but the point is that something Perforce makes
    a developers life so much more easier. I could work around all these
    things in CVS (i.e. do it in multiple steps) but the ease is something
    worth paying for I think.

    I haven't used subversion, so I can't comment on it.

  29. Merge issues much worse than Performance issues by stoicfaux · · Score: 1

    I've worked in it extensively, and it is still the best version control system I've ever used.

    SVN would be great if it had merge tracking (and true renames.) As much as I like SVN, the merge issues are a deal breaker:

    • No merge tracking. You have to manually record merge information in the checkin comment, which is inherently error prone. If merge tracking isn't done or is done incorrectly (e.g. merge -r 100:HEAD) there's no way to recover except to redo the merge with extra double checking.
    • The svnmerge.py merge tracking script only considers the current directory. It doesn't do any recursive analysis so you want to do all your merges at the project's root dir to be accurate.
    • Lack of true renames. When you rename or move a file, it does a delete + add, which leaves you open to missed merges. Ex: Branch. Rename branch/a.java to branch/b.java. Make an enhancement change to branch/b.java. Make a bug fix in trunk/a.java. Merge branch to trunk. SVN will delete a.java (which has the bug fix) and add b.java. Congrats, you just lost the bug fix change. SVN should have merged b.java with a.java.
    • Bi-directional merges. When you merge between branches multiple times, any merge conflicts resolved in previous merges get re-flagged as conflicts, thus giving you an ever increasing number of spurious merge conflicts that hide the real, new merge conflicts. The workaround is to skip merge revisions, which has the drawbacks of requiring multiple merges, and any extraneous changes made during a merge (such as a quick and simple bug fix) are not merged.
    • Serious training to understand merges. You basically need a merge-meister or two who understands the implications and pitfalls of SVN's merging and merge-tracking.
    • There's also no way to 'lock down' merges via hooks/triggers. (Such as requiring svnmerge.py to be used for all merges.)
    Once merge tracking is added to SVN (maybe in 1.5?) it would be great. Until then, I wouldn't use it except on small teams using Agile and few, short lived branches in order to minimize the merge issues.
    1. Re:Merge issues much worse than Performance issues by aled · · Score: 1
      SVK its based on SVN and says history sensitive merging:

      svk is a decentralized version control system built with the robust Subversion filesystem. It supports repository mirroring, disconnected operation, history-sensitive merging, and integrates with other version control systems, as well as popular visual merge tools.
      --

      "I think this line is mostly filler"
  30. Questions, questions by vrmlguy · · Score: 1

    Neither the article nor the replies tell me anything useful. .tar.gz files are small, meaning they are fast to move through a network, but do they diff well? Good compression algorithms turn data into statistically random streams of bits, so I suspect that different generations of uncompressed .tar files would have smaller deltas than the compressed versions. Similar questions abound for GIF and JPEG files.

    --
    Nothing for 6-digit uids?
    1. Re:Questions, questions by Anonymous Coward · · Score: 0

      Well, as it happens, I once looked into that. The diffs between compressed files are so big, it makes much more sense to save the compressed file directly instead of saving diffs of compressed files (at least for .tar.gz of the files I work on).

      That's because the diff is actually bigger than the compressed file and has the overhead of needing to restore the original file from the diff.

    2. Re:Questions, questions by WuphonsReach · · Score: 1

      Back when we used VSS+SourceOffSite (all traffic went through SOS, which kept our repository from getting damaged), we would zip up our .MDB (MSAccess database files) prior to checking them in. VSS/SOS was extremely inefficient at storing binaries (every time you checked in a 20MB binary, the repository size would increase by 20MB), so zip'ing the files saved us a lot of space. Plus SOS didn't do a very good job of over-the-wire compression, so a zip'd MDB would download a lot faster then storing the MDB natively.

      In SVN, we no longer bother to zip up our MDBs (or other large, easily compressed, binary files). Adding a 20MB MDB to a SVN repository only adds a few MB to the total size and transmits exceedingly quickly. Especially on subsequent check-ins where it can do a diff and only store / transmit a delta. Pulling down a 500MB MDB where you know someone changed a single record is no longer a big deal if you have a recent version in your local working copy.

      Same thing for other binary formats (AVI, MPEG, JPG, TIFF). We simply don't bother trying to zip them up. It's a hassle that we're happy to no longer have to deal with.

      --
      Wolde you bothe eate your cake, and have your cake?
  31. How to ignore disk trashing by Anonymous Coward · · Score: 0

    Sorry, but this whole paper points at i/o-bound problems and they didn't even look for what with a 99% chance is the root issue. (see numbers for dedicated workstation)

    just painful.

  32. Git can delta-compress binaries, too by Anonymous Coward · · Score: 0

    Its delta-compression algorithm doesn't treat line breaks any different from other bytes, so if the files contain large chunks of duplicate data, they can be delta-compressed.

    Of course, data compression masks similarities, so pre-compressed objects defeat this, but it still manages to work on a lot of things.

    Anyway, just FYI.

  33. They forgot the most obvious thing to do! by shplorb · · Score: 1

    License Perforce.