Slashdot Mirror


Making Sense of Revision-Control Systems

ChelleChelle writes "During the past half-decade there has been an explosion of creativity in revision-control software, complicating the task of determining which tool to use to track and manage the complexity of a project as it evolves. Today, leaders of teams are faced with a bewildering array of choices ranging from Subversion to the more popular Git and Mercurial. It is important to keep in mind that whether distributed or centralized, all revision-control systems come with a complicated set of trade-offs. Each tool emphasizes a distinct approach to working and collaboration, which in turn influences how the team works. This article outlines how to go about finding the best match between tool and team."

18 of 268 comments (clear)

  1. Git and Mercurial? by capnchicken · · Score: 4, Insightful

    Git and Mercurial are more popular than Subversion? That's the big news to me, with all snarkyness aside. I best be getting out of my bubble.

    --
    A libertarian shat on my carpet once. Claimed the free market would sort it out. -Ford Prefect(8777)
    1. Re:Git and Mercurial? by Vanders · · Score: 5, Interesting

      I share your scepticism. Given the vast numbers of CVS repositories that exist and the ease with which you can transition to Subversion, I don't think it's popularity is going to wane any time soon. It also has some of the widest range of plugins for IDEs such as Visual Studio and Eclipse and the largest number of tools and clients, which make it a popular choice for a lot of new projects. Outside of Linux development Git is almost unheard of, but may gain popularity and although I've worked with Mercurial professionally I've yet to see it used anywhere for Open Source development, yet.

    2. Re:Git and Mercurial? by Vanders · · Score: 3, Insightful

      it would be crazy not to use Mercurial is a new project

      Mercurial is a distributed system, Subversion is centralised. They suit almost totally different workflows and teams. If you're a large group of Open Source developers in different countries and timezones Mercurial may suit you. If you're a small group of developers in the same office doing rapid development, Subversion may be better for you.

    3. Re:Git and Mercurial? by Vanders · · Score: 3, Informative

      All you have to do is set up an extra server and say "Hey, this is the central server now".

      Yeah. I know. In fact I did just that at my last job when we implemented Mercurial. The problem is training developers to push their local changeset to the central repository and from stopping developers pulling from someone else and not the central repository. There was a least one incident a week where a conflict arose due to developers doing things like that which led to divergent codebases which required significant effort on behalf of one of the developers to merge and fix conflicts. I have no doubt these problems could have been fixed given time, but it was an uphill battle.

    4. Re:Git and Mercurial? by RiotingPacifist · · Score: 3, Insightful

      you can do that with either centralised or distributed systems, i fail to see your point.

      --
      IranAir Flight 655 never forget!
    5. Re:Git and Mercurial? by ckaminski · · Score: 4, Insightful

      coupled with its designers absolute refusal to support deleting contents from the repository

      I don't necessarily disagree with you, but in places I've worked, if we removed code in such a fashion and an audit found out about it, we'd get pummeled. Especially if it was discovered after a public release. It's one thing to ship code copyrighted by someone else, it's something completely different to go about covering up the fact.

      So I'm torn on this "feature."

    6. Re:Git and Mercurial? by orzetto · · Score: 4, Informative

      [Subversion's] designers absolute refusal to support deleting contents from the repository [is bad]

      That is one great feature of Subversion: absolutely no way to screw up stuff that was committed. Revision control is about keeping track of stuff, any model that allows a user to remove information from a repository is a disaster quietly waiting to happen; sorry you did not understand that.

      If you absolutely need to remove something from a SVN repository, you can do that with svndumpfilter, meaning you have to ask the repository's administrator. That's a good safeguard against accidental deletions.

      "throwing useless things away makes cleaner code"

      For "cleaner code" you just need svn delete.

      --
      Victims of 9/11: <3000. Traffic in the US: >30,000/y
  2. Re:Perforce by xmundt · · Score: 4, Insightful

    P4 is awesome and works great for huge repos with lots of developers.

    However it is getting stale. I can't think of a single new feature added to it since I started using it in 1999.

    Greetings and Salutations...
              Funny...I tend to think of software more like a truck than a stalk of celery, so, staleness really never popped up on my radar. What new features would add to the capabilities of a package that you describe as "awesome"?
              Not flaming, I am really curious, as I have done some software development myself, and, wonder where the line is between actually adding good functionality to a tool, and "creeping featuritis" that adds bells, whistles and complications, but no real increased usability.
              regards
              dave mundt

    --
    YAB - http://blog.beemandave.com/
  3. Errata by kabloom · · Score: 5, Informative

    Because Subversion offers working out of a shared branch as the path of least resistance, developers tend to do so blindly without understanding the risk they face. In fact, the risks are even subtler: suppose that Alice's changes do not textually conflict with Bob's; she will not be forced to check out Bob's changes before she commits, so she can commit her changes to the server unimpeded, resulting in a new tree state that no human has ever seen or tested.

    This statement is incorrect. Subversion requres you to update your working copy before committing whenever you have modified a file that has changed in the repository.

    1. Re:Errata by forkazoo · · Score: 3, Insightful

      This statement is incorrect. Subversion requres you to update your working copy before committing whenever you have modified a file that has changed in the repository.

      Yes and no. It is possible to only update/checkin at a certain level in the directory hierarchy, and miss a change to a header outside of the scope you are interested in. You have to be slightly beligerent to get into such a situation, but it can happen.

  4. No they don't. by SanityInAnarchy · · Score: 4, Informative

    Each tool emphasizes a distinct approach to working and collaboration, which in turn influences how the team works.

    Ok, yes, some tools do. For example, subversion supports trivial branching, but sucks at merging, so it encourages people to work on a common "trunk" branch. It also only supports a central server, so it "encourages" developing with a central server.

    Git, on the other hand, "encourages" people to not put multi-gigabyte files in version control.

    However, Git can be used to talk to an SVN repository. It can also talk to a central repository, or work purely via ssh between workstations, or with something like Gitjour, in a truly distributed fashion. Github is a strange and wonderful mutation of the two.

    Perhaps, by making branches and merges so awesomely fast, Git "encourages" lots of little local branches, and keeping a neat patch history. But to sum it up:

    SVN can handle large binary files and Windows better than Git, and is better integrated into IDEs.

    Git is better at everything else, ever. Seriously -- 99% of projects that are hosted on SVN would make more sense on Git.

    --
    Don't thank God, thank a doctor!
    1. Re:No they don't. by russotto · · Score: 4, Interesting

      Git is better at everything else, ever. Seriously -- 99% of projects that are hosted on SVN would make more sense on Git.

      When I first looked at git, it wasn't even clear how simple revision control tasks could be done, e.g. simply checking in a file, or reverting changes to it. Looked like you had to deal with bizarre syntax and long hex numbers for the simplest things (and it's not just because it's distributed, as mercurial was much more straightforward). I assume that's changed as people aside from Linus actually use the thing, but it was very off-putting in the beginning.

    2. Re:No they don't. by SanityInAnarchy · · Score: 4, Informative

      It has changed, somewhat -- but mostly, I think there's just better documentation.

      But, for example...

      Looked like you had to deal with bizarre syntax and long hex numbers for the simplest things

      That is pretty fundamental to the design -- it's a SHA1 hash. It's also not incredibly difficult -- cut and paste. When your SVN revisions hit four and five digits, they don't really have much more meaning than that hash, do they?

      Generally, you learn to use relative terms, instead -- for example, HEAD^ to refer to the revision just behind HEAD.

      mercurial was much more straightforward

      I thought so, too...

      I think I tried mercurial, and then bzr, and eventually settled on Git for three reasons:

      1. It's obscenely fast
      2. Everyone's doing it, which has a network effect (github)
      3. I can hold its data model comfortably in my head.

      I should clarify that last part... Maybe some things are cryptic, and I'm sure I don't know all of the possible commands I could run -- but at a very basic level, I know exactly what's going on, just like I did in SVN.

      Just for fun, here's the data model in a paragraph: There are commits. Each commit has a parent commit that it includes, except for merges, which have two parents. A branch is just a pointer to a commit.

      That's it.

      And knowing that, everything else starts to make sense... but it's more than I want to get into in a Slashdot post.

      --
      Don't thank God, thank a doctor!
  5. TortoiseSVN by ImustDIE · · Score: 4, Insightful

    I am a bit jealous of some Git features, but the place I work -- and me for my personal projects -- use SVN for one big reason: TortoiseSVN. It is a great interface to version control and not everyone (probably the majority) who needs to contribute is a programmer, or has any idea about command line interfaces, ssh, branching, merging, etc.

    I am aware of TortoiseGit, but it has not reached a stable release, so it is not up for consideration in a serious environment.

    There are other things to keep in mind too; SVN is much more tailored to our repo structure than Git, so that's a big plus for SVN -- at least for us.

  6. No mention of ClearCase? by gillbates · · Score: 3, Informative

    What I find interesting is there's no mention of ClearCase. Maybe the author is unaware of it, or considers it obsolete? Then again, the author didn't seem that experienced with the debacles into which one can get with revision control SW. The example he posits is the least of the problems which can crop up.

    I've used both ClearCase and CVS. First, CVS:

    1. I instinctively save files. And this is a bad thing to do with CVS; when I do a commit, my otherwise unchanged file can overwrite another engineer's more recent changes because I happened to save the file at a later date than him. The interesting thing is that this is not immediately apparent to either of us until we check out a fresh copy of the repository and he notices his changes are gone. And then I'm listed as the last modifier, and he comes to me...
    2. You can't (or shouldn't) copy one directory to another within a source tree. Nor should you do it between repositories. CVS will commit your changes to the copied directory back to the original repository, unless you delete all of the CVS folders. This little quirk cost a few of my colleagues a few hours of debugging to figure out why their changes kept disappearing...
    3. CVS does not (or did not when I used it) enforce strict version control protocol. I can commit an entire repository back to mainline even if I have outdated files. Even if others have made more recent updates. I didn't know this was happening for a good few months of use...

    Now for ClearCase

    1. ClearCase can manage extraordinarily large codebases spread across several geographical locations.
    2. It can be integrated with version control and bug tracking databases.
    3. It allows two or more developers to work on the same file at the same time, with the last one to commit having to perform a manual merge *only when there are conflicts*. Most of the time, it gets the merges right.
    4. With proper tagging procedures, I can always reproduce the last build bit-exact. No matter how badly an engineer subsequently mangles the codebase, I can always build from the last tag. My impending release can't be sabotaged by another developer committing code-breaking-but-it-compiles-on-my-machine-oh-silly-me-I-forgot-the-headers kind of changes.
    5. It does have problems with cache-coherency. Modifying files on machines other than the build machine may end up with stale files being linked...
    6. It has dynamic views, which don't require a full copy of the source tree on the local machine. There are some big advantages to this, among them being not having to worry so much about the theft of a developer's laptop, and using the server's storage pool for building, rather than the local hard disk. From a developer perspective, it is nice not to have to wait an hour or so for the repository download should I need to make a change to an older codebase. I can work on multiple versions of the same code base at the same time, without having to maintain a separate local copy of the entire tree for each of them.
    7. Managing ClearCase is an administrative position. Yes, it is exceedingly complex.
    8. Suppose I merge several bug fixes for a build. And later, one of those fixes needs to be backed out (didn't fix the problem, conflicts with other SW, etc...). I can do that with ClearCase rather easily, without having to reconstruct all of the interim versions between the two.
    9. I can apply the same bugfix to two different branches of a source tree without checking out and modifying both branches. That is, I can check the changes into one branch, and merge them into another branch (or just pick them up) without having to checkout the repository from the other branch.

    Now, granted, a lot of FOSS products are not trying to be SEI level 5*. They don't have to demonstrate a repeatable process. The often don't incorporate bug fixes into older releases, or maintain several concurrent branches of the same codebase. It is also important to show which

    --
    The society for a thought-free internet welcomes you.
    1. Re:No mention of ClearCase? by Ztream · · Score: 3, Insightful

      I use Subversion on a daily basis, and I believe everything positive you said about ClearCase holds true for Subversion, except point 9. There are some philosophical objections to 9 (you should test the resulting code before committing it anyway), but I don't know if it's a design decision or a missing feature.

      That's not to say that Subversion doesn't have problems of its own though, but using CVS as a representation of the state of version control systems is like judging proprietary software on the basis of Windows 95.

    2. Re:No mention of ClearCase? by ztransform · · Score: 3, Insightful

      Essentially everyone who knows anything about modern version control considers CVS obsolete.

      Clarification: everyone who thinks they know everything about modern version control considers CVS obsolete.

      CVS still has advantages, in my opinion:
      - simple underlying storage structure that any administrator can understand
      - ability to simply administratively repair obscene or damaging check ins (investigate the cvs admin -o command, few other version control systems can do this)
      - simple file numbering scheme

      At the end of the day your needs may be more complex (regular branching, regular directory moves, etc) but in some commercial situations simplicity and ease of administration can be valuable points (and I think often outweighs the perceived benefits of SVN).

      As for Git with it's advanced learning curve of at least a week, sometimes you have not just programmers contributing to a project but front-end designers, template producers, who have never seen a version control system in their life. Subjecting them to Git can be both cruel and potentially uneconomic - particularly if they are all on short term contracts.

  7. Savana - transactional workspaces on top of SVN by SashaMan · · Score: 3, Interesting

    Friends of mine have open-sourced savana, http://savana.codehaus.org/ a thin layer on top of Subversion that makes it easy to do all work in private branches before promoting to the trunk. A common workflow is:

    sav createuserbranch mybranch --calls svn copy under the covers to create user branch named mybranch ... normal checkins using svn commit go to YOUR private branch ... when you are ready to promote your changes back to the trunk:
    sav sync -- pulls in any changes made to trunk since your private branch was created so you can test locally
    sav promote -- merges your changes back into the trunk

    The thing I like about this thin "workspace managing" layer on top of Subversion is that for the most part you can take advantage of existing tool support for subversion (like integrated IntelliJ Idea and Eclipse support), as all of the savana commands just call svn commands under the covers.