Slashdot Mirror


Making Sense of Revision-Control Systems

ChelleChelle writes "During the past half-decade there has been an explosion of creativity in revision-control software, complicating the task of determining which tool to use to track and manage the complexity of a project as it evolves. Today, leaders of teams are faced with a bewildering array of choices ranging from Subversion to the more popular Git and Mercurial. It is important to keep in mind that whether distributed or centralized, all revision-control systems come with a complicated set of trade-offs. Each tool emphasizes a distinct approach to working and collaboration, which in turn influences how the team works. This article outlines how to go about finding the best match between tool and team."

6 of 268 comments (clear)

  1. Errata by kabloom · · Score: 5, Informative

    Because Subversion offers working out of a shared branch as the path of least resistance, developers tend to do so blindly without understanding the risk they face. In fact, the risks are even subtler: suppose that Alice's changes do not textually conflict with Bob's; she will not be forced to check out Bob's changes before she commits, so she can commit her changes to the server unimpeded, resulting in a new tree state that no human has ever seen or tested.

    This statement is incorrect. Subversion requres you to update your working copy before committing whenever you have modified a file that has changed in the repository.

  2. No they don't. by SanityInAnarchy · · Score: 4, Informative

    Each tool emphasizes a distinct approach to working and collaboration, which in turn influences how the team works.

    Ok, yes, some tools do. For example, subversion supports trivial branching, but sucks at merging, so it encourages people to work on a common "trunk" branch. It also only supports a central server, so it "encourages" developing with a central server.

    Git, on the other hand, "encourages" people to not put multi-gigabyte files in version control.

    However, Git can be used to talk to an SVN repository. It can also talk to a central repository, or work purely via ssh between workstations, or with something like Gitjour, in a truly distributed fashion. Github is a strange and wonderful mutation of the two.

    Perhaps, by making branches and merges so awesomely fast, Git "encourages" lots of little local branches, and keeping a neat patch history. But to sum it up:

    SVN can handle large binary files and Windows better than Git, and is better integrated into IDEs.

    Git is better at everything else, ever. Seriously -- 99% of projects that are hosted on SVN would make more sense on Git.

    --
    Don't thank God, thank a doctor!
    1. Re:No they don't. by SanityInAnarchy · · Score: 4, Informative

      It has changed, somewhat -- but mostly, I think there's just better documentation.

      But, for example...

      Looked like you had to deal with bizarre syntax and long hex numbers for the simplest things

      That is pretty fundamental to the design -- it's a SHA1 hash. It's also not incredibly difficult -- cut and paste. When your SVN revisions hit four and five digits, they don't really have much more meaning than that hash, do they?

      Generally, you learn to use relative terms, instead -- for example, HEAD^ to refer to the revision just behind HEAD.

      mercurial was much more straightforward

      I thought so, too...

      I think I tried mercurial, and then bzr, and eventually settled on Git for three reasons:

      1. It's obscenely fast
      2. Everyone's doing it, which has a network effect (github)
      3. I can hold its data model comfortably in my head.

      I should clarify that last part... Maybe some things are cryptic, and I'm sure I don't know all of the possible commands I could run -- but at a very basic level, I know exactly what's going on, just like I did in SVN.

      Just for fun, here's the data model in a paragraph: There are commits. Each commit has a parent commit that it includes, except for merges, which have two parents. A branch is just a pointer to a commit.

      That's it.

      And knowing that, everything else starts to make sense... but it's more than I want to get into in a Slashdot post.

      --
      Don't thank God, thank a doctor!
  3. Re:Git and Mercurial? by Vanders · · Score: 3, Informative

    All you have to do is set up an extra server and say "Hey, this is the central server now".

    Yeah. I know. In fact I did just that at my last job when we implemented Mercurial. The problem is training developers to push their local changeset to the central repository and from stopping developers pulling from someone else and not the central repository. There was a least one incident a week where a conflict arose due to developers doing things like that which led to divergent codebases which required significant effort on behalf of one of the developers to merge and fix conflicts. I have no doubt these problems could have been fixed given time, but it was an uphill battle.

  4. No mention of ClearCase? by gillbates · · Score: 3, Informative

    What I find interesting is there's no mention of ClearCase. Maybe the author is unaware of it, or considers it obsolete? Then again, the author didn't seem that experienced with the debacles into which one can get with revision control SW. The example he posits is the least of the problems which can crop up.

    I've used both ClearCase and CVS. First, CVS:

    1. I instinctively save files. And this is a bad thing to do with CVS; when I do a commit, my otherwise unchanged file can overwrite another engineer's more recent changes because I happened to save the file at a later date than him. The interesting thing is that this is not immediately apparent to either of us until we check out a fresh copy of the repository and he notices his changes are gone. And then I'm listed as the last modifier, and he comes to me...
    2. You can't (or shouldn't) copy one directory to another within a source tree. Nor should you do it between repositories. CVS will commit your changes to the copied directory back to the original repository, unless you delete all of the CVS folders. This little quirk cost a few of my colleagues a few hours of debugging to figure out why their changes kept disappearing...
    3. CVS does not (or did not when I used it) enforce strict version control protocol. I can commit an entire repository back to mainline even if I have outdated files. Even if others have made more recent updates. I didn't know this was happening for a good few months of use...

    Now for ClearCase

    1. ClearCase can manage extraordinarily large codebases spread across several geographical locations.
    2. It can be integrated with version control and bug tracking databases.
    3. It allows two or more developers to work on the same file at the same time, with the last one to commit having to perform a manual merge *only when there are conflicts*. Most of the time, it gets the merges right.
    4. With proper tagging procedures, I can always reproduce the last build bit-exact. No matter how badly an engineer subsequently mangles the codebase, I can always build from the last tag. My impending release can't be sabotaged by another developer committing code-breaking-but-it-compiles-on-my-machine-oh-silly-me-I-forgot-the-headers kind of changes.
    5. It does have problems with cache-coherency. Modifying files on machines other than the build machine may end up with stale files being linked...
    6. It has dynamic views, which don't require a full copy of the source tree on the local machine. There are some big advantages to this, among them being not having to worry so much about the theft of a developer's laptop, and using the server's storage pool for building, rather than the local hard disk. From a developer perspective, it is nice not to have to wait an hour or so for the repository download should I need to make a change to an older codebase. I can work on multiple versions of the same code base at the same time, without having to maintain a separate local copy of the entire tree for each of them.
    7. Managing ClearCase is an administrative position. Yes, it is exceedingly complex.
    8. Suppose I merge several bug fixes for a build. And later, one of those fixes needs to be backed out (didn't fix the problem, conflicts with other SW, etc...). I can do that with ClearCase rather easily, without having to reconstruct all of the interim versions between the two.
    9. I can apply the same bugfix to two different branches of a source tree without checking out and modifying both branches. That is, I can check the changes into one branch, and merge them into another branch (or just pick them up) without having to checkout the repository from the other branch.

    Now, granted, a lot of FOSS products are not trying to be SEI level 5*. They don't have to demonstrate a repeatable process. The often don't incorporate bug fixes into older releases, or maintain several concurrent branches of the same codebase. It is also important to show which

    --
    The society for a thought-free internet welcomes you.
  5. Re:Git and Mercurial? by orzetto · · Score: 4, Informative

    [Subversion's] designers absolute refusal to support deleting contents from the repository [is bad]

    That is one great feature of Subversion: absolutely no way to screw up stuff that was committed. Revision control is about keeping track of stuff, any model that allows a user to remove information from a repository is a disaster quietly waiting to happen; sorry you did not understand that.

    If you absolutely need to remove something from a SVN repository, you can do that with svndumpfilter, meaning you have to ask the repository's administrator. That's a good safeguard against accidental deletions.

    "throwing useless things away makes cleaner code"

    For "cleaner code" you just need svn delete.

    --
    Victims of 9/11: <3000. Traffic in the US: >30,000/y