Slashdot Mirror


Linus on GIT and SCM

An anonymous reader sends us to a blog posting (with the YouTube video embedded) about Linus Torvalds' talk at Google a few weeks back. Linus talked about developing GIT, the source control system used by the Linux kernel developers, and exhibited his characteristic strong opinions on subjects around SCM, by which he means "Source Code Management." SCM is a subject that coders are either passionate about or bored by. Linus appears to be in the former camp. Here is his take on Subversion: "Subversion has been the most pointless project ever started... Subversion used to say, 'CVS done right.' With that slogan there is nowhere you can go. There is no way to do CVS right."

2 of 392 comments (clear)

  1. Re:git by Anonymous Coward · · Score: 5, Interesting

    He is only human. Just because he is the head of a huge software project doesn't make him infallible.

    Just look at the whole 'RMS vs Linus' thing.

    His opinions should carry some weight, especially since he should know more than anyone what the limitations of SCM software is when it comes to larger projects like the linux kernel. But a lot of SCM comes down to the way a project is managed, the preferences of the people involved, and how they deal with their project. I doubt there is a blanket solution... a 'one SCM package to rule them all' so to speak.

    Especially in the software industry you can always find someone just as good as yourself that strongly holds opinions that are the polar opposite of yours.

  2. git is pretty cool, take a closer look by Anonymous Coward · · Score: 5, Interesting

    I took a look at git a while ago and was completely underwhelmed. The UI was so bad it was useless, and it didn't "seem" to do anything that Darcs didn't do. (I used to love Darcs because of the automatic patch dependency computations).

    Now that all the "next generation" SCM tools have matured somewhat, I took a look at all of them again. I had to stop using Darcs because of the "patch of death" problem, which basically is this: after using Darcs on a project with long-lived parallel branches, the repository may eventually enter a wedged state you can't get out of, due to exponentially complex patch dependencies. Oops.

    At this point I had an idea of what an SCM should do, how it should work, what the "mental model" should be. I want to create changesets, add them to branches, combine multiple branches (and keep track of renames and so forth between branches), re-order changesets, collapse multiple changesets into one, discard old branches, etc.

    Of course, CVS and close cousin Subversion are SO UTTERLY USELESS I didn't even consider them. Seriously, Subversion is like gold-plated shit. Looks nice but it's still shit. Reading people say stuff like "Subversion is awesome" makes me wince. How can something that doesn't have "real" branches, and doesn't have tags OF ANY KIND, be useful for anything? How do you keep track of multiple merges between branches? Answer: you don't. Or you keep track of revision numbers using svnmerge and pray it all works. Even the Subversion docs sortof hand-wave this away. I.e., they hand-wave away one of the FUNDAMENTAL ASPECTS of source code management: branching and merging. It's like hearing people talk about OO databases. They mean well but they just don't comprehend the generality of the underlying problem.

    That's why I was so excited about Darcs: the author "gets it". Unfortunately the implementation is flawed.

    I checked out a few more (Mercurial, bzr) but finally settled on git because it let me do all the things I needed to do, and it did them FAST. Once I figured out the underlying model I was pretty impressed. Git can be viewed at many levels: very low-level plumbing, or UI-level, or in between. The UI and documentation is still pretty shitty, but thankfully they are working on improving it and are moving away from the idea of having interchangeable UIs. Just focus on improving "core git".

    One great thing about git is that so much of it is just files in the .git dir and shell scripts that combine very simple low-level functions. For instance, you can create a branch just by saving the SHA1 ID of the tip into a file in .git. You can branch off any point in the history this way, including branches you've deleted in the past (git keeps all the old commit objects by default, even ones that aren't pointed to by any branch or tag.. this is very simple and understandable model, like reference-counting in a way).

    The other great thing about git is how easy it is to sling changes around and reorder them and combine them. For instance let's say you add a file to your project as commit "A". Then you add some code that uses this file as commit "B". Then you fix a bug in the file as commit "C". So you have A-B-C. Now you'd like to combine A and C into a single patch A', and put B on top of it, like this: A'-B. In git, this is super-easy. I can think of two ways to do it off the top of my head.

    I was checking into a CVS project the other day (for a client) and wanted to do this. Then I realized, you can't move things around in CVS like this *twitch*. So nowdays I do everything in git and only after the changes are beautiful and self-contained and well-commented do I check them into CVS one at a time.

    Okay so they point is, check out git (or honestly? Checkout out ANYTHING that isn't CVS or svn). Even if you think Linus is an asshole (which he is) or you don't like the git UI (it's not that bad now), check it out anyway.

    And if you don't use SCM at all? You suck. Start learning. It's a best practice that you can't live without, once you start.