Linus On Branching Practices

← Back to Stories (view on slashdot.org)

Posted by CmdrTaco on Tuesday November 30, 2010 @05:18AM from the not-the-wood-kind dept.

rocket22 writes "Not long ago Linus Torvalds made some comments about issues the kernel maintainers were facing while applying the 'feature branch' pattern. They are using Git (which is very strong with branching and merging) but still need to take care of the branching basics to avoid ending up with unstable branches due to unstable starting points. While most likely your team doesn't face the same issues that kernel development does, and even if you're using a different DVCS like Mercurial, it's worth taking a look at the description of the problem and the clear solution to be followed in order to avoid repeating the same mistakes. The same basics can be applied to every version control system with good merge tracking, so let's avoid religious wars and focus on the technical details."

26 of 90 comments (clear)

Min score:

Reason:

Sort:

Re:RELIGIOUS? by Anonymous Coward · 2010-11-30 05:22 · Score: 4, Funny

Agreed. Except when it comes to mercurial which is the sux0rs.
Yeah by truthsearch · 2010-11-30 05:23 · Score: 3, Funny

so let's avoid religious wars and focus on the technical details
Hahahaha... Good one!

--
Developers: We can use your help.
1. Re:Yeah by Monkeedude1212 · 2010-11-30 05:31 · Score: 5, Funny
  
  I know. Linus is such a Linux Fanboy. Its so obvious.
2. Re:Yeah by Anonymous Coward · 2010-11-30 05:59 · Score: 4, Funny
  
  him and his blanket
This all sounds complicated by Anrego · 2010-11-30 05:24 · Score: 4, Insightful

Which I imagine makes sense, as the kernel is very complicated from a dev standpoint.
For most projects I’ve been involved with, the path to success is keeping the trunk in a stable state, and using _that_ as the baseline. Dev code should never be in the trunk imo... the trunk should always be in a ready to release (or proceed to formal testing, or whatever) state. Everyone branches from the trunk.. everyone can update their branch to the latest trunk.. and everyone merges back down into the trunk when it’s good and ready.
Resisting the temptation to make “quick fixes” in the trunk is also important. Additionally, dev platforms should be setup so the system can be run from any branch as easily as the trunk (making it a pain to test out the system from a branch is a great way to ensure unstable code ends up in your trunk).
Obviously in the case of the kernel.. they probably have branches off branches off branches, but I think for most reasonably sized projects, that shouldn’t be necessary.
1. Re:This all sounds complicated by assemblyronin · 2010-11-30 05:37 · Score: 3, Insightful
  
  I think you actually restated the point that Linus made in the original thread. Which was: Don't branch and start new development from an unknown state.
  For you, the stable baseline is equal to the trunk. For Linus, the stable baseline is equal the labeled release build node.
2. Re:This all sounds complicated by Anonymous Coward · 2010-11-30 05:48 · Score: 2, Insightful
  
  You merge several branches together into a "integration" branch, then test that and merge it to the trunk if it passes.
3. Re:This all sounds complicated by gstoddart · 2010-11-30 06:05 · Score: 4, Informative
  
  For most projects I’ve been involved with, the path to success is keeping the trunk in a stable state, and using _that_ as the baseline. Dev code should never be in the trunk imo... the trunk should always be in a ready to release (or proceed to formal testing, or whatever) state. Everyone branches from the trunk.. everyone can update their branch to the latest trunk.. and everyone merges back down into the trunk when it’s good and ready.
  He's also saying that everybody should branch from the exact same point along the branch or trunk. That way everybody has a set of diffs against the same baseline to merge back in.
  If you always branch from trunk, then as more stuff gets added, you start from a different point than you might otherwise.
  The specifically labeled "point in time" means that three separate changes can more readily be integrated as they'll be all from the exact same baseline.
  If the trunk is ready for formal testing, and it affects your other branches, you have a harder time if you fix things and need to push them back into those branches.
  
  --
  Lost at C:>. Found at C.
4. Re:This all sounds complicated by MtHuurne · 2010-11-30 06:32 · Score: 4, Interesting
  
  I think that the development process should be selected to match the particular project and the stage it is in. There is no perfect process that applies to every project, or even to one project forever. A team of 4 in a single room working on a demo for a new product idea will have very different requirements from a team of 20 working in two locations on an improved version of a product that is already in production...
  
  There are two conflicting goals: to avoid breaking the main branch (trunk) and to get changes out to the other developers soon. A broken main branch wastes the time of other developers on the project. But integrating changes late has its own inefficiencies: Problems in the modifications will only be raised after the work is done. It is more likely for one set of modifications to conflict with another set if both are being developed in parallel for a longer time. Other developers might have to wait for a full set of changes to arrive while they only need a subset, or they might start merging the subset from each other's development branches, creating a confusing mix of versions.
  
  Committing directly into trunk can be acceptable and even desirable depending on the project. It depends on how likely commits are to break the code: How many developers are there? How many mistakes do they make? (a combination of experience and carefulness) Is there decent test coverage before committing? How fragile is the code base; are there many unexpected side effects? And it depends on how much damage a broken main branch does: How long does it typically take to find and fix a problem? How modular is the code base: will a bug in one part be a nuisance to developers working on another part? And it also depends on how much there is to gain from early merging: Is the project in the start-up phase where it is likely that other developers are waiting for new core functionality, or is the code base mature and are most changes done on the edges of the program? Are all design decisions made before code is written or are developers doing design and implementation work at the same time?
5. Re:This all sounds complicated by i_ate_god · 2010-11-30 06:42 · Score: 3, Interesting
  
  I go in the reverse.
  Trunk is dev, branches are stable. We haven't had much trouble with this set up at all.
  
  --
  I'm god, but it's a bit of a drag really...
6. Re:This all sounds complicated by kbielefe · 2010-11-30 07:00 · Score: 3, Informative
  
  I hate to break it to you, but even if your trunk is clean, you will still have this problem in some other branch. Let's examine a very common situation where you have an interface being changed, one or more implementations of that interface, and one or more users of that interface. Developers are working simultaneously on both sides of that interface in order to meet a deadline.
  Because of your clean trunk rule, none of the changes can be checked into the trunk until all of the changes are ready, but they still need to be shared among the people working on it, or they will have no idea if it is "good and ready." So those developers create their own branch, which of necessity is sometimes in a temporarily broken state. You might not think of it as a branch, if it's John's working directory and the "checkout" procedure is him emailing files around, but it's conceptually a branch nonetheless.
  Linus is simply acknowledging that temporary brokenness is inevitable when multiple people integrate changes to the same code, and therefore whatever branch contains that messy integration should use tags to communicate the best branch points. I'm not saying keeping a clean trunk isn't a good idea, just that you have to deal with broken branch points one way or another, even if it's just John deciding when the best time is to email out the new header files to his team.
  
  --
  This space intentionally left blank.
7. Re:This all sounds complicated by Anonymous Coward · 2010-11-30 07:27 · Score: 3, Interesting
  
  We did this where I worked previously too. It was also the MO for the artists building the artwork for the game.
  Your trunk is the "Main line", a boiling pot of all the changes and can change on a minute by minute bases right near crunch time; This is good because you fail early if your change is not compatible with other changes instead of at the end of the day or whatever. This is very important for artwork.
  The last known good build is tagged/labelled (or branch if you prefer) and was generated by an auto build process that ran tests or the lead/QA department.
  Also a admin user (project manager, lead developer etc) could lock the whole project with exceptions for themselves or other specified users (automated build machine user etc). Do a build and test cycle and then mark it as known good release.
  The source control system for the artists also allowed changes to be kept separate from the "Main Line" but kept in source control; this is like a branch if you will, but with the difference that you could allow other users or custom defined groups access to your WIP changes. This in effect is like been able to get the latest of the trunk and changes from other users branches at once.
  Very useful when a group of artists, 3D modeller, texturer, animator are working on a new area as they can refine it without checking in none working bits to the main line everyone else is working on.
Re:ClearCase solved these problems years ago by assemblyronin · 2010-11-30 05:45 · Score: 2, Interesting

Never underestimate the stupidity of some people. I've seen some VOBS get royally hosed and take a day or two to go through the version-trees of individual elements to untangle their merge history. This was all due to two things: 1) OzPeter's Point 2) Lazy CM that didn't want to provide simple scripts and lock down a standard method for view/config-spec management.
Isn't this kind of obvious? by syousef · 2010-11-30 05:49 · Score: 2, Insightful

The whole story seems to be summed up by: "Don't just branch from some random point. Wait until your code is stable and branch from that." and "Create these stable points from which you can branch as often as is practical". I'm sorry but I've never been tempted to branch from an unstable point, and I'd be horrified if anyone on my time tried to do so.
As for only adding features to a stable release I find that depends on the size, complexity and maturity of the project. Early on nothing is feature complete and everyone tends to work on an unstable head/trunk/master/whatever-your-scm-calls-it. Once development has settled down and there's been a release, it's much more controlled and people do tend to add their code from a stable point.
I'm sorry I just don't see any life changing revelations here.

--
These posts express my own personal views, not those of my employer
Should have linked to the actual article by Shandalar · 2010-11-30 05:50 · Score: 5, Informative

Here is the actual article that the submitter should have linked to. It's Linus's post. Instead, the submitter linked to his or her advert site, which is a blog that has ads which hawk their own, non-git source control system, all of which you get to read before you are given the link to Linus's actual post.
Heisenberg as applied to SW development by vlm · 2010-11-30 05:53 · Score: 4, Insightful

Some devs know where STABLE is located, some devs know what direction their new code is going, and a successful merge is where a dev violates the Heisenberg Uncertainty Principle and accomplish both at the same time.

--
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
Re:comment from original page by Americano · 2010-11-30 05:55 · Score: 4, Informative

Yep, this is standard practice if your scm support knows what they're doing. The only reason it's not "desirable" to only branch off of stable, 'known-good' baselines is developer laziness. It can take more time setting up the branch, and sometimes that quick checkout-edit-checkin on the trunk is just SOOOO tempting as a shortcut. I see this a lot in groups working on new products, too - "it's never been released to production, so we'll just branch from wherever, and call it a day." Usually they grow out of this type of practice after they spend a few days untangling a mess they've created, but there are some die-hards who just hate having to deal with anybody else, and insist on doing their own thing.
This is why it's important to have:
1) Management / leadership that understands the value of proper configuration management, and expects good practices to be used;
2) Support for your SCM system that knows how to set up these practices and is empowered to enforce them;
3) Mature developers who understand that "fastest" isn't always "best";
(Full disclosure: part of my role in my current job involves clearcase admin, and i've also worked with svn, cvs, pvcs, and (shudder) vss in varying capacities)
Re:comment from original page by gstoddart · 2010-11-30 06:15 · Score: 3, Insightful

Usually they grow out of this type of practice after they spend a few days untangling a mess they've created, but there are some die-hards who just hate having to deal with anybody else, and insist on doing their own thing.
And those people get smacked on the knuckles with a ruler. If they keep failing to abide by your policies, you smack 'em on the ass with the ruler. If they keep going like that, you get rid of them.
There are very few things more destructive to a development team than some prima donna who won't follow the rules and procedures. In the long run, if they won't play by the rules laid down, they'll do more harm than good.
Source Code Management and "cowboys" can't really coexist if you want to be able to have maintainable software. I've seen someone who would apply changes to any old branch and more or less decree it was someone else's problem to get them onto main -- buh bye, if you're sabotaging the build process, we don't need you.

--
Lost at C:>. Found at C.
Re:comment from original page by Americano · 2010-11-30 06:29 · Score: 3, Insightful

I agree, and if the choice were mine, there are some people I work with who would be pink-slipped immediately... but, politics at a large-ish company being what they are, it's a matter of demonstrating to managers that the actions are counter-productive and costing us time and money... then letting them draw the proper conclusions. In a well-run meritocracy, these people would be gone for violating the "No Asshole" rule.
The problem is, some of the managers are over-promoted cowboys themselves - I've heard, no exaggeration, the following from a manager when I was arguing for locking down one of our production systems because people kept making changes live: "I know it's good policy, but as soon as policy slows down my developers, the policy goes out the window."
The technical problems are easy. It's this political maneuvering that requires the patience of a saint.
Re:branch/merge is sux by Mordstrom · 2010-11-30 06:31 · Score: 2, Interesting

Do you really want to do all the validation testing every time you put back to the trunk? Including Installation testing? There are only so many scenarios you can catch with test-first development. The rest are usually discovered by the testers. That's why they still have jobs and why you can actually accomplish anything in a CI environment. There will always be the heroic tales of development teams that are on version 11000 on the trunk and have never busted a customer. Maybe you are him/her? >.>
For the rest of us simpletons, we prefer a validated release to start baselines from.
If you employ a fully featured VCS like ClearCase (or others) you can even run multiple simultaneous baselines releasing feature content while hardening your previous releases...and then merge between them. 0.0 Yes, good SCM teams let you do this and protects you from the worst of your merge nightmares.
Re:comment from original page by gstoddart · 2010-11-30 06:34 · Score: 4, Insightful

I've heard, no exaggeration, the following from a manager when I was arguing for locking down one of our production systems because people kept making changes live: "I know it's good policy, but as soon as policy slows down my developers, the policy goes out the window."
Run. Run fast, run far.
If managers are going to support the notion of un-tracked changes on a production server in the name of getting things done, then eventually someone will be looking to lay blame for something that went horribly wrong.
Failure to understand why people have change procedures for live systems is pretty significant. And, depending on your industry ... un-tracked fixes and tweaks can actually get you in legal trouble. Think Sarbanes-Oxley.
In almost any sane shop, failure to follow the change procedures can be a grounds for immediate dismissal.

--
Lost at C:>. Found at C.
it's simply ignorance! by bogolisk · 2010-11-30 06:52 · Score: 2, Interesting

The kernel devs don't do development on master! However, git's fast-forward-merge will, by default, push development/intermediate commits onto master. Those intermediate commits are extremely useful for code-inspection/code-review and bisect-based debugging. They're are not meant for starting a new dev branch and that why they're not tagged! There's nothing new or interesting in that article other than a bunch stupid comments at the bottom. The whole thing smells like a disguised advertising for PlasticSCM

--
Bogus
The original email complained about failed bisect by Chirs · 2010-11-30 07:05 · Score: 2, Interesting

git allows you to bisect from known-good and known-bad kernels to try and find the source of the problem. The original complaint was that some of the intermediate changes don't build.
The problem here is not necessarily branching/merging, but that maintainers and developers do something along the line of "commit bad change, notice problem, commit fix" in their own private branch. Then, rather than clean up their private branch that whole history gets merged into the main kernel tree.
This has the advantage of showing more details of development, but has the downside that a bisect that hits the "bad change" commit won't build and will require some manual action to select a "nearby" commit that will build.
As I view it, it's less about rebase/merge and more about developers/maintainers being more diligent about keeping their trees clean before merging back to the mainline.
Re:comment from original page by Americano · 2010-11-30 07:20 · Score: 2, Insightful

I'm sorry, how does "automated testing of the main line via a CI tool after the changes are committed to the main line" assure that your main line stays stable?
"Virtually stable" is not "stable". When you work for a financial services firm whose livelihood depends on the market data and trading systems your team builds, "virtually stable" is nowhere near "stable" and doesn't even begin to approach "good enough".
Re:comment from original page by wrench+turner · 2010-11-30 07:46 · Score: 2, Insightful

It's not the CI tool that assures that mainline is stable; it's the quality of the regressions.
Re:comment from original page by Americano · 2010-11-30 09:15 · Score: 2, Informative

You can do it right with a 4-line config spec. The config spec needs to include that /main/LATEST clause at the bottom because new elements being added to the branch aren't labeled with the baseline you're branching from.
The config spec should take the form of:
element * CHECKEDOUT
element * .../branch/LATEST
element * BASELINE -mkbranch branch
element * /main/LATEST -mkbranch branch
The only time the /main/LATEST rule will ever be evaluated is if an element is added to the branch after the BASELINE is applied, and even then, it will force development out to the branch.