Git Adoption Soaring; Are There Good Migration Strategies?

← Back to Stories (view on slashdot.org)

Git Adoption Soaring; Are There Good Migration Strategies?

Posted by Soulskill on Friday January 9, 2009 @10:15PM from the git-while-the-gittin's-good dept.

Got To Get Me A Git writes "Distributed version control systems (DVCS) seem to be the next big thing for open source software development. Many projects have already adopted a DVCS and many others are in the process of migrating. There are a lot of major advantages to using a DVCS, but the task of migrating from one system to another appears to be a formidable challenge. The Perl Foundation's recent switch to Git took over a year to execute. The GNOME project is planning its own migration strategy right now after discovering that a significant majority of the project's developers favor Git. Perhaps some of the projects that are working on transitions from other mainstream version control systems can pool their resources and collaborate to make some standardized tools and migration best practices documentation. Does such a thing already exist? Are any folks out there in the Slashsphere working on migrating their own project or company to a DVCS? I'd appreciate some feedback from other readers about what works and what doesn't."

10 of 346 comments (clear)

Min score:

Reason:

Sort:

Re:Git links by Bananenrepublik · 2009-01-09 23:51 · Score: 4, Interesting

whygitisbettertanx.com claims that mercurial doesn't have cheap branching -- the only advantage he sees git having over hg if leaving aside github. I'm surprised by this statement because I use hg branches everyday. The things he describes can all be done straightforwardly with hg, so I'm asking: can anybody in the know tell me if and how git branches are in any way more powerful than hg branches?
FTR I love hg, and I see no reason to switch to git, even though the whole bandwagon movement seems to have jumped on the git train.
Re:Why is it soaring? by bheading · 2009-01-10 00:00 · Score: 5, Interesting

I think it's more popular for one of the same reasons that Bitkeeper initially became popular - it's being used by Linus for the kernel. Getting Linus to use one of your tools is one of the best marketing coups you can land. Outside of this, Bitmover is a small company and it's hard to see how they would have gotten the kind of exposure that they did with the kernel. That said, they seem to be surviving just fine today.
The other reason that it's popular is because it's free. This is fine for open source projects. In the commercial land, managers tend to underestimate the importance of good revision control tools and processes, and the importance of tools which make it easy to build and enforce those processes. Bitkeeper (and some of it's competitors) go to a lot of effort to provide both tools and processes. Git is not so good at this. Other tools that are not good at this include Clearcase (although UCM is an attempt, albeit a controversial one) and CVS.
And I wouldn't say that Bitkeeper and Git are the same. The underlying design concept - distributed version control, changesets, and the benefits that flow out of this eg proper merge tracking and a greater degree of determinism - are the same. Bitkeeper has much better GUI tools, and it's a lot more user-friendly; the command interface is coherent and consistent, and the commands are simpler and easier to remember, options that do similar things are the same across different commands. For example, the "-r" flag always refers to a changeset number, in any command that accepts this parameter. I used BK on a project with between 20 and 45 users; it never once corrupted the repository and there wasn't a single time when the server went down. There were a few times when things were weird when new users unfamiliar with the tool broke their repos, but that stopped after a couple of weeks. The real benefit is that it makes it very easy to see who broke what, and how - whether it was during development or during a merge.
Git isn't friendly or forgiving at all, and you need to really know what you are doing. There are operations that are very dangerous, like the rebase operation; BK does have an equivalent but it incorporates some basic measures to stop someone from messing up the repository they are pushing to.
Additionally, Git will break things in unexpected ways. Try pushing a change into another Git repository, then navigate to that repository and run "git status" - git does not auto-checkout changes in the destination repository during a push. It's the user's responsibility to detect this and deal with it. I find that design approach - the idea that the user is expected to spot and deal with the internal behaviour of the tool - to be pretty bad.
Linus says that anyone who thinks Git is hard to use is an idiot. Idiocy is not the problem here. The developers in the organization I work in do not want to have to know or care about how the internals of the tool work. They want to cut their code, merge it and integrate it as quickly and as effectively as possible. BK easily beats Git on this measure. On the other hand, Git is far and away the superior open-source revision control tool. Anyone who thinks that Subversion is better just doesn't get it.
Re:If it looks like a tree, you'll probably be fin by IamTheRealMike · 2009-01-10 00:10 · Score: 4, Interesting

Right, that was always the weakness of git, and although it's improved I still have problems with its usability (or lack of it). For all the dumping Linus does on Subversion/Perforce and its ilk, they are easy to understand and it's basically always clear what you're doing. I haven't used git for a while, but last time I did it was like a box of sharp knives. Although hard to mess up the remote copy, messing up your local copy was much easier.
Git + Eclipse by david.given · 2009-01-10 00:14 · Score: 5, Interesting

I'm trying to use git as much as possible --- I'm still pretty crappy at doing anything even slightly complicated with it, but even with minimal skills it's brilliant at keeping track of changes to local directories.
The only problem is that I'd really, really like a decent Eclipse git plugin. I'm used to using Subclipse for SVN, which is fantastic: I can point at a file or directory, say 'Synchronise with repository', and then get a graphical diff of every change and the ability to quickly and easily revert or commit changes on a per-change, per-file, per-group-of-file basis, etc. (And you can do this with any revision, which makes backing out one specific change very easy.) Doing the same with git's command line tools seems terribly clunky by comparison, especially when I'm struggling to remember the syntax, and the fundamentally unfamiliar workflow.
I do use the Eclipse git plugin at git.or.cz, but it's still very crude. The file decoration is invaluable, which lets me see at a glance which files are new/changed/pristine in the Eclipse project view, but actually trying to *do* anything with it is deeply unpleasant --- no synchronise view, no graphical diff, and some weird behaviour like if you point at a file, say 'commit this', you get a dialogue prompting you to commit *all* files. Which is not what I want. And there's lots of UI clunkiness all round, due to simple immaturity.
I've had some luck with giggle, but the UI is pretty bad, and some changes (I forget what; new files, perhaps) don't show up in it, which is a bit awkward. I've had a play with some other GUI frontends but they're all pretty nasty by comparison with Subclipse. Still, the git plugin is getting better with time --- I'm just hoping that Synchronise shows up soon...
My (short) experience with git so far by JensR · 2009-01-10 00:24 · Score: 5, Interesting

I used to use cvs, subversion and perforce. After switching to git, it feels a lot more powerful, at the cost of more things that can go wrong.
My workflow with subversion was:
- regular update: update, check/fix conflicts, continue work
- commit: update, pick files I want to commit with TortoiseSVN, verify the changes in the diff view, write log message, commit, continue work
On GIT:
- regular update: stash my changes, change to master branch, pull, check for errors or dirty files (mostly endian problems), switch to work branch, rebase from master, check for errors or dirty files, unstash my changes, check for errors or dirty files, continue work
- commit: update, stage the files I want to commit, commit them, verify the changes, push
At several stages some obscure thing could go wrong that I needed to look up in the manual or on the internet, or needed to ask someone who used it for longer. That doesn't mean I think GIT is bad, I just feel it takes more time to be fully productive with compared to older systems. And I miss a few minor things from svn, like keyword expansion or properties.
Mercurial vs. Git by rpp3po · 2009-01-10 01:50 · Score: 5, Interesting

I'm working on the OpenJDK source tree through Mercurial. I couldn't be more satisfied. The tools are well structured, very easy to use, stable, fast and well documented. I don't miss any feature. Could anybody, who tried both and prefers Git, list some advantages of it over Mercurial? To me it just seems like a Git done right without the hype and too complex UI.
Re:Git links by Bromskloss · 2009-01-10 03:02 · Score: 3, Interesting

For problem two: this isn't a real problem with git, but rather with your organization. Multiple projects don't belong in the same repository, it's as simple as that.
I have been wanting to start with Git, but I find it too hard to know what should go into different repositories and what should be in the same.
First example: I might be writing a book in book/ and keep all images in a subdirectory book/images/. I think it is not far-fetched that I might want to work on only the images without downloading all the other, possibly huge, subdirectories.
Second example: Say I write a scientific article for which I compute a lot of numerical data. Then I write a second article, which builds upon the same data. Should the two articles go into the same repository, so that I can easily pull and compile everything at once with all dependencies in place, or should they be kept separately, so that I can work on the first article without dragging the other one along?

--
Swedish plasma phys. PhD student; MSc EE; knows maths, programming, electronics; finance interest; seeks opportunities
Git Adoption Soaring? by cjjjer · 2009-01-10 03:30 · Score: 3, Interesting

Funny the OP only mentioned two projects and one is only planning. Sometimes /. can be a bigger hype machine than money grabbing corps. that we all love to bash.
How does it work with non-static IPs? by pauljlucas · 2009-01-10 05:36 · Score: 3, Interesting

The thing I don't understand about any distributed VCS is how (for example) others could pull stuff from my repository if I don't have a static IP. Also note that I don't mean "don't have a static IP" only in the usual sense of "having a dynamic IP at home": I also mean in the sense that my development machine is my laptop and I often work at coffee shops and other places and so I'm often behind NATs.

--
If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.
Re:Meanwhile... by ThePhilips · 2009-01-10 09:00 · Score: 3, Interesting

You can't work with CC effectively without 20-30 helper scripts.
Where do people get ideas like this? I use CC effectively with one trivial Perl script. It converts "my feature is on this branch off this label" descriptions into config specs -- raw config specs are too complicated to handle, so you need a layer above them which matches your CM process. Yes, IBM/Rational should explain that to their customers. Or maybe make UCM not suck.
What about normal diff? CC still doesn't allow to use external diff program. And "ct diff" insists on two files - it can't diff hijacked file against original.
What about normal recursive diff for two branches?
What about patch generator? So that you can back up you unchecked-in changes.
What about change log? Recursive change log showing changes for all files in directory?
How about converting change history into set of patches? To allow easier investigation of regressions.
The moronism with R/O files? All extracted/"ct get -to" files are marked R/O.
And this is from top of my head. For all of that I have scripts. And with the scripts, I'd say, CC isn't half bad.
But to the point of original question, with Git I would not need any of the scripts.

Hijacked/checked-out files is major pain.
Then you're not branching, like you're supposed to do, and a hijacked file is the *least* of your problems. You cannot use CC as if it was CVS; a dynamic view is not a sandbox if you set it up to silently show other people's possibly incomplete changes.
We do branching and hijacked files are not problem per se. It is just better half CC tools, when given as parameter hijacked file, would simply say "f-off, this is view private file."
In some situations checked-out files are even worse since CC treats checked out files like files on a special branch. Consequently half of CC tools accept the file as parameter, yet show dick but no information about the file.
Git doesn't draw any difference between the files and files in repo. At any time you can do whatever you like with any accessible file/revision.

Dynamic views are great feature yet are completely useless.
Use them correctly for a few years, then report back.
Care to elaborate on "correct" usage pattern then?
People tried them in company few years ago and pretty much abandoned them. They are still accessible, yet generally unused. Our CC admins would be happy to know the "correct" usage for them.
You can't index dynamic view - because it contains all possible vobs and all possible files. And I do not want to deal with 150K files of the whole project, I need only 3.5K files belonging to my part.
You can't compile in dynamic view - because even if only dozen of people compile simultaneously, CC server simply dies under load.
Heck, simple "ls" spits on screen bunch of errors every time, because dynamic view can't properly show branch, but shows all files on all branches (readdir() lists all of them). And if file did happen to be not on the branch of the dynamic view, stat()ing it would give you an error.
If you can't do development with them, what else can you do with the dynamic views?
I used in past dynamic views solely for porting semi-automatically (with script) trivial fixes into many branches. For more than that dynamic views are useless.
Please, reveal me the secret: how do I use dynamic view "correctly"? Many people in my company would be happy to know it too.

--
All hope abandon ye who enter here.