Slashdot Mirror


Ask Slashdot: Selecting a Version Control System For an Inexperienced Team

An anonymous reader writes: I have been programming in Python for quite a while, but so far I have not used a version control system. For a new project, a lot more people (10-15) are expected to contribute to the code base, many of them have never written a single line of Python but C, LabVIEW or Java instead. This is a company decision that can be seen as a Python vs. LabVIEW comparison — if successful the company is willing to migrate all code to Python. The code will be mostly geared towards data acquisition and data analysis leading to reports. At the moment I have the feeling, that managing that data (=measurements + reports) might be done within the version control system since this would generate an audit trail on the fly. So far I have been trying to select a version control system, based on google I guess it should be git or mercurial. I get the feeling, that they are quite similar for basic things. I expect, that the differences will show up when more sophisticated topics/problems are addressed — so to pick one I would have to learn both — what are your suggestions? Read below for more specifics. These are the requirements I can see so far:
- __Server_running_locally__ (as opposed to in the cloud) on windows (IT departments choice, non-negotiable)
- Good/easy to use Windows clients (IT departments choice / company policy, again non-negotiable)
- Use windows credentials (maybe, single sign on)
- Open source server/client (personal preference)
- Well established Project that will not disappear/ get unmaintained within a foreseeable future
- Do basic test on the code (Syntax errors, pytest/nose/or alike with coverage (of tests), check coding style)
- email notifications
- good documentation
- reasonable price for 5 — 10 users : free — 500€

Things that would be great ...
- web interface (like github) would be nice
- integration of bug tracking / bug reports
- possibility to do and print out a code review
- some kind of jupyter / ipython integration

Things I am not sure I will need but seem to be a good idea at the time of writing...
- Include other files/ file types for measurement data, documentation and user manuals (docx, xml, xlsx, gz, ...)
- When thinking about measurement data /reports it would be great to have digital signatures (--> FDA compliant). I know this is extremely hard, if this exists I would love it, if not I am fine. Somehow this feels like mixed document/version control, but I would love to have data + code + text = report at the same place to easily find implications of a bug — which data has to be re-evaluated and so on.

8 of 325 comments (clear)

  1. CVS or Subversion by benjfowler · · Score: 5, Insightful

    As far as I can tell, you're describing the classic CVS or Subversion small team setup. You can run a server on the network (via Apache, or via SSH), run ViewCVS, set up checkin hooks, and give your clients a nice client like TortoiseCVS/TortoiseSVN built into Windows Explorer.

    If you want integration with bug tracking tools, then have a look at Bugzilla and Bonsai.

    All your users need to know about, is check in, and checkout, so the cognitive overhead is low.

    It would take one engineer half a day to set all this stuff up on a spare machine, and you could try it out fairly quickly.

    And best of all, this setup is gratis as well as Free. This has worked really nicely for me in both an academic and a commercial environment.

    1. Re:CVS or Subversion by rainmaestro · · Score: 4, Insightful

      Agreed. For small / inexperienced teams, we've always recommended VisualSVN. GUI to manage most of the project admin tasks, easy integration with AD for user/group auth, and a fairly simple workflow.

      Git is has some really nice features, but I wouldn't push it onto a team with no VCS experience.

    2. Re:CVS or Subversion by gonz · · Score: 5, Insightful

      For a small-to-medium team that has easy access to a centralized server, choosing Subversion instead of Git could save you a TON of time. In my experience, Git has a constant overhead of messed up merges, "brown bag" discussions to educate new devs about various gotchas, and ongoing debates about the right usage strategy (merging versus rebasing, branch management, how to keep histories from growing too large, etc).

      By contrast, I've also worked at several different companies that used Subversion, and basically you just show new devs how to sync and commit, and they figure out the rest themselves. The reason is that having a single always-up-to-date master is an order of magnitude simpler than Git's model of working-copy/branch/master on your local PC and then also branch/master on a remote PC and push/pull/fetch/merge between them.

      With Subversion you still have to manage branches sometimes, but there is typically a maintainer person who handles that. Whereas the model of Git is that every dev is doing merge algebra from day 1.

    3. Re:CVS or Subversion by angel'o'sphere · · Score: 4, Insightful

      SVN might have drawbacks, one is its name. However this: SVN doesn't work. is simply wrong.

      Linus had special requirements, hence he wrote git. If he claimed SVN does not work, he is not smart as he looks like.

      That does not mean that SVN etc. does not work. CVS is another thing. Having non atomic commits (how retarded is that anyway????) is a huge problem.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    4. Re:CVS or Subversion by swillden · · Score: 4, Interesting

      So now master is the development branch! master is the release branch! THAT is terrifying. Although they do tag releases, but still.

      Developing on and releasing from master has its risks, but given appropriate QA, including code reviews, extensive automated unit, functional and integration tests, and extensive release tests, it can work very well. That's what Google does. 25,000 engineers, one source repository, 45,000 commits per day, developing on and releasing from HEAD.

      Well, almost. Developers create local branches for their work and don't commit into master until code review is complete -- including of automated tests. The actual commit into master doesn't go in unless the commit and everything else that could possibly depend on it builds and passes all of the tests (the build/test/submit cycle is automated; engineers kick it off and then get informed of the results). Releases are branched off to freeze them while release testing is done, and sometimes a few commits are cherry-picked into a release to fix issues, but mostly the release either passes the tests and goes out, or fails the tests and is abandoned. Most projects operate on a weekly release cycle, so the impact of abandoning a release is small. As long as it doesn't happen too often.

      Note that I'm speaking of the web properties; search, Gmail, etc. Obviously other groups have different approaches. For example, I currently work on Android, which has a roughly annual release cycle. That drives a very different strategy. One with lots of branching, actually.

      Also note that I'm not claiming that this is a good strategy for every team or company. I'm just pointing out that it can work, if you manage it well. Of course, the same is true of virtually every development process, though different processes are better suited to different contexts.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  2. Re:So many options by benjfowler · · Score: 4, Informative

    Oh God, stay the fuck away from SourceSafe.

    SourceSafe is an absolutely terrible choice, since it is actively user-hostile, and has the alarming habit of eating your source code at the worst possible time. Rational Clearcase is almost as bad.

  3. What will you ACTUALLY be doing? by GrantRobertson · · Score: 5, Insightful

    First we need to take a step back and figure out what you are actually doing. You have pulled up with a "software version control" bandwagon and everyone just jumped on without looking to see if it would take you where you wanted to go.

    Are you wanting to keep track of the versions of your code or the reports generated by that code or the data that the code used to generate the reports? Each type of information is best suited for a different kind of versioning system. Are the reports generated only by the code or are they written by humans? Trying to use a code versioning system to keep track of modifications to reports or data is a loosing game. Don't make the mistake of thinking every problem is a nail just because you have a hammer.

  4. Re:git by Pseudonymous+Powers · · Score: 4, Informative

    I was with a small team of very experienced developers, and even for us going to git had a bunch of surprises. For me it's not so much the UI tools, it's understanding what's going on, and why git does what it does.

    That's what I mean when I say there's no simple formula of "do these 3 commands to do this, those 2 commands to do that". You have to understand WHY the commands are doing what they are doing.

    That's certainly a common view of Git, but after using it for the last few years, I think that a lot of the problems that beginners have with it are happening because of this assumption. That is, when a developer asks how to merge their code into the shared Git repo for the first time, the wise old Git gurus point them at a site that explains how Git works at the molecular level, called The Git Book. This is almost never helpful, because your average Joe C. Programmer doesn't have time in his schedule to read an entire book, and even if he reads it over the weekend instead of, you know, having a life, he just ends up with his head full of crazy circles-and-arrows diagrams, which, divorced from any concrete, hands-on practice, only serves to confuse the issue more.

    What the inexperienced Gitsperson actually needs at that point is a short and to-the-point workflow that he can use to get his goddamn code in the goddamn repo, like (commands for illustration purposes only, I use a Fischer Price GUI): "git clone MyRepo; git switch master; git pull; git branch MyFeature; git switch MyFeature; [implement the code changes]; git commit; git push; git switch master; git pull; git merge MyFeature; [fix conflicts, resolve, commit again if necessary]; git push". And for the love of God, Newbie, please don't try to use "rebase", you'll just cripple our entire product at 5:30 pm on a Friday.

    There's documentation of that kind out there, admittedly, but it's really hard to find among all the indistinguishable-from-autogenerated-prank-nonsense man pages and fifteen-part seminars on how the version hashing algorithm works.