Ask Slashdot: Selecting a Version Control System For an Inexperienced Team

← Back to Stories (view on slashdot.org)

Ask Slashdot: Selecting a Version Control System For an Inexperienced Team

Posted by samzenpus on Sunday October 11, 2015 @04:37AM from the help-please dept.

An anonymous reader writes: I have been programming in Python for quite a while, but so far I have not used a version control system. For a new project, a lot more people (10-15) are expected to contribute to the code base, many of them have never written a single line of Python but C, LabVIEW or Java instead. This is a company decision that can be seen as a Python vs. LabVIEW comparison — if successful the company is willing to migrate all code to Python. The code will be mostly geared towards data acquisition and data analysis leading to reports. At the moment I have the feeling, that managing that data (=measurements + reports) might be done within the version control system since this would generate an audit trail on the fly. So far I have been trying to select a version control system, based on google I guess it should be git or mercurial. I get the feeling, that they are quite similar for basic things. I expect, that the differences will show up when more sophisticated topics/problems are addressed — so to pick one I would have to learn both — what are your suggestions? Read below for more specifics. These are the requirements I can see so far:
- __Server_running_locally__ (as opposed to in the cloud) on windows (IT departments choice, non-negotiable)
- Good/easy to use Windows clients (IT departments choice / company policy, again non-negotiable)
- Use windows credentials (maybe, single sign on)
- Open source server/client (personal preference)
- Well established Project that will not disappear/ get unmaintained within a foreseeable future
- Do basic test on the code (Syntax errors, pytest/nose/or alike with coverage (of tests), check coding style)
- email notifications
- good documentation
- reasonable price for 5 — 10 users : free — 500€

Things that would be great ...
- web interface (like github) would be nice
- integration of bug tracking / bug reports
- possibility to do and print out a code review
- some kind of jupyter / ipython integration

Things I am not sure I will need but seem to be a good idea at the time of writing...
- Include other files/ file types for measurement data, documentation and user manuals (docx, xml, xlsx, gz, ...)
- When thinking about measurement data /reports it would be great to have digital signatures (--> FDA compliant). I know this is extremely hard, if this exists I would love it, if not I am fine. Somehow this feels like mixed document/version control, but I would love to have data + code + text = report at the same place to easily find implications of a bug — which data has to be re-evaluated and so on.

20 of 325 comments (clear)

Min score:

Reason:

Sort:

CVS or Subversion by benjfowler · 2015-10-11 04:43 · Score: 5, Insightful

As far as I can tell, you're describing the classic CVS or Subversion small team setup. You can run a server on the network (via Apache, or via SSH), run ViewCVS, set up checkin hooks, and give your clients a nice client like TortoiseCVS/TortoiseSVN built into Windows Explorer.
If you want integration with bug tracking tools, then have a look at Bugzilla and Bonsai.
All your users need to know about, is check in, and checkout, so the cognitive overhead is low.
It would take one engineer half a day to set all this stuff up on a spare machine, and you could try it out fairly quickly.
And best of all, this setup is gratis as well as Free. This has worked really nicely for me in both an academic and a commercial environment.
1. Re:CVS or Subversion by rainmaestro · 2015-10-11 05:02 · Score: 4, Insightful
  
  Agreed. For small / inexperienced teams, we've always recommended VisualSVN. GUI to manage most of the project admin tasks, easy integration with AD for user/group auth, and a fairly simple workflow.
  Git is has some really nice features, but I wouldn't push it onto a team with no VCS experience.
2. Re:CVS or Subversion by gonz · 2015-10-11 05:02 · Score: 5, Insightful
  
  For a small-to-medium team that has easy access to a centralized server, choosing Subversion instead of Git could save you a TON of time. In my experience, Git has a constant overhead of messed up merges, "brown bag" discussions to educate new devs about various gotchas, and ongoing debates about the right usage strategy (merging versus rebasing, branch management, how to keep histories from growing too large, etc).
  By contrast, I've also worked at several different companies that used Subversion, and basically you just show new devs how to sync and commit, and they figure out the rest themselves. The reason is that having a single always-up-to-date master is an order of magnitude simpler than Git's model of working-copy/branch/master on your local PC and then also branch/master on a remote PC and push/pull/fetch/merge between them.
  With Subversion you still have to manage branches sometimes, but there is typically a maintainer person who handles that. Whereas the model of Git is that every dev is doing merge algebra from day 1.
3. Re:CVS or Subversion by sugar+and+acid · 2015-10-11 05:49 · Score: 3, Informative
  
  It is perfectly possible to branch in SVN and manage it. Git is better for branching and developing in complex and large team environments. But this is not the case here. They probably have max 3 guys maintaining and max 3 guys on a development branch. SVN is more than capable of handling that.
4. Re:CVS or Subversion by Antique+Geekmeister · 2015-10-11 06:09 · Score: 3, Interesting
  
  > classic CVS or Subversion small team setup
  Yes, but I'd recommend _really strongly_ against either today. Both have considerable difficulty establishing disaster recovery or failover, and the tendency to set either of them up with the passwords stored locally in the user's home directory present profound security problems. And neither of them allow developers to make their own branches, and record their changes locally on their own systems, and submit them only when needed. The result can be a profound amount of clutter in the main repository, especially if anyone accidentally commits bulky binaries to a branch. CVS at least allows deletion of accidentally committed bulky objects: Subversion does not, not without extraordinary effort.
  I'm afraid that building your own bug tracking systems from scratch, even with tools like Bugzilla or Bonsai or RT or any of the major toolkits, is a blackhole of support work. Git has proven _very_ good for developers, because it allows them to branch, and to merge, far more cleanly, with very good mechanisms to make a "pull request" and get code review, and much more reliable and verifiable GPG signed tags. For small private repositories, github.com has proven very robust and resilient, with very good tools for Wikis and bug reports and integration with build systems.
  The only compelling reason I see to use Subversion today is the very, very good "TortoiseSVN" inteface for Windows users. "TortoiseGit" simply does not work well enough, and the X based GUI's aren't as good.
  > It would take one engineer half a day to set all this stuff up on a spare machine, and you could try it out fairly quickly.
  And it can take a full day every week to support just this one service, even in a small shop, with backup, high availability, bug fixes, security updates, end user support, and the hand management of user access and privilege management that is common to these small setups.
  > And best of all, this setup is gratis as well as Free. This has worked really nicely for me in both an academic and a commercial environment.
  I've unfortunately had to clean up from a number of "free as in beer" source control systems mismanaged over the long term.
5. Re:CVS or Subversion by angel'o'sphere · 2015-10-11 08:47 · Score: 4, Insightful
  
  SVN might have drawbacks, one is its name. However this: SVN doesn't work. is simply wrong.
  Linus had special requirements, hence he wrote git. If he claimed SVN does not work, he is not smart as he looks like.
  That does not mean that SVN etc. does not work. CVS is another thing. Having non atomic commits (how retarded is that anyway????) is a huge problem.
  
  --
  Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
6. Re:CVS or Subversion by Maxmin · 2015-10-11 10:18 · Score: 3, Interesting
  
  The ONLY reason git gained popularity is Linus Torvalds. If an unknown engineer released a VCS with similarly confusing, incoherent command-line semantics? It would NEVER have taken off. git survived because Linus Torvalds. That's it.
  But git is the lingua franca. It has a learning curve, but because of that there is a virtually unlimited selection of learning materials out there. There is NO EXCUSE for not having some expertise in git, as engineers. Why rebase? How to cherry-pick? Only stubborn engineers don't know these things, and it's odd because they're smart enough to grasp the concepts, do the katas and gain proficiency.
  I work with a bunch of engineers that refuse to branch in git! They're terrified of it! Because, once they branched, worked out of that branch for three months, then had a disastrous merge to master (of course). So now master is the development branch! master is the release branch! THAT is terrifying. Although they do tag releases, but still.
  
  --
  O lord, bless this thy holy hand grenade, that with it thou mayest blow thine enemies to tiny bits, in thy mercy.
7. Re:CVS or Subversion by jrumney · 2015-10-11 13:07 · Score: 3, Informative
  
  I never thought I would see a recommendation for CVS in 2015. The OP is on the right track looking at git and mercurial to start with. The only probem with his requirements are the Windows server. Maybe a virtual machine running on the Windows server would be acceptable to IT? While it is possible to run a git or mercurial server on Windows, there are a lot of good tools that would give the "things that would be great" that are not supported on Windows. On the client side, TortoiseGit and TortoiseHg are available, giving the same Explorer integration as TortoiseSVN/TortoiseCVS.
8. Re:CVS or Subversion by swillden · 2015-10-11 13:44 · Score: 4, Interesting
  
  So now master is the development branch! master is the release branch! THAT is terrifying. Although they do tag releases, but still.
  Developing on and releasing from master has its risks, but given appropriate QA, including code reviews, extensive automated unit, functional and integration tests, and extensive release tests, it can work very well. That's what Google does. 25,000 engineers, one source repository, 45,000 commits per day, developing on and releasing from HEAD.
  Well, almost. Developers create local branches for their work and don't commit into master until code review is complete -- including of automated tests. The actual commit into master doesn't go in unless the commit and everything else that could possibly depend on it builds and passes all of the tests (the build/test/submit cycle is automated; engineers kick it off and then get informed of the results). Releases are branched off to freeze them while release testing is done, and sometimes a few commits are cherry-picked into a release to fix issues, but mostly the release either passes the tests and goes out, or fails the tests and is abandoned. Most projects operate on a weekly release cycle, so the impact of abandoning a release is small. As long as it doesn't happen too often.
  Note that I'm speaking of the web properties; search, Gmail, etc. Obviously other groups have different approaches. For example, I currently work on Android, which has a roughly annual release cycle. That drives a very different strategy. One with lots of branching, actually.
  Also note that I'm not claiming that this is a good strategy for every team or company. I'm just pointing out that it can work, if you manage it well. Of course, the same is true of virtually every development process, though different processes are better suited to different contexts.
  
  --
  Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:So many options by benjfowler · 2015-10-11 04:57 · Score: 4, Informative

Oh God, stay the fuck away from SourceSafe.
SourceSafe is an absolutely terrible choice, since it is actively user-hostile, and has the alarming habit of eating your source code at the worst possible time. Rational Clearcase is almost as bad.
What will you ACTUALLY be doing? by GrantRobertson · 2015-10-11 05:21 · Score: 5, Insightful

First we need to take a step back and figure out what you are actually doing. You have pulled up with a "software version control" bandwagon and everyone just jumped on without looking to see if it would take you where you wanted to go.
Are you wanting to keep track of the versions of your code or the reports generated by that code or the data that the code used to generate the reports? Each type of information is best suited for a different kind of versioning system. Are the reports generated only by the code or are they written by humans? Trying to use a code versioning system to keep track of modifications to reports or data is a loosing game. Don't make the mistake of thinking every problem is a nail just because you have a hammer.
Re:UH oh by __aaclcg7560 · 2015-10-11 05:25 · Score: 3, Insightful

Or a hardware company being run by a marketing hack: "Python is new and popular! Let's get all our programmers and code base on Python yesterday!"
As the summary makes no justification for switching away from C and Java, I'm just assuming the worse possible reason for switching programming languages.
OUCH!!! by LostMyBeaver · 2015-10-11 05:57 · Score: 3, Informative

I'll start by answering your question. Use GIT. It's the most widely supported system at this time and it works really well.

Next let me be a typical slashdot asshole that makes abrasive comments that may be well intended by will come off as being a dick. I'll explain that I already see endless problems coming from this.

If you're working with a team of 10-15 developers who all lack experience with version control, you have a major problem with out-of-date programmers and you're throwing them into a hell called Python. If you generally accomplish projects using C and LabView, the developers you have more than likely lack a modern development skill set and coding in a language like Python will produce some of the worst code ever written. If C is like shooting yourself in the leg and C++ is like blowing the whole damned leg off, Python is like dropping a nuke. You will have an endless supply of options for writing terribly bad code in the worst ways possible. The only redeeming feature will be it will have nice uniform spacing.

I would highly recommend doing what always works best which is to hire a Python developer with good GIT skills that can lay the majority of the foundation of the project and create a uniform set of standards of coding for the project and then bring the other developers on 3 at a time and perform constant code review. Focus heavily on test driven development and use a system like SCRUM for lifecycle management. If you want to teach old dogs new tricks, don't just throw them in the fire and tell them to figure it out. The programming paradigms are so drastically different between your old method and new that without some sort of leader with experience, it will turn out to be a disaster and jungle of crap code. I personal avoid Python projects not because the language is bad, but instead because they tend to be like this.

You should of course know by now that if you are traditionally a LabView shop, you're going to sacrifice a massive number of really important features to save a buck. Python has great support too multi-threading but it's not an awesome environment for event driven programming like LabView is. You of course can accomplish all the same things, but even with the thousands of toolkits/libraries out there, you'll have to write the entire underlying architecture yourselves and you'll lose almost all visualization you've come to depend on.
1. Re:OUCH!!! by Njovich · 2015-10-11 06:31 · Score: 3, Insightful
  
  The one thing I agree with is that Git is the obvious choice as it is the current standard. For the rest I guess you are fairly inexperienced. If you really believe it's easier to shoot (or nuke) yourself with Python than with C you are extremely wrong. Obviously you can write bad code in any language, but Python is no worse than most others.
  In a couple of hours most Git basics can be taught to any reasonable programmer. It can be worthwhile to make sure they set aside some time to read up on Git usage. Especially with a GUI it's not exactly rocket science (and any programmer worth their salt should have no problems with the CLI, some annoyances notwithstanding). Making your hiring decision for a Python programmer based on Git skill is a bit weird, as there are much more important factors to choose a programmer on. I have seen good and bad Git usage across all ages and skill levels, it mostly just depends on what exactly they worked on in recent years.
  As far as massively changed programming paradigms, unless you just time traveled from the 70's, that's BS.
  As for Scrum and Test Driven Development, you would need to know more about this project before you can make a decision like that. I don't see anything in this description that would give you enough information to advise on that.
Re:git by ATMAvatar · 2015-10-11 06:32 · Score: 3, Insightful

I don't know that I share the same experience. There are plenty of UI tools that help make git easier to work with, such that I wouldn't have much hesitation in making it the first VCS for a team.
I certainly don't expect them to be doing rebasing, bisecting, or force pushes anytime soon, nor would I suggest they start by setting each other as remotes to take advantage of the distributed aspect. However stage, commit, merge, pull, and push operations on a central origin are all pretty simple, and not much different than they would be doing with any other VCS.

--
"They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety."
Re:git by Electricity+Likes+Me · 2015-10-11 07:04 · Score: 3, Insightful

No you're not? With Git its not at all distributed unless you really really work at it. The simplest and most naive git model is "get latest head, edit, commit and push". This is what everyone is going to be doing with any other tool.
The difference is, when they get more advanced, you'll be in the good company of the *massive* git ecosystem and featureset which will make your life a lot easier. If you're dealing with people who don't know version control, then it doesn't matter what you pick - they are not going to understand it and you will be doing a lot of support.
Re:git by Pseudonymous+Powers · 2015-10-11 08:14 · Score: 4, Informative

I was with a small team of very experienced developers, and even for us going to git had a bunch of surprises. For me it's not so much the UI tools, it's understanding what's going on, and why git does what it does.
That's what I mean when I say there's no simple formula of "do these 3 commands to do this, those 2 commands to do that". You have to understand WHY the commands are doing what they are doing.
That's certainly a common view of Git, but after using it for the last few years, I think that a lot of the problems that beginners have with it are happening because of this assumption. That is, when a developer asks how to merge their code into the shared Git repo for the first time, the wise old Git gurus point them at a site that explains how Git works at the molecular level, called The Git Book. This is almost never helpful, because your average Joe C. Programmer doesn't have time in his schedule to read an entire book, and even if he reads it over the weekend instead of, you know, having a life, he just ends up with his head full of crazy circles-and-arrows diagrams, which, divorced from any concrete, hands-on practice, only serves to confuse the issue more.
What the inexperienced Gitsperson actually needs at that point is a short and to-the-point workflow that he can use to get his goddamn code in the goddamn repo, like (commands for illustration purposes only, I use a Fischer Price GUI): "git clone MyRepo; git switch master; git pull; git branch MyFeature; git switch MyFeature; [implement the code changes]; git commit; git push; git switch master; git pull; git merge MyFeature; [fix conflicts, resolve, commit again if necessary]; git push". And for the love of God, Newbie, please don't try to use "rebase", you'll just cripple our entire product at 5:30 pm on a Friday.
There's documentation of that kind out there, admittedly, but it's really hard to find among all the indistinguishable-from-autogenerated-prank-nonsense man pages and fifteen-part seminars on how the version hashing algorithm works.
EasyMercurial by Immerman · 2015-10-11 09:59 · Score: 3, Informative

I'll second the recommendation of Mercurial. There's also EasyMercurial, a nice little dead-simple GUI front end that lets you do a handful of the most common things (checkpoints, history overview, version comparisons, reversions, branching and merges) without having to learn much about the details. Very nice for beginners to get their feet wet with version control systems, and if/when they need something more powerful they can always use the command-line tools directly, or migrate to a more feature-rich GUI. But honestly, for a bunch of people without prior experience even the heavily restricted feature set will be a huge step forward.
There's probably similar "beginner" GUIs available for most of the major VCSes, but EasyMercurial is the only one I can vouch for. I would also lean towards recommending a distributed VCS that offers easy branching and merging for a bunch of VCS beginners - they seem to offer far less conceptual/procedural overhead to the "lone wolf" work flow, and thus are more likely to actually get used effectively.

--
--- Most topics have many sides worth arguing, allow me to take one opposite you.
Re:UH oh by Megane · 2015-10-11 10:10 · Score: 3, Insightful

LabView is not only proprietary, it's a visual programming language (connect a bunch of boxes with lines) that stores its stuff in binary blobs. So you can't do version control on it, or even diff it. If someone changes one of those little boxes in a big LabView project, you will likely never know who did it or when it happened, and good luck finding where it was changed. Or you might not even know that it happened at all, just things start acting screwy.

--
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
Re:UH oh by __aaclcg7560 · 2015-10-12 15:00 · Score: 3, Interesting

I gave a trivial example where I wrote a Python that took 123 seconds to do one million dice rolls. I then use Cython to convert dice rolls into an C extension, which resulted in Python script that executed in two seconds. I didn't convert the entire Python script into an C extension. If the goal is to standardize the code base from C to Python, Cython can fix the performance issues.