Slashdot Mirror


Ask Slashdot: Selecting a Version Control System For an Inexperienced Team

An anonymous reader writes: I have been programming in Python for quite a while, but so far I have not used a version control system. For a new project, a lot more people (10-15) are expected to contribute to the code base, many of them have never written a single line of Python but C, LabVIEW or Java instead. This is a company decision that can be seen as a Python vs. LabVIEW comparison — if successful the company is willing to migrate all code to Python. The code will be mostly geared towards data acquisition and data analysis leading to reports. At the moment I have the feeling, that managing that data (=measurements + reports) might be done within the version control system since this would generate an audit trail on the fly. So far I have been trying to select a version control system, based on google I guess it should be git or mercurial. I get the feeling, that they are quite similar for basic things. I expect, that the differences will show up when more sophisticated topics/problems are addressed — so to pick one I would have to learn both — what are your suggestions? Read below for more specifics. These are the requirements I can see so far:
- __Server_running_locally__ (as opposed to in the cloud) on windows (IT departments choice, non-negotiable)
- Good/easy to use Windows clients (IT departments choice / company policy, again non-negotiable)
- Use windows credentials (maybe, single sign on)
- Open source server/client (personal preference)
- Well established Project that will not disappear/ get unmaintained within a foreseeable future
- Do basic test on the code (Syntax errors, pytest/nose/or alike with coverage (of tests), check coding style)
- email notifications
- good documentation
- reasonable price for 5 — 10 users : free — 500€

Things that would be great ...
- web interface (like github) would be nice
- integration of bug tracking / bug reports
- possibility to do and print out a code review
- some kind of jupyter / ipython integration

Things I am not sure I will need but seem to be a good idea at the time of writing...
- Include other files/ file types for measurement data, documentation and user manuals (docx, xml, xlsx, gz, ...)
- When thinking about measurement data /reports it would be great to have digital signatures (--> FDA compliant). I know this is extremely hard, if this exists I would love it, if not I am fine. Somehow this feels like mixed document/version control, but I would love to have data + code + text = report at the same place to easily find implications of a bug — which data has to be re-evaluated and so on.

8 of 325 comments (clear)

  1. Re:So many options by benjfowler · · Score: 4, Informative

    Oh God, stay the fuck away from SourceSafe.

    SourceSafe is an absolutely terrible choice, since it is actively user-hostile, and has the alarming habit of eating your source code at the worst possible time. Rational Clearcase is almost as bad.

  2. Stay away from git for "inexperienced team" by iamacat · · Score: 2, Informative

    Source control with git is like using (char *) &myStruct in C. Very flexible, but impossible to explain to someone who wants to do simple tasks, and most commands result in corrupting your work. Including correct commands accidentally used twice. Worked with it for two years, still regularly find things that baffle me.

    Better to start with a comprehensible tool like svn and a good IDE with a source control plugin such as IntelliJ. You might migrate several years down the road, but by that time either team will become experienced enough to use git, or hopefully something better comes along.

  3. Re:CVS or Subversion by sugar+and+acid · · Score: 3, Informative

    It is perfectly possible to branch in SVN and manage it. Git is better for branching and developing in complex and large team environments. But this is not the case here. They probably have max 3 guys maintaining and max 3 guys on a development branch. SVN is more than capable of handling that.

  4. OUCH!!! by LostMyBeaver · · Score: 3, Informative

    I'll start by answering your question. Use GIT. It's the most widely supported system at this time and it works really well.

    Next let me be a typical slashdot asshole that makes abrasive comments that may be well intended by will come off as being a dick. I'll explain that I already see endless problems coming from this.

    If you're working with a team of 10-15 developers who all lack experience with version control, you have a major problem with out-of-date programmers and you're throwing them into a hell called Python. If you generally accomplish projects using C and LabView, the developers you have more than likely lack a modern development skill set and coding in a language like Python will produce some of the worst code ever written. If C is like shooting yourself in the leg and C++ is like blowing the whole damned leg off, Python is like dropping a nuke. You will have an endless supply of options for writing terribly bad code in the worst ways possible. The only redeeming feature will be it will have nice uniform spacing.

    I would highly recommend doing what always works best which is to hire a Python developer with good GIT skills that can lay the majority of the foundation of the project and create a uniform set of standards of coding for the project and then bring the other developers on 3 at a time and perform constant code review. Focus heavily on test driven development and use a system like SCRUM for lifecycle management. If you want to teach old dogs new tricks, don't just throw them in the fire and tell them to figure it out. The programming paradigms are so drastically different between your old method and new that without some sort of leader with experience, it will turn out to be a disaster and jungle of crap code. I personal avoid Python projects not because the language is bad, but instead because they tend to be like this.

    You should of course know by now that if you are traditionally a LabView shop, you're going to sacrifice a massive number of really important features to save a buck. Python has great support too multi-threading but it's not an awesome environment for event driven programming like LabView is. You of course can accomplish all the same things, but even with the thousands of toolkits/libraries out there, you'll have to write the entire underlying architecture yourselves and you'll lose almost all visualization you've come to depend on.

  5. Ok, others seem to have missed this by Anonymous Coward · · Score: 2, Informative

    Knowing what industry you're working in helps to understand what you're trying to do and what you actually need. Others have seemingly missed this, but you did mention FDA compliance at the end of your extended list of requirements. So, I'm guessing you're a life sciences, pharma or drug discovery company working with microarrays and other data acquisition hardware where the data WAS passed to LabView and then processed there or in one of the Java, C, or other apps you mentioned that were developed in-house. The processed data is then documented with reports.

    You actually have three problems and not one. You have a code versioning issue that requires version control for better debugging and maintenance going forward, a data cataloging problem and a document management problem. I would imagine, that some groups either have or will have visualization and graphics data to manage as well, but we'll leave that out for now.

    You mention that the developers have no background using an off-the-shelf version control system. This, in the modern development era, is ... scary! As was mentioned above in several places, a simple check-in, check-out system would probably be a good place to start with the plan to migrate to something more sophisticated in a three to four year time window. You know these developers are going to want more features down the road, but aren't ready for the full enchilada on day one as that would be a lot messier to deal with for a longer period of time than having a migration plan and starting them off slowly. I would recommend a CVS or SVN variant for the initial launch and then have a migration plan to something like Git. This gives you time to get the initial setup running and the developers have a flatter on-ramp to usage while you (and IT) come up to speed on Git in the background before a planned deployment a couple years down the road. In the end, a plan like this will save you time and pain in the long run, oh, and management will save $$$ in the process.

    Data cataloging may not be that great an issue, but it does require some thought. Not sure how data is being cataloged now, but there are a few commercial products that will help with this process. Some are desktop only and then there are those tied to commercial database systems like Oracle. You'd really have to do some research based on your actual needs to find one that works best for the researchers. Can't touch this topic sight unseen.

    Then, there's the document management problem. Being a Windows shop there are Microsoft options for this, as well as commercial and open source tools. Again, not being familiar with your internal workflow and budgets I can't be more specific than that.

    Trying to do everything with a version control system is just foolish. You're only going to create more headaches for yourself, the IT team and the researchers/developers than will be mitigated by a single system trying to manage different verticals within a workflow. DON'T DO IT!!! Use the Keep It Simple, Stupid rule along with the right-tool for the right job approach. In the end, you, your IT team and your users will be happier and more productive, that will make management very happy and cost a lot less in the long run.

    Cheers and good luck.

  6. Re:git by Pseudonymous+Powers · · Score: 4, Informative

    I was with a small team of very experienced developers, and even for us going to git had a bunch of surprises. For me it's not so much the UI tools, it's understanding what's going on, and why git does what it does.

    That's what I mean when I say there's no simple formula of "do these 3 commands to do this, those 2 commands to do that". You have to understand WHY the commands are doing what they are doing.

    That's certainly a common view of Git, but after using it for the last few years, I think that a lot of the problems that beginners have with it are happening because of this assumption. That is, when a developer asks how to merge their code into the shared Git repo for the first time, the wise old Git gurus point them at a site that explains how Git works at the molecular level, called The Git Book. This is almost never helpful, because your average Joe C. Programmer doesn't have time in his schedule to read an entire book, and even if he reads it over the weekend instead of, you know, having a life, he just ends up with his head full of crazy circles-and-arrows diagrams, which, divorced from any concrete, hands-on practice, only serves to confuse the issue more.

    What the inexperienced Gitsperson actually needs at that point is a short and to-the-point workflow that he can use to get his goddamn code in the goddamn repo, like (commands for illustration purposes only, I use a Fischer Price GUI): "git clone MyRepo; git switch master; git pull; git branch MyFeature; git switch MyFeature; [implement the code changes]; git commit; git push; git switch master; git pull; git merge MyFeature; [fix conflicts, resolve, commit again if necessary]; git push". And for the love of God, Newbie, please don't try to use "rebase", you'll just cripple our entire product at 5:30 pm on a Friday.

    There's documentation of that kind out there, admittedly, but it's really hard to find among all the indistinguishable-from-autogenerated-prank-nonsense man pages and fifteen-part seminars on how the version hashing algorithm works.

  7. EasyMercurial by Immerman · · Score: 3, Informative

    I'll second the recommendation of Mercurial. There's also EasyMercurial, a nice little dead-simple GUI front end that lets you do a handful of the most common things (checkpoints, history overview, version comparisons, reversions, branching and merges) without having to learn much about the details. Very nice for beginners to get their feet wet with version control systems, and if/when they need something more powerful they can always use the command-line tools directly, or migrate to a more feature-rich GUI. But honestly, for a bunch of people without prior experience even the heavily restricted feature set will be a huge step forward.

    There's probably similar "beginner" GUIs available for most of the major VCSes, but EasyMercurial is the only one I can vouch for. I would also lean towards recommending a distributed VCS that offers easy branching and merging for a bunch of VCS beginners - they seem to offer far less conceptual/procedural overhead to the "lone wolf" work flow, and thus are more likely to actually get used effectively.

    --
    --- Most topics have many sides worth arguing, allow me to take one opposite you.
  8. Re:CVS or Subversion by jrumney · · Score: 3, Informative

    I never thought I would see a recommendation for CVS in 2015. The OP is on the right track looking at git and mercurial to start with. The only probem with his requirements are the Windows server. Maybe a virtual machine running on the Windows server would be acceptable to IT? While it is possible to run a git or mercurial server on Windows, there are a lot of good tools that would give the "things that would be great" that are not supported on Windows. On the client side, TortoiseGit and TortoiseHg are available, giving the same Explorer integration as TortoiseSVN/TortoiseCVS.