Ask Slashdot: Selecting a Version Control System For an Inexperienced Team
An anonymous reader writes: I have been programming in Python for quite a while, but so far I have not used a version control system. For a new project, a lot more people (10-15) are expected to contribute to the code base, many of them have never written a single line of Python but C, LabVIEW or Java instead. This is a company decision that can be seen as a Python vs. LabVIEW comparison — if successful the company is willing to migrate all code to Python. The code will be mostly geared towards data acquisition and data analysis leading to reports. At the moment I have the feeling, that managing that data (=measurements + reports) might be done within the version control system since this would generate an audit trail on the fly. So far I have been trying to select a version control system, based on google I guess it should be git or mercurial. I get the feeling, that they are quite similar for basic things. I expect, that the differences will show up when more sophisticated topics/problems are addressed — so to pick one I would have to learn both — what are your suggestions? Read below for more specifics.
These are the requirements I can see so far:
- __Server_running_locally__ (as opposed to in the cloud) on windows (IT departments choice, non-negotiable)
- Good/easy to use Windows clients (IT departments choice / company policy, again non-negotiable)
- Use windows credentials (maybe, single sign on)
- Open source server/client (personal preference)
- Well established Project that will not disappear/ get unmaintained within a foreseeable future
- Do basic test on the code (Syntax errors, pytest/nose/or alike with coverage (of tests), check coding style)
- email notifications
- good documentation
- reasonable price for 5 — 10 users : free — 500€
Things that would be great ...
- web interface (like github) would be nice
- integration of bug tracking / bug reports
- possibility to do and print out a code review
- some kind of jupyter / ipython integration
Things I am not sure I will need but seem to be a good idea at the time of writing...
- Include other files/ file types for measurement data, documentation and user manuals (docx, xml, xlsx, gz, ...)
- When thinking about measurement data /reports it would be great to have digital signatures (--> FDA compliant). I know this is extremely hard, if this exists I would love it, if not I am fine. Somehow this feels like mixed document/version control, but I would love to have data + code + text = report at the same place to easily find implications of a bug — which data has to be re-evaluated and so on.
- __Server_running_locally__ (as opposed to in the cloud) on windows (IT departments choice, non-negotiable)
- Good/easy to use Windows clients (IT departments choice / company policy, again non-negotiable)
- Use windows credentials (maybe, single sign on)
- Open source server/client (personal preference)
- Well established Project that will not disappear/ get unmaintained within a foreseeable future
- Do basic test on the code (Syntax errors, pytest/nose/or alike with coverage (of tests), check coding style)
- email notifications
- good documentation
- reasonable price for 5 — 10 users : free — 500€
Things that would be great ...
- web interface (like github) would be nice
- integration of bug tracking / bug reports
- possibility to do and print out a code review
- some kind of jupyter / ipython integration
Things I am not sure I will need but seem to be a good idea at the time of writing...
- Include other files/ file types for measurement data, documentation and user manuals (docx, xml, xlsx, gz, ...)
- When thinking about measurement data /reports it would be great to have digital signatures (--> FDA compliant). I know this is extremely hard, if this exists I would love it, if not I am fine. Somehow this feels like mixed document/version control, but I would love to have data + code + text = report at the same place to easily find implications of a bug — which data has to be re-evaluated and so on.
You sound like a hardware company. Nothing worse than getting EEs to see the logic in versionning. They'll all be in their corner doing it their way because it's better....
As far as I can tell, you're describing the classic CVS or Subversion small team setup. You can run a server on the network (via Apache, or via SSH), run ViewCVS, set up checkin hooks, and give your clients a nice client like TortoiseCVS/TortoiseSVN built into Windows Explorer.
If you want integration with bug tracking tools, then have a look at Bugzilla and Bonsai.
All your users need to know about, is check in, and checkout, so the cognitive overhead is low.
It would take one engineer half a day to set all this stuff up on a spare machine, and you could try it out fairly quickly.
And best of all, this setup is gratis as well as Free. This has worked really nicely for me in both an academic and a commercial environment.
Perforce is what I would recommend, but it may not fit your budget.
Although, anything but git will work.
We use in my company. Excellent merging, and good support for branch per task. Also user friendly, and good support. If you are allowed open source, git is the obvious choice.
Easy to learn, cross platform, enterprise grade, supports atomic commits (unlike CVS), free and open source, well documented and (even better) easy to google the answer to "how do I..." questions.
The capstone of the evolution of 30 years of traditional version control development.
Git. Everyone is using it now. Next question.
For side projects I use git (github by there are other hosted options or you can run your own server). At work we use TFS. I've been at this a while, seen CVS, SVN, Source Safe and there are many more out there. Try a few out, pilot one or two, find what fits. Read a book or two about whatever you choose so you get the most out of it and learn the best practices for using whatever you choose.
Except
That's not really a function of version control
Fossil: http://fossil-scm.org/index.html/doc/trunk/www/index.wiki
In fact, what you want is several different tools, at least a VCS and an integrated build:
https://en.wikipedia.org/wiki/...
At my company, we are using CruiseControl.NET, which is free and open-source, but seems discontinued.
It's sufficient for our needs.
We use a SVN server to make our commits (with TortoiseSvn on the clients), it's dead simple to install and use.
Configuring CruiseControl is more tedious, but you'll get automated builds, along with code coverage and unit tests.
A better tool may exist, but we use this one.
I think git can meet all of your needs, and personally I love it.
- It's a free, well-established, and well-documented open source project.
- There are plenty of GUIs.
- For inexperienced developers, there are tutorials like this one.
- Here's decent guide to getting password-less authentication via ssh working on Windows to connect to a server running locally on a Windows box (as long as it's running OpenSSH, maybe via cygwin).
- You can use Git hooks to do notifications, run syntax checks, etc.
Building Better Software
I'd recommend running Git rather than Mercurial. Use SourceTree as GUI and it will be great for inexperienced users. Using a DVCS requires some initial effort from each team member to learn the basic concepts of versioning.
I further recommend working with the Atlassian stack with Jira (and possibly Bamboo and Stash later on).
git
gerrit (especially how implented via LibreOffice/Openstack)
Launchpad (what Ubuntu uses) also just added git support, and it's $250 flat a year for a proprietary project. This would not be hosted locally though (https://help.launchpad.net/CommercialHosting). I wouldn't recommend starting a new project with Bazaar today.. but if this was 5+ years ago it might be perfect for your use case.. (great Windows client)
I'm always going to recommend git as the version control system of choice. It scales well, and you can learn how it works without mucking with servers to start. Plus github.com has some good tutorials, and there are several web interfaces available. If you could convince your IT department to let you use a cloud based system, github would actually be perfect. Also, the speed. Don't underestimate how important that is.
Here's a list of reasons to use it instead of SVN or CVS: http://www.gitguys.com/topics/...
Almost all of the requested features are possible with most version control systems, but, like back end infrastructure, require someone knowledgeable about that particular system to set things up. For instance, there are commit hooks to handle sending E-Mails and doing code checking, but that requires editing the right file.
So lets pretend that we've just completed writing this code, as opposed to having just completed sabotaging it -Altera
You cannot go wrong with Git. It's free, most popular, and quite honestly the best VCS due to it's simplicity and distributed nature. Just because it's most common in open source projects, doesn't mean you can't use it for private enterprise. Check out gitlab...it's an enterprise version of github.
You'd be doing yourself and your colleagues a disservice of you picked an outdated VCS like CVS, Subversion, or Perforce. The rest of the world is already on the Git train; you can hop on or get left behind.
I recommend git. It's fast, it's easy, it's decentralized so code cowboy can't burn your project. And there are gui's for it for windows as well: https://git-scm.com/download/g...
Since IT has set the policy to a Windows operating system only server, you've had your hands tied as to what technology you can use. Fortunately for you, you can run Docker on Windows: https://docs.docker.com/instal..., which means you'll have access to tens of thousands docker containers for various purposes such as gitlab: https://github.com/sameersbn/d...
For basic test on the code (Syntax errors, pytest/nose/or alike with coverage (of tests), check coding style) it sounds like what you're looking for might be jenkins: https://en.wikipedia.org/wiki/... and you can create a docker container for running jenkins on your server: https://github.com/jenkinsci/d... or https://wiki.jenkins-ci.org/di...
Use a GUI like Atlassian's "SourceTree". It's what we use at work, and it works pretty well. You'll still want at least one Git expert on the team for when someone does something stupid, but you'll need that for whatever platform you choose.
Super easy to set up, take a day with your team to learn the main functionality and you are good to go.
As your team gets more experienced, you will be happy you made that choice.
You can't handle the truth.
I'd go with Subversion. It's older and has a centralized repository rather than Git's distributed-repositories approach, but that won't be a problem for your team since they aren't spread out across multiple locations. It's got better support for running on Windows (CollabNet sells a supported commercial Windows-based server plus the whole TeamForge line), has Windows clients (both integrated into Explorer and stand-alone) and has supported integration with Visual Studio. Older means that almost every development tool out there for Windows understands how to interact with it. It's also easier for people who aren't familiar with version control to grasp SVN's model and how you interact with it (a commit is a commit, they don't have to understand the differences between their local copy of the repository and the origin copy on the Git server). Finally, SVN offers a degree of centralized control that makes management happy (eg. mandating commit comments in a certain form, controlling individual access to different parts of the directory tree).
If you can't do it in EMACS you're just not trying, poseur.
if successful the company is willing to migrate all code to Python
Sounds like a recipe for failure. While Python can do some amazing things, it's not total replacement for C and Java is all use cases. Summary makes no mention as to why Python should be the only programming language for this project. Maybe Python programmers are cheaper than C and Java programmers these days?
Git. The best. Whether your company has one employee, or hundreds of thousands.
no, I don't have a sig
Source control with git is like using (char *) &myStruct in C. Very flexible, but impossible to explain to someone who wants to do simple tasks, and most commands result in corrupting your work. Including correct commands accidentally used twice. Worked with it for two years, still regularly find things that baffle me.
Better to start with a comprehensible tool like svn and a good IDE with a source control plugin such as IntelliJ. You might migrate several years down the road, but by that time either team will become experienced enough to use git, or hopefully something better comes along.
I have used tortoise svn at a couple of jobs. One of which was in an electrical engineering group. We had files strewn about and everything was complete chaos. SVN allowed us to organize all of our files so anyone could find any revision of any file and track changes.
I remember it took a couple weeks to get our format set up, get all files in order, and train everyone on how to use it. We also did a weekly export as a backup/search file.
First we need to take a step back and figure out what you are actually doing. You have pulled up with a "software version control" bandwagon and everyone just jumped on without looking to see if it would take you where you wanted to go.
Are you wanting to keep track of the versions of your code or the reports generated by that code or the data that the code used to generate the reports? Each type of information is best suited for a different kind of versioning system. Are the reports generated only by the code or are they written by humans? Trying to use a code versioning system to keep track of modifications to reports or data is a loosing game. Don't make the mistake of thinking every problem is a nail just because you have a hammer.
This isn't necessarily an endorsement but just an option that I have experience with. Gitlab, as far as I know, is just a knockoff of Github with slightly fewer features, but it's proably fairly close for most use cases.
We use it at my company and one of our offices has a very Windows-focused group of devs, while many of us in our office lean more toward Linux/BSD. The web interface is alright, it gets the job done. And I'm pretty sure you can self-host for free, but there are plenty of sites to check up on that. My experience isn't from an admin perspective but rather a user perspective, and it seems ok if your users aren't complete morons.
Mercurial: I personally haven't seen any other VCS easier on windows (tortoise hg is way better than most windows git alternatives).
For continuous integration you can try many of the choices out there but jenkins is good enough IMO.
Mind that the computer on which you will host the mercurial sever and/or jenkins should be maintained by someone and this might pose some challenges for a "inexperienced team".
As far as your basic requirements are concerned, pretty much any major (git, svn, mercurial) open source version control system will cater for them, with some third party (mostly) free tools. Local server, well established, open source, email notification via hooks, extensive (if not easy to read) documentation ... all of these would be covered by the VCS itself. Single sign on integration with Active Directory (AD) can probably be set up using an LDAP extension. Many windows clients exist, most catering to several VCSs at once; which are good and which are bad, I often find is a matter of personal taste. Tortoise* and sourcetree seem to be the most popular at the moment. Tests are generally a matter for the project itself, i.e. part of the code, and automating testing based on source control activity (e.g. test on new commits) can also be done using scripting hooks, although you might prefer some kind of continuous integration system like jenkins.
For your 'nice-to-haves'; you would be looking at a third party stack. I personally would recommend gitlab. It comes with baked in issue tracking, project wikis for documentation/planning, email notifications without you having to script hooks, LDAP/AD integration (iirc, never used it myself), merge/pull requests (i.e. a form of code review). You can attach/upload files of any type to issues/comments/wiki pages, not sure if that's what you are looking for. Alternatively, you could look at gitstack, which just fits into your price range and covers most of the maintenance/admin headaches by the looks of it. I've never used, found it by googling.
Finally, git (and possibly mercurial and svn) has a way to sign off commits using a GPG key. This work flow is also accessible through gitlab. Basically, a change is made and committed to branch which is then pushed to the gitlab server. This generates a pull request to some pre-designated branch (e.g. trunk/development/whatever). When the pull request is approved, it can be signed using the approver's GPG key. I'm not sure is this covers your specific use case; I'm afraid I'm not sure exactly what you want from the signing part of your requirements
DISCLAIMER: This advice is based exclusively on personal experience, does not constitute legal advice, makes no guarantee of merchantability or fitness to a particular purpose implied or otherwise, did not harm any kittens in the making thereof, and may cause the reader distress by making them learn something.
Use a GUI like Atlassian's "SourceTree". It's what we use at work, and it works pretty well. You'll still want at least one Git expert on the team for when someone does something stupid, but you'll need that for whatever platform you choose.
With an inexperienced team you'd probably be better off with SVN. It's easier for complete noobs to understand and a bunch of noobs is not likely to need the extra features you get with Git. By the time they have gotten comfortable with SVN and you feel that your team is ready for more complex work you can always upgrade to Git.
Having used both HG and Git, I can safely say that in a Windows environment, HG wins hands down in terms of GUI. The main client for HG, TortoiseHG, is slick and polished and miles ahead of its Git competitor TortoiseGit - so far ahead, in fact, that I never had to drop into the command-line with HG, whereas I spend most of my time there with Git.
I know all of Slashdot will tell you to use Git because Linus Torvalds invented it and therefore it's cool, but in my experience, Mercurial is by far the better product.
I bet you could just setup up ZFS and snapshot it frequently. Run it in FreeNAS mounted through your preferred file sharing system. Its not really version control, but if you want to hide from real version control, it might work.
If you want real version control, and something very simple do SVN with no branches (branches can be come complex). If you actually want branching, use git.
I'll start by answering your question. Use GIT. It's the most widely supported system at this time and it works really well.
Next let me be a typical slashdot asshole that makes abrasive comments that may be well intended by will come off as being a dick. I'll explain that I already see endless problems coming from this.
If you're working with a team of 10-15 developers who all lack experience with version control, you have a major problem with out-of-date programmers and you're throwing them into a hell called Python. If you generally accomplish projects using C and LabView, the developers you have more than likely lack a modern development skill set and coding in a language like Python will produce some of the worst code ever written. If C is like shooting yourself in the leg and C++ is like blowing the whole damned leg off, Python is like dropping a nuke. You will have an endless supply of options for writing terribly bad code in the worst ways possible. The only redeeming feature will be it will have nice uniform spacing.
I would highly recommend doing what always works best which is to hire a Python developer with good GIT skills that can lay the majority of the foundation of the project and create a uniform set of standards of coding for the project and then bring the other developers on 3 at a time and perform constant code review. Focus heavily on test driven development and use a system like SCRUM for lifecycle management. If you want to teach old dogs new tricks, don't just throw them in the fire and tell them to figure it out. The programming paradigms are so drastically different between your old method and new that without some sort of leader with experience, it will turn out to be a disaster and jungle of crap code. I personal avoid Python projects not because the language is bad, but instead because they tend to be like this.
You should of course know by now that if you are traditionally a LabView shop, you're going to sacrifice a massive number of really important features to save a buck. Python has great support too multi-threading but it's not an awesome environment for event driven programming like LabView is. You of course can accomplish all the same things, but even with the thousands of toolkits/libraries out there, you'll have to write the entire underlying architecture yourselves and you'll lose almost all visualization you've come to depend on.
Really about 1/4th of what you need is a VCS, and 3/4s of what you need is a sensible, documented and enforced process that requires unit tests and reproducible builds. I've been in the industry for 25 years now and have only seen this a couple of times. Sun's was very strict and required an 11 page form to be filled out so that the version control branch you were updating to could be unlocked. You had to include what feature or bug the update addressed, a description of what your code did, a sign off from a code review board and the diffs for the commit. Once you checked it all in, an automated system would pick up the changes, build and test them and send out an E-Mail blaming you if the build broke. Rogue Wave software also had an automated build and test system which would do nightly builds of their libraries over all the systems they supported (Which was damn close to all the systems that were ever invented.)
At most other places, the build process was an afterthought that was thrown together by the developers on the team. This could be anything from some hastily-assembled makefiles to home-rolled shell scripts. Java projects would typically use ant or maven. Or occasionally ant AND maven. I've encountered one or two java projects that used make. I've also encountered one or two projects where they couldn't guarantee or had forgotten how to do a reproducible build. These ranged from "Oh just run make 3 or 4 times in the top level until all the build errors from missing libraries go away" to "Steve was running a jenkins server on his personal workstation and it got shut down when he was laid off. Can you fix that for us while you're at it?" If any of that sounds like where you're at, the first step to recovery is admitting you have a problem.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
The more I try to pick apart what you're trying to do, the more convinced I am that you should have multiple tools for the different problems you're trying to solve. That being said, if you want to do everything in one tool, I honestly think TFS is your best solution.
_PLEASE_ don't use TFS.
As far as distributed vs centralized source control goes, I would recommend using distributed, since it means you won't pick up any hard-to-break thought patterns from centralized source control. DVCS is a little bit more complicated, but it does merging so much better that I believe it's always worth it. Plus, to paraphrase Linus, you get backups by sending your data across the network! The downside is that neither git nor mercurial does well with large binary (rather than text) files out of the box.
Mercurial has better windows clients, a more beginner friendly interface, and is written in python, so if you need something fancy it would be easy to write an extension. Git is more commonly used (so you'd be developing skills more likely to benefit you later), is more powerful, and is much more command line focused. As far as local hosting goes, http://www.jeremyskinner.co.uk/mercurial-on-iis7/ has a slightly old but still fairly accurate list of what to do to set up mercurial on IIS. Presumably your IT department will be able to configure IIS to use AD credentials.
As far as digital signatures go, both git and mercurial support signing commits with GPG. Git has this functionality out of the box in version 1.7.9+, and mercurial supports both the GpgExtension or the CommitsigsExtension.
For compiling code, running tests, and (possibly) running reports, you really want a Continuous Integration server. Based on your experience level, I'd recommend TeamCity. It has an impressive number of integrations, and you can administer all of the configuration in an easy web UI. I would expect the free tier (3 agents to do the work, up to 20 configurations) to work for you. If Open Source is a big deal, Jenkins is the CI of choice, but it requires significantly more expertise in continuous integration to get up and running.
Finally, while I'm not sure how you're planning on getting data, I'd recommend against including data and reports in the same repository as your source code. While I'm not sure what they'll be, I'm sure it will cause you headaches because of the amount of data you'll be adding to source control. I haven't ever had to deal with storing versions of non-text data before, so I don't have a good solution for that, but I've seen what happens when too many things get stuck in source control, and it isn't pretty.
Just go with GIT. The trust is no version control or SCM software is very good, they all suck in a lot of ways and have very limited strengths. One to totally stay from is Preforce, it's unstable, unsafe and you run a serious chance of loosing your code into a corrupted mess of memory errors.
Be precise and thorough about your check-in process. Each developer's modifications should pass tests under varying sets of real world inputs. I develop Python code for our observatories using ASCOM hardware simulators for the DAQ process. Convenient, but not the same as the real thing.
Knowing what industry you're working in helps to understand what you're trying to do and what you actually need. Others have seemingly missed this, but you did mention FDA compliance at the end of your extended list of requirements. So, I'm guessing you're a life sciences, pharma or drug discovery company working with microarrays and other data acquisition hardware where the data WAS passed to LabView and then processed there or in one of the Java, C, or other apps you mentioned that were developed in-house. The processed data is then documented with reports.
You actually have three problems and not one. You have a code versioning issue that requires version control for better debugging and maintenance going forward, a data cataloging problem and a document management problem. I would imagine, that some groups either have or will have visualization and graphics data to manage as well, but we'll leave that out for now.
You mention that the developers have no background using an off-the-shelf version control system. This, in the modern development era, is ... scary! As was mentioned above in several places, a simple check-in, check-out system would probably be a good place to start with the plan to migrate to something more sophisticated in a three to four year time window. You know these developers are going to want more features down the road, but aren't ready for the full enchilada on day one as that would be a lot messier to deal with for a longer period of time than having a migration plan and starting them off slowly. I would recommend a CVS or SVN variant for the initial launch and then have a migration plan to something like Git. This gives you time to get the initial setup running and the developers have a flatter on-ramp to usage while you (and IT) come up to speed on Git in the background before a planned deployment a couple years down the road. In the end, a plan like this will save you time and pain in the long run, oh, and management will save $$$ in the process.
Data cataloging may not be that great an issue, but it does require some thought. Not sure how data is being cataloged now, but there are a few commercial products that will help with this process. Some are desktop only and then there are those tied to commercial database systems like Oracle. You'd really have to do some research based on your actual needs to find one that works best for the researchers. Can't touch this topic sight unseen.
Then, there's the document management problem. Being a Windows shop there are Microsoft options for this, as well as commercial and open source tools. Again, not being familiar with your internal workflow and budgets I can't be more specific than that.
Trying to do everything with a version control system is just foolish. You're only going to create more headaches for yourself, the IT team and the researchers/developers than will be mitigated by a single system trying to manage different verticals within a workflow. DON'T DO IT!!! Use the Keep It Simple, Stupid rule along with the right-tool for the right job approach. In the end, you, your IT team and your users will be happier and more productive, that will make management very happy and cost a lot less in the long run.
Cheers and good luck.
For $10, you can BUY a 10 seat licensed, self-installed Bitbucket Server that provides everything github does with full admin control, LDAP integration, SSH, Pull Requests, etc. And its all git underneath, baby.
Head over to Atlassian, you won't regret it.
There's also the modest Fossil SCM by SQLite author D. Richard Hipp. ://en.wikipedia.org/wiki/Fossil_(software)
For such a small team, it might be do the job.
Use it simply at first and you'll have no issues. Make 2 branches (dev and master) so you can make sure your release branch doesn't get screwed up by an inexperienced dev. I've had to detangle a few of these in my time and 30 seconds of somoen admitting they didn't know how to do something would have saved me 2 hours of work in every occasion.
Git is one of few open source projects that uses a pun that is apt and isn't awful - even VB developers can use Git successfully.
I have been in the situation where we wanted to introduce versioning to newer or more junior programmers. What didn't work well was the open source solution. We used Subversion; it was quirky and unintuitive for them. (Of course, someone on this forum is going to hotly contest that Subversion is anything but intuitive and easy to use. Well, to that, you probably aren't representative of most programmers, and most likely, neither are your friends, if you have any.) And my experience is that most open source solutions are difficult to use--especially for novices--because they are in fact quirky or require a high level of expertise to use right, or both.
But when we switched to a commercial vendor (in our case, CA's Harvest), the team picked up the versioning system much faster than they did with Subversion. Harvest was more intuitive for them and a LOT less quirks and bugs to deal with. That is, things more often than not worked as they were expected to in Harvest than in Subversion.
So my advice would be to look at some commercial vendor packages, pay them the bucks they are asking for, and enjoy the professional support they give you, the training, and quicker turn-around time for your project deliverables.
Don't select $foo for inexperienced, new users, etc. unless there's a valid pedagogical reason for doing so. Putting people in simulators on their first day of flight school is valid.
Starting a project with a VCS based on the fact that people are inexperienced is probably not valid. It seems like the learning curves will be roughly similar for all of them.
I know both svn and git for many years and used them in dozens of prijects and git is clearly the better choice.
I would suggest 1-2 hours training by some git guru first.
But once I learned a bit of git I never wanted to use svn again.
For easier start try gitlab.
Either SVN/Trac for a small team and a moderate size code base.
SVN and Trac for bug handling works really well and Subversion is pretty easy to pickup if you don't already have any version control experience.
Otherwise go with Git/GitLab if your people prefer it. I find working with git to be more arcane but then I kind grew up on CVS/SVN.
If you like the distributed model but have issues with Git then Mercurial is you best next option.
I would strongly advise against:
Perforce, ClearCase, Team Foundation Server, AccuRev on cost alone.
Their proprietary server based setups just make them horribly less functional than their open counterparts.
Rolling out CVS in the age of SVN is just silly. RCS and PVCS is just that much more ludicrous.
Hear me out. I know git is more popular—I prefer it myself—but mercurial has a much simpler conceptual model, is easier to learn, and offers nearly all of the benefits of git.
With git you really need to learn about the difference between "add" and "commit" and how the staging area works. That's a very useful feature, but it also complicates the teaching, and for basic day-to-day stuff, doesn't offer huge benefits. And git just has _so_ many commands. They're powerful, but intimidating to a newbie.
Mercurial, on the other hand, has most of the power of git, but it's a lot more straightforward for the most part. The lack of a fast-forward capability means you end up with a lot more merge commits in your history, but that's not a huge deal. At least not at first. And its fairly easy to migrate from mercurial to git later, once your team is more comfortable with the way the system works. So it's not like you're making a lifetime commitment.
Mercurial is less powerful than git overall, but it's a great introduction to the whole model of DVCS. And for day-to-day stuff, mercurial is definitely more than adequate.
Both git and mercurial are vastly superior to svn, especially for performance. Having to make network round trips for all but the most basic examinations of history is a serious disadvantage of svn. If you're just testing a script, for example, a bibisect can be many orders of magnitude faster with git or mercurial. And you can do it even if you're sitting in a hotel room and don't want to pay the outrageous wifi fees. You don't need a network at all. Using SVN in this day and age is simply inexcusable. There are absolutely no benefits—only disadvantages.
If you just want to get up and running with a vcs that will offer great benefits with minimal floundering while people learn the ins and outs of the system, mercurial is a pretty darn good place to start. If you have a little more time to spare getting everyone up to speed, though, it might be just as well to leap straight to git. *shrug*
"I have been programming in Python for quite a while, but so far I have not used a version control system."
Don't mean to be a pain but if you have no experience with versioning, not sure why you seem to be the one making critical choices (like dictating the language the team uses or what version control makes sense...)
Short answer is just use git. Its dominate. Its got some weird alien brains but there's going to be plenty of help and good examples. I find smart people manage and also its sufficiently well designed that if someone really screws up you can usually fix stuff. Also your existing programmers will learn a skill they find valuable when they start applying for jobs somewhere else (usually the first thing people do when they are told to change languages)
Best of luck with the company decision to force all your existing programmers to flush their current skills in favor of some other language ;)
Peace, or Not?
Don't discount cultural inertia.
Moving away from labview will attract criticism (whether it be to python or any other language)
Adopting revision control will attract criticism (doing it at all, and the solution chosen specifically)
You've tied these two projects together, so unless everything works brilliantly any critic of any change
will be able to use any shortcomings to derail BOTH projects.
Pick one fight at a time else the group that like labview will ally with the group that dislike whatever
VCS you pick and defeat you.
Visual source safe is an excellent version control system for an inexperienced team.
A Microsoft product. Not expensive to deploy. And, it's actually a very good product.
B*llshit.
Git is difficult to grasp because
* it makes simple things complicated
* you need to read an entire book before using it
* and its manual reads like this: http://git-man-page-generator.lokaltog.net/
If you're switching from LabView to Python because it's an open, popular, well-supported language that's easy to find experienced programmers for, I can't recommend learning anything other than git for version control. SVN works, yes, and lots of people have previous experience with it, yes, but git is ubiquitous and there are lots of excellent tutorials that should have your team up and running on it in less than an hour. My biggest difficulty in moving from SVN to git was not git, but rather un-learning SVN and the assumptions and mindset needed to use it effectively.
There's the added bonus of finding extremely inexpensive ($25/month) git hosting (GitHub) for private teams of unlimited size.
For my personal use i am a fan of git, but if you have a team with the possibility for a constant server, and all team members have a decent network connection to it (and that is a big if), go for subversion server and buy supported client tools for the applications where tortoiseSVN does not seem fit.
My reasoning (favoring subversion over git is as follows):
* in a beginner team, the reduced choices and the standard layout of a project in subversion are an advantage
* Migrating to git can happen at any time when you decide that these limits are not ok for you (typically they are)
* Subversion tools are typically better integrated on all platforms
* Intregration in eclipse is better
But seriously, get a consultant who analyzes your actual requirements and sets up the system in the most productive way for you (believe me, the $1000 which this may cost are well invested money, if you avoid stupidly restricted or deformed workflows in your team)
I don't have a clear picture of what your organization will be doing, but your comment about "managing that data (=measurements + reports)" made me wonder if you will want to use the IPython notebook.
http://ipython.org/notebook.html
When people work to analyze measurements (make plots, etc. and make decisions) and then write new code, if they do so step by step in an IPython Notebook, and then other scientists can peer-review the notebooks, this might be even more useful to you than version control. It would give you a history of how the analysis was done and why the reports were made the way they were.
In my job, I do some analysis and work in databases, and I seriously want to start using IPython Notebook as my SQL client, and save my notebooks for later review. It would document the queries I ran and the results I got, so later I could find the queries again to re-run them, and see how they worked out before re-running them.
https://github.com/catherinedevlin/ipython-sql
lf(1): it's like ls(1) but sorts filenames by extension, tersely
I'll second the recommendation of Mercurial. There's also EasyMercurial, a nice little dead-simple GUI front end that lets you do a handful of the most common things (checkpoints, history overview, version comparisons, reversions, branching and merges) without having to learn much about the details. Very nice for beginners to get their feet wet with version control systems, and if/when they need something more powerful they can always use the command-line tools directly, or migrate to a more feature-rich GUI. But honestly, for a bunch of people without prior experience even the heavily restricted feature set will be a huge step forward.
There's probably similar "beginner" GUIs available for most of the major VCSes, but EasyMercurial is the only one I can vouch for. I would also lean towards recommending a distributed VCS that offers easy branching and merging for a bunch of VCS beginners - they seem to offer far less conceptual/procedural overhead to the "lone wolf" work flow, and thus are more likely to actually get used effectively.
--- Most topics have many sides worth arguing, allow me to take one opposite you.
You may be tempted to use a centralized one like SVN, and it although it will be easy to use in the short term, down the road you will usually want to switch to decentralized.
Mercurial and git are similar enough that learning one allows you to use the other, so I would just say pick one. There are some awesome GUI tools as well -- TortoiseHg should suit your team's needs.
Pick something hosted that you don't have to manage. There are only a couple well supported systems.
I'm going to lose credibility with my peeps but....
VisualStudio.com is probably the easiest to spin up. We use it when we have to do a multi-company PoC or joint project. It comes with task management, scrum boards and other bits. You can set up your repositories as either TFS or as Git. They treat GIT as a first class citizen.
GIT is an expert's tool. There are several hosted repositories, GITHub , AWS Code Commit, BitBucket, the previously mentioned VisualStudio.com, etc..
The DesignSync and ProjecySync components of Dassault Systems Enovia Synchronicity will do almost all of what you ask, including versioning of text/binary files, windows client software, web based interface, integration with its bug tracking system or its customizable process flows such as reviews/approvals, customizable data sets, triggers, scripts, email alerts etc and excellent documentation to boot. Probably the only piece of software I have seen that does it all. Just a happy user for the last 10 years.
http://www.3ds.com/products-se...
... under version management? Do you re-do measurements, and then the new version replaces the old? It doesn't seem to make too much sense. Also, separation of code and data?
Subversion is pretty easy for everyone to understand. The single-rev-number defining a release is a lot better than the CVS tagging
system. We moved from CVS to Subversion a couple of years back and have been very happy with the decision.
This is an old-school company where everyone comes to work. The servers are in the closet.
Use Collab.net SVN. TortoiseSVN if you want an Explorer builtin client. Otherwise I'm sure, hope, your development IDE has some sort of SVN support.
First off, if you are doing LabVIEW then avoid the llb files and commit each VI to the repo individually. That way you can track which SubVI changed on an individual basis. Also, llbs will blow up the size of the repo as they are usually huge when compared to the size of the SubVI that you are actually changing. Having the individual VI's in the repo allows the commits to be small.
Secondly, SVN is great when everyone is in the same building, but if you are working remotely, then "git clone" can be critical for your offline work. You can "git clone" while you are on site, then future pulls are not so terrible when you are on a slow VPN. If you have no connection to your corporate network, you can still track your stuff. We're in a distributed world, and you will suffer without the distributed capability that Git offers.
I am finding that 100% of my clients starting new projects are using Git. SVN is only being used by people who set up their repos in the early 2000's.
SourceSafe is an abomination. We discovered that when we added PDFs, they came back out corrupted. We lost a bunch of schematics that way.
There's a learning curve, but skip the SVN and go for Git. Don't think about VSS.
Cheers!
Use git for your server stuff you can use gitlab which can be installed locally. I do not know if it is easy to install on Windows, however, it is not important what runs on the server from a user's perspective. It works well on Linux.
Seems to promise the benefits of Git but not having to manage the implementation . Not necessarily all that's wanted in the submitters question, but thought I'd ask since I'm thinking of using it.
I've been a configuration manager and tools administrator for two decades now so hopefully you'll trust my judgement.
First, don't use svn or CVS. They're antiquated. That would be like walking around with an iPhone 1 right now in 2015. Actually, worse than that. It would be like walking around with a Blackberry. CVS and svn are previous generation tools, they don't hold up to modern code needs nor do they scale at all. Just say no to svn and CVS (but especially CVS.)
Git is modern, it works, it's reliable, it's used by everyone, is supported by just about every other tool that needs a version control hook, and you should consider it first if it fits your needs. It's distributed but it doesn't have to be used that way. Pair it up with GitLab and you have a lot more control (or more accurately, your users have more control, and you can enjoy less admin overhead) over what people can do with the code, and who can access what. It has lightweight bug tracking and pull/merge requests. Your inexperienced (which I assume is a euphemism for 'recent college grads') developers don't know anything, so you would be doing them a favor by teaching them a tool that people actually use instead of saddling them with 15 year-old knowledge. In its basic form, git isn't any more 'difficult' than svn or cvs. I personally would put up a very strong argument that you would be doing your developers actual harm in using an outdated version control tool. Mercurial falls into this category, too.
That being said, my main quibble with git is that it doesn't scale very well. I'm talking about the overall size or your stored code history. If you plan on submitting a lot of binaries, and keeping history of them, git will break down for you after a while without the help of third-party tools or clever 'shallow' cloning, etc. If it's just code, it all compresses well and it'll be a long time before you outgrow it (if ever), but there are ways to make git unbearable by putting lots of binary content in it.
Full disclosure, I'm a Perforce admin professionally. I don't work for the company, I'm just a cheerleader. If you're keeping lots of binary data in your version control tool, Perforce is hard to beat. I won't go into detail about it, but suffice to say it scales very, very large, with little to no performance degradation. The current server I maintain (several of them replicated, actually) have over 2 million changes on it with data that is approaching 14 TB. Perforce chews through that like it's nothing.
Tying it all together, Perforce has a tool called GitFusion that acts as a layer between Perforce and git clients. This is especially useful when you have a business that stores large binary files but not everyone needs access to them. Your git users can use git for their smaller repos, your documentation folks can use Perforce for their big docs (or images, or iso's or ROMs or whatever), and everyone's happy. And all your assets are backed by Perforce, regardless of whether they choose to use git or Perforce as a client.
Perforce also has a product that's based on GitLab with a Perforce storage engine.
Under certain circumstances (up to 20 users) Perforce is free, so it should at least be on your short list of tools to evaluate.
So in summary, if you're mostly going to be storing code with not a lot of big binaries and their history, go with git. If you think you'll have heavier storage needs, take a look at Perforce+git.
Lastly, if I ever worked somewhere that was adamant about supporting only one platform (even if it was Linux or Solaris), I'd quit immediately. This Windows mandate is ridiculous and points to a pretty amateur IT team. Some of the things you have in your requirements sound fishy to me and I wonder if the organization is forward-thinking enough to keep the place afloat. Linux, Windows, Solaris, BSD, they all have their strengths and saddling someone with a mandate on the back-end platform is, to be blunt, asinine. If I didn't know any better, I'd think your marketing team calls the shots with IT. Ungh.
I use git at work. It does the job. At home I use subversion because I understand it better, and like the tools.
Point is, both allow you to make changes, undo changes, merge with each other without worrying too much about breaking stuff. Pick one and stop worrying about it.The project will not succeed or fail based on the VCS.
Use git or mercurial (seriously though, use git) You can even run gitlab community edition in a linux VM on your windows server. It's easy to setup and manage. And you have code review if you want it, and a wiki and all sorts of really GREAT STUFF THAT YOU SHOULD LEARN.
It is hard ? No, really it isn't. Learn. train. It takes 2 hours and a good tutorial (and there are many very good interactive tutorials on the web) to be able to use git efficiently. and it is WORTH it. a million times.
Working on a software project with a team of 15 people who know nothing about version control is SUICIDE ! You will be in a world of pain for the rest of the project.
FYI, gitk does almost everything that sourcetree does, and it's part of the official git source tree.
p.s. Caution: There's a serious bug in gitk that makes gitk almost completely useless for any patch containing angle brackets (apparently nobody else that uses gitk has ever viewed patches containing greater-than or less-than, C++, or XML, so I might actually be the only person on earth that actually uses gitk). Anyway, here's the 2 line super hack I've been using to eliminate the bug on my local machine:
That way whatever stupid decision you make won't really matter.
We went with git in our shop of inexperienced VCS users precisely because it was the market leader. People can find a ton resources on how to use git, and it seems like everyone and their brother offers git integration into their app. Wed the look of Github and Sourcetree but were not willing to pay for it, so instead we went with Gitlab - basically it's an open source Github clone.
https://about.gitlab.com/
Visual SourceSafe.
If you have a Mac, then use Voodoo. :-*
git for everything.
Sign up for a free repo at VisualStudio.com. You can set up your project as GIT or as TFS.
The Microsoft TFS team has been upgrading their GIT repo to have feature parity with native TFS.
I worked at a place with millions of LoC in Perforce. It was a nice system.
I worked in SW development so this is something I actually know about. You probably know CM tools have the same religious attachment that editors and operating systems do. The level of people's advocacy highly amplifies the merits of any particular tool. First get the team together and canvass for opinions. Out of a dozen or so staff it is possible there is somebody who wants some CM duties and can do it. If the team is stuck with a choice they think is poor, the life of the project will be spent complaining about this choice. If anybody rants and raves and can give no better reason why they don't want a specific tool than "it sucks" they are a candidate for career change. It's not a group decision but the opinion of the group must be considered. Lastly whatever you choose (subversion is great) it is more important to use the tool well. Find someone on the team who wants to be an expert on the tool and has the skills to do so. Yes, there are people who want to do this. Given your team size, it is about a half-time job or a little less. Develop a plan for using the tools that fits your practice and don't count on vendor support for anything.
Here is a helpful, free book that describes git: https://git-scm.com/book/en/v2.
In the recent past I have used SVN, mercurial, perforce and git. All of them will work for a project like yours but I prefer git for the following reasons:
1. It is a distributed VCS so developers are not tied to a central repository.
2. Branches are really easy which makes non-linear development a breeze. In this context the term "non-linear" means fixing bugs or adding features in parallel.
3. It is widely used so there is lots of help available.
4. It is fast.
5. I find it easy to understand and use. I have heard many folks complain about how arcane git is. If you read the book and still find it arcane, then you should consider using something else.
6. It is really easy to incorporate into your software. If you need to offer a version control service in your tools, git is a great way to do it. It integrates well with languages like python, java and C.
In addition to using git as your VCS, I would recommend using the github development model so that your developers become familiar with the open-source development model. It is well suited to small projects: http://readwrite.com/2013/09/30/understanding-github-a-journey-for-beginners-part-1.
If you do choose git, please note that you never need to using the main branch for development. Learn to use branches for features and issues. Also make use of tags. It will make your life much easier in the long run.
One side note, I have also used git as a perforce client which worked pretty well. I understand that it can also be used as a SVN and a mercurial client but I have never done that. My guess is that it would work very with mercurial because the branching models are somewhat similar but that it would be difficult to handle client side branches in SVN because SVN uses a linear development paradigm.
PlasticSCM (https://www.plasticscm.com/) is thebest version control system I have ever seen or used bar none. However it is a commercial product and is not open source. It does have one of the lowest costs when compared to other commercial VSC systems ($9.95/user/month) I would *strongly* recommend staying as far away from Subversion as possible. This broken VCS is extremely likely to screw up your code if you ever try to have more than 1 version at a time (attempt to branch the code). Although, if you only ever have 1 main branch it will only screw up your code and conflict with itself some of the time. I have not used git myself, but if you are looking for an open source solution, I would recommend git over subversion any day. Where I work, we use Clearcase (which is *really* expensive and hard to maintain) and some teams have started to migrate to subversion. Some of the subversion projects are entering maintenance and I dread when someone attempts to merge code :(
I also use PlasticSCM and Subversion on other projects outside of my primary job. When using subversion, it often fails to check in files due to conflicts, and this is 1 person working in trunk. Oftentimes the only way to get subversion to work is to delete the working copy and start over. This has never happened since we switched to PlasticSCM.
GitHub or gitlab, making sure to prevent force pushes and use an alt branch to prevent idiocy
I still use RCS because it has the in-line markup to keep track of the revision you have. And is so simple to set-up and use that a 1 page cheat-sheet is usually enough for most people that can type without looking at their fingers. Put it on a ZFS filesystem and take hourly snapshots. Don't worry about network access, since that is how you are going to loose your repo. People can login to a server to edit and rsync to make remote copies. Easy and safe (using ssh for example). I always display the $Id$ string in the version output for each module under -V or --version: that means you can know for sure that you have the latest version before you test/release.
Interesting. I've never even _heard_ of gitk before. That bug sounds rather scary, though.
We're using SourceTree where I work as it's an Atlassian product that integrates well with the rest of the Atlassian suite, which we're also using.
Perhaps the best choice is Mercurial running on a $500 server in house.
Backed up with another $500 server perhaps in another building managed
by second individual such that there are two sets of master pass words to
two servers.
Lots can be done with good desk top boxes today...
My preference is to run Ubuntu, Centos or Fedora on the inexpensive hardware.
Clients work on WindowZ...
For any system to work a policy is needed. Check text in each day
perhaps on a "work-in-progress" (WIP) branch.
Managers need to know how to use and monitor it.
Also find a bug system to track progress, features, bugs.
Some checked in changes will be tagged to a feature request
or a bug.
I am a fan of RCS for learning how revision control works.
With NFS and some wrapper scripts it can be scaled to hundreds of engineers
as long as they stick to their own projects.
You also need a documentation system and plan.
Any system can be modeled with colored 4x6" cards over an conference table.
Pass cards.. into and out of piles to and from people.
If you cannot model your system with colored cards across a table it is not understood
or just too complex.
Lots of folk begin to understand revision systems and live locks by sharing a check book.
Truth is stranger than fiction, but it is because Fiction is obliged to stick to possibilities; Truth isn't. Mark Twain.
If you are developing in Eclipse (PyDev), use Perforce. it's perfect for small teams where you always have a central repository
Text is searchable, skimmable, and can be read in far less time than it takes to say it. Which is one reason that there's always a dozen Slashdotters screaming for the editor's head on a plate if a video is posted without a transcript. It's also more accessible to people who may not have high speed Internet access everywhere they go.
Programming is a text-based job; video offers no advantages. It's not like code examples can be conveyed better with sight gags or interpretive dance. If you think video talks are superior to written documentation you are simply wrong.
Those who advocate genocide deserve every protection afforded by law, and none afforded by common human decency.