The Future of Subversion
sciurus0 writes "As the open source version control system Subversion nears its 1.5 release, one of its developers asks, what is the project's future? On the one hand, the number of public Subversion DAV servers is still growing quadratically. On the other hand, open source developers are increasingly switching to distributed version control systems like Git and Mercurial. Is there still a need for centralized version control in some environments, or is Linus Torvalds right that all who use it are 'ugly and stupid'?" The comments on the blog post have high S/N.
I run the IT systems for my small software company and frankly Subversion is a great tool for the job. I don't *want* a distributed VC system because I don't want the hassle of trying to ensure that everyone's modifications to the code tree are backed up correctly and stored safely somewhere. I want it in a central spot I can back up and manage without my employees having to worry about it.
:-)
Basically Subversion is not suited for development with a diverse population of loosely connected individuals, each with their own private branches. Frankly, for corporate work, I don't understand why you would want the backup and integrity hassles of a distributed version control system. But maybe that's because I'm ugly and stupid
--- Nick, hard at work
Git out of here.
-Matthew Riley "TofuMatt" MacPherson
I have a website
Give developers a personal directory on a shared network drive. Developer repositories/branches stay there(*). Backup shared network drive. Done.
(*) Don't know about any other systems than Bazaar, but I expect they have similar functionality: Bazaar has "checkouts" which enable you to have a local branch (i.e. on C:/Projects/Whatever) while also auto-committing to the "source" (i.e. the shared directory/drive) when you do a local commit.
HAND.
While Git obviously has its strong sides, often quite overwhelming, there are cases when it sucks compared to SVN:
1. timestamps. Subversion doesn't do that by default, but it has good enough metadata support than timestamps can be hacked in easily. For Git, metastore is nearly worthless. If you store a program source, you risk just skewed builds -- for other types of data, lack of timestamps is often a deal breaker.
2. move tracking: trying to move a directory branch from one dir to another means you lose history. Rename tracking works only sometimes, often it will take a random file from somewhere else, etc.
3. large files. Take a 100MB binary file into SVN, change one byte, commit. Change one byte again. And again. Git will waste the freaking 100MB for every single commit.
4. partial checkouts. If there's a 5GB repository, you'll often want to check out just a single dir. With Git, there is no such option.
5. ease of use. I see that ordinary users, web monkeys and so on can learn SVN easily; with Git, even I had some troubles initially.
On the other hand, SVN used to have no merge tracking (I wonder what that "limited merge tracking" in 1.5 means...) which for distributed development of program sources is worse than points 1..5 together.
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
If you're in a highly-distributed development environment like Linux, where the developers are spread across multiple continents and have very little shared infrastructure and a high need to work independently of each other (either because of preference or because they don't want their work stalled by another undersea cable cut half a world away), then yes using a centralized VCS like Subversion is stupid. But if you're a developer on a project where all the developers are in a common location sharing common infrastructure, often literally within speaking distance of each other, then a decentralized VCS like Git is stupid. It's harder to maintain and, in that situation, yields none of the offsetting benefits.
Analogy: a fleet of Chevy vans vs. a freight train. The vans are far more flexible, they can travel any route needed whereas the freight train's limited to fixed tracks, and their smaller size and lower cost each let you buy a lot of them and dedicate each one to just a few deliveries in a particular area without a lot of overhead. You can fan the vans out all over the city, sending just what you need where it's needed and rerouting each one to adapt to changes without upsetting the others. But if your only delivery each day is 1000 tons of a single product from one warehouse to another 600 miles away, you're better off with that one big freight train.
... and is primarily focussed on kernel development. Some would even say it is the only thing he knows how to do. That is fine, but it does not make him an authority on version control systems for other types of projects. Kernel development has very specific needs, not mirrored by other projects. Personally I find SVN perfectly adequate for small teams, and not only for program source code, but also for texts.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
You do realize that a distributed VCS can perfectly be used like a centralized VCS, don't you ? Declare any repository as the "central" one and decide that everybody should push/pull to/from it. That's their power: discributed VCS don't force you into a specific workflow, you choose how you want to use them.
I run the IT systems for my small software company and frankly, we still use CVS.
People still seem to miss the point of something like git or BK in the corporate world. The main thing it does is to let developers organize their work before it is moved into the main central repository. Don't kid yourself, most serious devs working with something like SVN in any sort of environment with pre-checkin process will have a large bulk of work that is sitting around invisible to the SCM. Stuff that is waiting to clear the pre-checkin hurdles and be ready for commit.
:P
The longer it takes to execute your companies QA process, the more desperately you need a distributed SCM.
git lets the developer organize this stuff very quickly, lets them share it with other people working on related stuff and records important meta-data like 'This is the exact thing I tested'.
If you have ever sat in a meeting where group A was arguing that group B needed to immediately checkin their not quite done thing because their work depends on it while group C was arguing that it wasn't done enough yet you just wasted an hour because you aren't using a distributed SCM..
If you ever had to argue with your CVS-czar that you needed a branch you just wasted an hour because you are not using a distributed SCM.
Thats ignoring the fact that merging doesn't work in SVN
Seems to me that most of the people promoting DVCS have used them and have seen the light. Once you use a DVCS on a project you don't want to go back to the bad old way of doing things.
Most of the people knocking DVCS or saying they can't see the benefits haven't actually used them on any projects. They have built up a framework in their minds of How Things Should Work, but unfortunately that model was defined by the limitations of their tools.
IDE integration:
SVN is currently integrated with our IDEs (all 3), one of the main selling points in choosing a VCS.
Ease of backups:
We archive our repositories every day, IT loves being able to simply tgz the SVN directory and not have to worry about anything else, regardless of the state of any current projects (all groups use SVN).
Simplicity:
SVN/Trac training (client use, management, backend workings) takes less than 10 minutes. In another 15 minutes I can have someone setting up their own SVN repositories+Trac, without needing to pull up a single reference document, primarily because the an SVN setup methodology is trivial to memorize.
Often wrong but never in doubt.
I am Jack9.
Everyone knows me.
- Merges in a typical environment become effectively anonymous. Let's say you have a build manager and individual developers working on different changes in parallel. The build manager can't merge the changes without those changes taking on his identity, that is, all identifying information about the originator of the changes is lost.
- So-called "best practice" for SVN branching means building new branches with every new release. That is, it's not recommended to build one branch and merge changes from the trunk into it as you're incrementally changing things on that branch, noooo. You have to keep polluting the repository with needless hair by making new branches every week, and sometimes, multiple ones per day.
These are just two I'm aware of that bite us in the ass on a regular basis. The first issue is supposed to be fixed in one of the near-term mods to SVN, but the fact that the second even exists tells me that the guys developing SVN don't really work in the same world as a lot of the bigger commercial development environments do.Dog is my co-pilot.
...getting all the other IT people in the office to use it. Even better? Getting them recognize why version control is so useful in the first place. :-D
-Rob
Biblical fiscal responsibility
If you are one of those MSDN shops where you went to subversion because VSS is weak, you can go with Team System. We were using VSS before and Team system is way better for version control plus it has defect tracking and a build agent. Also, they have caching server to aid distributed environments. Considering it's free (sort of) with your MSDN license (I think it's with their "team edition" licenses) it's tough to beat. It's tough to beat that, especially for larger shops. The only things I don't like in some ways is how it represents the branching and the source control permissions and that was comparatively minor problems. I work in Config Management/Build and so I don't know what Joe developer would like/dislike about it.
If you're not an MSDN shop then what to use is more problematic. I really liked subversion but that was years back. From what I recall for what it was it does well. The only product I was ever "wow'ed" with was Accurev. That product deals with branching like a breeze and I think they might have a defect tracking module now.
Oops, how did this get here?
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
This probably should have been in the summary -- merge tracking is being added in 1.5, so bouncing changes from one branch to another is now easy. This is a huge feature, and something as I recall Linus specifically complained about in his talk.
http://blogs.open.collab.net/svn/2007/09/what-subversion.html
BTW, they did a really nice job of mapping out the use cases and whatnot before implementing the feature. I guess source control people are natural planners.
http://subversion.tigris.org/merge-tracking/requirements.html
Anyway, I'm sure the world will continue to have need for both distributed and client/server source control systems, and Subversion is a nice example of the latter.
What I don't see mentioned very often, if at all, is the implicit assumption in distributed systems such as git, that a single person has ultimate integration responsibility and authority in order to form the official/mainline release. That is, given a single tree that is considered the main one from which all others ultimately derive (Linus' tree in the Linux case), there is absolutely no way for tools such as git to allow collaborative maintenance of that tree. In the end, the owner of that tree must perform all checkins to the tree, and must resolve all merge conflicts themself. This is a dual problem in that it wastes the time of a potentially talented developer (e.g. Linus) doing the mundane work of merging and integration, and the additional problem that if this mainline tree owner is not an expert in some particular area of the code, they are likely to make mistakes when resolving conflicts or performing other integration tasks.
Contrast this with a centralized source model where all developers have the ability to check in to the tree, optionally coupled with a peer review process, enforced either through convention or through mechanisms in the tools. Under this model each developer is responsible for their own integration and merging efforts, not wasting the time of a centralized authority. Not only is the central authority freed from routine tree maintenance work, but each developer can make the best and wisest decisions regarding the particular area of the codebase in which they are an expert, and not have to become involved in areas they have little experience with. Granted, for larger projects there is still a need for some management of checkin authorization, particularly to avoid conflicts during large tree merge operations and the like, but it's more of a coordination role than an authorization role.
This second model is what my employer uses, and our homegrown source control system is well-tailored to it (it actually has capabilities for more centralized control, but they are by and large unused). Perhaps this is unusual, as my experience with other employers is minimal, and mostly took the form of "copy your code into this directory once in a while" (i.e. "Source control? Why would we need that?"). However, given adequately diligent and intelligent developers, I have to say it works marvelously.
Cyrano de Maniac
... management system right now. One problem is, Subversion won't work because I need to have totally clean checkout trees. Subversion inserts tracking files in the checkout trees. So I guess I have to look for something else.
now we need to go OSS in diesel cars
There is nothing in the distributed systems that precludes one from using them in a centralized fashion if so desired. The term decentralized, as used in this context, simply means there is no inherent central point. It does not mean one cannot be established.
FUD. I think you may have had a bad experience with Git, which is known to be somewhat difficult to learn, but in general the easy things you do with a CVCS can just as easily be done with a DVCS (commits, diffs, tags...). And the things that are complex/impossible to do with a CVCS become easy/possible with a DVCS (low-cost branching, working out of the office with no network/VPN, central repo not a single point of failure, etc). You can benefit from these advantages even in small dev teams. Case in point: I use a DVCS (Mercurial) for some private projects even when the single repo is on my laptop and when I am the only developer mostly because it couldn't be simpler:
I think an incremental move of Subversion to a distributed model would be the best way forward. Svk shows it can be done, Subversion should just "do it right", rather than providing it as an add-on. I think if Subversion did that, it would instantly become the de-facto standard for version control, distributed or otherwise.
If Subversion doesn't move to a distributed model, I'm probably going to switch my projects incrementally over to Mercurial (Git isn't really an option).
Comment removed based on user account deletion
and do handle merging much better than svn (which hardly manages merging at all) however, a lot of companies use perforce, because perforce does handle branching and merging well, whereas it also allows for a centrol source repository.
Mercurial and get are mostly built around the model of sending your patches in via email. They make some noise about how you *could* build a central repository, but it's not like that out of the box, and things like permissions often aren't there yet.
Basically, the development model that the open source guys use to do their work isn't the model that guys at big corporations need to use, and distributed VC tools are so heavily geared towards the open source model (in the case of git, not even supporting windows that well) that they can't really be drop in replacements for tools like perforce.
Comment removed based on user account deletion