Slashdot Mirror


The Future of Subversion

sciurus0 writes "As the open source version control system Subversion nears its 1.5 release, one of its developers asks, what is the project's future? On the one hand, the number of public Subversion DAV servers is still growing quadratically. On the other hand, open source developers are increasingly switching to distributed version control systems like Git and Mercurial. Is there still a need for centralized version control in some environments, or is Linus Torvalds right that all who use it are 'ugly and stupid'?" The comments on the blog post have high S/N.

22 of 173 comments (clear)

  1. Well *I'm* ugly and stupid... by Wulfstan · · Score: 5, Insightful

    I run the IT systems for my small software company and frankly Subversion is a great tool for the job. I don't *want* a distributed VC system because I don't want the hassle of trying to ensure that everyone's modifications to the code tree are backed up correctly and stored safely somewhere. I want it in a central spot I can back up and manage without my employees having to worry about it.

    Basically Subversion is not suited for development with a diverse population of loosely connected individuals, each with their own private branches. Frankly, for corporate work, I don't understand why you would want the backup and integrity hassles of a distributed version control system. But maybe that's because I'm ugly and stupid :-)

    --
    --- Nick, hard at work :->
    1. Re:Well *I'm* ugly and stupid... by mweather · · Score: 3, Insightful

      I run the IT systems for my small software company and frankly Subversion is a great tool for the job. I don't *want* a distributed VC system because I don't want the hassle of trying to ensure that everyone's modifications to the code tree are backed up correctly and stored safely somewhere. I want it in a central spot I can back up and manage without my employees having to worry about it. You can do that with distributed version control, too and still have the flexibility for alternative work flows.
    2. Re:Well *I'm* ugly and stupid... by peragrin · · Score: 3, Insightful

      i would say it is variable. I can see the point of both.

      subversion is good for small projects, or larger projects with limited number of developers.

      Once you get into the hundreds and thousands of developers working on the same project though you need to think a bit differently in terms of needs of the individual developer, and the group as a whole.

      --
      i thought once I was found, but it was only a dream.
    3. Re:Well *I'm* ugly and stupid... by EricR86 · · Score: 5, Insightful

      Frankly, for corporate work, I don't understand why you would want the backup and integrity hassles of a distributed version control system.

      Correct me if I'm wrong, but isn't this the major selling point of distributed revision control? The idea being that since it is a distributed repository, everyone has a "backup" of someone else's repository (depending where they got their code from). No distributed copy is necessarily considered more important than another. However in a corporate environment I would imagine it works out quite well since there's an inherent hierarchy. Those "higher up" can pull changes from those "below". Those "higher" repositories you could (and probably should) backup.

      As far as integrity goes I think one of the main goals of both Mecurial and Git was to protecting against corruption (using a SHA1 hash). You're much more likely to get corruption through CVS and SVN, which is awful considering it's in a central location.

    4. Re:Well *I'm* ugly and stupid... by Wulfstan · · Score: 5, Insightful

      I'm using the terms backup and integrity in slightly different ways than you are.

      By backup - I mean a tape or location where I know I can look to find the "good" copy that contains the official tree of code that represents what is going into my product. What you are describing is copies of repositories sitting in various locations that isn't really the same as a backup. It's also a bit upside-down - I don't want to be "pulling" fixes from engineers, I want engineers "pushing" fixes into a known-good integration environment.

      By integrity - I mean ensuring that you have all of the fixes you want to have from everyone who should be making changes on a project. NOT file corruption.

      --
      --- Nick, hard at work :->
    5. Re:Well *I'm* ugly and stupid... by Wulfstan · · Score: 2, Insightful

      Yes, but the point is that it encourages and allows behaviour that is not desirable in a corporate development environment - local checkins. You CAN push your changes to it but equally you CAN just check stuff in locally. In some contexts this is great - but I think in corporate environments it promotes risky behaviour.

      Look - it's a tool - you can use it responsibly or use it irresponsibly - with the right set of rules and processes I'm sure it can be made to work. Local checkins are what really get my goat ;-)

      --
      --- Nick, hard at work :->
    6. Re:Well *I'm* ugly and stupid... by EricR86 · · Score: 5, Insightful

      ...a tape or location where I know I can look to find the "good" copy that contains the official tree of code that represents what is going into my product.

      In a distributed environment usually there's someone's (or a group's) repository that's considered more important than others. In a software setting this could be a Lead Engineer's/QA/Certification's repository. Depending on what your definition of the "good" repository is, you would take the copy from the right place. It opens up in terms of flexibility what code you actually want to get to work with. The upcoming released version of your software from QA, the next-generation stuff that developers are working on, or maybe a new feature that you here so-and-so is working on...

      I don't want to be "pulling" fixes from engineers, I want engineers "pushing" fixes into a known-good integration environment.

      But you have someone who needs to approve a change to a central repository that everyone shares. Right? That person would probably want to examine those changes before they're committed. The only difference between distributed and centralized, in this case, is that it's a required step. Everyone is responsible for their own repository.

      By integrity - I mean ensuring that you have all of the fixes you want to have from everyone who should be making changes on a project Again, in a centralized system, someone has to have the responsibility that all "fixes" have been made which isn't much different from a distributed model. And technically anyone is free to make changes to a project locally on their own machine. They just have to notify the "higher" person saying "Hey I've got a fix for so-and-so", and in a controlled manner they can decide whether or not to accept the changes into their own repository.

      I'm no expert on distributed revision control, so anyone please feel free to correct me.

    7. Re:Well *I'm* ugly and stupid... by gbjbaanb · · Score: 1, Insightful

      One of the best things is you can checkin changes, roll back to previous versions, branch, merge, etc... all on your local repository while you're on the plane or beach where there is no network access

      no, that's the WORST thing about it in a corporate environment. See, if I've paid you $5000 a month to write software, I don't mind it written on a laptop on the beach as long as you check it into the central repository. I seriously do mind if you write it on your laptop on the beach, check it in to your local repository and then get your laptop stolen (or covered in margeritas). This is such a deal-breaker that I would say 'no beach coding' to all developers and make them sit in cubicles instead. Now if they could check code into the central, secure, backed-up repository then I'm fine with whereever they want to code.

      Now branching... that's another story and is possibly why this article should be talking about the differences between MS Team Foundation System and Clearcase.

  2. Depends on the environment by Todd+Knarr · · Score: 4, Insightful

    If you're in a highly-distributed development environment like Linux, where the developers are spread across multiple continents and have very little shared infrastructure and a high need to work independently of each other (either because of preference or because they don't want their work stalled by another undersea cable cut half a world away), then yes using a centralized VCS like Subversion is stupid. But if you're a developer on a project where all the developers are in a common location sharing common infrastructure, often literally within speaking distance of each other, then a decentralized VCS like Git is stupid. It's harder to maintain and, in that situation, yields none of the offsetting benefits.

    Analogy: a fleet of Chevy vans vs. a freight train. The vans are far more flexible, they can travel any route needed whereas the freight train's limited to fixed tracks, and their smaller size and lower cost each let you buy a lot of them and dedicate each one to just a few deliveries in a particular area without a lot of overhead. You can fan the vans out all over the city, sending just what you need where it's needed and rerouting each one to adapt to changes without upsetting the others. But if your only delivery each day is 1000 tons of a single product from one warehouse to another 600 miles away, you're better off with that one big freight train.

  3. Linus has a big mouth... by gweihir · · Score: 5, Insightful

    ... and is primarily focussed on kernel development. Some would even say it is the only thing he knows how to do. That is fine, but it does not make him an authority on version control systems for other types of projects. Kernel development has very specific needs, not mirrored by other projects. Personally I find SVN perfectly adequate for small teams, and not only for program source code, but also for texts.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    1. Re:Linus has a big mouth... by nuzak · · Score: 3, Insightful

      Linus has a long history of flapping his jaw about kernel development topics he knows nothing about as well. To his credit, they often become topics that he didn't know about at the time that he then becomes well-educated on (SMP and /dev/poll to name a couple) but sometimes it's on things he religiously refuses to learn anything further about (microkernel architectures).

      He's an excellent assembly hacker, a fast learner, and at least a majority of the time a nice guy, so most people overlook it.

      --
      Done with slashdot, done with nerds, getting a life.
  4. Distributed VCS can be used like this by this+great+guy · · Score: 3, Insightful

    You do realize that a distributed VCS can perfectly be used like a centralized VCS, don't you ? Declare any repository as the "central" one and decide that everybody should push/pull to/from it. That's their power: discributed VCS don't force you into a specific workflow, you choose how you want to use them.

    1. Re:Distributed VCS can be used like this by Wulfstan · · Score: 2, Insightful

      What worries me is that it encourages behaviour which leaves valuable changes sitting on a disk which may not be backed up. I see changes being made to a codebase like valuable little bits of gold which need to be kept somewhere nice and safe, which is not on individual machines but on the server (RAID, redundant power, UPS, etc)

      Yes, if you are disciplined about how you use it then I'm sure you can use it like any centralised VC. It is a tool - it is not evil - it just encourages what I see as risky behaviour in my particular environment. But I can fully understand that in other contexts it may be useful.

      --
      --- Nick, hard at work :->
    2. Re:Distributed VCS can be used like this by this+great+guy · · Score: 2, Insightful

      What worries me is that it encourages behaviour which leaves valuable changes sitting on a disk which may not be backed up.

      Huh ? If you don't push to the main repo, nobody sees your commits. Don't you think this is sufficient to remember DVCS users they need to push regularly ?

    3. Re:Distributed VCS can be used like this by this+great+guy · · Score: 5, Insightful

      How do you force your cvs/svn users to commit ? You can't, you expect them to be responsible and do it. This isn't much different from a DVCS.

      What if a user wants his work to be backed up but doesn't want to commit because his changes are not ready to be published ? A centralized VCS forces them to commit with the side-effect of making their unfinished work immediately visible in the central repo, while a DVCS lets them commit to a private repo that you can back up independently.

      Your backup requirements can be solved 2 different ways:

      • 1. With any VCS (centralized or distributed), put the users' working directories on private NFS/Samba shares. This way everybody's work, committed or not, is on the file server which can be backed up.
      • 2. Use a DVCS. The users' private repos and working directories can remain on fast local storage on their workstations. A file server contains the main repo as well as private spaces that can be used by the users to periodically push to private repos, so they can be backed up without interfering with the main repo.

      Besides, in this debate, you are completely ignoring the other major advantages of DVCS over centralized ones: scalability, no single point of failure, possibility to work offline and have full access to all of the features of your VCS, usually faster than centralized VCS, low-cost branching/merging, etc.

  5. Don't knock it till you try it by burris · · Score: 5, Insightful

    Seems to me that most of the people promoting DVCS have used them and have seen the light. Once you use a DVCS on a project you don't want to go back to the bad old way of doing things.

    Most of the people knocking DVCS or saying they can't see the benefits haven't actually used them on any projects. They have built up a framework in their minds of How Things Should Work, but unfortunately that model was defined by the limitations of their tools.

  6. My goal regarding the future of Subversion... by lpangelrob · · Score: 3, Insightful

    ...getting all the other IT people in the office to use it. Even better? Getting them recognize why version control is so useful in the first place. :-D

  7. Re:Git vs Subversion by KiloByte · · Score: 4, Insightful

    Regarding #4, If you're only checking out a single directory and allowed to make a commit, how did you build/test your 5GB project? Note that I specifically mentioned "for things other than program sources". Most other pieces of software does not require builds, and neither it is monolithic.
    To commit a change to the Linux kernel, you do need to build the whole thing. That's a monolothic thing.
    To commit a change to a webpage, a graphical project, a set of biochem data, you don't need that. Do you need to check out the countless megs of Wesnoth to update your changes to a campaign? That's a modular thing.

    If that directory was an independent piece, it should be in a separate repository since it's logically independent. If that directory is part of a larger whole, you shouldn't be allowed to work with just that one piece. (IMO) If I want to modify a 5GB webpage, why would I want to checkout unrelated pieces? And having every subpage in a separate repository would be counterproductive.
    --
    The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
  8. Re:Git vs Subversion by 0xABADC0DA · · Score: 2, Insightful

    As a programmer, what pisses me off most about subversion is... well just check out their codebase and look around a bit. Yeah, it works and it does 90% of what people want it to, but the code is a giant piece of shit. That svn has been developed as a total hack job and they seemingly have spent no effort over time trying to clean it up, as a programmer, offends me. I don't know how anybody can have confidence in svn when they can't even do simple changes to it.

    They've been working for years to do simple things like just updating their folder structure so it doesn't leave ".svn" folders everywhere. Or just providing an option to not store a second copy of your 2 gig repository just so you can do restore to Head (and that's all) without asking the svn server which is probably over in the closet on the gigabit ethernet anyway. They can't do this with their current code... it's so bad that they are trying to scrap the local store code entirely.

    ... and then there are the even simpler things like why tf can't I say "svn mv *.[ch] newfolder/" or any of the other commands that you have to use shell scripting to accomplish? That kind of thing should be simple. There are a lot of these kinds of problems in svn that never gets fixed (despite having a guy at google that is apparently paid to hack on svn).

  9. Code integration assumptions by Cyrano+de+Maniac · · Score: 4, Insightful

    What I don't see mentioned very often, if at all, is the implicit assumption in distributed systems such as git, that a single person has ultimate integration responsibility and authority in order to form the official/mainline release. That is, given a single tree that is considered the main one from which all others ultimately derive (Linus' tree in the Linux case), there is absolutely no way for tools such as git to allow collaborative maintenance of that tree. In the end, the owner of that tree must perform all checkins to the tree, and must resolve all merge conflicts themself. This is a dual problem in that it wastes the time of a potentially talented developer (e.g. Linus) doing the mundane work of merging and integration, and the additional problem that if this mainline tree owner is not an expert in some particular area of the code, they are likely to make mistakes when resolving conflicts or performing other integration tasks.

    Contrast this with a centralized source model where all developers have the ability to check in to the tree, optionally coupled with a peer review process, enforced either through convention or through mechanisms in the tools. Under this model each developer is responsible for their own integration and merging efforts, not wasting the time of a centralized authority. Not only is the central authority freed from routine tree maintenance work, but each developer can make the best and wisest decisions regarding the particular area of the codebase in which they are an expert, and not have to become involved in areas they have little experience with. Granted, for larger projects there is still a need for some management of checkin authorization, particularly to avoid conflicts during large tree merge operations and the like, but it's more of a coordination role than an authorization role.

    This second model is what my employer uses, and our homegrown source control system is well-tailored to it (it actually has capabilities for more centralized control, but they are by and large unused). Perhaps this is unusual, as my experience with other employers is minimal, and mostly took the form of "copy your code into this directory once in a while" (i.e. "Source control? Why would we need that?"). However, given adequately diligent and intelligent developers, I have to say it works marvelously.

    --
    Cyrano de Maniac
    1. Re:Code integration assumptions by RedWizzard · · Score: 2, Insightful

      What I don't see mentioned very often, if at all, is the implicit assumption in distributed systems such as git, that a single person has ultimate integration responsibility and authority in order to form the official/mainline release. That is, given a single tree that is considered the main one from which all others ultimately derive (Linus' tree in the Linux case), there is absolutely no way for tools such as git to allow collaborative maintenance of that tree. In the end, the owner of that tree must perform all checkins to the tree, and must resolve all merge conflicts themself. This is a dual problem in that it wastes the time of a potentially talented developer (e.g. Linus) doing the mundane work of merging and integration, and the additional problem that if this mainline tree owner is not an expert in some particular area of the code, they are likely to make mistakes when resolving conflicts or performing other integration tasks. The reason you don't see it mentioned very often is because it's not really an issue. You just need that single person to be able to trust at least some of the developers they get changes from. Get the trusted "lieutenants" to do the merging for their particular areas. They can even delegate responsibility further. Since in a centralised system you have to trust all the developers to do merging this is no worse, and potentially better.
  10. Re:we use SVN by Jack9 · · Score: 2, Insightful

    I personally use Tortoise but the IDEs tend to not be change-aware unless I'm using the integrated tool.

    --

    Often wrong but never in doubt.
    I am Jack9.
    Everyone knows me.