Slashdot Mirror


Ease Into Subversion From CVS

comforteagle writes "While you have a nice leisurely Sunday afternoon/evening you might want to read this fine article on easing into Subversion from CVS. Written by versioning admin Mike Mason, it talks about the philosophy and design behind Subversion (now 1.0), how it improves upon CVS, and how to get started using it."

17 of 130 comments (clear)

  1. Is there demand? by cookiepus · · Score: 4, Interesting

    I've read the linked article (really!) and I think Subversion sounds like a good idea. Primarily, I like the fact that everything you can do with CVS you can do with Subversion in the same way as with CVS.

    I am really curious how much demand there is for Subversion's new features, however.

    Do developers out there voice the need to store binaries? I can imagine this being needed for web developers and such, but I think programmers can just build their binaries from CVS.

    Also, have there been many problems that required atomic commits? Can someone explain why this is important? I mean, the idea is you'll need to merge one way or another. I can see the point being in that what you commit at any given time will compile (presuming you're commiting completed code) but realistically, does anyone not fix their up-to-date checks as soon as they happen?

    Also, Subversions says that it is much faster at things like tagging, but tagging is not a very frequent operation...

    To me it sounds like a great product but I am not able to see a compelling reason why most development shops out there who are currently in CVS would rush to switch.

    Not a flame btw, just an opinion.

    1. Re:Is there demand? by dietz · · Score: 5, Interesting

      Before reading this, let the record show that I am a subversion fanboy. But I am only a Subversion fanboy because it solved almost all of my complaints about CVS. I am not involved with the project at all.

      Do developers out there voice the need to store binaries?

      Uh, most projects of any size will have at least a few binary files in their repository... icons, etc. But you could store those in CVS without too many problems.

      Also, have there been many problems that required atomic commits? Can someone explain why this is important?

      Rolling back changes without atomic commits is a pain in fucking ass. Have you ever had to do it? You have to track down every file that you changed (somehow... hopefully you can remember), check which version was the version prior to your commit, and get all those versions of files. For example "Okay, I need version 1.7 of foo.c and version 1.8 of barf.c and version 1.13 of foo.h." It's totally annoying.

      Plus atomic commits just makes it much, much easier to keep track of what changes have gone it. This is my biggest, biggest complaint about CVS. File-level commits just make no sense. There is no time, ever, that I can think of when the ability to commit an entire changeset at once isn't better than committing a single file at a time.

      Also, Subversions says that it is much faster at things like tagging, but tagging is not a very frequent operation...

      Depends on your development process. During beta periods, it's common to make a tag or two per day, and if each tag takes ten minutes, well... it's not a big thing, but it's certainly annoying.

      To me it sounds like a great product but I am not able to see a compelling reason why most development shops out there who are currently in CVS would rush to switch.

      Certainly not every shop is going to "rush to switch". But, regardless, I imagine that every shop will switch eventually. It may take years, but subversion's advantages are significant enough that in my opinion it will become the new version control standard.

      Also note that CVS was crufty and adding new features was almost impossible. Subversion targetted CVS features as their 1.0 milestone. But more importantly, the Subversion code base is a much better baseline to work from when adding new features. So you can expect that it will only get better in the future.

    2. Re:Is there demand? by Ninja+Programmer · · Score: 3, Interesting
      Do developers out there voice the need to store binaries? I can imagine this being needed for web developers and such, but I think programmers can just build their binaries from CVS.
      CVS lets you check in binaries. But it doesn't use any diff algorithm -- its just stores each instance. So its just inefficient. Any application that uses media will commonly have binary data.

      The other thing is that Unicode source data is typically not stored in a purely ASCII compatible form. Moving forward, people are going to be using Unicode source data which at a low level can be considered essentially binary.

      Also, have there been many problems that required atomic commits? Can someone explain why this is important?
      Once you get to above about two dozen developers working on the same code base, you will end up with erroneous check-in collisions. Detecting and reversing out of these is a lot of fun.

      I mean, the idea is you'll need to merge one way or another.
      If you check-in mulitple files, then everything will be checked in except where there are conflicts. When you fix the "conflicts" you end up with an image that nobody actually tested. If you test it before checking in the fixes for the conflicts, then you leave the source tree exposed in a state where only part of your check in is there (and with enough developers there is an arbitrary number of partial checkins that the tree might be containing at any one time.)

      These are all standard "race condition" problems. Commits have to be atomic for the same reason that transactions are atomic in databases, and mutexes/semaphores exist in operating systems.

      IMHO, this issue alone is more important that all other combined.

      Also, Subversions says that it is much faster at things like tagging, but tagging is not a very frequent operation...
      Chicken and egg? If tagging were fast, wouldn't people be more likely to use it? Tagging is a way test people, release people, and even marketing people interact with the development results in a way that makes sense to them. Tagging is a very useful thing. Having numbered check-ins like Perforce makes this slightly less important, but why map your milestone ordinals to some homebrew scheme, when your source control can do it for you?
    3. Re:Is there demand? by Jacek+Poplawski · · Score: 2, Interesting

      CVS lets you check in binaries. But it doesn't use any diff algorithm -- its just stores each instance. So its just inefficient. Any application that uses media will commonly have binary data.

      CVS stores binaries but it is not so trivial. When we put some binary data into our CVS tree we realized Windows users can't access it (need some setting in repository). CVS behaves differently in Linux and in Windows in this case.

    4. Re:Is there demand? by spongman · · Score: 2, Interesting
      Yeah, I love the fact that there's a revision number that's global to the whole repository.

      We embed that number into each build of our product and our testers file bugs against a particular revision. If I can't repro a bug against my current code, I can just create a new branch at the given revision, compile, and I know I'm using exactly the same code that the tester was running.

    5. Re:Is there demand? by 0x0d0a · · Score: 2, Interesting

      That's because you checked in the binary in text format instead of binary, and the linefeed translation chewed up your binaries when switching between platforms.

      This is particularly annoying with text-like formats, like Visual Studio 6's .dsw files -- they look like text files, they smell like text files, and CVS autodetects them as text files, but Visual Studio 6 throws a tantrum if you try to hand it a .dsw file with LF line endings.

    6. Re:Is there demand? by Textbook+Error · · Score: 3, Interesting

      if some customer has a bug with version 2.1.2.4 of Foofware, the company can just check that out, instead of figuring out (and hoping to get it right) how to build it

      Your build system is seriously broken if this is the case. The whole point of revision control is that you can get back to a previous build just by fetching a specific tag or branch. If that means that you need to keep your entire dev environment (IDE+tools straight off the CD, headers, runtime libraries, etc) under revision control then that's what you should do.

      Builds have to be deterministic if you want to have reliable QA, and making the build process reproducible is at least as important as using source control. The alternative is you end up checking out a build from 6 months ago that crashes, yet when you try and build the equivalent source the crash goes away. Having to say "um, this should be the same build but this one works and that one doesn't and I can't tell you why" is a sign that something pretty serious has gone wrong in your process.

      There are plenty of other good reasons to keep binary data in a revision control system (images, sound, models, data for regression tests, materials for installers, etc) but trying to avoid having to have a deterministic build process shouldn't be one of them.

      Third party libraries that you never build yourself can obviously be checked in as-is, but anything that you build from source should always be buildable from source on a brand new workspace. No ifs, no buts - if you can't produce a reliable build on demand, how do you know what's going into any of your builds?

      --

      Nae bother
  2. Windows server? by Xtifr · · Score: 1, Interesting

    It's nice that you can run a subversion server on MSWin server systems, I suppose, if that's the sort of thing that floats your boat. But how on earth is the option to spend hundreds of extra dollars on proprietary operating system software and the more-expensive hardware it requires "significantly lower[ing] the barrier to entry?"

    There may be a minor barrier, in Win-only shops (although I would say that it's the Win-only policy that is the barrier, not the other way around). Like I say, Win support is a perfectly nice thing. But "significant"?

  3. All your files are belong to us by wayne606 · · Score: 4, Interesting

    It bothers me a bit that all the files are now in a big database. A good thing about CVS is that you can see what files and modules are available using regular unix tools, and if things get messed up in some way you can always fall back to the rcs commands or in the worst case edit the ,v file by hand and extract the latest version. With a database, if things were to get corrupted enough (I have no evidence that this happens often, but still...) you are stuck. Just like with the windows registry, where if it gets messed up you lose big.

    Any opinions on this?

    1. Re:All your files are belong to us by ray-auch · · Score: 2, Interesting

      Personally, all the data in Oracle, (SQL Server even) or PostgreSQL wouldn't bother me, MySQL might worry me a little, MS Jet / Access worries me a lot. BerkleyDB I'm not sure about, I know a little of its heritage on unix but would be a lot less sure on other platforms.

      A lot of people's experience with source control and DBs will be coloured by Visual Source Safe and Jet (which it uses). It is ok until it gets corrupted, and then you are hosed. Keeping everything in readable files CVS-style is a BIG plus point once you've been in that situation.

      I'm confused on your corruption statement - you seem to say both that it never happens, and that subversion never does it but other things ("Usually it was the person using multiple servers...") do. Which is it ? And if the latter, what recovery options are there ?

      I am also wary of database-based products which are tied to one particular database - makes me worried there are low level hacks being relied on. I think a lot of people (well, me for one) would like to run _one_ rdbms on _one_ db-optimised server managed by _one_ dba - not a dozen different ones all over the place which all have to be managed differently (backup Oracle here, backup Exchange (yuk) here, backup MySQL here for appY, backup SQL here for AppX and now add another special here for source control...).

      With stuff in one rdbms it is also easy to relate stuff together in queries (query source control operations related to versions in a trouble tickts app, for example).

      If it supported multiple (at least two) rdbms from the outset configurable via odbc/jdbc/etc., preferrably also with an open schema and "just use sql like this to get file x version y from project z" - then it would give me (for one) far more confidence that it was worth looking into further.

      PS. I haven't had sourcesafe (still have to use it for some stuff) corrupt a db in over the past two years either - the horror of seeing >5yrs of the whole team's code history suddenly inaccessible (shortly after tape drive problems...) stays fresh in your mind a lot longer.

  4. database is a dependency by ajagci · · Score: 1, Interesting

    The problem is that putting the stuff into a database creates another dependency on a non-trivial piece of software. That creates all sorts of risks.

    On UNIX/Linux systems, the file system is more than sufficient for handling this kind of storage and transactioning, so this dependency and risk is unnecessary.

    I suspect Subversion uses a database because it may be intended to run on operating systems with less powerful file systems.

  5. how do you migrate? by DeadSea · · Score: 2, Interesting

    I can't switch unless we can convert our repository from cvs. Are there tools for doing this?

  6. It helps just a little by r6144 · · Score: 4, Interesting
    I have used Subversion in quite a few (small, mostly one-man) research projects during the last six months. Before then I used RCS/CVS. Subversion does make me somewhat more comfortable, and I have little to complain about it, which means I probably won't ever look back.

    However, IF there is no free software like Subversion, I'll rather do with CVS than using non-free stuff even if someone else pay the money for me. For example, CVS does not have atomic commits, so I use tags instead (ironic since CVS does tagging quite slowly, but still acceptable for one-man projects). Other weak points of CVS can also be worked around. It isn't pretty, but not THAT painful either. Actually, before I discovered RCS, I just did version control manually by saving a tarball after each day's work, which is tedious but still sufferable.

    Of course, for large projects, version control is much more important.

  7. Graph? by aled · · Score: 2, Interesting

    Is there any client front end for subversion that makes a graphical tree of versions, like wincvs or cervisia? It's a very useful feature and I would like to have something equivalent for subversion.

    --

    "I think this line is mostly filler"
  8. Any GUI Clients? by tjmsquared · · Score: 2, Interesting

    Are there any GUI clients like wincvs for subversion yet? It looks like a much better tool, but I don't see my group switching unless there is a client that is at least as good as wincvs.

  9. Re:Consider GCC by devphil · · Score: 2, Interesting


    The person who tried it reported it wasn't working for certain branches off the main trunk. *shrug* Haven't tried it personally since the 1.0 release.

    --
    You cannot apply a technological solution to a sociological problem. (Edwards' Law)
  10. Meta data and Moves by irontiki · · Score: 2, Interesting

    I've been using CVS in professional development environments for about 5 years at several different employers. I love CVS but have been watching Subversion closely and with some anticipation.

    The atomic commits will be nice but honestly the lack of them has never been a huge problem for my teams (atomic commits are probably less a problem with 6-8 people). The things that do bug me about CVS that Subversion is supposed to address :

    1. the ability to move or rename a file w/o losing the history

    2. the ability to set file permissions

    3. ability to remove unused directories

    I know that these things can be achieved by tweaking CVS's files manually but that's a long way from elegant. It's been a stumbling block when I'm trying to introduce a new team to CVS.