Multi-User Subversion
chromatic writes "Rafael Garcia-Suarez has just penned an article about adopting Subversion for multi-user projects. (He also has a previous article on Single-User Subversion). With the recent release of Subversion 0.16 (see the File sharing link), the successor to CVS looks very good."
Actually I would like to know how this "subversion" compares to what Linus is using?
I really hope that building ancillary tools like nice clients (wincvs) and useful add-ons (bonsai) is easy enough to do, because that's really where the critical mass is wrt widespread adoption.
There aint no pancake so thin it doesn't have two sides.
A good discussion of this was in Linux Journal recently.
Briefly, CVS lacks version control across file renames, has some issues with binary files, and the CVS code base has serious design issues.
The fundamental design of CVS is flawed, and this leads to anomolous behavior. Becase the problem with CVS is in its design, not its current implementation, a re-design and corresponding re-write is reqiured.
For example, CVS stores its repository in a series of diffs in a directory structure, and the directory structure parallels the development tree. The difficulty is, directories in the cvs backend are therefore not versioned! Thus, moving files around, and re-working the tree, are not handled correctly in cvs. In subversion, the entire repository (dirs, files, and all) is stored as a single coherent revision; a subversion repository is an array of coherent tree/file groupings. As such, correct handling of directories occurs automatically. Also, atomic commits (something cvs lacks) are handled much more easily in this model.
Let me also say that the design of subversion is absolutely excellent. The design is properly decoupled and properly abstracted. Architecturally, it is greatly superior to cvs, and substantially superior to several commercial alternatives. I would imagine that the low-end source control products (PVCS, SourceSafe) will have significant difficulty staying alive once Subversion is widely deployed and tested.
One thing I like to do is put in the CVS $Revision$ substition variable in each of my text-based project files. I use them to help me know which version I've got on my live site...for if I've made bugfixes to that branch, etc.
If there aren't any changes to the file in my new tagged & branched release, that $Revision$ variable will stay the same between releases. This is irrelvent for all my other files that aren't web-related, but the ones that are can remain cached in a user's browser as long as there haven't been changes to the file (this is especially helpful if I have large javascript library files that would otherwise slow down loadtimes a lot).
However, I don't see how I can do this using Subversion. It looks like the version for all project files is incremented everytime a single file checked-in.
Is there an alternative or better technique than what I'm doing by using Subversion? I like its advantages over CVS and would like to experiment with it more!
Another use of atomic checkins is checking in more than one file at a time, especially when checking in two files whose changes depend on each other. In my experience at several CVS shops, the build sometimes breaks simply because of unfortunate timing: checking out the tree while someone else is in the middle of checking in a series of files. Most shops deal with this manually, by having certain rules, such as no checkins allowed at all during a certain hour of the night.
If there was a way in CVS to begin a transaction, then do multiple checkins, then commit/rollback that transaction atomically, that would be wonderful. It would be impossible for one person to check out someone else's partial checkin.
If Subversion is getting this feature, or even has this already, it will be one of the best "selling points" to replace CVS!
Dr. Demento On The 'Net!
Damn you. That thread got me to download subversion source and read it - mistake I won't repeat any time soon. I've spent several months wading through fairly disgusting code - block device drivers are not pretty, ditto for devfs. I had more than once found myself grabbing Lovecraft to read something that would be less nightmare-inducing. But _THAT_ takes the fscking cake - I don't _care_ what Larry (or anybody else for that matter) does to people who had excreted that code. No, wait - I _do_ care. I want video of the... event. I don't use BK, but you can be damn sure that I won't touch SVN. Ever.
found on The Linux Kernel Mailing-List
What's really new about arch is that it makes version control decentralized, rather than merely distributed, as CVS and Subversion do. In arch, any branch, any developer's private work area can be regarded as a repository of its own, with a global name space for developers, repositories, and branches, to manage this.
With CVS and Subversion, the repository is still a centralized entity. Compared to arch, these systems are still at a similar conceptual level, while arch adds something that is really new.
In comparison, CVS over ssh is secure and works pretty much everywhere. Many machines don't need to run a web server, let alone Apache 2, while ssh pretty much runs everywhere.
I wasn't going to bother, but the previous comment mentioning arch has been modded up to 4, so I'll speak a tiny bit of my peace.
... well, just follow the dev list closely.
SVN is a huge and complex system that aims, for its 1.0 release, to be just a tiny bit more featureful than CVS. There's quite a large number of bugs. The complexity for repository administrators is pretty high. The developers are willfully postponing consideration of a lot of deep issues in revision control. If you follow the dev list closely
arch is a tiny, simple system that aims, for its 1.0 release, to be way more featureful than CVS. Although I don't think its ready for deployment in large-scale situations, early adopters tell me that they enjoy using it. arch, unlike svn, is very well positioned to compete (with just a bit more development) with BitKeeper, ClearCase, and others. arch can do a hell of a lot for the commercial free software world with just a bit of investment.
And, I don't know how you should interpret this, but svn has a handful of paid developers -- arch has none and, in fact, I'm literally days away from homelessness.
Go figure.
Subversion does look somewhat better and cleaner than CVS. But there are lots of add-on tools for CVS that will need to get ported (GUIs, servers, web interfaces, IDE integration, etc.). Just the retraining required to get people to use it in a multi-user environment is pretty daunting--CVS is used by many people who are not primarily developers, and the switch wouldn't be easy for them.
:)
Good point, but this is also a big concern of the Subversion folks. This is why subversion looks so much like CVS. The commands and aliases are almost all the same, and a great part of the comportment the users see also is.
The ViewCVS scripts has already been ported to SVN, though it's not perfect yet, it does work. The GUI is pretty much in development indeed : RapidSVN is a working one, yet not complete either. An Emacs mode, similar to the CVS mode, shouldn't bee too hard to code I suppose, this is just a matter of time, will, and knowledge of elisp
There was talkings about using SVN as a backend for a wiki too, this could be fun and really nice. A first draft had been coded by Greg Stein (if I'm not mistaken), but it was mostly test stuff.
Subversion still needs help and contributors. People keep whining about CVS not handling file renaming etc, and they also keep using complicated tricks to avoid those flaws. I know, I've done it too. The very same people look at subversion and say "mh, nice, but not mature yet, let's wait it grows up a little". I doubt it'll grow quickly on its own, it just needs some help from all these coders who *will* use it in a few months/years !
Believe me, once you've switched to svn, it just looks life is *so* easier. Try it, it won't bite, and you'll most likely love it !
theefer
improving Arch instead of blathering about it people might give a shit. I mean this in the nicest possible way, of course. God knows you've got the time - over 100 lengthy posts to the GCC list in the last week alone.
I'd love to use SVN. I once tried to set it up on my own system (long ago, when it was much more difficult than it is now).
Then I decided that if I wanted to get other people involved in my project, using SourceForge would be simpler for me and them. And SF uses CVS.
If SF offered an SVN option (does it? will it?), I might move to that. I don't much like CVS for normal development (refactoring's a bitch), but the alternative is to host all the stuff that SF does on my own. I'm a developer, not a sysadmin; I don't have time for that.
CVS operates on a per-file model. Each file has its own history, branches, and so on. Operations on a set of files actually visit each file, perform the operation, and move on.
This leads to the following problems:
Creating a branch or a tag visits (and writes on) every file in the source tree, so it takes a long time. For example, the gcc folks would like to create periodic snapshots of their source tree and publish the snapshots. One step in doing this is tagging all the files. Well, creating that tag writes information into every source file and takes HOURS.
Renaming a file is not supported. All the history information in CVS is associated with "foo.c". If you want to rename "foo.c" to "bar.c", you actually have to create bar.c and then delete foo.c. This loses all the history associated with the old foo.c.
Directories are even worse. There is no way to delete a directory in the CVS repository (that's what all the "prune on checkout" kludgery is for, to delete empty directories in the client work area that should not even exist in the first place).
When I edit source, I often edit more than one file. I might edit 10 files in 5 separate directories. CVS has no notion that my changes are one "unit of activity". The GNU project uses ChangeLog files, which manually tie the 10 changes together and actually work very well. But it would be even better if CVS knew that when I committed 10 files, it's all part of one changeset, not 10 separate changes. That makes it a lot easier to backport patches from development branches to stable branches, to figure out what some other guy did (hmmm he changed foo.h, I wonder what went along with that change?)
These are all well-known problems to people who use CVS a lot. Newer source control systems (bitkeeper, subverrion, arch) all have the idea of changesets in some form or other, and all have better ways of implementing whole-tree operatings like tagging and branching.
These are just the "data model" problems. The standard CVS server has other implementation problems -- that is, problems that could be fixed just by improving the server, without changing the millions of cvs clients in the world. One big problem is that CVS needs write access to the files, even for read-only operations such as anonymous checkout, and does excessive disk I/O, even for read-only operations. This is particularly annoying because CVS doesn't guarantee checkout consistency across a whole tree anyways, but only a single directory! This is no big deal in a departmental cluster but it becomes a serious issue for public open source servers that are trying to scale up to serve the whole world and do it with limited resources.
For a start, the CVS command line client is non interactive, which is a pain in the ass if you are using SSH authentication and have to enter your passphrase every time you want to do something.
/ path/to/cvs/root checkout file
That's what's ssh-agent is for, you upload your public key to the machine running CVS and the agent running on your machine authenticates you without a password.
cvs -d:pserver:anonymous@cvs.project.sourceforge.net:
Oh wow, yeah, now thats so obvious isn't it?
1. You can get rid of everything up to the "checkout" by putting it in your CVSROOT variable.
2. No subsequent updates/checkins ever need this information again as it's stored in the CVS data in the directory. So it's a one time deal.
Subversion replaces the cryptic CVSROOT with a "normal" url, so you'll be typing something like "svn co http://someserver/repository module".