Multi-User Subversion
chromatic writes "Rafael Garcia-Suarez has just penned an article about adopting Subversion for multi-user projects. (He also has a previous article on Single-User Subversion). With the recent release of Subversion 0.16 (see the File sharing link), the successor to CVS looks very good."
Sorry, I'm not exactly a professional developer but:
Whats the advantage to killing the standard of CVS, that seems to work well today? I mean, are the features of this "Subversion" make it worth the switchover?
I really hope that building ancillary tools like nice clients (wincvs) and useful add-ons (bonsai) is easy enough to do, because that's really where the critical mass is wrt widespread adoption.
There aint no pancake so thin it doesn't have two sides.
If something goes wrong during a CVS checkin, then all hell can break loose.
Sex - Find It
The fundamental design of CVS is flawed, and this leads to anomolous behavior. Becase the problem with CVS is in its design, not its current implementation, a re-design and corresponding re-write is reqiured.
For example, CVS stores its repository in a series of diffs in a directory structure, and the directory structure parallels the development tree. The difficulty is, directories in the cvs backend are therefore not versioned! Thus, moving files around, and re-working the tree, are not handled correctly in cvs. In subversion, the entire repository (dirs, files, and all) is stored as a single coherent revision; a subversion repository is an array of coherent tree/file groupings. As such, correct handling of directories occurs automatically. Also, atomic commits (something cvs lacks) are handled much more easily in this model.
Let me also say that the design of subversion is absolutely excellent. The design is properly decoupled and properly abstracted. Architecturally, it is greatly superior to cvs, and substantially superior to several commercial alternatives. I would imagine that the low-end source control products (PVCS, SourceSafe) will have significant difficulty staying alive once Subversion is widely deployed and tested.
One thing I like to do is put in the CVS $Revision$ substition variable in each of my text-based project files. I use them to help me know which version I've got on my live site...for if I've made bugfixes to that branch, etc.
If there aren't any changes to the file in my new tagged & branched release, that $Revision$ variable will stay the same between releases. This is irrelvent for all my other files that aren't web-related, but the ones that are can remain cached in a user's browser as long as there haven't been changes to the file (this is especially helpful if I have large javascript library files that would otherwise slow down loadtimes a lot).
However, I don't see how I can do this using Subversion. It looks like the version for all project files is incremented everytime a single file checked-in.
Is there an alternative or better technique than what I'm doing by using Subversion? I like its advantages over CVS and would like to experiment with it more!
Of course, no discussion of CVS's strengths and weaknesses vs. Subversion would be complete without mention of the third contender, arch. Link 1, Link 2.
I hereby adopt subversion 0.16.
I am now Dimwit 22.16.
(Subversion - sub-version...get it? Ha ha? Ahhh...nobody has a sense of humor these days...)
...but it's being eaten...by some...Linux or something...
If you're a coder, and have never used CVS, try it. It's absolutely lovely. "Oh, introduced a bug there...let's just diff against a known good version." "Oh, it looks like *Bob* was the one to commit that broken code." "Why did I add *that* code? Let me check my CVS log..."
Yeah, there are probably things about CVS that could be better. But if you've never used it, and aren't already using a competitor, it's really good.
May we never see th
The only problem I have with subversion is its dependences on apache which just gets in the way in a local project unlike cvs which can be used over rsh/ssh, cvsserver, and locally. The other problem with apache is that they use the HEAD of httpd and apr as their base which is wrong for use with Darwin(Mac OS X). Also apache is big and is too modular for a project like subversion.
I've been moving all of my CVS development over to Subversion over the past few months, including a couple development servers at my company.
Since Subversion is now in Debian unstable, it's really easy to install. Compiling from source is a bit of a hassle due to all the dependencies, especially on the apache2 libs.
So far, I've not had a bad experience. No data loss or anything. And I'm very, very happy that I can finally get rid of pserver.
Subversion impressed our company developers by its speed (subjectively, considerably faster than CVS for comparable operations) and its user-friendlyness. It's the details, stuff like automatic detection of binary files, that makes life for the dev people easier.
For the admin, the fact that it runs via apache2 makes your life much easier, especially when it comes to firewalling and access control (user and passwords, etc.) - in a corporate network, you could easily plug it right unto your LDAP server for authentication, for example.
Two things are holding Subversion back right now, IMHO:
a) lack of a wincvs/tkcvs equivalent. Rapidsvn is making progress, but it's still very much alpha.
b) a couple things still missing, like understanding symlinks.
Assorted stuff I do sometimes: Lemuria.org
see the File sharing link directly from the freekin frontpage. This was meant for humor, but obviously the moderation has no sense of humor. Off topic was a bogus moderation though.
Ignore the "p2p is theft" trolls, they're just uninformed
Check out Aegis - http://aegis.sourceforge.net. It's better than Subversion. It's older than Subversion. It's more stable than Subversion. It has atomic multi-file commits. Branching to any depth. Multi-user support. Distributed support. Applying change sets to multiple repositories. And much, much more.
aegis.sourceforge.net
Does anyone have any experience with integrating Subversion with, say, NetBeans? Does it work as good as the CVS support in NetBeans?
:-)
I'll probably try it anyway, I'm just lazy.
Being bitter is drinking poison and hoping someone else will die
Yes, try reading the FAQ.
v s2 svn.
2 sv n/README
http://subversion.tigris.org/project_faq.html#c
So far, its "only" a Python script. Very much in beta, the usual warnings apply, but they claim it's working ok.
As an example, the whole first
year of Subversion's own history was converted from CVS into a
3000+ revision svn repository. It took about 30 minutes.
http://svn.collab.net/repos/svn/trunk/tools/cvs
Being bitter is drinking poison and hoping someone else will die
Damn you. That thread got me to download subversion source and read it -
1 03 402696209262&w=2
mistake I won't repeat any time soon. I've spent several months wading
through fairly disgusting code - block device drivers are not pretty,
ditto for devfs. I had more than once found myself grabbing Lovecraft
to read something that would be less nightmare-inducing. But _THAT_ takes
the fscking cake - I don't _care_ what Larry (or anybody else for that
matter) does to people who had excreted that code. No, wait - I _do_ care.
I want video of the... event.
I don't use BK, but you can be damn sure that I won't touch SVN. Ever.
Short and concise as ever.
http://marc.theaimsgroup.com/?l=linux-kernel&m=
Generally, Subversion's interface to a particular feature is similar to CVS's, except where there's a compelling reason to do otherwise.
How about this for a compelling reason; CVS's interface is HORRENDOUS!
Look, CVS has fantastic features, to be sure. But it has a horrible interface that's far more complex than it needs to be. I haven't even found a GUI front-end that can make it easy to use.
It's great to have powerful features, but not everyone needs all that power. 9 times out of 10, all I need is simple check-in and check-out with revision control. I don't need encryption. I don't need a million options for checking in and checking out.
I just find all this other stuff gets in the way. I'm a firm believe that if you want to use software at it's simplest levels, it should be simple to use. As you get to more advanced features, it's okay for it to get more difficult to use. But to make it difficult to do the most basic things just doesn't make sense.
I don't mean to slam CVS, but I'd just really like to see a simple to use alternative to it. Too many times I've gotten lost with CVS wondering exactly what the hell I had done.
Damn you. That thread got me to download subversion source and read it - mistake I won't repeat any time soon. I've spent several months wading through fairly disgusting code - block device drivers are not pretty, ditto for devfs. I had more than once found myself grabbing Lovecraft to read something that would be less nightmare-inducing. But _THAT_ takes the fscking cake - I don't _care_ what Larry (or anybody else for that matter) does to people who had excreted that code. No, wait - I _do_ care. I want video of the... event. I don't use BK, but you can be damn sure that I won't touch SVN. Ever.
found on The Linux Kernel Mailing-List
In comparison, CVS over ssh is secure and works pretty much everywhere. Many machines don't need to run a web server, let alone Apache 2, while ssh pretty much runs everywhere.
Subversion does look somewhat better and cleaner than CVS. But there are lots of add-on tools for CVS that will need to get ported (GUIs, servers, web interfaces, IDE integration, etc.). Just the retraining required to get people to use it in a multi-user environment is pretty daunting--CVS is used by many people who are not primarily developers, and the switch wouldn't be easy for them.
It's been years since we have had any signficant problems with CVS; it seems to be just ticking along, doing its thing. So, I'm not convinced switching to subversion would be enough an advantage to outweigh the risks and retraining costs associated with it. I think it would take a number of large and very visible open source development projects switching to Subversion to give me enough confidence in it to try it.
I wasn't going to bother, but the previous comment mentioning arch has been modded up to 4, so I'll speak a tiny bit of my peace.
... well, just follow the dev list closely.
SVN is a huge and complex system that aims, for its 1.0 release, to be just a tiny bit more featureful than CVS. There's quite a large number of bugs. The complexity for repository administrators is pretty high. The developers are willfully postponing consideration of a lot of deep issues in revision control. If you follow the dev list closely
arch is a tiny, simple system that aims, for its 1.0 release, to be way more featureful than CVS. Although I don't think its ready for deployment in large-scale situations, early adopters tell me that they enjoy using it. arch, unlike svn, is very well positioned to compete (with just a bit more development) with BitKeeper, ClearCase, and others. arch can do a hell of a lot for the commercial free software world with just a bit of investment.
And, I don't know how you should interpret this, but svn has a handful of paid developers -- arch has none and, in fact, I'm literally days away from homelessness.
Go figure.
Programming can be fun again. Film at 11.
CVS operates on a per-file model. Each file has its own history, branches, and so on. Operations on a set of files actually visit each file, perform the operation, and move on.
This leads to the following problems:
Creating a branch or a tag visits (and writes on) every file in the source tree, so it takes a long time. For example, the gcc folks would like to create periodic snapshots of their source tree and publish the snapshots. One step in doing this is tagging all the files. Well, creating that tag writes information into every source file and takes HOURS.
Renaming a file is not supported. All the history information in CVS is associated with "foo.c". If you want to rename "foo.c" to "bar.c", you actually have to create bar.c and then delete foo.c. This loses all the history associated with the old foo.c.
Directories are even worse. There is no way to delete a directory in the CVS repository (that's what all the "prune on checkout" kludgery is for, to delete empty directories in the client work area that should not even exist in the first place).
When I edit source, I often edit more than one file. I might edit 10 files in 5 separate directories. CVS has no notion that my changes are one "unit of activity". The GNU project uses ChangeLog files, which manually tie the 10 changes together and actually work very well. But it would be even better if CVS knew that when I committed 10 files, it's all part of one changeset, not 10 separate changes. That makes it a lot easier to backport patches from development branches to stable branches, to figure out what some other guy did (hmmm he changed foo.h, I wonder what went along with that change?)
These are all well-known problems to people who use CVS a lot. Newer source control systems (bitkeeper, subverrion, arch) all have the idea of changesets in some form or other, and all have better ways of implementing whole-tree operatings like tagging and branching.
These are just the "data model" problems. The standard CVS server has other implementation problems -- that is, problems that could be fixed just by improving the server, without changing the millions of cvs clients in the world. One big problem is that CVS needs write access to the files, even for read-only operations such as anonymous checkout, and does excessive disk I/O, even for read-only operations. This is particularly annoying because CVS doesn't guarantee checkout consistency across a whole tree anyways, but only a single directory! This is no big deal in a departmental cluster but it becomes a serious issue for public open source servers that are trying to scale up to serve the whole world and do it with limited resources.
If I use the start of the mailing list archives as a guide, I would say both are about the same age (around April 2000). Both are still in alpha (check Subversion's status).
Personally, I would like a better comparison of these two.
Subversion is indeed already a giant step better than CVS in all the areas where CVS was painful, while having a good migration path. Arch, OpenCM, and PRCS2 could be in the running, and Arch has that multi-repository support going for it. But I'd say Subversion is the best thing going as of right now.
I have a listing of all known SCM software for Linux at http://linuxmafia.com/~rick/linux-info/scm.html, in case it will help.
Rick Moen
rick@linuxmafia.com
I haven't actually used Aegis, but I assume that since the submitter also submits a test case, the submitter could "invent a test" as you put it that roughly corresponded to the way the new code is more "elegant."
I don't know if Aegis has this, but I'd like to see "benchmarks" in addition to "test cases." A test case basically either passes or fails, but a benchmark would return a score, so one could then submit a change that did not break anything but resulted in smaller line count, fewer branches, smaller object size, faster exeution, whatever without breaking other test cases or worsening other benchmarks. You'd probably also need to define a utility function that would combine benchmark scores to determine which combination was better, and I expect people would frequently submit changes or even branches to that utility function as their thinking evolved on how they'd like to optimize various trade-offs.
Actually, as I understand it, that's not entirely true. While Aegis does have support for "a process", it is not mandatory. Rather, steps can be skipped and ignored. Furthermore, you can elect to adopt only the portions of the process that you like (test cases, etc). That means you only have to use as much as the process as you desire or none at all. In other words, it allows for a process framework to be used where developer supplied content fills in the framework or you can have null steps for each part of that framework leaving you with only the toolset behind.
Long story short, I believe that you've been misinformed. Feel free to correct as need.
Case in point: Quite often, during code reviews or programming sessions, I come across bugs or bad programming methods that exemplify a certain fundamental lack of experience or understanding on part of the author. Using cvs annotate I can determine exactly who wrote the line(s) in question, discuss the problem with the culprit and, if I do my job right, hopefully ensure that the mistake is not repeated. Without the annotation feature, I would have to ask each team member whether they wrote the code in question. Too often it happens that they don't remember. We have had some major directory reorganization the last few months, and at one point all of our files lost their history simply because of a single directory renaming operation.
The remove/add renaming trick damages the projects' collective memory. You end up with bits of the past that are simply missing.
This is very much like TeamNet's "checkpoint" concept. In TeamNet, you made changes in your work area at will, and any time you wanted to remember the state of the project, you'd freeze a checkpoint.
This was a *very* cheap operation, since it consisted of supplying a name to the current state of the world, and creating a new "open" checkpoint.
To create a Work Area in TeamNet, you'd make a Virtual Copy, or VCP (effectively a symlink) to some checkpoint in the repository, which TeamNet referred to as the "Baseline". Your work area would only use physical storage for the files that you'd actually changed.
I'm glad to see SubVersion dealing with the need to know the entire state of the project, instead of leaving this burden on the developer. There are lots of times when adding a feature or fixing a bug means changes to a bunch of files, which I need to apply or rollback as a set.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."