Ease Into Subversion From CVS
comforteagle writes "While you have a nice leisurely Sunday afternoon/evening you might want to read this fine article on easing into Subversion from CVS. Written by versioning admin Mike Mason, it talks about the philosophy and design behind Subversion (now 1.0), how it improves upon CVS, and how to get started using it."
Remember, when changing software components, it's a good idea to back up first!
The article is a summary.
How we know is more important than what we know.
I've read the linked article (really!) and I think Subversion sounds like a good idea. Primarily, I like the fact that everything you can do with CVS you can do with Subversion in the same way as with CVS.
I am really curious how much demand there is for Subversion's new features, however.
Do developers out there voice the need to store binaries? I can imagine this being needed for web developers and such, but I think programmers can just build their binaries from CVS.
Also, have there been many problems that required atomic commits? Can someone explain why this is important? I mean, the idea is you'll need to merge one way or another. I can see the point being in that what you commit at any given time will compile (presuming you're commiting completed code) but realistically, does anyone not fix their up-to-date checks as soon as they happen?
Also, Subversions says that it is much faster at things like tagging, but tagging is not a very frequent operation...
To me it sounds like a great product but I am not able to see a compelling reason why most development shops out there who are currently in CVS would rush to switch.
Not a flame btw, just an opinion.
Ecce Europa - Web Design for Business
See the project front page
Subversion
It's nice that you can run a subversion server on MSWin server systems, I suppose, if that's the sort of thing that floats your boat. But how on earth is the option to spend hundreds of extra dollars on proprietary operating system software and the more-expensive hardware it requires "significantly lower[ing] the barrier to entry?"
There may be a minor barrier, in Win-only shops (although I would say that it's the Win-only policy that is the barrier, not the other way around). Like I say, Win support is a perfectly nice thing. But "significant"?
It bothers me a bit that all the files are now in a big database. A good thing about CVS is that you can see what files and modules are available using regular unix tools, and if things get messed up in some way you can always fall back to the rcs commands or in the worst case edit the ,v file by hand and extract the latest version. With a database, if things were to get corrupted enough (I have no evidence that this happens often, but still...) you are stuck. Just like with the windows registry, where if it gets messed up you lose big.
Any opinions on this?
Ok, I saw some questions about why people should switch from CVS to Subversion. The article does a nice job of covering what features Subversion adds, but people still seem to wonder why these are important.
Atomic Commits:
As stated in the article, if something goes wrong in the middle of a CVS commit (e.g. network goes down) it can leave the commit only partially complete. This can be a problem if changes in multiple files are dependent upon each other. Say I add a function to an API, then call it in other file. If the call gets committed and the API change doesn't, now the code in CVS won't compile. With atomic commits if the connection was dropped the commit would simply roll back. Then when my network came back up I could try to commit again, but the repository would never be left in a state where it didn't compile.
Constant Time Tagging/Branching:
In Subversion tagging and branching are fundamentally the same, they're both executed as a "copy" command. I'm not sure what the execution time is for these operations in CVS, though I believe it's linear to the size of the repository. In Subversion this is an O(1) operation. While one of the posts commented on tagging being an infrequent operation, this may be true, but why not let it be fast anyways? However, no matter how often you do tags, constant time branching is nice. I can at any time quickly create my own branch of a project to work from. Working in my own branch means that I can keep very granular track of my changes by committing frequently, without worrying about breaking something else. Once I'm satisfied with my changes I can merge my branch with the main code.
Storing Binaries:
"Binaries" does not necessarilly mean compiled code. There are plenty of things that can benefit from this. Anywhere you use graphics: web programming, GUI programming, or say game or other 3D programming andy you want to store your models. Or, you can store documentation in the repository: PDFs, Word docs, spreadsheets, etc.
Finally, the barrier to switching isn't all that high. The command line program has quite similar syntax, so switching is pretty easy, and the other interfaces such as the web viewer, TortoiseCVS, and IDE integrations generally have counterparts for Subversion.
Well, that's all I can think of for now. I'm actually going to try to get my company to switch over to Subversion from a commercial software they were using when we start on our new product. We're using a Java applet to interface with the repository now, and it's not very nice. CVS would work, since the main thing I want is integration with Eclipse and IntelliJ Idea, but there are plugins to support this with Subversion as well. However, Subversion has nice feature CVS doesn't, so I don't see any reason to use CVS over Subversion.
But using a database DOES provide advantages, as stated in the article. Mostly speed advantages, but also the ability to do live backups. If you try backing up an online (as in live) CVS server's files, there's nothing stopping people from doing commits, thus possibly botching your backup (you're no longer backing up the files you thought you were).
And when it comes down to it, backups are really where your safety lies. In the last CVS project I worked on, the repository was hosed twice. Once due to a careless admin, and once due to the hard drive dying. While we had some down time, virtually no work was lost, largely due to our nightly backups. The fact that CVS stored its data as plain text files certainly didn't protect us.
Be sure to talk to your programmers before you pull the switch on them. Not telling them would be rather subversive...
He who laughs last is stuck in a time dilation bubble.
Once a week, a snapshot release is made. That means a tag is added. This operation takes, on average, 40 minutes, because the GCC source tree is large.
Every time someome makes a branch, they create a tag just before branching (for use later on, with diffs and merging). 40 minutes to tag, another 40 minutes to branch.
All because these are, stupidly, O(n) operations instead of O(1). We'd like to move to Subversion, but can't, until they get annotate ('svn blame') fully working, because GCC developers spend a lot of time doing "revision-control archaeology".
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
The problem is that putting the stuff into a database creates another dependency on a non-trivial piece of software. That creates all sorts of risks.
On UNIX/Linux systems, the file system is more than sufficient for handling this kind of storage and transactioning, so this dependency and risk is unnecessary.
I suspect Subversion uses a database because it may be intended to run on operating systems with less powerful file systems.
- Finger feel is very similar to CVS
- Flexible directory layout & tagging
- Extremely stable development.
Subversion Bad Points:- Database & log files take up a LOT of space.
- Quite hard to share repositories
- No way to mark your branches (if you accidentally check out the directory containing your branches, you just got 50 gigs of 99.9% identical files...)
- No distributed development
- Pretty weak merging
Arch Good Points:- Extremely good distributed development
- Super easy to share repositories
- Pretty strong merging.
- Very stable development
Arch Bad Points:- Forces you to give your projects weird names ("my-project--branch-1--1.1").
- Forces each branch into a different top-level directory in your archive ("my-project--branch-2--1.1").
- Doesn't feel anything like CVS.
- Pretty slow (but they're working on it).
- Somewhat difficult to resolve merge conflicts
I wish I could love Arch because distributed development absolutely rules. I could tolerate its bizarre command set, but I simply won't accept arbitrary (and ugly) constraints on what I name my projects and branches.Verdict: I'm still using CVS. Subversion is very close to pleasing me enough to switch... I'll probably ditch CVS some time this year.
Do developers out there voice the need to store binaries?
There are definitely reasons for storing binary (non-text) files in a version control system:
WWTTD?
I can't switch unless we can convert our repository from cvs. Are there tools for doing this?
However, IF there is no free software like Subversion, I'll rather do with CVS than using non-free stuff even if someone else pay the money for me. For example, CVS does not have atomic commits, so I use tags instead (ironic since CVS does tagging quite slowly, but still acceptable for one-man projects). Other weak points of CVS can also be worked around. It isn't pretty, but not THAT painful either. Actually, before I discovered RCS, I just did version control manually by saving a tarball after each day's work, which is tedious but still sufferable.
Of course, for large projects, version control is much more important.
Is there any client front end for subversion that makes a graphical tree of versions, like wincvs or cervisia? It's a very useful feature and I would like to have something equivalent for subversion.
"I think this line is mostly filler"
Are there any GUI clients like wincvs for subversion yet? It looks like a much better tool, but I don't see my group switching unless there is a client that is at least as good as wincvs.
That's a pretty good question in my opinion, and TortoiseSVN's Windows shell-extension doesn't cut it. ("-1, Redundant" my ass.) If you're looking for something more like WinCVS, check out RapidSVN.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
Actually, atomic commits means something totally different from global revision numbers. Having atomic commits means that a software failure during repository-modifying activities leaves everything in a well-defined state. That is, if you are committing your changes and the network connection dies, your computer dies, the OS or source control client crashes, you kill the source control client, etc., then your repository should not be corrupted (and it will be as though you never committed at all). With CVS, some files may have gotten committed to, and others not, leaving the repository in an unknown/inconsistent state.
I've been using CVS in professional development environments for about 5 years at several different employers. I love CVS but have been watching Subversion closely and with some anticipation.
:
The atomic commits will be nice but honestly the lack of them has never been a huge problem for my teams (atomic commits are probably less a problem with 6-8 people). The things that do bug me about CVS that Subversion is supposed to address
1. the ability to move or rename a file w/o losing the history
2. the ability to set file permissions
3. ability to remove unused directories
I know that these things can be achieved by tweaking CVS's files manually but that's a long way from elegant. It's been a stumbling block when I'm trying to introduce a new team to CVS.
Someone want to forward this to the guys at SF? I'd like to know what became of their, "we'll add subversion once it matures enough" claim.
I spent all weekend playing around with svn, actually, cvs2svn is still converting my 3 GB cvs repo...
There's two things holding me back currently, the long ( and possibly broken ( cvs2svn.py is not 1.0 ) ) conversion process, and the lack of decent support for the cvs modules file.
I think I _might_ be able to convince the rest of the developers that a clean switch might be ok, but there's no way around the heavy use of the modules file.
What we do is have the projects that you checkout as modules in the modules file, each of those include the common parts that are across the two platforms, as well across projects. we also build some of our components as libraries, and we also include third party libraries. when we're ready to ship something, we tag that module in cvs, which tags not only the source for that project, but all the common code, our libraries, and external libraries. This makes it very easy to share code across projects, yet retain an easy checkout/build/rebuild. I don't see how I can do this with subversion and the externals file... =(
--patiently waiting on the svn:externals...
Hot backups to plain text make the live data storage format largely irrelevant. See `svnadmin dump --incremental`, `svnadmin hotcopy` (and its wrapper script `hot-backup.py`) as documented in the open source "Version Control with Subversion" book (another fine O'Reilly tome written by some of the core developers).
h tm l#svn-ch-5-sect-3.6
,v files by hand at one time or another, but Subversion has built-in commands replacing almost every non-corruption use case for that insanity. The operational procedure for handling data corruption -- which as never happened to date -- is backups, not hacking at the raw data storage format and praying.
http://svnbook.red-bean.com/html-chunk/ch05s03.
Sure, most of us have edited
Daniel Rall
Yes you can ease right into subversion like Taco eases right into Timothy's backside.
If you speed up CVS, I'll have no time to read /.
Yes, if you've used TortoiseCVS before, you might want to check out TortoiseSVN...
It integrates into Windows Explorer and allows you to do all the updates, commits, etc with right mouse clicks.
Also, have there been many problems that required atomic commits? Can someone explain why this is important?
Well, to database developers, the thought of having SQL scripts committed WITHOUT atomic commits is very scary. I use CVS to record the SQL DDL scripts for database generation (and backup). If I commited a new table.sql script, for example, and that conflicted with a sequences.sql, which was not commited atomically, my database keys could completely meltdown...
Fortunately, we don't have enough developers with CVS for that to be a problem, but I plan to move us to Subversion soon.
Yeah, I love the fact that there's a revision number that's global to the whole repository. We embed that number into each build of our product and our testers file bugs against a particular revision.
Has anyone done that for Subversion with some Java build tools like Ant or Anthill? Do you incorporate the build number into your WAR or EAR file?