Ease Into Subversion From CVS
comforteagle writes "While you have a nice leisurely Sunday afternoon/evening you might want to read this fine article on easing into Subversion from CVS. Written by versioning admin Mike Mason, it talks about the philosophy and design behind Subversion (now 1.0), how it improves upon CVS, and how to get started using it."
Remember, when changing software components, it's a good idea to back up first!
But using a database DOES provide advantages, as stated in the article. Mostly speed advantages, but also the ability to do live backups. If you try backing up an online (as in live) CVS server's files, there's nothing stopping people from doing commits, thus possibly botching your backup (you're no longer backing up the files you thought you were).
And when it comes down to it, backups are really where your safety lies. In the last CVS project I worked on, the repository was hosed twice. Once due to a careless admin, and once due to the hard drive dying. While we had some down time, virtually no work was lost, largely due to our nightly backups. The fact that CVS stored its data as plain text files certainly didn't protect us.
How about individuals wanting source control on their at-home projects? I'm sure not going to spend the money on the MS control, but I don't have a *ix box up 24/7 either. (I use my laptop nearly exclusively, and my laptop hardware supports Windows better.)
When you used PostgreSQL, MySQL, or Oracle, does it bother you that your data is in a big database? Why do you worry so much about Subversion then?
A good thing about CVS is that you can see what files and modules are available using regular unix tools, and if things get messed up in some way you can always fall back to the rcs commands or in the worst case edit the ,v file by hand and extract the latest version.
It is a good thing that you were able to hand-edit CVS repositories when they got corrupted -- because corrupt CVS repositories are a dime a dozen.
I've been using Subversion since January 2002 (yes, a full two years before 1.0 came out.) and I have never, ever, ever seen a corrupt repository or heard about one on the mailing lists. When someone did claim that they thought Subversion corrupted their repositories, the Subversion devs dropped everything to make sure this wasn't the case. AFAIK, it has never happened. (Usually it was the person using multiple servers to access their repo or putting their repo on a network share (Berkeley DB doesn't work over NFS/AFS/CIFS.))
Let me quote a Slasdot posting of mine from a couple of years ago:
My opinion has not changed in the past two years.Thomas
A filesystem should not be used to hold multiple versions of a file as well as the meta-data associated with it. Less not forgeting the associations of multiple files that become a project. This is the work of a database, hence BerkleyDB. If you are concerned about "repairing" a file (aka db), there are command-line tools for just such an event, but you will probably find that you just won't ever need them. Just my 5000 sheckles.
A filesystem should not be used to hold multiple versions of a file as well as the meta-data associated with it. Less not forgeting the associations of multiple files that become a project. This is the work of a database, hence BerkleyDB.
The UNIX file systems is a database. That's what it is designed to be, that's what it is used as, and that's what it is good at. It has an extensive set of tools for manipulating it and lots of excellent GUIs for dealing with it. Some current UNIX/Linux file system implementations are, in fact, implemented just like database software.
You are just mindlessly repeating what generations of Windows hackers stuck on flaky FAT file systems have told you.
If you are concerned about "repairing" a file (aka db), there are command-line tools for just such an event, but you will probably find that you just won't ever need them. Just my 5000 sheckles.
No, I'm "concerned" about having to use an entirely different set of tools to manipulate data in a DB, about DB performance for blobs, and for a Subversion installation breaking when I upgrade the DB shared library. All of those aren't theoretical problems, they actually happen in practice. And on UNIX, they are completely unnecessary problems and risks.
As I was saying, the Berkeley DB decision may make sense if the Subversion server is supposed to run on Windows or on MacOS. But as far as UNIX and Linux are concerned, it's a no-brainer: this kind of data belongs directly in the file system, not in databases.