Pragmatic Version Control Using CVS
What's the approach? The philosophy of this series is summed up on the Starter Kit website:
Software development is difficult enough; if you try to build on a shaky foundation it can make development almost impossible (which might account for the fact that about 50% of all software projects fail). You need a firm foundation: The Pragmatic Starter Kit is a set of basic, common-sense practices applicable in all software development environments. The techniques given in these three books are not expensive to implement and are not hard to learn, but can make the difference between being a success and being a statistic.
The first book in the series covers the what, why and how of software versioning, using CVS for the examples. It walks you through installing CVS clients, setting up your server, and using basic commands, then teaches advanced concepts. It is the new CVS handbook that can be used by both beginners and veterans.
Target Audience This book, like The Pragmatic Programmer, should have very broad appeal. It should be required reading for any junior developers or CVS administrators, and it should be a bookshelf reference book for mid-level to senior developers. It is slanted heavily towards CVS, but given that CVS is free and widely used, that shouldn't prevent anyone from using the book to learn the concepts, even if their company uses another versioning system for production work.
What's to like? As is usual for Thomas and Hunt's books, this one is a very easy read. The concepts are clearly laid out, with plenty of working examples throughout. There is a good coverage of the fundamentals as well as very advanced topics. Unlike most CVS books or tutorials, this text is clear and straightforward. It's easy to understand and follow. It's got the best coverage of CVS branching and merging that I've ever read!
What's to hate? Honestly, there is not a lot here that I don't like. The introductory chapters are little too basic, but since the book is (partly) aimed at beginners, that's okay.
Why bother reading this book? I've been using CVS for over six years now (including being the CVS admin at two companies) and this book covered a few very useful advanced topics that I had never even heard of. An example of this is the use of vendor tags (Chapter 10). Using this feature, you can have a local copy of your favorite open source project in your company's CVS server and make changes to it. You can then merge your local project with the new releases of the public project, and CVS will handle merging your changes with the public baseline. This feature is incredibly useful, but I didn't even know it existed until I read this book.
This book is a great introduction if you've never used a versioning system. By the time you've finished the book, you'll have installed CVS (client and server), created projects, created new files, merged changes, etc. If you already use versioning software, it can remind you about the features you've forgotten about (or never knew existed). This book is a great introduction and a great refresher too.
Where to buy?
Not so long ago in another Slashdot article, Andy and Dave suggested that in order to compete in the new global economy, we should all diversify our skill sets. To that end, this book is published under their new publishing company, The Pragmatic Bookshelf. You can buy copies from the Pragmatic Programmer's web site in both dead tree ($29.95) and PDF ($20.00) formats.
Summary As we have come to expect from Andy and Dave, this is another great book. The technical content is rich and clear but it won't put you to sleep. It has appeal to both newbies and veteran developers. I give it '10 out of 10 slashes.'
Richardson met Hunt while he and Thomas were finishing up The Pragmatic Programmer and has reviewed each book that they have written since -- he makes no bones about liking their work.
This site has more reviews for this book.
For me, I thought Code Complete was the book for learning good coding.
On another note, does anyone else want to scream every time someone says 'best practice'?
Time to bury CVS, not to praise it.
Check out Arch.
Perforce is the only version control software worth talking about. CVS just doesn't have the features or the robustness to be really useful. I wish CVS would go away, in favor of perforce, or better yet, an OSS equivalent. What's happening with subversion? Is it useable yet?
cvs checkout -r mytag repository
cvs log -rmytag -d 'yyy-mm-dd'
two -r switches but... the first one has a space before the tag, the second one doesn't. when you look at the cederquist doco online the html really doesn't make this clear.
if this book addresses this one quirk it's worth a hundred bucks.
2 1337 4 u!
Unless the source control software has a complicated GUI from which you can cut and paste stuff into powerpoint, and makes checking in a file, a longer process than software development, our bosses won't go for it
CVS is great for version control. Don't get tempted by Rational's ClearCase product.
A full build of a sample project with CVS takes me 30 seconds. CC takes 7 min, 30 sec.
CVS doesn't need multi-site repositories, clearcase does if you have a lot of remote development.
CVS doesn't integrate with the kernel, so if CVS crashes it doesn't take your whole machine.
CVS has better add-on GUI tools for branching and comparison.
It is easy to create and apply patch files with CVS, something not easy to do with CC.
With CC, when you check out a file, you can't actually write to it. You have to loop and keep checking for the file to be 'writable' after check out. Even then, sometimes when CC marks the file as writable, it really isn't.
A batch update in CVS is easy, with CC you have to check out individual files. I have a script for this. A batch update takes about 20 minutes compaired to 45 seconds in CVS.
CVS is free.
CVS doesn't require as much training or support time as ClearCase.
ClearCase does have excellent command-line tools. It also has a lot more features. But you can probably live without them.
Judging by his piss-poor auction site and other spam-based referral farms, I'd say "certainly not enough".
I've always found CVS to be more trouble than it's worth. I do small-time development with Mac OS X (previously Project Builder, now Xcode) and like the *idea* behind CVS. But the articles/tutorials I've read are either how to install (which I have) and just go over the commands, or they're geared toward the expert. I haven't found much info on conceptual/fundmental questions, like on integrating with IDEs, for instance "do I check the entire development tree into CVS, or just the text files?" If it's just the text files, that seems like a lot of work. "How do I put my web site HTML files into a repository and still have the web server still be able to access it?" Overview stuff like that.
My current way of version control is the old way of just zipping up each release!
is using debian apt-get
As someone whose been desparately trying to get a grasp on some advanced CVS concepts lately, especially vendor tags and tracking of third party sources, I'm a little disappointed at the slow start the book gets off to; it feels just a bit belabored reading another introdution to the basics, but I'm glad to hear there's good stuff further on. Guess I'll get back to it.
"No more rhymes now, I mean it!" "Anybody want a peanut?"
who uses cvs anymore? *giggle*
By the way, backports.org has a wonderful woody backport of subversion.
- about me
were you expecting to see a sig here? perhaps you'd rather see the inside of an ambulance!
Subversion still seems to have a few serious issues including rename is not atomic.
/etc/passwd file?
Also, CVS, Subversion and Arch all require unique UNIX user accounts to access their repository - this sucks from an administrative and security point of view. I just want contributers to read from and commit to the repo - not have UNIX access of any kind. Is there a free RCS that just runs as a server and does not require monkeying around with the box's
Aegis, GNU Arch (my personal favorite), Subversion, BitKeeper... all of these work around CVS's worst failings. What's unfortunate is how few people have had their expectations of what a revision control system should do set far too low by CVS.
A few examples of features one should expect of a modern revision control system:
Bitkeeper? I've seen a few good OSS projects use this and read some good things about it. Anyone, anyone? Bueller??
...pragmatic versioning. I have no idea what's it's about. If it's another shortcut that makes programmers more productive at the expense of customers and implementers, I'm against it.
I can't really dump CVS until there is support for the major IDEs (Including EMACS!).
Interestingly, Subversion support for Eclipse and Netbeans is available.
-- ac at work
Open Source Development with CVS by Karl Fogel is a great online CVS manual and reference. I use it all the time.
JP
I concur with this. I highly recommend Code Complete for developers of all experience levels. It provides a nice basis for best practices, with explanations as to why the recommendations are made etc. Very useful information in there! I've had countless coworkers and developer acquaintances read it.
Please consider subversion. It rocks, and the project is managed in a professional manner.
... even when you don't include price as a factor. There are a lot of favorable comments on the svn mailing lists from former Perforce users.
Actually, CVS stands for "CVS Versioning System"
I've only met one person who used CVS and he used it to version control his mbox files! Obviously, CVS can be used on any text file but is it really useful for non-programmers?
--
"Most people would sooner die than think; in fact, they do so."
Bertrand Russell.
Does Subversion really handle the repeated merge problem now? I have heard that Arch does do this and I don't really know about bitkeeper. I'd say that this is my personal biggest beef with CVS (aside from its ridiculously inefficient storage scheme).
Last time I checked, repeated merge was a post-1.0 issue, but for me, it's the only reason not to move to Subversion.
Look up commitinfo in the CVS manual. It does both of these things.
He may in fact speak "some" truth, but something about developing your own filesystem (forced kernel integration) just so you can back out of mistakes rubs me the wrong way. Why not integrate into pre-established file systems that already do snapshoting? Such as Network Appliance, which already does it better and faster. Any time you need to buy disk storage and a separate server to run a revision control system, you end up painting yourself into a corner.
Does Subversion really handle the repeated merge problem now?
Hmm -- I don't know for sure on Subversion; perhaps someone else here will comment. I'm positive that Arch does (it's what I use personally), and pretty sure that BitKeeper does.
Just curious -- any particular reason you're not considering Arch?
I distinctly recall commitinfo not being useful for this in actual practice. It was a while ago, so I'm not sure why -- perhaps it was running on the client rather than the server? (Of course, what we *really* need is a 3rd machine set up as the canonical dev environment running the tests, neither the client nor the revision control server -- something which tla-pqm makes trivial).
I can ask the IT lead why that was, if you're really curious; his memory's better than mine.
I *do* like the pragmatic guys, but why on earth are they going introduce newbies to CVS when Subversion is out Q1 next year? It's like 100000 times better and simpler in EVERY respect...
I develop dozens of projects concurrently. Keeping all the development details straight can be difficult, particular when reapproaching a project that has been untouched for several months. The ability to back out non-obvious design mistakes, start speculative development branches, and distribute projects across multiple machines, depending on where I am at any point in time has made single version control a necessity for me.
Starting a project in CVS is simple.
Create a directory. Maybe add a descriptive text file or two. Run cvs import from within the directory. You'll then need to do a checkout, and you're ready to begin adding makefiles, source code, autoconf delights/madness. For what it's worth, I always rename the original directory before the checkout in case CVS has a glitch. Older CVS balked at overwriting the files; maybe that's still the case.
The last reason that I use source control even for projects that I am the sole developer is that I have on occassion, deleted critical files by mistake, and had to reimplement entire classes/modules under duress. It's a rare event, but with source control, I'm less stressed over the possibility of screwing up.
-Hope
So.. does this book explain vendor branches *well*?
:-)
Does it explain why files stay on the vendor branch until you change them? (Which means if you change a couple files, they are on your main line, but the rest stay on the vendor branch).
That really bugs me (i.e., shouldn't the vendor branch be tagged with only vendor's version numbers, and your main line be only tagged with YOUR tags, instead of mixed on both branches?)
I found out that "cvs admin -b" will move the vendor code to the main line, so I always run that command after importing vendor branches. But it really doesn't make sense.
It's much more logical to do vendor branching in Subversion (even though you are technically doing them "by hand", like all tagging and branching in svn).
Anybody know what I'm talking about, or is this like "super-advanced CVS for anal pricks"? Maybe.
Anyway I hope this book is good. I found the O'Reilly book to be awful. I love their Ruby book.
or
Generally, you may wish to start with a simple project and check it in before beginning your work in earnest. The fewer files the better, and if you've already run configure and have hundreds of non-source build files in the directory already, it can be time consuming to remove them, despite helpful make targets. I try to start with as few files as possible, and add all my source files explicitly.
From here, one merely needs to checkout the project, add, and commit source files.
Obviously, a good bit of the first part could be scripted... I don't bother since I find project setup somewhat zen anyway. I enjoy the ceremony of it, and it's not a daily task anyway.
-Hope
Stop modding this spammer up (notice all the referral links as 'ccats-20')
Life isn't like a box of chocolates. It's more like a jar of jalapenos. What you do today, might burn your ass tomorrow.
Its pretty amazing that no one has mentioned TortoiseCVS yet. If you are using CVS and are stuck on the windows platform, then Tortoise CVS is a god send.
Corporate Gadfly
Jonathan Archer: the most beaten up Enterprise captain in Star Trek history
Is the 'tla' binary all you need?
I thought it shelled out to tar, diff, rcs, apache and other processes to do the "real" work.
Not very pragmatic if I can't buy it at Amazon. I don't live in the US and I don't have a credit card.
You can also set a flag (-kb I think) so that you can version control non-text files, and if you are using a GUI like WinCVS a great front-end to CVS, then it will usually automatically handle that for you.
Arch avoids the whole issue by never rewriting or removing files which have been added to the repository
I don't think this can be emphasized enough. The most important thing a revision control system can do for me is guarantee the safety of my code (as it's my work product and the most valuable thing I've got). Knowing that the history of my project is accurate because it is never modified (by the arch tools, anyway) is very important to me.
-- The world is watching America, and America is watching TV.
commitinfo runs on the server, the CVSROOT directory is not normally even checked out to the clients. And ssh (or less secure equivalents) make it easy to use a dedicated test machine.
One can setup CVS so multiple client accounts are mapped into single or a few unix accounts. There are plenty interesting files in CVSROOT ;-)
I've always wanted to come up with a scheme that would take the entire /etc directory of a Linux box, commit it to CVS, then be able to similarly commit a number of similar machines as branches and keep all of them up to date by committing after each change. That would make it easy to see the configuration differences between any two machines and/or two points in time. However, I've never come up with a non-destructive way to get this started with existing machines. Is there one?
Aegis is around for about 10 years - for that time people could already recognize it's great features, design and implementation. Why didn't they do?
I am suspicios that most of people tend to prefer more primitive solutions by the same reasons as they stick to Windows. They can quickly start, but they don't really care about upcoming problems.
When I think about huge popularity of Windows and CVS I begin to disappoint in the humankind.
Less is more !
My god, how can you recommend Arch when there isn't even a production-quality release yet?
All I see is 'arch-pre1' 'arch-pre9', etc at their website:
http://ftp.gnu.org/gnu/gnu-arch/
Thanks for the detailed description, in particular for your clear description for what you might want to do with a distributed version control... I've read a number of pro and anti bitkeeper flames, but none of them have made that clear.
It's like 100000 times better and simpler in EVERY respect...
Except in implementation..
A versioned file system layered on top of Berkeley DB? EXCUSE ME????? Berkeley DB is key/value database .. how the heck do you get TREES out of it?
The great thing about svn is the simple CVS-like interface. That's it. I predict it will collapse under it's own weight by 2006.
Unfortunately I can't really come up with any better alternatives. I like the underlying implementation of Arch but the interface is overly complex. The other choices are commercial and/or from companys with dicks at the helm.
Honestly, CVS will be my weapon of choice for longer than I'd like.
Yes, but CVS is unfortunately "good enough" (and just barely).. all those other systems have their own shortcomings:
.. many things take multiple intermediate steps instead of one .. versions and revisions have an odd syntax with "--" in between the components. Directories/URLs with "{}" in the name are problematic. Wonderfully pragmatic storage mechanism though, the best of "hackable" CVS repositories and atomic databases.
subversion: grotesquely bloated opaque "versioned filesystem" built on top of BerkelyDB... not really based on changesets, it can't remember what you've merged.. has a nice CVS-like interface though. I personally believe that versioned filesystem business will become a huge albatross once they run into the practical problems of computing and storing deltas *efficiently*, and handling the changeset concept properly.
arch: confusing and non-intuitive interface
bitkeeper: proprietary, issues on Linux kernel list leave a bad taste in my mouth (I don't care about "programmers paying their mortgage", I care about getting my work done). Can eat up RAM.
perforce: proprietary, though actually I kindof like this one, but too expensive for me.
I'm waiting for the perfect version control system, none of these are it. So with a heavy heart I continue to use CVS. For my purposes it does the job. I'd love to see a CVS-like interface on Arch though, or at least a *simpler* interface. The help screen is 182 lines long!! for CVS it's like 40.
Doing an update in a tree where one file has changed is a simple as downloading the single changeset -- unlike CVS, where the status of every single file needs to be queried to find the one that's altered. In a 30,000-file tree, this is a major performance improvement.
Bah -- neither CVS nor ClearCase can really scale.
For an example of my "CVS can't scale" assertion in action, just look at the pains SourceForge has had to go through, and oppose that to a system like Arch (where the repositories can be served over a completely unmodified web server using standard HTTP load-balancers, multi-layer distributed caching and all the other techniques used by high-volume web sites serving static content).
Arch doesn't need GUI add-on tools for branching -- indeed, the ease and power of its branching and merging operators is one of the things that best recommends it.
CVS may be adequate version control for those with few needs -- but by no means is it "great". See my other post for more on that.
I'm not going to defend the others, because I don't particularly like them either, but let me throw in a word or two about Arch here:
The interface to Arch is only unintuitive until one's played with it enough, or crawled inside Tom's head for a bit. Given this effort, Arch just Makes Sense -- quite a lot of the interface reflects the underlying implementation concepts and such quite cleanly.
Further, the "extra steps" argument is less true than it used to be. For instance, tla 1.1's archive-setup command makes into a single step what used to be a 3-step process (creating a new category, branch and version in the archive for a fresh project or branch), and star-merge has a syntax that's vastly simplified.
Tom has played around with the idea of writing "itla" -- an interactive client for TLA -- or overarch (a tool for writing project-specific arch scripts). In the meantime, though, there are a fair number of 3rd-party frontends available -- I don't use any of them, but Samium Gromoff has been actively looking for user feedback on "tlator", so it's probably a good place to start.
re archive-setup, let me add that init-tree can take a previously seven-step process:- branchr t
create-archive
create-category
create
create-version
init-tree
make-log
impo
down to three:
create-archive
init-tree
import -S -s "initial import of project foo"
I hope you don't mind, I've saved your quote about trees. It cracks me up, along with other Subversion developers.
If you want a serious answer: we have several db tables. The values in the tables are lisp-like "s-expressions" that hold real data and keys to other tables. It's a way of getting "columns" when you don't have a real SQL backend. And by the way, it works extremely well.
(We do have future plans, however, to give the repository a real SQL backend someday.)
A number of posters ask why would one even want to use CVS considering its known faults? Although I'm personally partial to Subversion, choosing CVS doesn't seem unreasonable. CVS is no less useful today than it was (say) two years ago, before Arch or Subversion or any of the other new kids on the block were ready. If you want something that does the job and whose problems are known, CVS is not a bad choice.
But on to the real reason for my post:
Some of the posts here make simply wrong statements about Subversion. Below are corrections.
1. Subversion does not require a Unix user account per vc user. Furthermore, this is true not just with the WebDAV (http://) access method, but also with the svnserve (svn://) method. Some posts said otherwise; not sure why.
2. Subversion does *not* require Apache, nor does it require you to use the WebDAV protocol for repository access. Apache/WebDAV is one of two entirely independent network access methods. The other is a custom protocol (think of it like CVS pserver) using a custom server (svnserve) and its own URL space, "svn://".
3. Subversion is not in Alpha anymore, it is in Beta. This was a recent transition, so it's understandable that people wouldn't have known about it.
4. Someone said you can't make client-side graphical reports about revisions and differences, because the client doesn't have access to the right information. I think this person must have read and misunderstood a highly technical mailing list thread. The client does have access to the necessary information already (see 'svn log -v' for starters).
People with further questions about Subversion should please come to users@subversion.tigris.org, or irc.freenode.net, channel #svn. Hope to see you there!
By the way, I have not used Arch in a long time, so can't comment on the differences between it and Subversion.
http://www.red-bean.com/kfogel