Pragmatic Version Control Using Subversion
Chapters on repository layouts, integrating third party code (into your source tree and products) and conflict resolution all help raise this book from just being a single application tutorial into a best practices guide that you'll come back to long after you've gained confidence with Subversion itself.
Pragmatic Version Control Using Subversion is very similar to Pragmatic Version Control Using CVS, but this is in no way a criticism! The previous book was the best introduction to CVS that I've read, and this related volume manages to retain the winning formula while adding useful sections, such as CVS hints, to help people migrating across.
While the book has a broad appeal, the ideal audience are those developers who know they should be doing version control but have heard it's too complex, have been burnt by previous mistakes, or just don't know where to start. Seasoned developers will also find this book useful, but in different ways. For instance, using it as an easy to scan and follow reference, handing it down to less experienced colleagues, or even just for quickly bringing themselves up to speed when moving from CVS to Subversion.
Considering the book's slim size (or quick download, if you purchase the PDF version) it packs in surprisingly wide coverage of the important topics. The first two chapters provide an overview and sell the benefits of using a version control system. They cover what should and shouldn't be under version control, and clearly explain the terminology required to understand both the technology in general and the book's later chapters.
Chapters 3, 4, and 5 get you working from your own Subversion repository and introduce the essential commands. They show how to create, add and import your projects in a clear, easy-to-understand way. Once you have some files to work with, they take you through a well-paced tour of the simple operations; checking out, committing and accessing the files in different ways.
Following these, Chapter 6, "Common Subversion Commands," shows some of the more complex but essential tasks you'll want to perform in Subversion; setting properties, looking at changes and their associated history and how to handle merge conflicts. These are all presented in short sections that provide enough information to be useful on a day-to- day basis while not leaving beginners bogged down in the minutiae.
Jumping ahead slightly, we leave the part of the book that everybody using Subversion should read and move onto the more powerful, and complex, functionality such as "Using Tags and Branches" (Chapter 8) and the more abstract topics of "Organising Your Repository" (Chapter 7) and dealing with "Third Party Code" (Chapter 10).
Chapter 8 stands alone in the second half of the book due to its coverage of a very technical subject; chapters 7, 9 and 10 are more abstract. Tagging and branching are one of the more notorious areas of version control, but this book -- much like the CVS book before it -- manages to explain not only when and how to use both tags and branches, but also provides enough guidance to allow the reader to 'smell' when something's wrong and adding them would make it worse.
Chapters 7, 9 and 10 logically combine to cover the issues surrounding setting up your own project, including the project's structure, the integration of third party code, external projects, and binary libraries such as Nunit or Java mock libraries. Considering the amount of maintenance coding (as opposed to new projects) that happens in the world, these chapters might not be immediately useful to a fair chunk of the readership. I don't think they should be removed, though -- better to leave them in and show best practices and experience-driven common sense than remove them and let people make the same mistakes over and over again.
It's worth noting that the appendices are a lot more useful than the filler material typically found lurking at the back of a book -- they cover a couple of topics that don't fit elsewhere and help round out both the book's coverage and appeal.
Appendix A is more relevant to system administrators than developers. It shows how to install Subversion on the server. It then gives a brief introduction to configuring, serving (using either the native svnserve, svn over SSH or via Apache) and adding basic security to your repositories. It finishes off with a short, but useful, digression into backing up your hard work.
This appendix provides a valuable, quick guide to getting a Subversion install in place. It's a good starting point for anyone who needs to actually run and maintain a Subversion server.
The remaining appendices vary in usefulness. Appendix B is a concise introduction to migrating a CVS repository to Subversion; this is something you either need desperately or won't care about. Most of Appendix C shows how to perform common tasks using the TortoiseSVN extension for Windows Explorer; this won't appeal to the Unix/Linux crowd but might help sway Windows developers away from the hell that is Visual Source Safe.
In short, whether you're new to version control in general or just Subversion itself, this book is highly recommended. Clear, concise and crammed full of useful, important and dare I say, pragmatic, advice and information. An excellent book in its own right and a worthy addition to the Starter Kit Series.
Dean Wilson is a System Administrator at Outcome Technologies. His personal site is unixdaemon.net. You can purchase Pragmatic Version Control Using Subversion from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page
The way I like to put it to point out why ClearCase and others of its ilk are such a beast to work with compared to CVS, is that broken merges are fixed BY the people that cause them, BEFORE they screw up the repository.
With the "normal" source control systems that use the reserve/checkin style, a programmer may work on several files - perhaps they even work on them unreserved to be nice to others (as is becoming policy here).
You still have the issue of "The Merge" That is, the programmer doing the development is nt getting changes made to those files while he is working, and others are not seing his work.
So when it comes time to check in all the files, a prigrammer checks most of them in - but then possibly runs into issues with merges in the last few files. Lots of commercial merge tools seem poorly designed to help the average programmer deal with issues, sometimes they simply automatically hose the merge without the programmer even knowing.
Using a CVS system, the programmer is able to keep in sync all through development by constantly updating files. That means that issues caused by him will also be resolved by him on the fly, instead of someone else discovering a merge went wrong later. It makes the day of checkin no longer something to fear, since you are already synchronized and can be reasonably sure the system works BEFORE you do the checkin, instead of checking after and possibly having a broken build.
Sure that checkin might cause someone else issues, but they will tend to be isolated to a developer and not affect the whole team at once.
So basically CVS style operation encourages programmers to keep in sync with what other people are doing - I honestly believe that if it did not work this way and people were forced to deal with reserved files, the whole OpenSource movement would be a fraction of its current size and success.
Yes I know ClearCase can kind of do something like that, but not very well and I have seen clear case totally bungle automatic merges before.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
It's much better to have repository-wide revision numbers, because it means that a revision number identifies the state of an entire project at a specific point in time, not just the state of specific file.
This doesn't make it any harder to track changes to individual files. When you run "svn log" on a file or directory, you only see the log messages/revisions listed where that file or directory was changed.
It's really quite an elegant system.
This space intentionally left blank.
When I was initially getting involved with Subversion I found that most of its features reminded me a lot of Perforce. The repository-wide revision numbers, database backend, and general "feel" of Subversion is very similar to Perforce.
I think Perforce is better than Subversion if you're doing a lot of branching -- the merge point tracking that Perforce performs is really well implemented and saves you from a lot of manual tracking. Overall though if you're looking for a free alternative to Perforce I'd highly recommend Subversion
Well, assuming that you're declaring computer science superior over the pragmatic approach, I've started wondering about that recently.
Let me start off by saying that I'm firmly in the scientific camp, intending to start a PhD in CS this (northern) summer. It seems that an awful lot of popular things in IT are despised by computer scientists. Linux, as a monolithic kernel, is a famous example, as is C++, and I recently saw something about Perl being evil as well.
Now, these scientists have good reasons to call these things ugly, but people still use them. That means that either people are stupid, or the computer scientists are missing something. I think that it's mostly the latter.
In my software engineering course I was taught that the first and foremost thing you do in a project is gather requirements. It seems to me that computer scientists need to get out and ask people who work in IT what they actually expect from their kernels, languages, and development systems. Then they can try and create theories of how it all works or should work to fulfill those wishes, and use those theories to improve those real-world systems.
The alternative, sitting in your ivory tower inventing things that you think are pretty and everyone else thinks are useless, doesn't seem to be working too well.
It's interesting to note that while many things are stated by computer scientsts as evil, they are also practical. There are reasons why these languages are 'efficient' in a business sense: Linux is a widely used and widely known free *nix kernel. C++ leverages the existing C knowledge base; Perl is wonderful when you're trying to analyze ad-hoc log files. Personally, I haven't learned Python or Ruby, because the amount of time I spend hacking perl (maybe 10-15 hours a year) isn't enough to ever justify learning a programming language to replace it. [note I'm not a programmer; thus don't rely on programming knowledge to hold a job; thus my self-education time is better spent elsewhere; otherwise this might be a useful broadening of my horizons]
What you have to realize is that something won't be (rapidly) adopted unless it is significantly better than a previous product (or has a near-zero learning curve). For example, one of the members of the IEEE that sits on some of the Ethernet standards committees mentioned to me that one of the reasons Ethernet jumps by magnitudes of 10X is to ensure the next generation provides significant benefit over the previous one (of course, there are other reasons, too).
The point is, you can find a significant number of languages in use, but few that have been displaced. Those that were widely adopted (Fortran, Cobol, C, Perl, Java) all have specialties (scientific, business, fast/portable, efficient scripting, the web) that significantly differentiate them from previous languages. The dynamics of Python and Ruby is actually an interesting case study of where Perl is being very slowly knocked out of some of its domains by a similar, less "ugly" language (or such is my perception [both Perl being knocked out and the others being less 'ugly']). However, that transition is, I would argue, a bit glacial.
Also, to say that computer science (and studying algorithms) has very little effect is not all that accurate. Developments in rendering 2D and 3D systems, graph theory, and countless other areas are the result of those individuals. Even my understanding of how to organize code to ensure my OO code is a directed acyclic graph is a result of those "CS" folks and having that knowledge filter down. It doesn't necessarily affect the languages, but it changes how we use them. More recently, we have seen discussion of aspects, though (personally) I still haven't figured those out [I actually think it's because I work in problem domains where aspects are not efficient].
In addition, continually pointing out the weaknesses of the existing systems helps those that will design the next set of systems avoid the same mistakes. Even though that advocate may appear to be in an 'ivory tower' and ignoring what is going on in the 'real world', if a few architects listen and learn (and apply to the next generation of systems), then the computer scientist has served a purpose.
One other note. In respect to: "It seems to me that computer scientists need to get out and ask people who work in IT what they actually expect from their kernels, languages, and development systems.", try it sometime. Many times, people have NO IDEA what they are looking for. That's the whole basis of prototyping - to get something in front of the person and get feedback. Many other times, you will get completely contradictory answers. Detailed-oriented people will want a revision control system, code tags, and everything javadoc'ed. Others will be hacking out code that hopefully self-documents, but only has a comment to identify the license and copyright of the code. The challenge when dealing with people is that you have wildly different learning mechanisms, personality types (e.g. Myers-Briggs), brain operation (e.g. Hermann), and Communication Styles (e.g. DiSC). These differences lead to d
Other CS topics are dependent on physical devices with complex emergent behavior (computer systems and their constituent parts) and thus don't admit axiomatic proof; but are still "hard" in that repeatable controlled experiments can be conducted to generate solid empirical evidence.
However, many parts of "Computer Science" are dependent on economics and human behavior. It is at this junction that emotions are thrown into the mix and religion ensues.
"Hard" sciences do not have dependencies on soft sciences; the results of physics research do not depend on psychology or economics. Given this, I consider CS to be a soft science.
One, a vague sense of security, that if a file gets corrupted, I can at least make an attempt at manual repair.
The problem is that sense of security is very misplaced. CVS doesn't do any integrity checking. So you can easily have corruption problems and not know it until it is way too late. And if you add a binary that you haven't configured CVS for, well, he's dead, Jim. A scrambled text file isn't going to be any more recoverable than a scrambled binary.
You might not like binary formats. But I don't see how you can avoid them if you are really going to handle binary data well. Otherwise you are ducking the issue.
As far as using grep on a repository, yeah, I have done that too. It's ok if you have small projects. But for larger projects that is not a useful benefit. My current employer has an 11GB SourceUnsafe Repo that has to be a disaster in the making. And of course a $0 budget to move it to something else.
Subversion isn't the cureall either. It's got some bad design in it that has got me holding back from recommending it. What I want is stuff like keywork expansion in Unicode. Merge tracking. etc.
But at least it isn't a one man project like darcs. That would never fly for any sane corporation.