Pragmatic Version Control Using Subversion
Chapters on repository layouts, integrating third party code (into your source tree and products) and conflict resolution all help raise this book from just being a single application tutorial into a best practices guide that you'll come back to long after you've gained confidence with Subversion itself.
Pragmatic Version Control Using Subversion is very similar to Pragmatic Version Control Using CVS, but this is in no way a criticism! The previous book was the best introduction to CVS that I've read, and this related volume manages to retain the winning formula while adding useful sections, such as CVS hints, to help people migrating across.
While the book has a broad appeal, the ideal audience are those developers who know they should be doing version control but have heard it's too complex, have been burnt by previous mistakes, or just don't know where to start. Seasoned developers will also find this book useful, but in different ways. For instance, using it as an easy to scan and follow reference, handing it down to less experienced colleagues, or even just for quickly bringing themselves up to speed when moving from CVS to Subversion.
Considering the book's slim size (or quick download, if you purchase the PDF version) it packs in surprisingly wide coverage of the important topics. The first two chapters provide an overview and sell the benefits of using a version control system. They cover what should and shouldn't be under version control, and clearly explain the terminology required to understand both the technology in general and the book's later chapters.
Chapters 3, 4, and 5 get you working from your own Subversion repository and introduce the essential commands. They show how to create, add and import your projects in a clear, easy-to-understand way. Once you have some files to work with, they take you through a well-paced tour of the simple operations; checking out, committing and accessing the files in different ways.
Following these, Chapter 6, "Common Subversion Commands," shows some of the more complex but essential tasks you'll want to perform in Subversion; setting properties, looking at changes and their associated history and how to handle merge conflicts. These are all presented in short sections that provide enough information to be useful on a day-to- day basis while not leaving beginners bogged down in the minutiae.
Jumping ahead slightly, we leave the part of the book that everybody using Subversion should read and move onto the more powerful, and complex, functionality such as "Using Tags and Branches" (Chapter 8) and the more abstract topics of "Organising Your Repository" (Chapter 7) and dealing with "Third Party Code" (Chapter 10).
Chapter 8 stands alone in the second half of the book due to its coverage of a very technical subject; chapters 7, 9 and 10 are more abstract. Tagging and branching are one of the more notorious areas of version control, but this book -- much like the CVS book before it -- manages to explain not only when and how to use both tags and branches, but also provides enough guidance to allow the reader to 'smell' when something's wrong and adding them would make it worse.
Chapters 7, 9 and 10 logically combine to cover the issues surrounding setting up your own project, including the project's structure, the integration of third party code, external projects, and binary libraries such as Nunit or Java mock libraries. Considering the amount of maintenance coding (as opposed to new projects) that happens in the world, these chapters might not be immediately useful to a fair chunk of the readership. I don't think they should be removed, though -- better to leave them in and show best practices and experience-driven common sense than remove them and let people make the same mistakes over and over again.
It's worth noting that the appendices are a lot more useful than the filler material typically found lurking at the back of a book -- they cover a couple of topics that don't fit elsewhere and help round out both the book's coverage and appeal.
Appendix A is more relevant to system administrators than developers. It shows how to install Subversion on the server. It then gives a brief introduction to configuring, serving (using either the native svnserve, svn over SSH or via Apache) and adding basic security to your repositories. It finishes off with a short, but useful, digression into backing up your hard work.
This appendix provides a valuable, quick guide to getting a Subversion install in place. It's a good starting point for anyone who needs to actually run and maintain a Subversion server.
The remaining appendices vary in usefulness. Appendix B is a concise introduction to migrating a CVS repository to Subversion; this is something you either need desperately or won't care about. Most of Appendix C shows how to perform common tasks using the TortoiseSVN extension for Windows Explorer; this won't appeal to the Unix/Linux crowd but might help sway Windows developers away from the hell that is Visual Source Safe.
In short, whether you're new to version control in general or just Subversion itself, this book is highly recommended. Clear, concise and crammed full of useful, important and dare I say, pragmatic, advice and information. An excellent book in its own right and a worthy addition to the Starter Kit Series.
Dean Wilson is a System Administrator at Outcome Technologies. His personal site is unixdaemon.net. You can purchase Pragmatic Version Control Using Subversion from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page
For those looking for Subversion documentation, there's also an excellent Subversion book, with electronic copy available for free, at http://svnbook.red-bean.com/
My only problem with Microsoft is the severity of bugs in their software.
...Subversion support is one of the most requested features on RubyForge.
Is there a StatCVS-type reporting tool for Subversion? I suppose StatCVS could be modified to support Subversion... there's been some discussion of it on their mailing list...
The Army reading list
Pragmatic Version Control Using Subversion
as
Pragmatic Version Control Using subterfuge ?
Maybe I've just been doing too much sneaky stuff today...
The Subversion Book seems to have most of what you need to know and its free as in speech and beer.
from the review it does seem to have a couple of chapters about general project organization that aren't in TSB. Otherwise it the list of topics seems to be right out of the oreilly book.
they've begun to gain a reputation for writing, editing and finding book authors
Good for them, how do you edit a book author though? Remove a finger or two if they don't send you their rough draft?
Executive summary:
By the way, the GCC team is starting to make experiments with svn, and it looks like they might switch in 2 or 3 months.
So as a crusty old fart who hates changing tools just because they are new and cool, and am pretty much happy with CVS, what is it that subversion does better than CVS that should make me want to switch?
And this is a real question asked for the puposes of gaining information, just just a snide "here is a nickel go buy a real computer" kind of remark.
The book sounds very interesting, but the way it is described makes it seems like it only goes through the basics. What if you want more in-depth reading on tagging and other simple necessities that you cant go without knowing about well?
The book isn't even available yet (see the link from bn.com). One must presume the reviewer got an advance copy from the publisher, who may have given it away expecting a favorable review.
What reason do we have to to believe that this review isn't complete astroturf? What is his relation that caused him to get an advance copy?
It may well be an honest review, but I'm inclined to be a lot more skeptical. Especially since there is no mention in the review of the fact that he got an advance copy.
Sample excerpts from the book are available in PDF format from the book website. You can also download the full Using Tags and Branches chapter artima.com (free login required).
<gripe>Most tech books these days have a page on the publishers website, and some offer a sample chapter for download. Book reviewers should include a direct link to this book page, and note what excerpts/chapters are available for preview, if any (and prevent people like me karma whoring).</gripe>
We are switching from PVCS to subversion. Besides being pretty crappy and expensive, PVCS uses the lock/modify/checkin paradigm. Every time when I convert a PVCS user to Subversion they are scared because of the edit/conflict/merge idea. "OHMYGOD I have to lock my files or anything can happen". (Because I am not articulate) I have a hard time conveying the benefit of the CVS/subversion way.
My rationale is that if two people need to modify a file the conflicts exist independent of the version control mechanism, its just that locking serializes modifications and and merging has you modifying in parallel and then fixing it all at once, which is more efficient than making someone wait. Not to mention the idiots who lock everything and then go on vacation.
I usually just say 'try it and you will see that subversion is way easier to use and the rare conflicts are easy to merge'.
Any recommendations?
- Atomic Commits
- Faster tagging / branching
- Natively client/server
- Directory structure versioning
- File rename versioning
- etc.
Anyone have success stories in moving from CVS to Subversion? Any caveats?I've played with cvs, subversion, arch, darcs and I honestly think darcs is easiest to use and best of all of them I've played with.
The only thing I would change in darcs is the way it handles binary files. It can't apply patches to binary files, it has to save full copies of them. Not very condusive to projects with lots of binaries.
Subversion, on the other hand handles this better.
But I still like darcs better... its features are sweet.
Just install Subversion, configuration and maintenance is a breeze. Which makes me happy because I have to use it and administer it (or not in the case of svn:)
I love it so much, I am actually considering installing svn on my families computer so they can keep track of their most beloved digital documents as well.
JsD
Yes, um, sir, we would like to switch to a new version control system. No, it's not because of a problem with SourceSafe. Actually we stopped using SourceSafe several years ago. Yes, sorry we didn't inform you. Yes, we realize that cvs is not sold by Microsoft. Yes we would like to switch now. The new tool? It's another open source tool. Actually, it's called "Subversion".
There are also a few rants by Greg Hudson and Tom Lord about changeset vs tree-history. Search google
The general idea is that in a given set of interrelated files, it does not make sense to think about revisions on the file level, as a change to one file is really a change to an entire project. Simply assigning revision numbers on a repository-wide basis simplifies the revision number system and does away with one bit of weirdness in CVS. It's a bit strange if you're coming from a system that works with per-file rather than per-repository versions, but it makes a lot of sense when you get used to it.
As to difficulty tracking the files -- no, it's not difficult. Finding the revisions associated with changing a file is easy, so tracking the changes is no more difficult than in CVS.
How can we continue to believe in a just universe and freedom to eat crackers if we have no ale?
I wonder if SourceForge will ever move from CVS over to subversion. Maybe they could setup a temp program to allow people choose which way they use. The benefits subversion brings to OSS soruce control is really amazing.
But it is hard to say how well subversion would handle the load. I'm guessing sf.net has done a lot of tweaking to get CVS to handle the number of projects they currently have, and moving everyone over... or just supporting both, is greater than the effort they want to put in.
Maybe some day...
Its not what it is, its something else.
It depends on your goals. I prefer SVN's revisioning system (having used both CVS and CMVC, which use the same revisioning). The nice thing about it is if you extract revision n, you'll know all the files you extract from the repo are as the repo looked when revision was made. This is non-trivial with CVS or CMVC's revisioning system unless you know the version number of all the files in the entire repo and extract those version numbers.
It's still easy to tell when individual files were changed using something like the 'svn log' command.
Oolite: Elite-like game. For Mac, Linux and Windows
The way I like to put it to point out why ClearCase and others of its ilk are such a beast to work with compared to CVS, is that broken merges are fixed BY the people that cause them, BEFORE they screw up the repository.
With the "normal" source control systems that use the reserve/checkin style, a programmer may work on several files - perhaps they even work on them unreserved to be nice to others (as is becoming policy here).
You still have the issue of "The Merge" That is, the programmer doing the development is nt getting changes made to those files while he is working, and others are not seing his work.
So when it comes time to check in all the files, a prigrammer checks most of them in - but then possibly runs into issues with merges in the last few files. Lots of commercial merge tools seem poorly designed to help the average programmer deal with issues, sometimes they simply automatically hose the merge without the programmer even knowing.
Using a CVS system, the programmer is able to keep in sync all through development by constantly updating files. That means that issues caused by him will also be resolved by him on the fly, instead of someone else discovering a merge went wrong later. It makes the day of checkin no longer something to fear, since you are already synchronized and can be reasonably sure the system works BEFORE you do the checkin, instead of checking after and possibly having a broken build.
Sure that checkin might cause someone else issues, but they will tend to be isolated to a developer and not affect the whole team at once.
So basically CVS style operation encourages programmers to keep in sync with what other people are doing - I honestly believe that if it did not work this way and people were forced to deal with reserved files, the whole OpenSource movement would be a fraction of its current size and success.
Yes I know ClearCase can kind of do something like that, but not very well and I have seen clear case totally bungle automatic merges before.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
I've been using Perforce for awhile for a personal project (their "trial version" is a perpetual 2-user free-as-in-beer license) and I have to admit, I'm hooked on the speed. CVS on the LAN at work is an order of magnitude SLOWER for edit/commit operations than Perforce on a 512K upstream DSL connection.
I've thought about moving to Subversion just so it would be cheaper if I ever had to scale my "personal project" up past two people. But honestly, I think Perforce is well worth the US$750/seat for the sheer speed it offers.
Anybody have any idea how SVN compares?
It's much better to have repository-wide revision numbers, because it means that a revision number identifies the state of an entire project at a specific point in time, not just the state of specific file.
This doesn't make it any harder to track changes to individual files. When you run "svn log" on a file or directory, you only see the log messages/revisions listed where that file or directory was changed.
It's really quite an elegant system.
This space intentionally left blank.
When I was initially getting involved with Subversion I found that most of its features reminded me a lot of Perforce. The repository-wide revision numbers, database backend, and general "feel" of Subversion is very similar to Perforce.
I think Perforce is better than Subversion if you're doing a lot of branching -- the merge point tracking that Perforce performs is really well implemented and saves you from a lot of manual tracking. Overall though if you're looking for a free alternative to Perforce I'd highly recommend Subversion
Subversion isn't the only choice for people looking for relief from SourceUnsafe. CVSNT is an evolved and mature CVS that you should look at too.
http://www.cvsnt.com/cvspro/ for the server
and http://www.wincvs.org/ for a gui client
Mergepoints in cvsnt are very cool and wincvs is a powerful client. Since cvsnt runs on Windows and many unixes, you also have your choice of platform as well.
cvsnt is a project that has been around over five years (at my reckoning) and has a good following. Plus you can get commercial support for it from March... what more can you ask for from Free software?
Do not spread "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" over the internet, thank you.
Well, assuming that you're declaring computer science superior over the pragmatic approach, I've started wondering about that recently.
Let me start off by saying that I'm firmly in the scientific camp, intending to start a PhD in CS this (northern) summer. It seems that an awful lot of popular things in IT are despised by computer scientists. Linux, as a monolithic kernel, is a famous example, as is C++, and I recently saw something about Perl being evil as well.
Now, these scientists have good reasons to call these things ugly, but people still use them. That means that either people are stupid, or the computer scientists are missing something. I think that it's mostly the latter.
In my software engineering course I was taught that the first and foremost thing you do in a project is gather requirements. It seems to me that computer scientists need to get out and ask people who work in IT what they actually expect from their kernels, languages, and development systems. Then they can try and create theories of how it all works or should work to fulfill those wishes, and use those theories to improve those real-world systems.
The alternative, sitting in your ivory tower inventing things that you think are pretty and everyone else thinks are useless, doesn't seem to be working too well.
I perfer to use intimidation:
"Touch my code and I will beat you with silly with a rolled copy of the original spec docs."
Seems to work for me.
-- TMKThat's fine, and if it's working for you you don't need to change it, but personally I find it hard to remember that in version 1.15 I added a new table and that branched version 1.2.4.15 corresponds to the current production code.
In Subversion you'd use symbolically named tags, which are copies of directory trees, in order to remember that kind of thing. So you might have a bunch of files in your repository corresponding to your released software, in directoryIn your example a single revision number is useful because a change to SQL code usually involves a change to other code. For example, if you rename a column you'll need to change application code that accesses that column. If you commit all these changes in one go, logically grouping them together, it makes things a lot clearer when reviewing changes later on. Once you have changes grouped together as a unit you can move them around, apply them to other branches, or even back them out if they don't work.
Grouping changes like this together probably also implies some stuff like having database people sit with programmers and pair on their changes, but I'll avoid going off the deep-end and evangelising XP too much...
Subversion now has a file system version as well, so you aren't stuck with Bezerkly db if you don't like it.
TortoiseSVN is an excellent front-end, and there are a bunch of IDE plugins for integration with things like Eclipse, IntelliJ, Visual Studio...
http://tortoisesvn.tigris.org/
Since SVN people are about, could some answer these two questions. How's the branch switching code looking nowadays and what's the status of read-only files in the repository?
We tested Subversion in November against our working CVS system. It was fun, and we were all really happy with it until...
When switching between two branches, a file that was moved caused the switch to fail. The local sandbox was broken to the extent that nothing short of re-rechecking out the repository would fix it. All subsequent commits, updates, or attempts to switch back would not work.
The second thing that bit us really bad was that we have applications that set source code controlled files read-only. This is intentional and necessary; if they are not read-only, the files will be changed automatically when certain tests are run which is something we do not want. Despite this, there are times when those files need to be updated. SVN crashed and burned trying to checkout changes over read-only files. All the research reading mailing list indicated that the prevailing thought at the time was that read-only meant read-only from SVN.
That's not how it works in CVS at all where we currently use watches and locks to get this functionality. Read-only is an attribute of the file. When some checks out the file, it must be read-only when it arrives in their sandbox. The file and the sandbox is managed by source control so aside from user permissions, the last word on whether a file can be modified is that of the sandbox manager, not the filesystem. In short, if CVS or SVN need to write over a read-only file they should be able to do it so long as the file is read-only when the job is done.
With the read-only detail and the sandbox corruption issues open, we had no choice but to return to CVS. I am seriously looking forward to what Subversion has to offer in the future though.
Hope
It's my understanding that the file system repository is also binary. I really like CVS's text repositories for two reasons. One, a vague sense of security, that if a file gets corrupted, I can at least make an attempt at manual repair. Two, I can do a quick read-only edit of a repository file to see some code, I don't have to waste time with checking it out, just go look. Grep also works on it. If I want to find some code that was put into Attic a long time ago, grep works wonders. I don't know how you'd do it with Subversion.
I also had a minor scare with moving subversion (with a Berkeley DB repository) from a Pentium to an Opteron system. The repository size, using dump and restore as recommended by the subversion docs, grew by (IIRC) a factor of around ten. I do not understand this. Merely going to a longer word size would have made no difference to a CVS or darcs repository, being text, and I could think of no reason to more than double for a binary repository. That is what prompted me to look for alternatives, and how I found darcs, and I won't go back to either CVS or Subversion.
Infuriate left and right
PuTTY converts to svn:/ svn.ht ml
http://www.chiark.greenend.org.uk/~sgtatham
(much) smaller project than mono, obviously; interesting all the same.
It's been a very long time. I seem to remember writing a short script that invoked findmerge and clearmerge to do it all. Definitely not a simple as CVS.
:-)
As for the screwing up on whitespace, no idea. But CVS does sometimes, too.
And no, I don't prefer clearcase... that thing was like a tank without an engine: you'd just have a bunch of people inside turning the wheels.
I can explanate how to administrate your network. You must configurate and segmentate it, so it can computate.
I introduced the company I work for to Subversion. They now use it will all new projects. All prior existing projects still use CVS. I also created a full featured (including per project, directory level permissions with inheritence capabilities) web-based client in PHP that is tied into dotProject. Most web based interfaces to SVN that I found back when I started the project failed to consider that some people need restricted access.
Most people just use Tortoise though. The web-interface is nice for browsing repositories and downloading single files but when you need more stuff done, then Tortoise is ideal.
Work Safe Porn
If you've got the money and the need for a SCM tool like Perforce, you should also be looking at Borland Star Team (comes with defect tracking utility) and Clearcase.
If costs like $1000 per seat scare you off, and/or you don't have a need for planet-wide team development, then maybe Subversion would be suitable.
Chip H.
About two years ago, I switched a large project from CVS (three years of revision history) to Subversion.
We were able to migrate it all easily. We have developers using both WinXP and Linux. The Eclipse client was kind of broken at first, but recent versions have been acceptable. I've been able to forget all the workarounds and weird issues that caused us headaches with CVS.
Overall a very good experience - I would say Subversion doesn't add anything groundbreaking to revision control, but rather is CVS done really really well.
It's interesting to note that while many things are stated by computer scientsts as evil, they are also practical. There are reasons why these languages are 'efficient' in a business sense: Linux is a widely used and widely known free *nix kernel. C++ leverages the existing C knowledge base; Perl is wonderful when you're trying to analyze ad-hoc log files. Personally, I haven't learned Python or Ruby, because the amount of time I spend hacking perl (maybe 10-15 hours a year) isn't enough to ever justify learning a programming language to replace it. [note I'm not a programmer; thus don't rely on programming knowledge to hold a job; thus my self-education time is better spent elsewhere; otherwise this might be a useful broadening of my horizons]
What you have to realize is that something won't be (rapidly) adopted unless it is significantly better than a previous product (or has a near-zero learning curve). For example, one of the members of the IEEE that sits on some of the Ethernet standards committees mentioned to me that one of the reasons Ethernet jumps by magnitudes of 10X is to ensure the next generation provides significant benefit over the previous one (of course, there are other reasons, too).
The point is, you can find a significant number of languages in use, but few that have been displaced. Those that were widely adopted (Fortran, Cobol, C, Perl, Java) all have specialties (scientific, business, fast/portable, efficient scripting, the web) that significantly differentiate them from previous languages. The dynamics of Python and Ruby is actually an interesting case study of where Perl is being very slowly knocked out of some of its domains by a similar, less "ugly" language (or such is my perception [both Perl being knocked out and the others being less 'ugly']). However, that transition is, I would argue, a bit glacial.
Also, to say that computer science (and studying algorithms) has very little effect is not all that accurate. Developments in rendering 2D and 3D systems, graph theory, and countless other areas are the result of those individuals. Even my understanding of how to organize code to ensure my OO code is a directed acyclic graph is a result of those "CS" folks and having that knowledge filter down. It doesn't necessarily affect the languages, but it changes how we use them. More recently, we have seen discussion of aspects, though (personally) I still haven't figured those out [I actually think it's because I work in problem domains where aspects are not efficient].
In addition, continually pointing out the weaknesses of the existing systems helps those that will design the next set of systems avoid the same mistakes. Even though that advocate may appear to be in an 'ivory tower' and ignoring what is going on in the 'real world', if a few architects listen and learn (and apply to the next generation of systems), then the computer scientist has served a purpose.
One other note. In respect to: "It seems to me that computer scientists need to get out and ask people who work in IT what they actually expect from their kernels, languages, and development systems.", try it sometime. Many times, people have NO IDEA what they are looking for. That's the whole basis of prototyping - to get something in front of the person and get feedback. Many other times, you will get completely contradictory answers. Detailed-oriented people will want a revision control system, code tags, and everything javadoc'ed. Others will be hacking out code that hopefully self-documents, but only has a comment to identify the license and copyright of the code. The challenge when dealing with people is that you have wildly different learning mechanisms, personality types (e.g. Myers-Briggs), brain operation (e.g. Hermann), and Communication Styles (e.g. DiSC). These differences lead to d
Oh happy Christ, Borland must release two tools under the same name, because you and I can't be using the same StarTeam. All you need to know is that StarTeam assigns files to the following status codes: "Current", "Out of Date", "Not in View" and "Unknown". How the hell does it not know? We spent like $50,000 on it (by "we," I thankfully don't mean me) and that's about the only thing making it difficult to convert us over to svn. I'm hoping to use Trac as a Trojan Horse here.
The one thing that has me mystified is how svn seems to be corrupting my binary files.
I do a
svn add *
and it recognizes that a particular file (a CAD file) is binary, but when I commit, I see that the file size has increased and now my CAD software can't read the file!
What is svn adding to binary files? Surely not those properties you set with svn propset etc?
Other CS topics are dependent on physical devices with complex emergent behavior (computer systems and their constituent parts) and thus don't admit axiomatic proof; but are still "hard" in that repeatable controlled experiments can be conducted to generate solid empirical evidence.
However, many parts of "Computer Science" are dependent on economics and human behavior. It is at this junction that emotions are thrown into the mix and religion ensues.
"Hard" sciences do not have dependencies on soft sciences; the results of physics research do not depend on psychology or economics. Given this, I consider CS to be a soft science.
No, it can do exactly that, and very well. If it is set up correctly, basically designers can create their own side-stream, and do repeated merges into it to keep up to date.
So where can you find more information on ClearCase side streams and bringing in external changes? I have ever been frustrated by a lack of good ClearCase documentation, and whoever set up our ClearCase systems at work sure does not know about this.
However, I still find setting up a sidestream (private branch? Google had nothing on ClearCse sidestreams) a lot of bother when under CVS I can simply edit files and merge in all changes as they come. It literally is the least amount of work possible for the best workflow, and if the sidestream thing works like private branches then I fear the version tree would be a nightmare from daily merges into a private branch over time... I'll reserve judgement though until I get more details and try it out. If all you do is create a private branch once then I guess the overhead is not too bad... as long as the merging works well.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
It is coming along quite nicely... recently basic synchronize features were introduced, and the JavaSVN layer improves on a rapid pace (using it to avoid JavaHL). The installation thanks to a new update site by Alex Kiatev of the JavaSVN Layer is a breeze now, just point eclipse to it and watch getting it installed. But yes... the plugin has bugs, but there is only a handful of maintainers and getting everything to work correctly is a huge task (the devs definitely need a helping hand to get the rest of the functions in and improve performance, which is a drag once the project gets bigger... *IBM anyone?*
We use Subversion here at my company and have had a great time with it. When we moved from CVS we had a few headaches but the documentation is so good that we had an easy time figuring it out. Now we have all of our projects in Subversion repositories. Life is good....
Then we loaded a project onto SourceForge. Back to CVS. I screwed up the initial import and had to remove a bunch of files. Do you know how this is done? By sending a request into SourceForge.
Already I miss Subversion.....
--
Dan