Designing a New Version Control System?
tekvov asks: "When Linus Torvalds decided to use BitKeeper as the version control system for Linux there seemed to be a lot of controversy and many challenges to create a better system than CVS. My question is exactly what would this 'better system' look like? How is the subversion project, Tigris, doing at creating a new version control system? Basically, does the Open Source Community need new tools in this aspect of development? And if so, how should these new tools look?"
Shurely, the Tigris project subversion (http://subversion.tigris.org/)??
"Elmo knows where you live!" - The Simpsons
Heh, but it doesn't have to be that way. Look at Clearcase. Good gui and gobs of functionality. Too bad it's so expensive.
Subversion isn't the only open-source revision control system out there. Check out these projects as well:
OpenCM
arch
Stellation
PRCS
I've used Clearquest for over a year and I can tell you that it is a nightmare.
Completely unintuitive GUI and it is difficult to enter bugs into. Maybe they have fixed this in a later version but the email bug alerts don't have clickable links for developers to go straight to the bug in question.
Clearquest was designed by project managers FOR project managers. Any good developer would not go within a mile of it if they had the choice.
Regardless of where your particular alliegances lie---whether it be with arch or subversion or opencms or bitkeeper or whatever---it does seem obvious that the open source community is asking things of CVS that it is just not able to deliver. One need only look at some of the problems large projects like GCC have with it to realize that some alternative is needed.
And if that doesn't convince you, well, it's not for nothing that some of the primary developers of CVS are now working on subversion.
Now, of the new crop of tools, the only one I've played with extensively is subversion---but I am absolutely blown away by how well it seems to make common operations simple. Even with its documentation in a very rough state, and despite its many architectural differences from CVS (with which I have several years of experience), I was able to figure out how to maintain a vendor branch and local modifications, perform updates on both, merge them, tag releases, etc., very quickly and easily. Its developers have obviously looked at CVS to see what things it does not do well that people do frequently, and designed accordingly.
Is subversion for you? Who knows. But if you use CVS a lot---especially if you find yourself cursing CVS a lot---you should do yourself a favor and look at some of the alternatives. A lot of lessons have been learned, and you should avail yourself of the benefits.
Urgle.
Clearcase might cut it for most corporate types. Sure it's got tons of features, but you'll get a groetesque design and a bad implementation for free.
(Try scaling a clearcase server and you'll see how bad the design is... Hint: Adding more CPUs won't help you.) No, most people won't care, but you do if you need to scale it to several thousands of active developers.
Even though I dislike the product, its has more functions than you'll ever need. Integration with different platforms and products are superb, if you're willing to pay...
However it lacks in areas where the developer isn't fully connected (i.e. with LAN access to the view and vob servers).
IMNHO, what the open source community, by definition, needs is something that'll work in a distributed (and disconnected) environment. Clearcase does NOT come even close to delivering that, CVS does, but functionality wise, BitKeeper blows them both away.
I haven't looked at SubVersion in a long time (before it was self hosting) and it looked promising, but IIRC it lacked some of the more advanced functionality that BitKeeper has.
Personally I'd much prefere using a completely free version. Not because I don't like to support the BitKeeper team (I'll buy the product if I use it commercially!), but because of the open logging function.
It just comes down to the fact that I like my privacy...
-oswa
I agree, Perforce is a very solid product indeed. And all the commandline tools are there for Linux, and so are the servers. I've used it both on Windows and Linux (both servers and clients) and it works like a charm.
And in case you don't like their fortcomming linux GUI (I hadn't heard about that before, thanks WPIDalamar) they do provide you with an API so you can make one of your own (KPerforce ^_^), which shouldn't take that long really.
The pricing seems very high for an individual, but their pricing is real cheap for this kind of software (for companies) and you can use it without a license but then with max two client specifications. They also have good support (something that is not common unfortunatly).
http://www.perforce.com -- go there and check it out. If you hate paying and want to make your own set of tools you can learn a lot from Perforce.
And I agree, source safe is icky, and so is CVS and source offsite. I haven't had a reason to try out BitKeeper so I unfortunatly don't know how it stacks compared to Perforce.
Look at Continuus Versioning Control System: every change on an object belongs to a certain so-called "task" (or more than one task). tasks belong to a certain task-folder and task-folders can used for releases. Continuus has a nice flexible database, but the disadvantage of this is that it makes it complicated for people to learn it fast.
- fake out CVS by doing a remove/add pair on every file you want to move (which means you lose the revision history of each such file!), or
- manually move files around in the repository (which entirely defeats the purpose of using a revision control system in the first place!)
If anyone out there creates a successor to CVS, please fix this fatal flaw!There is quite a mature native Windows port of cvs that we've been using for quite some time.
john
I'm a ClearCase specialist so I'm biased.
However ClearCase has some -very- good features, and here is what I would arrive at (ideally):
1) Make your repository a mountable file system, supporting multiple types of connection, NFS, SMB, Active Directory, FTP, etc.. When connecting you must specify a profile to be used.
2) Make every user have a number of profiles (Min:1) (like ClearCase views), these profiles contain -all- the info needed to access file versions correctly. They should allow sharing ('base my profile X on the profile Y created by user Z'). And support concepts such as labelling, conditional branching, etc..
3) All profiles are managed from a central server (redundancy?) via a web interface (to achive cross-platform conformity) and command-line interface (SSH based) for scripting/power-users.
I could go on forever, but I think the three above points are the things that matter most to me. Obviously you also need security, administration, storage, etc.. but I think that making files available simultaneously via many common file sharing protocols would produce the greatest benefit.
Finally: MAKE DIRECTORIES VERSIONABLE/BRANCHABLE!, yes it causes some potential headaches, but it's benefits easily outweigh them.
"Oops, I always forget the purpose of competition is to divide people into winners and losers." - Hobbes
Also, they do open source licensing. If you are a certified open source project (I guess they just checkout the project???), you can get licenses for free.
1;
I think in designing a source control system you have to be aware that there are two different usage models.
CVS has been designed and mostly used for the latter. Tools like SourceSafe and Clearcase were designed, and are almost exclusively used for the former.
One of the obvious differences in approach is file locking on checkout. Obviously there are others as well.
I don't see any reason why one tool could address both models, with suitable ruleset configuration for the administrator. But you have to recognise, and design for, one or the other or both models from the start.
Never trust a man in a blue trench coat, Never drive a car when you're dead
I love perforce too. And if you are either an open source developer or you just want to use it for personal use (two users max) you can use if for free. Check out their licensing here (scroll down for open source info) and here
Plus it's so easy to install on a linux server. There's a bit of a learning curve with how the system works but in less than a day you'll be checking in and out and branching without a problem.
"Karma can only be portioned out by the cosmos." -Homer Simpson
I am working at a really large corporation and we are using clearcase. All I can tell CC is really sucks. CC always needs some maintenance. We have a dedicated CC expert and IT for maintaining CC. And CC is painfully slow if you compile something from the repository (at least on Solaris). So, we use the snapshot-view feature which is more like CVS. In short, we use CC as it were CVS.
Of course, corporate policy forbids us to use CVS.
Government cannot make man richer, but it can make him poorer. - Ludwig von Mises
Could not disagree more. At my previous job, it took a team of 5 CC guys to support a team of 50 coders. It never worked right. The *nix support was OK but the NT support was a nightmare. The view concept of CC is great in theory, but in practise, it's a disaster: hours before the final build, everybody is scrambling to verify that indeed the build team is pulling the right files from the right view, etc.
Like in in many other areas of sw development, simplicity is often the best choice you can ever make. CVS is simple. It works. It's easy to audit, it's really cross platform and with so many OSS projects using it, it's worth learning the 5 or 6 commands you'll ever need. The thing that the cvs documentation explains really well is that the key to successful CM lies no in the tool but in the processes and the communication between team members.
there's no place like ~
From a licensing standpoint, they have a problem with the code that validates you. We went through some layoffs and backed off the number of users upon renewal. Even though users didn't exist in the database, the licensing reports said they were there. It took me a few days to demonstrate this was actually happening and get them to admit to it -- don't trust their logs!
This one also looks good:
http://www.opencm.org/
Here is a quick list of key features of OpenCM.
* For-real configurations! It's just amazing what CVS doesn't know.
* Ability to rename files without losing their history
* Access controls on lines of development (branches).
* Cryptographic authentication. This provides the ability to give developers accounts on the OpenCM repository without giving them an account on the underlying machine (OS), and makes multi-organization collaborations possible.
* End-to-end integrity controls. If a server has a bad file, or a replicating server actively attempts to replace the proper content, the end user can detect the error or substitution.
In future releases (coming soon), OpenCM will provide:
* Repository replication
* Disconnected commits (ever screwed up a code base on an airplane or a vacation and wished you could back out?
* Advisory access controls at the file level.
We have had these features working at various points (so we know they will work), but elected to remove them to make the 0.1 release available sooner.
Look, CVS is king.
Yes, King.
I would not hesitate to say that it has it's share of difficulties, but there is no way anything is going to replace it anytime soon. There are many meta-features of CVS that make it unable to be replaced:
1. Multi-platform: I don't mean 3 or 4 or even 5 or 6, bla bla bla. I mean EVERYWHERE. I've seen CVS on more places that anything besides emacs and gcc. And really, anyplace gcc or emacs goes, cvs is the third guy there.
2. Massive Acceptance: CVS is everywhere. 10 million people use it with sourceforge. Another few million elsewhere. It is the common thread that binds us together (kinda like the force!)
3. Massive, Massive Tool support: This is my favorite. You can use it about a hundred different ways. Not 1 gui, but 50!. It goes into command line apps like great!. Show me another tool that has integration with the windows explorer (via TortoiseCVS) like it has. You Can't. (Don't even try that god-awful Bitkeeper's integration:yuk!)
4. SimplicityIt's REALLY simple to use. It's not that complicated. If CVS throws you for a loop, maybe software devleopment really isn't where you should be working. The incompetence among developers is what makes all software look bad.
5. Protocols: You can run CVS thru SSH, RSH, PServer, File Access, and more... It fits into every environment. It works across any damn network. It can jump tall buildings in a single bound!
Really, until someone makes something that trounces CVS in all those areas, AND provides features that "I can't live without" CVS will Rule.
"...In your answer, ignore facts. Just go with what feels true..."
(Try scaling a clearcase server and you'll see how bad the design is... Hint: Adding more CPUs won't help you.) Try adding more memory and distributing yor server duties over a cluster and that will help.
Blaze a trail to the New World
What about purchasing one of the WalMart machines, sneaking it into the building one day, and just adding it to the network. You could even configure the machine to refuse connections except from your work group to try and keep it off of your ServerNazi's radar screens.
Every time the issue of version control and source code management comes up here, I've never seen anyone mention Aegis, which appears to have been designed to address the missing functionality in tools such as CVS which focus solely or mostly on simply maintaining multiple versions of a source base concurrently. Here's an excerpt from the CVS comparison in the CVS Transition Guide:
1.5.1. Why should I change from CVS to Aegis?
The software seems to be pretty mature (currently at version 4.5, first released in 1991). Has anyone here used it?
In Soviet Russia, Jesus asks: "What Would You Do?"
Everybody who thinks javadoc is the same as literate programming (web) really needs to do some google searching or something to find out what literate programming is about.
- Scratch pad versions. Ever needed to play around
with a piece of code (put in debug statements,
or change part of it temporarily to help debug
something) but didn't want to check it out
and have the threat of making the changes
accidentally permanent? Envy had the ability
to make a "scratch" version of a file - letting
you edit it, but not worry about accidental
check ins, or forgetting that you had made a
file writeable.
- Version/Releases. Not only could you label
a specific state of an application a "version"
but you could also label a version of an
app a "release". This allowed some subtle
distinctions between "ok here's a workable
version we can get back to (demo)" and "here's
the real, outgoing released version".
- Manager. Code could be given specific people
that were the manager, or "owner" of a piece
of code. If you wanted to enter your changes
into the code base in general, you had to get
the owner to do it. This control could be
anywhere from every check-in, to version or
releases. An owner could give permissions to
other people as well.
- Multiple checkouts. Envy recognized that
sometimes people have to work on the same
file, as much as its best prevented. So,
it allowed multiple check-outs, with facilities
to integrated the files back together on
check-in.
It was quite complex, but looking back at it I now understand why many of the facilities were there and die to have them for my team. We're using SourceSafe (blech), and it works ok, but something like Envy would be great.There is also an Emacs client for Perforce. It's not completely full-featured, but it means you don't have to bother with any of the standalone clients when doing basic editing and version control tasks (You are using Emacs as your editor, aren't you?). When I need to do something fancy, you still need to use one of the standalone clients (GUI, commandline, or web).
Besides the Emacs integration, there are integrations for Developer Studio, Windows explorer, Perl, Python, Forte, Eclipse, etc. These are all available at Perforce User Interfaces & Integrations.
Aegis enforces a development process which requires that change sets ``work'' before they may be integrated into the project baseline. Works includes requiring that change sets build successfully, and (optionally) that they include and pass tests. It also ensures that code reviews have been performed.
--- Fox
plus tortoise which has been working very well for myself and some co-workers.
four-oh-four
Subversion is, basically, changeset based. Their storage model is a bit.... wierd. But they capture the set of changes in a checkin
as a single, atomic change unit.
On the other hand, when we started implementing Stellation, we used PRCS. To say that it's a frustating, annoying, glitchy mess
is an epic understatement. When we finally moved to self-hosting
(inside IBM; we don't yet have strong enough access control to
put up a public Stellation server, so we shadow our internal
repos to an external CVS for the moment), it was an enormous
relief.
PRCS generates thousands upon thousands of unnecessary questions,
phrased in obtuse and easily misinterpreted ways. It requires you
to manually maintain the map between filenames and repository
artifacts, by manually editing a cryptic configuration file. It
constantly mucks up that configure file, adding hundreds of carriage
returns for no apparent reason. Echh. The versioning model
is very nice; the implementation is solid, but frustrating.
-Mark
See:
http://www.perforce.com/perforce/branch.html
Which explains the merge process in Perforce.
-Stu
A couple other posts have mentioned Vesta, which goes a long way towards meeting the requirements you lay out. (For the sake of disclosure, it's only fair to mention that I am currently the primary maintainer of Vesta, and am somewhat responsible for getting it released as free software.)
Vesta absolutely does this.
Vesta does not explicitly provide this, however it's very easy to get with a simple diff command. The Vesta repository has a filesystem interface which makes it possible to directly access all versions past and present. A simple diff -r will show exactly what changed between any two versions. The also has other fun uses (e.g. greping across versions).
Vesta's access controls are essentially UNIX file permissions. Through the repository's filesystem interface, you can control who can read and write (commit new versions) at a variety of granularities with chown, chgrp, and chmod.
Vesta provides no direct help here, but again, because of the filesystem interface to the repository, it's easy to apply standalone diff/merge/conflict resolution tools.
Vesta's unit of version control is a directory. Between versions, files and subdirectories can be added, removed, renamed, etc.
Not built-in, but already implemented on top.
Vesta includes sophisticated cross-site features, including replication and remote checkout/checkin. It's been successfully applied before by a team spread across the US east and west coasts with hundreds of megabytes of sources.
Vesta really shines in these areas. Vesta is also a build tool, and every build neccessarily includes a complete specification of the set of immutable versions it uses. Builds are also themselves immutably versioned. This makes it easy to determine which source componenets have changed between two versions of a build. Also, since it's as easy to select any historical version as it is the latest one, rolling back changes is trivial.
We're still working on it (at the moment just Linux and Tru64 work), but hey, it's free software, and we'd love to have more developers/porters.
At this point there's a command-line interface and some rudimentary support for repository operations in the web interface. Again, it's free software, and we'd love to have somebody contribute more layered tools.
Nothing built-in yet, but we've been talking about doing it, and it may happen soon.
There's a short summary of Vesta's excellent capabilities on it's web site (which includes several points not mentioned here), that I would recommend anybody interested in better version control/configuration management check out.
CVS is teh suck. Use Vesta instead.
We use perforce, too. We've been less than satisfied with it. I'm don't know the size of the companies people with positive views of perforce are working for, but with a couple hundred developers, and on the order of a couple of thousand different code lines, the perforce server often grinds to a halt. More hardware has been thrown at it, more disk, etc, and no one can seem to figure out where the bottleneck is. It's very unpleasant when checkouts are taking 10 minutes, in the middle of the day.
(I'm not a developer, just an observer of the project.)
Subversion is nearing a first final release. The alpha version is just around the corner, and development is very active.
Subversion is basically CVS done right. Natively client/server, atomic commits, proper handling of binary files, proper versioning of everything - including directories and file metadata, and much more. Go read about it at http://subversion.tigris.org
/ Peter Schuller
--
peter.schuller@infidyne.com
http://www.scode.org
ccache.samba.org
So look at cvsnt. It meets your requirements:
A major downside to perforce is that it is terrible at working in a disconnected environment, thus, it isn't suitable to open source development.
I can only assume you're talking about some other Vesta from the one I'm familiar with, because:
CVS is teh suck. Use Vesta instead.
We also had some performance problems with perforce. They've made a few updates to their software on our reccomendation and things generally seem a lot better. I'm not an expert, but here are a few things to keep in mind:
//*'.
Avoid 'p4 dirs
Avoid remote depots. Perforce's implementation may slow down commands such as "p4 dirs"
Do not create clientspecs or branchspecs with "//someProduct/*/main/..." or a similar file spec.It may block all update access (p4 edit and p4 submit) to all of Perforce for all users for five to fifteen minues. Explicitly list depot, project, and codeline in all branch specs.
That's a short list, our release engineering dept. has a long page of things not to do with perforce. I have to say, we've gotten it to run pretty quickly now, after several months of getting used to it.
Personally, I just don't ever plan to give people local access to my servers. One of the huge benefits of being Apache-based is that our authentication occurs through Apache, and that it does not need system accounts (it can integrate with existing auth systems (PAM, LDAP, NTLM, etc) or use text files, databases, etc). Therefore, the only way somebody could alter the author property is if, say, I wrote a CGI script that allowed a person to go and tweak it.
So the general answer is: no, a developer is not entirely trusted, and they cannot change the recorded author on revisions.
Are you possibly thinking of Arch again? I'm not sure how it records authors, or whether other developers can tweak that information.
Seconded, we use CVSNT and Tortoise, and it works really well. It works well for us as we are a geographicaly distributed group of developers with erratic interconnectivity, all working off laptops that move with us. CVS is brilliant for this as you dont need to check a file out to make a change.
The only down side for tortoise is that can make explorer a bit slow if it cant reach the CVS server.