An Illustrated Version Control Timeline
rocket22 writes "Most software developers are supposed to be using the latest in tech and see themselves as living on the edge of software innovation. But, are they aware of how old some of the tools they use on a daily basis are? There are teams out there developing iPad software and checking in code inside arcane CVS repositories. Aren't we in the 21st century, the age of distributed version control? The blog post goes through some of the most important version control systems on the last three decades and while it doesn't try to come up with an extremely detailed thesis, it does a good job creating a catalog of some of the most widely spread or technologically relevant SCMs."
Comment removed based on user account deletion
There are teams out there developing iPad software and checking in code inside arcane CVS repositories
But does it work for them? If so, great! Why switch to something else if you have no real need for all those features?
It's also an advertisement. Sigh.
SCCS predates RCS. Why isn't it on the list?
Geez, he's right, I've been using things like grep and gcc just because I'm familiar with them and they perform the task at hand. Time to upgrade to the hip and new version that does the same damn thing in a slightly different way!
Not that I'm against progress, but it's a matter of weighing the hassle against the gains. Forcing the new kids to learn the old tools can be annoying, but good for them. Likewise, showing grandpa that there's a diff with side-by-side comparisons is probably a good idea.
~500 megahertz K6 laptop w/ ~192K RAM
-single core P4 desktop w/ 512K RAM
I bow before your leet minimalist systems!
We use TFS here. Because some suit that shouldn't have been making the decisions he did, who was also probably wined and dined by some MS suit. Was told it was the best thing since sliced bread. Every developer to a man hates it. It sucks. god knows how much this 'privilege' costs us.
Speaking of modern CVS repositories, one thing I haven't seen is version-ing for web based applications and sites. The one I use is a custom job built by our corporate office and it works very well but it's missing a lot of features. Every once in a while I google for any projects like this and haven't had anything turn up. Anyone know of any?
check out the Mp3 Garbler I built!
Finally, i'll go out and say it, the hate-on for SVN is overrated. A part from a few world class developers who, for some reason, have shitty network connections and thousand patches to sort though, it's mostly cool-kids overcomplicating their lives solving "problems" they never had while developing their blogging software, not working in an office, and imagining they are Linus Torvalds?
While I agree svn isn't bad to work with, having a local repository can be VERY handy once you're accustomed to the workflow.
Enough so that when dealing with svn repositories it's rather nice to be able to import a complete copy of the repo using git-svn, albeit it takes somewhat longer etc.
With svn I could not do version control based things during a long commute such as finding out who wrote a specific line of code so I can make a note to yell at them, with git I can. One of many things.
I'm stuck using SCCS, you insensitive clod!
Nothing for 6-digit uids?
Too Many Acronyms
He who knows best knows how little he knows. - Thomas Jefferson
If it's anything like Hudson's graphical build timeline, it's a cool feature that made me go "neat" but I have honestly never used it to look stuff up on our CI server.
Orwell was an optimist.
All of "big" companies I've worked for use ancient out-of-date source control. The first one used VSS (late 90's, so it wasn't so unusual at the time) but then around 2000 moved to PVCS. All the developers assumed that someone got kickbacks because there's no reason to move to an older, more expensive, inferior product. Now I work at a Fortune 500 company that also uses PVCS. Their reason: not a soul in the building has ever used anything else. I explain about the features of modern source control and people look at with with either marvel (it can do that!!??), or disdain (how dare you question my source control system).
I don't know why this one piece of software evokes such illogical responses. Oh well.
Second, the article doesn't even mention SCCS, developed in 1972 (and still available), so his history lesson is lacking some completeness and perspective.
Third, remarking, "It [CVS] is outdated now, but it worked in the 90s! (If you have it, just walk away and go on to something else!)" -- as well as the other snobbish comments about other (older) systems -- is simply narrow-minded. CVS is completely satisfactory for many, many projects. Contrary to later comments in the article, I've used, and still use, CVS in several commercial products and it works just fine.
Real lesson: Newer is not always better; more features are not always needed.
It must have been something you assimilated. . . .
also leaves out SCCS (a precursor to RCS), as well as layered tools/integration of tools ONTO the SCM (like eclipse -> SVN, etc.)
the article was a good intro to the history of SCM, however i am amazed they didnt just do a box, and put their tool in the upper right hand corner...
I couldn't agree more. In my previous job, I had a colleague who wanted to convert me from SVN to Bazaar (http://bazaar-vcs.org/).
He told me "it was very simple to use, you only have to..." and then started drawing a very complicated diagram on my whiteboard.
Personally, I thought it was complete overkill for the two-man project we were working on.
Are you able to use any software written in the last decade? Modern desktop applications seems to soak memory like water.
supposed to be using the latest in tech and see themselves as living on the edge of software innovation.
I do. I live on the tail edge:
~500 megahertz K6 laptop w/ ~192K RAM -single core P4 desktop w/ 512K RAM
192K? 512K? Kilobyte? You sure about that? Not Megabyte?
"Ayn Rand is a bloody socialist compared to me." - Robert A. Heinlein
It's extremely useful even for a single developer.
You can still use DVCS systems as if they were centralized, without any inconvenience.
I'm not even going to bother to list them myself, as wikipedia does a fine job already.
http://en.wikipedia.org/wiki/Distributed_revision_control
Every single project I've worked that didn't use DVCS, I missed it alot. DVCS is how it should be done, and CVCS is inherently flawed.
However, I also think the article sucked.
You're so right!!
Brings back memories of (barely) running the first version of Visual C++ on a 386SX.
I'm not a lawyer, but I play one on the Internet. Blog
I hear all the time how terrible Subversion is at branching and merging, but I can't really see any issues with it. Am I missing something, or is this all based on pre-1.5, when it didn't have merge tracking? Granted, it was fairly brain-dead to not track what revision a branch occurred in or what revision it was last merged to a particular other branch (or the head), but as far as I can tell, comparing it to AccuRev which I use at work heavily and is supposed to be fantastic at merging (it's ... ok), there's little difference beyond the terminology.
Can somebody explain what it handles so badly? I feel like I'm not missing something I should be. I like Subversion, probably just because I know it, and use it for my home projects, but if there was an actual benefit (and decent cross-platform tools, TortiseSVN is fantastic, I love working on my linux box but doing graphical diffs on the same working copy over a Samba share) I'd love to switch to something better--I know I said I like Subversion, but it's more like how you like a kevlar vest, it's better than the alternative, which in this simile is bullet holes in my torso.
<xml><I><am><so><damn>Web 2.0</damn></so></am></I></xml>
Nope. Distributed systems are ALWAYS more useful than centralized ones for source code control.
Only sometimes their advantages are not that significant.
I do agree, and sometimes is not even about being distributed, is because the older systems suck in terms of branching and merging.
I was hoping for a visual timeline of distributed git repos, or something that would make using git easier. Git is likely a better way to do version control, but it is better because it is fundamentally different. Those differences have not worked their way into Eclipse's abstraction of version control far enough, yet.
I wrote parts of this stuff
Ooops...
~500 megahertz K6 laptop w/ ~192[M] RAM
-single core P4 desktop w/ 512[M] RAM
You can tell you're old when you remember when 512K was considered a lot of memory. "I got a Commodore Amiga 500. I could do BASIC or C programming for an entire month and still not fill-up all the space! Wow." ----- Now we have computers with 1000 times that amount and they won't run Win7 or OS X out-of-the-box.
"I disapprove of what you say, but I will defend to the death your right to say it." - historian Evelyn Beatrice Hall
"If it ain't broken, don't fix it."
Right. And CVS is horribly broken. So it's been fixed. :P
A host is a host from coast to coast...
Unless it's down, or slow, or fails to POST!
What matters is the total build and regression story, which encompasses configuration management and revision control--that topic gets far too little attention.
DVCS can use the exact same workflow as non-distributed VCS, and they're faster in many cases - including not having to connect to a central server for each and every commit.
If it costs you to change now, don't, but if you're starting a new project, I see no reason to choose CVS.
Dilbert RSS feed
bullshit, the three developers I gave (real) example always must work inside company on local network. name one advantage of distributed system for them
even stinky ol' subversion does branch and merge well enough for small project with a few devs.
Full of troll, and incorrect in some spots. For example, TFS doesn't do branching and merging? It may do a crappy job of branching and merging, but that functionality is prominently there.
I quit using SVN just because I found the Xcode integration to be flakey at best, and remote work was less than seamless. It otherwise seems to work fine, and what it lacks are things that are just poseur points for most shops (quick, list two problems for your shop that only a DCVS can solve).
Git worked for me when I was doing work on the bus to and from my day job, allowing granular commits instead of the big mixed-up commit when I got home. I like it for a lot of other reasons even after doing my own thing full-time. But there's no way I'd get on a religious soapbox about it, starting with the learning curve (first time a merge or a push goes wrong, break out the google).
But hey, use what you want as long as it's not VSS. Because even a tabs vs. spaces flamewar interests me more than source control debates.
Mod parent up.
Have you ever tried to administer a central hg or git code repository accessed by a couple thousand developers and maintain access controls on it, like you would in a corporate environment? I have, its a complete pain.
The distributed nature of a tool like git is nice in that you don't have locking and its easy to get all the code you need. However, its worthless to a company unless that can be managed, and you realistically need something like Gerrit (git code review, access control system) or github:fi (costs a metric crapload to help Scott Chacon buy a fleet of Ferraris).
I'm mostly a single person team, and I find DCVS quite useful (the reasons I won't bore anyone with). For my workflow, centralized (SVN in my case) was limiting me.
Central machine with repo is down - nobody works.
They can commit locally many times and merge that to a single commit per feature before ushing to the central system.
Git uses much less space than SVN - Mozilla's repo size: CVS - 3GB, SVN - 12GB, Git - 300MB.
Dilbert RSS feed
I agree, subversion is not terrible. However, after getting a laptop, I definitely see the advantages of a DVCS. git's not the friendliest of tools, but regardless of the reason, there's a lot of moment out there and supporting tools, so I prefer using git as my DVCS system.
In addition, with git, I also have gotten extremely comfortable with creating a new local branch for any separate task I want to do. This makes my commits much cleaner and virtually eliminates the problem I had with svn where I was working on a feature then got interrupted with a high priority bug.
The git-svn bridge also comes in extremely handy, and is a great way to get the benefits of both worlds.
I have to say though, that I think git not handling directories as real objects is a big step backwards. And Subversion's use of metadatas can also be pretty handy sometimes.
Feature branches and isolation of production and development code : branch a repository and work on it in isolation, then merge it later back in the mainline. I use it for myself in that way. There are things that sometimes need to be done fast, while in parallel I also need to be able to implement larger changes.
You can always use a DVCS like a centralized one. The opposite is not true.
For personal projects, I've mostly been happy running RCS on my own server. I use CVS from time to time, and bits of it annoy me, but I can get work done with it.
Maybe for huge open source projects with teams all over the planet who can't communicate very tightly and who don't have a unified SDLC, some of these newer tools are worthwhile. But if you've got a small team and an adequate process, what's the compelling argument for switching to one of them if what you've got in place is working fine?
While I agree svn isn't bad to work with
SVN is barely tolerable. I'm never going back - git is so much better it's not even fair to consider it in the same league as SVN.
The secret to creativity is knowing how to hide your sources. - Albert Einstein
A distributed system is handy because you don't need to be net-connected to do work, so you can take it on your laptop and work on it while on the plane, bus, etc. without worrying about a net connection.
You can pass half-done changes to your coworkers for evaluation without checking them in to the central server.
If doing any work with the linux kernel, git is the most efficient tool to use simply because everyone else is using it.
And of course you can still have a central server to act as the "official" repository. It's just that you can also bypass it when desired.
The author appears to believe that old version control systems are bad because they are old.
I have used ( and administered ) projects using RCS, CVS, SVN, Perforce, Clearcase, Git and VSS.
RCS - Advantage: no setup necessary. I used RCS to track changes to my 140 page thesis ( latex ) during the year of writing. I can still take that tar archive and extract to any workstation, PC ( windows, mac or linux) and have full access to the revision history. No setup, dirt simple. ( of course since RCS was never designed to handle more than a single person modifying the file at a time, concepts like branching, merging etc, don't exist, but for simple single person projects, this is far better than nothing ( and vastly better than manually archiving copies when you remember to)
CVS - Advantage: Supports multiple users, branching and merging (same server, DCVS variant provides some concept of distributed but should be avoided). Relatively easy to setup, and when restricted to ssh only access can be relatively secure. Disadvantage: no distributed support, very coarse security ( if you have access to the server and repository directory you have access, multiple projects on same server are clumsy to secure).
SVN - better than CVS, but harder to setup ( less obvious ?). Distributed support (sort of), but no concept of locking checkouts, so not suitable for code that is not easily merged ( VHDL and Verilog can get ugly when you try to merge what appear to be trivial changes ).
(CVS and SVN are pretty well supported via integration with many IDEs out of the box).
Clearcase - Great big bag of hurt. Avoid this if at all possible. Advantage: Large companies ( Govm't contractors ) use this tool. Ratio of administrators to users 1:10 typically, so expensive manpower. Provides distrubuted (ish) support using Multi-Site. License costs very high. Security is laughable. Any user with network access to the server ports, and an installed licensed client (access to license server) and the ability to assume root on a unix/linux machine can perform any administrative level operations of the files. The client reported username and group membership are trusted by the server to determine access privilege.
Perforce - Despite the authors grouping, P4 provides very good distributed support for controlled development projects. Using proxy servers remote access to files is pretty fast. The only tool listed so far that supports atomic checkins. If any file in the set you are submitting fails to checkin, the whole checkin fails. This may sound like a bad thing if you have never had to fix a problem where one file didn't get checked in, breaking the build. Security (access to parts of the repository) is controlled within the tool, so a fine level of granularity can be achieved. Account management can be done directly in perforce by the admin ( passwords stored locally ), or can be setup to use ldap/kerberos/Active Directory for added trust.
VSS - Small bag of hurt. Small bag because it worked so poorly that we never used it for large projects. Nothing good to say about this, just say no.
Git - I haven't used this enough to know if I like it or not. Having the repository replicated at each remote leaf (user) is nice for the distributed development, but for projects requiring close control of the source code this can be nightmarish. Since every remote site has a copy of the whole history, fixing the problem when Johnny accidentally checks in code from projectX that contractually cannot be shared with projectY can suck.
Actually Bazaar and Mercurial do work with the much simpler feel of RCS in that there is no real effort needed to set things up. It can be refreshing to not have to mess around with repository configuration for simple projects.
I am becoming gerund, destroyer of verbs.
Completely agree that git is leagues ahead, but svn is usable in the sense I could use ed as my ide if I wished, while nowhere near as useful it is still functional, for a certain need.
We purely use subversion at work (after years of trying to convince them, prior to this all work was done on live server, by multiple devs at once). I've switched to git for most personal projects. Of all the benefits and downsides of git vs svn, I just feel more comfortable in a distributed VCS workflow. My home directory is still subversion though. Seems to address the problem better for wanting to keep my different home directories in sync. Don't want to login to a server and go "oh crap, I forgot to push my changes from other computer".
I've tried converting work to git but it just isn't going to happen. I just snicker when something happens in subversion that I know git can do easier.
(rant)To revert changes in subversion you do a non-intuitive reverse merge? With git it's... git revert. Anytime someone needs to revert a commit in subversion they ask me and I always have to look up the syntax because it doesn't make sense(/rant)
Ooh, PlasticSCM is free (as in beer, but not as in liberty) for up to fifteen users for a lifetime! Git is free. (period). Why do we even put these on the same chart?
It's extremely useful even for a single developer. You can still use DVCS systems as if they were centralized, without any inconvenience. .
For a one-man team, wouldn't centralized and distributed VCS kind of merge, if you'll allow me a version control pun?
I am not a crackpot.
Nope. Distributed systems are ALWAYS more useful than centralized ones for source code control.
Only sometimes their advantages are not that significant.
I'd have to disagree. There are many instances were DVCS are always superior. However, there are times - such as in corporate environments - where you simply do not want that kind of information floating around the organization. In those instances a Centralized VCS is superior as the main advantage (distributed version control) is in those situations the biggest disadvantage of a DVCS.
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
This Plastic SCM thing claims to do that... I just downloaded and will see.
Their description of Subversion is almost blatantly wrong, and misses much of the improvements SVN brought about. It would have helped them to at least have read some of the Subversion Documentation - or even just the chapter on Subversion's Delta Editor in the book Beautiful Code.
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
The false sense of security you get with VSS is worse than having nothing at all.
Tell that to all the idiots who blame Windows XP users for using an OS from 2001.
Spaces not tabs. How can you even pose that question?
Try gitolite. Good stuff.
You are not the customer.
It's a nice timeline of some key milestones but it's worth noting that they're advertising something, it'd be nice if that had been clearer from the article.
Also, I was disappointed not to see GNU arch / tla get a mention as I think they might have been first to decentralised operation. They were most certainly one of the first and as such I suspect they had a certain amount of influence on those that followed, even though the user experience was reputed to be lacking from what I heard (actually, I thought that bzr evolved out of it too, so it may also have a more direct connection with the modern-day main players)
Fraud Alert: This Slashdot story was written to make a new commercial version control system, Plastic, seem as though it is the best, in my opinion.
Was a Slashdot editor paid to run this story? Is it Slashdot company policy to allow sneaky advertising???
You can judge a company's products by its morals: Oracle, Microsoft (huge hassles with products being unfinished), AOL (misleading accounting), and Enron (misleading accounting) are examples that come to mind.
A couple of months ago? As mentioned in another post, I'm no source control fanatic. However, I think someone dictating VSS use without legacy reasons borders on negligent. "Negligent" as in the shareholders get to sue, and the executive team fires your ass when the inevitable, unrecoverable, and well-documented database corruption eats your company's IP. (Yeah, I know you could pull the backup tapes, but that would disrupt an otherwise good rant.) People knew VSS sucked even before Microsoft bought it, there's certainly no reason to use it now.
I've started using Mercurial, and I LOVE it! It's lightweight nature means I can make ANY directory a repo, and keep track of changes within that directory, using nothing more than that directory. This makes it feasibly easy to use source control in places I never would have with SVN - such as admin scripts buried in /usr/local, or all the system settings in /etc...
It has solved problems I never thought truly possible to solve instantly, easily, and accurately.
I have no problem with your religion until you decide it's reason to deprive others of the truth.
I'd like to take this opportunity to push my very own brand new project for a VCS for large binary files. Most VCS:s so far has been developed for code. Nothing wrong with that, but there are many other kinds of data that also deserves a safe, version controlled repository they can call home. My tool is named "boar", it's written in python and I need users and help to make it perfect.
To quote the project page: "Boar aims to be the perfect way to make sure your most important digital information, like pictures, movies and documents, are stored safely. If you are familiar with vcs software such as Subversion, you might think of boar as "version control for large binary files". But keep reading, because there is more to it..."
Check it out! (pun intended)
http://code.google.com/p/boar/
Couldn't agree more! I can't begin to tell you how much data I've lost with VSS. The worst thing is, you typically don't know or understand how completely boned you are until you actually attempt to check out your source.
Users who willingly use VSS have likely experienced some sort of de-evolution. I wouldn't be surprised if their knuckles actually drag the ground.
The better way to look at a lot of the SCM systems boils down to:
- How technical are your users?
- Do you want something centralized or decentralized or a mix?
- What tools do you have and do they play nicely with the SCM?
- Does the SCM play nicely in your environment?
- Is the product worth the licensing cost (vs a free solution)?
For instance, SVN is definitely better then CVS, but it's centralized. Which has some advantages and disadvantages. It has very nice tools (TortoiseSVN, FSVS) and is easy for end-users to wrap their heads around it. Merging works, is undergoing constant improvement, but may not be suitable to all styles of development.
For our particular shop, SVN simply works. Couple that with being able to use FSVS to version-control our servers (mostly for tracking changes to the file system), and I'm happy enough that it's not worth moving. (Considering our prior SCM was SourceSafe on top of VSS, nearly any SCM was a better choice. SVN was the natural upgrade path back in the 2004-2006 timeframe. They were there, the tools were ready, and it played nicely with our environment.)
If we needed decentralized repositories, then we'd go look at git, Mercurial or one of the others.
At the end of the day, it's more important that you use at least some sort of SCM, rather then which SCM you use.
Wolde you bothe eate your cake, and have your cake?
That's a lie.
A commit in a non-DVCS is much more than a commit in a DVCS.
With DVCS, they rename the fast, easy part to "commit." The actual merging of your repository with another across a network is one of those items they never discuss, as it is to be performed later. Benchmark a Commit and a Push in a DVCS against a Commit in a non-DVCS, and you'll start gaining my respect as a person who's not fudging the numbers.
If you are a single person, why don't you have the SVN repository in your working environment? You know that you can use "svnadmin create" to create a file based repository without the need for a SVN server. You just checkout of the file:/// URL and you never have to worry about the network being down.
Mercurial's SVN re-education page does a good job of describing the advantages.
Or maybe the fact that svn takes over 10 minutes to bring up log files for us off the server with a repo of about 30 "websites" each with 250+ files and a total of ~3000 revisions across all of it. Compare that to git where git log runs (like everything else) locally, so I'm not killing the same server everyone else is using. Oh, and its results are back in less than 15 seconds most of the time. I used to use svn, and resisted git as it was so CLI oriented. (And yes, I did learn the CLI commands.) But when TortoiseGit came out, that really lowered the bar and we moved - there's no going back.
. Define sqrt(x) as something really evil like (x / rand()), and bury it deep. Watch your coworkers go nuts.
Yeah, repository corruption was a silent killer. Not to mention the whole "you must give write access to the entire repository directory" to VSS clients. (Note: I haven't used VSS since about 2004.)
However, it was actually functional if you put it behind a SourceOffSite server and only used the SOS client to talk to the SOS server (which then talked to the local VSS repository directory on the disk). The SOS server, since it was on the same machine, was able to isolate the VSS repository from a lot of the stupidities that would cause corruption. Having a network error between the SOS client and SOS server no longer caused corruption issues in the VSS repository.
Ultimately, I got tired of paying the licensing costs and having to track licenses for SOS, so we switched away to SVN back in '04 or '05.
Wolde you bothe eate your cake, and have your cake?
Awesome site! That was exactly what I needed. Even down to the point where I admitted that I've been braindamaged by a lifetime of using CVS.
If I have been able to see further than others, it is because I bought a pair of binoculars.
Because I work on multiple machines, and would frequently do work on the bus (meaning I'd have to somehow sync later with the home boxes) when I worked a regular day job. If SVN could be made to fit that smoothly, I didn't see it. Admittedly, it could have been my ignorance. But after reading up on that "git" these kids today are using, I thought it was worth at least a look (along with Hg).
Git solved my problems out of the box. I now have other reasons to like git (or just DCVS in general), and the more I use it the more I'm glad I ditched SVN.
If only PHB's had a clue, we're forced to use Visual Source Safe at work. I would claim it a legacy system but they just put it in a couple of monthes ago. I think any version control is better than nothing, but I'm not sure Visual Source Safe beats the file system's snap shots that are automatically created.
For the love of god, if you have to use MS software, show the PHB Team Foundation Server. It does a lot more than source control and is superior in every way to VSS. If you have MSDN then it might even be free (beer).
"I'd have to disagree. There are many instances were DVCS are always superior. However, there are times - such as in corporate environments - where you simply do not want that kind of information floating around the organization."
That's just an empty string of words. What do you mean be 'not floating around'? Access controls? So use them on per-repository level, duh.
Information leaks? Developers can just use working copies for that.
1) Local history. It's much faster than in SVN.
2) Fast operations - git is faster, though in this case it's not that much important.
3) Ability to work offline and to have private branches.
And 4) git is not worser than SVN.
Benchmark a Commit and a Push in a DVCS against a Commit in a non-DVCS, and you'll start gaining my respect as a person who's not fudging the numbers.
As part of my workflow, what I actually want to get done is a commit to record history many times per day, and I only sync once a day or so -- so why should the DVCS scores for a common action be crippled just because the non-DVCS forces you to do extra work?
It's not comparing apples to oranges, it's comparing an apple to an apple plus an orange -- I'm only interested in apples, so the first one does deserve to win for being less wasteful
I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
useless for others e.g. a three-person development team making an rpc-xml insurance claim submitter.
Speaking as half of a two-man team, I've found that switching from subversion to git decimated the amount of VCS-related fuss we have to put up with -- we can now both make many small commits all day long (small patches = easier to trace which patch introduced a bug), with merge conflicts only coming up at agreed merge times when we are both available to discuss conflicts.
Plus, even if you use git in centralised mode, it's simply better as a bit of software :P (local copy of history = faster browsing; the ability to automatically run through a set of commits testing each to see where a bug started; nice bits of polish like coloured output and automatically piping through less when appropriate; better branching and merging; etc etc etc)
I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
"I'd have to disagree. There are many instances were DVCS are always superior. However, there are times - such as in corporate environments - where you simply do not want that kind of information floating around the organization."
That's just an empty string of words. What do you mean be 'not floating around'? Access controls? So use them on per-repository level, duh.
Information leaks? Developers can just use working copies for that.
With DVCS systems, like git, every node in the system hosts the whole repository. Aside from the initial access to the system, there's not much you can do per access controls - that's the nature of distributed systems where the information exists in the "cloud" and there is no centralized location.
You can only really only do access controls when you control the whole system - thus you require a centralized system instead of a decentralized system.
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
Git would still be much faster for almost every operation than Subversion.
We have a small team but switching to Git from Subversion meant that check outs actually took measurably less time despite being a full copy of the repository rather than just the most recent revision.
Admittedly Subversion is probably a particularly bad example of non-DVCS but still...
So what? The push is done unattended when I'm finished and go do something else, not while I'm actually programming, so it doesn't waste my time.
A commit in a DVCS does what needs to be done. Non-DVCS wastes my time doing useless stuff in each commit.
Dilbert RSS feed
Yes but Git is still much much better. :)
So?
Just split data into separate repositories. Problem solved.
If SVN could be made to fit that smoothly, I didn't see it. Admittedly, it could have been my ignorance.
Aside from putting the repo on your laptop and using it as a server which you connect to from your desktops, and being screwed if you leave your laptop behind, it wasn't just your ignorance.
The closest to what you want that you get is a tool called svk, which is a (heavy) wrapper around Svn that sort of turns it into a DVCS. By the time you're messing with that you might as well just be using Git if you have the choice.
Personally, I thought it was complete overkill for the two-man project we were working on.
With git, setting up a new project is
cd myproj
git init .
git add .
git commit -m "initial commit"
and you're off--no setting up a server. And if you want to share it with one or two people, ssh will do.
I suspect bzr is similar.
So at least for me the low-overhead setup makes them more attractive for small projects too.
I'll admit they require some initial investment to learn, but it's very much worth it.
Our legacy system was a "hey only one person can write files to this directory but you can copy the files to this directory for developmenstuction" setup. When I mentioned source control I was looked at like I was the new kid telling them what to do because they had no problems with their current system (in their minds). Rough stuff.
bullshit, the three developers I gave (real) example always must work inside company on local network. name one advantage of distributed system for them
Simpler setup, no need to configure a new network service just to get started, much faster history browsing (even on a fast local network), ability to try out lines of development privately without giving up completely on version control for them, ability to keep working when the server goes down, simpler backup of the history (just backup any repo), ....
I've always heard of git as a distributed version control system, and as such I've been ignoring it. Ie, all my coworkers are in the office usually, not distantly separated and loosely connected kernel devs. Examples I've seen of where people love git tend to be nearly exclusively open source projects composed of distributed developers.
So where does having a local repository (and the huge space this entails) actually help the person who's sitting in a cube with a fast lan to the server? Almost never have I wanted to have an entire repository local, having 3 or 4 branches is great, and then only a subset of the full directory tree, and very rarely I'll do a diff against an archaic branch (and I can afford the 1 second wait that takes).
So what does it do for a corporate user that SVN does not do? For background, I'm still with CVS at work, and we're debating about migrating to either SVN or Perforce (I like perforce, and it's what the rest of the company uses, but one person has vetoed it for some reason). I know why Linus hates it, and his reasons are correct for his uses but it's not necessarily true for all.
I love change sets; I often have multiple ongoing tasks in the same source directory, or the same task that I want to split into multiple commits. CVS has zero support for this. Perforce has full support. SVN lets me group into change-lists, but none of this is remembered at the back end from what I can see. Git has this at the back end so you can do things like cherry pick, but you can't group your working files into different sets and then say "commit this set only". So for me neither is better than Perforce for work flow. But that's just me; I'm certain other people see the distributed stuff as the must-have feature.
But many distributed systems I've seen are not supersets of centralized systems; they have different feature sets. For example, if you had a distributed version control system that could not do branching, it would most definitely not be more useful than the centralized system. I'm sure you can build a crappy distributed CVS system that people would hate to use and which would be inferior to SVN in almost all ways except one.
FOR YOU.
I think CVS is better, since I don't have to deal with the big-freaking URLs all of the time. I wish they truly had made a (much-closer-to) superset of CVS. Doing integration with CVS is much easier than with subversion, since I can bump the version and do the tag right BEFORE I submit it, not at the beginning of the integration process.. so I have to do everything basically 'backwards' in subversion.
Before OS X, Macs had a bit of trouble with version control. No command-line, see. In fact, I only knew of two that were reasonably-priced or free: MacCVS and Voodoo. I never got a chance to use Voodoo, but its approach and feature-set looked cool. Can anyone comment on how well it worked? Or what version-control system you used on classic Macs?
i'd hit it so hard, if you pulled me out you'd be the king of britain [bash.org]
My first experience with source safe was: Install on NT server, put all my code in, make a few check ins, can't get anything back out because it broke.
In two weeks, the fucking thing broke and stole all of my changes in a proprietary something or other. It scared me off of scm for... 8 years.
Now, I use subversion at home, because there is no branching or merging, it's just my stuff. I do a hotcopy, then apply Par2 to the copy, and burn to a CD, whenever I remember to do so. TortoiseSVN and par2, that's what I trust. And CD burning, I suppose, although not entirely. I no longer fear making big changes to stuff I've written, I just verify my previous CD with quickpar or whatever it is and then go ahead. Database scripts, code, VBS quickies, greasemonky scripts, it's all safe.
Work code is different, of course, we have rules and things, but if you're one person working on something on Windows it's hard to beat the Tortoise.
The Linux model is totally different, and I agree with Linus that svn is not appropriate for Linux, but I'm one guy and I have no problem manually comparing every little thing, no automated branching or merging here.
Actually ENVY Developer was waaay more sophisticated than any of the tools presented here. Back in the early 90s you could have distributed development, anybody could make their own branches of the code, and you could pick and choose what version of each class you wanted to build with. Also it eschewed flat files and instead used a more powerful transactional database store.
The article clearly has a bias to text-only solutions.
In theory, I completly agree. A DVCS can do everything a centralized can do and more. In practice however things look a little different. All Open Source DVCS currently still lack proper support for narrow, shallow and sparse checkouts, thus you can't download only a single directory, you have to always download the whole repository. Which makes implementing some basic workflows that you get with SVN impossible or at least highly impractical to implement with a DVCS.
Git will be faster and (much) more reliable.
CVS can lose your history without you noticing, Git won't do that. I've had subversion repositories become corrupted. On the other hand a) if a Git repository is corrupted then you know and b) in a worst case you can restore from one of the clones (as every developer has a clone of the central repository).
Thinker: v. to tinker while cogitating.
I like it! And I'll use it. (If refudiate is good enough for OED, thinker's great, too.)
I'm not a lawyer, but I play one on the Internet. Blog
And another point: "the age of distributed version control"?
I work in an office. I have a gigabit network between my workstation and the version control server, which runs on a RAID array significantly faster than the disks under my desk. The connection is always on, always works, and is so fast I don't notice it. In what way could I possibly benefit from a distributed system? And why would I use a distributed system when every one I've ever tried requires a two-step approach to sending my changes to the other developers (synchronize my working copy with the local version control, push changes from local to the rest of the team) rather than just one (commit changes)?
Git would still be much faster for almost every operation than Subversion.
My main problem with git (as a non-expert user who tried it for about an hour, once, so I'll readily admit I could be wrong about this) is that it seems that it changes sharing my changes with other developers from a two-step process (dev1$ svn commit; dev2$ svn update) to a three step process (either dev1$ git commit; manually inform dev2 that they need to pull changes from dev1; dev2$ git pull dev1, or dev1$ git commit; dev1$ git push; dev2$ git pull centralserver). While this may not sound so bad in a situation with just 2 developers, it seems to get linearly more complicated with each developer added.
Now, if you have somebody whose role in the project is to review other developer's changesets and then decide whether or not to include them (i.e. the situation of most open source projects, and what git was specifically designed to be good at) this problem disappears; everyone simply submits their changes and pulls from a single source and there's no issue, but that isn't how most professional development teams work. We trust each other to make changes. We automatically incorporate each others' changes without oversight. We don't want to have to manually pull in all the other developers' work all the time. We just want to perform a single merge of all the recent changes before we perform a final run of the tests and then commit.
Now, I might be wrong, git might support a good way of doing this. But I don't see it mentioned in any of the tutorials I read, and it isn't obvious from the manual pages, or anything like that.
Yes, I'll happily accept that each individual operation is faster than using svn. And if I could find a sensible way of working with it that didn't involve more developer work (even if it is just clicking a sequence of GUI buttons where once I only had to click one of them), and hence wasted more of my time than it saves, I'd love to switch. But it seems to be more complicated to use, and therefore not worth the effort (at least for my team here).
Simpler setup
Really? When I tried to set up git for a three-developer situation earlier this year, I came away puzzled and not entirely sure I'd understood it right. Sure, for a single developer it might be easier, but setting up push/pull aliases using egit (the integration for eclipse, the IDE we use here) involved a number of obscure dialog boxes with options that I don't understand and weren't explained in any of the tutorials I read. Whereas with subversion, all we need to do is use svnadmin to create a repository on the server, get one team member to start a new project and share it to the appropriate URL and the others then import from the same URL.
no need to configure a new network service just to get started
SVN tunnels over SSH, which is enabled by default on every Linux install I've used in the last 15 years.
much faster history browsing (even on a fast local network)
I'll grant this is true, but as getting the history of an entire project on my server here takes less than 3 seconds, and it's something I do at most once a week, I don't really see the issue.
ability to try out lines of development privately without giving up completely on version control for them
I don't really understand the point here. What's wrong with just creating a branch and using that?
ability to keep working when the server goes down
While this is true, at least where I am, one of the dev team would be responsible for fixing the server anyway, and the others would probably not be able to achieve an awful lot because the same server hosts the bug tracker, documentation wiki, task list, email server, and basically all the other tools we use for the work we do. Not to mention the virtual server we deploy to for interactive testing. And even with SVN, they'd still be able to continue working until they next needed to commit, i.e. they can finish what they're currently working on.
simpler backup of the history (just backup any repo),
As opposed to backing up the central repo, which is guaranteed to contain all of the history, rather than potentially having some of it missing if people are working on private branches?
This! I'm using git-svn at work purely because I want branches that can actually be merged back without sacrificing a few barnyard animals at a satanic altar.
Branching, directories, merging...
Some things are just broken in CVS...
I love change sets; I often have multiple ongoing tasks in the same source directory, or the same task that I want to split into multiple commits. CVS has zero support for this. Perforce has full support. SVN lets me group into change-lists, but none of this is remembered at the back end from what I can see. Git has this at the back end so you can do things like cherry pick, but you can't group your working files into different sets and then say "commit this set only".
Actually, yes you can. Selecting which files to commit is actually even part of the normal workflow.
-rozzin.
I've always heard of git as a distributed version control system, and as such I've been ignoring it. Ie, all my coworkers are in the office usually, not distantly separated and loosely connected kernel devs. Examples I've seen of where people love git tend to be nearly exclusively open source projects composed of distributed developers.
So where does having a local repository (and the huge space this entails) actually help the person who's sitting in a cube with a fast lan to the server?
If you actually compare the stats for space used by different systems, you'll find that, between all of the different DVCS tools (Git, Bazaar, Mercurial, etc.) and Subversion,
Subversion checkouts actually tend to use the most space: yes, a Subversion checkout without a local repository--which means that it only gives you immediate access to one version, and doesn't allow you to do annotations or anything else without round-tripping through the server--actually tends to use more space than a local DVCS repository that stores all of the history and allows you to batch and group commits, do fast annotates, deal with merge-conflicts more easily, be immune to server/network reliability issues, etc.
Try a comparison--do a svn checkout of some project, then import the project into the DVCS of your choice and compare the space used by each. I usually use Bazaar with bzr-svn for this, with "bzr branch " to import just the trunk, or with "bzr svn-import " to import all of the branches. Bazaar is the DVCS that everyone wails on for `using more space than Git', so I was initially hesitant to use it; but then I realised that it still uses only half as much space as a checkout from Subversion.
-rozzin.
>I work in an office. I have a gigabit network between my workstation and the version control server, which runs on a RAID array significantly faster than
>the disks under my desk. The connection is always on, always works, and is so fast I don't notice it. In what way could I possibly benefit from a
>distributed system?
If this is your work environment, quite little. In practise, one of these things tends to happen: need to work *not* in the same office, team size increases (server load increases), server goes down. In all of them, the fast server isn't so fast anymore.
>And why would I use a distributed system when every one I've ever tried requires a two-step approach to sending my changes to the other developers
>(synchronize my working copy with the local version control, push changes from local to the rest of the team) rather than just one (commit changes)?
The push can be automated if you want. But usually you don't necessarily want that (and having the possibility is an advantage).
True, but selecting the files is a manual process. You can't just say something like "git commit bugfix-23" as far as I know.
I believe the boring parts are exactly what this discussion needs. There are enough people saying "X works for me".
I read your follow-up comment about developing on two separate machines. Did this simply get solved by using GitHub? If there had been a third-party SVNHub site would that have fixed your limitations as well?
I have. Its better than gitosis, but not good enough to really make large businesses notice.
We use Gerrit and contribute back improvements and fixes since we get the functionality we need and the developer time in fixing Gerrit is less than the cost of buying commercial SCM software.
I was trying to keep the reply constrained to arguing against parent Troll Boy's ridiculous premise. But since you asked...
Part of the problem was solved with GitHub (which could have been done with my own external-facing machine), and later Unfuddle. But to solve the problem of keeping the machines synchronized, something external-facing wasn't needed at all. I could have just pushed/pulled to the other machine(s) when I got home. One of the differences with git (or other DCVS) is that each machine has a copy of the repository. The end result is that you can push/pull the changes to any other repository that resides on another machine. As far as I can tell with SVN, there is the repository. As someone else pointed out, the SVN repository could be kept on the laptop, but it would have to be running when I'm on the desktop and screwed if the laptop isn't around.
Which brings us to the second part of how a DCVS solved my problem. Something like SVNHub does exist in the form of Unfuddle (I use them for git repositories, and would recommend them). But Unfuddle's SVN support wouldn't fix the problem of working on the bus. Back before my own thing turned into my day job, I was in the mode of "start a side company nights and weekends" outside of my day job. I'd fix bugs on my 45 minute bus ride. In bug-fix mode, I could knock out a few in that 90 minute round trip. Here are the modes for SVN (as I understand SVN) and git:
SVN: fix three bugs, two of which I fixed over eight hours ago. Get home, go to check in. Check-in comment is something like: "three bug fixes. Null pointer dereference, menu fix and one that I can't remember because I've worked a full day since fixing it". All fixes are crammed into one big check-in. If it breaks, back all of it out. No one-to-one relationship between check-ins and bugs (or features, or what have you).
git: fix those same three bugs in the same time frame. Do a commit (SVN's "check-in") after each bug fix, with descriptive message. I can do this because I have a local copy of the repository and therefore can do granular commits even on the bus. Get home, push to Unfuddle or a local machine and build. If it breaks, back out the one bad commit/check-in.
Okay, so I'm working out of an office in my house. No more bus rides (or at least not as many), would I still recommend git for the single developer? Yup. Anybody that's worked at home knows you sometimes have to get out of the house. I pick up the laptop (after pulling changes) and go to the coffee shop. Wife takes the laptop someplace, I use the desktop. Point is, I never worry about where the repository lives, because it lives on every machine. For me, that simplifies things a lot.
Not directly related to the question, but git's branching and merging are trivial compared to SVN, at least to me. Working on a feature, create a local branch, get it working, merge to mainline and push it out. OTOH, if I went completely the wrong direction, delete the branch and start over. All of it is one or two entries on the command line, and I don't give a second thought to creating a branch for the most trivial things (bug fixes, for instance). SVN just didn't seem as easy for the same tasks. It wasn't why I started using git, but it's one of the things keeping me using it.
Finally (and I thank you for even reading this far), why Unfuddle vs. GitHub? IIRC, Unfuddle gives more bang for the buck (as in actually US bucks; I use private repositories for the commercial stuff). But it was mainly because of better bug tracking and tying bugs to commits/check-ins. There's some lightweight project tracking, too, that's sufficient for my needs. Some may scoff that it's overkill for a one-man team. But I'm old and forgetful, and tracking customer bugs via email (and internal stuff via PostIt) isn't going to work. I also don't plan to remain a one-man team forever.
Sure, those are all pretty reasonable comments. I'll describe two ways of working with Git that I think address this somewhat.
I guess first up I would say that Git doesn't really push you towards a specific way of working so if there is a particular way that you want to do things then chances are you can do that with Git (that said there are definitely ways to do things that would be less worthwhile than others).
We switched to Git from Subversion where I work about 6 months ago and it's worked extremely well for us. We have a small team but it is spread across two or three countries (some of us move around a bit). For us we use a central model where there is a single 'authoritative' repository which we all push and pull from. So I don't share my repository directly with the other developers we all push and pull from the same central one (which we integrate with Hudson for CI).
So really we're not using the distributed for the sake of sharing code directly between us but because having access to the full repository (and did I mention that Git is fast?) allows us to do stuff we couldn't before. Once you've done the initial clone Git is very very fast (and things like svn log are much much faster with git because you don't have to do any network operations). Essentially we still have a central repository that update from and commit too simimlarly to SVN it's just that we can continue committing without a network connection and split our commits into sensible chunks of work that go into the central repository etc.
Now, in terms of doing something really distributed, you can do that too more easily than you described (though I've not really done it in any seriousness so not sure whether it is worth setting up). If you give me access to your repository, I give you access to mine and we both have access to a third repository. We can set up our own repositories to pull from both the others - so it is not so much more difficult to fetch the work of the other repositories into our own working tree. There is a tiny bit of extra work here - generally Git commands are bit more fine grained than something like Subversion so you do end up with more commands. For my use (and for my team) I've written some simple scripts that tie some stuff together (so I run 'git up' which simulates 'svn up' or 'cvs up' you could easily do the same for fetching from multiple repositories if you wanted ... ).
Generally I wouldn't really recommend working that way for a small group. What would make more sense would if you had two teams working on a product (maybe a team doing support and a team doing a product release) the release team might want to pull changes from the support team as they're finished and make sure that they're incorporated into the next release. Generally I think you'd have one person doing that and pushing it into a central repository for the release team (so each team would have a central repository but you might pull changes from one into the other). All this is far easier with Git than with Subversion (and I've worked in situations in the past where something like what I just described would have saved a lot of heartache and angst between teams).
But generally your point about Git being more complicated to use is absolutely correct and, sadly, when you ask for help you don't always get a nice answer (sometimes it's more along the lines of 'well the manuals say you shouldn't do that so you did it anyway now you're screwed and it serves you right for not understanding the internals of our magically wondrous system). So... Git *is* awesome, I'm really glad that my company uses it but it does have a very steep learning curve.
In terms of ease of use I quite like the look of Kiln which is some proprietary code wrapped around Mercurial and looks relatively easy to use (in general my impression is that Mercurial is easier to use). That said - Git is free...
I hope that helps I'm certainly not an expert with the finer points of Git but I'm happy to tell you what I know if you're interested (even if you stay with whatever you use if you're more informed then you're still going to make better decisions)....
-mark
Thank you for typing all of this out; it was very informative. The ability with git to commit much more freely without affecting others using the same repository seems like a very big plus. Immediately being able to "save off" the three separate bugs you mentioned is nice, and on long multi-day coding sessions I could see myself "saving off" the results at the end of each day. It's like having a local SVN repository on your machine with a very simple way to push to the repository later on.
At our company we're still working under the old check-out/change/check-in model, and the idea of allowing multiple people to make changes to a file at the same time (let alone merge those changes together when they're both done) would probably terrify many here. So we're still many steps away from "hey let's all use git"! At least now I understand some good reasons why we might consider git.
Sometimes you want to do some experimental work that is complicated enough to be version-controlled, but not stable enough for other developers to see yet.
In SVN you have to use a branch, and SVN branches are not that convenient to use. Otherwise, when another developer checks out the tree or commits his own changes, he will see the unstable changes you have committed.
In Git you can simply commit your experimental changes and push them (or let others pull them) when they are ready.
And if you frequently want to push immediately after commit, just make such a shortcut.
Plastic does a good job here because it can work in centralized mode and still with much better branching and merging than the non-DVCS.